kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-25 03:02:17 +00:00

Author	SHA1	Message	Date
Gabriela Cervantes	3e07c89d39	metrics: Remove unused variable in tensorflow nhwc script This PR removes unused variable in tensorflow nhwc script. Fixes #7750 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `32a778b6da`)	2023-09-21 13:25:56 +02:00
Fabiano Fidêncio	5b9a69433d	kata-deploy: Don't try to remove /opt/kata The directory is a host path mount and cannot be removed from within the container. What we actually want to remove is whatever is inside that directory. This may raise errors like: ``` rm: cannot remove '/opt/kata/': Device or resource busy ``` Fixes: #7746 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `d8f3ce6497`)	2023-09-21 13:25:48 +02:00
Jeremi Piotrowski	e99a13d26c	gha: vfio: Run on Ubuntu 23.04 runner The vfio test requires nested-nested virtualization: L0 Azure host -> L1 Ubuntu VM -> L2 Fedora VM -> L3 Kata This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out v6.2. Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com> (cherry picked from commit `936e8091a7`)	2023-09-21 13:25:35 +02:00
Jeremi Piotrowski	394d146b89	local-build: Remove GID before creating group docker install now creates a group with gid 999 which happens to match what we need to get docker-in-docker to work. Remove the group first as we don't need it. Fixes: #7726 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com> (cherry picked from commit `3b881fbc0e`)	2023-09-21 13:25:18 +02:00
Gabriela Cervantes	7421737229	metrics: Add TensorFlow ResNet50 fp32 Dockerfile This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `959ca49447`)	2023-09-21 13:25:10 +02:00
Gabriela Cervantes	9acbf2faf7	metrics: Add TensorFlow ResNet50 FP32 benchmark This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics. Fixes #7735 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `4b7d72c4a8`)	2023-09-21 13:25:03 +02:00
Fabiano Fidêncio	4f2c9372c3	kata-deploy: Avoid failing on content removal We can simply use `rm -f` all over the place and avoid the container returning any error. Fixes: #7733 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `5cba38c175`)	2023-09-21 13:24:56 +02:00
Gabriela Cervantes	6ea1d3bffd	metrics: Add disk link to README This PR adds disk link to README documentation for kata metrics. Fixes #7721 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `8afd158cef`)	2023-09-21 13:24:35 +02:00
Gabriela Cervantes	ad2036927f	metrics: Fix FIO path This PR fixes the FIO path for the FIO files. Fixes #7711 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `eee2ee6eeb`)	2023-09-21 13:24:06 +02:00
Gabriela Cervantes	abcb225ce3	metrics: Use function from metrics common in pytorch script This PR uses a common function into the pytorch script. Fixes #7709 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `39bc3488f5`)	2023-09-21 13:23:58 +02:00
Dan Mihai	508f1bba15	gha: capture additional kata-deploy output 10 lines can be insufficient for diagnostics. Fixes: #7707 Signed-off-by: Dan Mihai <dmihai@microsoft.com> (cherry picked from commit `400eb88743`)	2023-09-21 13:23:48 +02:00
David Esparza	d46c300608	metrics: Enable kata runtime in K8s for FIO test. This PR configures the corresponding kata runtime in K8s based on the tested hypervisor. This PR also enables FIO metrics test in the kata metrics-ci. Fixes: #7665 Signed-off-by: David Esparza <david.esparza.borquez@intel.com> (cherry picked from commit `fb571f8be9`)	2023-09-21 13:23:36 +02:00
Gabriela Cervantes	3d3882a06a	metrics: Update tensorflow name in gha run script This PR update tensorflow name in gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `85c02828e1`)	2023-09-21 13:23:17 +02:00
Gabriela Cervantes	7d0a3dbf24	metrics: Fix check results for tensorflow benchmark This PR fixes the check results for tensorflow benchmark now that we change the name of the test. Fixes #7684 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `e8a5119343`)	2023-09-21 13:23:09 +02:00
Fabiano Fidêncio	3e2a383b7d	gha: kata-deploy: Do the runtime class cleanup as part of the cleanup Instead of doing this as part of the test itself, let's ensure it's done before running the tests and during the tests cleanup. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `2d896ad12f`)	2023-09-21 13:23:02 +02:00
Fabiano Fidêncio	2c5db14a1a	gha: kata-deploy: Add the first kata-deploy test This test, at least for now, only checks whether the runtimeclasses have been properly created. This is just a migration from a test we had as part of the k8s suite. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `4ffc2c86f3`)	2023-09-21 13:22:56 +02:00
Gabriela Cervantes	0b4fb826de	metrics: Remove unused variable in tensorflow mobilenet script This PR removes unused variable in tensorflow mobilenet script. Fixes #7679 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `8616c050ae`)	2023-09-21 13:22:47 +02:00
Fabiano Fidêncio	b38624e2b3	tests: common: Ensure test_type is used as part of the cluster's name By doing this we can make sure there won't be any clash on the cluster name created for either the k8s or the kata-deploy tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `285e616b5e`)	2023-09-21 13:22:40 +02:00
Fabiano Fidêncio	cdfcd9aba8	tests: commob: Don't fail if yq is not part of the cache This may happen on external runners. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `790bd3548d`)	2023-09-21 13:22:33 +02:00
Fabiano Fidêncio	74edbaac96	gha: kata-deploy: Add run-kata-deploy-tests.sh This will have the same function as run-k8s-tests.sh has, but for kata-deploy. Right now it doesn't have any tests, and the command to actually run the tests is commented out, but right now this is just a placeholder that will be populated sooner than later. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `ce6adecd0a`)	2023-09-21 13:22:27 +02:00
Fabiano Fidêncio	d7130f48b0	gha: k8s: Stop running kata-deploy tests as part of the k8s suite In a follow-up series, we'll add a whole suite for the kata-deploy tests. With this in mind, let's already get rid of this one and avoid more kata-deploy tests to land here. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `cfc29c11a3`)	2023-09-21 13:22:21 +02:00
Aurélien Bombo	810507e8a3	tests: k8s: Call ensure_yq() in setup.sh It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing the yq error so let's try this instead. Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341 Signed-off-by: Aurélien Bombo <abombo@microsoft.com> (cherry picked from commit `f4dd152863`)	2023-09-21 13:22:10 +02:00
Aurélien Bombo	915bace795	kata-deploy: Properly create default runtime class The default `kata` runtime class would get created with the `kata` handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong hypervisor and broke CI. Fixes: #7663 Signed-off-by: Aurélien Bombo <abombo@microsoft.com> (cherry picked from commit `339569b69c`)	2023-09-21 13:22:00 +02:00
Gabriela Cervantes	870d8004a0	metrics: Fix MobileNet help me description This PR fixes MobileNet help me description in the tensorflow script. Fixes #7661 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `2a491e9b1f`)	2023-09-21 13:21:54 +02:00
Fabiano Fidêncio	145450544d	gha: ci: Start running kata-deploy tests Let's add the tests as part of the ci.yaml, so they an be triggered as part of each PR. For this PR those tests won't be triggered, courtesy to the `pull_request_target` event we rely on. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `d19a75e80c`)	2023-09-21 13:21:46 +02:00
Gabriela Cervantes	bd29413721	docs: Fix TensorFlow word across the document This PR fixes the TensorFlow word across the document to have uniformity across all the document. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `bade6a5c3b`)	2023-09-21 13:21:28 +02:00
Gabriela Cervantes	a845e94139	docs: Add Tensorflow Resnet50 documentation This PR adds the Tensorflow Resnet50 documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `1a1b207760`)	2023-09-21 13:21:21 +02:00
Gabriela Cervantes	6e5a5b8249	metrics: Add Dockerfile for ResNet50 int8 This PR adds the dockerfile for ResNet50 int8 benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `24baededc0`)	2023-09-21 13:21:13 +02:00
Gabriela Cervantes	5d85cac1d6	metrics: Add Tensorflow ResNet50 int8 benchmark This PR adds the Tensorflow ResNet50 int8 script for kata metrics. Fixes #7652 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `6d971ba8df`)	2023-09-21 13:21:07 +02:00
Fabiano Fidêncio	7474e50ae2	gha: cri-containerd: Enable tests As the cri-containerd tests have been fully migrated to GHA, let's make sure we get them running. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `b3592ab25c`)	2023-09-21 13:19:36 +02:00
Fabiano Fidêncio	20be3d93d5	gha: cri-containerd: Add timeout to the crictl calls on testContainerStop As part of the runners, we're hitting a timeout that I cannot reproduce, at all, when allocating the same instance and running the tests manually. The default timeout to connect to the server is 2s when using `crictl`. Let's increase this to 20s. It's fairly important to mention that in the first tests I used a timeout of 10s, and that helped but we still hit issues every now and then. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `84dd02e0f9`)	2023-09-21 13:19:28 +02:00
Fabiano Fidêncio	10058f718a	gha: cri-containerd: Show pod before deleting it It'll help us to debug failures with the pod stop / pod delete. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `b29782984a`)	2023-09-21 13:19:22 +02:00
Fabiano Fidêncio	585d5fba03	gha: cri-containerd: Print kata logs in case of error We need this to fully understand what are the issues we're facing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `ae0930824a`)	2023-09-21 13:19:17 +02:00
Fabiano Fidêncio	2fea5a5f8b	gha: cri-containerd: Group containerd logs This improves readability in case of failures by a lot. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `6c8b2ffa60`)	2023-09-21 13:19:11 +02:00
Fabiano Fidêncio	3c7597f4ba	gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account Short commit log says it all. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `9e898701f5`)	2023-09-21 13:19:04 +02:00
Gabriela Cervantes	738d808cac	metrics: Rename tensorflow scripts This PR renames the tensorflow scripts to include the data format that is being used as we will have multiple tests with different data and model formats for tensorflow so this will help us to distinguish them. Fixes #7645 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `18a7fd8e4e`)	2023-09-21 13:18:52 +02:00
Fabiano Fidêncio	4bb8fcc0c0	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `e55fa93db9`)	2023-09-21 13:18:42 +02:00
Fabiano Fidêncio	f5e14ef283	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `d9ee17aaec`)	2023-09-21 13:18:35 +02:00
Fabiano Fidêncio	e812c437fe	tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder Right now this file does nothing, as it's not even called by any GHA. However, it'll be populated later on as part of a different series, where we'll have kata-deploy specific tests running here. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `831e73ff91`)	2023-09-21 13:18:21 +02:00
Fabiano Fidêncio	c19cebfa80	tests: Add gha-run-k8s-common.sh Let's split a good portion of `tests/integration/kuberentes/gha-run.sh` out, and put them in a place where they can be used to the soon-to-come kata-deploy specific tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `af1b46bbf2`)	2023-09-21 13:18:07 +02:00
David Esparza	4e8c512346	metrics: fix the loop used to stop kata components #7629 This PR fixed the loop that stops the kata-shim and the hypervisors used in metrics checks. Fixes: #7628 Signed-off-by: David Esparza <david.esparza.borquez@intel.com> (cherry picked from commit `767434d50a`)	2023-09-21 13:17:32 +02:00
Gabriela Cervantes	47f32c4983	metrics: Add cassandra statefulset yaml This PR adds cassandra statefulset yaml for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `5d0f0d43c7`)	2023-09-21 13:17:26 +02:00
Gabriela Cervantes	d5a14449fc	metrics: Add cassandra service yaml This PR adds the cassandra service yaml for the benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `c1dcc1396f`)	2023-09-21 13:17:20 +02:00
Gabriela Cervantes	1292b51092	metrics: Add block loop pvc yaml for cassandra This PR adds block loop pvc yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `2297a0d1c5`)	2023-09-21 13:17:13 +02:00
Gabriela Cervantes	105a556a30	metrics: Add block loop pv yaml for cassandra test This PR adds the block loop pv yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `e3d511946f`)	2023-09-21 13:17:04 +02:00
Gabriela Cervantes	1b126eb4ce	metrics: Add block loop pvc for cassandra test This PR adds the block loop pvc for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `9890271594`)	2023-09-21 13:16:59 +02:00
Gabriela Cervantes	671ad98451	metrics: Add Cassandra Kubernetes benchmark for kata metrics This PR adds Cassandra Kubernetes benchmark for kata metrics tests. Fixes #7625 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `349b89969a`)	2023-09-21 13:16:53 +02:00
Fabiano Fidêncio	058b304455	gha: static-checks: Move to the Azure instances The GHA runners are not exactly powerful, which makes the static-checks take way too long (almost an hour). Let's give a try and move those to the same size of Azure instances used as part of our CI, and probably have this time reduced. Fixes: #7446 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `c52d090522`)	2023-09-21 13:16:47 +02:00
Gabriela Cervantes	b600659df2	metrics: Add check containers are running in tensorflow mobilenet This PR adds check containers are running in tensorflow mobilenet that is being defined in common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `fdcd52ff78`)	2023-09-21 13:16:33 +02:00
Gabriela Cervantes	1b30aa818e	metrics: Add check containers are up in tensorflow script This PR adds the check containers are up function from common in tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> (cherry picked from commit `36337ee146`)	2023-09-21 13:16:26 +02:00

1 2 3 4 5 ...

11296 Commits