Commit Graph

430 Commits

Author SHA1 Message Date
David Esparza
d2c130ea69 Merge pull request #7710 from GabyCT/topic/fixpytorch1
metrics: Use function from metrics common in pytorch script
2023-08-21 15:31:24 -06:00
Gabriela Cervantes
eee2ee6eeb metrics: Fix FIO path
This PR fixes the FIO path for the FIO files.

Fixes #7711

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 21:06:04 +00:00
David Esparza
9347051592 Merge pull request #7666 from dborquez/metrics_improve_fio_test
metrics: Enable kata runtime in K8s for FIO test.
2023-08-21 13:51:57 -06:00
Gabriela Cervantes
39bc3488f5 metrics: Use function from metrics common in pytorch script
This PR uses a common function into the pytorch script.

Fixes #7709

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 16:12:35 +00:00
Dan Mihai
400eb88743 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-21 15:58:57 +00:00
GabyCT
700759232f Merge pull request #7690 from GabyCT/topic/fixpytorch
metrics: Fix README for pytorch
2023-08-21 09:50:14 -06:00
Jiang Liu
6e038e66e4 Merge pull request #7680 from GabyCT/topic/removetime
metrics: Remove unused variable in tensorflow mobilenet script
2023-08-21 23:39:07 +08:00
Gabriela Cervantes
c8b43f8b3e metrics: Fix README for pytorch
This PR fixes the pytorch reference in the README file.

Fixes #7689

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-18 20:14:49 +00:00
Fabiano Fidêncio
7e66d1f6b5 Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests
gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
2023-08-18 07:47:26 +02:00
David Esparza
fb571f8be9 metrics: Enable kata runtime in K8s for FIO test.
This PR configures the corresponding kata runtime in K8s
based on the tested hypervisor.

This PR also enables FIO metrics test in the kata metrics-ci.

Fixes: #7665

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-17 17:11:27 -06:00
Gabriela Cervantes
85c02828e1 metrics: Update tensorflow name in gha run script
This PR update tensorflow name in gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 20:17:48 +00:00
Gabriela Cervantes
e8a5119343 metrics: Fix check results for tensorflow benchmark
This PR fixes the check results for tensorflow benchmark now
that we change the name of the test.

Fixes #7684

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 19:52:45 +00:00
Fabiano Fidêncio
2d896ad12f gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
Instead of doing this as part of the test itself, let's ensure it's done
before running the tests and during the tests cleanup.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
Fabiano Fidêncio
4ffc2c86f3 gha: kata-deploy: Add the first kata-deploy test
This test, at least for now, only checks whether the runtimeclasses
have been properly created.

This is just a migration from a test we had as part of the k8s suite.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
GabyCT
4ba684e6e4 Merge pull request #7653 from GabyCT/topic/tensorflowfp32
metrics: Add Tensorflow ResNet50 int8 benchmark
2023-08-17 10:44:25 -06:00
Gabriela Cervantes
8616c050ae metrics: Remove unused variable in tensorflow mobilenet script
This PR removes unused variable in tensorflow mobilenet script.

Fixes #7679

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 16:04:18 +00:00
Fabiano Fidêncio
285e616b5e tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:16 +02:00
Fabiano Fidêncio
790bd3548d tests: commob: Don't fail if yq is not part of the cache
This may happen on external runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:14 +02:00
Fabiano Fidêncio
ce6adecd0a gha: kata-deploy: Add run-kata-deploy-tests.sh
This will have the same function as run-k8s-tests.sh has, but for
kata-deploy.

Right now it doesn't have any tests, and the command to actually run the
tests is commented out, but right now this is just a placeholder that
will be populated sooner than later.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:49:03 +02:00
Fabiano Fidêncio
cfc29c11a3 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:48:54 +02:00
Fabiano Fidêncio
e470a650e0 Merge pull request #7654 from sprt/ci-fixes
kata-deploy: Properly create default runtime class
2023-08-17 09:43:34 +02:00
Aurélien Bombo
f4dd152863 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 14:13:56 -07:00
Aurélien Bombo
339569b69c kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 11:04:44 -07:00
Gabriela Cervantes
2a491e9b1f metrics: Fix MobileNet help me description
This PR fixes MobileNet help me description in the
tensorflow script.

Fixes #7661

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-16 15:25:39 +00:00
Gabriela Cervantes
bade6a5c3b docs: Fix TensorFlow word across the document
This PR fixes the TensorFlow word across the document to have uniformity
across all the document.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 20:13:05 +00:00
Fabiano Fidêncio
0bc48eab60 Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests
gha: cri-containerd: Enable tests
2023-08-15 21:18:28 +02:00
Gabriela Cervantes
1a1b207760 docs: Add Tensorflow Resnet50 documentation
This PR adds the Tensorflow Resnet50 documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:46:44 +00:00
Gabriela Cervantes
24baededc0 metrics: Add Dockerfile for ResNet50 int8
This PR adds the dockerfile for ResNet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:38:26 +00:00
Gabriela Cervantes
6d971ba8df metrics: Add Tensorflow ResNet50 int8 benchmark
This PR adds the Tensorflow ResNet50 int8 script for kata metrics.

Fixes #7652

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:30:22 +00:00
GabyCT
0bbabeaaf8 Merge pull request #7644 from GabyCT/topic/renametensorflow
metrics: Rename tensorflow scripts
2023-08-15 09:23:24 -06:00
Fabiano Fidêncio
46d25d908d Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests
gha: tests: Add kata-deploy functional tests -- Part 1
2023-08-15 15:23:48 +02:00
Fabiano Fidêncio
b3592ab25c gha: cri-containerd: Enable tests
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:32:42 +02:00
Fabiano Fidêncio
84dd02e0f9 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.

The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.

It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
b29782984a gha: cri-containerd: Show pod before deleting it
It'll help us to debug failures with the pod stop / pod delete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
ae0930824a gha: cri-containerd: Print kata logs in case of error
We need this to fully understand what are the issues we're facing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
6c8b2ffa60 gha: cri-containerd: Group containerd logs
This improves readability in case of failures by a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
9e898701f5 gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
Short commit log says it all.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Gabriela Cervantes
18a7fd8e4e metrics: Rename tensorflow scripts
This PR renames the tensorflow scripts to include the data format
that is being used as we will have multiple tests with different
data and model formats for tensorflow so this will help us to
distinguish them.

Fixes #7645

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-14 20:40:35 +00:00
GabyCT
a740c80251 Merge pull request #7626 from GabyCT/topic/cassandrak
metrics: Add Cassandra Kubernetes benchmark for kata metrics
2023-08-14 14:22:52 -06:00
GabyCT
4e5e39e8b3 Merge pull request #7618 from GabyCT/topic/addfunctionscommon
metrics: Add common functions to the common script
2023-08-14 14:22:30 -06:00
Fabiano Fidêncio
831e73ff91 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
Right now this file does nothing, as it's not even called by any GHA.
However, it'll be populated later on as part of a different series,
where we'll have kata-deploy specific tests running here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:46:10 +02:00
Fabiano Fidêncio
af1b46bbf2 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:45:58 +02:00
David Esparza
767434d50a metrics: fix the loop used to stop kata components #7629
This PR fixed the loop that stops the kata-shim and the
hypervisors used in metrics checks.

Fixes: #7628

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-11 12:32:41 -06:00
Gabriela Cervantes
5d0f0d43c7 metrics: Add cassandra statefulset yaml
This PR adds cassandra statefulset yaml for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:39 +00:00
Gabriela Cervantes
c1dcc1396f metrics: Add cassandra service yaml
This PR adds the cassandra service yaml for the benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:36 +00:00
Gabriela Cervantes
2297a0d1c5 metrics: Add block loop pvc yaml for cassandra
This PR adds block loop pvc yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:33 +00:00
Gabriela Cervantes
e3d511946f metrics: Add block loop pv yaml for cassandra test
This PR adds the block loop pv yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:29 +00:00
Gabriela Cervantes
9890271594 metrics: Add block loop pvc for cassandra test
This PR adds the block loop pvc for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:19 +00:00
Gabriela Cervantes
349b89969a metrics: Add Cassandra Kubernetes benchmark for kata metrics
This PR adds Cassandra Kubernetes benchmark for kata metrics tests.

Fixes #7625

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:21:48 +00:00
Gabriela Cervantes
fdcd52ff78 metrics: Add check containers are running in tensorflow mobilenet
This PR adds check containers are running in tensorflow mobilenet
that is being defined in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:17:20 +00:00