Commit Graph

1402 Commits

Author SHA1 Message Date
David Esparza
3a419ba3b1
metrics: common: Add function to update kata config.
Add an extra function that updates kata config
to use the max num. of vcpus available and
to use the available memory in the system.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-04-09 09:05:51 -06:00
Beraldo Leal
959e56525c docs: adding an initial CI documentation
This is actually a first attempt to document our CI, and all this
content was based on the document created by Fabiano Fidencio (kudos to
him). We are just moving the content and discussion from Google Docs to
here.

I used the "poetic license" to add some notes on what I believe our CI
will look like in the future.

Fixes #9006

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-04-09 09:21:47 -04:00
Saul Paredes
51498ba99a genpolicy: toggle containerd pull in tests
- Add v1 image test case
- Install protobuf-compiler in build check
- Reset containerd config to default in kubernetes test if we are testing genpolicy
- Update docker_credential crate
- Add test that uses default pull method
- Use GENPOLICY_PULL_METHOD in test

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-08 19:28:29 -07:00
Dan Mihai
f60c9eaec3
Merge pull request #9398 from microsoft/danmihai1/policy-test-cleanup
tests: k8s: improve the Agent Policy tests
2024-04-08 15:37:07 -07:00
Gabriela Cervantes
fb4c359cc2 tests: Improve the kbs_k8s_delete function
This PR improves the kbs_k8s_delete function to verify that the
resources were properly deleted for baremetal environments.

Fixes #9379

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-08 18:03:07 +00:00
stevenhorsman
a284a20a14 tests: Filter CoCo tests on ppc64le/arm
- At the moment we aren't supporting ppc64le or
aarch64 for
CoCo, so filter out these tests from running

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
Gabriela Cervantes
6d85025e59 test/k8s: Add basic attestation test
- Add basic test case to check that a ruuning
pod can use the api-server-rest (and attestation-agent
and confidential-data-hub indirectly) to get a resource
from a remote KBS

Fixes #9057

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Co-authored-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-08 11:38:53 +01:00
GabyCT
9d2c5b180e
Merge pull request #9419 from GabyCT/topic/fxlatency
metrics: Improve latency test cleanup
2024-04-05 16:31:00 -06:00
Wainer Moschetta
aae7048d4f
Merge pull request #9273 from ldoktor/kcli-coco-kbs
tests: Support for kbs setup on kcli
2024-04-05 18:55:58 -03:00
Dan Mihai
6f9f8ae285
Merge pull request #9413 from microsoft/saulparedes/ensure_unique_rg_in_gha
gha: ensure unique resource group name
2024-04-04 17:13:09 -07:00
GabyCT
80d926c357
Merge pull request #9411 from microsoft/danmihai1/k8s-job
tests: k8s-job: wait for job successful create
2024-04-04 15:14:56 -06:00
Gabriela Cervantes
8e5d401be0 metrics: Improve latency test cleanup
This PR improves the latency test cleanup in order to avoid random
failures of leaving the pods.

Fixes #9418

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-04 20:43:53 +00:00
Saul Paredes
f20caac1c0 gha: ensure unique resource group name
There's an rg name duplication situation that got introduced by #9385
where 2 different test runs might have same rg name.

Add back uniqueness by including the first letter of GENPOLICY_PULL_METHOD to
cluster name.

Fixes: #9412

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-04 13:13:32 -07:00
Dan Mihai
3e72b3f360 tests: k8s-job: wait for job successful create
Don't just verify SuccessfulCreate - wait for it if needed.

Fixes: #9138

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 22:11:15 +00:00
Gabriela Cervantes
73f27e28d1 gha: Define GH_PR_NUMBER variable in gha run k8s common script
This PR defines the GH_PR_NUMBER variable in gha run k8s common
script to avoid failures like unbound variable when running
locally the scripts just like the GHA CI.

Fixes #9408

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-03 18:25:00 +00:00
Dan Mihai
f800bd86f6 tests: k8s-sandbox-vcpus-allocation.bats policy
Use the "allow all" policy for k8s-sandbox-vcpus-allocation.bats,
instead of relying on the Kata Guest image to use the same policy
as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:33 +00:00
Dan Mihai
4211d93b87 tests: k8s-nginx-connectivity.bats policy
Use the "allow all" policy for k8s-nginx-connectivity.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:26 +00:00
Dan Mihai
5dcf64ef34 tests: k8s-volume.bats allow all policy
Use the "allow all" policy for k8s-volume.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:18 +00:00
Dan Mihai
04085d8442 tests: k8s-sysctls.bats allow all policy
Use the "allow all" policy for k8s-sysctls.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:10 +00:00
Dan Mihai
839993f245 tests: k8s-security-context.bats allow all policy
Use the "allow all" policy for k8s-security-context.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:01:03 +00:00
Dan Mihai
02a050b47e tests: k8s-seccomp.bats allow all policy
Use the "allow all" policy for k8s-seccomp.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:56 +00:00
Dan Mihai
543e40b80c tests: k8s-projected-volume.bats allow all policy
Use the "allow all" policy for k8s-projected-volume.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:47 +00:00
Dan Mihai
3f94e2ee1b tests: k8s-pod-quota.bats allow all policy
Use the "allow all" policy for k8s-pod-quota.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:37 +00:00
Dan Mihai
ba23758a42 tests: k8s-optional-empty-secret.bats policy
Use the "allow all" policy for k8s-optional-empty-secret.bats,
instead of relying on the Kata Guest image to use the same policy as
its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:30 +00:00
Dan Mihai
e4ff6b1d91 tests: k8s-measured-rootfs.bats allow all policy
Use the "allow all" policy for k8s-measured-rootfs.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:23 +00:00
Dan Mihai
2821326a7e tests: k8s-liveness-probes.bats allow all policy
Use the "allow all" policy for k8s-liveness-probes.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:15 +00:00
Dan Mihai
9af3e4cc4a tests: k8s-inotify.bats allow all policy
Use the "allow all" policy for k8s-inotify.bats, instead of relying
on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:08 +00:00
Dan Mihai
bd45e948cc tests: k8s-guest-pull-image.bats policy
Use the "allow all" policy for k8s-guest-pull-image.bats, instead of
relying on the Kata Guest image to use the same policy as its default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 03:00:00 +00:00
Dan Mihai
be3797ef7c tests: k8s-footloose.bats allow all policy
Use the "allow all" policy for k8s-footloose.bats, instead of
relying on the Kata Guest image to use the same policy as its
default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:50 +00:00
Dan Mihai
18f5e55667 tests: k8s-empty-dirs.bats allow all policy
Use the "allow all" policy for k8s-empty-dirs.bats, instead of
relying on the Kata Guest image to use the same policy as its
default.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:44 +00:00
Dan Mihai
ef22bd8a2b tests: k8s: replace run_policy_specific_tests
Check from:

- k8s-exec-rejected.bats
- k8s-policy-set-keys.bats

if policy testing is enabled or not, to reduce the complexity of
run_kubernetes_tests.sh. After these changes, there are no policy
specific commands left in run_kubernetes_tests.sh.

add_allow_all_policy_to_yaml() is moving out of run_kubernetes_tests.sh
too, but it not used yet. It will be used in future commits.

Fixes: #9395

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-03 02:59:28 +00:00
Dan Mihai
39805822fc tests: k8s: reduce policy testing complexity
Don't add the "allow all" policy to all the test YAML files anymore.

After this change, the k8s tests assume that all the Kata CI Guest
rootfs image files either:

- Don't support Agent Policy at all, or
- Include an "allow all" default policy.

This relience/assumption will be addressed in a future commit.

Fixes: #9395

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-02 16:18:31 +00:00
Alex Lyn
aa9cd232cd
Merge pull request #9358 from GabyCT/topic/nerdrandom
gha: Update journal log names for nerdctl artifacts
2024-04-01 09:50:16 +08:00
Steve Horsman
53fa1fd82d
Merge pull request #9349 from fidencio/topic/ci-k8s-update-cpuid
k8s: confidential: Update cpuid to its latest release
2024-03-27 16:57:36 +00:00
ChengyuZhu6
c50d3ebacc tests:k8s: Add a test to pull large images in the guest
Add a test to pull large images in the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 21:58:44 +08:00
Gabriela Cervantes
a997e282be gha: Update journal log names for nerdctl artifacts
This PR updates the journal log name for nerdctl artifacts to make
sure that we have different names in case we add a parallel GHA job.

Fixes #9357

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-26 20:03:54 +00:00
Lukáš Doktor
a671b3fc6e
tests: Use full svc address to check kbs service
the service might not listen on the default port, use the full service
address to ensure we are talking to the right resource.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-26 16:59:02 +01:00
Lukáš Doktor
6b0eaca4d4
tests: Add support for nodeport ingress for the kbs setup
this can be used on kcli or other systems where cluster nodes are
accessible from all places where the tests are running.

Fixes: #9272

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-03-26 16:59:00 +01:00
Fabiano Fidêncio
cfe75f9422
k8s: confidential: Update cpuid to its latest release
Since v2.2.6 it can detect TDX guests on Azure, so let's bump it even if
Azure peer-pods are not currently used as part of our CI.

Fixes: #9348

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-26 10:21:12 +01:00
Gabriela Cervantes
d54cdd3f0c scripts: Fix unbound variables in k8s setup script
This PR fixes the unbound variables error when trying to run
the setup script locally in order to avoid errors.

Fixes #9328

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-21 19:10:16 +00:00
James O. D. Hunt
1e684f5848
Merge pull request #9259 from jodh-intel/tests-add-static-checks-announce
tests: static checker: Add announce message
2024-03-21 13:59:36 +00:00
GabyCT
03f3d3491d
Merge pull request #9265 from GabyCT/topic/fixnydusclean
gha: Fix nydus namespace clean up
2024-03-20 16:17:38 -06:00
Gabriela Cervantes
a855ecf21b gha: Update journal log names for kubernetes artifacts
This PR updates the journal log names for kubernetes artifacts
in order to make sure that we have different names when we are
running parallel GHA jobs.

Fixes #9308

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 15:44:20 +00:00
Gabriela Cervantes
4fb8f8705f gha: Fix nydus namespace clean up
This PR terminates the nydus namespace to avoid the error of
that the flag needs an argument.

Fixes #9264

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-20 15:41:39 +00:00
James O. D. Hunt
577abd014b tests: static checker: Add announce message
Added an announcement message to the `static-checks.sh` script. It runs
platform / architecture specific code so it would be useful to display
details of the platform the checker is running on to help with
debugging.

Fixes: #9258.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-20 13:41:26 +00:00
James O. D. Hunt
4af4a8ad2b tests: static checker: Create setup function
Move some of the common code into a setup function.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-20 11:58:28 +00:00
Fabiano Fidêncio
19eb45a27d
Merge pull request #8484 from ChengyuZhu6/guest-pull
Merge basic guest pull image code to main
2024-03-19 23:15:39 +01:00
Fabiano Fidêncio
8911d3565f
gha: tests: Filter out confidential tests for aarch64 / ppc64le
Those two architectures are not TEE capable, thus we can just skip
running those tests there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-19 18:06:01 +01:00
Fabiano Fidêncio
d14e9802b6
gha: k8s: Set {https,no}_proxy correctly for TDX
This is needed as the TDX machine is hosted inside Intel and relies on
proxies in order to connect to the external world.  Not having those set
causes issues when pulling the image inside the guest.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
e23737a103
gha: refactor code with yq for better clarity
refactor code with yq for better clarity:

Before:
```bash
yq write -i "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml"
'spec.template.spec.containers[0].env[7].value' "${KATA_HYPERVISOR}:${SNAPSHOTTER}"
```

After:
```bash
yq write -i \
  "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" \
  'spec.template.spec.containers[0].env[7].value' \
  "${KATA_HYPERVISOR}:${SNAPSHOTTER}"
```

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
2c0bc8855b
tests: Make sure to install yq before using it
Make sure to install yq before using it to modify YAML files.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
c52b356482
tests: add guest pull image test
Add a test case of pulling image inside the guest for confidential
containers.

Signed-off-by: Da Li Liu <liudali@cn.ibm.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
e8c4effc07
tests: refactor the check for hypervisor to a function
Extract two reusable functions for confidential tests in confidential_common.sh

- check_hypervisor_for_confidential_tests: verifies if the input hypervisor supports confidential tests.
- confidential_setup: performs the common setup for confidential tests.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
cd6a84cfc5
kata-deploy: Setting up snapshotters per runtime handler
Setting up snapshotters per runtime handler as the commit
(6cc6ca5a7f) described.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
Hyounggyu Choi
b381743dd5 CI|k8s: Handle skipped tests with a comment for filter_out_per_arch
This commit updates `filter_k8s_test.sh` to handle skipped tests that
include comments. In addition to the existing parameter expansion,
the following expansions have been added:

- Removal of a comment
- Stripping of trailing spaces

Fixes: #9304

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-19 17:21:25 +01:00
Chelsea Mafrica
2c50d3c393
Merge pull request #9278 from wainersm/github_env_fix
tests: fix nounset error with $GITHUB_ENV
2024-03-14 16:39:13 -07:00
Dan Mihai
6094f1e31d
Merge pull request #9250 from microsoft/danmihai1/k8s-pid-ns2
tests: k8s: k8s-pid-ns.bats auto-generated policy
2024-03-14 10:10:24 -07:00
Wainer dos Santos Moschetta
981f95df55 tests: fix nounset error with $GITHUB_ENV
Initialize $GITHUB_ENV to avoid nounset error when running the scripts locally
out of Github Actions.

Fixed commit 9ba5e3d2a8

Fixes #9217
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-13 14:57:38 -03:00
Dan Mihai
ac27caf1b4
Merge pull request #9248 from microsoft/danmihai1/k8s-exec.bats2
tests: k8s: k8s-exec.bats auto-generated policy
2024-03-13 09:21:12 -07:00
Dan Mihai
e8c2a45ce0 tests: k8s: k8s-pid-ns.bats auto-generated policy
Auto-generate policy for k8s-pid-ns.bats.

Fixes: #9249

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-12 22:34:46 +00:00
Alex Lyn
a116b252c8
Merge pull request #9236 from jodh-intel/docs-improve-install-details
docs: install: Simplify instructions
2024-03-12 14:29:38 +08:00
Dan Mihai
88b7a44271 tests: k8s: k8s-exec.bats auto-generated policy
Auto-generate policy for k8s-exec.bats.

Fixes: #9247

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-08 17:48:20 +00:00
GabyCT
35d8f82232
Merge pull request #9242 from GabyCT/topic/enabldebugnerd
gha: Add collect artifacts step to nerdctl workflow
2024-03-07 13:34:40 -06:00
Wainer Moschetta
91998af173
Merge pull request #9114 from wainersm/ci_kbs_cli
CI: add KBS utilities for attestation tests
2024-03-07 16:34:03 -03:00
Wainer dos Santos Moschetta
8ea9ac515e tests/k8s: update kbs repository
Recently confidential-containers/kbs repository was renamed to
confidential-containers/trustee. Github will automatically resolve the
old URL but we better adjust it in code.

The trustee repository will be cloned to $COCO_TRUSTEE_DIR. Adjusted
file paths and pushd/popd's to use $COCO_KBS_DIR
($COCO_TRUSTEE_DIR/kbs).

On versions.yaml changed from `coco-kbs` to `coco-trustee` as in the
future we might need other trustee components, so keeping it generic.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
c669567cd3 tests/k8s: add utils to set KBS policies
Added the kbs_set_resources_policy() function to set the KBS policy. Also the
kbs_set_allow_all_resources() and kbs_set_deny_all_resources to set the
"allow all" and "deny all" policy, respectively.

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
6f0d38094d tests/k8s: add utils to set KBS resources
Added utility functions to manage resources in KBS:
- kbs_set_resource(), where the resource data is passed via argument
- kbs_set_resource_from_file(), where the resource data is found in a
  file

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
2a374422c5 tests/k8s: add function to install kbs-client
Added kbs_install_cli function to build and install the kbs-client
executable if not present into the system.

Removed the stub from gha-run.sh; now the install kbs-client in the
.github/workflows/run-kata-deploy-tests-on-aks.yaml will effectively
install the executable.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta
e410aef4fa tests/k8s: add utils to get kbs service address
Added functions to return the service host, port or full-qualified
HTTP address, respectively, kbs_k8s_svc_host(), kbs_k8s_svc_port(),
and kbs_k8s_svc_http_addr().

Fixes #9056
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-07 11:20:36 +00:00
Dan Mihai
c08b696d9e tests: k8s: k8s-shared-volume generated policy
Auto-generate policy for k8s-shared-volume.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
b24758fad8 tests: k8s: k8s-scale-nginx auto-generated policy
Auto-generate policy for k8s-scale-nginx.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
af9ac8d194 tests: k8s: k8s-replication auto-generated policy
Auto-generate policy for k8s-replication.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
56689c6800 tests: k8s: k8s-qos-pods auto-generated policy
Auto-generate policy for k8s-qos-pods.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Dan Mihai
0179f53469 tests: k8s: k8s-parallel auto-generated policy
Auto-generate policy for k8s-parallel.bats.

Fixes: #9096

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-03-07 05:57:30 +00:00
Gabriela Cervantes
94fdcda7f7 scripts: Add collect artifacts function in nerdctl gha run script
This PR adds the collect artifacts function in nerdctl gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-06 19:48:12 +00:00
James O. D. Hunt
b1d4cbd9d1 utils: spell-checker: Fix grep warning
Fix the `grep(1)` warning caused by the unnecessary escaping of the
hash/sharp symbol.

Fixes: #9235.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-06 13:21:15 +00:00
James O. D. Hunt
a67ed2f1c2 tests: Add k3s artifacts
The k3s distribution of k8s uses an embedded version of containerd and
configures it to log to a file, not the journal. Hence, although we
collect the journal as a test artifact, we also need to collect the
actual log files for containerd.

Also collect the k3s containerd config files to help with debugging.

Fixes: #9104.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-05 17:54:20 +00:00
Wainer dos Santos Moschetta
9ba5e3d2a8 gha: export start_time to collect artifacts properly
The jobs running on garm will collect journal information. The data gathered
is based on the time the tests started running. The $start_time is
exported on run_tests() and used in collect_artifacts(). It happens that
run_tests() and collect_artifacts() are called on different steps of the
workflow and the environment variables aren't preserved between them,
i.e, $start_time exported on the first step is not available on the
subsequents.

To solve that issue, let's save $start_time in the file pointed out by
$GITHUB_ENV that Github actions uses to export variables. In case $GITHUB_ENV is
empty then probably it is running locally outside of Github, so it won't
save the start time value.

Fixes #9217
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-05 12:15:20 -03:00
Greg Kurz
0320198889
Merge pull request #9206 from lifupan/main
CI: fix the issue of ci failure on crio
2024-03-05 09:52:13 +01:00
Wainer Moschetta
38088a934b
Merge pull request #9184 from wainersm/fix_kata_deploy_bats
tests/kata-deploy: fix checker for kata-deploy running
2024-03-04 20:50:37 -03:00
GabyCT
77d048da4d
Merge pull request #9065 from wainersm/ci_install_kbs
CI: Install KBS on k8s for attestation tests
2024-03-04 16:59:01 -06:00
Gabriela Cervantes
5d50262422 docs: Add general tests documentation in main README
This PR adds the general tests documentation in main README of the
kata containers repository.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 21:53:01 +00:00
Gabriela Cervantes
d5fa2bebd5 docs: Add general README for tests section
This PR adds general README documentation for the tests section
in the kata containers repository.

Fixes #9209

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 21:50:37 +00:00
GabyCT
4dea9019ab
Merge pull request #9126 from GabyCT/topic/addartifactsk
gha: Storing artifacts for logs of k8s tests garm
2024-03-04 15:41:54 -06:00
Gabriela Cervantes
fc5e040d96 scripts: Apply general fixes to variables in gha-run script
This PR applies general fixes to variables in gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-04 18:54:15 +00:00
Fupan Li
07e0cf1855 CI: fix the issue of ci failure on crio
PR #8760 tentatively tried to have the shim to run in its own mount
namespace for the sake of improving isolation between the sandbox and
the host. Thus crio storage drivers shouldn't create a PRIVATE
bind mount on their home directory. Otherwise, the container's rootfs
mount wouldn't be propagated to kata runtime's mount namespace, and
kata runtime couldn't access the container's rootfs files.

So, when kata cooperated with crio, crio should set
skip_mount_home=true for its storage overlay.

Fixes: #9028

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-03-03 20:53:36 +08:00
Wainer dos Santos Moschetta
2c24977cb1 tests/k8s: allow to overwrite the cluster name
_print_cluster_name() create a string based information like the
pull request number and commit SHA. However, when you are developing the
scripts you might want to use an arbitrary name, so it was introduced
the $AKS_NAME variable that once exported it will overwrite the
generated name.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
5e4b7bbd04 tests/k8s: expose KBS service externally
Until this point the deployed KBS service is only reachable from within
the cluster. This introduces a generic mechanism to apply an Ingress
configuration to expose the service externally.

The first implemened ingress is for AKS. In case the HTTP application
routing isn't enabled in the cluster (this is required for ingress), an
add-on is applied.

It was added the get_cluster_specific_dns_zone() and
enable_cluster_http_application_routing() helper functions
to gha-run-k8s-common.sh.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
e1e0b94975 tests/k8s: introduce the CoCo kbs library
Introduce the tests/integration/kubernetes/confidential_kbs.sh library
that contains functions to manage the KBS on CI. Initially implemented
the kbs_k8s_deploy() and kbs_k8s_delete() functions to, respectively,
deploy and delete KBS on Kubernetes. Also hooked those functions in the
tests/integration/kubernetes/gha-run.sh script to follow the convention
of running commands from Github Workflows:

$ .tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
$ .tests/integration/kubernetes/gha-run.sh delete-coco-kbs

Fixes #9058
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:39:26 -03:00
Wainer dos Santos Moschetta
6a28c94d99 tests/k8s: add a kustomize installer
Kustomize has been used on some of our internal components (e.g.
kata-deploy) to manage k8s deployments. On CI it has been used
the `sed` tool to edit kustomization.yaml files, but `kustomize` is
more suitable for that purpose. So in order to use that tool on CI
scripts in the future, this commit introduces the `install_kustomize()`
function that is going to download and install the binary in
/usr/local/bin in case it's found on $PATH.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:39:26 -03:00
GabyCT
4a0cfc4e3f
Merge pull request #9199 from GabyCT/topic/enablecri
gha: Enable cri-containerd tests for cloud hypervisor runtime-rs
2024-03-01 12:23:16 -06:00
Gabriela Cervantes
7299dbdb43 gha: Store journalctl logs
This PR stores the journalctl logs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-01 15:17:20 +00:00
Gabriela Cervantes
342d3a320d gha: Add collect artifacts function in gha-run script
This PR adds the collect artifacts function in gha-run script for
the kubernetes tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-03-01 15:17:20 +00:00
Greg Kurz
dc6bda19bf
Merge pull request #9179 from gkurz/fix-k8s-sandbox-vcpus-allocation-check
tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29
2024-03-01 15:55:07 +01:00
Wainer dos Santos Moschetta
24c163e6e1 tests/kata-deploy: fix checker for kata-deploy running
Currently, the checking for kata-deploy is running assume that the
daemonset scheduled at least one pod, however it might not had and the
kubectl wait command fails due to "error: no matching resources found".

On CI I've observed that fail intermittently. I suspect the service
account kata-deploy-sa take a while to show up then no kata-deploy is
scheduled in meanwhile.

Changed the checker logic to use waitForProcess() to keep testing if it is
already running, or hit the timeout (still 10m).

Fixes #9183
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-29 22:26:27 -03:00
Gabriela Cervantes
beb592b309 gha: Enable cri-containerd tests for cloud hypervisor runtime-rs
This PR enables the cri-containerd tests for cloud hypervisor runtime-rs.

Fixes #9198

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-29 20:18:16 +00:00
Gabriela Cervantes
0f595cf15b gha: General variable fixes to gha-run script
This PR adds general variable fixes to gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-29 18:15:27 +00:00
Greg Kurz
f3442cdef9 tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29
Kubernetes v1.29 introduced a new `PodReadyToStartContainers` condition
that gets inserted at index 0 in the conditions array. This means that
the expected `PodCompleted` reason can now be either at index 0 with
kubernetes v1.28 and older or at index 1 starting with kubernetes v1.29.
This is fragile at best since the `kubectl wait` doesn't allow to combine
multiple checks. Also, checking the reason is dubious as it doesn't really
tell if the pods have actually completed or not.

Check the pod phase to be `Succeeded` instead, this guarantees that :

> All containers in the Pod have terminated in success, and will not
> be restarted.

Fixes #9178

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-29 17:00:31 +01:00
Greg Kurz
f89120662d tests: k8s: Wait for all pods concurrently
A single invocation of `kubectl wait` can handle all pods.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-29 17:00:31 +01:00
Gabriela Cervantes
3cd319fcc2 scripts: General fixes to the gha-run script
This PR implements general fixes to the gha-run script for the
cri-containerd tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 19:32:51 +00:00
Gabriela Cervantes
5a498948c8 scripts: Skip cri-containerd in gha-run script
This PR skips the cri-containerd in gha-run script for cloud hypervisor
runtime-rs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-28 19:30:38 +00:00
Wainer Moschetta
c4b8270073
Merge pull request #9009 from wainersm/runk_bats
tests/runk: fix the "run ps command" flaky test
2024-02-28 15:58:36 -03:00
Wainer Moschetta
129ce84705
Merge pull request #9116 from wainersm/ci_install_kbs-workflow
gha: k8s: prepare AKS workflow to install the CoCo KBS
2024-02-28 14:43:41 -03:00
Wainer dos Santos Moschetta
b44e0c4e7c gha: k8s: prepare AKS workflow to install the CoCo KBS
Changed the "run k8s tests on AKS" workflows to get the CoCo KBS
installed so that we can run attestation tests.

The plan is to run attestation tests only on a subset of non-TEE jobs
initially, so this commit restricts to install KBS only on kata-qemu
configuration. Actually at this point it is added only stubs commands
to tests/integration/kubernetes/gha-run.sh that should be implemented
in a future commit.

Fixes #9058
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-27 13:51:15 -03:00
Wainer dos Santos Moschetta
0f8c36d990 tests/nydus: refactor the teardown()
This refactor the teardown() of tests/integration/nydus/nydus_tests.sh:

* Moved boilerplate code that kill process to a loop;
* Doesn't leave teardown() if a process failed to get killed, so that
  other clean up routines are ran;
* Check if the pid exist then attempt to kill the process, so avoid this
  misleading message:
```
Usage:
 kill [options] <pid> [...]

Options:
 <pid> [...]            send signal to every <pid> listed
 -<signal>, -s, --signal <signal>
                        specify the <signal> to be sent
 -q, --queue <value>    integer value to be sent with the signal
 -l, --list=[<signal>]  list all signal names, or convert one to a name
 -L, --table            list all signal names in a nice table

 -h, --help     display this help and exit
 -V, --version  output version information and exit

For more details see kill(1).
```

Fixes #8948
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:21:43 -03:00
Wainer dos Santos Moschetta
0f0ce9a81b tests/runk: replace the busybox image
It's recommended to avoid images from docker.io to avoid errors related
with hitting the pull limits that happens mostly on bare-metal machines.

So this replaced the docker.io's busybox with
quay.io/prometheus/busybox.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:11:05 -03:00
Wainer dos Santos Moschetta
bba8b5b2b4 tests/runk: fix flaky test
The "run ps command" test has failed once in a while because it doesn't
wait the sh command to start within the container, consequently `ps`
won't report the amount of lines expected.

Fixes #8975
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta
8a606eb94d tests/runk: convert to bats
Migrated runk tests from pure shell script to bats to be consistent with
other test suites.

The install_dependencies() will install the bats tool locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-26 11:09:23 -03:00
GabyCT
4f3c83cd12
Merge pull request #9115 from GabyCT/topic/adddief
scripts: Add an enhanced die function
2024-02-23 12:03:02 -06:00
Steve Horsman
dfa6e932bb
Merge pull request #9122 from ChengyuZhu6/snapshotter-clean
gha: try to cleanup nydus snapshotter before deploying it
2024-02-22 13:30:04 +00:00
ChengyuZhu6
8ab3894dc5 gha: try to cleanup nydus snapshotter before deploying it
CI failed to deploy nydus snapshotter because it was not cleaned up last time.
So we can try to cleanup nydus snapshotter before deploying it.

Fixes: #9121

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-22 18:51:14 +08:00
Dan Mihai
b3c3f992ab tests: k8s: common clean-up on teardown
teardown() gets executed after each test case, so there is no need to
clean-up before teardown.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9c164698d3 tests: k8s: k8s-optional-empty-configmap policy
Auto-generate policy for k8s-optional-empty-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
74a52c6d25 tests: k8s: k8s-oom.bats auto-generated policy
Auto-generate policy for k8s-oom.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
26a77d67f4 tests: k8s: k8s-number-cpus auto-generated policy
Auto-generate policy for k8s-number-cpus.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
9cbdce15fd tests: k8s: k8s-memory.bats auto-generated policy
Auto-generate policy for k8s-memory.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
40209cc0b7 tests: k8s: k8s-limit-range auto-generated policy
Auto-generate policy for k8s-limit-range.bats.

Also, fix teardown() namespace.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
df3c0318c6 tests: k8s: add set_namespace_to_policy_settings
Add set_namespace_to_policy_settings() for changing the pod namespace
in genpolicy settings.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:08 +00:00
Dan Mihai
6e14ce93c9 tests: k8s-kill-all-process-in-container policy
Auto-generate policy for k8s-kill-all-process-in-container.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
fad7ba0aea tests: k8s: k8s-job.bats auto-generated policy
Auto-generate policy for 8s-job.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
41c2bcbdc5 tests: k8s: k8s-file-volume auto-generated policy
Auto-generate policy for k8s-file-volume.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
81e641814f tests: k8s: k8s-cpu-ns auto-generated policy
Auto-generate policy for k8s-cpu-ns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
bc6d3fc238 tests: k8s: k8s-env.bats auto-generated policy
Auto-generate policy for k8s-env.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
0a4fc071ac tests: k8s: k8s-custom-dns auto-generated policy
Auto-generate policy for k8s-custom-dns.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
f693f49e92 tests: k8s: k8s-credentials-secrets policy
Auto-generate policy for k8s-credentials-secrets.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
d3d27bbb5b tests: k8s: k8s-configmap auto-generated policy
Auto-generate policy for k8s-configmap.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Dan Mihai
b318535536 tests: k8s: auto-generate k8s-caps.bats policy
Auto-generated policy for k8s-caps.bats.

Fixes: #9072

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-21 18:08:07 +00:00
Chelsea Mafrica
241a56989a
Merge pull request #9090 from GabyCT/topic/pulldockerimage
gha: docker: Pull docker image as part of the dependencies
2024-02-20 14:28:53 -08:00
GabyCT
64c09fe6c5
Merge pull request #9088 from GabyCT/topic/fixnydus
gha: nydus: Fix indentation in gha run script
2024-02-20 14:09:54 -06:00
Gabriela Cervantes
ff8a6fa9ef scripts: Add error script
This PR adds the error script to display the error message with
much more information to help debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:30:03 +00:00
Gabriela Cervantes
43a46d5a6b scripts: Add an enhanced die function
This PR adds an enhanced die function in order to dump more information
in a yaml format that will help with the debugging.

Fixes #9105

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-20 18:27:44 +00:00
Fabiano Fidêncio
3468ac3b6e
ci: k8s: Fix checks used to skip confidential tests
This has been introduced by 53bc4a432b,
where the condition was changed.

The correct condition is:
* If the list of supported tees does not contain the kata hypervisor
  and the list of supported non tees does not contain the kata
  hypervisor.

The error is that we were checking whether kata-hypervisor would contain
the list of supported tees, and that would almost always be false
(unless in the case where the list had an one and only one element).

Fixes: #9055 -- part II

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-18 10:10:45 +01:00
Hyounggyu Choi
8b3f7f353d CI|k8s: Skip vcpu allocation test for s390x
A test `vcpu allocation k8s test` exhibits different behavior on s390x
For more details, please refer to issue #9093.
This commit is to make the test skipped until the issue is resolved on
the platform.

Fixes: #9093

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-15 12:26:35 +01:00
GabyCT
9cf343779f
Merge pull request #9062 from GabyCT/topic/nonteet
tests: Add ability to run non-TEE environments
2024-02-13 14:28:07 -06:00
Gabriela Cervantes
598c77409a gha: docker: Pull docker image as part of the dependencies
This PR pulls the docker image needed for the test as part of the dependencies
in order to avoid failures of timeouts mainly because the image was not
properly download it and it is unable to find it.

Fixes #9089

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 17:48:31 +00:00
Gabriela Cervantes
53bc4a432b tests: Add ability to run non-TEE environments
This PR adds the ability to run k8s confidential tests in a
non-TEE environment.

Fixes #9055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 17:27:55 +00:00
Gabriela Cervantes
54d1f34650 gha: nydus: Fix indentation in gha run script
This PR fixes the indentation in gha run script for nydus.

Fixes #9087

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-13 16:53:28 +00:00
Fabiano Fidêncio
3877a9f49a
ci: Clean up kata-deploy ds before starting the tests
This will ensure no leftovers are in the node, which has been cause the
TDX CI to fail every now and then.

Fixes: #9081

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-13 14:10:44 +01:00
GabyCT
00be9ae872
Merge pull request #9070 from microsoft/danmihai1/debug-containers
tests: k8s: avoid deleting unrelated pods
2024-02-12 15:24:15 -06:00
Greg Kurz
532567bfe9
Merge pull request #8936 from fidencio/topic/fix-cri-o-ci
tests: cri-o: Use packages from pkgs.k8s.io
2024-02-12 10:04:53 +01:00
Dan Mihai
a21ca9b7c9 tests: k8s: avoid deleting unrelated pods
Delete the debugger pod created during the test, rather than already
existing debugger pods.

Also, send the output of "kubectl delete" to stderr, just in case it's
useful for debugging.

Fixes: #9069

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-09 22:48:41 +00:00
Dan Mihai
a054462eb7
Merge pull request #9051 from microsoft/danmihai1/k8s-copy-file
tests: k8s: k8s-copy-file auto-generated policy
2024-02-09 12:30:49 -08:00
ChengyuZhu6
97fbf360cc
gha: Cleanup nydus snapshotter by the daemonset
Cleanup nydus snapshotter by the daemonset.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-09 14:47:13 +01:00
ChengyuZhu6
43b04fd0c0
gha: Deploy nydus snapshotter by the daemonset
We can use daemonset to deploy nydus snapshotter, which will decrease
one manual step both for Kata Containers and Confidential Containers CI.

Fixes: #8584

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-09 14:47:09 +01:00
Fabiano Fidêncio
344e0580ca
tests: cri-o: Use packages from pkgs.k8s.io
CRI-O has moved, for a long time, towards pkgs.k8s.io, see:
https://kubernetes.io/blog/2023/10/10/cri-o-community-package-infrastructure/

With this the OBS repo won't be used anymore.

Fixes: #8935

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-02-09 12:15:55 +01:00
Gabriela Cervantes
0b508f301b tests:k8s: make add_kernel_initrd_anotations function generic
This PR replaces the add_kernel_initrd_annotations_to_yaml function
more generic so later can be used for other components.

Fixes #9054

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-08 19:30:43 +00:00
Dan Mihai
f139c7dc60 tests: k8s: k8s-copy-file auto-generated policy
Auto-generate policy for k8s-copy-file.bats.

Fixes: #9050

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 13:26:05 +00:00
Dan Mihai
1179306afa tests: k8s: additional policy testing utilities
1. add_requests_to_policy_settings allows one or more ttrpc requests
   from the Host to the Guest. Example:

add_requests_to_policy_settings "${policy_settings_dir}" \
   "ReadStreamRequest" "WriteStreamRequest"

2. add_copy_from_host_to_policy_settings allows executing on the Guest
   the commands initiated behind the scenes by "kubectl cp" from the
   Host to the Guest. Example:

add_copy_from_host_to_policy_settings "${policy_settings_dir}"

3. add_copy_from_guest_to_policy_settings allows executing on the Guest
   the commands initiated behind the scenes by "kubectl cp" from the
   Guest to the Host. Example:

add_copy_from_guest_to_policy_settings "${policy_settings_dir}" \
   "/tmp/file.txt"

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-08 13:25:41 +00:00
Dan Mihai
2bb91c9d8f
Merge pull request #8922 from microsoft/danmihai1/k8s-attach-handlers
tests: k8s-attach-handlers auto-generated policy
2024-02-07 13:29:50 -08:00
Dan Mihai
6b5e57f7c7 tests: k8s: address PR review feedback
1. Rename install_kata_common to install_kata_core.

2. Add TODO for better way to install the Kata tools.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 18:51:56 +00:00
ChengyuZhu6
d0b8e6d8f3 nydus: Bump nydus snapshotter version to v0.13.7
Bump nydus snapshotter version to v0.13.7.
The new release name of nydus snapshotter is `nydus-snapshotter-v0.13.7-linux-amd64.tar.gz`,
which differs from the version used by kata (`nydus-snapshotter-v0.12.0-x86_64.tgz`).
Therefore, we need to update the script to obtain the correct nydus snapshotter name.

Fixes: #9044

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-07 22:17:05 +08:00
Dan Mihai
dd16bc393f tests: k8s: k8s-attach-handlers generated policy
Automatically generate the test policy for k8s-attach-handlers.bats,
if AUTO_GENERATE_POLICY is enabled.

Steps:

- Create a temporary directory for the current test and copy the
  common genpolicy settings into this new directory.

- Change genpolicy settings in the temp directory to allow the
  "kubectl exec" command that this test needs. (For CoCo, exec is
  blocked by the default policy settings)

- Auto-generate the policy for the test YAML file.

- Test as usual, using the YAML file.

- Clean-up the temporary settings described above.

Fixes: #8921

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:26:03 +00:00
Dan Mihai
0de407f8b7 tests: k8s: enable AUTO_GENERATE_POLICY
Enable AUTO_GENERATE_POLICY for one of the Kata CI K8s test platforms.
Additional platforms will be enabled after testing them.

When AUTO_GENERATE_POLICY is enabled, create genpolicy settings that
are common for all tests. Some of the tests will make temporary copies
of these common settings and customize them as needed.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:25:54 +00:00
Dan Mihai
05b2e4f606 tests: k8s: install genpolicy
Install the genpolicy app before starting test execution.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:25:42 +00:00
Dan Mihai
8aa8b70573 tests: k8s: add policy test utilities
Add script functions useful for auto-generating and testing policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:24:06 +00:00
Dan Mihai
24a17a2e1b tests: k8s: output the names of test files
Output the names of test files, for easier search through logs.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:54 +00:00
Dan Mihai
bf533de31a tests: k8s: add DEBUG support for test scripts
Make these scripts easier to debug.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:46 +00:00
Dan Mihai
1b4ef672ef tests: k8s: reduce namespace name duplication
1. Avoid repeating "kata-containers-k8s-tests".
2. Allow users to specify a different test namespace.
3. Introduce the TEST_CLUSTER_NAMESPACE variable, that will also be
   useful when auto-generating the Agent Policy for these tests.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:38 +00:00
Dan Mihai
8a5ba5fb34 tests: k8s: allow run_kubernetes_tests.sh exec
Allow everyone to directly execute run_kubernetes_tests.sh, for easier
local testing.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:30 +00:00
GabyCT
d74b6e143f
Merge pull request #8951 from GabyCT/topic/udf
metrics: Update packages for TensorFlow ResNet Int8 Dockerfile
2024-02-06 14:29:41 -06:00
GabyCT
6337f300a8
Merge pull request #8628 from GabyCT/topic/enablek8stclh
tests: k8s: Enable tests for cloud hypervisor runtime-rs without devicemapper
2024-02-06 14:28:35 -06:00
Gabriela Cervantes
cf049fc718 k8s: Skip k8s tests that are not working
This PR skips the k8s tests that are not working with cloud hypervisor
runtime-rs with its proper issue.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-06 16:52:02 +00:00
Wainer Moschetta
f1ca5d1563
Merge pull request #8953 from ChengyuZhu6/ci-guest-pull
gha: Enable nydus snapshotter in CoCo ci tests
2024-02-06 09:36:59 -03:00
Wainer dos Santos Moschetta
106e1af497 cri-containerd: fix loop in TestContainerMemoryUpdate()
The loop that generate test cases for virtio-mem enabled/disabled
doesn't return the integers '1' and '0' as expected. Instead it returns
the strings '{1,' and '0}'.

Fixes #9024
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-05 10:59:39 -03:00
ChengyuZhu6
a214bd8d13 gha: Enable nydus snapshotter in CoCo ci tests
This PR is a split of #8585.
make the changes on the Github workflows, and the skeleton to deploy_snapshotter()
and cleanup_snapshotter() in tests/integration/kubernetes/gha-run.sh in this commit.

After initially merging this patch to trigger CI jobs for CoCo, which will begin executing
the dummy functions deploy_snapshotter() and cleanup_snapshotter(), the implementation details for these functions
remain in #8585. Our subsequent step involves transferring this logic to the PR #8484, enabling the PR to undergo CI testing prior to its merge.

Fixes: #8997

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-05 18:51:59 +08:00
Xuewei Niu
fa01a86334
Merge pull request #9007 from wainersm/aks_delete_rg
gha: delete azure RG only if it exists
2024-02-04 16:34:17 +08:00
Wainer dos Santos Moschetta
a04b215bcc gha: delete azure RG only if it exists
delete_cluster() has tried to delete the az resources group regardless
if it exists. In some cases the result of that operation is ignored,
i.e., fail to resource group not found, but the log messages get a
little dirty. Let's delete the RG only if it exists then.

Fixes #8989
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-02 16:57:20 -03:00
Gabriela Cervantes
eb5b7d3bf8 tests: k8s: Enable tests for cloud hypervisor runtime-rs
This PR enable the k8s tests for cloud hypervisor runtime-rs.

Fixes #8627

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-02-02 17:58:58 +00:00
Amulya Meka
e4252a3fe2
Merge pull request #8957 from Amulyam24/add-k8s-test-ppc64le
gha: add kubernetes tests workflow for ppc64le
2024-02-02 10:22:00 +05:30
Aurélien Bombo
0ace31f041 ci: aks: switch from eastus2 to eastus region
This addresses an internal AKS issue that intermittently prevents
clusters from getting created. The fix has been rolled out to eastus but
not yet eastus2, so we unblock the CI by switching. No downsides in
general.

This supersedes #8990.

Fixes: #8989

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-02-01 19:22:42 +00:00
Amulyam24
f8585db8d9 gha: add kubernetes tests workflow for ppc64le
This PR adds workflow for running kubernetes test suite on ppc64le.

It uses scripts to create and delete the cluster using kubeadm as none of the current cluster creation tools are supported on Power.

Fixes: #7950

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2024-02-01 12:23:11 +05:30
Gabriela Cervantes
78b517ccc8 tests: Re-arranged nerdctl tests
This PR re-arranged the nerdctl tests to avoid random failures.
In this PR first will run the tests with RunC and then with the kata hypervisor.
This PR tries to avoid the random failures that is happening with cloud-hypervisor
and clh.

Fixes #8963

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-30 16:07:12 +00:00
Gabriela Cervantes
31813cf8d8 metrics: Update packages for TensorFlow ResNet Int8 Dockerfile
This PR updates the required packages for the TensorFlow ResNet50
Int8 Dockerfile.

Fixes #8950

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-29 16:11:09 +00:00
Fabiano Fidêncio
448c0aaecb
gha: azure: Set the correct subscription to the account
Due to the changes done in the CI, we need to set the correct
subscription to be used with the account from now on, otherwise we'd end
up using CoCo subscription.

Fixes: #8946

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-29 15:00:38 +01:00
Hyounggyu Choi
ab462a4b89 tests: Add IBM SE to the basic confidential test
The existing confidential basic test titled `Test unencrypted
confidential container launch success and verify that we are
running in a secure enclave` has been updated to incorporate
IBM Secure Execution (`qemu-se`).
Previously, a secure image was absent from kata-deploy, hindering
the inclusion of IBM SE in the test.
Thanks to the #6755 update, it is now possible to test the TEE.

This modification extends the existing test by introducing
`qemu-se`. The specific changes are outlined below:

- Add an additional test `cc-se-e2e-tests` to s390x nightly
- Expansion of `REMOTE_COMMAND_PER_HYPERVISOR` for `qemu-se`
- Temporary exclusion of two test cases currently incompatible with IBM SE
(`cpu-ns` is a common issue across all TEEs, while `inotify`
will be addressed in a subsequent pull request).

Fixes: #8913

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-01-26 06:04:39 +01:00
GabyCT
36fc2fd83f
Merge pull request #8876 from GabyCT/topic/dockerrestfp
metrics: Update packages needed for ResNet50 FP32 Dockerfile
2024-01-25 13:51:16 -06:00
Dan Mihai
66c012d052 tests: k8s: bats --show-output-of-passing-tests
Add --show-output-of-passing-tests to the k8s integration tests. The
output of a passing test can be helpful when investigating a failure
of the same test.

Fixes: #8885

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-24 03:04:28 +00:00
Gabriela Cervantes
eb7e123de8 metrics: Update packages needed for ResNet50 FP32 Dockerfile
This PR updates the packages necessary to build the ResNet50 fp32
Dockerfile to run properly the benchmark.

Fixes #8875

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-22 16:15:36 +00:00
Dan Mihai
ea9c659d36 gha: get ready to install genpolicy
The changes to install and test genpolicy must come later, after CI
picks up these gha changes.

Fixes: #8856

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-19 23:37:49 +00:00
David Esparza
e11c520ffa
Merge pull request #8808 from kata-containers/memory_usage_test_skip_virtiofs_when_req
tests: Ignore virtiofs contribution to memory usage when it is disabled.
2024-01-16 16:50:06 -06:00
David Esparza
4b772d2480 tests: Ignore virtiofs contribution to memory usage when it is disabled.
This PR removes the references to virtiofs from memory average
calculation when the container uses a shared file system other than
virtiofs.

Fixes: #8807

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-01-15 08:07:06 -08:00
Gabriela Cervantes
dff800a8ff metrics: Remove iperf3 server protocol
This PR removes the iperf3 server protocol as this server definition is
also used for the UDP iperf3 benchmarks to avoid duplication of the
same yaml files.

Fixes #8829

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-15 15:44:24 +00:00
Dan Mihai
b7c31e3b98 tests: cbl-mariner: disable k8s-oom.bats
Disable k8s-oom.bats on cbl-mariner until it passes more often.

Fixes: #8824

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-14 17:39:25 +00:00
GabyCT
a7114a35a8
Merge pull request #8792 from GabyCT/topic/updatenhwc
metrics: Use a specific python version to run tensorflow benchmark
2024-01-12 11:24:54 -06:00
Alex.Lyn
ffcd95b6b4
Merge pull request #8737 from Apokleos/test-ci-dgb-cri-containerd
ci: enable test dragonball stability and cri-containerd
2024-01-12 11:56:22 +08:00
Gabriela Cervantes
12a41f89b1 metrics: Use a specific python version to run tensorflow benchmark
This PR uses a specific python version to run tensorflow benchmark
as it needs python 3.8 to run correctly and avoid failures.

Fixes #8791

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-11 22:15:31 +00:00
GabyCT
69be050ff9
Merge pull request #8657 from WenyuanLau/8656/Fix_StratoVirt_on_gha_metrics
gha: Fix the failure of gha metrics for StratoVirt
2024-01-11 11:41:25 -06:00
alex.lyn
b97efc3139 CI: enable test container memory update for dragonball
Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 19:07:33 +08:00
alex.lyn
6c85e95c34 CI: bugfix for dragonball when CI running with cri-containerd
Containerd runtime options with wrong setting cause it failed.
Correct it as below:
...
 [plugins.cri.containerd.runtimes.${runtime}.options]
   ConfigPath= "${KATA_CONFIG_PATH}"
...

Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 17:35:33 +08:00
alex.lyn
cd59d31a15 CI: make CI work for dragonball to test stability and cri-containerd
It needs to remove the skip setting, and make it work for dragonball.

Fixes: #8746

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2024-01-11 17:35:13 +08:00
Dan Mihai
de61b4d4e2
Merge pull request #8772 from microsoft/danmihai1/wait-for-delete
tests: list the current k8s pods
2024-01-09 13:45:55 -08:00
Gabriela Cervantes
24fab19f6f tests: Remove check images function from stressng test
This PR removes the check images function from stressng test as now
it will part of the install dependencies function from gha-run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-08 17:40:39 +00:00
Gabriela Cervantes
aceba94d95 tests: Add check images as part of install dependencies
To avoid random failures while trying to build and install the stressng image,
this PR moves that step as part of the install dependencies in order to move
the stability tests and avoid timeouts.

Fixes #8787

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-08 17:38:14 +00:00
Dan Mihai
90c782f928 tests: list the current k8s pods
Log the list of the current pods between tests because these pods
might be related to cluster nodes occasionally running out of memory.

Fixes: #8769

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-01-05 16:41:43 +00:00
Gabriela Cervantes
4ad1971a0a tests: Add hypervisor component to kill kata components function
This PR adds the qemu-experimental hypervisor in the function to
kill kata components.

Fixes #8775

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-04 17:07:12 +00:00
GabyCT
f056ffe5ef
Merge pull request #8759 from fadecoder/update_docs_for_stratoVirt_VMM
docs: Update docs for new StratoVirt VMM introduction
2024-01-04 10:39:37 -06:00
Zhigang Wang
44b5b88f4c docs: Update docs for new StratoVirt VMM introduction
As the StratoVirt VMM has been added, we can update the docs
and make some intoduction to StratoVirt, thus users can know more
about the hypervisor choices.

Fixes: #8645

Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com>
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2024-01-04 14:26:48 +08:00
Gabriela Cervantes
4bc67dba08 metrics: Improve iperf3 cleanup
This PR improves the iperf3 cleanup to ensure all the components are
being deleted properly to avoid the random failures of leaving
the iperf3 clients on the kata metrics CI.

Fixes #8765

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-01-03 17:14:38 +00:00
Xuewei Niu
206ed6d77d tests: Load vhost modules explicitly while Kata installing
The default network backend of runtime-rs with Dragonball is vhost-net
after #8609 merged. The tests might be failed if vhost modules are not
loaded.

Fixes: #8717

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-12-22 11:07:37 +08:00
Dan Mihai
d916da15dd
Merge pull request #8688 from microsoft/danmihai1/k8s-confidential
tests: retry connection to pod SSH server
2023-12-20 15:01:26 -08:00
stevenhorsman
9e718b4e23 gha: kata-deploy: Add containerd status check
After kata-deploy has installed, check that the worker nodes
are still in Ready state and don't have a containerd://Unknown
container runtime versions, identicating that container isn't working
to ensure that we didn't corrupt the containerd config during kata-deploy's edits

Fixes: #8678
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-12-20 09:10:43 +00:00
Dan Mihai
8aa390279e tests: retry connection to pod SSH server
To become more resilient against these kinds of errors:

deployment.apps/confidential-unencrypted created
pod/confidential-unencrypted-c5fdd6964-rrb6q condition met
ssh: connect to host 10.42.0.109 port 22: Connection refused

Fixes: #8687

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-20 02:48:05 +00:00
GabyCT
5504176e9a
Merge pull request #8699 from GabyCT/topic/fixconfidentialscript
tests: k8s: Fix indentation in confidential common script
2023-12-19 16:01:28 -06:00
Dan Mihai
551a50cd72 tests: additional run-runk logging
Add logging to run-runk, for debugging possible failures.

Fixes: #8696

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-12-19 14:08:01 +00:00
Gabriela Cervantes
1469a5efca tests: k8s: Fix indentation in confidential common script
This PR fixes the indentation of the confidential common
script for kubernetes tests.

Fixes #8698

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-18 20:25:06 +00:00
Liu Wenyuan
61fe20cf9a gha: Fix some of gha metrics failure for StratoVirt
Update the Speed & Density metric tests baseline for StratoVirt
and re-enable them, and skip other metric tests temporarily.

Fixes: #8656

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-12-15 17:45:01 +08:00
GabyCT
4a49dd73db
Merge pull request #8676 from GabyCT/topic/fixins
tests: k8s: Fix indentation in setup script
2023-12-14 13:57:47 -06:00
GabyCT
7a606a19c4
Merge pull request #8659 from GabyCT/topic/improvecleanuplatency
metrics: Improve latency network cleanup
2023-12-14 13:57:28 -06:00
GabyCT
0831529279
Merge pull request #8644 from GabyCT/topic/updadockerresint
metrics: Update TensorFlow ResNet50 Int8 Dockerfile
2023-12-14 13:56:41 -06:00
Gabriela Cervantes
c92b14da97 tests: k8s: Fix indentation in setup script
This PR fixes the indentation of the kubernetes setup script.

Fixes #8675

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-14 16:26:22 +00:00
Gabriela Cervantes
8151117f73 metrics: Improve latency network cleanup
This PR improves the latency network cleanup by removing the pods
even if the test fails.

Fixes #8658

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-13 17:56:01 +00:00
Chelsea Mafrica
63636b869c static-checks: Update copyright dates
Some copyright dates were not updated with the most recent changes to
code; update them.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 16:34:06 -08:00
Chelsea Mafrica
b11c772865 static-checks: Change dir for building tools
Change directory for running make due to local errors when building with
make -C.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 16:34:06 -08:00
Gabriela Cervantes
23f76653e5 metrics: Update command to run the tensorflow int8 benchmark
This PR updates the command to run the tensorflow resnet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-12 16:24:09 +00:00
Gabriela Cervantes
8fd5ef7fb7 metrics: Update TensorFlow ResNet50 Int8 Dockerfile
This PR updates the TensorFlow ResNet50 Int8 Dockerfile to use the
proper python version for kata metrics.

Fixes #8643

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-12 16:20:56 +00:00
Chelsea Mafrica
a9d360728e static-checks: Fix directory for github labels
Fix paths for yqdir (where the install_yq.sh script currently is) so
that static checks can run without error.

Fixes #8595

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-12-12 02:16:35 -08:00
GabyCT
ee74fca92c
Merge pull request #8617 from GabyCT/topic/enabletestnerdctl
tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs
2023-12-11 14:09:58 -06:00
David Esparza
584a26dab0
Merge pull request #8542 from dborquez/metrics_fix_deployment_cleaning
metrics: cleans k8s iperf deployment when the test finishes.
2023-12-11 13:14:39 -06:00
GabyCT
43410e1918
Merge pull request #8560 from GabyCT/topic/enablek8srs
gha: k8s: Add cloud-hypervisor (runtime-rs) support
2023-12-11 09:42:49 -06:00
James O. D. Hunt
2a35541af7
Merge pull request #8592 from jodh-intel/static-checks-try-multiple-user-agents
CI: static-checks: Try multiple user agents
2023-12-11 11:52:29 +00:00
Hyounggyu Choi
40f0c8fbb7 GHA: Use --client=true for k3s kubectl version
This is to fix a broken usage for `k3s kubectl version` by switching
an option `--short` to `--client=true`.

Fixes: #8621

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-11 08:26:39 +01:00
Gabriela Cervantes
1662a3e859 common: Add cloud hypervisor in enabling hypervisor function
This PR adds the cloud hypervisor in the enabling hypervisor function.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-08 21:32:00 +00:00
James O. D. Hunt
5d085a3042 CI: static-checks: Try multiple user agents
Make the URL checker cycle through a list of user agent values until we
hit one the remote server is happy with.

This is required since, unfortunately, we really, really want to check
these URLs, but some sites block clients based on their `User-Agent`
(UA) request header value. And of course, each site is different and can
change its behaviour at any time.

Our strategy therefore is to try various UA's until we find one the
server accepts:

- No explicit UA (use `curl`'s default)
- Explicitly no UA.
- A blank UA.
- Partial UA values for various CLI tools.
- Partial UA values for various console web browsers.
- Partial UA for Emacs's built-in browser.
- The existing UA which is used as a "last ditch" attempt where the UA implies multiple platforms and browser.

> **Notes:**
>
> - The "partial UA" values specify specify the UA "product" but not the
>   UA "product version": we specify `foo` and not `foo/1.2.3`). We do
>   this since most sites tested appear to not care about the version.
>   This is as expected given that the version is strictly optional (see `[*]`).
>
> - We now log all errors and display an error summary if none of the UAs
>   worked, in addition to the simple list of the URLs we believe to be
>   invalid. This should make future debugging simpler.

`[*]` - https://www.rfc-editor.org/rfc/rfc9110#section-10.1.5

Fixes: #8553.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 18:02:41 +00:00
James O. D. Hunt
613def0328 CI: static-checks: Move curl to a separate function
Split the call to `curl` in the URL checker out into a new
`run_url_check_cmd()` function to make `check_url()` slightly clearer.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
6d859f97ee CI: static-checks: Lint fixes
Declare and then define a couple of variables separately.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
efa8e6547c CI: static-checks: Check params have a value
Check that the `check_url()` parameters have a value.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
563ea020b0 CI: static-checks: Fold long line
Break up a long line as little to make it easier to read.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
James O. D. Hunt
3ad43df946 CI: static-checks: Improve markdown checker test
Only attempt to build the markdown checker if it doesn't already exist.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-12-08 17:11:20 +00:00
Gabriela Cervantes
f3eeab10ab tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs
This PR enables the nerdctl tests for cloud hypervisor runtime-rs.

Fixes #8616

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-08 16:12:36 +00:00
David Esparza
b2577000e7
metrics: Expose iperf3 pods over a k8s networks.
A prerequisite for measuring kata network bandwidth is
run Iperf3 tool at a the transport layer provided by a
k8s service for exposing a network where the clients
inside the cluster can use to contact Pods in the service.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-12-07 18:07:05 -06:00
David Esparza
a062ba166b
metrics: cleans k8s iperf deployment when the test finishes.
This PR fixes small issues like:
1. Cleaning up the k8s environment by removing the iperf test
implementation even when the test fails.
2. Checks if the workload returned a result before generating
an empty results json file as it was bein done.
3. Removes the redundancy of calls to functions that process
subtests and should compose the results json file only when
all results are ready and not before.
4. The tcp service manifest was added to the server deployment
which targets TCP port 5201.

Fixes: #8534

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-12-07 18:02:39 -06:00
GabyCT
0e0a7d9410
Merge pull request #8604 from GabyCT/topic/enablenerdctlrs
gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI
2023-12-07 14:35:26 -06:00
David Esparza
298be4aa1c
Merge pull request #8594 from GabyCT/topic/updatedockerfilet
metrics: Update TensorFlow ResNet FP32 dockerfile
2023-12-07 11:14:48 -06:00
Gabriela Cervantes
ce694b905b tests: Fix indentation of gha-run script
This PR fixes the indentation of gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:56:19 +00:00
Gabriela Cervantes
33b300431e tests: Enable but do not run k8s tests for cloud hypervisor
This PR enables but do not run k8s tests for cloud hypervisor
for runtime-rs.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:39:15 +00:00
Gabriela Cervantes
50a5fa9a65 tests: Enable but do not run the nerdctl tests for cloud hypervisor
This PR enables but do not run the nerdctl tests for cloud hypervisor
runtime-rs until we find out how stable they are.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-07 16:29:51 +00:00
Hyounggyu Choi
0d5a970e54 GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict
It is to remove a GITHUB_WORKSPACE directory for self-hosted runners
when a workflow fails due to the merge conflict. This will prevent
the subsequent workflows from getting stuck in the same situation.

Fixes: #8600

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-07 10:25:57 +01:00
Gabriela Cervantes
56dddab04f metrics: Update command to run tensorflow resnet fp32 benchmark
This PR updates the command needed to run the tensorflow benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-06 17:02:10 +00:00
Gabriela Cervantes
62fdebeeb5 metrics: Update TensorFlow ResNet FP32 dockerfile
This PR updates the python version for the TensorFlow ResNet FP32
dockerfile so the benchmark can run without issues.

Fixes #8593

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-12-06 16:53:21 +00:00
Fabiano Fidêncio
d149b9f9ca
Merge pull request #7231 from wainersm/measured_rootfs-improvements
Build for measured rootfs improvements
2023-12-05 22:20:33 +01:00
Fabiano Fidêncio
05ce52d746 devmapper: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Fixes: #8569

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
Fabiano Fidêncio
a8a156b1af stability: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
Fabiano Fidêncio
16ad721eda cri-containerd: dragonball: Enable, but do not run, the tests
This will make the life easier for dragonball developers to properly
enable the tests once the tests are ready.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-12-05 15:29:23 +01:00
GabyCT
1c00a9a6a9
Merge pull request #8524 from GabyCT/topic/addiperfinfo
docs: Update iperf3 network documentation
2023-12-04 14:03:30 -06:00
Fabiano Fidêncio
852021e416
Merge pull request #8483 from fidencio/topic/move-rust-config-files-to-subdir-based-on-jodh-approach
build/kata-deploy: Move rust runtime config files to runtime-rs directory -- based on #8445
2023-12-01 16:22:51 +01:00
Chelsea Mafrica
818b8f93b1
Merge pull request #8288 from cmaf/migrate-static-checks
Migrate static checks
2023-11-30 17:44:16 -08:00
GabyCT
2bd21f7831
Merge pull request #8531 from GabyCT/topic/fixiperfli
metrics: Fix iperf parallel bandwidth limit
2023-11-30 13:47:00 -06:00
Gabriela Cervantes
37633d3cc2 metrics: Fix iperf parallel bandwidth limit
This PR fixes the iperf parallel bandwidth limit for the kata
metrics CI.

Fixes #8530

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-29 19:59:45 +00:00
Dan Mihai
96deea52f2 tests: more k8s-exec-rejected debug output
Print more information useful for debugging. Also, use a separate YAML
file for this test, instead of reusing someone else's file.

Fixes: #8270

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-11-29 18:05:15 +00:00
Fabiano Fidêncio
8fd39d11c4 tests: Adapt enable_hypervisorto the runtime-rs config location change
As the configuration for the runtime-rs based drivers are now placed in
a different location than the golang ones, we should adapt this script
accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:35 +01:00
Fabiano Fidêncio
38183acbcb tests: Use kata-ctl instead of kata-runtime for runtime-rs
`kata-ctl` is the tool for runtime-rs, and it should be used instead of
`kata-runtime`.

`kata-ctl` requires sudo, and that's the reason it's also been added as
part of the calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:35 +01:00
Fabiano Fidêncio
a5a73a11cb tests: Replace kata-runtime kata-env by kata-runtime env
`kata-runtime env` is an alias for `kata-runtime kata-env, and calling
it with the `env` paramenter allows us to easily extend the scripts to
use `kata-ctl` instead of `kata-runtime` when dealing with runtime-rs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-29 14:51:31 +01:00
Chelsea Mafrica
05efb23261 tests: update go.mod and go.sum
Generate a go.sum file for tests.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-11-28 17:40:41 -08:00
Fabiano Fidêncio
30acb5a0c0 tests: nydus: Adapt the default config file for runtime-rs based drivers
As we've done some changes in the runtime-rs based drivers to install
their configuration into a different location, this should also be
reflected as part of this test.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-28 20:37:59 +01:00
Chelsea Mafrica
6d9cb9325d tests: update scripts for static checks migration
Updates to scripts for static-checks.sh functionality, including common
functions location, the move of several common functions to the existing
common.bash, adding hadolint and xurls to the versions file, and changes
to static checks for running in the main kata containers repo.

The changes to the vendor check include searching for existing go.mod
files but no other changes to expand the test.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
66f3944b52 tests: move github-labels to main repo
Move tool as part of static checks migration.

Fixes #8187

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
7f3c12f1dd tests: move spell check tool to main repo
Move tool as part of static checks migration.

Fixes #8187

Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Dan Middleton <dan.middleton@intel.com>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: Hui Zhu <teawater@antfin.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Jimmy Xu <xjmmyshcn@gmail.com>
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
8ad433d4ad tests: move markdown check tool to main repo
Move the tool as a dependency for static checks migration.

Fixes #8187

Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Julio Montes <julio.montes@intel.com>
2023-11-28 11:13:55 -08:00
Chelsea Mafrica
eaa6b1b274 tests: move static checks and dependencies from tests
Move static checks scripts and dependencies from tests to
kata-containers repo.

Fixes #8187

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Bin Liu <bin@hyper.sh>
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Dan Middleton <dan.middleton@intel.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Derek Lee <derlee@redhat.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Xu Wang <xu@hyper.sh>
Signed-off-by: Yang Bo <bo@hyper.sh>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-11-28 11:13:55 -08:00
Gabriela Cervantes
9166d0aabb docs: Update iperf3 network documentation
This PR updates the iperf3 network documentation to include
the parallel bandwidth.

Fixes #8523

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-28 15:59:38 +00:00
Wainer dos Santos Moschetta
48bdca4c49 tests/k8s: add k8s-measured-rootfs.bats
Implements the following test case:

  Scenario: Check incorrect hash fails
  **Given** I have a version of kata installed that has a kernel with the
  initramfs built and config with rootfs_verity.scheme=dm-verity
  rootfs_verity.hash=<incorrect hash of rootfs> set in the kernel_params
  **When** I try and create a container a basic pod
  **Then** The pod is doesn't run
  **And**  Ideally we'd get a helpful message to indicate why

Currently on CI only qemu-tdx is built with measured
rootfs support in the kernel, so the test is restriced to that
runtimeclass.

Fixes #7415
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:54 -03:00
Wainer dos Santos Moschetta
1eae657b91 tests/k8s: add set_node() to lib.sh
Use this new function to set the node where the pod should be scheduled
to.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
c6075c8627 tests/k8s: add setup common
Bring the setup_common() from CCv0 branch test's
integration/kubernetes/confidential/tests_common.sh. It should be used
to reduce boilerplates on the setup() of the tests.

Unlike the original code, this won't export the `test_start_time` variable
as it wouldn't be accurate to grab logs from the worker nodes due
date/time mismatch between the running tests machine and the worker
node. The function export the `node` variable which holds the name of
a random node which has kata installed. Apart from that, it exports the
`node_start_time` which capture the date/time when the test started,
relative to the `node`.

Tests that should inspect the logs can schedule pods/resources to the `node`
and use `node_start_time` as the value reference to grep the logs.

Fixes #7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
220a2d9a15 tests/k8s: add assert_logs_contain() to lib.sh
Bring the assert_logs_contain() from CCv0 branch tests'
integration/kubernetes/confidential/lib.sh.

Introduced the print_node_journal() which uses `kubectl debug` to print
the systemd's journal of a k8s's node.

Fixes #7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
9a9c7a5c6f tests/k8s: add set_metadata_annotation() to lib.sh
This new function allow to the annotations to metadata section in a yaml
configuration file.

Co-authored-by: Ryan Savino <ryan.savino@amd.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
36ea1b8ee7 tests/k8s: add new_pod_config() to lib.sh
Copied the new_pod_config() and pod-config.yaml.in from CCv0 branch
tests' integration/kubernetes/confidential/tests_common.sh and fixtures.
Unlike the original version, new_pod_config() now gets the runtimeclass
by parameter as the RUNTIMECLASS environment variable seems not broadly
used on main branch's CI.

The pod-config.yaml.in was changed as the diff shows below. In
particular the imagePullSecrets was removed to avoid it throwing a
warning on the pod's log.

```
--- a/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
+++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
@@ -5,12 +5,10 @@
 apiVersion: v1
 kind: Pod
 metadata:
-  name: busybox-cc
+  name: test-e2e
 spec:
   runtimeClassName: $RUNTIMECLASS
   containers:
-  - name: nginx
+  - name: test_container
     image: $IMAGE
-    imagePullPolicy: Always
-  imagePullSecrets:
-  - name: cococred
\ No newline at end of file
+    imagePullPolicy: Always
\ No newline at end of file
```

Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta
428daf9ebc tests/k8s: add utilities functions for the tests
The following functions were copied from CCv0's branch test's
integration/kubernetes/confidential/lib.sh. I did just smalls
refactorings (shortened their names and delinted shellcheck warnings):

- k8s_delete_all_pods_if_any_exists()
- k8s_wait_pod_be_ready()
- k8s_create_pod()
- assert_pod_fail()

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Jordan Jackson <jordan.jackson@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
2023-11-28 11:21:53 -03:00
Amulyam24
754aec02c3 gha: add cri-containerd workflow for ppc64le
This PR adds workflow to run containerd tests on Power as a part of CI migration.

Fixes: #8500

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2023-11-27 17:58:58 +05:30
Gabriela Cervantes
37916e7a58 metrics: Fix result finding
This PR fixes the result finding for the general throughput for
the tensorflow benchmark.

Fixes #8466

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-17 15:59:51 +00:00
Fabiano Fidêncio
f8322ffad2
Merge pull request #7796 from WenyuanLau/7794/StratoVirt_VMM_support
StratoVirt: add support for a lightweight VMM StratoVirt in Kata
2023-11-17 10:53:17 +01:00
Hyounggyu Choi
ffe1ea52cf tests|gha: add containerd and k8s tests for s390x
As part of the CI migration, this PR is to add workflows for containerd and k8s for s390x.

Fixes: #7930
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-11-16 18:14:26 +01:00
GabyCT
8586308dcd
Merge pull request #8453 from GabyCT/topic/udpreadme
metrics: Add iperf udp information to README
2023-11-16 10:38:56 -06:00
Liu Wenyuan
c77e990c3e tests: Enable tests for StratoVirt hypervisor
This commit enables StratoVirt hypervisor to be tested in kata GHA,
incluing k8s, metrics, cri-containerd, nydus and so on.

Meanwhile, adding some unit tests for StratoVirt to make sure it works.

Fixes: #7794

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
2023-11-16 20:47:26 +08:00
Gabriela Cervantes
9cc6908b09 stability: Update stressng to run on the gha
This PR updates the stressng test to run on the gha for kata CI.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 19:34:36 +00:00
Gabriela Cervantes
9d8eb298c3 metrics: Add iperf udp information to README
This PR adds the iperf udp information to the network README
for the kata metrics CI.

Fixes #8452

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 15:22:06 +00:00
Gabriela Cervantes
4b7854b668 stability: Add missing dependencies
This PR adds missing dependencies to run stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Gabriela Cervantes
79177bb9cb tests: Enable stressng scalability test
This PR enables the stressng scalability test for kata CI.

Fixes #8420

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-11-15 14:51:14 +00:00
Fabiano Fidêncio
fd9b6d6837
Merge pull request #7623 from fidencio/topic/runtime-improve-vcpu-allocation-on-host-side
runtime: Improve vCPU allocation for the VMMs
2023-11-14 14:10:54 +01:00
Fabiano Fidêncio
c858ea1460
Merge pull request #8174 from fidencio/topic/re-revert-8115
ci: Re-add tracing tests and move docker/nerdctl to the basic-ci-amd64.yaml file
2023-11-13 18:19:40 +01:00
David Esparza
98ec34b04c
Merge pull request #8338 from dborquez/improve_metrics_init_environment
metrics: Fix function that completely stops kata containers before running a test
2023-11-13 09:35:27 -06:00
Fabiano Fidêncio
ee17fe9d20 Revert "gha: ci: Revert tracing test PR to unbreak CI"
This reverts commit e9bd852113.
2023-11-13 15:27:39 +01:00
Fabiano Fidêncio
849253e55c tests: Add a simple test to check the VMM vcpu allocation
As we've done some changes in the VMM vcpu allocation, let's introduce
basic tests to make sure that we're getting the expected behaviour.

The test consists in checking 3 scenarios:
* default_vcpus = 0 | no limits set
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 0.25
  * this should allocate 1 vcpu
* default_vcpus = 0.75 | limits set to 1.2
  * this should allocate 2 vcpus

The tests are very basic, but they do ensure we're rounding things up to
what the new logic is supposed to do.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-10 18:26:01 +01:00
Fabiano Fidêncio
1a81989d20 tests: k8s: Use the "ALLOWED_HYPERVISOR_ANNOTATIONS"
The current kata-deploy code has been doing a `sed` to add allowed
hypervisor annotations, so CBL mariner can be tested with their own
kernel and initrd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 13:42:31 +01:00
Fabiano Fidêncio
455b7bf776 gha: k3s: Avoid unnecessary escape
There's no reason to escape the first + on the +k3s[0-9]\+ regex, as
shown here:
```sh
ubuntu@k3s:~$ /usr/local/bin/k3s kubectl version --short 2>/dev/null | \
	grep "Client Version" | \
	sed \
		-e 's/Client Version: //' \
		-e 's/+k3s[0-9]\+//'
v1.27.7
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 08:42:25 +01:00
Fabiano Fidêncio
e7890ee8f6 gha: Fix regex used to get kubectl version from the k3s version
It seems that with the new k3s release, they've bumped their kubectl
version from x.y.z+k3s1 to x.y.z+k3s2.

Let's ensure our regexp is more generic and future proof for such
changes.

Fixes: #8410

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 07:08:02 +01:00
Archana Shinde
92a517156c
Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test
network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests
2023-11-08 11:45:13 -08:00
Xuewei Niu
136fb76222 tests: Add a integrated test for device cgroup
`TestDeviceCgroup` is added to cri-containerd's integration tests. The test
launches two containers. Each container has a block device. It checks the
validity of device cgroup.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Archana Shinde
c075fa6817 tests: Add test with nerdctl to verify macvlan support
Add test to verify kata supports macvlan networks.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
Archana Shinde
07db673eb9 tests: Add test with nerdctl to verify ipvlan support
Add test to verify kata supports ipvlan networks.
This test can be bit tricky as it requires knowledge about host interfaces
to be used as a master for the ipvlan network.
However, with github actions, we can assume interface called eth0 to be
present on the host and functioning.

Fixes: #8366

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
Wainer Moschetta
949ac4d810
Merge pull request #8217 from beraldoleal/issues/8216
tests: fixes permission denied when running test
2023-11-07 12:25:23 -03:00
David Esparza
28e7b3467b
metrics: improving stop and remove running containers
This PR makes the change to using the SIGKILL signal instead
of SIGTERM to force stop each kata component before start
running any metric test.

Fixes: #8336

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-11-06 09:54:32 -06:00
David Esparza
c232869af9
metrics: removes double-quotes in checkemtrics when parsing results
This PR removes double quotes in jq output to return raw strings
as input of checkmetrics tool.

Fixes: #8331

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:43:03 -06:00
David Esparza
c42a2f2eda
metrics: increase the number of attempts to stop kata
This PR increases the number of attempts to stop kata components
when it is required usually before starting a metrics test.

Fixes: #8307

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:43:03 -06:00
David Esparza
1626253d9e
metrics: FIO ci test enablement
This PR enables the new FIO test based on the containerd client
which is used to track the I/O metrics in the kata-ci environment.

Additionally this PR fixes the parsing of results.

Fixes: #8199

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 09:42:54 -06:00
David Esparza
873386a349
metrics: update iodepth and job size fio parameters to improve workload
This PR updates the values of the fio parameters for iodepth
requests and for the number of jobs, in order to increase the
number of sequential operations.

Additionally, it adds the list of packages needed to parse the
results.

Fixes: #8198

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-30 08:43:06 -06:00
Wainer dos Santos Moschetta
0ce0abffa6 tests/git-helper: cancel any previous rebase left halfway
In bare-metal machines the git tree might get on unstable state with the
previous rebase left halfway. So let's attempt to abort any rebase before.

Fixes #8318
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-26 11:50:12 -03:00
Gabriela Cervantes
2d0518cbe6 metrics: Add parallel udp iperf3 benchmark
This PR adds the parallel udp iperf3 benchmark for network metrics.

Fixes #8277

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-20 19:54:06 +00:00
GabyCT
8486283012
Merge pull request #8247 from GabyCT/topic/iperfudp
metrics: Add iperf udp benchmark
2023-10-20 09:21:37 -06:00
Fabiano Fidêncio
468a3e4b53
Merge pull request #8260 from gkurz/fix-8259
ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
2023-10-19 23:58:22 +02:00
GabyCT
5d6bdbd0a1
Merge pull request #8241 from GabyCT/topic/enableagenttest
tests: Enable agent stability test
2023-10-19 14:12:49 -06:00
Greg Kurz
36109da93f ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
Fixes #8259

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-19 21:53:23 +02:00
Gabriela Cervantes
d01daf749b tests: Adjust timeout for agent stability test
This PR adjusts the timeout for the agent stability test
to run on the gha.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-19 16:55:23 +00:00
Gabriela Cervantes
a58afe70b8 metrics: Add iperf udp benchmark
This PR adds the iperf udp benchmark for bandwdith measurement
for network metrics.

Fixes #8246

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-18 15:52:03 +00:00
Gabriela Cervantes
82a0814fc2 tests: Enable agent stability test
This PR enables the agent stability test for stability gha CI.

Fixes #8240

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-17 15:16:06 +00:00
Dan Mihai
32be8e3a87 tests: query data from the OPA service
Add example for querying json data from the OPA service.

Fixes: #8231

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-17 13:31:43 +00:00
David Esparza
d90d1c5c10
Merge pull request #8243 from dborquez/fix_systemctl_masked_query
metrics: fixes common.sh function to always return true
2023-10-16 20:17:24 -06:00
Dan Mihai
b81c0a6693 tests: encode policy file during test
Encode policy file during test - easier to understand than hard-coding
the encoded file contents.

Fixes: #8214

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-10-16 15:58:12 -07:00
David Esparza
4f9681b411
metrics: fixes common.sh function to always return true
This PR corrects the init env() helper function, to make that
systemctl always returns true when enumerating masked services,
and preventing the test from failing

Fixes: #8242

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-16 15:57:57 -06:00
David Esparza
59e8b1d5a7
Merge pull request #8206 from dborquez/memory_footprint_test_removing_trailing_commas_to_make_json_results_file_valid
Memory footprint test removing trailing commas to make json results file valid
2023-10-16 14:31:28 -06:00
Chao Wu
157caea9fe Revert "nydus: Temporarily skip tests on dragonball"
This reverts commit aba36ab188.

Fixes: #8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
David Esparza
908519db9d metrics: skips docker restart when it is not installed or is masked.
To avoid errors when initializing the test environment, the
kill_processes_before_start() helper function needs to verify that
docker is installed before attempting to stop it.

Fixes: #8218

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:02:00 +00:00
David Esparza
c2763120aa metrics: removing trailing comma characters from json file.
This PR removes trailing commas so that the json results
file is valid.

This PR also changes the way data results are collected by
terating through the array of memory values to calculate
their average.

Fixes: #8204

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-13 18:00:57 +00:00
Beraldo Leal
5ef691528d tests: fixes permission denied when running test
After running cri-containerd/integration-tests twice we receive
permission denied during containerd clean.

Fixes: #8216

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-10-12 19:23:40 +00:00
GabyCT
1974d13122
Merge pull request #8188 from dborquez/metrics_add_fio_readme.md
metrics: removal of reference in the documentation to the fio dax subtest.
2023-10-12 10:53:55 -06:00
Gabriela Cervantes
ef6388e815 tests: Remove unused function from scability test
This PR removes an unused function from scability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-11 19:44:21 +00:00
Gabriela Cervantes
c6463cb5ae tests: Fix path for versions yaml for soak parallel test
This PR fixes the path for versions yaml for soak parallel test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 22:29:20 +00:00
David Esparza
89c9454fca
metrics: removal of reference in the documentation to the dax test.
This PR removes the reference in the documentation to the DAX
subtest of the FIO benchmark, because this metric is currently
WIP.

Fixes: #8159

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-10 15:55:59 -06:00
Gabriela Cervantes
30ff58904e tests: Enable scability test for stability CI
This PR enables the scability test for stability CI gha.

Fixes #8196

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 19:59:57 +00:00
GabyCT
538131ab44
Merge pull request #8154 from GabyCT/topic/addstability
tests: Enable soak parallel stability test
2023-10-10 13:53:14 -06:00
Gabriela Cervantes
e786b2b019 gha: Add install dependencies for stability tests
This PR adds the install dependencies for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-10 16:05:48 +00:00
Wainer Moschetta
d311c3dd04
Merge pull request #7621 from wainersm/gha-run-local
ci: k8s: adapt gha-run.sh to run locally
2023-10-10 11:19:19 -03:00
David Esparza
bba34910df
metrics: stops kata components and k8s deployment when test finishes
This PR adds a trap whenever the scrip exits, it deletes the iperf
k8s deployment and k8s services, and deletes the kata components.

This way, when the script finishes, it verifies that there are
indeed no kata components still running.

Fixes: #8126

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-09 13:41:43 -06:00
Gabriela Cervantes
84e3d884e4 gha: Add general dependencies to stability tests
This PR adds the general dependencies to stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
dec3951ca5 tests: Add soak parallel stability test
This PR adds the soak parallel stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Gabriela Cervantes
0f04d527d9 tests: Enable soak parallel test
This PR enables the soak parallel test for stability test.

Fixes #8153

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-09 17:02:49 +00:00
Wainer dos Santos Moschetta
e669282c25 ci: k8s: set KUBERNETES default value
The KUBERNETES variable is mostly used by kata-deploy whether to apply
k3s specific deployments or not. It is used to select the type of
kubernetes to be installed (k3s, k0s, rancher...etc) and it is always
set on CI. Running the script locally we want to set a value by default
to avoid `KUBERNETES: unbound variable` errors.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
c30c3ff185 tests: run k8s-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
666993da8d tests: run k8s-file-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta
3a00fc9101 tests: exec_host() now gets the node name
The exec_host() simply fails on cluster with multi-nodes because
`kubectl get node -o name" will return a list o names. Moreover, it will
return control nodes names which usually don't have kata installed.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
61c9c17bff tests: add get_one_kata_node() to tests_common.sh
The introduced get_one_kata_node() returns the first node that
has the kata-runtime=true label, i.e., supposedly a node with
kata installed.

This is useful for tests that should run on a determined worker
node on a multi-nodes cluster.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
68f083c4d0 ci: k8s: set KATA_HYPERVISOR default value
Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable
is required to tweak some configurations of kata-deploy.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
6677a61fe4 ci: k8s: configurable deploy kata timeout
The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata
deploy installation finish. This allow users of the script to overwrite
that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment
variable.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
200e542921 ci: k8s: shellcheck fixes to gha-run.sh
Fixed a couple of warns shellcheck emitted and disabled others:
 * SC2154 (var is referenced but not assigned)
 * SC2086 (Double quote to prevent globbing and word splitting)

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d54e6d9cda ci: k8s: run_tests() for kcli
The only difference to the other platforms is that it needs to
export KUBECONFIG.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
c2ef1f0fb0 ci: k8s: add deploy-kata-kcli() to gh-run.sh
The cleanup-kcli() behaves like other deploy kata for
bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG
should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
d2be8eef1a ci: k8s: add cleanup-kcli() to gha-run.sh
The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev,
tdx...etc) except that KUBECONFIG should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
cbb9aa15b6 ci: k8s: set default image for deploy_kata()
On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and
DOCKER_TAG are exported to match the built image. However, when running
the script outside of CI context, a developer might just use the latest
image which in this case will be
`quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta
89bef7d036 ci: k8s: create k8s clusters with kcli
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.

Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Aurélien Bombo
e9bd852113 gha: ci: Revert tracing test PR to unbreak CI
Revert "Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests"

This unbreaks CI as seen in https://github.com/kata-containers/kata-containers/actions/runs/6434757133

Fixes: #8161

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-10-06 14:13:17 -07:00
Fabiano Fidêncio
fa6786d1d7
Merge pull request #8117 from fidencio/topic/ci-add-runk-tests
gha: ci: Port runk tests over
2023-10-06 11:19:55 +02:00
Fabiano Fidêncio
8fec654716
Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests
ci: gha: Port tracing tests over
2023-10-06 10:06:57 +02:00
GabyCT
265f53e594
Merge pull request #8082 from dborquez/enable_fio_on_ctr
Enable fio test using containerd client
2023-10-05 17:26:22 -06:00
Fabiano Fidêncio
da91c9df88 ci: Port runk tests to this repo
I'm basically moving the runk tests from the tests repo to this one, and
I'm adding the "Signed-off-by:" of every single contributor the tests.

Fixes: #8116

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 20:41:29 +02:00
Fabiano Fidêncio
9205acc3d2 ci: Move tracing tests here
I'm basically moving the tracing tests from the tests repo to this one,
and I'm adding the "Signed-off-by:" of every single contributor to the
tests.

Fixes: #8114

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-10-04 20:02:27 +02:00
Gabriela Cervantes
85d290a048 gha: Add stability gha run script
This PR adds the stability gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-04 17:45:45 +00:00
Fabiano Fidêncio
2c3bf406dc ci: Create a function to install docker
This will be re-used in other tests as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-04 15:01:51 +02:00
Steve Horsman
c430cc3707
Merge pull request #8098 from stevenhorsman/k8s-registry-suite
versions: migrate out of k8s.gcr.io
2023-10-04 10:51:39 +01:00
David Esparza
8c498ef5ee
metrics: Use jq tool to pretty-print json metrics output
This PR enables the use of jq pretty-print feature to
improve the formatting of metric results json files.

Fixes: #8081

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:33:19 -06:00
David Esparza
a2159a6361
metrics: Enables FIO test for kata containers
FIO benchmark is enabled to measure IO in Kata
at different latencies using containerd client,
in order to complement the CI metrics testing set.

This PR asl deprecated the previous Fio bench
based on k8s.

Fixes: #8080

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-10-03 23:32:38 -06:00
Fabiano Fidêncio
f337315952
Merge pull request #8106 from fidencio/topic/gha-fix-k0s-related-cis
gha: Fix k0s deployment
2023-10-03 21:47:40 +02:00
Fabiano Fidêncio
70e7ec3e23 gha: Fix k0s deployment
The tests are failing when setting up k0s, and that happens because we
download a kubectl binary matching the kubernetes version k0s is using,
and we do that by:
```
sudo k0s kubectl version --short 2>/dev/null | ...
```

With kubectl 1.28, which is now the default on k0s, `kubectl version
--short` has been removed, leading us to an empty stringm causing then
the error in the CI.

Fixes: #8105

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 17:21:40 +02:00
Wainer dos Santos Moschetta
0db8fb8f98 versions: migrate out of k8s.gcr.io
The k8s.gcr.io is deprecated for a while now and has been redirected to
registry.k8s.io. However on some bare-metal machines in our testing
pools that redirection is not working, so let's just replace the
registries.

Fixes #8098
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit b2c3bca558c38deff2117d5909d9071c23c05590)
2023-10-03 11:52:59 +01:00
Gabriela Cervantes
6339605a14 tests: Add general stability fixes
This PR adds general stability fixes.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-10-02 19:42:46 +00:00
Gabriela Cervantes
fd19f4082f tests: Add agent stability test
This PR adds the agent stability test to stability test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:37:02 +00:00
Gabriela Cervantes
215577032f tests: Add cassandra stress in stability tests
This PR adds the cassandra stress at the stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 22:34:45 +00:00
Gabriela Cervantes
f2d3ea988d tests: Add stressng dockerfile for stability tests
This PR adds the stressng dockerfile for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:35:22 +00:00
Gabriela Cervantes
6493aa309e tests: Add stressor CPU test for stability tests
This PR adds the stressor CPU test for stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:33:08 +00:00
Gabriela Cervantes
ef68a3a36b metrics: Add stability test for kata CI
This PR adds the stability test for kata containers repository.

Fixes #8084

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-28 16:23:36 +00:00
GabyCT
fcc755fc3b
Merge pull request #8068 from GabyCT/topic/limitlatency
metrics: Add latency value limits for kata CI
2023-09-27 13:28:41 -06:00
Gabriela Cervantes
8d66ef5185 metrics: Increase qemu jitter value
This PR increases qemu jitter value.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:31:07 +00:00
Gabriela Cervantes
5600e28b54 metrics: Increase jitter value for clh
This PR increases jitter value for clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-27 17:30:19 +00:00
Fabiano Fidêncio
8b25e90027
Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests
ci: Port kata-monitor tests from Jenkins to GHA
2023-09-27 15:48:46 +02:00
Fabiano Fidêncio
489caf1ad0 ci: kata-monitor: Move tests over
Let's move, adapt, and use the kata-monitor tests from the tests repo.
In this PR I'm keeping the SoB from every single contributor from who
touched those tests in the past.

Fixes: #8074

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-27 11:40:31 +02:00
Fabiano Fidêncio
57cb4ce204 ci: Make install_kata aware of container engines
This will help us when running tests using CRI-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:31:17 +02:00
Fabiano Fidêncio
de1eeee334 ci: Create a generic install_crio function
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
64a2000859 ci: Add install_cni_plugins helper
This will become handy when doing tests with CRI-O, as CRI-O doesn't
install the CNI plugins for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
8132fe15c9 ci: Modify containerd default config
Let's ensure we have runc running with `SystemdCgroups = false`,
otherwise we'll face failures when running tests depending on runc on
Ubuntu 22.04, woth LTS containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:16:12 +02:00
Gabriela Cervantes
8cb7df1bed metrics: Add checkmetrics for latency test
This PR adds the checkmetrics for latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 19:11:08 +00:00
Gabriela Cervantes
e90440ae24 metrics: Add qemu latency value limit
This PR adds the qemu latency value limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:30:09 +00:00
Gabriela Cervantes
a74a8f8a9d metrics: Add latency value limits for kata CI
This PR adds latency value limits for kata CI.

Fixes #8067

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-26 17:29:07 +00:00
GabyCT
309103169d
Merge pull request #8056 from GabyCT/topic/fixlatencypath
metrics: Fix latency yamls path
2023-09-26 10:16:55 -06:00
GabyCT
5c0afaacf4
Merge pull request #8018 from GabyCT/topic/fixreadme
metrics: Fix metrics README
2023-09-26 09:51:47 -06:00
David Esparza
83326f89b3
Merge pull request #8054 from GabyCT/topic/fixcrdoc
metrics: Fix C-Ray documentation
2023-09-26 09:50:19 -06:00
Gabriela Cervantes
9ac29b8d38 metrics: Add init_env function to latency test
This Pr adds the init_env function to latency test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 22:06:00 +00:00
Gabriela Cervantes
81c8babca9 metrics: Fix latency yamls path
This PR fixes the latency yamls path for the latency test for
kata metrics.

Fixes #8055

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:52:24 +00:00
Gabriela Cervantes
4815736820 metrics: Fix C-Ray documentation
This PR fixes the C-Ray documentation for kata metrics.

Fixes #8052

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-25 15:27:58 +00:00
Fabiano Fidêncio
ef63d67c41 ci: crio: Trail '\r' from exec_host() output
We've faced this as part of the CI, only happening with the CRI-O tests:
```
 not ok 1 Test readonly volume for pods
 # (from function `exec_host' in file tests_common.sh, line 51,
 #  in test file k8s-file-volume.bats, line 25)
 #   `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass
 # bash: line 1: $'\r': command not found
 #
 # Error from server (NotFound): pods "test-file-volume" not found
```

I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 16:42:18 +02:00
Fabiano Fidêncio
74c12b2927 ci: crio: Enable default capabilities
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.

A huge thanks to Greg Kurz for spotting this and suggesting the fix.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
ebaa4fa4c1 ci: crio: Pass -y to apt
That was something overlooked during my tests. :-/

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
Gabriela Cervantes
97e73b2234 metrics: Fix spelling warnings
This PR fixes general spelling warnings detected by the spelling check.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:50:51 +00:00
Gabriela Cervantes
36c8cd6f1f metrics: Fix metrics README
This PR fixes the network metrics section at the README by leaving
the current tests that we have in our kata metrics.

Fixes #8017

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-22 15:28:58 +00:00
Gabriela Cervantes
6776b55d7e metrics: Enable latency test in gha run script
This PR enables the latency test for gha run script for kata metrics.

Fixes #8037

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-21 16:11:58 +00:00
Fabiano Fidêncio
07a6e63a6b ci: k8s: rke2: Use sudo to call systemd
Otherwise we'll face the following error:
```
Failed to enable unit: Interactive authentication required.
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 08:48:29 +02:00
Fabiano Fidêncio
d7105cf7a4 ci: k8s: Add a method to install CRI-O
This is based on official CRI-O documentations[0] and right now we're
making this specific to Ubuntu as that's what we have as runners.

We may want to expand this in the future, but we're good for now.

[0]:
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
54c0a471b1 ci: k8s: k0s: Allow passing parameters to the k0s installer
We'll need this in order to setup k0s with a different container engine.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
GabyCT
6111ef6fb6
Merge pull request #7990 from GabyCT/topic/parallelbandwidth
metrics: Enable parallel bandwidth iperf limit
2023-09-19 14:52:21 -06:00
Fabiano Fidêncio
5560e72024
Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support
ci: kata-deploy: Enable all k8s flavours that we support
2023-09-19 17:44:34 +02:00
Fabiano Fidêncio
2c908b598c ci: kata-deploy: Add the ability to deploy rke2
This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
eaf6164916 ci: kata-deploy: Add the ability to deploy k0s
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
0015257636 ci: kata-deploy: Add deploy-k8s argument to gha-run.sh
We'll be using exactly the same code used for the k8s tests, which are
already deploying k3s on GARM.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
bf2cb02283 ci: kata-deploy: Expland tests to run on k0s / rke2
We just need to make sure the correct overlay is applied, following what
we already have been doing for k3s.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
9e1fb8a966 ci: kata-deploy: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

This was also done as part of fa62a4c01b,
for the k8s tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
09cc0ed438 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
486fe14c99 ci: Properly set K8S_TEST_UNION
Otherwise only the first test will be executed

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
d9ef1352af ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name.  With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
68267a3996 ci: Create clusters in individual resource groups
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.

This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.

The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.

Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.

Fixes: #7982

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:55 +02:00
Gabriela Cervantes
9aa8d1c917 metrics: Add parallel bandwidth limit for qemu
This PR adds the parallel bandwidth limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 21:08:54 +00:00
Gabriela Cervantes
af59d4bf4a metrics: Enable parallel bandwidth iperf limit
This PR enables the parallel bandwidth iperf limit for kata metrics.

Fixes #7989

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-18 16:32:11 +00:00
Fabiano Fidêncio
aba36ab188 nydus: Temporarily skip tests on dragonball
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.

The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
   /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
   error: Failed to Mount backend
...

Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b8a8dfcd15 nydus: Use kata-${KATA_HYPERVISOR} instead of kata
This will ensure we're testing with the correct runtime, instead of
using the `default` one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
2f9c9e2e63 tests: nydus: Update nydus tests
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b73bde320d gha: nydus: Populate run()
And with this we finally enable the nydus tests to run as part of our
GHA CI.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b3904a1a30 gha: nydus: Populate install_dependencies()
Let's have all the dependencies needed for running the nydus tests
installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
d2b3b67f5d gha: nydus: Actually install kata when install-kata is called
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.

Now, let's install kata, as expected. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
0ec00ad42e gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.

Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
568439c77b tests: nydus: Add timeout to the crictl calls
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
5ac3b76eb1 tests: nydus: Add uid / namespace to the nydus container / sandbox
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
376574a16c tests: nydus: Decorate some calls with sudo
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
4290fd4b67 tests: nydus: Adapt "source ..." to GHA
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
a84efa3e87 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
As that's what we've been using as part of the GHA.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
56a14b3950 tests: common: Add install_nydus_snapshotter()
This function will be used to download and install the
nydus-snapshotter, and it follows the same pattern we already have
introduced for downloading and installing another dependencies from
GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b6563783e2 tests: common: Add install_nydus()
This function will be used to download and install nydus, and it follows
the same pattern we already have introduced for downloading and
installing another dependencies from GitHub.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Greg Kurz
cab46c9e23
Merge pull request #7973 from fidencio/topic/ci-use-bigger-machine-sizes-for-the-needed-tests-part-0
ci: Use variable size of VMs depending on the tests running
2023-09-18 12:06:44 +02:00
Fabiano Fidêncio
e125775863 tests: install_rust: Also install clippy
clippy is used as part our tests, so it's useful to have it installed
while we're already installing rust.

In case of developers, they also better be using it. :-)

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:31 +02:00
Fabiano Fidêncio
6794d4c843 tests: Move install_rust.sh from the tests repo
We'll use it as part of the refactoring we're doing in the static check
tests.

I can see a lot of other uses of this, but changing all of them to this
one is out of the scope for this PR.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:29 +02:00
Fabiano Fidêncio
e64508c308 tests: install_go: Remove tests repo dependency
We can rely on the functions that are now part of the common.bash.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
11dff731b7 tests: Move functions from kata_arch script here
We can use this a lot as part of our CI, but right now I'm just moving
those here with the intent to use later on in this series.

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:52:28 +02:00
Fabiano Fidêncio
c69a1e33bd ci: Use variable size of VMs depending on the tests running
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.

Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)

With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.

Fixes: #7972

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 09:13:54 +02:00
Jeremi Piotrowski
6f30d00ae7
Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used
ci: Reduce the size of the AKS VMs
2023-09-15 08:49:08 +02:00
Fabiano Fidêncio
094b6b2cf8 ci: k8s: Temporarily disable tests that require a bigger VM instance
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI

We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 01:33:19 +02:00
Fabiano Fidêncio
92fff129fd ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
faf98c0623 ci: Reduce the size of the AKS VMs
We do **not** need a very powerful machine for our tests, as we're not
building anything there.

The instance we switched to (Standard_D2s_v5) still has nested virt
available, as shown here[0], but has half of the amount of vCPUs /
Memory, which should be fine only for running the tests, costing us
basically half of the price[1].

[0]:
https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series
[1]:
https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing

Fixes: #7955

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Gabriela Cervantes
cd4fd1292a metrics: Add iperf cpu utilization limit for qemu
This PR adds the iperf cpu utilization limit for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 17:17:47 +00:00
Gabriela Cervantes
df5cd10ea0 metrics: Add iperf value for cpu utilization
This PR adds the iperf value for cpu utilization for kata metrics.

Fixes #7936

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-14 16:06:49 +00:00
Jeremi Piotrowski
a96050a7ad tests: Apply timeout to 'ctr t kill'
This task has been observed to hang at times.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9d93036783 tests/vfio: Bump VM image to Fedora 38
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
faee59b520 tests/vfio: Accept single device in vfio group for CLH
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df3dc1105c tests/vfio: Get rid of sync's
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
9f1a42c6cc tests/vfio: Give commands 30s to execute
This is a to catch the case of the guest getting stuck.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
b46b0ecf8b tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms
This shouldn't be hiding behind only a qemu check, we need this for clh as
well.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5f6475a28a tests/vfio: Gather debug info and disable tdp_mmu
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.

Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
8fffdc81c5 tests/vfio: Capture journal from vm
For debugging (though this doesn't get exposed yet).

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
df815087e7 tests/vfio: Change to get the test working in GHA
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
a92ddeea15 tests/vfio: Move dependency installation to gha-run.sh
To match the flow of other github actions workflows.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Jeremi Piotrowski
5a551a85b1 gha: vfio: Import jobs scripts from tests repo
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh

Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-09-14 14:23:28 +02:00
Fabiano Fidêncio
a1e3fa7ac4
Merge pull request #7905 from microsoft/danmihai1/mariner-annotations
tests: fix kernel and initrd annotations
2023-09-14 10:37:42 +02:00
GabyCT
1d331124ad
Merge pull request #7925 from GabyCT/topic/bandwidthlimit
metrics: Add iperf bandwidth value for kata metrics
2023-09-13 17:43:55 -06:00
Gabriela Cervantes
49e2fa189c metrics: Increase jitter value for qemu
This PR increases the jitter value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 22:36:09 +00:00
Gabriela Cervantes
49234433a7 metrics: Increase value limit for jitter in clh
This PR increases the value limit for jitter in clh.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-13 21:27:08 +00:00
David Esparza
0a24d3f718
Merge pull request #7923 from GabyCT/topic/addcassandradoc
metrics: Add Cassandra Metrics documentation
2023-09-13 10:17:00 -06:00
GabyCT
c565053bac
Merge pull request #7895 from GabyCT/topic/removewarning
metrics: Remove warning from metrics documentation
2023-09-13 10:16:38 -06:00
Fabiano Fidêncio
813bfdec01 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
This will ensure that we're calling the correct binary for the
hypervisor.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:10:14 +02:00
Fabiano Fidêncio
46bc0b1c01 ci: nerdctl: Create the containerd config
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.

This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
13968aa7f6 ci: nerdctl: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
e0c811678b ci: docker: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Gabriela Cervantes
0aa073967d metrics: Add iperf bandwidth value for qemu
This PR adds the iperf bandwidth value for qemu for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 20:57:14 +00:00
Dan Mihai
c0ad914766 tests: fix kernel and initrd annotations
Fix kernel and initrd annotations in the k8s tests on Mariner. These
annotations must be applied to the spec.template for Deployment, Job
and ReplicationController resources.

Fixes: #7764

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-12 20:15:25 +00:00
Gabriela Cervantes
615c1cbf19 metrics: Add iperf bandwidth value for kata metrics
This PR adds the iperf bandwidth value for kata metrics.

Fixes #7924

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:30:24 +00:00
Gabriela Cervantes
d53eb73eec metrics: Ensure docker is running in init_env
This PR ensures that docker is running as part of the init_env function
in kata metrics to avoid failures like docker is not running and making
the kata metrics CI to fail.

Fixes #7898

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 19:13:09 +00:00
Gabriela Cervantes
ad08321b83 metrics: Add Cassandra Metrics documentation
This PR adds the Cassandra Metrics documentation for kata metrics.

Fixes #7922

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-12 16:30:35 +00:00
David Esparza
a58ea66592
metrics: this PR skips the FIO test temprarily to fix issues
FIO test is showing ongoing issues when running in k8s.
Working on running FIO on the ctr client which has been
shown to be stable.

Fixes: #7920

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-12 10:23:57 -06:00
Fabiano Fidêncio
f536ef5ce1 ci: docker: Also run the smoke test with runc
This will help us to make sure that the failure is actually related to
Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:54:02 +02:00
Fabiano Fidêncio
12d833d07d ci: Add a very basic nerdctl sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7911

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:52:55 +02:00
Fabiano Fidêncio
348b8644d6 ci: Add a very basic docker sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 15:15:26 +02:00
Gabriela Cervantes
060499dcae metrics: Remove warning from metrics documentation
Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation.

Fixes #7894

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-11 16:41:48 +00:00
GabyCT
b384757ac7
Merge pull request #7874 from fidencio/topic/manually-rebase-branches-atop-of-the-target-one
gha: Manually rebase PR atop of the target branch before testing
2023-09-11 10:35:01 -06:00
GabyCT
fa818bfad1
Merge pull request #7867 from GabyCT/topic/optimizedimage
metrics: Use TensorFlow optimized image
2023-09-08 11:34:21 -06:00
Fabiano Fidêncio
bd24afcf73 gha: Manually rebase PR atop of the target branch before testing
We're changing what's been done as part of ac939c458c, as we've
notcied issues using `github.event.pull_request.merge_commit_sha`.

Basically, whenever a force-push would happen, the reference of
merge_commit_sha wouldn't be updated, leading us to test PRs with the
old code. :-/

In order to get the rebase properly working, we need to ensure we pull
the hash of the commit as part of checkout action, and ensure
fetch-depth is set to 0.

Fixes: #7414

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 18:56:31 +02:00
GabyCT
dc7414f5c1
Merge pull request #7870 from dborquez/metrics_fio_fix_clean_env_order
metrics: fix FIO test initialization
2023-09-08 10:28:10 -06:00
Fabiano Fidêncio
9d74b7ccc9 k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 15:51:46 +02:00
Fabiano Fidêncio
f6cd3930c5 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:29 +02:00
Fabiano Fidêncio
3cc20b47a6 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:24 +02:00
Fabiano Fidêncio
b5bad3cb0f ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:04:50 +02:00
Fabiano Fidêncio
27fa7d828d ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
fa62a4c01b ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
3de23034f8 ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:46:58 +02:00
David Esparza
adfea55b8f
metrics: fix FIO test initialization
This PR changes the order in which the FIO test first
cleans the environment and then checks if the environment
is indeed clean.

Fixes: #7869

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-07 15:41:59 -06:00
Fabiano Fidêncio
2df183fd99 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
369a8af8f7 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ada65b988a ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ad45ab5d33 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
028a97e0d5 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
David Esparza
34f580901f
Merge pull request #7824 from dborquez/fix_memory_usage_initialization
metrics: re-enable memory-usage initialization step
2023-09-07 14:24:27 -06:00
Gabriela Cervantes
3a427795ea metrics: Use TensorFlow optimized image
This PR replaces the ubuntu image for one which has TensorFlow optimized
for kata metrics.

Fixes #7866

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-07 15:38:51 +00:00
Fabiano Fidêncio
b28b54df04 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-06 23:08:17 +02:00
Fabiano Fidêncio
54f7117212 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:07:41 +02:00
Gabriela Cervantes
438fbf9669 metrics: Add write 95 percentile for FIO for qemu
This PR adds the write 95 percentile for FIO for qemu for
checkmetrics for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 22:50:31 +00:00
Gabriela Cervantes
024b4d2ffe metrics: Add write 95 percentile FIO value
This PR adds the write 95 percentile FIO value for checkmetrics
for kata metrics.

Fixes #7842

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 21:00:05 +00:00
Gabriela Cervantes
e98e5cdea2 metrics: Add checkmetrics to gha run script
This PR adds the checkmetrics to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 17:05:03 +00:00
Gabriela Cervantes
c1edfe5511 metrics: Add checkmetrics value for qemu for iperf
This PR adds the checkmetrics value for qemu for iperf benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
6a79ecedf9 metrics: Add jitter value for clh
This PR adds jitter value for clh for iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
f609a9a754 metrics: Add test selector to iperf metrics
This PR adds test selector to iperf metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Gabriela Cervantes
5b8db30422 metrics: Enable iperf benchmark on gha for kata metrics
This PR enables the iperf benchmark to run on the gha for kata metrics.

Fixes #7575

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-05 16:04:52 +00:00
Fabiano Fidêncio
b663ec21ac
Merge pull request #7803 from GabyCT/topic/readmereportdoc
metrics: Add README for kata metrics report
2023-09-03 21:57:13 +02:00
David Esparza
b151cfd140
metrics: re-enable memory-usage initialization step
This PR re-enables the initialization step disabled
on 538c965c2b.

Fixes: #7804

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-09-01 14:29:34 -06:00
Dan Mihai
bf21411e90 tests: add policy to k8s tests
Use AGENT_POLICY=yes when building the Guest images, and add a
permissive test policy to the k8s tests for:
- CBL-Mariner
- SEV
- SNP
- TDX

Also, add an example of policy rejecting ExecProcessRequest.

Fixes: #7667

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Gabriela Cervantes
6668825752 metrics: Add grabdata script for metrics report
This PR adds the grabdata script so it can be used for the metrics report
for kata metrics.

Fixes #7812

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-31 16:17:29 +00:00
GabyCT
b467f2ef68
Merge pull request #7772 from GabyCT/topic/fiolimit
metrics: Enable FIO limits for kata metrics
2023-08-30 14:49:04 -06:00
Gabriela Cervantes
9f21fa9b39 metrics: Add report generator link to general documentation
This PR adds the report generator link to general documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:55:14 +00:00
Gabriela Cervantes
c0ed5ea0ad metrics: Add README for kata metrics report
This PR adds the README for kata metrics report.

Fixes #7802

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 16:36:08 +00:00
Fabiano Fidêncio
aa2b51a831
Merge pull request #7783 from GabyCT/topic/makereport
metrics: Add metrics report script
2023-08-30 17:11:39 +02:00
Gabriela Cervantes
a7b59a5bf9 metrics: Add limit for 90 percentile for qemu value
This PR adds the limit for 90 percentile for qemu value for
FIO kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
99db6568e9 metrics: Add limit for write 90 percentile value for clh
This PR adds the limit for write 90 percentile value for clh for
FIO metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
6e06392c55 metrics: Enable FIO limits for kata metrics
This PR enables the FIO limits for kata metrics.

Fixes #7771

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
c8dd3c0737 metrics: Fix memory footprint qemu limit
This PR fixes the memory footprint qemu limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 22:51:21 +00:00
Gabriela Cervantes
8877ec62fb metrics: Fix memory inside limits for kata metrics
This PR fixes the memory inside limit for clh for kata metrics due
to the recent changes that we had in the script which impacted
in the performance measurement.

Fixes #7786

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 21:38:18 +00:00
Gabriela Cervantes
7e364716dd metrics: Add test setup details to metrics report
This PR adds test setup details to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:56:53 +00:00
Gabriela Cervantes
17dc1b9760 metrics: Add boot lifecycle times to metrics report
This PR adds the boot lifecycle times to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:55:44 +00:00
Gabriela Cervantes
3b0d6538f2 metrics: Add memory inside container to metrics report
This PR adds memory inside container to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:53:17 +00:00
Gabriela Cervantes
79fbb9d243 metrics: Add scaling system footprint in metrics report
This PR adds scaling system footprint in metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:51:27 +00:00
Gabriela Cervantes
8e6d4e6f3d metrics: Add metrics reportgen
This PR adds metrics reportgen for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:45:36 +00:00
Gabriela Cervantes
139ffd4f75 metrics: Add report file titles
This PR adds report file titles for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:43:06 +00:00
GabyCT
8f2dae7b53
Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results
metrics: fix parsing issue on memory-usage test
2023-08-29 11:26:13 -06:00
Gabriela Cervantes
878d1a2e7d metrics: Generate PNGs alongside the PDF report
This PR generates the PNGs for the kata metrics PDF report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:50:32 +00:00
Gabriela Cervantes
fce2487971 metrics: Add metrics report R files
This PR adds the metrics report R files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:45:22 +00:00
Gabriela Cervantes
08812074d1 metrics: Add report dockerfile
This PR adds the report dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:28:32 +00:00
Gabriela Cervantes
69781fc027 metrics: Add metrics report script
This PR adds metrics report script for kata metrics.

Fixes #7782

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:25:14 +00:00
Fabiano Fidêncio
e286e842c1 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
e31f099be1 tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
c3b9d4945e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:34 +02:00
David Esparza
538c965c2b
metrics: fix parsing issue on memory-usage test
This PR fixes an issues in the parsing results stage,
by collecting just the n-results from the n-running
containers, discarding irrelevant data.

Fixes: #7774

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-28 23:39:46 -06:00
Fabiano Fidêncio
02a08c956b
Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment
tests: delete k8s deployment at the test's end
2023-08-27 17:52:00 +02:00
Fabiano Fidêncio
98037ced52
Merge pull request #7755 from microsoft/danmihai1/unique-test-name
tests: use unique test name
2023-08-27 17:27:40 +02:00
Dan Mihai
183f51d6f6 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:41:06 +00:00
Dan Mihai
6a974679f2 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:34:37 +00:00
Gabriela Cervantes
32a778b6da metrics: Remove unused variable in tensorflow nhwc script
This PR removes unused variable in tensorflow nhwc script.

Fixes #7750

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-24 15:54:27 +00:00
Gabriela Cervantes
959ca49447 metrics: Add TensorFlow ResNet50 fp32 Dockerfile
This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:24:58 +00:00
Gabriela Cervantes
4b7d72c4a8 metrics: Add TensorFlow ResNet50 FP32 benchmark
This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics.

Fixes #7735

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-23 16:21:09 +00:00
GabyCT
b8990c0490
Merge pull request #7722 from GabyCT/topic/adddiskreadme
metrics: Add disk link to README
2023-08-22 12:29:54 -06:00
GabyCT
514d3d42b8
Merge pull request #7712 from GabyCT/topic/fixfiopath
metrics: Fix FIO path
2023-08-22 12:28:28 -06:00
Gabriela Cervantes
8afd158cef metrics: Add disk link to README
This PR adds disk link to README documentation for kata metrics.

Fixes #7721

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-22 16:20:31 +00:00
Fabiano Fidêncio
8032797418
Merge pull request #7708 from microsoft/danmihai1/kata-deploy-log
gha: capture additional kata-deploy output
2023-08-21 23:43:51 +02:00
David Esparza
d2c130ea69
Merge pull request #7710 from GabyCT/topic/fixpytorch1
metrics: Use function from metrics common in pytorch script
2023-08-21 15:31:24 -06:00
Gabriela Cervantes
eee2ee6eeb metrics: Fix FIO path
This PR fixes the FIO path for the FIO files.

Fixes #7711

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 21:06:04 +00:00
David Esparza
9347051592
Merge pull request #7666 from dborquez/metrics_improve_fio_test
metrics: Enable kata runtime in K8s for FIO test.
2023-08-21 13:51:57 -06:00
Gabriela Cervantes
39bc3488f5 metrics: Use function from metrics common in pytorch script
This PR uses a common function into the pytorch script.

Fixes #7709

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-21 16:12:35 +00:00
Dan Mihai
400eb88743 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-21 15:58:57 +00:00
GabyCT
700759232f
Merge pull request #7690 from GabyCT/topic/fixpytorch
metrics: Fix README for pytorch
2023-08-21 09:50:14 -06:00
Jiang Liu
6e038e66e4
Merge pull request #7680 from GabyCT/topic/removetime
metrics: Remove unused variable in tensorflow mobilenet script
2023-08-21 23:39:07 +08:00
Gabriela Cervantes
c8b43f8b3e metrics: Fix README for pytorch
This PR fixes the pytorch reference in the README file.

Fixes #7689

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-18 20:14:49 +00:00
Fabiano Fidêncio
7e66d1f6b5
Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests
gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy
2023-08-18 07:47:26 +02:00
David Esparza
fb571f8be9
metrics: Enable kata runtime in K8s for FIO test.
This PR configures the corresponding kata runtime in K8s
based on the tested hypervisor.

This PR also enables FIO metrics test in the kata metrics-ci.

Fixes: #7665

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-17 17:11:27 -06:00
Gabriela Cervantes
85c02828e1 metrics: Update tensorflow name in gha run script
This PR update tensorflow name in gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 20:17:48 +00:00
Gabriela Cervantes
e8a5119343 metrics: Fix check results for tensorflow benchmark
This PR fixes the check results for tensorflow benchmark now
that we change the name of the test.

Fixes #7684

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 19:52:45 +00:00
Fabiano Fidêncio
2d896ad12f gha: kata-deploy: Do the runtime class cleanup as part of the cleanup
Instead of doing this as part of the test itself, let's ensure it's done
before running the tests and during the tests cleanup.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
Fabiano Fidêncio
4ffc2c86f3 gha: kata-deploy: Add the first kata-deploy test
This test, at least for now, only checks whether the runtimeclasses
have been properly created.

This is just a migration from a test we had as part of the k8s suite.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 18:54:46 +02:00
GabyCT
4ba684e6e4
Merge pull request #7653 from GabyCT/topic/tensorflowfp32
metrics: Add Tensorflow ResNet50 int8 benchmark
2023-08-17 10:44:25 -06:00
Gabriela Cervantes
8616c050ae metrics: Remove unused variable in tensorflow mobilenet script
This PR removes unused variable in tensorflow mobilenet script.

Fixes #7679

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-17 16:04:18 +00:00
Fabiano Fidêncio
285e616b5e tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:16 +02:00
Fabiano Fidêncio
790bd3548d tests: commob: Don't fail if yq is not part of the cache
This may happen on external runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:14 +02:00
Fabiano Fidêncio
ce6adecd0a gha: kata-deploy: Add run-kata-deploy-tests.sh
This will have the same function as run-k8s-tests.sh has, but for
kata-deploy.

Right now it doesn't have any tests, and the command to actually run the
tests is commented out, but right now this is just a placeholder that
will be populated sooner than later.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:49:03 +02:00
Fabiano Fidêncio
cfc29c11a3 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 09:48:54 +02:00
Fabiano Fidêncio
e470a650e0
Merge pull request #7654 from sprt/ci-fixes
kata-deploy: Properly create default runtime class
2023-08-17 09:43:34 +02:00
Aurélien Bombo
f4dd152863 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 14:13:56 -07:00
Aurélien Bombo
339569b69c kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-08-16 11:04:44 -07:00
Gabriela Cervantes
2a491e9b1f metrics: Fix MobileNet help me description
This PR fixes MobileNet help me description in the
tensorflow script.

Fixes #7661

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-16 15:25:39 +00:00
Gabriela Cervantes
bade6a5c3b docs: Fix TensorFlow word across the document
This PR fixes the TensorFlow word across the document to have uniformity
across all the document.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 20:13:05 +00:00
Fabiano Fidêncio
0bc48eab60
Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests
gha: cri-containerd: Enable tests
2023-08-15 21:18:28 +02:00
Gabriela Cervantes
1a1b207760 docs: Add Tensorflow Resnet50 documentation
This PR adds the Tensorflow Resnet50 documentation.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:46:44 +00:00
Gabriela Cervantes
24baededc0 metrics: Add Dockerfile for ResNet50 int8
This PR adds the dockerfile for ResNet50 int8 benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:38:26 +00:00
Gabriela Cervantes
6d971ba8df metrics: Add Tensorflow ResNet50 int8 benchmark
This PR adds the Tensorflow ResNet50 int8 script for kata metrics.

Fixes #7652

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-15 17:30:22 +00:00
GabyCT
0bbabeaaf8
Merge pull request #7644 from GabyCT/topic/renametensorflow
metrics: Rename tensorflow scripts
2023-08-15 09:23:24 -06:00
Fabiano Fidêncio
46d25d908d
Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests
gha: tests: Add kata-deploy functional tests -- Part 1
2023-08-15 15:23:48 +02:00
Fabiano Fidêncio
b3592ab25c gha: cri-containerd: Enable tests
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:32:42 +02:00
Fabiano Fidêncio
84dd02e0f9 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.

The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.

It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
b29782984a gha: cri-containerd: Show pod before deleting it
It'll help us to debug failures with the pod stop / pod delete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
ae0930824a gha: cri-containerd: Print kata logs in case of error
We need this to fully understand what are the issues we're facing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
6c8b2ffa60 gha: cri-containerd: Group containerd logs
This improves readability in case of failures by a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Fabiano Fidêncio
9e898701f5 gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
Short commit log says it all.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-15 14:31:54 +02:00
Gabriela Cervantes
18a7fd8e4e metrics: Rename tensorflow scripts
This PR renames the tensorflow scripts to include the data format
that is being used as we will have multiple tests with different
data and model formats for tensorflow so this will help us to
distinguish them.

Fixes #7645

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-14 20:40:35 +00:00
GabyCT
a740c80251
Merge pull request #7626 from GabyCT/topic/cassandrak
metrics: Add Cassandra Kubernetes benchmark for kata metrics
2023-08-14 14:22:52 -06:00
GabyCT
4e5e39e8b3
Merge pull request #7618 from GabyCT/topic/addfunctionscommon
metrics: Add common functions to the common script
2023-08-14 14:22:30 -06:00
Fabiano Fidêncio
831e73ff91 tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder
Right now this file does nothing, as it's not even called by any GHA.
However, it'll be populated later on as part of a different series,
where we'll have kata-deploy specific tests running here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:46:10 +02:00
Fabiano Fidêncio
af1b46bbf2 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-14 17:45:58 +02:00
David Esparza
767434d50a
metrics: fix the loop used to stop kata components #7629
This PR fixed the loop that stops the kata-shim and the
hypervisors used in metrics checks.

Fixes: #7628

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-11 12:32:41 -06:00
Gabriela Cervantes
5d0f0d43c7 metrics: Add cassandra statefulset yaml
This PR adds cassandra statefulset yaml for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:39 +00:00
Gabriela Cervantes
c1dcc1396f metrics: Add cassandra service yaml
This PR adds the cassandra service yaml for the benchmark.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:36 +00:00
Gabriela Cervantes
2297a0d1c5 metrics: Add block loop pvc yaml for cassandra
This PR adds block loop pvc yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:33 +00:00
Gabriela Cervantes
e3d511946f metrics: Add block loop pv yaml for cassandra test
This PR adds the block loop pv yaml for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:29 +00:00
Gabriela Cervantes
9890271594 metrics: Add block loop pvc for cassandra test
This PR adds the block loop pvc for cassandra test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:22:19 +00:00
Gabriela Cervantes
349b89969a metrics: Add Cassandra Kubernetes benchmark for kata metrics
This PR adds Cassandra Kubernetes benchmark for kata metrics tests.

Fixes #7625

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-11 17:21:48 +00:00
Gabriela Cervantes
fdcd52ff78 metrics: Add check containers are running in tensorflow mobilenet
This PR adds check containers are running in tensorflow mobilenet
that is being defined in common script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:17:20 +00:00
Gabriela Cervantes
36337ee146 metrics: Add check containers are up in tensorflow script
This PR adds the check containers are up function from common
in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:15:18 +00:00
Gabriela Cervantes
f700f9b0ba metrics: Remove unused variable in tensorflow script
This PR removes an unused variable in tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:13:37 +00:00
Gabriela Cervantes
833cf7a684 metrics: Add check containers are running function
This PR adds the check containers are running function the common metrics
script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:12:22 +00:00
Gabriela Cervantes
918c783084 metrics: Add check containers are up in tensorflow mobilenet script
This PR adds the check containers are up in the common script
in the tensorflow mobilenet script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 20:06:40 +00:00
Gabriela Cervantes
9d57a1fab4 metrics: Use check containers are up in tensorflow script
This PR uses the check containers are up from the common script
in the tensorflow script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:42:09 +00:00
Gabriela Cervantes
1c84680d8c metrics: Add check containers are up in common script
This PR adds check containers are up in common script for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:39:24 +00:00
Gabriela Cervantes
d3e57cf454 metrics: Use collect_results function in tensorflow mobilenet test
This PR uses the collect results function defined in common for
the tensorflow mobilenet test.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:34:30 +00:00
Gabriela Cervantes
286de046af metrics: Remove collect results function definition
This PR removes the collect results function from tensorflow script
as it is going to be referenced in the common metrics script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:31:23 +00:00
Gabriela Cervantes
9879709aae metrics: Add common functions to the common script
This PR adds the collect results function to the common metrics
script.

Fixes #7617

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-10 17:27:11 +00:00
David Esparza
7bf994827d
Merge pull request #7609 from dborquez/tensorflow_check_completion
metrics: compute tensorflow statistics
2023-08-09 18:47:47 -06:00
David Esparza
dcdb3b067f
Merge pull request #7606 from GabyCT/topic/nginx
metrics: Add network nginx benchmark
2023-08-09 16:14:13 -06:00
David Esparza
2defdcc598
Merge pull request #7579 from dborquez/simplify_gha_metrics_workflow
metrics: install kata once and run multiple checks
2023-08-09 14:45:09 -06:00
David Esparza
473b0d3a31
metrics: compute tensorflow statistics
This PR computes average results for TF bench.
Additionally, it improves the data parsing from
all running containers.

Fixes: #7603

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-09 14:42:30 -06:00
Fabiano Fidêncio
eb463b38ec ci: unencrypted-image: Don't fail to build on s390x
Let's make sure that we don't fail in case we're building non x86_64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 20:32:36 +02:00
Gabriela Cervantes
d1a6296221 metrics: Add nginx documentation to network README
This PR adds nginx documentation to network README for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:17:46 +00:00
Gabriela Cervantes
498f7c0549 metrics: Add nginx kubernetes yaml
This PR adds the nginx kubernetes yaml.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:14:04 +00:00
Gabriela Cervantes
f8a5255cf7 metrics: Add network nginx benchmark
This PR adds the network nginx benchmark for kata metrics.

Fixes #7605

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-09 16:12:21 +00:00
Fabiano Fidêncio
5cdf981a2b
Merge pull request #7596 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests
tests: Create image that will be used in the unencrypted confidential tests
2023-08-09 17:06:07 +02:00
Fabiano Fidêncio
c932369f42
Merge pull request #7492 from fidencio/topic/adapt-tests-to-the-new-kata-deploy-env-vars
kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests
2023-08-09 12:55:03 +02:00
Fabiano Fidêncio
034d7aab87 tests: k8s: Ensure the runtime classes are properly created
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.

Fixes: #7491

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:46:04 +02:00
Fabiano Fidêncio
ab5f603ffa ci: k8s: Add the image used for unencrypted confidential tests
Let's add here the image we'll be using for unencrypted confidential
tests.  Later on, we'll make sure to build and use this image as part of
our CI.

The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:33:18 +02:00
Fabiano Fidêncio
1e8fe131bd k8s: tests: Take advantage of SHIMS and DEFAULT_SHIM env vars
We don't have to do any sed to replace the runtimeclass being used by
the moment we start taking advantage of the `DEFAULT_SHIM` environment
variable exposed merged in the previous commits.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-09 11:15:34 +02:00
Unmesh Deodhar
aeaec9dae9 tests: upgrade bats version
Instead of using package manager to install bats, building
this from source. This gives us the updated version of bats
which supports functions such as setup_file and
teardown_file.
We can use these functions into our current tests.

Fixes: #7597

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-08 18:16:39 -05:00
David Esparza
e664969862
metrics: install kata once and run multiple checks
This PR changes the metrics workflow in order to just install
kata once, and run the checks for multiple hypervisor variations.

In this way we save time avoiding installing kata for each
hypervisor to be tested.

Fixes: #7578

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-08 10:25:13 -06:00
Chelsea Mafrica
553fd79ea9
Merge pull request #7572 from GabyCT/topic/resnet50fp32
metrics: General improvements to mobilenet tensorflow test
2023-08-07 13:33:28 -07:00
Gabriela Cervantes
863283716d metrics: General improvements to mobilenet tensorflow test
This PR renames the mobilenet tensorflow test to have a more specific
tensorflow name mainly because tensorflow has different configurations
and we will add more tensorflow tests so we want to distinguish each
tensorflow test.

Fixes #7571

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-07 16:50:00 +00:00
Gabriela Cervantes
3c319d8d4c metrics: Add iperf to gha run script
This PR adds iperf to gha run script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-07 16:20:00 +00:00
GabyCT
7144acb2a5
Merge pull request #7527 from GabyCT/topic/latency
metrics: Add network latency test
2023-08-04 15:54:07 -06:00