This PR removes the CI_JOB variable which previously was used but
not longer being supported of the metrics sysbench test.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR updates the container name to put a random name instead
of using a hard coded name. This PR is a general improvement
to avoid random bug failures specially when we are running on
baremetal environments.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Test reports that it is a onednn test when it is openvino; update
description.
Fixes: #9948
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
This commit extends the vfio-ap hotplug test to include the use of `zcrypttest`.
A newly introduced test by the tool consists of several test rounds as follows:
- ioctl_test
- simple_test
- simple_one_thread_test
- simple_multi_threads_test
- multi_thread_stress_test
- hang_after_offline_online_test
A writable root filesystem is required for testing because the reference count
needs to be reset after each test round. The current containerd kata containers
support does not include `--privileged_without_host_devices`, which is necessary
to configure a writable filesystem along with `--privileged`. (Please check out
https://github.com/kata-containers/kata-containers/issues/9791 for details)
So `crictl` is chosen to extend the test.
The commit also includes the removal of old commands previously used for the
tests repository but no longer in use.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This commit copies an internal testing tool `zcrypttest` to the
test image. A base image is changed to `ubuntu:22.04` due to a
library dependency issue.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As part of archiving the tests repo, we are eliminating the dependency on
`clone_tests_repo()`. The scripts using the function is as follows:
- `ci/install_rust.sh`.
- `ci/setup.sh`
- `ci/lib.sh`
This commit removes or replaces the files, and makes an adjustment accordingly.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
A multi-arch image for `alpine-bash-curl` has been pushed to and available
at `quay.io/kata-containers`.
This commit switches the test image to `quay.io/kata-containers/alpine-bash-curl`.
Fixes: #9935
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Observed instability in the API server after deploying kata-deploy caused test failures.
(see: https://github.com/kata-containers/kata-containers/actions/runs/9681494440/job/26743286861)
Specifically, `kubectl_retry logs` failed before the API server could respond properly.
This commit increases the interval and max_tries for kubectl_retry(), allowing sufficient
time to handle this situation.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This PR increases the timeout to crictl calls on kata monitor
tests to avoid to hit issues every now and avoid random failures.
This PR is very similar to PR #7640.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This is the first part of adding a job to clean up potentially dangling
Azure resources. This will be based on Jeremi's tool from
https://github.com/jepio/kata-azure-automation.
At first, we'll only clean up AKS clusters, as this is what has been
causing us problems lately, but this could very well be extended to
cleaning up entire resource groups, which is why I left the different
names pretty generic (i.e. "resources" instead of "clusters").
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
- We only want to enable the sample verifier in the KBS for non-TEE
tests, so prevent an edge case where the TEE platform isn't set up
correctly and we might fall back to the sample and get false positives.
To prevent this we add guards around the sample policy enablement and
only run it for non confidential hardware
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add --layers-cache-file-path flag to allow the user to
specify where the cache file for the container layers
is saved. This allows e.g. to have one cache file
independent of the user's working directory.
Signed-off-by: Leonard Cohnen <lc@edgeless.systems>
This PR fixes the variables names for the network that was created as well
removes the network that were created for the tests to ensure a clean environment
when running all the tests and avoid failures specially on baremental environments
that network already exists.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR improves the variable definition in memory inside
the container script for metrics. This change declares and assigns
the variables separately to avoid masking return values.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The following error was observed during the deployment of nydus snapshotter:
```
Error from server (NotFound):
the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v)
'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries
Error: Process completed with exit code 1.
```
This error can occur when a pod is re-created by a daemonset during the retry interval.
This commit addresses the issue by using `--selector` rather than the pod name
for `kubectl logs/describe`.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
The following tests are disabled because they fail (alike with dragonball):
- k8s-cpu-ns.bats
- k8s-number-cpus.bats
- k8s-sandbox-vcpus-allocation.bats
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
This PR uses the nodeport deployment from upstream trustee.
To ensure our deployment is as close to upstream trustee replace
the custom nodeport handling and replace it with nodeport
kustomized flavour from the trustee project.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR makes general improvements like definition of variables and
the use of them to improve the general setup script for kubernetes
tests.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR uses the function definition to have uniformity across
all the launch times script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Frequent errors have been observed during k8s e2e tests:
- The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
- Error from server (ServiceUnavailable): the server is currently unable to handle the request
- Error from server (NotFound): the server could not find the requested resource
These errors can be resolved by retrying the kubectl command.
This commit introduces a wrapper function in common.sh that runs kubectl up to 3 times
with a 5-second interval. Initially, this change only covers gha-run.sh for Kubernetes.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.
Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.
Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.
Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.
Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It seems I was very lose on disabling some of the tests, and the issues
I faced could be related to other instabilities in the CI.
Let's re-enable this one, following what was done for the SEV, SNP, and
coco-qemu-dev.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's stop skipping the CDH tests for TDX, as know we should have an
environmemnt where it can run and should pass. :-)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We must ensure we use the host ip to connect to the PCCS running on the
host side, instead of using localhost (which has a different meaning
from inside the KBS pod).
The reason we're using `hostname -i` isntead of the helper functions, is
because the helper functions need the coco-kbs deployed for them to
work, and what we do is before the deployment.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is generally useful debug output on test failures,
and specifically this has been useful for nydus-related
issues recently.
Signed-off-by: Chris Porter <porter@ibm.com>