The latest stable Kubernetes version advertised by dl.k8s.io may
temporarily have unresolvable package dependencies (e.g. missing
cri-tools or kubernetes-cni for the newest minor). This causes CI
failures during k8s deployment.
Refactor do_deploy_k8s to resolve the version once, perform a dry-run
apt-get install check, and if it fails, automatically fall back to the
previous minor version (e.g. v1.36 -> v1.35) before retrying.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The ci-coco-stability.yaml workflow has its weekly schedule
commented out with a note that the workload is not maintained.
Remove the entire chain: ci-coco-stability.yaml, ci-weekly.yaml,
run-kata-coco-stability-tests.yaml, and the kubernetes stability
test scripts that were only used through this path.
The local containerd stability tests (tests/stability/gha-run.sh)
remain as they are actively used by basic-ci workflows.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This manifest is not referenced by any .bats test file and
is effectively dead code.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The tests/integration/stdio/ directory has a gha-run.sh script
but no workflow in .github/workflows/ references it, so these
tests never run in CI.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The run-tracing job in basic-ci-amd64.yaml has been disabled
(if: false) due to issue #9763, with no path to re-enablement.
Remove the job definition and the backing
tests/functional/tracing/ directory.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The run-vfio job in basic-ci-amd64.yaml has been disabled
(if: false) due to issues #9764, #9851, and #9940, with no
path to re-enablement. Remove the job definition and the
backing tests/functional/vfio/ directory.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The run-metrics.yaml workflow is a reusable workflow_call with no
caller in the repository, making it effectively dead code. Remove
the workflow, the entire tests/metrics/ directory (~586 files
including vendored Go for checkmetrics), and the "metrics"
self-hosted runner label from actionlint.yaml.
Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
"cloud-hypervisor" is also a runtime-rs hypervisor. So we need to include it in the settings selection logic.
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
`k8s-confidential.bats` technically doesn't need attestation, but only runs
on TEE hardware, so include it in the attestation list so we can test it in PRs
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The skip conditional is wrong, but it's not needed as the setup
and teardown only allow confidential hardware anyway
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Remove --tail=N limits from `kubectl logs` for kata-deploy pods so
the complete output is visible in CI job logs for debugging.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
During tests, one error as below:
```
..k8s-kill-all-process-in-container.bats: line 40: [: too many arguments
```
This commit aims to address such issue follows:
(1) Update process query command to "ps aux || ps" to ensure
compatibility across different container images while maximizing
process visibility.
(2) Use "[t]ail" in grep to reliably match the process without
self-matching.
(3) Quote variable in assertion to resolve "too many arguments" bash
error.
(4) Improve test reliability by ensuring the process list is actually
visible to the verification logic.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
The 'no-layer-image' test case was failing because the underlying shim
returned a "unsupported rootfs mounts count" error instead of the
expected application-level "file not found" or "ENOENT" error.
This change updates the BATS test to accept the shim-level rootfs
validation error as a valid failure condition for this unsupported
image scenario, ensuring the CI remains green while reflecting
current runtime behavior.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Document the end-to-end workflow for using the containerd EROFS
snapshotter with Kata Containers runtime-rs, covering containerd
configuration, Kata QEMU settings, and pod deployment examples
via crictl/ctr/Kubernetes.
Include prerequisites (containerd >= 2.2, runtime-rs main branch),
QEMU VMDK format verification command, architecture diagram,
VMDK descriptor format reference, and troubleshooting guide.
Note that Cloud Hypervisor, Firecracker, and Dragonball do not
support VMDK block devices and are currently unsupported for
fsmerged EROFS rootfs.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
The image used has some special (as weird) properties that are being
taking advantage of to implement policy related tests.
Changing the image is a no-go at this point, otherwise we break the
tests ... so let's just skip those for now.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add qemu-coco-dev-runtime-rs to the arm64 k8s test matrix so that the
CoCo non-TEE configuration is exercised on aarch64 runners.
Also enable auto-generated policy for qemu-coco-dev on aarch64 (matching
the existing x86_64 behavior) and register the new job as a required
gatekeeper check.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
Add aarch64/arm64 to the list of supported architectures for
qemu-coco-dev and qemu-coco-dev-runtime-rs shims across kata-deploy
configuration, Helm chart values, and test helper scripts.
Note that guest-components and the related build dependencies are not
yet wired for arm64 in these configurations; those will be addressed
separately.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
This test uses YAML files from a different directory than the other
k8s CI tests, so annotations have to be added into these separate
files.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>