kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-18 13:46:06 +00:00

Author	SHA1	Message	Date
Alex Lyn	0f04363ea8	tests: Disable CPU elasticity tests for nontee scenarios This commit updates the non-TEE tests to disable two specific test cases: `k8s-number-cpus.bats` and `k8s-sandbox-vcpus-allocation.bats`. These tests are designed to cover CPU elasticity/dynamic scaling capabilities. In the non-TEE scenario, we are enforcing the disabling of this capability by setting the default configuration to `static_sandbox_resource_mgmt=true`. Although the tests currently pass, allowing them to run is logically inconsistent with the intended non-TEE configuration. Therefore, we are disabling them for all non-TEE runtimes, specifically targeting: - `qemu-coco-dev` - `qemu-coco-dev-runtime-rs` This change ensures that our non-TEE CI accurately reflects the static resource management policy and prevents misleading test results. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	beaf44dd2e	tests: disable block volume test for s390 arch As runtime-rs doesn't support block device hotplug in s390 arch, with this fact, we just disable or skip the test when it is the s390. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	28371dbec5	tests: Enable cloud-hypervisor and qemu-runtime-rs within the CI Enable the cpu hotplug tests within the k8s-number-cpus.bats for both cloud-hypervisor and qemu-runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	82a72b4564	tests: Enable cpu hotplug for dragonball and clh in vcpus allocation We have support cpu hotplug features within dragonball and clh, this commit is to enable the test within the CI. Fixes: #8660 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	6196d3d646	tests: Enable cpu hotplug tests in k8s-cpu-ns.bats As previous failure within the case, we choose to skip it, but now the cpu hotplug has been corrected, and it's time to re-enable it. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	96bd13e85d	tests: Add support for qemu-runtime-rs We have supportted virtio-scsi driver, and now the CI should be enabled. Fixes: #10373 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Fabiano Fidêncio	69a0ac979c	tests: Adjust install_bats() The function assumes that the runner is a Ubuntu machine, which so far has been true as part of our CI. However, the new ARM runner is running on Debian, and those mirror additions would simply break. With this in mind, for any distro that's not ubuntu, let's just make sure to inform the owner of the system to have bats already installed as part of the environment provided. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-10 12:05:04 +01:00
Fabiano Fidêncio	406f6b1d15	Revert "tests: Add workaround to override CDI files" This reverts commit `5a81b010f2`, as we now have all the infrastructure properly set up as part of our CI node. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-09 23:18:11 +01:00
Fabiano Fidêndio	71f78cc87e	tests: cc: gpu: Lower the amount of memory required by the pods We've made the pods require a ridiculous amount of memory, just for the sake of getting them running. Now that those are running, tests are passing, CI is required, let's work to lower the amount of mmemory needed as everything else is working as expected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-06 00:16:43 +01:00
Dan Mihai	965ad10cf2	tests: k8s: tests_common.sh local modification Clean-up shellcheck warnings: SC2030 (info): Modification of cmd_out is local (to subshell caused by (..) group). SC2031 (info): cmd_out was modified in a subshell. That change might be lost. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Dan Mihai	8199171cc4	tests: k8s: tests_common.sh braces around variables Clean-up shellcheck warnings: SC2250 (style): Prefer putting braces around variable references even when not strictly required. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Fabiano Fidêncio	5a81b010f2	tests: Add workaround to override CDI files Let's add a simple backup and restore logic for the CDI configuration file nvidia.com-pgpu.yaml in the k8s-nvidia-*.bats and k8s-confidential-attestation.bats test files. Althought not optimal, this is a temporary workaround needed until NVIDIA releases what's needed for the GPU Operator to properly deal with cold plugged devices for the Confidential Containers cases, which is work in progress right now. After that's released, we can revert/drop this patch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-05 18:58:35 +01:00
Zvonko Kaiser	147e9f188e	Merge pull request #12080 from manuelh-dev/mahuber/cc-gpu-ci-attestation tests: nvidia: cc: Add attestation test	2025-12-05 09:31:57 -05:00
Steve Horsman	2f1b98c232	Merge pull request #12197 from stevenhorsman/logrus-1.9.3-bump version: Bump sirupsen/logrus	2025-12-05 14:18:50 +00:00
Manuel Huber	e5861cde20	tests: use Authorization when GH_TOKEN is set Same as for other uses of GH_TOKEN, use it when set in order to avoid rate limiting issues. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 14:08:43 +01:00
stevenhorsman	9eba559bd6	version: Bump sirupsen/logrus Bump the github.com/sirupsen/logrus version to 1.9.3 across our components where it is back-level to bring us up-to-date and resolve high severity CVE-2025-65637 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-05 11:12:04 +00:00
Manuel Huber	34efa83afc	tests: nvidia: cc: Add attestation test Add the attestation bats test case to the NVIDIA CI and provide a second pod manifest for the attestation test with a GPU. This will enable composite attestation in a subsequent step. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	116a72ad0d	tests: cc: Fix command evaluation This brings two fixes: - use the test_key variable to check against the aatest value. - properly check the run command invocation (run w/o bash does not seem to like the pipe which leads to ALWAYS evaluating the status result to 1. With this, the deny-all test would ALWAYS succeed regardless of whether aatest was actually returned or not. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	23675c784b	tests: cc: Reset default policy When running these tests repeatedly locally, the default policy is not being reset after the test completes, then subsequent runs fail. Similar to k8s-sealed-secrets.bats, we set the default policy in an if condition. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	f70c3adaf1	tests: cc: Add kbs_set_gpu0_resource_policy This allows setting a GPU0 resource policy, enabling GPU attestation tests to not use the default resource policy. For now, the policy requires attestation's ear status to not be contraindicated. In a future change we will require this to be affirming once our CI runners' vBIOS version is properly configured. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	c2d1e2dcc9	tests: cc: Add is_confidential_gpu_hardware This enables attestation tests to figure out whether composite attestation with a GPU can be executed. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
Manuel Huber	53e94df203	tests: nvidia: cc: add SUPPORTED_TEE_HYPERVISORS Add the NVIDIA TEE hypervisors. With this, attestation tests can be run against the NVIDIA handlers, for instance. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-05 11:48:55 +01:00
stevenhorsman	403de2161f	version: Update golang to 1.24.11 Needed to fix: ``` Vulnerability #1: GO-2025-4155 Excessive resource consumption when printing error string for host certificate validation in crypto/x509 More info: https://pkg.go.dev/vuln/GO-2025-4155 Standard library Found in: crypto/x509@go1.24.9 Fixed in: crypto/x509@go1.24.11 Vulnerable symbols found: #1: x509.HostnameError.Error ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-04 22:50:07 +01:00
Hyounggyu Choi	1dd3426adc	tests: Extend vfio-ap test for runtime-rs vfio-ap passthrough has been introduced for runtime-rs, requiring that the existing test verify this new functionality. This commit adds: - containerd config specific to runtime-rs - extensions to the existing test functions to cover vfio-ap Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
Hyounggyu Choi	aa326fb9b8	tests: Remove usage of crictl for vfio-ap `crictl` is not used any more after #10767. Let's clean up all places where the tool is used. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-04 15:05:23 +01:00
stevenhorsman	79a75b63bf	tests: Switch nginx test image ref to digest As tags are mutable and digests are not, lets pin our image by digest to give our CI a better chance of stability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-02 13:02:50 +00:00
stevenhorsman	5c618dc8e2	tests: Switch nginx images to use version.yaml details - Swap out the hard-coded nginx registry and verisons for reading the test image details for version.yaml which can also ensure that the quay.io mirror is used rather than the docker hub versions which can hit pull limits - Try setting imagePullPoliycy Always to fix issues with the arm CI Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-02 10:04:09 +01:00
Manuel Huber	5a5c43429e	ci: nvidia: remove kubectl_retry calls When tests regress, the CI wait time can increase significantly with the current kubectly_retry attempt logic. Thus, align with other tests and remove kubectl_retry invocations. Instead, rely on proper timeouts. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-28 19:00:57 +01:00
Alex Lyn	4e450691f4	tests: Unify nydus configuration to containerd v3 schema Containerd configuration syntax (`config.toml`) varies across versions, requiring per-version logic for fields like `runtime`. However, testing confirms that containerd LTS (1.7.x) and newer versions fully support the v3 schema for the nydus remote snapshotter. This commit changes the previous containerd v1 settings in `config.toml`. Instead, it introduces a unified v3-style configuration for nydus, which can be vailid for lts and active containerds. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-26 17:58:16 +08:00
Alex Lyn	ebe084e093	Merge pull request #12122 from fidencio/topic/configs-do-no-have-commented-out-options runtimes: config: Do NOT have commented fields	2025-11-26 10:33:32 +08:00
Fabiano Fidêncio	e859537c74	runtimes: config: Do NOT have commented fields In order to have a better way to set things up using a toml editor, we should take the containerd approach and actually have everything uncommnted. This will help us to unify how we deal with such values in the future from the kata-deploy POV. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-25 19:26:56 +01:00
Manuel Huber	331515e1b8	ci: enable security policy for openvpn test With issue 11777 being resolved, this commit enables openvpn policy testing. The remaining work on the security policy required to successfully run this test case was to enable UDP ports for Service kinds and to use the mount path's last component instead of the volume name to construct the expected storage source path. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-23 17:23:43 +00:00
Manuel Huber	dfc229f51e	tests: nvidia: cc: Remove nvrc.smi.srs=1 parameter Remove the nvrc.smi.srs=1 parameter from the kernel command line. In CC use cases, the attestation agent is expected to set the GPU ready state. For the CUDA vectorAdd case where attestation agent is not being used, we set the ready state by adding the kernel command line parameter through an annotation. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:35:05 +01:00
Manuel Huber	6c6fc50aa5	tests: nvidia: cc: allow-all policy and init-data Add an allow-all policy for the CC GPU tests and ensure the init-data device is being created (hypervisor annotations). Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	7e20118c8e	tests: nvidia: move secret definitions to bottom The add_allow_all_policy_to_yaml in tests_common.sh needs some improvements so that this function can support pod manifests with different resource kinds. For now, moving the Secret definition to the bottom so that we can create a default policy for the Pod. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	ffd5443637	tests: nvidia: adapt is_aks_cluster The qemu-nvida-gpu handlers should not cause is_aks_cluster to return 1. Otherwise, CI logic will assume these hypervisors run on AKS hosts, see the following message in CI w/o this change: INFO: Adapting common policy settings for AKS Hosts Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Manuel Huber	f2bdd12e5e	tests: nvidia: Check KATA_HYPERVISOR var Fail explicitly when a wrong KATA_HYPERVISOR variable is provided. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-21 09:24:15 +01:00
Fabiano Fidêncio	6b40b59861	tests: Reduce KBS deployment check flakeness We currently start a pod that does a `wget` to the KBS address, and fails after 5 seconds. By the time it fails and reports back, we can see that KBS is actually running, but the workflow failed as the checker failed. :-/ Let's give it more time for the KBS to show up, and the flakeness should go away. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-20 19:29:26 +01:00
Fabiano Fidêncio	35672ec5ee	tests: cc: Test authenticated images with force guest pull As this should simply work. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-20 19:02:15 +01:00
Manuel Huber	477ca3980b	tests: nvidia: cc: Re-enable multi GPU test case Use the pod name variable so that kubectl wait finds the pod. Currently, kubectl waits for nvidia-nim-llama-3-2-nv-embedqa-1b-v2, not for nvidia-nim-llama-3-2-nv-embedqa-1b-v2-tee Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-20 10:05:46 +01:00
Fabiano Fidêncio	ae463642ed	tests: k8s: Fix typo in authenticated tests The person who introduced the check, someone named Fabiano Fidêncio, forgot a `$` in a variable assignment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-19 11:59:59 +01:00
Alex Lyn	1da225efc5	tests: Enable AUTO_GENERATE_POLICY for qemu-coco-dev-runtime-rs Enable auto-generate policy on cbl-mariner Hosts for qemu-coco-dev-runtime-rs if the user didn't specify an AUTO_GENERATE_POLICY value. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-19 10:44:03 +08:00
Fabiano Fidêncio	8c02b5b913	tests: nvidia: cc: Temporarily skip multi GPU for nim tests We will re-enable this one later on once the changes to properly cold plug multi GPUs are merged. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	94ed4051b0	tests: nvidia: cc: Increase RAM for NIM pods Those need to pull the models inside the guest, and the guest has 50% of its memory "allowed" to be used as tmpfs, so, we gotta usa the RAM that we have. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	e5062a056e	tests: nvidia: cc: Adjust timeouts on NIM pods Timeout increases for confidential computing slowness: * livenessProbe: * initialDelaySeconds: 15 → 120 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 3 → 10 * readinessProbe: * initialDelaySeconds: 15 → 120 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 3 → 10 * startupProbe: * initialDelaySeconds: 40 → 180 seconds * timeoutSeconds: 1 → 10 seconds * failureThreshold: 180 → 300 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	6be43b2308	tests: nvidia: Retry kubectl commands As with CoCo some of the commands may take longer, way longer than expected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	bb5bf6b864	tests: nvidia: nims: Use the current auths format for KBS We cannot use the same format used for docker, as it includes username and password, while what's expected when using Trustee does not. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	92da54c088	tests: nvidia: cc: Enable NIM tests Now that we've bumped Trustee to a version that supports the NVIDIA remote verifier, let's re-enable the tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 22:29:42 +01:00
Fabiano Fidêncio	8eca0814bd	tests: Run authenticated tests with experimental_force_guest_pull As it should be supported. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-18 14:46:48 +01:00
Fabiano Fidêncio	75996945aa	kata-deploy: try-kata-values.yaml -> values.yaml This makes the user experience better, as the admin can deploy Kata Containers without having to download / set up any additional file. Of course, if the admin wants something more specific, examples are provided. Tests and documentation are updated to reflect this change. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-17 12:16:17 +01:00

... 5 6 7 8 9 ...

2085 Commits