kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-18 10:44:10 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	aa3a795cd4	tests: k8s: coco: rely more on free runners Run all CoCo non-TEE variants in a single job on the free runner with an explicit environment matrix (vmm, snapshotter, pull_type, kbs, containerd_version). Here we're testing CoCo only with the "active" version of containerd. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-19 21:21:57 +01:00
Fabiano Fidêncio	50fab83173	tests: k8s: rely more on free runners We were running most of the k8s integration tests on AKS. The ones that don't actually depend on AKS's environment now run on normal ubuntu-24.04 GitHub runners instead: we bring up a kubeadm cluster there, test with both containerd lts and active, and skip attestation tests since those runtimes don't need them. AKS is left only for the jobs that do depend on it. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-19 20:44:41 +01:00
Dan Mihai	ea53779b90	ci: k8s: temporarily disable mariner host Disable mariner host testing in CI, and auto-generated policy testing for the temporary replacements of these hosts (based on ubuntu), to work around missing: 1. cloud-hypervisor/cloud-hypervisor@0a5e79a, that will allow Kata in the future to disable the nested property of guest VPs. Nested is enabled by default and doesn't work yet with mariner's MSHV. 2. cloud-hypervisor/cloud-hypervisor@bf6f0f8, exposed by the large ttrpc replies intentionally produced by the Kata CI Policy tests. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-19 20:42:50 +01:00
Dan Mihai	3e2153bbae	ci: k8s: easier to modify az aks create command Make `az aks create` command easier to change when needed, by moving the arguments specific to mariner nodes onto a separate line of this script. This change also removes the need for `shellcheck disable=SC2046` here. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-19 20:42:50 +01:00
Steve Horsman	6a67250397	Merge commit from fork runtime-go/rs: Disable virtio-pmem for Cloud Hypervisor	2026-02-19 09:00:56 +00:00
Chiranjeevi Uddanti	88203cbf8d	tests: Add regression test for sandbox_cgroup_only=false Add unit test for get_ch_vcpu_tids() and integration test that creates a pod with sandbox_cgroup_only=false to verify it starts successfully. Signed-off-by: Chiranjeevi Uddanti <244287281+chiranjeevi-max@users.noreply.github.com> Co-authored-by: Antigravity <antigravityagent@google.com>	2026-02-18 20:20:14 +01:00
Aurélien Bombo	8ff9cd1f12	Merge pull request #12455 from ajaypvictor/secret-cm-without-sharedfs ci: Add integration tests for secret & configmap propagation	2026-02-18 12:06:48 -06:00
Aurélien Bombo	336b922d4f	tests/cbl-mariner: Stop disabling NVDIMM explicitly This is not needed anymore since now disable_image_nvdimm=true for Cloud Hypervisor. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-02-18 11:52:51 -06:00
Dan Mihai	eee25095b5	tests: mariner annotations for k8s-openvpn This test uses YAML files from a different directory than the other k8s CI tests, so annotations have to be added into these separate files. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-18 07:17:04 +01:00
Fabiano Fidêncio	80a175d09b	kata-deploy: Add TEE nodeSelectors for TEE shims when NFD is detected When NFD is detected (deployed by the chart or existing in the cluster), apply shim-specific nodeSelectors only for TEE runtime classes (snp, tdx, and se). Non-TEE shims keep existing behavior (e.g. runtimeClass.nodeSelector for nvidia GPU from `f3bba0885` is unchanged). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-16 12:07:51 +01:00
Ajay Victor	83935e005c	ci: Add integration tests for secret & configmap propagation Enhance k8s-configmap.bats and k8s-credentials-secrets.bats to test that ConfigMap and Secret updates propagate to volume-mounted pods. - Enhanced k8s-configmap.bats to test ConfigMap propagation * Added volume mount test for ConfigMap consumption * Added verification that ConfigMap updates propagate to volume-mounted pods - Enhanced k8s-credentials-secrets.bats to test Secret propagation * Added verification that Secret updates propagate to volume-mounted pods Fixes #8015 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2026-02-14 08:56:21 +05:30
Fabiano Fidêncio	2930c68c0b	ci: tdx: properly skip k8s-sandbox-vcpus-allocation.bats This is a follow-up for `25962e9325` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-13 20:56:08 +01:00
Fabiano Fidêncio	8cb7d0be9d	tests: nvidia: Fix genpolicy error when pulling nvcr.io images genpolicy pulls image manifests from nvcr.io to generate policy and was failing with 'UnauthorizedError' because it had no registry credentials. Genpolicy (src/tools/genpolicy) uses docker_credential::get_credential() in registry.rs, which reads from DOCKER_CONFIG/config.json. Add setup_genpolicy_registry_auth() to create a Docker config with nvcr.io auth (NGC_API_KEY) and set DOCKER_CONFIG before running genpolicy so it can authenticate when pulling manifests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-13 13:12:55 +01:00
Fabiano Fidêncio	6a3bbb1856	tests: Retry k8s deployment We've seen a lot of spurious issues when deploying the infra needed for the tests. Let's give it a few tries before actually failing. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-12 20:13:59 +01:00
Mikko Ylinen	25962e9325	tests/coco: disable k8s-sandbox-vcpus-allocation.bats for TDX After the move to Linux 6.17 and QEMU 10.2 from Kata, k8s-sandbox-vcpus-allocation.bats started failing on TDX. 2026-02-10T16:39:39.1305813Z # pod/vcpus-less-than-one-with-no-limits created 2026-02-10T16:39:39.1306474Z # pod/vcpus-less-than-one-with-limits created 2026-02-10T16:39:39.1307090Z # pod/vcpus-more-than-one-with-limits created 2026-02-10T16:39:39.1307672Z # pod/vcpus-less-than-one-with-limits condition met 2026-02-10T16:39:39.1308373Z # timed out waiting for the condition on pods/vcpus-less-than-one-with-no-limits 2026-02-10T16:39:39.1309132Z # timed out waiting for the condition on pods/vcpus-more-than-one-with-limits 2026-02-10T16:39:39.1310370Z # Error from server (BadRequest): container "vcpus-less-than-one-with-no-limits" in pod "vcpus-less-than-one-with-no-limits" is waiting to start: ContainerCreating A manual test without agent policies added it seems to work OK but disable the test for now to get CI stable. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-02-11 22:02:59 +01:00
Hyounggyu Choi	c84e37f6ac	Merge pull request #12486 from BbolroC/cpu-hotplug-s390x-runtime-rs runtime-rs: Skip sockets and threads for hotplug_vcpus on Z/P	2026-02-11 09:40:21 +01:00
Hyounggyu Choi	67f54bdcb5	tests: Remove skip condition for runtime-rs on s390x in k8s-cpu-ns This commit removes the skip condition for qemu-runtime-rs on s390x in k8s-cpu-ns.bats. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-02-11 05:52:13 +01:00
stevenhorsman	15d6a681ed	doc: Fix spelling issues Put things in backticks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-10 21:58:28 +01:00
Fabiano Fidêncio	5c0269881e	tests: Make editorconfig-checker happy - Trim trailing whitespace and ensure final newline in non-vendor files - Add .editorconfig-checker.json excluding vendor dirs, .patch, .img, .dtb, .drawio, *.svg, and pkg/cloud-hypervisor/client so CI only checks project code - Leave generated and binary assets unchanged (excluded from checker) Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-10 21:58:28 +01:00
Fabiano Fidêncio	cb652e0da1	tests: Update NVRC trace to use drop-in config mechanism Update the enable_nvrc_trace() function to use the new drop-in configuration mechanism instead of directly modifying the base configuration file. The function now creates a 90-nvrc-trace.toml drop-in file that properly combines existing kernel parameters with the nvrc.log=trace setting. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-10 18:12:17 +01:00
Manuel Huber	a6ca5c6628	ci: add editorconfig checker This adds a basic configuration for editorconfig checker. The supplied configuration checks against trailing whitespaces and issues with newlines. Example: \| tools/packaging/kernel/configs/fragments/x86_64/numa.conf: \| Wrong line endings or no final newline \| tools/packaging/release/generate_vendor.sh: \| 44: Trailing whitespace Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-09 15:03:26 -08:00
Manuel Huber	525192832f	tests: Clean up superfluous GPU annotation This annotation was required for GPU cold-plug before using a newer device plugin and before querying the pod resources API. As this annotation is no longer required, cleaning it up. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-09 11:28:24 -08:00
Alex Lyn	3fda59e27d	tests: rename pod_exec_with_retries to pod_exec and update callers It will do following works in this commit: (1) Rename pod_exec_with_retries() to pod_exec(). (2) Update implementation to call container_exec(). (3) Replace all usages of pod_exec_with_retries across tests with pod_exec. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Alex Lyn	861d39305c	tests: drop kubectl exec retries in container_exec This commit aims to drop retries when kubectl exec a container: (1) Rename container_exec_with_retries() to container_exec(). (2) Remove the retry loop and sleep backoff around kubectl exec. Keep the same logging and container-selection logic and return kubectl exec exit status directly. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
stevenhorsman	b29312289f	versions: Bump go to 1.24.13 Bump go to 1.24.13 to fix CVE GO-2026-4337 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 14:49:31 +01:00
Manuel Huber	cf7f340b39	tests: Read and overwrite kernel_verity_parameters Read the kernel_verity_paramers from the shim config and adjust the root hash for the negative test. Further, improve some of the test logic by using shared functions. This especially ensures we don't read the full journalctl logs on a node but only the portion of the logs we are actually supposed to look at. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-05 23:04:35 +01:00
Manuel Huber	e120dd4cc6	tests: cc: Remove quotes from kernel command line With dm-mod.create parameters using quotes, we remove the backslashes used to escape these quotes from the output we retrieve. This will enable attestation tests to work with the kernelinit dm-verity mode. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-05 23:04:35 +01:00
Manuel Huber	282014000f	tests: cc: support initrd, image for attestation Allow using an image instead of an initrd. For confidential guests using images, the assumption is that the guest kernel uses dm-verity protection, implicitly measuring the rootfs image via the kernel command line's dm-verity information. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-05 23:04:35 +01:00
Fabiano Fidêncio	dda1b30c34	tests: nvidia-nim: Use sealed secrets for NGC_API_KEY Convert the NGC_API_KEY from a regular Kubernetes secret to a sealed secret for the CC GPU tests. This ensures the API key is only accessible within the confidential enclave after successful attestation. The sealed secret uses the "vault" type which points to a resource stored in the Key Broker Service (KBS). The Confidential Data Hub (CDH) inside the guest will unseal this secret by fetching it from KBS after attestation. The initdata file is created AFTER create_tmp_policy_settings_dir() copies the empty default file, and BEFORE auto_generate_policy() runs. This allows genpolicy to add the generated policy.rego to our custom CDH configuration. The sealed secret format follows the CoCo specification: sealed.<JWS header>.<JWS payload>.<signature> Where the payload contains: - version: "0.1.0" - type: "vault" (pointer to KBS resource) - provider: "kbs" - resource_uri: KBS path to the actual secret Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-04 12:34:44 +01:00
Fabiano Fidêncio	c9061f9e36	tests: kata-deploy: Increase post-deployment wait time Increase the sleep time after kata-deploy deployment from 10s to 60s to give more time for runtimes to be configured. This helps avoid race conditions on slower K8s distributions like k3s where the RuntimeClass may not be immediately available after the DaemonSet rollout completes. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-04 12:13:53 +01:00
Fabiano Fidêncio	0fb2c500fd	tests: kata-deploy: Merge E2E tests to avoid timing issues Merge the two E2E tests ("Custom RuntimeClass exists with correct properties" and "Custom runtime can run a pod") into a single test, as those 2 are very much dependent of each other. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-04 12:13:53 +01:00
Fabiano Fidêncio	fef93f1e08	tests: kata-deploy: Use die() instead of fail() for error handling Replace fail() calls with die() which is already provided by common.bash. The fail() function doesn't exist in the test infrastructure, causing "command not found" errors when tests fail. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-04 12:13:53 +01:00
Dan Mihai	d7ff54769c	tests: policy: remove the need for using sudo Modify the copy of root user's settings file, instead of modifying the original file. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-01 20:09:50 +01:00
Dan Mihai	4d860dcaf5	tests: policy: avoid redundant debug output Avoid redundant and confusing teardown_common() debug output for k8s-policy-pod.bats and k8s-policy-pvc.bats. The Policy tests skip the Message field when printing information about their pods, because unfortunately that field might contain a truncated Policy log - for the test cases that intentiocally cause Policy failures. The non-truncated Policy log is already available from other "kubectl describe" fields. So, avoid the redundant pod information from teardown_common(), that also included the confusing Message field. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-01 20:09:50 +01:00
Steve Horsman	4d1095e653	Merge pull request #12350 from manuelh-dev/mahuber/term-grace-period tests: Remove terminationGracePeriod in manifests	2026-01-29 15:17:17 +00:00
Fabiano Fidêncio	500146bfee	versions: Bump Go to 1.24.12 Update Go from 1.24.11 to 1.24.12 to address security vulnerabilities in the standard library: - GO-2026-4342: Excessive CPU consumption in archive/zip - GO-2026-4341: Memory exhaustion in net/url query parsing - GO-2026-4340: TLS handshake encryption level issue in crypto/tls Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-29 00:23:26 +01:00
Dan Mihai	20ca4d2d79	runtime: DEFDISABLEBLOCK := true 1. Add disable_block_device_use to CLH settings file, for parity with the already existing QEMU settings. 2. Set DEFDISABLEBLOCK := true by default for both QEMU and CLH. After this change, Kata Guests will use by default virtio-fs to access container rootfs directories from their Hosts. Hosts that were designed to use Host block devices attached to the Guests can re-enable these rootfs block devices by changing the value of disable_block_device_use back to false in their settings files. 3. Add test using container image without any rootfs layers. Depending on the container runtime and image snapshotter being used, the empty container rootfs image might get stored on a host block device that cannot be safely hotplugged to a guest VM, because the host is using the same block device. 4. Add block device hotplug safety warning into the Kata Shim configuration files. Signed-off-by: Dan Mihai <dmihai@microsoft.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Cameron McDermott <cameron@northflank.com>	2026-01-28 19:47:49 +01:00
Fabiano Fidêncio	d0fe60e784	tests: Fix empty string handling for helm Fix empty string handling in format conversion When HELM_ALLOWED_HYPERVISOR_ANNOTATIONS, HELM_AGENT_HTTPS_PROXY, or HELM_AGENT_NO_PROXY are empty, the pattern matching condition `!= :` or `!= =` evaluates to true, causing the conversion loop to create invalid entries like "qemu-tdx: qemu-snp:". Add -n checks to ensure conversion only runs when variables are non-empty. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-26 20:50:01 +01:00
Fabiano Fidêncio	4b2d4e96ae	tests: Add qemu-{tdx,snp}-runtime-rs to the list of tee shims We missed doing this as part of `b5a986eacf`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-26 20:50:01 +01:00
Fabiano Fidêncio	26c534d610	tests: Use shims.disableAll in test helpers Update the CI and functional test helpers to use the new shims.disableAll option instead of iterating over every shim to disable them individually. Also adds helm repo for node-feature-discovery before building dependencies to fix CI failures on some distributions. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-26 20:50:01 +01:00
Fabiano Fidêncio	d8a3272f85	kata-deploy: Add tests for custom runtimes Helm templates Add Bats tests to verify the custom runtimes Helm template rendering, and that the we can start a pod with the custom runtime. Tests were written with Cursor's help. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-26 20:50:01 +01:00
Manuel Huber	6438fe7f2d	tests: Remove terminationGracePeriod in manifests Do not kill containers immediately, instead use Kubernetes' default termination grace period. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-23 16:18:44 -08:00
Manuel Huber	0d35b36652	Revert "ci: Ensure the KBS resources are created" This reverts commit `c0d7222194`. Soon, guest components will switch to using a DB instead of storing resources in the filesystem. Further, I don't see any more indicators why kbs-client would struggle to set simple resources. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-23 16:18:10 -08:00
Fabiano Fidêncio	5b82b160e2	runtime-rs: Add arm64 QEMU support Add the necessary configuration and code changes to support QEMU on arm64 architecture in runtime-rs. Changes: - Set MACHINETYPE to "virt" for arm64 - Add machine accelerators "usb=off,gic-version=host" required for proper arm64 virtualization - Add arm64-specific kernel parameter "iommu.passthrough=0" - Guard vIOMMU (Intel IOMMU) to skip on arm64 since it's not supported These changes align runtime-rs with the Go runtime's arm64 QEMU support. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2026-01-23 19:48:31 +01:00
Fabiano Fidêncio	ec18dd79ba	tests: Simplify kata-deploy test to use helm directly The kata-deploy test was using helm_helper which made it hard to debug failures (die() calls would cause "Executed 0 tests" errors) and added unnecessary complexity. The test now calls helm directly like a user would, making it simpler and more representative of real-world usage. The verification job status is explicitly checked with proper failure detection instead of relying on helm --wait. Timeouts are configurable via environment variables to account for different network speeds and image sizes: - KATA_DEPLOY_TIMEOUT (default: 600s) - KATA_DEPLOY_DAEMONSET_TIMEOUT (default: 300s) - KATA_DEPLOY_VERIFICATION_TIMEOUT (default: 120s) Documentation has been added to explain what each timeout controls and how to customize them. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-21 20:14:33 +01:00
Fabiano Fidêncio	2369cf585d	tests: Fix retry loop bugs in helm_helper The retry loop in helm_helper had two bugs: 1. Counter initialized to 10 instead of 0, causing immediate failure 2. Exit condition used -eq instead of -ge, incorrect for loop logic These bugs would cause helm_helper to fail immediately on the first retry attempt instead of properly retrying up to max_tries times. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-21 20:14:33 +01:00
Fabiano Fidêncio	e0158869b1	tests: Add common bats test runner function Add run_bats_tests() function to common.bash that provides consistent test execution and reporting across all test suites (k8s, nvidia, kata-deploy). This removes duplicated test runner code from run_kubernetes_tests.sh, run_kubernetes_nv_tests.sh, and run-kata-deploy-tests.sh. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-20 12:31:55 +01:00
Fabiano Fidêncio	b5a986eacf	kata-deploy: Add runtime-rs TDX / SNP runtimeclasses https://github.com/kata-containers/kata-containers/pull/11534 has been merged and it added all the needed bits to deploy the QEMU SNP / TDX runtime-rs variants, apart from the kata-deploy additions, which is done by this PR. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-19 22:41:50 +01:00
Fabiano Fidêncio	c7570427d2	tests: Add report generation to NVIDIA tests The NVIDIA GPU test runner script was not generating test reports, causing the report_tests() function in gha-run.sh to have nothing to display. This aligns the script with run_kubernetes_tests.sh by: - Adding set -o pipefail for proper pipeline error handling - Creating a reports directory with timestamped subdirectory - Capturing test output to files with ok-/not_ok- prefixes - Adding --timing flag to bats for timing information Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-19 18:21:43 +01:00
Fabiano Fidêncio	96e1fb4ca6	tools: Remove runk The runk tool hasn't been supported for a few years, with no maintainers since ManaSugi stopped being involved in the project and the CI was disabled in 2024. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-19 14:43:53 +01:00

1 2 3 4 5 ...

1872 Commits