kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-18 18:58:36 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	ea18f543b4	tests: kata-deploy: Enable verification during helm install Enable post-install verification in kata-deploy CI tests. When HELM_VERIFY_DEPLOYMENT is set, a simple verification pod is created that runs with the Kata runtime to confirm deployment succeeded. The verification pod prints kernel info and exits - success indicates the Kata runtime is properly configured and functional. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-16 10:52:43 +01:00
Alex Lyn	fba92880c9	tests: make set_container_command idempotent and add debug output set_container_command() previously appended command arguments one-by-one with '.command += [...]'. This makes the helper non-idempotent and can lead to unexpected command arrays when invoked multiple times. Update the helper to set the full command array in a single yq v4 expression and print the target YAML path plus the command being applied to simplify debugging when tests fail. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 17:56:28 +01:00
Alex Lyn	38296a41b2	tests: Generate pod config with stable .yaml suffix The pod config file created by new_pod_config() was generated via mktemp using the template "pod-config.yaml.in.XXX", which produces filenames that do not end with ".yaml" (e.g. pod-config.yaml.in.ABC). If the random combination of special suffix with ".Csv" or ".Xml", etc. the following operations with yq will fail. Some helpers and tooling assume the config path ends with ".yaml". Switch the mktemp template to place the random suffix before the extension so the returned path always ends with ".yaml". Fixes: #12268, #12319 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-01-12 17:56:28 +01:00
Manuel Huber	183507beeb	agent: change secure_storage_integrity default Change the secure_storage_integrity option's default value to true. With this, integrity protection for encrypted block device contents will be requested from the confidential data hub by default, see the agent's cdh_handler_trusted_storage function in rpc.rs. This behavior can be disabled by explicitly setting the agent.secure_storage_integrity parameter to 0 or false via kernel command line parameters. This will affect the trusted storage implementation for the guest-pull mechanism, and it will affect future implementations using this code path, such as implementations for ephemeral secure storage. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-10 16:54:03 +01:00
Manuel Huber	df2896c298	docs: Create NVIDIA GPU passthrough QEMU scenario Create a new page for a reference implementation for Kubernetes using QEMU, the go shim and an NVIDIA rootfs. The new page contains information on: - components involved in the NVIDIA (TEE) GPU scenario - orchestration flow for GPU passthrough scenarios - deployment guidance Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-09 19:02:56 +01:00
Saul Paredes	02979a13e3	Merge pull request #12208 from romoh/patch-1 ci: Update AKS setup post Pod Sandboxing GA	2026-01-08 11:02:05 -08:00
Fabiano Fidêncio	6b3953dd51	tests: k8s: liveness-probes: Adjust events grep Till k8s 1.34 we could grep by "Started containerd". From k8s 1.35 onwards the event message changed and we should, instead, grep by "Container started". Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-01-07 23:01:59 +01:00
Roaa Sakr	44c79cf14a	ci: Update AKS setup post Pod Sandboxing GA Update workload-runtime value to align with current AKS Pod Sandboxing documentation post GA. Signed-off-by: Roaa Sakr <romoh@microsoft.com>	2026-01-05 13:47:33 -08:00
Hyounggyu Choi	3fa1d93f85	tests: remove re-delcared local variable in k8s-empty-dirs.bats Since #12204 was merged, the following error has been observed: ``` bats warning: Executed 1 instead of expected 2 tests [run_kubernetes_tests.sh:162] ERROR: Tests FAILED from suites: k8s-empty-dirs.bats ``` The cause is that `pod_logs_file` is re-declared as a local variable in the second test before skipping, which makes it inaccessible in `teardown()` and leads to an error. This commit removes the re-declaration of the variable. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-12-18 18:57:16 +01:00
Manuel Huber	78c41b61f4	tests: nvidia: Update images, probes and timeouts Changes in NIM/RAG samples: - update image references - update memory requirements, timeouts, model name - sanitize some of the probes and print-out Further refinements can be made in the future. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-18 10:57:14 +01:00
Manuel Huber	0373428de4	tests: nvidia: Use secret for NGC API key This is a slight change in the manifest to at least use a secret for the environment variable. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-18 10:57:14 +01:00
Hyounggyu Choi	56ec8d7788	Merge pull request #12204 from kata-containers/runtime-rs-stability-debug CI: Upgrade log details for improved error analysis	2025-12-18 10:54:54 +01:00
Tobin Feldman-Fitzthum	decc09e975	tests: cc: add test with SNP reference values Add two attestation tests. The first one sets a resource policy that requires CPU0 to have an affirming trust level. This is a negative test which can run on any platform. Setting this policy without setting any reference values should result in an attestation failure. Next, a second test will set the same policy, but this time it will use the journal log to find the QEMU command line from the previous test and calculate the expected reference values. Currently this is only supported on SNP using the sev-snp-measure tool, but the same flow should work on other platforms. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2025-12-18 00:12:11 +01:00
Alex Lyn	3696d9143a	tests: Correct the teardown_common in cpu-ns.bats It will address the issue: "# bats warning: Executed 0 instead of expected 1 tests" Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	a28f24ef8c	tests: move the get_pod_config_dir into setup_common As each case need such preparation of get_pod_config_dir, a better method is directly move it into the setup_common method. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	5778b0a001	tests: Introduce measure_node_time to get test case end time To measure the duration for journal, we need clearly print the journal start time and end time for each case which helps to ensure the journal log is for the specified period for the case. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	648f0913ca	tests: Load lib.rs in bats to ensure related function available The lib.rs should be first loaded before execute some functions call. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	bbec15d695	tests: delete policy_settings_dir only for first test case Currently policy_settings_dir is created only when BATS_TEST_NUMBER == "1", but delete_tmp_policy_settings_dir "${policy_settings_dir}" is called in teardown() for every test. This means that for tests after the first one teardown() may attempt to delete a directory that was already removed by a previous test, or rely on a value that does not belong to the current test execution. Adjust teardown logic so that policy_settings_dir is only deleted for the first test case (BATS_TEST_NUMBER == "1") and ignored for subsequent tests. This keeps the original optimization of running genpolicy only once, while avoiding unnecessary or confusing cleanup attempts in later test cases. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	24e68b246f	tests: Add missing bin env at the head of bats Add the missing part of `#!/bin/bash/env` in bats. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	93ba6a8e76	tests: Make pod_name a global variable the previous pod_name is set as local which can not be captured within the teardown() function, causing failure. This commit just remove the `local pod_name` to make it a global variable. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Alex Lyn	89dce4eff6	tests: Enhance debug log output Introduce setup_common in setup() and teardown_common() in teardown() to get enough log to help debug Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-17 16:14:10 +00:00
Hyounggyu Choi	7f72acc266	Merge pull request #12180 from BbolroC/enable-vfio-ap-passthrough-runtime-rs runtime-rs: Enable VFIO-AP passthrough (hotplug only) on s390x	2025-12-17 15:50:10 +01:00
Fabiano Fidêncio	830d15d4c8	tests: Adapt to using kata-tools Instead of relying and the fully bloated kata tarball. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-16 12:55:07 +01:00
Fabiano Fidêncio	50b853eb93	tests: nvidia: Always rely on the "kata" default runtime class This is a pattern already followed by all the other tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	ff2396aeec	tests: nvidia: Declare KATA_HYPERVISOR variable Align with other test logic - declare the KATA_HYPERVISOR in the run bash script, then declare the RUNTIME_CLASS_NAME variable in the bats files. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	6e31cf2156	tests: nvidia: cc: USE is_confidential_gpu_hw This function has recently been introduced, so we align patterns. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	cd1f55b41c	tests: nvidia: cc: Set GPU0 policy for NIM tests Now that we have a more restrictive resource policy for KBS, let us start adopting it across all NVIDIA test cases. This policy was previously introduced by the NVIDIA attestation test. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	edbac264cb	tests: nvidia: cc: Remove KBS variable The variable is now set in the CI YAML file, thus removing the assignment. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	9665b74653	tests: nvidia: cc: address shellcheck warnings Address shellcheck warnings for run_kubernetes_nv_tests.sh Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	5f9e7a03a8	tests: nvidia: do not use teardown_common Clean up in each NVIDIA bats file according to our needs. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 16:31:42 +01:00
Manuel Huber	1781fb8b06	tests: nvidia: cc: Use CUDA image from NVCR Pull from nvcr.io to avoid hitting unauthenticated pull rate limits. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	f63f95f315	tests: nvidia: cc: generate pod security policies With these changes, we create pod security policies when running against NVIDIA TEE GPU handlers where AUTO_GENERATE_POLICY is set. For the non-TEE GPU tests, the added functions bail out by design. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	bf26ad9532	nvidia: tests: remove outer CDI annotations With the new device plugin being used by CI runners, these annotations are no longer necessary. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	37b4f6ae8b	tests: Adapt NVIDIA common policy settings Following existing patterns, we adapt the common policy settings for NVIDIA GPU CI platforms. For instance, for our CI runners, we use containerd 2.x. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	f4c0c8546e	tests: Enable AUTO_GENERATE_POLICY for NVIDIA TEEs Enable auto-generate policy for qemu-nvidia-gpu-* if the user didn't specify an AUTO_GENERATE_POLICY value. Setting this in run_kubernetes_nv_tests.sh is too late as gha-run.sh calls into run_tests, setup.sh, and then into create_common_genpolicy_settings() where the rules.rego and genpolicy-settings file are being copied to the right locations. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-12 12:52:33 +01:00
Manuel Huber	560f6f6c74	tests: nvidia: cc: Affirming attestation policy Set the attestation policy for GPU0 to affirming. This requires the GPU, for instance, to have production properties, such as properly signed VBIOS firmware. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-12-11 10:16:58 +01:00
Alex Lyn	751b6875f9	tests: Temporarily skip the cpu-ns test for the s390x platform As some reasons that this CI is continuously failed, we'd like to temporarily skip it for the s390x platform. And it will be enabled when we addressed related issues. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	0f04363ea8	tests: Disable CPU elasticity tests for nontee scenarios This commit updates the non-TEE tests to disable two specific test cases: `k8s-number-cpus.bats` and `k8s-sandbox-vcpus-allocation.bats`. These tests are designed to cover CPU elasticity/dynamic scaling capabilities. In the non-TEE scenario, we are enforcing the disabling of this capability by setting the default configuration to `static_sandbox_resource_mgmt=true`. Although the tests currently pass, allowing them to run is logically inconsistent with the intended non-TEE configuration. Therefore, we are disabling them for all non-TEE runtimes, specifically targeting: - `qemu-coco-dev` - `qemu-coco-dev-runtime-rs` This change ensures that our non-TEE CI accurately reflects the static resource management policy and prevents misleading test results. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	beaf44dd2e	tests: disable block volume test for s390 arch As runtime-rs doesn't support block device hotplug in s390 arch, with this fact, we just disable or skip the test when it is the s390. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	28371dbec5	tests: Enable cloud-hypervisor and qemu-runtime-rs within the CI Enable the cpu hotplug tests within the k8s-number-cpus.bats for both cloud-hypervisor and qemu-runtime-rs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	82a72b4564	tests: Enable cpu hotplug for dragonball and clh in vcpus allocation We have support cpu hotplug features within dragonball and clh, this commit is to enable the test within the CI. Fixes: #8660 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	6196d3d646	tests: Enable cpu hotplug tests in k8s-cpu-ns.bats As previous failure within the case, we choose to skip it, but now the cpu hotplug has been corrected, and it's time to re-enable it. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Alex Lyn	96bd13e85d	tests: Add support for qemu-runtime-rs We have supportted virtio-scsi driver, and now the CI should be enabled. Fixes: #10373 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-12-10 22:11:56 +01:00
Fabiano Fidêncio	69a0ac979c	tests: Adjust install_bats() The function assumes that the runner is a Ubuntu machine, which so far has been true as part of our CI. However, the new ARM runner is running on Debian, and those mirror additions would simply break. With this in mind, for any distro that's not ubuntu, let's just make sure to inform the owner of the system to have bats already installed as part of the environment provided. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-10 12:05:04 +01:00
Fabiano Fidêncio	406f6b1d15	Revert "tests: Add workaround to override CDI files" This reverts commit `5a81b010f2`, as we now have all the infrastructure properly set up as part of our CI node. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-09 23:18:11 +01:00
Fabiano Fidêndio	71f78cc87e	tests: cc: gpu: Lower the amount of memory required by the pods We've made the pods require a ridiculous amount of memory, just for the sake of getting them running. Now that those are running, tests are passing, CI is required, let's work to lower the amount of mmemory needed as everything else is working as expected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-06 00:16:43 +01:00
Dan Mihai	965ad10cf2	tests: k8s: tests_common.sh local modification Clean-up shellcheck warnings: SC2030 (info): Modification of cmd_out is local (to subshell caused by (..) group). SC2031 (info): cmd_out was modified in a subshell. That change might be lost. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Dan Mihai	8199171cc4	tests: k8s: tests_common.sh braces around variables Clean-up shellcheck warnings: SC2250 (style): Prefer putting braces around variable references even when not strictly required. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-12-06 00:16:23 +01:00
Fabiano Fidêncio	5a81b010f2	tests: Add workaround to override CDI files Let's add a simple backup and restore logic for the CDI configuration file nvidia.com-pgpu.yaml in the k8s-nvidia-*.bats and k8s-confidential-attestation.bats test files. Althought not optimal, this is a temporary workaround needed until NVIDIA releases what's needed for the GPU Operator to properly deal with cold plugged devices for the Confidential Containers cases, which is work in progress right now. After that's released, we can revert/drop this patch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-12-05 18:58:35 +01:00
Zvonko Kaiser	147e9f188e	Merge pull request #12080 from manuelh-dev/mahuber/cc-gpu-ci-attestation tests: nvidia: cc: Add attestation test	2025-12-05 09:31:57 -05:00

1 2 3 4 5 ...

1822 Commits