kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-04 11:03:52 +00:00

Author	SHA1	Message	Date
Manuel Huber	dd868dee6d	tests: nvidia: onboard NIM service test Onboard a test case for deploying a NIM service using the NIM operator. We install the operator helm chart on the fly as this is a fast operation, spinning up a single operand. Once a NIM service is scheduled, the operator creates a deployment with a single pod. For now, the TEE-based flow uses an allow-all policy. In future work, we strive to support generating pod security policies for the scenario where NIM services are deployed and the pod manifest is being generated on the fly. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-04-02 16:58:54 +02:00
Fabiano Fidêncio	2131147360	tests: add kata-deploy lifecycle tests for restart resilience and cleanup Add functional tests that cover two previously untested kata-deploy behaviors: 1. Restart resilience (regression test for #12761): deploys a long-running kata pod, triggers a kata-deploy DaemonSet restart via rollout restart, and verifies the kata pod survives with the same UID and zero additional container restarts. 2. Artifact cleanup: after helm uninstall, verifies that RuntimeClasses are removed, the kata-runtime node label is cleared, /opt/kata is gone from the host filesystem, and containerd remains healthy. 3. Artifact presence: after install, verifies /opt/kata and the shim binary exist on the host, RuntimeClasses are created, and the node is labeled. Host filesystem checks use a short-lived privileged pod with a hostPath mount to inspect the node directly. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 15:20:53 +02:00
Fabiano Fidêncio	8b9ce3b6cb	tests: remove k3s/rke2 V3 containerd template workaround Remove the workaround that wrote a synthetic containerd V3 config template for k3s/rke2 in CI. This was added to test kata-deploy's drop-in support before the upstream k3s/rke2 patch shipped. Now that k3s and rke2 include the drop-in imports in their default template, the workaround is no longer needed and breaks newer versions. Removed: - tests/containerd-config-v3.tmpl (synthetic Go template) - _setup_containerd_v3_template_if_needed() and its k3s/rke2 wrappers - Calls from deploy_k3s() and deploy_rke2() This reverts the test infrastructure part of `a2216ec05`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 14:24:55 +02:00
Manuel Huber	177f5c308e	tests: gpu: use container image layer storage Use the container image layer storage feature for the k8s-nvidia-nim.bats test pod manifests. This reduces the pods' memory requirements. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-04-01 10:22:26 +02:00
Manuel Huber	b6cf00a374	tests: parametrize storage parameters - trusted-storage.yaml.in: use $PV_STORAGE_CAPACITY and $PVC_STORAGE_REQUEST so that PV/PVC size can vary per test. - confidential_common.sh: add optional size (MB) argument to create_loop_device. - k8s-guest-pull-image.bats: pass PV_STORAGE_CAPACITY and PVC_STORAGE_REQUEST when generating storage config. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-04-01 10:22:26 +02:00
Hyounggyu Choi	11cd5f2808	tests: Configure devmapper properly regardless of containerd version The follow differences are observed between container 1.x and 2.x: ``` [plugins.'io.containerd.snapshotter.v1.devmapper'] snapshotter = 'overlayfs' ``` and ``` [plugins."io.containerd.snapshotter.v1.devmapper"] snapshotter = "overlayfs" ``` The current devmapper configuration only works with double quotes. Make it work with both single and double quotes via tomlq. In the default configuration for containerd 2.x, the following configuration block is missing: ``` [[plugins.'io.containerd.transfer.v1.local'.unpack_config]] platform = "linux/s390x" # system architecture snapshotter = "devmapper" ``` Ensure the configuration block is added for containerd 2.x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-04-01 07:14:52 +02:00
Alex Lyn	119a145923	docs: Upgrade architecture documentation from 3.0 to 4.0 Replace Kata 3.0 architecture docs with Kata 4.0 (Rust Runtime) documentation. Key changes: - Remove deprecated architecture 3.0 documentation - Add comprehensive Kata 4.0 architecture guide covering: - Unified single-binary architecture - Built-in Dragonball VMM integration - Async I/O model with Tokio - Layered architecture design - Modular resource manager - Extensible framework for multiple container types The new documentation reflects the production-ready Rust runtime with improved performance and reduced resource consumption. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	004333ed71	docs: Update containerd-kata.md with clear settings In this commit: (1) Update containerd config with kata configurations (2) Add more comments to guide how to use containerd/kata with default setting and customized configure setting; (3) Update the usage of containerd cmd tool ctr with explicitly specified runtime-config-path options to make it work. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	a923bb2917	docs: Add document for how-to-use passthroughfd-IO within runtime-rs This document describes the Passthrough-FD (pass-fd) technology implemented in Kata Containers to optimize IO performance. By bypassing the intermediate proxy layers, this technology significantly reduces latency and CPU overhead for container IO streams. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Hyounggyu Choi	8cebcf0113	Merge pull request #12742 from BbolroC/remove-skipped-emptydir-tests-for-ibm-sel tests: Remove skip condition for emptyDir-related tests on IBM SEL	2026-03-27 14:35:48 +01:00
Fabiano Fidêncio	f0ad9f1709	tests: snp: policy: Adjust to containerd 2.3.0 As the AMD maintainers switched to the 2.3.0-beta.0 containerd (due to the nydus fixes that landed there). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Fabiano Fidêncio	1b8189731a	tests: hand nydus snapshotter setup over to kata-deploy Now that kata-deploy deploys and manages nydus-for-kata-tee on all platforms, the separate standalone nydus-snapshotter DaemonSet deployment is no longer needed. - Short-circuit deploy_nydus_snapshotter and cleanup_nydus_snapshotter to no-ops with an explanatory message. - Add qemu-snp to the workaround case so AMD SEV-SNP baremetal runners also get USE_EXPERIMENTAL_SETUP_SNAPSHOTTER=true and kata-deploy picks up the snapshotter setup on every run. - Drop the x86_64 arch guard and the hypervisor sub-case from the EXPERIMENTAL_SETUP_SNAPSHOTTER block, allowing any architecture and hypervisor to use the kata-deploy-managed path when the flag is set. Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Hyounggyu Choi	de3afd3076	tests: Remove skip condition for s390x in trusted ephemeral storage test Remove the skip condition for s390x in k8s-trusted-ephemeral-data-storage.bats. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-03-26 18:58:13 +01:00
Hyounggyu Choi	911aee5ad7	tests: Remove skip condition for emptyDir-related tests on IBM SEL Fixes: #10002 Since #11537 resolves the issue, remove the skip conditions for the k8s e2e tests involving emptyDir volume mounts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-03-26 15:39:33 +01:00
Fabiano Fidêncio	814ae53d77	tests: Use the helm chart to setup nydus for TDX Now that containerd 2.3.0-beta.0 has been released, it brings fixes for multi-snapshotters that allows us to test the baremetal machines in the same way we test the non-baremetal ones. Let's start doing the switch for TDX as timezone is friendlier with Mikko. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-24 19:13:59 +01:00
Manuel Huber	79efe3e041	tests: gpu: use container data storage feature Use the container data storage feature for the k8s-nvidia-nim.bats test pod manifests. This reduces the pods' memory requirements. For this, enable the block-encrypted emptydir_mode for the NVIDIA GPU TEE handlers. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-23 11:43:11 -07:00
Steve Horsman	2728b493d5	Merge pull request #12681 from manuelh-dev/mahuber/ci-pip-py-venv tests: cc: setup function for python venv	2026-03-23 14:33:30 +00:00
Fabiano Fidêncio	fe817bb47b	Merge pull request #12705 from fidencio/topic/tests-nginx-connectibity-2nd-try tests: nginx-connectivity: Use `-O index.html` to override the downloaded file	2026-03-23 13:08:51 +01:00
Fabiano Fidêncio	514a2b1a7c	Merge pull request #12264 from fidencio/topic/nvidia-gpu-cc-use-nydus-snapshotter nvidia: cc: Use nydus-snapshotter	2026-03-23 12:50:15 +01:00
Fabiano Fidêncio	83f37f4beb	tests: nginx-connectivity: Override index.html (2nd try) We need to explicitly pass `-O index.html` as the busybox' wget has a different behaviour than GNU's wget. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-23 11:11:44 +01:00
Fabiano Fidêncio	e44dfccf7a	Revert "tests: nginx-connectivity: Allow overriding the downloded file" This reverts commit `4403289123`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-23 11:06:23 +01:00
Hyounggyu Choi	1035504492	Merge pull request #12701 from fidencio/topic/tests-arm-nginx-connectivity tests: nginx-connectivity: Allow overriding the downloded file	2026-03-23 10:37:25 +01:00
Fabiano Fidêncio	642b5661ff	Merge pull request #12651 from manuelh-dev/mahuber/doc-update-nvidia-gpu-op docs: Update NVIDIA GPU passthrough QEMU scenario	2026-03-23 09:01:02 +01:00
Fabiano Fidêncio	4403289123	tests: nginx-connectivity: Allow overriding the downloded file In case a wget fails for one reason or another, it'll leave behind an 'index.html' file. Let's make sure we allow overriding that file so the retry loop doesn't fail for no reason. Fixes: #12670 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-23 04:08:24 +01:00
Alex Lyn	d2c2ec6e23	Merge pull request #12633 from LandonTClipp/docs_materialx docs: Move to mkdocs-material, port Helm to docs site	2026-03-23 09:29:25 +08:00
Fabiano Fidêncio	740d380b8e	tests: nvidia: cc: Use nydus-snapshotter So we can test what we just changed in the config files. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-22 10:10:34 +01:00
Agam Dua	f6319da73d	tests: Add eBPF and dwarves to spell check dictionary Add missing terms to the spell check dictionary to fix CI failures for kernel debug documentation: - eBPF - dwarves: Linux package with DWARF/BTF tools (pahole) required for CONFIG_DEBUG_INFO_BTF kernel option Also fix the casing of "ebpf" to "eBPF" in the kernel README to match the official naming convention. Signed-off-by: Agam Dua <agam_dua@apple.com>	2026-03-20 15:04:08 -07:00
LandonTClipp	5333e45313	docs: Fix static-checks.sh when running locally This fixes the test_dir variable in static-checks.sh so that when a --repo-path is provided, the test_dir variable uses that for the location instead of the GOPATH location. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2026-03-20 14:51:45 -05:00
Manuel Huber	476f550977	docs: Update NVIDIA GPU passthrough QEMU scenario With the upcoming GPU operator 26.3 relase and recent changes to kata-containers, we adapt this documentation with notes on multi GPU passthrough, support for TDX, changed deployment instructions, and with various other minor improvements. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-20 10:53:14 -07:00
stevenhorsman	e62df07b6a	static-checks: Delete kata-spell-check The old hunspell based spell-check was causing contributors challenges and proving a barrier to doc updates. We've replaced it with a cspell based-solution, so clean up the old approach. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:22:54 +00:00
stevenhorsman	d06dadd8ef	docs: Spelling updates Either fixing typos, or including program/repo name in backticks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:22:54 +00:00
stevenhorsman	829a32ee67	spellcheck: Add cspell files Add cspell config and initial dictionary Assisted-by: Bob (dictionary ordering and catergorisation) Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:22:54 +00:00
Manuel Huber	5765bc97b4	tests: cc: setup function for python venv We recently had a failure on a new CI runner where ${HOME}/.cicd/venv/bin/activate was not present. The relevant call originated from ensure_sev_snp_measure. Thus, add a function ensure_cicd_python_venv before callers to pip install. Currently, the NVIDIA NIM test and the confidential attestation tests use pip to install dependencies. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-18 17:07:47 -07:00
Aurélien Bombo	352b4cdad2	Merge pull request #12660 from LandonTClipp/ci_docs ci: Don't run CI builds on doc PRs	2026-03-17 12:19:11 -05:00
Aurélien Bombo	f8e234c6f9	Merge pull request #12650 from kata-containers/sprt/remove-csi ci: Stop building/deploying CSI driver	2026-03-16 16:53:02 -05:00
Manuel Huber	e13748f46d	tests: Adapt trusted ephemeral storage test With the new CDH version, the LUKS header is moved off of the disk into guest memory. We hence adapt the test's filesystem type checks. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-16 09:43:17 -07:00
Manuel Huber	5bbc0abb81	tests: use pre-created, signed sealed secrets With signature support for sealed secret, use pre-created signed sealed secrets and provision the signing public key to the KBS. Add instructions for re-creating these signed secrets. Improve k8s-sealed-secrets.bats by reducing repeated kubectl logs calls. A test run showed a SIGPIPE error one one of the grep-logs while the printouts of the initial kubectl logs invocation showed that the expected values were actually in the logs. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-16 09:43:17 -07:00
Manuel Huber	4a7022d2f4	tests: nvidia: call genpolicy auth for all tests Call the setup_genpolicy_registry_auth in run_kubernetes_nv_tests.sh. Authenticate before exercising any tests. Recently, we have seen UnauthorizedError messages for the CUDA vectorAdd image. While this image is not gated behind authentication, rate limiting may be a possible issue. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-13 09:03:01 -07:00
LandonTClipp	9a8932412d	docs: remove URL and markdown reference checks This URL check performed a CURL command to see if it was real. This will not work in the mkdocs world because the docs might reference a link that is not yet built on the main page. This is a chicken-and-egg problem. For reference: ``` ERROR: Invalid URL 'https://kata-containers.github.io/kata-containers/installation/#helm-chart' found in the following files: tools/packaging/kata-deploy/helm-chart/README.md ``` The markdown reference requirement was put in place for the old docs system, but this will not apply anymore in the new mkdocs system. I'm removing this entirely because it will only get in the way and cause confusion. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2026-03-12 15:48:35 -05:00
Aurélien Bombo	dd2c4c0db3	Revert "coco: ci: Add no-op steps to deploy CSI driver" This reverts commit `5e4990bcf5`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-11 12:55:23 -05:00
Steve Horsman	ba0f5b98fe	Merge pull request #12643 from stevenhorsman/bump-golang-to-1.25.8 versions: bump golang to 1.25.8	2026-03-11 08:53:21 +00:00
Fabiano Fidêncio	374b0abe29	tests: Fix kubelet data dir for k0s in trusted ephemeral storage test k0s uses /var/lib/k0s/kubelet instead of /var/lib/kubelet as its kubelet data directory. Introduce get_kubelet_data_dir() in tests_common.sh and use it in k8s-trusted-ephemeral-data-storage.bats instead of hardcoding /var/lib/kubelet. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	68bdbef676	tests: Improve logging for some tests Use modern test semantics to ease debugging. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	3dd77bf576	tests: Introduce new env variables to ease development It can be useful to set these variables during local testing: * AZ_REGION: Region for the cluster. * AZ_NODEPOOL_TAGS: Node pool tags for the cluster. * GENPOLICY_BINARY: Path to the genpolicy binary. * GENPOLICY_SETTINGS_DIR: Directory holding the genpolicy settings. I've also made it so that tests_common.sh modifies the duplicated genpolicy-settings.json (used for testing) instead of the original git-tracked one. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	a98e328359	tests: Add test for trusted ephemeral data storage This tests the feature on CoCo machines. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
stevenhorsman	8ae0e36737	versions: bump golang to 1.25.8 Bump the builder image and versions to resolve CVEs: - GO-2026-4601 - GO-2026-4602 - GO-2026-4603 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-09 09:10:01 +00:00
Alex Lyn	a35dcf952e	ci: Fix YAML parsing flakiness caused by mktemp random suffixes In some CI runs, `mktemp` generates random characters that accidentally form file extensions like `.cSV` or `.Xml`. This triggers downstream parsing errors because the YAML content is misidentified as CSV/XML. The issues look like as below: ``` '/tmp/bats-run-KodZEA/.../pod-guest-pull-in-trusted-storage.yaml.in.cSV': ... ``` This commit fixes the issue by: 1. Moving the `XXXXXX` placeholder before the `.yaml` extension. 2. Ensuring the generated file always ends in `.yaml`. This prevents format misidentification while maintaining filename uniqueness and security. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:21:29 +08:00
Fabiano Fidêncio	8f35c31b30	Merge pull request #12542 from fidencio/topic/genpolicy-distribute-different-settings-rather-than-patching-for-ci genpolicy: settings.d drop-ins and scenario example drop-ins	2026-03-05 07:37:30 +01:00
Fabiano Fidêncio	b5e0a5b7d6	Merge pull request #12555 from fidencio/topic/tests-use-local-pv-pvc-for-policy-tests k8s-policy-pvc: use local PV/PVC when no default StorageClass exists	2026-03-05 07:37:11 +01:00
Fabiano Fidêncio	a0b9d965e5	k8s-policy-pvc: use local PV/PVC when no default StorageClass exists Create local block storage (loop device, StorageClass, PV) in the test only when the cluster has no default StorageClass, matching the approach used in k8s-volume.bats. Set our StorageClass as default so the PVC binds to our PV; tear it down after the test. When a default already exists (e.g. AKS), skip creation and cleanup so we do not change the cluster's default storage class. Fixes: #9846 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 21:50:51 +01:00

1 2 3 4 5 ...

1957 Commits