kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-27 11:03:40 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	ff655b0200	tools: Fix shellcheck issues in build-kernel.sh Fix shellcheck warnings and notes identified by running shellcheck --severity=style. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-24 08:14:08 +02:00
Fabiano Fidêncio	8cd8210611	tools: Fix shellcheck issues in artifact-list.sh Fix shellcheck warnings and notes identified by running shellcheck --severity=style. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-24 08:14:08 +02:00
Fabiano Fidêncio	0959f02b76	gatekeeper: Make arm64 CI unrequired We have only one machine up and running the CIs, thus no capacity to keep it as required for now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-23 16:12:06 +02:00
Aurélien Bombo	87a3318151	Merge pull request #12695 from microsoft/saulparedes/test_mariner_runtime-rs ci: k8s-tests: test mariner and runtime-rs	2026-04-22 16:01:08 -05:00
Fabiano Fidêncio	ed3f8b4efe	release: Bump version to 3.29.0 Bump VERSION and helm-charts versions. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-22 15:57:39 +02:00
Fabiano Fidêncio	9b62021049	kata-deploy: Remove untested arm64 and qemu-cca shim support We should not ship configurations that we do not actively test. This commit drops the following from the kata-deploy helm chart: values.yaml: - arm64 from supportedArches for the clh shim - arm64 from supportedArches for the cloud-hypervisor shim - arm64 from supportedArches for the dragonball shim - arm64 from supportedArches for the fc shim - arm64 from supportedArches for the qemu-nvidia-gpu shim - the entire qemu-cca shim definition try-kata-tee.values.yaml: - CCA from the file description comment - qemu-cca from the TEE shims list comment - the entire qemu-cca shim definition - arm64: qemu-cca from the defaultShim mapping, replaced with arm64: qemu-coco-dev-runtime-rs (which is tested) try-kata-nvidia-gpu.values.yaml: - arm64 from supportedArches for the qemu-nvidia-gpu shim - arm64: qemu-nvidia-gpu from the defaultShim mapping Once arm64 and qemu-cca support are properly tested, they can be re-added. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-22 10:55:29 +02:00
Saul Paredes	baf0f16804	ci: k8s-tests: test mariner and runtime-rs Disable policy tests when using mariner and runtime-rs. These are not supported yet. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2026-04-21 14:08:21 -07:00
Fabiano Fidêncio	0c80372cf5	Merge pull request #12881 from stevenhorsman/bump-web-pki-to-0.103.12 Bump web pki to 0.103.12	2026-04-21 18:11:26 +02:00
stevenhorsman	9fbdf513ca	kata-deploy: Delete Cargo.lock In #12776 kata-deploy's binary was moved to the main cargo workspace, but the Cargo.lock wasn't deleted. As it shares the main Cargo.lock tidy this up. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-20 17:09:21 +01:00
Fabiano Fidêncio	b64673196a	ci: cache: qemu: Take configure-hypervisor.sh into account The script is used to change the options used to build QEMU and must be taken into consideration in case something changes, otherwise the QEMU used by the CI would be the old cached one (ignoring any flag newly added). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-20 14:52:57 +02:00
Fabiano Fidêncio	d6f0b15578	ci: erofs: restrict to runtime-rs only The erofs snapshotter configuration is node-wide (a single containerd drop-in) and cannot be split per runtime handler. The Go runtime does not support fsmerged EROFS — it rejects fsmeta.erofs mount sources with "unsupported mount source" — so erofs is only usable with runtime-rs. Drop qemu-coco-dev (Go) from the erofs CI matrix and add a check in kata-deploy's configure_erofs_snapshotter() that inspects the SNAPSHOTTER_HANDLER_MAPPING: if any Go shim is explicitly mapped to erofs, emit a prominent warning and bail out with a clear error telling the operator to fix the mapping. Since all shims are now guaranteed to be runtime-rs when erofs is active, remove the conditional is_rust_shim gating and always emit the full erofs configuration (differ options, default_size, max_unmerged_layers=1). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	7f7cca16fa	kata-deploy: Complete containerd config for erofs snapshotter Add missing containerd configuration items for erofs snapshotter to enable fsmerged erofs feature: Add snapshotter plugin configuration: - default_size: "10G" # can be customized - max_unmerged_layers: 1 # Fixed with 1 These configurations align with the documentation in docs/how-to/how-to-use-fsmerged-erofs-with-kata.md Step 2, ensuring the CI workflow run-k8s-tests-coco-nontee-with-erofs-snapshotter can properly configure containerd for erofs fsmerged rootfs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	04e0f1c403	qemu: Enable VMDK block format support The multi-layer EROFS rootfs feature relies on QEMU's VMDK flat-extent driver to merge multiple EROFS layers into a single virtual block device. Replace --disable-vmdk with an explicit --enable-vmdk so the Kata static QEMU build includes VMDK support. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	588a67a3fb	kata-deploy: add arm64 support for qemu-coco-dev shims Add aarch64/arm64 to the list of supported architectures for qemu-coco-dev and qemu-coco-dev-runtime-rs shims across kata-deploy configuration, Helm chart values, and test helper scripts. Note that guest-components and the related build dependencies are not yet wired for arm64 in these configurations; those will be addressed separately. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	861f15cdc4	build: add arm64 coco-dev build dependencies Build coco-guest-components, pause-image, and rootfs-image-confidential for arm64, which are required by qemu-coco-dev-runtime-rs. Enable MEASURED_ROOTFS on the arm64 shim-v2 build, add the aarch64 case to install_kernel() so the default kernel is built as a unified kernel (with confidential guest support, like x86_64), and adjust the kernel install naming so only CCA builds get the -confidential suffix. Also wire rootfs-image-confidential-tarball into the aarch64 local-build Makefile. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	e1f8b8e8b4	build: add arm64 tools build (genpolicy only) The arm64 build workflow was missing the tools build entirely. Add build-tools-asset and create-kata-tools-tarball jobs mirroring the amd64 workflow so that genpolicy and the other tools are available for coco-dev tests that need auto-generated policy. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:02 +02:00
Fabiano Fidêncio	64c139208f	agent: add GetDiagnosticData RPC with termination log support Add a new extensible GetDiagnosticData RPC that retrieves diagnostic information from the guest VM. The request carries a log_type string field to specify what kind of data is requested, and a container_id field to identify the target container. The first supported log_type is "termination_log", which reads the Kubernetes termination message file from inside the guest. This is needed for shared_fs=none configurations where the host cannot directly access the guest filesystem. On the Go runtime side, the container stop() path now calls GetDiagnosticData to copy the termination message to the host when running with NoSharedFS and the terminationMessagePolicy annotation is set to "File". The call is best-effort: failures are logged as warnings rather than blocking container teardown. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2026-04-17 13:01:13 +02:00
LandonTClipp	fd896e4e76	ci: Add kata-dictionary.txt to required_tests.yaml This makes it so that changes to the kata-dictionary.txt file only trigger the static checks to run. Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>	2026-04-15 14:48:01 +01:00
Fabiano Fidêncio	df1d02d3cf	kata-deploy: Allow overriding containerd config path and file name Add two new Helm values under `containerd`: - `configDir`: overrides the host directory where the containerd config lives, taking precedence over the k8sDistribution-based auto-detection. - `configFileName`: overrides the containerd config file name, propagated to the kata-deploy binary via the new CONTAINERD_CONFIG_FILE_NAME environment variable. These are useful for non-standard containerd setups that don't match any of the built-in k8sDistribution presets (k8s, k3s, rke2, k0s, microk8s). The config file name override only affects the default runtime branch in get_containerd_paths(). The k0s/microk8s/k3s/rke2 branches are left untouched since those runtimes have mandatory file naming conventions. Also fixes a spurious leading space in the k3s containerdConfPath branch. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-13 22:31:55 +02:00
stevenhorsman	8be3a24112	ci: Update cargo-deny in gatekeeper Update the name and move it to the static checks as we don't need to ensure it's running for none code changes. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-11 08:46:32 +01:00
LizZhang315	2312f67c9b	helm: add overheadEnabled switch for runtimeclass Add a global and per-shim configurable switch to enable/disable the overhead section in generated RuntimeClasses. This allows users to omit overhead when it's not needed or managed externally. Priority: per-shim > global > default(true). Signed-off-by: LizZhang315 <123134987@qq.com>	2026-04-10 10:26:11 +02:00
Fabiano Fidêncio	dca89485f0	Merge pull request #12802 from stevenhorsman/bump-golang-1.25.9 versions: bump golang to 1.25.9	2026-04-10 06:50:35 +02:00
Fabiano Fidêncio	72fb41d33b	kata-deploy: Symlink original config to per-shim runtime copy Users were confused about which configuration file to edit because kata-deploy copied the base config into a per-shim runtime directory (runtimes/<shim>/) for config.d support, leaving the original file in place untouched. This made it look like the original was the authoritative config, when in reality the runtime was loading the copy from the per-shim directory. Replace the original config file with a symlink pointing to the per-shim runtime copy after the copy is made. The runtime's ResolvePath / EvalSymlinks follows the symlink and lands in the per-shim directory, where it naturally finds config.d/ with all drop-in fragments. This makes it immediately obvious that the real configuration lives in the per-shim directory and removes the ambiguity about which file to inspect or modify. During cleanup, the symlink at the original location is explicitly removed before the runtime directory is deleted. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-09 17:16:40 +02:00
stevenhorsman	31f9a5461b	versions: bump golang to 1.25.9 Bump the go version to resolve CVEs: - GO-2026-4947 - GO-2026-4946 - GO-2026-4870 - GO-2026-4869 - GO-2026-4865 - GO-2026-4864 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-09 08:59:40 +01:00
Harshitha Gowda	bb1165b23f	tests: Set sev-snp, qemu-snp CIs as required run-k8s-tests-on-tee (sev-snp, qemu-snp) Signed-off-by: Harshitha Gowda <hgowda@amd.com>	2026-04-08 22:36:58 +02:00
Fabiano Fidêncio	21466eb4e5	kata-deploy: Fix clippy warnings across crate Fix all clippy warnings triggered by -D warnings: - install.rs: remove useless .into() conversions on PathBuf values and replace vec! with an array literal where a Vec is not needed - utils/toml.rs: replace while-let-on-iterator with a for loop and drop the now-unnecessary mut on the iterator binding - main.rs: replace match-with-single-pattern with if-let in two places dealing with experimental_setup_snapshotter - utils/yaml.rs: extract repeated serde_yaml::Value::String key into a local variable, removing needless borrows on temporary values Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 20:47:59 +02:00
Fabiano Fidêncio	1874d4617b	kata-deploy: Run cargo clippy during build Ensure code formatting and compilation are verified early in the Docker build pipeline, before tests and the release build. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 20:47:59 +02:00
Greg Kurz	817580e35d	Merge pull request #12795 from fidencio/topic/kata-deploy-do-not-try-to-install-a-snapshotter-when-using-crio kata-deploy: Skip snapshotter install/uninstall on CRI-O	2026-04-08 17:18:05 +02:00
Fabiano Fidêncio	f27def1a5b	kata-deploy: Skip snapshotter install/uninstall on CRI-O Snapshotters (nydus, erofs) are containerd-specific. The validation code already warned that EXPERIMENTAL_SETUP_SNAPSHOTTER would be ignored on CRI-O, but the actual install/configure and uninstall loops still ran unconditionally, attempting containerd-specific operations on CRI-O nodes. Guard both the install and cleanup snapshotter loops with a `runtime != "crio"` check so the binary itself skips snapshotter work when it detects CRI-O as the container runtime. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:49 +02:00
Fabiano Fidêncio	bc719a66eb	kata-deploy: nvidia: Align force_guest_pull with default values.yaml The defdault is already false, but let's keep those aligned on explicitly setting the default. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Fabiano Fidêncio	78f02f2155	kata-deploy: nvidia: Align labels with default values.yaml Joji's added the labels for the default values.yaml, but we missed adding those to the nvidia specific values.yaml file. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Fabiano Fidêncio	f00b589ccd	Revert "kata-deploy: Temporarily comment GPU specific labels" This reverts commit `02c9a4b23c`, as GPU Operator v26.3.0 is out, and becomes a requirement. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Alex Lyn	c00f895338	kata-deploy: Fix noisy caused by unformatted code When do cargo fmt --all, some files changes as unformatted with `cargo fmt`. This commit is just to address it. Just use this as an example: ``` // Generate the common drop-in files (shared with standard // runtimes) - write_common_drop_ins(config, &runtime.base_config, &config_d_dir, container_runtime)?; + write_common_drop_ins( + config, + &runtime.base_config, + &config_d_dir, + container_runtime, + )?; ``` Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-08 14:39:57 +02:00
Fabiano Fidêncio	a12e0f1204	build: cache: Take NVRC & NVAT version into consideration Without those, we'd end up pulling the same / old rootfs that's cached without re-building it in case of a bump in any of those components. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 10:14:11 +02:00
Fabiano Fidêncio	b3ae6ef99c	Merge pull request #12760 from fitzthum/bump-nvat Bump trustee and guest-components to add nvswitch / ppcie support	2026-04-07 19:07:50 +02:00
Fabiano Fidêncio	461907918d	kata-deploy: pin nydus-snapshotter via versions.yaml Resolve externals.nydus-snapshotter version and url in the Docker image build with yq from the repo-root versions.yaml instead of Dockerfile ARG defaults. Drop the redundant workflow that only enforced parity between those two sources. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-07 10:07:06 +08:00
Fabiano Fidêncio	9e1f595160	kata-deploy: add Rust binary to root workspace Add tools/packaging/kata-deploy/binary as a workspace member, inherit shared dependency versions from the root manifest, and refresh Cargo.lock. Build the kata-deploy image from the repository root: copy the workspace layout into the rust-builder stage, run cargo test/build with -p kata-deploy, and adjust artifact and static asset COPY paths. Update the payload build script to invoke docker buildx with -f .../Dockerfile from the repo root. Add a repo-root .dockerignore to keep the Docker build context smaller. Document running unit tests with cargo test -p kata-deploy from the root. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-07 10:07:06 +08:00
Tobin Feldman-Fitzthum	0444d70704	rootfs: add runtime support for NVAT Update NVIDIA rootfs builder to include runtime dependencies for NVAT Rust bindings. The nvattest package does not include the .so file, so we need to build from source. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2026-04-06 17:51:12 +00:00
Tobin Feldman-Fitzthum	78c61459f8	packaging: add built-time support for NVAT The attestation agent will soon rely on the NVAT rust bindings, which have some built-time dependencies. There is currently no nvattest-dev package, so we need to build from source to get the headers and .so file. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2026-04-06 17:51:12 +00:00
Fabiano Fidêncio	47770daa3b	helm: Align values.yaml with try-kata-nvidia-gpu.values.yaml We've switched to nydus there, but never did for the values.yaml. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-06 18:51:54 +02:00
Fabiano Fidêncio	b4b62417ed	kata-deploy: skip cleanup on pod restart to avoid crashing kata pods When a kata-deploy DaemonSet pod is restarted (e.g. due to a label change or rolling update), the SIGTERM handler runs cleanup which unconditionally removes kata artifacts and restarts containerd. This causes containerd to lose the kata shim binary, crashing all running kata pods on the node. Fix this by implementing a three-stage cleanup decision: 1. If this pod's owning DaemonSet still exists (exact name match via DAEMONSET_NAME env var), this is a pod restart — skip all cleanup. The replacement pod will re-run install, which is idempotent. 2. If this DaemonSet is gone but other kata-deploy DaemonSets still exist (multi-install scenario), perform instance-specific cleanup only (snapshotters, CRI config, artifacts) but skip shared resources (node label removal, CRI restart) to avoid disrupting the other instances. 3. If no kata-deploy DaemonSets remain, perform full cleanup including node label removal and CRI restart. The Helm chart injects a DAEMONSET_NAME environment variable with the exact DaemonSet name (including any multi-install suffix), ensuring instance-aware lookup rather than broadly matching any DaemonSet containing "kata-deploy". Fixes: #12761 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 15:20:52 +02:00
Fabiano Fidêncio	28414a614e	kata-deploy: detect k3s/rke2 via systemd services instead of version string Newer k3s releases (v1.34+) no longer include "k3s" in the containerd version string at all (e.g. "containerd://2.2.2-bd1.34" instead of "containerd://2.1.5-k3s1"). This caused kata-deploy to fall through to the default "containerd" runtime, configuring and restarting the system containerd service instead of k3s's embedded containerd — leaving the kata runtime invisible to k3s. Fix by detecting k3s/rke2 via their systemd service names (k3s, k3s-agent, rke2-server, rke2-agent) rather than parsing the containerd version string. This is more robust and works regardless of how k3s formats its containerd version. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 14:24:55 +02:00
Fabiano Fidêncio	fe1f804543	kata-deploy: Restart nydus-snapshotter in case of failure Let's ensure that in case nydus-snapshotter crashes for one reason or another, the service is restarted. This follows containerd approach, and avoids manual intervention in the node. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 11:00:21 +02:00
Fabiano Fidêncio	789abe6fdf	kata-deploy: Make nydus a soft dep of containerd Let's relax our RequiredBy and use a WantedBy in the nydus systemd unit file as, in case of a nydus crash, containerd would also be put down, causing the node to become NotReady. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-01 10:52:29 +02:00
Fabiano Fidêncio	3a1683ccdc	gatekeeper: unrequire kata-deploy k3s tests Those are breaking, and I need time to investigate why. For now, unrequire those tests. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-31 18:32:17 +02:00
Fabiano Fidêncio	4fad88499c	kata-deploy: rename nydus-snapshotter to nydus-for-kata-tee Rename all host-visible names of the nydus-snapshotter instance managed by kata-deploy from the generic "nydus-snapshotter" to "nydus-for-kata-tee". This covers the systemd service name, the containerd proxy plugin key, the runtime class snapshotter field, the data directory (/var/lib/nydus-for-kata-tee), the socket path (/run/nydus-for-kata-tee/), and the host install subdirectory. The rename makes it immediately clear that this nydus-snapshotter instance is the one deployed and managed by kata-deploy specifically for Kata TEE use cases, rather than any general-purpose nydus-snapshotter that might be present on the host. Because the old code operated under a completely separate set of paths (nydus-snapshotter.*), any previously deployed installation continues to run without interference during the transition to this new naming. CI pipelines and operators can upgrade kata-deploy on their own schedule without having to coordinate an atomic cutover: the old service keeps serving its existing workloads until it is explicitly replaced, and the new deployment lands cleanly alongside it. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Steve Horsman	8c2b7ed619	Merge pull request #12729 from fidencio/topic/kata-deploy-nydus-dont-touch-data-dir-on-install kata-deploy: nydus: never remove the data dir	2026-03-25 10:28:50 +00:00
Steve Horsman	0d8186ae16	Merge pull request #12730 from fidencio/topic/bump-nydus-snapshotter versions: Bump nydus-snapshotter to v0.15.13	2026-03-25 10:20:23 +00:00
Fabiano Fidêncio	bcfb2354e0	gatekeeper: Unrequire NVIDIA GPU SNP tests till auth is fixed SSIA, the NIM tests are breaking due to authentication issues, and those issues are blocking other PRs. Let's unrequire the test for now, and mark it as required again once we fixed the auth issues. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-25 10:23:53 +01:00
Fabiano Fidêncio	caf6b244e6	versions: Bump nydus-snapshotter to v0.15.13 As this brings in a fix for using images with too many layers. https://github.com/containerd/nydus-snapshotter/releases/tag/v0.15.13 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-25 08:31:48 +01:00

1 2 3 4 5 ...

2208 Commits