kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-18 10:44:10 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	83a8b257d1	Merge pull request #12265 from fidencio/topic/nvidia-bump-container-toolkit nvidia: Bump nvidia-container-toolkit to 1.18.1	2026-03-05 15:25:15 +01:00
Fabiano Fidêncio	079fac1309	Merge pull request #12591 from fidencio/topic/kernel-add-mmio-back-to-the-unified-kernels kernel: include mmio fragment in unified build for firecracker	2026-03-05 13:45:41 +01:00
Fabiano Fidêncio	e9894c0bd8	nvidia: Bump nvidia-container-toolkit to 1.18.1 Let's update the nvidia-container-toolkit to 1.18.1 (from 1.17.6). We're, from now on, relying on the version set in the versions.yaml file. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-05 11:53:09 +01:00
Zachary Spar	bda9f6491f	kata-deploy: add per-shim configurable pod overhead Allow users to override the default RuntimeClass pod overhead for any shim via shims.<name>.runtimeClass.overhead.{memory,cpu}. When the field is absent the existing hardcoded defaults from the dict are used, so this is fully backward compatible. Signed-off-by: Zachary Spar <zspar@coreweave.com>	2026-03-05 08:00:01 +01:00
Fabiano Fidêncio	cb0d02e40b	kernel: include mmio fragment in unified build for firecracker Remove # !confidential from mmio.conf so CONFIG_VIRTIO_MMIO and CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES are included when building the unified x86_64/s390x kernel with -x Firecracker requires virtio-mmio for block devices; without it the guest kernel panics (no /dev/vda). Fixes: #12581 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 21:18:35 +01:00
Fabiano Fidêncio	d40afe592c	genpolicy: add settings drop-in directory and RFC 6902 JSON Patch support Allow genpolicy -j to accept a directory instead of a single file. When given a directory, genpolicy loads genpolicy-settings.json from it and applies all genpolicy-settings.d/.json files (sorted by name) as RFC 6902 JSON Patches. This gives precise control over settings with explicit operations (add, remove, replace, move, copy, test), including array index manipulation and assertions. Ship composable drop-in examples in drop-in-examples/: - 10- files set platform base settings (non-CoCo, AKS, CBL-Mariner) - 20-* files overlay specific adjustments (OCI version, guest pull) Users copy the combination they need into genpolicy-settings.d/. Replace the old adapt_common_policy_settings_* jq-patching functions in tests_common.sh with install_genpolicy_drop_ins(), which copies the right combination of 10-* and 20-* drop-ins for the CI scenario. Tests still generate 99-test-overrides.json on the fly for per-test request/exec overrides. Packaging installs 10-* and 20-* drop-ins from drop-in-examples/ into the tarball; the default genpolicy-settings.d/ is left empty. Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 20:13:21 +01:00
Steve Horsman	a4a4683ec7	Merge pull request #12626 from kata-containers/topic/kata-deploy-k3s-rke2-use-imports kata-deploy: a bunch of fixes regarding uninstall, rke2 and k3s tests	2026-03-04 14:01:09 +00:00
Steve Horsman	8e11bb2526	Merge pull request #12611 from mythi/coco-kernel-v6.18.15 versions: bump to Linux v6.18.15 (LTS)	2026-03-04 14:00:00 +00:00
Fabiano Fidêncio	ebe75cc3e3	kata-deploy: make verification job resilient to CRI runtime restarts kata-deploy restarts the CRI runtime (k3s/containerd) during install, which can kill the verification job pod or cause transient API server errors. Bump backoffLimit from 0 to 3 so the job can retry after being killed, and add a retry loop around kubectl rollout status to handle transient connection failures. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 11:26:31 +01:00
Fabiano Fidêncio	7a08ef2f8d	kata-deploy: run cleanup on SIGTERM instead of preStop hook Move the cleanup logic from a preStop lifecycle hook (separate exec) into the main process's SIGTERM handler. This simplifies the architecture: the install process now handles its own teardown when the pod is terminated. The SIGTERM handler is registered before install begins, and tokio::select! races install against SIGTERM so cleanup always runs even if SIGTERM arrives mid-install (e.g. helm uninstall while the container is restarting after a failed install attempt). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 11:26:31 +01:00
Fabiano Fidêncio	01895bf87e	kata-deploy: use k3s/rke2 drop-in Check the rendered containerd config for the versioned drop-in dir import (config.toml.d or config-v3.toml.d) and bail with a clear error if it is missing. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-04 11:08:26 +01:00
Fabiano Fidêncio	b0345d50e8	build: kernel: Do not expect a modules tarball for vanilla kernel When I added this I had in mind the period that we still relied on the SEV module being generated, which we don't do for quite a long time. This wrong assumption caused the cache to ALWAYS fail, increasing our build time considerably for no reason. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-03 20:14:42 +01:00
Mikko Ylinen	2cf9018e35	versions: bump to Linux v6.18.15 (LTS) Bump to the latest LTS kernel to get a fix for TDX: efi: Fix reservation of unaccepted memory table See details in: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0862438c90487e79822d5647f854977d50381505 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-03-03 07:56:24 +02:00
Mikko Ylinen	0b2af07b02	build: kernel: fix checksum checks for RC kernels get_kernel() thinks it knows when it needs to skip sha256sum validation for RC kernels since sha256sums.asc is not available: INFO: Config version: 176 INFO: Kernel version: 6.18-rc5 INFO: kernel path does not exist, will download kernel INFO: Release candidate kernels are not part of the official sha256sums.asc -- skipping sha256sum validation But continues to check it anyway since ${rc} matches with -n. sha256sum should only be checked when ${rc} is NOT set. Fixes a problem where downloaded RC kernels are always removed and downloaded again. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-03-03 07:56:24 +02:00
Steve Horsman	b147cb1319	Merge pull request #12587 from fidencio/topic/runtime-add-configurable-kubelet-root-dir runtimes: add configurable kubelet root dir	2026-02-28 19:06:14 +00:00
Fabiano Fidêncio	330bfff4be	kata-deploy: Fix nydus snapshotter config (on v3 config version) On containerd v3 config, disable_snapshot_annotations must be set under the images plugin, not the runtime plugin. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-27 18:20:30 +01:00
Fabiano Fidêncio	0a73638744	runtime: add configurable kubelet root dir Different kubernetes distributions, such as k0s, use a different kubelet root dir location instead of the default /var/lib/kubelet, so ConfigMap and Secret volume propagation were failing. This adds a kubelet_root_dir config option that the go runtime uses when matching volume paths and kata-deploy now sets it automatically for k0s via a drop-in file. runtime-rs does not need this option: it identifies ConfigMap/Secret, projected, and downward-api volumes by volume-type path segment (kubernetes.io~configmap, etc.), not by kubelet root prefix. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-27 14:10:57 +01:00
Hyounggyu Choi	be5ae7d1e1	Merge pull request #12573 from BbolroC/support-memory-hotplug-go-runtime-s390x runtime: Support memory hotplug via virtio-mem on s390x	2026-02-27 09:59:40 +01:00
Steve Horsman	c6014ddfe4	Merge pull request #12574 from sathieu/kata-deploy-kubectl-image kata-deploy: allow to configure kubectl image	2026-02-27 08:42:06 +00:00
Fabiano Fidêncio	8c91e7889c	helm-chart: support digest pinning for images When image.reference or kubectlImage.reference already contains a digest (e.g. quay.io/...@sha256:...), use the reference as-is instead of appending :tag. This avoids invalid image strings like 'image@sha256🔤' when tag is empty and allows users to pin by digest. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-26 13:39:51 +01:00
Mathieu Parent	b61d169472	kata-deploy: allow to configure kubectl image This can be used to: - pin tag (current is 20260112) - pin digest - use another image Signed-off-by: Mathieu Parent <mathieu.parent@insee.fr>	2026-02-26 13:12:03 +01:00
stevenhorsman	82c27181d8	kata-deploy: Remove unused crates cargo machete has identified `serde` and `thiserror` as being unused, so remove them from Cargo.toml Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:38:35 +00:00
Hyounggyu Choi	2860e68534	kernel: Enable CONFIG_VIRTIO_MEM for s390x Since QEMU v10.0.0 and Linux v6.13, virtio-mem-ccw is supported. Let's enable the required kernel configs for s390x. This commit enables `CONFIG_VIRTIO_MEM` and `CONFIG_MEMORY_HOTREMOVE` to support memory hotplug in the VM guest. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-02-25 08:17:48 +01:00
Fabiano Fidêncio	b082cf1708	kata-deploy: validate defaultShim is enabled before propagating it getDefaultShimForArch previously returned whatever string was set in defaultShim.<arch> without any validation. A typo, a non-existent shim, or a shim that is disabled via disableAll would all silently produce a bogus DEFAULT_SHIM_* env var, causing kata-deploy to fail at runtime. Guard the return value by checking whether the configured shim is present in the list of shims that are both enabled and support the requested architecture. If not, return empty string so the env var is simply omitted. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-21 14:01:11 +01:00
Fabiano Fidêncio	4ff7f67278	kata-deploy: fix nil pointer when custom runtime omits containerd/crio Using `$runtime.containerd.snapshotter` and `$runtime.crio.pullType` panics with a nil pointer error when the containerd or crio block is absent from the custom runtime definition. Let's use the `dig` function which safely traverses nested keys and returns an empty string as the default when any key in the path is missing. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-21 13:59:41 +01:00
Zvonko Kaiser	6d1eaa1065	Merge pull request #12461 from manuelh-dev/mahuber/guest-pull-bats tests: enable more scenarios for k8s-guest-pull-image.bats	2026-02-20 08:48:54 -05:00
Zvonko Kaiser	67d154fe47	gpu: Enable NVL5 based platform NVL5 based HGX systems need ib_umad and fabricmanager and nvlsm installed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-02-20 11:39:59 +01:00
Dan Mihai	d8b403437f	static-build: delete cloud-hypervisor directory This cloud-hypervisor is a directory, so it needs "rm -rf" instead of "rm -f". Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-02-19 20:42:50 +01:00
Fabiano Fidêncio	855f4dc7fa	release: Bump version to 3.27.0 Bump VERSION and helm-charts versions. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-19 14:01:26 +01:00
Amulyam24	a22c59a204	kata-deploy: enable kata-remote for ppc64le When kata-deploy is deployed with cloud-api-adaptor, it defaults to qemu instead of configuring the remote shim. Support ppc64le to enable it correctly when shims.remote.enabled=true Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2026-02-19 11:14:27 +01:00
Zvonko Kaiser	1d09e70233	Merge pull request #12538 from fidencio/topic/kata-deploy-fix-regression-on-hardcopying-symlinks kata-deploy: preserve symlinks when installing artifacts	2026-02-18 12:44:46 -05:00
Mikko Ylinen	5622ab644b	versions: bump QEMU to v10.2.1 v10.2.1 is the latest patch release in v10.2 series. Changes: https://github.com/qemu/qemu/compare/v10.2.0...v10.2.1 Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-02-18 18:18:52 +01:00
Mikko Ylinen	d68adc54da	versions: bump to Linux v6.18.12 (LTS) Latest changelog in https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.18.12 Also other changes for 6..11 updates are available. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-02-18 18:18:52 +01:00
Fabiano Fidêncio	34336f87c7	kata-deploy: convert install.rs get_hypervisor_name tests to rstest Use rstest parameterized tests for QEMU variants, other hypervisors, and unknown/empty shim cases. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-18 12:41:55 +01:00
Fabiano Fidêncio	bb11bf0403	kata-deploy: preserve symlinks when installing artifacts When copying artifacts from the container to the host, detect source entries that are symlinks and recreate them as symlinks at the destination instead of copying the target file. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-18 12:29:14 +01:00
Manuel Huber	4c760fd031	build: add CONFIDENTIAL_GUEST variable for kernel This change adds the CONFIDENTIAL_GUEST variable to the kernel build logic. Similar to commit `976df22119`, we would like to enable the cryptsetup functionalities not only when building a measured root file system, but also when building for a confidential guest. The current state is that not all confidential guests use a measured root filesystem, and as a matter of fact, we should indeed decouple these aspects. With the current convention, a confidential guest is a user of CDH with its storage features. A better naming of the CONFIDENTIAL_GUEST variable could have been a naming related to CDH storage functionality. Further, the kernel build script's -m parameter could be improved too - as indicated by this change, not only measured rootfs builds will need the cryptsetup.conf file. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-02-17 12:44:50 -08:00
Fabiano Fidêncio	f0a0425617	kata-deploy: convert a few toml.rs tests to rstest Turn test_toml_value_types into a parameterized test with one case per type (string, bool, int). Merge the two invalid-TOML tests (get and set) into one rstest with two cases, and the two "not an array" tests into one rstest with two cases. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-17 09:33:39 +01:00
Fabiano Fidêncio	899005859c	kata-deploy: avoid leading/blank lines in written TOML config When writing containerd drop-in or other TOML (e.g. initially empty file), the serialized document could start with many newlines. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-17 09:33:39 +01:00
Fabiano Fidêncio	cfa8188cad	kata-deploy: convert containerd version support tests to rstest Replace multiple #[test] functions for snapshotter and erofs version checks with parameterized #[rstest] #[case] tests for consistency and easier extension. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-17 09:33:39 +01:00
Fabiano Fidêncio	cadac7a960	kata-deploy: runtime_platform -> runtime_platforms Fix runtime_platforms typo. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-17 09:33:39 +01:00
Hyounggyu Choi	8bc60a0761	Merge pull request #12521 from fidencio/topic/kata-deploy-auto-add-nfd-tee-labels-to-the-runtime-class kata-deploy: Add TEE nodeSelectors for TEE shims when NFD is detected	2026-02-16 18:06:18 +01:00
Fabiano Fidêncio	a04df4f4cb	kata-deploy: disable provenance/SBOM for quay.io compatibility Disable provenance and SBOM when building per-arch kata-deploy images so each tag is a single image manifest. quay.io rejects pushing multi-arch manifest lists that include attestation manifests (400 manifest invalid). Add a note in the release script documenting this. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-16 13:32:25 +01:00
Fabiano Fidêncio	0e8e30d6b5	kata-deploy: fix default RuntimeClass + nodeSelectors The default RuntimeClass (e.g. kata) is meant to point at the default shim handler (e.g. kata-qemu-$tee). We were building it in a separate block and only sometimes adding the same TEE nodeSelectors as the shim-specific RuntimeClass, leading to kata ending up without the SE/SNP/TDX nodeSelector while kata-qemu-$tee had it. The fix is to stop duplicating the RuntimeClass definition, having a single template that renders one RuntimeClass (name, handler, overhead, nodeSelectors). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-16 13:09:03 +01:00
Fabiano Fidêncio	80a175d09b	kata-deploy: Add TEE nodeSelectors for TEE shims when NFD is detected When NFD is detected (deployed by the chart or existing in the cluster), apply shim-specific nodeSelectors only for TEE runtime classes (snp, tdx, and se). Non-TEE shims keep existing behavior (e.g. runtimeClass.nodeSelector for nvidia GPU from `f3bba0885` is unchanged). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-16 12:07:51 +01:00
Fabiano Fidêncio	d000acfe08	infra: fix multi-arch manifest publish Per-arch images were failing publish-multiarch-manifest with 'X is a manifest list' because Buildx now enables attestations by default, so each arch tag became an image index. Use 'docker buildx imagetools create' instead of 'docker manifest create' so we can merge those indexes into the final multi-arch manifest while keeping provenance and SBOM on per-arch images. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-14 19:49:00 +01:00
Fabiano Fidêncio	02c9a4b23c	kata-deploy: Temporarily comment GPU specific labels We depend on GPU Operator v26.3 release, which is not out yet. Although we have been testing with it, it's not yet publicly available, which would break anyone actually trying to use the GPU runtime classes. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-14 09:25:14 +01:00
Fabiano Fidêncio	5106e7b341	build: Add gnupg to the agent's builder container Otherwise we'll fail to check gperf's GPG signing key when needed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-14 00:33:45 +01:00
Fabiano Fidêncio	d8acc403c8	kata-deploy: set CRI images runtime_platform snapshotter for containerd v3 In containerd config v3 the CRI plugin is split into runtime and images, and setting the snapshotter only on the runtime plugin is not enough for image pull/prepare. The images plugin must have runtime_platform.<runtime>.snapshotter so it uses the correct snapshotter per runtime (e.g. nydus, erofs). A PR on the containerd side is open so we can rely on the runtime plugin snapshotter alone: https://github.com/containerd/containerd/pull/12836 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-13 22:15:02 +01:00
Fabiano Fidêncio	f6e0a7c33c	scripts: use temporary GPG home when verifying cached gperf tarball In CI the default GPG keyring is often read-only or missing, so 'gpg --import' of the cached keyring fails and verification cannot succeed. Use a temporary GNUPGHOME for import and verify so cached gperf can be verified without writing to the system keyring. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-13 19:39:55 +01:00
Joji Mekkattuparamban	f3bba08851	kata-deploy: add node selector to nvidia runtime classes The CC runtime classes kata-qemu-nvidia-gpu-snp and kata-qemu-nvidia-gpu-tdx are mutually exclusive with kata-qemu-nvidia-gpu, as dictated by the gpu cc mode setting. In order to properly support a cluster that has both CC and non-CC nodes, we use a node selector so the scheduling is consistent with the GPU mode. The GPU operator sets a label nvidia.com/cc.ready.state=[true, false] to indicate the gpu mode setting Fixes #12431 Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>	2026-02-13 15:58:06 +01:00

1 2 3 4 5 ...

1563 Commits