kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-10 14:02:59 +00:00

Author	SHA1	Message	Date
Alex Lyn	caf0275645	agent: Refactor multi-layer EROFS handling with unified flow Refactor the multi-layer EROFS storage handling to improve code maintainability and reduce duplication. Key changes: (1) Extract update_storage_device() to unify device state management for both multi-layer and standard storages (2) Simplify handle_multi_layer_storage() to focus on device creation, returning MultiLayerProcessResult struct instead of managing state (3) Unify the processing flow in add_storages() with clear separation: (4) Support multiple EROFS lower layers with dynamic lower-N mount paths (5) Improve mkdir directive handling with deferred {{ mount 1 }} resolution This reduces code duplication, improves readability, and makes the storage handling logic more consistent across different storage types. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-10 10:54:20 +08:00
Alex Lyn	47a501be80	agent: Register MultiLayerErofsHandler and process multiple EROFS Introduce MultiLayerErofsHandler and method of handle_multi_layer_storage for multi-layer storage: (1) Register MultiLayerErofsHandler to STORAGE_HANDLERS to handle multi-layer EROFS storage with driver type 'multi-layer-erofs'. (2) Add handle_multi_layer_erofs function to process multiple EROFS storages with X-kata.multi-layer marker together in guest. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-10 10:54:20 +08:00
Alex Lyn	1d2c5b8e27	agent: Add support for EROFS rootfs handling in kata-agent Add multi_layer_erofs.rs implementing guest-side processing logics of multi-layer EROFS rootfs with overlay mount support. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-10 10:54:20 +08:00
Alex Lyn	d90a947411	runtime-rs: Add erofs rootfs handling logic in handler_rootfs Add handling for multi-layer EROFS rootfs in RootFsResource handler_rootfs method. It will correctly handle the multi-layers erofs rootfs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-10 10:54:20 +08:00
Alex Lyn	cda094c023	runtime-rs: Add support for erofs rootfs with multi-layer Add erofs_rootfs.rs implementing ErofsMultiLayerRootfs for multi-layer EROFS rootfs with VMDK descriptor generation. It's the core implementation of Erofs rootfs within runtime. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-10 10:54:15 +08:00
Alex Lyn	740e724e82	runtime-rs: Change Rootfs::get_storage return type Change Rootfs::get_storage to return Option<Vec<Storage>> to support multi-layer rootfs with multiple storages. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-09 21:26:30 +02:00
Alex Lyn	609ac57967	runtime-rs: Add format argument to hotplug_block_device method Add format argument to hotplug_block_device for flexibly specifying different block formats. With this, we can support kinds of formats, currently raw and vmdk are supported, and some other formats will be supported in future. Aside the formats, the corresponding handling logics are also required to properly handle its options needed in QMP blockdev-add. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-09 21:26:30 +02:00
Alex Lyn	df91ade143	runtime-rs: Add BlockDeviceFormat enum to support more block formats In practice, we need more kinds of block formats, not limited to `Raw`. This commit aims to add BlockDeviceFormat enum for kinds of block device formats support, like RAW, VMDK, etc. And it will do some following actions to make this changes work well, including format field in BlockConfig. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-09 21:26:30 +02:00
Alex Lyn	983fb42f7a	runtime-rs: Add RUNTIME_ALLOW_MOUNTS to RuntimeInfo Add RUNTIME_ALLOW_MOUNTS annotation to RuntimeInfo to specify custom mount types allowed by the runtime. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-09 21:26:30 +02:00
Fabiano Fidêncio	72fb41d33b	kata-deploy: Symlink original config to per-shim runtime copy Users were confused about which configuration file to edit because kata-deploy copied the base config into a per-shim runtime directory (runtimes/<shim>/) for config.d support, leaving the original file in place untouched. This made it look like the original was the authoritative config, when in reality the runtime was loading the copy from the per-shim directory. Replace the original config file with a symlink pointing to the per-shim runtime copy after the copy is made. The runtime's ResolvePath / EvalSymlinks follows the symlink and lands in the per-shim directory, where it naturally finds config.d/ with all drop-in fragments. This makes it immediately obvious that the real configuration lives in the per-shim directory and removes the ambiguity about which file to inspect or modify. During cleanup, the symlink at the original location is explicitly removed before the runtime directory is deleted. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-09 17:16:40 +02:00
Steve Horsman	9e8069569e	Merge pull request #12734 from Apokleos/rm-v9p-rs runtime-rs: Remove virtio-9p Shared Filesystem Support	2026-04-09 16:15:55 +01:00
Hyounggyu Choi	f15f7f49f1	Merge pull request #12787 from fidencio/topic/runtime-rs-qemu-arm64-use-static-sandbox-resource-mgmt runtime: qemu: Enable static sandbox resource management on ARM & s390x	2026-04-09 09:18:11 +02:00
Fabiano Fidêncio	80b0ed273f	Merge pull request #12784 from hgowda-amd/sev-snp-tests-required Add sev-snp, qemu-snp CIs as required	2026-04-09 00:22:49 +02:00
Harshitha Gowda	bb1165b23f	tests: Set sev-snp, qemu-snp CIs as required run-k8s-tests-on-tee (sev-snp, qemu-snp) Signed-off-by: Harshitha Gowda <hgowda@amd.com>	2026-04-08 22:36:58 +02:00
Fabiano Fidêncio	2148afe243	Merge pull request #12796 from fidencio/topic/kata-deploy-run-cargo-fmt-and-cargo-check kata-deploy: Run cargo clippy during build	2026-04-08 22:32:31 +02:00
Fabiano Fidêncio	8ff630059a	Merge pull request #12778 from amd-aliem/enable-img-rootfs-snp runtime: SNP img-based rootfs with dm-verity	2026-04-08 22:06:31 +02:00
Fabiano Fidêncio	4561ae3e29	Merge pull request #12799 from fitzthum/fixup-nv-doc-1 docs: update flow for setting nvidia devices to ready	2026-04-08 21:32:55 +02:00
Tobin Feldman-Fitzthum	9119b4982c	docs: update flow for setting nvidia devices to ready Now, we include the nvrc.smi.srs=1 flag in the default kernel cmdline. Thus, we can remove the guidance for people to add it themselves when not using attestation. In fact, users don't really need to know about this flag at all. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2026-04-08 18:59:51 +00:00
Fabiano Fidêncio	21466eb4e5	kata-deploy: Fix clippy warnings across crate Fix all clippy warnings triggered by -D warnings: - install.rs: remove useless .into() conversions on PathBuf values and replace vec! with an array literal where a Vec is not needed - utils/toml.rs: replace while-let-on-iterator with a for loop and drop the now-unnecessary mut on the iterator binding - main.rs: replace match-with-single-pattern with if-let in two places dealing with experimental_setup_snapshotter - utils/yaml.rs: extract repeated serde_yaml::Value::String key into a local variable, removing needless borrows on temporary values Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 20:47:59 +02:00
Fabiano Fidêncio	1874d4617b	kata-deploy: Run cargo clippy during build Ensure code formatting and compilation are verified early in the Docker build pipeline, before tests and the release build. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 20:47:59 +02:00
Amanda Liem	79f844d057	runtime: SNP img-based rootfs with dm-verity Follow-on to kata-containers/kata-containers#12396 Switch SNP config from initrd-based to image-based rootfs with dm-verity. The runtime assembles the dm-mod.create kernel cmdline from kernel_verity_params, and with kernel-hashes=on the root hash is included in the SNP launch measurement. Also add qemu-snp to the measured rootfs integration test. Signed-off-by: Amanda Liem <aliem@amd.com>	2026-04-08 16:46:32 +00:00
Greg Kurz	817580e35d	Merge pull request #12795 from fidencio/topic/kata-deploy-do-not-try-to-install-a-snapshotter-when-using-crio kata-deploy: Skip snapshotter install/uninstall on CRI-O	2026-04-08 17:18:05 +02:00
Fabiano Fidêncio	e93bfbe01a	tests: Remove qemu-coco-dev* skip from sandbox vCPU allocation test With static_sandbox_resource_mgmt calculation fixed for runtime-rs, the VM is correctly pre-sized at creation time. The vCPU allocation test no longer depends on CPU hotplug, so the qemu-coco-dev* skip is no longer needed. Fixes: #10928 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 16:36:00 +02:00
Fabiano Fidêncio	6bc2452664	tests: Remove aarch64 skip from sandbox vCPU allocation test With static_sandbox_resource_mgmt now enabled for ARM on runtime-rs, the VM is correctly pre-sized at creation time. The vCPU allocation test no longer depends on CPU hotplug, so the aarch64 skip (issue #10928) is no longer needed. Fixes: #10928 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 16:36:00 +02:00
Fabiano Fidêncio	e0141991d3	runtime-rs: Enable static sandbox resource management on s390x runtime-rs memory hotplug hard-codes the `pc-dimm` device driver, which is an x86-only QEMU device model. On s390x, the `s390-ccw-virtio` machine type does not support `pc-dimm` at all — the Go runtime handles this by using `virtio-mem-ccw` instead (controlled by the `enable_virtio_mem` config knob, defaulting to true on s390x). runtime-rs has no virtio-mem support, so any attempt to dynamically hotplug memory on s390x fails with: 'pc-dimm' is not a valid device model name This is a pre-existing limitation on main — it has never worked. It is now visible because commit `45dfb6ff25` ("runtime-rs: Fix initial vCPU / memory with static_sandbox_resource_mgmt") expanded runtime-rs test coverage, causing k8s-memory.bats and k8s-oom.bats to actually exercise this code path on s390x. Let's enforce using static_sandbox_resources_mgmt also for s390x so the VM is sized upfront at creation time, bypassing the broken dynamic hotplug path entirely. If someone decides to implement hotplug support for s390x, the work would basically be an implemntation of virtio-mem-ccw support in the runtime-rs QEMU backend (boot-time device creation, qom-set based resize, and virtio-mem aware memory accounting), mirroring what the Go runtime already does, but I'm not game for this (sorry). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 16:36:00 +02:00
Fabiano Fidêncio	ffab9b7eee	runtime: qemu: Enable static sandbox resource management on ARM runtime-rs lacks several features needed for CPU hotplug on ARM: pflash/UEFI firmware passthrough, SMP topology in -smp, nr_cpus kernel parameter, and QMP vCPU add handling for the virt machine type (which requires core-id only placement with socket/thread/die set to -1). Without static sandbox resource management, these gaps cause failures in tests like k8s-memory.bats where the VM is not correctly sized for the workload. Enable static_sandbox_resource_mgmt for aarch64 in the QEMU runtime-rs configuration so the VM is pre-sized at creation time, sidestepping the need for hotplug entirely. Together with this we're aligning the go runtime to the very same behaviour. Fixes: #10928 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 16:36:00 +02:00
Fabiano Fidêncio	0e5e4802d7	runtime-rs: Fix initial vCPU / memory with static_sandbox_resource_mgmt InitialSizeManager::setup_config() is responsible for applying the sandbox workload sizing (computed from containerd/CRI-O sandbox annotations) to the hypervisor configuration before VM creation. Previously, the workload vCPU count was only logged but never actually added to default_vcpus, so the VM was always created with only the base vCPUs from the configuration/annotations. This caused the k8s-sandbox-vcpus-allocation test to fail with qemu-snp-runtime-rs: a pod with default_vcpus=0.75 and a container CPU limit of 1.2 should see ceil(0.75 + 1.2) = 2 vCPUs, but only got 1. Additionally, the workload memory was being added to default_memory unconditionally, diverging from the Go runtime which only applies both CPU and memory additions when static_sandbox_resource_mgmt is enabled. In the non-static path, adding workload resources here would cause double-counting: once from setup_config() at sandbox creation, and again from update_cpu_resources()/update_mem_resources() when individual containers are added. Guard both additions behind static_sandbox_resource_mgmt, matching the Go runtime's behavior in src/runtime/pkg/oci/utils.go: if sandboxConfig.StaticResourceMgmt { sandboxConfig.HypervisorConfig.NumVCPUsF += sandboxConfig.SandboxResources.WorkloadCPUs sandboxConfig.HypervisorConfig.MemorySize += sandboxConfig.SandboxResources.WorkloadMemMB } Fixes: k8s-sandbox-vcpus-allocation test failure on qemu-snp-runtime-rs Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-08 16:36:00 +02:00
Fabiano Fidêncio	bb051bb16a	Merge pull request #12788 from fidencio/topic/kata-deploy-re-apply-GPU-specific-labels kata-deploy: re-apply labels for the GPU runtime classes	2026-04-08 16:27:59 +02:00
Fabiano Fidêncio	bacc3f4ef1	Merge pull request #12785 from fidencio/topic/runtime-rs-deny-config runtime-rs: Deny config of unknown fields & change dbg_monitor_socket name	2026-04-08 15:12:53 +02:00
Fabiano Fidêncio	f27def1a5b	kata-deploy: Skip snapshotter install/uninstall on CRI-O Snapshotters (nydus, erofs) are containerd-specific. The validation code already warned that EXPERIMENTAL_SETUP_SNAPSHOTTER would be ignored on CRI-O, but the actual install/configure and uninstall loops still ran unconditionally, attempting containerd-specific operations on CRI-O nodes. Guard both the install and cleanup snapshotter loops with a `runtime != "crio"` check so the binary itself skips snapshotter work when it detects CRI-O as the container runtime. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:49 +02:00
Fabiano Fidêncio	bc719a66eb	kata-deploy: nvidia: Align force_guest_pull with default values.yaml The defdault is already false, but let's keep those aligned on explicitly setting the default. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Fabiano Fidêncio	78f02f2155	kata-deploy: nvidia: Align labels with default values.yaml Joji's added the labels for the default values.yaml, but we missed adding those to the nvidia specific values.yaml file. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Fabiano Fidêncio	f00b589ccd	Revert "kata-deploy: Temporarily comment GPU specific labels" This reverts commit `02c9a4b23c`, as GPU Operator v26.3.0 is out, and becomes a requirement. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 14:41:21 +02:00
Alex Lyn	c00f895338	kata-deploy: Fix noisy caused by unformatted code When do cargo fmt --all, some files changes as unformatted with `cargo fmt`. This commit is just to address it. Just use this as an example: ``` // Generate the common drop-in files (shared with standard // runtimes) - write_common_drop_ins(config, &runtime.base_config, &config_d_dir, container_runtime)?; + write_common_drop_ins( + config, + &runtime.base_config, + &config_d_dir, + container_runtime, + )?; ``` Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-08 14:39:57 +02:00
Fabiano Fidêncio	6269b3ecde	Merge pull request #12792 from fidencio/topic/nvidia-rootfs-take-nvrc-and-nvat-versions-in-consideration build: cache: Take NVRC & NVAT version into consideration	2026-04-08 12:44:41 +02:00
Fabiano Fidêncio	a12e0f1204	build: cache: Take NVRC & NVAT version into consideration Without those, we'd end up pulling the same / old rootfs that's cached without re-building it in case of a bump in any of those components. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-08 10:14:11 +02:00
RuoqingHe	a4fb9aef54	Merge pull request #12789 from kata-containers/pin-actions-rs-toolchain gha: Pin action for cargo-deny workflow	2026-04-08 08:36:13 +08:00
Fabiano Fidêncio	995767330d	Merge pull request #12782 from pavithiran34/pavi-ras-version-update fix: updated image-rs to v0.18.0	2026-04-07 23:32:05 +02:00
Alex Lyn	38382a59c4	kata-ctl: remove msize_9p from kata-ctl hypervisor info Remove the msize_9p field from HypervisorInfo struct and get_hypervisor_info() function in kata-ctl tool. This aligns with the removal of 9p filesystem support from the configuration and agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
Alex Lyn	2bac201364	agent: Remove virtio-9p storage handler Remove the Virtio9pHandler implementation and its registration from the storage handler manager: (1) Remove Virtio9pHandler struct and StorageHandler implementation. (2) Remove DRIVER_9P_TYPE and Virtio9pHandler from STORAGE_HANDLERS registration. (3) Update watcher.rs comments to remove 9p references. This completes the removal of virtio-9p support in the agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
Alex Lyn	10b24a19c8	kata-types: Remove virtio-9p shared filesystem support Remove all virtio-9p related code and configurations: (1) Remove DRIVER_9P_TYPE and VIRTIO_9P. (2) Remove 9p validation and adjustment logic from SharedFsInfo. (3) Remove KATA_ANNO_CFG_HYPERVISOR_MSIZE_9P annotation handling. (4) Update test configurations to remove msize_9p settings. (5) Update documentation and proto comments. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
Alex Lyn	f133b81579	docs: update shared filesystem documentation and tests (1) Update annotations documentation to reflect new shared filesystem options (virtio-fs, inline-virtio-fs, virtio-fs-nydus, none). (2) Replace virtio-9p references with inline-virtio-fs in config doc. (3) Update drop-in configuration tests to use 'none' instead of 'virtio-9p' Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
Alex Lyn	d6546f2a56	runtime-rs: Remove virtio-9p from configuration*.toml.in As virtio-9p is never supported in runtime-rs, we have more choices to replace it with blockfile snapshotter or erofs snapshotter(in future). It's time to remove its documents and reduce misleading guidance. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
Aurélien Bombo	8916f5f301	gha: Pin action for cargo-deny workflow The cargo-deny workflow should be the last workflow to not use a pinned version. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-07 15:41:09 -05:00
pavithiran34	528fa80953	fix: updated image-rs to v0.18.0 - Updated image-rs from rev 026694d4 to tag v0.18.0 - This update brings rsa 0.9.10 which fixes CVE-2026-21895 - Resolves vulnerability in indirect dependencies Signed-off-by: pavithiran34 <pavithiran.p@ibm.com>	2026-04-07 21:40:01 +02:00
Fabiano Fidêncio	b3ae6ef99c	Merge pull request #12760 from fitzthum/bump-nvat Bump trustee and guest-components to add nvswitch / ppcie support	2026-04-07 19:07:50 +02:00
Aurélien Bombo	79fab93041	Merge pull request #12779 from rophy/fix/strip-cr-from-tty-exec tests: strip \r from kubectl exec output for TTY containers	2026-04-07 10:19:21 -05:00
Tobin Feldman-Fitzthum	e40abcf72d	nvidia: add nvrc.smi.srs=1 to default nvidia kernel params The attestation-agent no longer sets nvidia devices to ready automatically. Instead, we should use nvrc for this. Since this is required for all nvidia workloads, add it to the default nv kernel params. With bounce buffers, the timing of attesting a device versus setting it to ready is not so important. Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>	2026-04-07 14:28:50 +00:00
Manuel Huber	0fd4559f7e	docs: Update NVIDIA GPU passthrough QEMU scenario Updates for the NVIDIA GPU passthrough scenario for the kata-containers release 3.29.0. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-04-07 14:58:40 +02:00
Fabiano Fidêncio	9a5aaf7ecb	runtime-rs: move create_container_timeout before [mem_agent] section The create_container_timeout key was placed after the [agent.@PROJECT_TYPE@.mem_agent] TOML section header, which meant TOML parsed it as a field of mem_agent rather than of the parent agent table. This was silently ignored before, but now that MemAgent has #[serde(deny_unknown_fields)] it causes a parse error. Move the key above the [mem_agent] section so it belongs to the correct [agent.@PROJECT_TYPE@] table. Also fix configuration-qemu-coco-dev which had a duplicate entry: keep only the correctly placed one with the COCO timeout value. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-07 11:23:59 +02:00

1 2 3 4 5 ...

18430 Commits