kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-26 18:43:06 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	48669a894e	runtime-rs: Add vCPU thread pinning support Port the Go runtime's enable_vcpus_pinning feature to runtime-rs. The Go runtime already lets users pin each vCPU thread to a specific host CPU when the vCPU count matches the sandbox cpuset size, using sched_setaffinity. This is useful for latency-sensitive workloads that benefit from eliminating cross-CPU migration of vCPU threads. The approach mirrors the Go implementation: After VM start and on every container add/update/delete, we fetch the vCPU thread IDs (via QMP query-cpus-fast for QEMU), compute the union of all containers' OCI cpusets, and if the two counts match, pin vCPU i to cpuset[i]. If they diverge (hotplug, container removal, etc.) we reset all threads back to the full cpuset so nothing gets stuck on a single core. The pinning check lives in CgroupsResourceInner::update_sandbox_cgroups, which already runs at exactly the right points in the lifecycle. The enable_vcpus_pinning flag flows from the TOML config through CgroupConfig into the cgroup resource layer, and can also be overridden per-pod via the io.katacontainers.config.runtime.enable_vcpus_pinning annotation. The QEMU config templates default to false. The NV GPU configs will get their own default (true) in a follow-up once those templates are added. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-21 12:45:56 +02:00
Fabiano Fidêncio	1c2d5cb57d	Merge pull request #12848 from kata-containers/sprt/fix-block-vol-test tests: make k8s-block-volume more robust	2026-04-21 11:27:43 +02:00
Fabiano Fidêncio	3b481813f9	Merge pull request #12887 from kata-containers/sprt/fix-runtime-rs-ch-cleanup runtime-rs/ch: Fix pod deletion hang and make deletion idempotent	2026-04-21 11:21:09 +02:00
Aurélien Bombo	a401266f0e	Merge pull request #11704 from microsoft/saulparedes/allow_default_gateway_neigh network: preseed default-gateway neighbor	2026-04-20 15:43:55 -05:00
Aurélien Bombo	d64fce3998	Revert "ci: k8s: Adjust timeout on free runners" This reverts commit `8d6f1d6f34`.	2026-04-20 15:36:35 -05:00
Aurélien Bombo	3cf9581fbe	runtime-rs/ch: Fix errors on pod deletion * get_rootless_symlink_sandbox_path() would get without first checking for is_rootless(), meaning cleanup() would ALWAYS fail (see below error), even though the shim/CH would NOT leak thanks to containerd's recovery routine. * Cleanup wouldn't be idempotent (in case the CRI issues multiple shutdown requests). This was fixed by introducing remove_dir_all_if_exists(). Apr 17 17:53:21 containerd[4078033]: time="2026-04-17T17:53:21.821624475-05:00" level=error msg="failed to shutdown shim task and the shim might be leaked" error="Others(\"failed to handle message handler TaskRequest\\n\\nCaused by:\\n 0: do shutdown\\n 1: do the clean up\\n 2: delete hypervisor\\n 3: No such file or directory (os error 2)\\n\\nStack backtrace:\\n 0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from\\n 1: <hypervisor::ch::CloudHypervisor as hypervisor::Hypervisor>::cleanup::{{closure}}\\n 2: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::cleanup::{{closure}}\\n 3: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::shutdown::{{closure}}\\n 4: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}::{{closure}}\\n 5: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}\\n 6: <service::task_service::TaskService as containerd_shim_protos::shim::shim_ttrpc_async::Task>::shutdown::{{closure}}\\n 7: <containerd_shim_protos::shim::shim_ttrpc_async::ShutdownMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}\\n 8: ttrpc::asynchronous::server::HandlerContext::handle_msg::{{closure}}\\n 9: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll\\n 10: <ttrpc::asynchronous::server::ServerReader as ttrpc::asynchronous::connection::ReaderDelegate>::handle_msg::{{closure}}::{{closure}}\\n 11: tokio::runtime::task::core::Core<T,S>::poll\\n 12: tokio::runtime::task::harness::Harness<T,S>::poll\\n 13: tokio::runtime::scheduler::multi_thread::worker::Context::run_task\\n 14: tokio::runtime::scheduler::multi_thread::worker::Context::run\\n 15: tokio::runtime::context::scoped::Scoped<T>::set\\n 16: tokio::runtime::context::runtime::enter_runtime\\n 17: tokio::runtime::scheduler::multi_thread::worker::run\\n 18: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll\\n 19: tokio::runtime::task::core::Core<T,S>::poll\\n 20: tokio::runtime::task::harness::Harness<T,S>::poll\\n 21: tokio::runtime::blocking::pool::Inner::run\\n 22: std::sys::backtrace::__rust_begin_short_backtrace\\n 23: core::ops::function::FnOnce::call_once{{vtable.shim}}\\n 24: std::sys::thread::unix::Thread::new::thread_start\\n 25: <unknown>\\n 26: <unknown>\")" id=fca6a162b8f0ed7ef2b33cd99b6f1b58124e85c5489c193ceac487db0e4acdde Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-20 15:36:18 -05:00
Aurélien Bombo	93bd2899fb	runtime-rs/ch: Fix hang on pod deletion This serializes CH API calls to avoid a race condition where deleting a pod would hang indefinitely and leak both the shim and CH processes. The race happened because the CRI can send multiple shutdown requests for the same pod, however the CH socket wasn't guarded against concurrent usage, hence it was possible that HTTP responses would interleave (see below) on the shutdown path, leading to an error. This would repro in <15 iterations (sometime 2-3) using a 2-container pod. With this commit, I haven't observed a repro in 200+ iterations. Fixes: #12858 ORIGINAL REPRO: while true; do kubectl apply -f busybox.yaml kubectl wait --for=condition=ready po busybox kubectl exec busybox -- echo foo kubectl delete po busybox done ORIGINAL ERROR: Apr 17 20:15:54 kata[2297383]: Failed to stop process, process = ContainerProcess { container_id: ContainerID { container_id: "d4eb8984d630111bbf808c7ea30b7a21274c0193cdb8d501d20e4f26a0a69151" }, exec_id: "", process_type: Container }, err = failed to update_mem_resource Caused by: 0: resize memory 1: get vminfo 2: failed to serde {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packages":1},"kvm_hyperv":false,"max_phys_bits":46,"affinity":null,"features":{"amx":false},"nested":null},"memory":{"size":2147483648,"mergeable":false,"hotplug_method":"Acpi","hotplug_size":132024107008,"hotplugged_size":null,"shared":true,"hugepages":false,"hugepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","initramfs":null},"rate_limit_groups":null,"disks":[{"path":"/usr/share/kata-containers/kata-containers.img","readonly":true,"direct":false,"iommu":false,"num_queues":1,"queue_size":128,"vhost_user":false,"vhost_socket":null,"rate_limit_group":null,"rate_limiter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":null,"mtu":null,"iommu":false,"num_queues":2,"queue_size":256,"vhost_user":false,"vhost_socket":null,"vhost_mode":"Client","id":"_net1","fds":[-1],"rate_limiter_config":null,"pci_segment":0,"offload_tso":true,"offload_ufo":true,"offload_csum":true}],"rng":{"src":"/dev/urandom","iommu":false},"balloon":null,"fs":[{"tag":"kataShared","socket":"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/root/virtiofsd.sock","num_queues":1,"queue_size":1024,"id":"_fs2","pci_segment":0}],"pmem":null:"/run/kata/e1ae0a05f575a13a535aa95a9990d1fded4766a759f76be0e528c7912d3a5e39/ch-vm.sock","iommu":false,"id":"_vsock3","pci_segment":0},"pvpanic":false,"iommu":false,"numa":null,"watchdog":false,"pci_segments":null,"platform":null,"tpm":null,"landlock_enabl"index":0,"base":3891789824,"size":524288,"type_":"Mmio32","prefetchable":false}}],"parent":null,"children":["_disk0"],"pci_bdf":"0000:00:01.0"},"_virtio-pci-_vsock3":{"id":"_virtio-pci-_vsock3","resources":[{"PciBar":{"index":0,"base":70367622201344,"sizee":false}}],"parent":null,"children":["_fs2"],"pci_bdf":"0000:00:04.0"},"_vsock3":{"id":"_vsock3","resources":[],"parent":"_virtio-pci-_vsock3","children":[],"pci_bdf":null},"_net1":{"id":"_net1","resources":[],"parent":"_virtio-pci-_net1","children":[],"presources":[{"PciBar":{"index":0,"base":70367623774208,"size":524288,"type_":"Mmio64","prefetchable":false}}],"parent":null,"children":["_net1"],"pci_bdf":"0000:00:02.0"},"_virtio-pci-__rng":{"id":"_virtio-pci-__rng","resources":[{"PciBar":{"index":0,"baseesources":[],"parent":null,"children":[],"pci_bdf":null}}}HTTP/1.1 200 Server: Cloud Hypervisor API Connection: keep-alive Content-Type: application/json Content-Length: 4285 {"config":{"cpus":{"boot_vcpus":1,"max_vcpus":32,"topology":{"threads_per_core":1,"cores_per_die":32,"dies_per_package":1,"packagesepage_size":null,"prefault":false,"zones":null,"thp":true},"payload":{"firmware":null,"kernel":"/usr/share/cloud-hypervisor/vmlinux.bin","cmdline":"reboot=k panic=1 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service agent.log_vport=1025 console=ttyS0,115200n8 root=/dev/vda1 rootflags=data=ordered,errors=remount-ro ro rootfstype=ext4 no_timer_check noreplace-smp systemd.log_target=console agent.container_pipe_size=1 agent.log=debug cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1","miter_config":null,"id":"_disk0","disable_io_uring":false,"disable_aio":false,"pci_segment":0,"serial":null,"queue_affinity":null,"backing_files":false}],"net":[{"tap":null,"ip":"192.168.249.1","mask":"255.255.255.0","mac":"9e:7e:13:ee:03:5c","host_mac":nu,"serial":{"file":null,"mode":"Tty","iommu":false,"socket":null},"console":{"file":null,"mode":"Off","iommu":false,"socket":null},"debug_console":{"file":null,"mode":"Off","iobase":233},"devices":[],"user_devices":null,"vdpa":null,"vsock":{"cid":3,"socket" 3: expected `,` or `}` at line 1 column 1924 Stack backtrace: 0: <E as anyhow::context::ext::StdError>::ext_context 1: anyhow::context::<impl anyhow::Context<T,E> for core::result::Result<T,E>>::with_context 2: <hypervisor::ch::CloudHypervisor as hypervisor::Hypervisor>::resize_memory::{{closure}} 3: resource::manager_inner::ResourceManagerInner::update_linux_resource::{{closure}} 4: virt_container::container_manager::container::Container::stop_process::{{closure}} 5: virt_container::container_manager::process::Process::run_io_wait::{{closure}}::{{closure}} 6: tokio::runtime::task::core::Core<T,S>::poll 7: tokio::runtime::task::harness::Harness<T,S>::poll 8: tokio::runtime::scheduler::multi_thread::worker::Context::run_task 9: tokio::runtime::scheduler::multi_thread::worker::Context::run 10: tokio::runtime::context::scoped::Scoped<T>::set 11: tokio::runtime::context::runtime::enter_runtime 12: tokio::runtime::scheduler::multi_thread::worker::run 13: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll 14: tokio::runtime::task::core::Core<T,S>::poll 15: tokio::runtime::task::harness::Harness<T,S>::poll 16: tokio::runtime::blocking::pool::Inner::run 17: std::sys::backtrace::__rust_begin_short_backtrace 18: core::ops::function::FnOnce::call_once{{vtable.shim}} 19: std::sys::thread::unix::Thread::new::thread_start 20: <unknown> 21: <unknown> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-20 15:36:00 -05:00
Fabiano Fidêncio	847f0f40cb	Merge pull request #12880 from fidencio/topic/improve-qemu-cache ci: cache: qemu: Take configure-hypervisor.sh into account	2026-04-20 19:16:01 +02:00
Saul Paredes	f1bcfb8a62	policy: allow neighbors with reachable state Related to previous commit, which adds the default gateway neighbor, and that entry has the state of reachable. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2026-04-20 10:00:23 -07:00
Saul Paredes	83bbfedc08	network: preseed default-gateway neighbor This change mirrors host networking into the guest as before, but now also includes the default gateway neighbor entry for each interface. Pods using overlay/synthetic gateways (e.g., 169.254.1.1) can hit a first-connect race while the guest performs the initial ARP. Preseeding the gateway neighbor removes that latency and makes early connections (e.g., to the API Service) deterministic. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2026-04-20 10:00:19 -07:00
Dan Mihai	b2ea9a8fc6	Merge pull request #12460 from microsoft/danmihai1/k8s-openvpn-runtime tests: annotations for all k8s-openvpn yaml files	2026-04-20 09:47:02 -07:00
Fabiano Fidêncio	b64673196a	ci: cache: qemu: Take configure-hypervisor.sh into account The script is used to change the options used to build QEMU and must be taken into consideration in case something changes, otherwise the QEMU used by the CI would be the old cached one (ignoring any flag newly added). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-20 14:52:57 +02:00
Fabiano Fidêncio	07731cde21	Merge pull request #12879 from stevenhorsman/confidential-tests-fixes Confidential tests fixes	2026-04-20 14:33:02 +02:00
stevenhorsman	c75c432c01	ci: Update TEE scope `k8s-confidential.bats` technically doesn't need attestation, but only runs on TEE hardware, so include it in the attestation list so we can test it in PRs Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-20 09:36:10 +01:00
stevenhorsman	7179e92142	tests/confidentials: Remove pointless skip The skip conditional is wrong, but it's not needed as the setup and teardown only allow confidential hardware anyway Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-20 09:36:10 +01:00
Fupan Li	2629df2785	Merge pull request #12763 from Apokleos/fsmerged-erofs-rs runtime-rs: support erofs snapshotter with Fsmerge enabled	2026-04-20 11:54:19 +08:00
Alex Lyn	e975b3158b	Merge pull request #12837 from stevenhorsman/rand-bump-GHSA-cq8v-f236-94qc versions: Bump rand crate where possible	2026-04-20 10:05:19 +08:00
Fabiano Fidêncio	d6f0b15578	ci: erofs: restrict to runtime-rs only The erofs snapshotter configuration is node-wide (a single containerd drop-in) and cannot be split per runtime handler. The Go runtime does not support fsmerged EROFS — it rejects fsmeta.erofs mount sources with "unsupported mount source" — so erofs is only usable with runtime-rs. Drop qemu-coco-dev (Go) from the erofs CI matrix and add a check in kata-deploy's configure_erofs_snapshotter() that inspects the SNAPSHOTTER_HANDLER_MAPPING: if any Go shim is explicitly mapped to erofs, emit a prominent warning and bail out with a clear error telling the operator to fix the mapping. Since all shims are now guaranteed to be runtime-rs when erofs is active, remove the conditional is_rust_shim gating and always emit the full erofs configuration (differ options, default_size, max_unmerged_layers=1). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	cf1e6f82f2	tests: Show full kata-deploy pod logs in CI Remove --tail=N limits from `kubectl logs` for kata-deploy pods so the complete output is visible in CI job logs for debugging. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	c26f647a3a	test: Improve process verification and robustness in kill test During tests, one error as below: ``` ..k8s-kill-all-process-in-container.bats: line 40: [: too many arguments ``` This commit aims to address such issue follows: (1) Update process query command to "ps aux \|\| ps" to ensure compatibility across different container images while maximizing process visibility. (2) Use "[t]ail" in grep to reliably match the process without self-matching. (3) Quote variable in assertion to resolve "too many arguments" bash error. (4) Improve test reliability by ensuring the process list is actually visible to the verification logic. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	f4f6c78e9e	tests: Update expectation for no-layer-image test case The 'no-layer-image' test case was failing because the underlying shim returned a "unsupported rootfs mounts count" error instead of the expected application-level "file not found" or "ENOENT" error. This change updates the BATS test to accept the shim-level rootfs validation error as a valid failure condition for this unsupported image scenario, ensuring the CI remains green while reflecting current runtime behavior. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	be47c2e932	runtime-rs: Avoid share-rw on readonly virtio-scsi/blk devices Hotplugging a readonly block device could fail with: Block node is read-only The backend block node was created readonly, but the virtio-scsi/blk frontend path still forced share-rw=true. This is unnecessary and can cause QEMU to reject the attach because the frontend configuration does not match the readonly backend. Fix the virtio-scsi/blk hotplug path by: - setting read-only for readonly devices where supported - skipping share-rw for readonly devices Readonly handling remains in the backend block node configuration, while the frontend keeps normal disk semantics for block devices. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	02f975f88b	runtime-rs: Enforce read-only and shared access for RO block devices Explicitly configure `read_only` and `force_share` for readonly block devices to ensure consistency between the image's read-only state and QEMU's access mode. Motivation: Previously, EROFS images were being accessed in a way that triggered QEMU's exclusive locking (e.g., the 'resize' lock), even when the images were intended to be read-only. This conflicted with external processes (e.g., containerd snapshotter) that held read-only handles, resulting in "Failed to get shared 'resize' lock" errors during blockdev-add. Changes: - Set `read_only=true` and `force_share=true` on both format and file nodes for VMDK descriptors and Raw images. - This ensures QEMU requests shared locks, correctly matching the read-only nature of EROFS filesystems and preventing write-mode locking conflicts with concurrent processes. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	9c803d86a6	ci: erofs: Bump containerd to v2.3 To ensure we're using the latest released version of the project, as I think we're missing patches on v2.2. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	cdd09c3c65	ci: enable erofs tests with runtime-rs Now that erofs snapshotter has added , let's make sure this is tested. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	7f7cca16fa	kata-deploy: Complete containerd config for erofs snapshotter Add missing containerd configuration items for erofs snapshotter to enable fsmerged erofs feature: Add snapshotter plugin configuration: - default_size: "10G" # can be customized - max_unmerged_layers: 1 # Fixed with 1 These configurations align with the documentation in docs/how-to/how-to-use-fsmerged-erofs-with-kata.md Step 2, ensuring the CI workflow run-k8s-tests-coco-nontee-with-erofs-snapshotter can properly configure containerd for erofs fsmerged rootfs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	04e0f1c403	qemu: Enable VMDK block format support The multi-layer EROFS rootfs feature relies on QEMU's VMDK flat-extent driver to merge multiple EROFS layers into a single virtual block device. Replace --disable-vmdk with an explicit --enable-vmdk so the Kata static QEMU build includes VMDK support. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	27341f45f1	docs: Add how-to guide for using fsmerged EROFS rootfs with Kata Document the end-to-end workflow for using the containerd EROFS snapshotter with Kata Containers runtime-rs, covering containerd configuration, Kata QEMU settings, and pod deployment examples via crictl/ctr/Kubernetes. Include prerequisites (containerd >= 2.2, runtime-rs main branch), QEMU VMDK format verification command, architecture diagram, VMDK descriptor format reference, and troubleshooting guide. Note that Cloud Hypervisor, Firecracker, and Dragonball do not support VMDK block devices and are currently unsupported for fsmerged EROFS rootfs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	526126904e	runtime-rs: Add support for handling vmdk hotplugging with scsi We should also support virtio-scsi driver for handling vmdk format block device, and this will help address more cases. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	ce3473d272	agent: Kill processes before removing container directory in destroy() When using multi-layer EROFS snapshotter, the destroy() method fails to kill container processes, causing process leaks in shared PID namespace scenarios. Problem Background: 1. Multi-layer EROFS creates temporary mount points under the container's root directory: - /run/kata-containers/<cid>/multi-layer/upper (ext4, writable) - /run/kata-containers/<cid>/multi-layer/lower-0 (EROFS, read-only) 2. The original destroy() method executed in this order: (1) umount rootfs (2) fs::remove_dir_all(&self.root) <- FAILS with "Read-only file system" (3) cgroup cleanup and process killing <- NEVER EXECUTED 3. When remove_dir_all() encounters the read-only EROFS mount point, it returns EROFS error (os error 30), causing destroy() to exit early without killing processes. Why This Fix: 1. The test case k8s-kill-all-process-in-container.bats creates an init container with a background process (tail -f /dev/null), expecting it to be killed when the init container is destroyed. 2. With shared PID namespace (shareProcessNamespace: true), the orphaned process continues running, causing the test to fail. Solution: 1. Reorder the destroy() method to kill processes BEFORE attempting to remove the container directory: (1) Get PIDs from cgroup and send SIGKILL (2) Destroy cgroup (3) umount rootfs (4) fs::remove_dir_all(&self.root) 2. This ensures processes are always killed regardless of filesystem cleanup status, matching the behavior of overlayfs snapshotter. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	c745d18e00	agent: Add virtio-scsi for multilayer erofs storage handler It aims to suppport virtio-scsi driver for handling vmdk and rwlayer storage in kata-agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	37a542c20f	agent: Refactor multi-layer EROFS handling with unified flow Refactor the multi-layer EROFS storage handling to improve code maintainability and reduce duplication. Key changes: (1) Extract update_storage_device() to unify device state management for both multi-layer and standard storages (2) Simplify handle_multi_layer_storage() to focus on device creation, returning MultiLayerProcessResult struct instead of managing state (3) Unify the processing flow in add_storages() with clear separation: (4) Support multiple EROFS lower layers with dynamic lower-N mount paths (5) Improve mkdir directive handling with deferred {{ mount 1 }} resolution This reduces code duplication, improves readability, and makes the storage handling logic more consistent across different storage types. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	27c59f15a0	agent: Register MultiLayerErofsHandler and process multiple EROFS Introduce MultiLayerErofsHandler and method of handle_multi_layer_storage for multi-layer storage: (1) Register MultiLayerErofsHandler to STORAGE_HANDLERS to handle multi-layer EROFS storage with driver type 'multi-layer-erofs'. (2) Add handle_multi_layer_erofs function to process multiple EROFS storages with X-kata.multi-layer marker together in guest. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	6ce9180333	agent: Add support for EROFS rootfs handling in kata-agent Add multi_layer_erofs.rs implementing guest-side processing logics of multi-layer EROFS rootfs with overlay mount support. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	d8db044c63	runtime-rs: Add erofs rootfs handling logic in handler_rootfs Add handling for multi-layer EROFS rootfs in RootFsResource handler_rootfs method. It will correctly handle the multi-layers erofs rootfs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	8d7051436a	runtime-rs: Add support for erofs rootfs with multi-layer Add erofs_rootfs.rs implementing ErofsMultiLayerRootfs for multi-layer EROFS rootfs with VMDK descriptor generation. It's the core implementation of Erofs rootfs within runtime. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	cb706219ae	runtime-rs: Change Rootfs::get_storage return type Change Rootfs::get_storage to return Option<Vec<Storage>> to support multi-layer rootfs with multiple storages. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-18 22:46:33 +02:00
Alex Lyn	c06bc388c2	runtime-rs: Add format argument to hotplug_block_device method Add format argument to hotplug_block_device for flexibly specifying different block formats. With this, we can support kinds of formats, currently raw and vmdk are supported, and some other formats will be supported in future. Aside the formats, the corresponding handling logics are also required to properly handle its options needed in QMP blockdev-add. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-18 22:46:33 +02:00
Alex Lyn	15740439eb	runtime-rs: Add BlockDeviceFormat enum to support more block formats In practice, we need more kinds of block formats, not limited to `Raw`. This commit aims to add BlockDeviceFormat enum for kinds of block device formats support, like RAW, VMDK, etc. And it will do some following actions to make this changes work well, including format field in BlockConfig. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-18 19:00:44 +02:00
Alex Lyn	8ed4fa1406	runtime-rs: Add RUNTIME_ALLOW_MOUNTS to RuntimeInfo Add RUNTIME_ALLOW_MOUNTS annotation to RuntimeInfo to specify custom mount types allowed by the runtime. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-18 19:00:44 +02:00
Fabiano Fidêncio	614cd0618e	Merge pull request #12841 from kata-containers/topic/arm-add-qemu-coco-dev runtime-rs: arm64: ci: Enable qemu-coco-dev tests	2026-04-18 12:22:58 +02:00
Fabiano Fidêncio	edfaeec316	tests: arm64: Skip tests which do not have a multi-arch image The image used has some special (as weird) properties that are being taking advantage of to implement policy related tests. Changing the image is a no-go at this point, otherwise we break the tests ... so let's just skip those for now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	d04bb98e09	runtime-rs: Increase reconnect_timeout_ms for confidential VMs The Go runtime's CoCo dev config uses dial_timeout = 45s, but all runtime-rs confidential VM configs had reconnect_timeout_ms set to 3000ms (3s) or 5000ms (SE). This is too short for confidential VMs, especially on arm64 where UEFI firmware (AAVMF) adds significant boot time on top of the measured boot process, causing ECONNRESET errors on the vsock connection before the agent is ready. Bump reconnect_timeout_ms to 45000ms across all confidential VM configs (coco-dev, SNP, TDX, SE) to match the Go runtime. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	35e48fdfd1	ci: run qemu-coco-dev-runtime-rs tests on arm64 Add qemu-coco-dev-runtime-rs to the arm64 k8s test matrix so that the CoCo non-TEE configuration is exercised on aarch64 runners. Also enable auto-generated policy for qemu-coco-dev on aarch64 (matching the existing x86_64 behavior) and register the new job as a required gatekeeper check. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	588a67a3fb	kata-deploy: add arm64 support for qemu-coco-dev shims Add aarch64/arm64 to the list of supported architectures for qemu-coco-dev and qemu-coco-dev-runtime-rs shims across kata-deploy configuration, Helm chart values, and test helper scripts. Note that guest-components and the related build dependencies are not yet wired for arm64 in these configurations; those will be addressed separately. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	861f15cdc4	build: add arm64 coco-dev build dependencies Build coco-guest-components, pause-image, and rootfs-image-confidential for arm64, which are required by qemu-coco-dev-runtime-rs. Enable MEASURED_ROOTFS on the arm64 shim-v2 build, add the aarch64 case to install_kernel() so the default kernel is built as a unified kernel (with confidential guest support, like x86_64), and adjust the kernel install naming so only CCA builds get the -confidential suffix. Also wire rootfs-image-confidential-tarball into the aarch64 local-build Makefile. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:13 +02:00
Fabiano Fidêncio	e1f8b8e8b4	build: add arm64 tools build (genpolicy only) The arm64 build workflow was missing the tools build entirely. Add build-tools-asset and create-kata-tools-tarball jobs mirroring the amd64 workflow so that genpolicy and the other tools are available for coco-dev tests that need auto-generated policy. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-18 00:48:02 +02:00
Fabiano Fidêncio	0ee556a40a	Merge pull request #12874 from fidencio/topic/nydus-update-to-v0.15.15 versions: Update nydus-snapshotter to v0.15.15	2026-04-17 22:21:34 +02:00
Saul Paredes	6f6e45522e	Merge pull request #11562 from Apokleos/clh-initdata runtime-rs: Add CoCo/protected device for initdata within runtime-rs/Cloud Hypervisor	2026-04-17 11:09:19 -07:00
Dan Mihai	0828784a03	tests: k8s: fix add_annotations_to_yaml Don't hard-code caller's "${K8S_TEST_YAML}" - use the local "${yaml_file}" as intended. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-04-17 17:38:11 +00:00

1 2 3 4 5 ...

18561 Commits