kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-27 02:56:50 +00:00

Author	SHA1	Message	Date
Markus Rudy	639ff3578d	genpolicy: restrict symlinks in CopyFile Allowing arbitrary symlinks in the shared directory is unsafe for confidential VM use cases. In order to make CopyFile safe both for the VM as well for the consuming containers, we implement the following rules for symlinks (in addition to the existing rules for other files): 1. Symlinks may not be placed directly into the shared directory. 2. Symlinks must not point 'upwards', i.e. contain `..` as a path element. 3. Symlinks must be relative. These rules ensure that all writes initiated by CopyFile are restricted to the shared directory (protecting the VM), and that symlinks can't point outside their mount points (protecting the container). These new restrictions mean that we can't support arbitrary mount sources (which might not follow these rules), but the usual k8s suspects (ConfigMap, Secret, ServiceAccountToken) should still pass. In order to aid writing the policy, we convert the CopyFileRequest to a structure that does not contain binary data, but well-defined strings and types. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-22 15:46:12 +02:00
Markus Rudy	d6bd666b3f	agent: fix naming for symlinks in CopyFile The agent referred to the `data` field of an incoming CopyFileRequest as the 'src'. This is misleading, because 'source' is not mentioned in the specification (where links are just a path with attached bytes), and because the documentation for the `ln` utility calls the path LINK_NAME and the data TARGET. This commit fixes the glitch and calls the first argument to `symlinkat` the target. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-22 15:46:12 +02:00
Markus Rudy	5c362adcff	agent: add required features for standalone build Building the kata-agent-policy crate only succeeded when its parents (agent and genpolicy) pulled in the required features. This commit adds the required features to the crate itself, such that it can be built standalone and IDEs don't show errors while browsing it. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-22 15:46:12 +02:00
Alex Lyn	ce3473d272	agent: Kill processes before removing container directory in destroy() When using multi-layer EROFS snapshotter, the destroy() method fails to kill container processes, causing process leaks in shared PID namespace scenarios. Problem Background: 1. Multi-layer EROFS creates temporary mount points under the container's root directory: - /run/kata-containers/<cid>/multi-layer/upper (ext4, writable) - /run/kata-containers/<cid>/multi-layer/lower-0 (EROFS, read-only) 2. The original destroy() method executed in this order: (1) umount rootfs (2) fs::remove_dir_all(&self.root) <- FAILS with "Read-only file system" (3) cgroup cleanup and process killing <- NEVER EXECUTED 3. When remove_dir_all() encounters the read-only EROFS mount point, it returns EROFS error (os error 30), causing destroy() to exit early without killing processes. Why This Fix: 1. The test case k8s-kill-all-process-in-container.bats creates an init container with a background process (tail -f /dev/null), expecting it to be killed when the init container is destroyed. 2. With shared PID namespace (shareProcessNamespace: true), the orphaned process continues running, causing the test to fail. Solution: 1. Reorder the destroy() method to kill processes BEFORE attempting to remove the container directory: (1) Get PIDs from cgroup and send SIGKILL (2) Destroy cgroup (3) umount rootfs (4) fs::remove_dir_all(&self.root) 2. This ensures processes are always killed regardless of filesystem cleanup status, matching the behavior of overlayfs snapshotter. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	c745d18e00	agent: Add virtio-scsi for multilayer erofs storage handler It aims to suppport virtio-scsi driver for handling vmdk and rwlayer storage in kata-agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	37a542c20f	agent: Refactor multi-layer EROFS handling with unified flow Refactor the multi-layer EROFS storage handling to improve code maintainability and reduce duplication. Key changes: (1) Extract update_storage_device() to unify device state management for both multi-layer and standard storages (2) Simplify handle_multi_layer_storage() to focus on device creation, returning MultiLayerProcessResult struct instead of managing state (3) Unify the processing flow in add_storages() with clear separation: (4) Support multiple EROFS lower layers with dynamic lower-N mount paths (5) Improve mkdir directive handling with deferred {{ mount 1 }} resolution This reduces code duplication, improves readability, and makes the storage handling logic more consistent across different storage types. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	27c59f15a0	agent: Register MultiLayerErofsHandler and process multiple EROFS Introduce MultiLayerErofsHandler and method of handle_multi_layer_storage for multi-layer storage: (1) Register MultiLayerErofsHandler to STORAGE_HANDLERS to handle multi-layer EROFS storage with driver type 'multi-layer-erofs'. (2) Add handle_multi_layer_erofs function to process multiple EROFS storages with X-kata.multi-layer marker together in guest. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Alex Lyn	6ce9180333	agent: Add support for EROFS rootfs handling in kata-agent Add multi_layer_erofs.rs implementing guest-side processing logics of multi-layer EROFS rootfs with overlay mount support. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-19 13:24:31 +02:00
Fabiano Fidêncio	64c139208f	agent: add GetDiagnosticData RPC with termination log support Add a new extensible GetDiagnosticData RPC that retrieves diagnostic information from the guest VM. The request carries a log_type string field to specify what kind of data is requested, and a container_id field to identify the target container. The first supported log_type is "termination_log", which reads the Kubernetes termination message file from inside the guest. This is needed for shared_fs=none configurations where the host cannot directly access the guest filesystem. On the Go runtime side, the container stop() path now calls GetDiagnosticData to copy the termination message to the host when running with NoSharedFS and the terminationMessagePolicy annotation is set to "File". The call is best-effort: failures are logged as warnings rather than blocking container teardown. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2026-04-17 13:01:13 +02:00
Fabiano Fidêncio	36a2d8e7f2	agent: Make launch_process_timeout configurable The hardcoded DEFAULT_LAUNCH_PROCESS_TIMEOUT of 6 seconds in the kata agent is insufficient for environments with NVIDIA GPUs and NVSwitches, where the attestation-agent needs significantly more time to collect evidence during initialization (e.g. ~2 seconds per NVSwitch). When the timeout expires, the agent (PID 1) exits with an error, causing the guest kernel to perform an orderly shutdown before the attestation-agent has finished starting. Make this timeout configurable via the kernel parameter agent.launch_process_timeout (in seconds), preserving the 6-second default for backward compatibility. The Go runtime is wired up to pass this value from the TOML config's [agent.kata] section through to the kernel command line. The NVIDIA GPU configs set the new default to 15 seconds. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-10 14:47:01 +02:00
Alex Lyn	2bac201364	agent: Remove virtio-9p storage handler Remove the Virtio9pHandler implementation and its registration from the storage handler manager: (1) Remove Virtio9pHandler struct and StorageHandler implementation. (2) Remove DRIVER_9P_TYPE and Virtio9pHandler from STORAGE_HANDLERS registration. (3) Update watcher.rs comments to remove 9p references. This completes the removal of virtio-9p support in the agent. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-04-07 23:15:39 +02:00
stevenhorsman	5390e470d3	agent: Remove Cargo.lock Following on from #12690, the agent is part of the repo workspace, so no longer needs a lock file. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-01 09:11:28 +01:00
Jiahao Wang	1163b6581f	agent: Change TARGET_PATH to root workspace After agent was moved to root workspace, the products are now under the repo root. Change the TARGET_PATH accordingly to tell Makefile where to lookup output. Signed-off-by: Jiahao Wang <jiahao.wang@lingcage.com>	2026-03-29 06:35:54 +00:00
Jiahao Wang	29e5d5d951	build: Move agent to root workspace This commit adds kata agent to the root workspace, as a follow up work of #12413. Remove agent from exclude list, and make it as a member of root workspace. Signed-off-by: Jiahao Wang <jiahao.wang@lingcage.com>	2026-03-29 06:35:38 +00:00
Fabiano Fidêncio	aa6890eae1	Merge pull request #12675 from manuelh-dev/mahuber/cdh-storage-options agent: add mkfs_opts parameter to cdh_secure_mount	2026-03-23 15:18:38 +01:00
stevenhorsman	38a655487f	vsock-exporter: Switch bincode for serde_json bincode is not maintained, so switch to serde_json to resolve RUSTSEC-2025-0141 Assisted-By: Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:45:17 +00:00
stevenhorsman	e1d7d5bef8	agent: Remove async-std It's a dev-dependency that doesn't seem to be used, so remove it and resolve RUSTSEC-2025-0052 Assisted-By: Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:45:17 +00:00
stevenhorsman	e4eda5e1d8	agent: Bump tracing-subscriber - Bump tracing-subscriber to 0.3.20 to resolve RUSTSEC-2025-0055 - Switch deprecated `slog_info!` for `slog::info!` Generated-By: Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-19 10:45:17 +00:00
Manuel Huber	62d74bb1fd	agent: add mkfs_opts parameter to cdh_secure_mount Add an mkfs_opts parameter to cdh_secure_mount so that its users can parametrize these options depending on their needs. For now, there is two users providing explicit values (container image layer storage and container data storage features). Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-18 15:37:32 -07:00
stevenhorsman	501578cc5a	agent: Remove non-idiomatic unwrap Calling .unwrap() after an .is_some() check is considered non-idiomatic in as it performs redundant work and makes the code more verbose. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-17 16:04:58 +00:00
Manuel Huber	169f92ff09	agent: cdh: Update CDH and API With the new CDH version, the secure_mount API changes. Further, the new CDH version no longer uses the luks-encrypt-storage script but utilizes libcryptsetup as well as mkfs.ext4 and dd. Hence, adapt some of the CDH and Kata components build steps Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-16 09:43:17 -07:00
Aurélien Bombo	9fe03fb170	genpolicy: Support trusted ephemeral data storage * Introduces a new cluster_config setting encrypted_emptydir defaulting to true. * Adapts genpolicy for encrypted emptyDirs. Crucially, the rules.rego change checks that the mount and the storage are well-formed together: * i_storage.source matches a known regex. * i_storage.mount_point == $(spath)/BASE64(i_storage.source) * i_storage.mount_point == p_storage.mount_point * i_storage.mount_point == i_mount.source Note that policy enforcement is necessary to prevent rogue device injection. E.g. the agent could not blindly encrypt all block devices as some use cases only need dm-verity. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	eaa711617e	agent: Support trusted ephemeral data storage Handles block-based emptyDirs plugged via virtio-blk and virtio-scsi by encrypting and formatting them. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
stevenhorsman	8177a440ca	libs: Remove unused crates Remove unused crates to reduce our size and the work needed to do updates Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:37:46 +00:00
Alex Lyn	d298df7014	kata-types: Add cross-platform host_memory_mib() helper for host memory Introduce host_memory_mib() with OS-specific implementations (Linux/Android via nix::sysinfo, macOS via sysctl) selected at compile time. This improves portability and allows consistent host memory sizing/validation across different platforms. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 21:04:26 +08:00
Alex Lyn	b3d60698af	runtime-rs: move host memory adjustment into MemoryInfo using nix sysinfo As the memory related information has been serialized at the sandbox initalization specially at the moment of parsing configuration toml. This commit aims to refactor MemoryInfo initialization logics: (1) Remove memory sizing/host-memory adjustment logic from QEMU cmdline Memory::new() (2) Initialize/adjust memory values via kata-types MemoryInfo (single source of truth) (3) Replace sysinfo::System::new_with_specifics with nix::sys::sysinfo::sysinfo() to get host RAM Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 19:32:44 +08:00
Jacek Tomasiak	8025fa0457	agent: Don't pass empty options to mount With some older kernels some fs implementations don't handle empty options strings well. This leads to failures in "setup rootfs" step. E.g. `cgroup: cgroup2: unknown option ""`. This is fixed by mapping empty string to `None` before passing to `nix::mount`. Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2026-02-16 14:55:59 +01:00
stevenhorsman	90dbd3f562	agent: Bump rkyv version to 0.7.46 Bump to remediate RUSTSEC-2026-0001 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-14 00:33:45 +01:00
stevenhorsman	ffcb10b6a3	agent: Bump time crate to 0.3.47 Update time to resolve CVE-2026-25727. Note: this involved bumping the versions of slog-term and slog-json and bumping the MSRV to 1.88.0 which time 0.3.47 requires. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:44:51 +01:00
stevenhorsman	e49a61eea2	agent: Bump bytes to 1.11.1 To remediate CVE-2026-25541 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:43:23 +01:00
Alex Lyn	ffb8a6a9c3	agent: fix misleading tokio::select! biased comment in do_read_stream The previous comment incorrectly implied that `biased` prevents data loss and the exit notifier would never be polled before all buffered data is read. And the detailed info can be seen from the document: https://docs.rs/tokio/latest/src/tokio/macros/select.rs.html#67 Tokio's `biased` only makes polling order deterministic(top-to-bottom) when multiple branches are ready in the same poll, and it makes fairness the caller's responsibility. Output can still be truncated if the exit notification becomes ready while `read_stream` is pending. This change updates the comment to reflect the actual semantics and caveats. No functional behavior change. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Alex Lyn	1080f6d87e	agent: Introduce drain after exit mechanism to address truncation race Short-lived processes (e.g., `kubectl exec echo`) in legacy-io mode occasionally lose the last segments of their output. The root cause is a race condition where the `term_exit_notifier` triggers before the pipe buffers are fully drained. In the previous implementation, once the exit notification was received, the agent immediately returned an EOF, causing the runtime's `run_io_copy` to terminate and drop any residual data in the pipe. This patch introduces a "drain after exit" mechanism: - Upon receiving an exit notification, the agent enters a 500ms window for polling `read_streaim` to flush remaining data from the buffer. - A true EOF is only returned if the stream is confirmed empty or the timeout is reached. This ensures reliable output delivery for transient exec tasks under high concurrency. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Alex Lyn	700bddeecc	agent: treat EOF as normal for read_stdout/stderr stream Legacy IO uses shim polling via read_stdout/read_stderr. The agent previously mapped pipe EOF (read() == 0) and term_exit_notifier to errors ("read meet eof"/"eof"), which became ttrpc INTERNAL failures. This caused runtime IO copy to abort early, leading to lost stdout/stderr for short-lived exec (e.g."echo") and spurious failures. Normalize EOF semantics: read_stream now returns Ok(empty) on EOF instead of Err("read meet eof"). This makes legacy IO behave like a proper stream: data until EOF, no INTERNAL errors for normal termination. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Zvonko Kaiser	7af306de13	agent: Update aarch64 create_pci_root_bus_path aarch64 is also a supported architecture for NUMA. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-02-09 10:19:41 +01:00
Zvonko Kaiser	8185c015ad	gpu: Add Agent NUMA Support 1 of N We're introducing a root_complex to assign each and every device to a NUMA node or to the default root_complex="00" aka pcie.0. This patch introduces the proper handling of the current qom path being bus/device == "00/02" with NUMAA we need to extend it with the root_complex/bus/device == "10/00/02". We're defaulting to root_complex="00". Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-02-09 10:19:41 +01:00
Qingyuan Hou	ca43a8cbb8	agent: remove redundant func comment This comment was first introduced in `e111093` with secure_join() but then we forgot to remove it when we switched to the safe-path lib in `c0ceaf6` Signed-off-by: Qingyuan Hou <lenohou@gmail.com>	2026-01-27 03:07:57 +00:00
stevenhorsman	78824e0181	agent: Remove unnecessary unwrap Switch `is_some()` and then `unwrap()` for `if let` which is "more idiomatic" Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-21 08:53:40 +00:00
Greg Kurz	cf3441bd2c	agent: Refresh `Cargo.lock` Downstream builders at Red Hat complain that `Cargo.lock` doesn't match `Cargo.toml`. Run `cargo check` to refresh `Cargo.lock`. `git bisect` shows that `7cfb97d41b` is the first commit where `cargo check` has an effect in `src/agent`. Signed-off-by: Greg Kurz <groug@kaod.org>	2026-01-20 14:44:47 +01:00
Manuel Huber	183507beeb	agent: change secure_storage_integrity default Change the secure_storage_integrity option's default value to true. With this, integrity protection for encrypted block device contents will be requested from the confidential data hub by default, see the agent's cdh_handler_trusted_storage function in rpc.rs. This behavior can be disabled by explicitly setting the agent.secure_storage_integrity parameter to 0 or false via kernel command line parameters. This will affect the trusted storage implementation for the guest-pull mechanism, and it will affect future implementations using this code path, such as implementations for ephemeral secure storage. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-10 16:54:03 +01:00
stevenhorsman	b07899f8dc	agent: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:17 +00:00
stevenhorsman	2af88dbb48	agent: bump cdi-rs In #12151 the version was bumped in cargo.toml, but the update not done, so run `cargo update -p container-device-interface` to apply it Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-20 10:08:45 +00:00
stevenhorsman	0027f6cae0	agent: Fix dead_code warning VirtioBlkCcwDeviceHandler and VirtioBlkCcwHandler are only constructed on s390x, so add #[cfg(target_arch = "s390x")] to all the code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	9ec7109712	agent: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	34d299ae44	vsock-exporter: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
Fabiano Fidêncio	fb326b53df	agent: Ensure MS_REMOUNT is respected When updating ephemeral storages, MS_REMOUNT is explicitly passed as, for instance, `/dev/shm` should be remounted after memory is hotplugged. Till now Kata Containers has been explicitly ignoring such updates, leading to the containers' `/dev/shm` having the size of "half of the memory allocated, during the startup time", which goes against the expected behaviour. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-12-16 15:11:34 +01:00
Adeet Phanse	db09912808	agent: add SandboxError enum for typed error handling - Replace generic errors in sandbox operations with typed SandboxError variants (InvalidContainerId, InitProcessNotFound, InvalidExecId). - This enables the kata shim to handle specific failure cases differently. Fixes #12120 Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>	2025-12-12 12:33:18 -05:00
Zvonko Kaiser	9dfa6df2cb	agent: Bump CDI-rs to latest Latest version of container-device-interface is v0.1.1 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-27 22:57:50 +01:00
shwetha-s-poojary	4510e6b49e	agent: fix the list_routes failure relax list_routes tests so not every route requires a device Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>	2025-11-25 20:25:46 -08:00
Dan Mihai	22d60a36c0	agent: allow disabling detect_initdata_device Allow users to build the Kata Agent using INIT_DATA=no to disable the detect_initdata_device() code loop and associated debug log output. Future additional improvements related to Init Data are tracked by #11532. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-25 02:44:28 +00:00
dependabot[bot]	ede5ac9c2d	build(deps): bump the bit-vec group across 2 directories with 1 update Bumps the bit-vec group with 1 update in the /src/agent directory: [bit-vec](https://github.com/contain-rs/bit-vec). Bumps the bit-vec group with 1 update in the /src/tools/agent-ctl directory: [bit-vec](https://github.com/contain-rs/bit-vec). Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) --- updated-dependencies: - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-19 10:43:25 +01:00

1 2 3 4 5 ...

1425 Commits