kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-18 10:44:10 +00:00

Author	SHA1	Message	Date
stevenhorsman	8177a440ca	libs: Remove unused crates Remove unused crates to reduce our size and the work needed to do updates Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:37:46 +00:00
Alex Lyn	d298df7014	kata-types: Add cross-platform host_memory_mib() helper for host memory Introduce host_memory_mib() with OS-specific implementations (Linux/Android via nix::sysinfo, macOS via sysctl) selected at compile time. This improves portability and allows consistent host memory sizing/validation across different platforms. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 21:04:26 +08:00
Alex Lyn	b3d60698af	runtime-rs: move host memory adjustment into MemoryInfo using nix sysinfo As the memory related information has been serialized at the sandbox initalization specially at the moment of parsing configuration toml. This commit aims to refactor MemoryInfo initialization logics: (1) Remove memory sizing/host-memory adjustment logic from QEMU cmdline Memory::new() (2) Initialize/adjust memory values via kata-types MemoryInfo (single source of truth) (3) Replace sysinfo::System::new_with_specifics with nix::sys::sysinfo::sysinfo() to get host RAM Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 19:32:44 +08:00
Jacek Tomasiak	8025fa0457	agent: Don't pass empty options to mount With some older kernels some fs implementations don't handle empty options strings well. This leads to failures in "setup rootfs" step. E.g. `cgroup: cgroup2: unknown option ""`. This is fixed by mapping empty string to `None` before passing to `nix::mount`. Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2026-02-16 14:55:59 +01:00
stevenhorsman	90dbd3f562	agent: Bump rkyv version to 0.7.46 Bump to remediate RUSTSEC-2026-0001 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-14 00:33:45 +01:00
stevenhorsman	ffcb10b6a3	agent: Bump time crate to 0.3.47 Update time to resolve CVE-2026-25727. Note: this involved bumping the versions of slog-term and slog-json and bumping the MSRV to 1.88.0 which time 0.3.47 requires. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:44:51 +01:00
stevenhorsman	e49a61eea2	agent: Bump bytes to 1.11.1 To remediate CVE-2026-25541 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:43:23 +01:00
Alex Lyn	ffb8a6a9c3	agent: fix misleading tokio::select! biased comment in do_read_stream The previous comment incorrectly implied that `biased` prevents data loss and the exit notifier would never be polled before all buffered data is read. And the detailed info can be seen from the document: https://docs.rs/tokio/latest/src/tokio/macros/select.rs.html#67 Tokio's `biased` only makes polling order deterministic(top-to-bottom) when multiple branches are ready in the same poll, and it makes fairness the caller's responsibility. Output can still be truncated if the exit notification becomes ready while `read_stream` is pending. This change updates the comment to reflect the actual semantics and caveats. No functional behavior change. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Alex Lyn	1080f6d87e	agent: Introduce drain after exit mechanism to address truncation race Short-lived processes (e.g., `kubectl exec echo`) in legacy-io mode occasionally lose the last segments of their output. The root cause is a race condition where the `term_exit_notifier` triggers before the pipe buffers are fully drained. In the previous implementation, once the exit notification was received, the agent immediately returned an EOF, causing the runtime's `run_io_copy` to terminate and drop any residual data in the pipe. This patch introduces a "drain after exit" mechanism: - Upon receiving an exit notification, the agent enters a 500ms window for polling `read_streaim` to flush remaining data from the buffer. - A true EOF is only returned if the stream is confirmed empty or the timeout is reached. This ensures reliable output delivery for transient exec tasks under high concurrency. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Alex Lyn	700bddeecc	agent: treat EOF as normal for read_stdout/stderr stream Legacy IO uses shim polling via read_stdout/read_stderr. The agent previously mapped pipe EOF (read() == 0) and term_exit_notifier to errors ("read meet eof"/"eof"), which became ttrpc INTERNAL failures. This caused runtime IO copy to abort early, leading to lost stdout/stderr for short-lived exec (e.g."echo") and spurious failures. Normalize EOF semantics: read_stream now returns Ok(empty) on EOF instead of Err("read meet eof"). This makes legacy IO behave like a proper stream: data until EOF, no INTERNAL errors for normal termination. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-09 15:56:13 +01:00
Zvonko Kaiser	7af306de13	agent: Update aarch64 create_pci_root_bus_path aarch64 is also a supported architecture for NUMA. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-02-09 10:19:41 +01:00
Zvonko Kaiser	8185c015ad	gpu: Add Agent NUMA Support 1 of N We're introducing a root_complex to assign each and every device to a NUMA node or to the default root_complex="00" aka pcie.0. This patch introduces the proper handling of the current qom path being bus/device == "00/02" with NUMAA we need to extend it with the root_complex/bus/device == "10/00/02". We're defaulting to root_complex="00". Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-02-09 10:19:41 +01:00
Qingyuan Hou	ca43a8cbb8	agent: remove redundant func comment This comment was first introduced in `e111093` with secure_join() but then we forgot to remove it when we switched to the safe-path lib in `c0ceaf6` Signed-off-by: Qingyuan Hou <lenohou@gmail.com>	2026-01-27 03:07:57 +00:00
stevenhorsman	78824e0181	agent: Remove unnecessary unwrap Switch `is_some()` and then `unwrap()` for `if let` which is "more idiomatic" Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-21 08:53:40 +00:00
Greg Kurz	cf3441bd2c	agent: Refresh `Cargo.lock` Downstream builders at Red Hat complain that `Cargo.lock` doesn't match `Cargo.toml`. Run `cargo check` to refresh `Cargo.lock`. `git bisect` shows that `7cfb97d41b` is the first commit where `cargo check` has an effect in `src/agent`. Signed-off-by: Greg Kurz <groug@kaod.org>	2026-01-20 14:44:47 +01:00
Manuel Huber	183507beeb	agent: change secure_storage_integrity default Change the secure_storage_integrity option's default value to true. With this, integrity protection for encrypted block device contents will be requested from the confidential data hub by default, see the agent's cdh_handler_trusted_storage function in rpc.rs. This behavior can be disabled by explicitly setting the agent.secure_storage_integrity parameter to 0 or false via kernel command line parameters. This will affect the trusted storage implementation for the guest-pull mechanism, and it will affect future implementations using this code path, such as implementations for ephemeral secure storage. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-01-10 16:54:03 +01:00
stevenhorsman	b07899f8dc	agent: Fix uninlined_format_args Clippy is recommending that format args are inlined for better clarity, so update our code to remove these warnings Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-22 19:49:17 +00:00
stevenhorsman	2af88dbb48	agent: bump cdi-rs In #12151 the version was bumped in cargo.toml, but the update not done, so run `cargo update -p container-device-interface` to apply it Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-20 10:08:45 +00:00
stevenhorsman	0027f6cae0	agent: Fix dead_code warning VirtioBlkCcwDeviceHandler and VirtioBlkCcwHandler are only constructed on s390x, so add #[cfg(target_arch = "s390x")] to all the code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:27 +00:00
stevenhorsman	9ec7109712	agent: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
stevenhorsman	34d299ae44	vsock-exporter: Fix clippy::io_other_error issue We can use the new Error::other options rather than Error:new(Error:Kind:Other Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-12-18 07:45:26 +00:00
Fabiano Fidêncio	fb326b53df	agent: Ensure MS_REMOUNT is respected When updating ephemeral storages, MS_REMOUNT is explicitly passed as, for instance, `/dev/shm` should be remounted after memory is hotplugged. Till now Kata Containers has been explicitly ignoring such updates, leading to the containers' `/dev/shm` having the size of "half of the memory allocated, during the startup time", which goes against the expected behaviour. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-12-16 15:11:34 +01:00
Adeet Phanse	db09912808	agent: add SandboxError enum for typed error handling - Replace generic errors in sandbox operations with typed SandboxError variants (InvalidContainerId, InitProcessNotFound, InvalidExecId). - This enables the kata shim to handle specific failure cases differently. Fixes #12120 Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>	2025-12-12 12:33:18 -05:00
Zvonko Kaiser	9dfa6df2cb	agent: Bump CDI-rs to latest Latest version of container-device-interface is v0.1.1 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-11-27 22:57:50 +01:00
shwetha-s-poojary	4510e6b49e	agent: fix the list_routes failure relax list_routes tests so not every route requires a device Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>	2025-11-25 20:25:46 -08:00
Dan Mihai	22d60a36c0	agent: allow disabling detect_initdata_device Allow users to build the Kata Agent using INIT_DATA=no to disable the detect_initdata_device() code loop and associated debug log output. Future additional improvements related to Init Data are tracked by #11532. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-25 02:44:28 +00:00
dependabot[bot]	ede5ac9c2d	build(deps): bump the bit-vec group across 2 directories with 1 update Bumps the bit-vec group with 1 update in the /src/agent directory: [bit-vec](https://github.com/contain-rs/bit-vec). Bumps the bit-vec group with 1 update in the /src/tools/agent-ctl directory: [bit-vec](https://github.com/contain-rs/bit-vec). Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) Updates `bit-vec` from 0.6.3 to 0.8.0 - [Changelog](https://github.com/contain-rs/bit-vec/blob/master/RELEASES.md) - [Commits](https://github.com/contain-rs/bit-vec/commits) --- updated-dependencies: - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec - dependency-name: bit-vec dependency-version: 0.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: bit-vec ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-19 10:43:25 +01:00
Markus Rudy	b771bb6ed3	genpolicy: log requests as jsonlines The current format of genpolicy request logs looks a bit like JSON, but it does not parse out of the box and needs post-processing with sed, for example. This commit changes the log format to jsonlines[1], which is basically newline-delimited compact JSON values. Compared to standard JSON, this allows streaming output. The resulting file can be converted and processed programmatically, for example with `jq -s`. The fields are also adjusted to match the field names of TestRequest, so that the logged requests can be used immediately in tests. [1]: https://jsonlines.org/ Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-11-17 09:01:00 +01:00
Alex Lyn	7423eb7a30	agent: Support both virtio-blk and virtio-scsi devices for initdata Currently, the initdata module only detects virtio-blk devices (/dev/vd) when searching for the initdata block device. However, when using virtio-scsi, the devices appear as /dev/sd in the guest, causing the initdata detection to fail. This commit extends the device detection logic to support both device types: - virtio-blk devices: /dev/vda, /dev/vdb, etc. - virtio-scsi devices: /dev/sda, /dev/sdb, etc. This commits aims to address issue of theinitdata device not being found when using virtio-scsi Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-10 18:03:23 +01:00
Dan Mihai	7b10f4c72a	agent: update version.rs when VERSION file changed - version.rs gets generated from version.rs.in - version.rs.in contains values read from VERSION - so version.rs (and maybe other Agent files too) must be re-generated when the VERSION file changes Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 17:53:09 +00:00
Hyounggyu Choi	32da38273a	agent/tests: Skip if kernel module is not found On IBM actionspz Z runners, the following error occurs when running `modprobe`: ``` modprobe: FATAL: Module bridge not found in directory /lib/modules/6.8.0-85-generic ``` Additionally, there are no files under `/lib/modules`, for example: ``` total 0 drwxr-xr-x 1 root root 0 Aug 5 13:09 . drwxr-xr-x 1 root root 2.0K Oct 1 22:59 .. ``` This commit skips the `test_load_kernel_module` test if the module is not found or if running `modprobe` is not permitted. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	075de4dc62	agent/tests: Skip test if error is EACCES (permission denied) On IBM actionspz Z runners, write operations on network interfaces are not allowed, even for the root user. This commit skips the `add_update_addresses` test if the operation fails with EACCES (-13, permission denied). Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	3f84b623a3	agent/tests: Skip RNG reseeding test on restricted environments On IBM actionspz Z runners, the ioctl system call is not allowed even for the root user. There is likely an additional security mechanism (such as AppArmor or seccomp) in place on Ubuntu runners. This commit introduces a new helper, `is_permission_error()`, which skips the test if ioctl operations in `reseed_rng()` are not permitted. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	c2abc4da34	agent/tests: Use detected filesystem for baremounted points The IBM actionspz Z runners mount /dev as tmpfs, while other systems use devtmpfs. This difference causes an assertion failure for test_already_baremounted. This commit sets the detected filesystem for bare-mounted points as the expected value. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Hyounggyu Choi	faa048893d	agent/tests: Handle error messages differetnly based on root filesystem The root filesystem for IBM actionspz Z runners is `btrfs` instead of `ext4`. The error message differs when an unprivileged user tries to perform a bind mount. This commit adjusts the handling of error messages based on the detected root filesystem type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-11-05 16:35:04 +00:00
Ruoqing He	7bb28d8da7	libs: Move `mem-agent` into `src/libs` `mem-agent` now does not ship example binaries and serves as a library for `agent` to reference, so we move it into `libs` to better manage it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Ruoqing He	f0e223c535	mem-agent: Rename `mem-agent-lib` to `mem-agent` Rename `mem-agent-lib` to `mem-agent` before we move it into `src/libs`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-10-22 03:26:35 +00:00
Hyounggyu Choi	88c333f2a6	agent: Fix race in tests calling LinuxContainer::new() We fix the following error: ``` thread 'sandbox::tests::add_and_get_container' panicked at src/sandbox.rs:901:10: called `Result::unwrap()` on an `Err` value: Create cgroupfs manager Caused by: 0: fs error caused by: Os { code: 17, kind: AlreadyExists, message: "File exists" } 1: File exists (os error 17) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` by ensuring that the cgroup path is unique for tests run in the same millisecond. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-15 11:32:22 +02:00
Hyounggyu Choi	8412af919d	agent/netlink: Attempt to fix ARP and routes tests test_add_one_arp_neighbor ========================= We attempt to fix the following error: ``` thread 'netlink::tests::test_add_one_arp_neighbor' panicked at src/netlink.rs:1163:9: assertion `left == right` failed left: "" right: "192.0.2.127 lladdr 6a:92:3a:59:70:aa PERMANENT" ``` by adding a sleep to prepare_env_for_test_add_one_arp_neighbor() to wait for the kernel interfaces to settle. list_routes =========== We attempt to fix the following error (notice that the available devices contain "dummy_for_arp"): ``` thread 'netlink::tests::list_routes' panicked at src/netlink.rs:986:14: Failed to list routes: available devices: [Interface { device: "", name: "lo", IPAddresses: [IPAddress { family: v6, address: "127.0.0.1", mask: "8", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v6, address: "169.254.1.1", mask: "31", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "2001:db8:85a3::8a2e:370:7334", mask: "128", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "::1", mask: "128", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 65536, hwAddr: "00:00:00:00:00:00", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "enc0", IPAddresses: [IPAddress { family: v6, address: "10.249.65.4", mask: "24", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::4ff:fe57:b3e4", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "02:00:04:57:B3:E4", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "docker0", IPAddresses: [IPAddress { family: v6, address: "172.17.0.1", mask: "16", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::42:56ff:fe5c:d9f9", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "02:42:56:5C:D9:F9", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, Interface { device: "", name: "dummy_for_arp", IPAddresses: [IPAddress { family: v6, address: "192.0.2.2", mask: "24", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }, IPAddress { family: v4, address: "fe80::f4f2:64ff:fe46:2b01", mask: "64", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }], mtu: 1500, hwAddr: "4A:73:DE:A3:07:64", devicePath: "", type_: "", raw_flags: 0, special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }] Caused by: 0: error looking up device 19888 1: Received a netlink error message No such device (os error 19) ``` by calling clean_env_for_test_add_one_arp_neighbor() at the start of the test. However this fix is uncertain: the original assumption for the fix was that the "dummy_for_arp" interface left over from test_add_one_arp_neighbor was the cause of the error. But (3) below shows that running list_routes in isolation while that interface is present is NOT enough to repro the error: 1. Running all tests + no clean_env in list_routes => list_routes FAILS (before this PR) 2. Running all tests + clean_env in list_routes => list_routes PASSES (after this PR) 3. Running only list_routes + dummy_for_arp present => list_routes PASSES (manual test, see below) ``` $ ip a l 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 169.254.1.1/31 brd 169.254.1.1 scope global lo valid_lft forever preferred_lft forever inet6 2001:db8:85a3::8a2e:370:7334/128 scope global valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: enc0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 02:00:01:02:e2:47 brd ff:ff:ff:ff:ff:ff inet 10.240.64.4/24 metric 100 brd 10.240.64.255 scope global dynamic enc0 valid_lft 159sec preferred_lft 159sec inet6 fe80::1ff:fe02:e247/64 scope link valid_lft forever preferred_lft forever 311: dummy_for_arp: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether ee:79:66:3a:dc:bc brd ff:ff:ff:ff:ff:ff inet 192.0.2.2/24 scope global dummy_for_arp valid_lft forever preferred_lft forever inet6 fe80::4c2e:83ff:fe7d:ef00/64 scope link valid_lft forever preferred_lft forever $ sudo -E PATH=$PATH make test ../../utils.mk:162: "WARNING: s390x-unknown-linux-musl target is unavailable" Finished `test` profile [unoptimized + debuginfo] target(s) in 0.25s Running unittests src/main.rs (target/s390x-unknown-linux-gnu/debug/deps/kata_agent-b2b5b200deca712e) running 1 test test netlink::tests::list_routes ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 224 filtered out; finished in 0.00s ``` Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-15 11:32:22 +02:00
Zvonko Kaiser	10f8ec0c20	cdi: Add Crate remove Github Hash Use CDI exclusively from crates.io and not from a GH repository. Cargo can easily check if a new version is available and we can far more easier bump it if needed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-15 09:22:20 +02:00
Aurélien Bombo	eeecd6d72b	Merge pull request #11872 from kata-containers/sprt/rust-use-uninit agent/rustjail: Fix potentially uninitialized memory read in unsafe code	2025-10-02 10:39:25 -05:00
Markus Rudy	507a0e09f3	agent: use TEST-NET-1 addresses for netlink tests test_add_one_arp_neighbor modifies the root network namespace, so we should ensure that it does not interfere with normal network setup. Adding an IP to a device results in automatic routes, which may affect routing to non-test endpoints. Thus, we change the addresses used in the test to come from TEST-NET-1, which is designated for tests and usually not routable. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-01 09:00:52 +02:00
Markus Rudy	bbc006ab7c	agent: add debug info to netlink tests list_routes and test_add_one_arp_neighbor have been flaky in the past (#10856), but it's been hard to tell what exactly is going wrong. This commit adds debug information for the most likely problem in list_routes: devices being added/removed/modified concurrently. Furthermore, it adds the exit code and stderr of the ip command, in case it failed to list the ARP neighborhood. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-10-01 09:00:52 +02:00
Aurélien Bombo	a3669d499a	agent/rustjail: Fix potentially uninitialized memory read in unsafe code The previous code only checked the result of with_nix_path(), not statfs(), thus leading to an uninitialized memory read if statfs() failed. No functional change otherwise. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-30 15:48:07 -05:00
Markus Rudy	369aed0203	kata-types: conditionally include safe-path Most of the kata-types code is reusable across platforms. However, some functions in the mount module require safe-path, which is Linux-specific and can't be used on other platforms, notably darwin. This commit adds a new feature `safe-path` to kata-types, which enables the functions that use safe-path. The Linux-only callers kata-ctl and runtime-rs enable this feature, whereas genpolicy only needs initdata and does not need the functions from the mount module. Using a feature instead of a target_os restriction ensures that the developer experience for genpolicy remains the same. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-09-29 09:48:32 +02:00
Aurélien Bombo	dedd833cdd	agent: Add note about future breaking change in nix Tracked in #11842. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-23 16:23:54 -05:00
Aurélien Bombo	ecb22cb3e3	agent/rustjail: Fix double free in TTY handling The repro below would show this error in the logs (in debug mode only): fatal runtime error: IO Safety violation: owned file descriptor already closed The issue was that the `pseudo.slave` file descriptor was being owned by multiple variables simultaneously. When any of those variables would go out of scope, they would close the same file descriptor, which is undefined behavior. To fix this, we clone: we create a new file descriptOR that refers to the same file descriptION as the original. When the cloned descriptor is closed, this affect neither the original descriptor nor the description. Only when the last descriptor is closed does the kernel cleans up the description. Note that we purposely consume (not clone) the original descriptor with `child_stdin` as `pseudo` is NOT dropped automatically. Repro ----- Prerequisites: - Use Rust 1.80+. - Build the agent in debug mode. $ cat busybox.yaml apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - image: busybox:latest name: busybox runtimeClassName: kata $ kubectl apply -f busyboox.yaml pod/busybox created $ kubectl exec -it busybox -- sh error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "e6c602352849647201860c1e1888d99ea3166512f1cc548b9d7f2533129508a9": cannot enter container 76a499cbf747b9806689e51f6ba35e46d735064a3f176f9be034777e93a242d5, with err ttrpc: closed Fixes: #11054 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-09-23 16:23:50 -05:00
Fabiano Fidêncio	bfc54d904a	agent: Fix format issues In the previous commit we've added some code that broke `cargo fmt -- --check` without even noticing, as the code didn't go through the CI process (due to it being a security advisory). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-09-23 16:47:39 +02:00
Fabiano Fidêncio	96108006f2	agent: Panic on errors accessing the attestation agent binary Let's make sure that whenever we try to access the attestation agent binariy, we only proceed the startup in case: * the binary is found (CoCo case) * the binary is not present (non-CoCo case) In case any error that's not `NotFound`, we should simply abort as that could mean a potential tampering with the binary (which would be reported as an EIO). Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-09-16 21:35:00 +02:00
Dan Mihai	bc75f6a158	Merge pull request #11783 from billionairiam/agenttypo kata-agent: Rename misleading variable in config parsing	2025-09-16 11:07:17 -07:00

1 2 3 4 5 ...

1402 Commits