kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-17 04:52:23 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	346119108e	kata-deploy: drop unused kube features The binary doesn't use kube::runtime (controllers, watchers, reflectors) or kube::derive (the CustomResource macro). Pulling them in only added transitive deps (kube-runtime, kube-derive, backon, educe, ahash, async-broadcast, ...) and inflated the binary's static data segment for no functional gain. Set default-features = false and select only what the binary actually calls into: the kube-client surface plus the rustls-tls backend that hyper-rustls already pulled in transitively. Behaviour is unchanged. Fixes: https://github.com/kata-containers/kata-containers/discussions/12976 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Cursor <cursoragent@cursor.com>	2026-05-07 13:40:55 +02:00
Fabiano Fidêncio	8a33007806	runtime-rs: Add configuration-qemu-nvidia-gpu-tdx-runtime-rs.toml.in Add a new runtime-rs configuration template that combines the NVIDIA GPU cold-plug stack with Intel TDX confidential guest support. This is the runtime-rs counterpart of the Go runtime's configuration-qemu-nvidia-gpu-tdx template. The template merges the GPU NV settings (VFIO cold-plug, Pod Resources API, NV-specific kernel/image/firmware, extended timeouts) with TDX confidential guest settings (confidential_guest, OVMF.inteltdx.fd firmware, TDX Quote Generation Service socket, confidential NV kernel and image). The Makefile is updated with the new config file registration and the FIRMWARETDVFPATH_NV variable pointing to OVMF.inteltdx.fd. Also removes a stray tdx_quote_generation_service_socket_port setting from the SNP GPU template where it did not belong. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-05-07 10:33:26 +02:00
Alex Lyn	4f618d09d5	runtime-rs: Add Pod Resources CDI discovery in sandbox Query the kubelet Pod Resources API during sandbox setup to discover which GPU devices have been allocated to the pod. When cold_plug_vfio is enabled, the sandbox resolves CDI device specs, extracts host PCI addresses and IOMMU groups from sysfs, and creates VfioModernCfg device entries that get passed to the hypervisor for cold-plug. Add pod-resources and cdi crate dependencies to the runtimes and virt_container workspace members. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-07 10:33:26 +02:00
Alex Lyn	e72ed1c12e	runtime-rs: Add VFIO modern device driver Add the VfioDeviceModern driver for VFIO device passthrough in runtime-rs. The driver handles device discovery through sysfs, detects whether the host uses iommufd cdev or legacy VFIO group interfaces, resolves PCI BDF addresses and IOMMU groups, and implements the Device and PCIeDevice traits for hypervisor integration. The module is structured as: - core.rs: sysfs discovery, BDF parsing, IOMMU group resolution, device-node path logic for both iommufd cdev and legacy group paths - device.rs: VfioDeviceModern/VfioDeviceModernHandle types, Device and PCIeDevice trait implementations - mod.rs: host capability detection (iommufd vs legacy), backend selection logic The DeviceType::VfioModern enum variant and stub PCIeTopology methods (reserve_bus_for_device, release_bus_for_device) are added so the driver compiles; full topology wiring follows in a subsequent commit. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-07 10:33:26 +02:00
Alex Lyn	b4768cfc61	dragonball: Adapt VFIO DMA calls to vfio-ioctls 0.6 API The vfio-ioctls 0.6.0 crate changed the vfio_dma_map signature: the host address parameter is now a raw pointer (*mut u8) instead of u64, and the size parameter is usize instead of u64. Since the kernel uses the host address to set up DMA mappings to physical memory — and the caller must guarantee the memory behind that pointer remains valid for the lifetime of the mapping — upstream marked vfio_dma_map as unsafe fn. Wrap vfio_dma_map calls in unsafe blocks and adjust the type casts accordingly. vfio_dma_unmap only needed the usize cast for the size parameter (it does not take a host address, so it remains safe). Bump workspace dependencies: - vfio-bindings 0.6.1 -> 0.6.2 - vfio-ioctls 0.5.0 -> 0.6.0 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-07 10:33:26 +02:00
Alex Lyn	0bb9b66815	kata-sys-util: Add PCI helpers for VFIO cold-plug paths The VFIO cold-plug path needs to resolve a PCI device's sysfs address from its /dev/vfio/ group or iommufd cdev node. Extend the PCI helpers in kata-sys-util to support this: add a function that walks /sys/bus/pci/devices to find a device by its IOMMU group, and expose the guest BDF that the QEMU command line will reference. These helpers are consumed by the runtime-rs hypervisor crate when building VFIO device descriptors for the QEMU command line. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-07 10:33:26 +02:00
Alex Lyn	1e96e75bf3	pod-resources-rs: Add kubelet Pod Resources API client Add a gRPC client crate that speaks the kubelet PodResourcesLister service (v1). The runtime-rs VFIO cold-plug path needs this to discover which GPU devices the kubelet has assigned to a pod so they can be passed through to the guest before the VM boots. The crate is intentionally kept minimal: it wraps the upstream pod_resources.proto, exposes a Unix-domain-socket client, and re-exports the generated types. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-07 10:33:26 +02:00
dependabot[bot]	8cc9325fee	build(deps): bump openssl from 0.10.78 to 0.10.79 Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.78 to 0.10.79. - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](https://github.com/rust-openssl/rust-openssl/compare/openssl-v0.10.78...openssl-v0.10.79) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.79 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-05-06 10:19:15 +00:00
Fabiano Fidêncio	210ad5de98	runtime-rs: Bump netlinks for Linux 6.17+ IPv6 dev conf RTNetlink Upgrade netlink-packet-route and rtnetlink so IFLA_INET6_CONF matches the kernel's 240-byte layout (DEVCONF_FORCE_FORWARDING). Adapt to API changes: NeighbourAttribute::LinkLayerAddress and bool MulticastSnooping. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-05 13:56:44 +02:00
stevenhorsman	efe62c9280	kata-ctl: Move into root workspace Add kata-ctl to be a workspace member to simplify the dependency management. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-30 08:45:27 +01:00
stevenhorsman	7664ebda7e	trace-forwarder: Move into root workspace Add trace-forwarder to be a workspace member to simplify the dependency management. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-29 12:11:04 +01:00
Fabiano Fidêncio	cbd71f534e	kata-sys-util: add oci_docker module for Docker netns detection Docker 26+ with `runtimeType` shims may not include a network namespace in the OCI spec's `linux.namespaces` and instead uses `libnetwork-setkey` hooks to communicate the sandbox ID. Add helpers to detect Docker containers and resolve the netns path from hook arguments, matching the Go runtime's `DockerNetnsPath` and `IsDockerContainer` utilities. Fixes: #9340 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 10:20:18 +02:00
Fabiano Fidêncio	74d9d043f0	agent: raise regorus policy length limits regorus 0.9.0 introduced a hard, per-engine ceiling on parsed-policy size (1024 columns / 1 MiB / 20 000 lines, see lexer.rs:30 in microsoft/regorus). The 1024-column cap rejects realistic policies emitted by `genpolicy`: the `NVIDIA_REQUIRE_CUDA` environment variable on `nvcr.io/nvidia/k8s/cuda-sample` is roughly 1.3 KiB on a single line, so the agent's `set_policy()` returns an error, the agent (PID 1) exits, the guest kernel reboots, and the runtime eventually times out connecting to the agent's vsock. regorus PR #624 ("feat: make policy length limits configurable per engine") adds `Engine::set_policy_length_config`, but it has not been released yet -- the latest published version is still 0.9.1, which predates that change. Pin `regorus` to the upstream commit that includes #624 and call the new setter from `AgentPolicy::new_engine()` with values that comfortably fit any policy we expect to evaluate (64 KiB per line, 16 MiB per file, 200 000 lines) while still rejecting pathological/minified input. Once a regorus release > 0.9.1 ships with #624, the dependency can be moved back to crates.io. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-26 10:18:26 +02:00
Markus Rudy	c8fe6a60d0	genpolicy: update regorus to 0.9.1 The version we used before was released in 2024, it's about time to use a newer version. The new version of the crate comes with a license, which addresses a `cargo deny` finding. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-26 10:18:26 +02:00
Steve Horsman	fc359d2140	Merge pull request #12901 from kata-containers/dependabot/cargo/openssl-0.10.78 build(deps): bump openssl from 0.10.76 to 0.10.78	2026-04-25 20:59:51 +01:00
dependabot[bot]	151a797fc0	build(deps): bump openssl from 0.10.76 to 0.10.78 Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.76 to 0.10.78. - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](https://github.com/rust-openssl/rust-openssl/compare/openssl-v0.10.76...openssl-v0.10.78) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.78 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-25 10:28:48 +00:00
stevenhorsman	d6df75853b	versions: Update rustls-webpki to 0.103.13 Simple bump to fix CVE GHSA-82j2-j2ch-gfr8: Denial of service via panic on malformed CRL BIT STRING Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-25 11:27:02 +01:00
Fabiano Fidêncio	e0927e0e0c	Merge pull request #12846 from RainaYL/rainax/split_irqchip_pr dragonball: Implement userspace IOAPIC to enable split irqchip	2026-04-24 19:07:45 +02:00
Anjana A R K	d2e0e277cc	kata-agent: Bump serde-enum-str to v0.5.0 Upgraded the serde-enum-str to v0.5.0 which bumps serde-attributes to 0.3.0 version Signed-off-by: Anjana A R K <anjana.a.r.k1@ibm.com>	2026-04-24 15:57:59 +05:30
Xiaofan Xxf	fd39117a21	dragonball: Implement userspace IOAPIC to enable split irqchip From Linux 6.14, creating a TDX VM requires that split irqchip is enabled. Under this circumstance, device IOAPIC would be managed in userspace, instead of KVM, so a manager is needed to handle MMIO read/write to emulated IOAPIC registers. Also, with split irqchip, irqfd is no longer able to trigger an interrupt after device IO is completed. Instead, KVM_SIGNAL_MSI is used for interrupt triggering. Note that only legacy irq with edge-triggered interrupt is implemented here. And split irqchip feature is only enabled when confidential VM type is set to TDX. Signed-off-by: Xiaofan Xxf <xiaofan.xxf@antgroup.com>	2026-04-24 10:33:05 +08:00
Fupan Li	18378145d2	Merge pull request #12821 from fidencio/topic/runtime-rs-cpu-pinning runtime-rs: Add vCPU thread pinning support	2026-04-23 16:49:18 +08:00
Markus Rudy	639ff3578d	genpolicy: restrict symlinks in CopyFile Allowing arbitrary symlinks in the shared directory is unsafe for confidential VM use cases. In order to make CopyFile safe both for the VM as well for the consuming containers, we implement the following rules for symlinks (in addition to the existing rules for other files): 1. Symlinks may not be placed directly into the shared directory. 2. Symlinks must not point 'upwards', i.e. contain `..` as a path element. 3. Symlinks must be relative. These rules ensure that all writes initiated by CopyFile are restricted to the shared directory (protecting the VM), and that symlinks can't point outside their mount points (protecting the container). These new restrictions mean that we can't support arbitrary mount sources (which might not follow these rules), but the usual k8s suspects (ConfigMap, Secret, ServiceAccountToken) should still pass. In order to aid writing the policy, we convert the CopyFileRequest to a structure that does not contain binary data, but well-defined strings and types. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-22 15:46:12 +02:00
Fabiano Fidêncio	48669a894e	runtime-rs: Add vCPU thread pinning support Port the Go runtime's enable_vcpus_pinning feature to runtime-rs. The Go runtime already lets users pin each vCPU thread to a specific host CPU when the vCPU count matches the sandbox cpuset size, using sched_setaffinity. This is useful for latency-sensitive workloads that benefit from eliminating cross-CPU migration of vCPU threads. The approach mirrors the Go implementation: After VM start and on every container add/update/delete, we fetch the vCPU thread IDs (via QMP query-cpus-fast for QEMU), compute the union of all containers' OCI cpusets, and if the two counts match, pin vCPU i to cpuset[i]. If they diverge (hotplug, container removal, etc.) we reset all threads back to the full cpuset so nothing gets stuck on a single core. The pinning check lives in CgroupsResourceInner::update_sandbox_cgroups, which already runs at exactly the right points in the lifecycle. The enable_vcpus_pinning flag flows from the TOML config through CgroupConfig into the cgroup resource layer, and can also be overridden per-pod via the io.katacontainers.config.runtime.enable_vcpus_pinning annotation. The QEMU config templates default to false. The NV GPU configs will get their own default (true) in a follow-up once those templates are added. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-21 12:45:56 +02:00
stevenhorsman	a59afa3154	versions: Update rustls-webpki to 0.103.12 Simple bump to fix CVEs: - RUSTSEC-2026-0098 - RUSTSEC-2026-0099 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-20 16:24:20 +01:00
stevenhorsman	35be1a938d	versions: Bump rand crate where possible Update all versions of rand that are controlled by us to remediate GHSA-cq8v-f236-94qc. Note: There are still some usages of rand 0.8.5 it that are from transitive dependencies which we can't currently update: - fail - phf_generator - opentelemetry due to them being archived, or our usage being 17 versions out of date Also update the rand API breakages e.g. : - rand::thread_rng() → rand::rng() (function renamed) - rand::distributions::Alphanumeric → rand::distr::Alphanumeric (module renamed) - rng.gen_range() → rng.random_range() (function renamed) Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-17 15:58:58 +01:00
Fabiano Fidêncio	9e1f595160	kata-deploy: add Rust binary to root workspace Add tools/packaging/kata-deploy/binary as a workspace member, inherit shared dependency versions from the root manifest, and refresh Cargo.lock. Build the kata-deploy image from the repository root: copy the workspace layout into the rust-builder stage, run cargo test/build with -p kata-deploy, and adjust artifact and static asset COPY paths. Update the payload build script to invoke docker buildx with -f .../Dockerfile from the repo root. Add a repo-root .dockerignore to keep the Docker build context smaller. Document running unit tests with cargo test -p kata-deploy from the root. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-07 10:07:06 +08:00
Ruoqing He	2a024f55d0	libs: Move libs into root workspace Remove libs from exclude list, and move them explicitly into root workspace to make sure our core components are in a consistent state. This is a follow up of #12413. Signed-off-by: Ruoqing He <ruoqing.he@lingcage.com>	2026-04-06 11:03:38 +02:00
Jiahao Wang	29e5d5d951	build: Move agent to root workspace This commit adds kata agent to the root workspace, as a follow up work of #12413. Remove agent from exclude list, and make it as a member of root workspace. Signed-off-by: Jiahao Wang <jiahao.wang@lingcage.com>	2026-03-29 06:35:38 +00:00
stevenhorsman	9871256771	versions: Bump cloud-hypervisor to v51 In v51 the license was added, so try bumping to this version to solve the cargo deny issue Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-23 10:34:28 +00:00
dependabot[bot]	ef32923461	build(deps): bump tar from 0.4.44 to 0.4.45 Bumps [tar](https://github.com/alexcrichton/tar-rs) from 0.4.44 to 0.4.45. - [Commits](https://github.com/alexcrichton/tar-rs/compare/0.4.44...0.4.45) --- updated-dependencies: - dependency-name: tar dependency-version: 0.4.45 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-23 10:34:27 +00:00
stevenhorsman	85e17c2e77	deps: Bump rustls-webpki Bump rusttls-webpki to 0.103.10 to remediate RUSTSEC-2026-0049 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-23 10:34:27 +00:00
stevenhorsman	c3868f8e60	deps: Bump aws-lc-rs to 1.16.2 Bump aws-lc-rs, so that aws-lc-sys updates to 0.39.0 to remediate RUSTSEC-2026-0044 and https://osv.dev/vulnerability/RUSTSEC-2026-0048 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-23 10:34:27 +00:00
Fupan Li	608f378bff	dragonball: make sure the nydus's worker thread access network Since the dragonball's vmm thread had been joined in the pod's netns, which wouldn't access the network, thus we should make sure the nydus's worker thread join into the runD's main thread's netns which would access the network. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2026-03-22 22:44:24 +08:00
Fupan Li	fddd1e8b6e	dragonball: update the Cargo.lock and rm the unused crate update the Cargo.lock and rm the unused crate Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2026-03-12 10:58:04 +00:00
Fupan Li	e9bda42b01	dragonball: fix the failed UT tests Fix dragonball make check: clippy and format errors Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2026-03-12 10:58:03 +00:00
Markus Rudy	6643b258bb	genpolicy: update oci-client to v0.16.1 The older version we used transitively depends on an unmaintained crate. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:48 +01:00
Markus Rudy	8dfeeea924	genpolicy: add to Cargo workspace This commit adds the genpolicy utility to the root workspace. For now, only dependencies that are already in the root workspace are consumed from there, the genpolicy-specific ones should be added later. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:46 +01:00
Xuewei Niu	8a4ae090e6	Merge pull request #12513 from lifupan/event_publish send the task create/start/delete event to containerd	2026-02-28 14:41:46 +08:00
stevenhorsman	e43a17c2ba	runtime-rs: Remove unused crates - Remove unused crates to reduce our size and the work needed to do updates - Also update package.metadata.cargo-machete with some crates that are incorrectly coming up as unused Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:37:46 +00:00
stevenhorsman	8177a440ca	libs: Remove unused crates Remove unused crates to reduce our size and the work needed to do updates Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:37:46 +00:00
stevenhorsman	ed7ef68510	dragonball: Remove unused crates Remove the crates that cargo machete has assessed as being unused Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-26 09:37:15 +00:00
Alex Lyn	d298df7014	kata-types: Add cross-platform host_memory_mib() helper for host memory Introduce host_memory_mib() with OS-specific implementations (Linux/Android via nix::sysinfo, macOS via sysctl) selected at compile time. This improves portability and allows consistent host memory sizing/validation across different platforms. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 21:04:26 +08:00
Alex Lyn	b3d60698af	runtime-rs: move host memory adjustment into MemoryInfo using nix sysinfo As the memory related information has been serialized at the sandbox initalization specially at the moment of parsing configuration toml. This commit aims to refactor MemoryInfo initialization logics: (1) Remove memory sizing/host-memory adjustment logic from QEMU cmdline Memory::new() (2) Initialize/adjust memory values via kata-types MemoryInfo (single source of truth) (3) Replace sysinfo::System::new_with_specifics with nix::sys::sysinfo::sysinfo() to get host RAM Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-02-25 19:32:44 +08:00
Fupan Li	499e18c876	runtime-rs: send the task start event to container According to shimv2 proto, it should send task start event to containerd once a container task start succesfully. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2026-02-14 12:44:03 +08:00
stevenhorsman	7f77948658	versions: Bump rkyv version to 0.7.46 Bump to remediate RUSTSEC-2026-0001 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-14 00:33:45 +01:00
Fabiano Fidêncio	34199b09eb	runtime-rs: Properly parse containerd runtime options to extract ConfigPath The runtime-rs shim was failing to load its configuration when deployed via kata-deploy because it couldn't correctly parse the ConfigPath passed by containerd. The previous implementation naively skipped the first 2 bytes of the options and interpreted the rest as a UTF-8 string, which doesn't work since containerd passes a properly serialized protobuf message of type runtimeoptions.v1.Options. This change adds the runtimeoptions.proto definition to the protocols crate and updates the load_config function to correctly deserialize the protobuf message and extract the config_path field, matching how the Go runtime handles this via typeurl.UnmarshalAny. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-02-10 18:12:17 +01:00
stevenhorsman	f840f9ad54	rust: Bump time to 0.3.47 To remediate CVE-2026-25727 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:44:51 +01:00
stevenhorsman	bc45788356	versions: Bump bytes to 1.11.1 To remediate CVE-2026-25541 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-02-09 21:43:23 +01:00
tak-ka3	5471fa133c	runtime-rs: Add -info flag support for containerd v2.0+ Add -info flag handling to containerd-shim-kata-v2 (Rust version). This outputs RuntimeInfo protobuf (name, version, revision) to stdout, providing compatibility with containerd v2.0+ which queries runtime information via this flag. This is the runtime-rs counterpart to the Go implementation. Fixes #12133 Signed-off-by: tak-ka3 <takumi.hiraoka@acompany-ac.com>	2026-01-26 13:38:07 +01:00
stevenhorsman	aace7a7336	versions: Bump openssl-src This is a vulnerability (CVE-2025-9230) in openssl, so move to 3.5.4 which has a fix for this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-01-14 14:05:48 +01:00

1 2

56 Commits