kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-16 20:37:15 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	947c7ff3b3	ci: Remove standalone kernel-modules-images build target Module images are now built as part of the kernel-tarball target via build-kernel.sh build-modules-images, so the separate CI matrix entry is no longer needed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:21:27 +02:00
Fabiano Fidêncio	68b80c998f	docs: Add documentation for kernel modules images Document the kernel_modules_images feature: building modules volumes, TOML and Helm chart configuration, agent behavior, and security considerations for both confidential and non-confidential deployments. Prominently warns that custom modules will not work with official Kata kernel releases because the KBUILD_SIGN_PIN used to sign modules is not public, requiring users to rebuild the kernel with their own signing key. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:21:27 +02:00
Fabiano Fidêncio	76d83ad5f7	kernel: Bump kata_config_version We need to do so as we're changing kernel build scripts. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:17 +02:00
Fabiano Fidêncio	848c2f95e4	kernel: Build combined kata-modules-all.img In addition to per-set module images (kata-modules-mlx5.img, kata-modules-ntfs.img), build a combined image containing all module sets. This reduces the number of virtio-blk devices and dm-mod.create kernel command line entries needed when a user wants all available modules. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:16 +02:00
Fabiano Fidêncio	79971c7c14	kernel: Add NTFS3 modules image support Add kernel config fragment for the NTFS3 filesystem driver as a loadable module and register it in the orchestrator script so that a kata-modules-ntfs.img disk image is produced alongside the MLNX image in the same CI build. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:15 +02:00
Fabiano Fidêncio	51ffafa0ac	kernel: Add MLX5 modules image build infrastructure Add config fragment, build script, and CI integration for building Mellanox MLX5/InfiniBand kernel modules as a standalone disk image. The orchestrator script (build-kernel-modules-images.sh) builds the kernel with extra module config fragments, runs modules_install, filters modules by subsystem into per-set staging trees, and packages each into its own disk image using build-modules-volume.sh. Since these modules are built within the Kata CI using the same KBUILD_SIGN_PIN, they are signed and loadable on the official released Kata kernel. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:14 +02:00
Fabiano Fidêncio	2273a32d2c	kata-deploy: Add kernel_modules_images support Allow deploying kernel modules images via the Helm chart. Users specify a list of images with paths and optional verity params in values.yaml. These are rendered as a ConfigMap, mounted into the kata-deploy pod, and used to generate a TOML drop-in with [[hypervisor.<name>.kernel_modules_images]] array of tables. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:12 +02:00
Fabiano Fidêncio	eb7f3657ba	agent: Mount storages before loading kernel modules Reorder create_sandbox to call add_storages before load_kernel_module so that modules on separate volumes are available when modprobe runs. After mounting, detect any storages targeting /lib/modules/kata-modules-* and if present, write a /etc/depmod.d/kata-modules.conf with search directives for those directories and run depmod -a to rebuild the module dependency database. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:11 +02:00
Fabiano Fidêncio	886946fc60	rootfs: Add kmod to guest rootfs package lists The kernel modules images feature requires modprobe and depmod to be available inside the guest VM. Add the kmod package to the Ubuntu, Alpine, and CentOS rootfs package lists. Debian inherits from Ubuntu's config so it picks up kmod automatically. The NVIDIA rootfs already installs kmod separately in nvidia_chroot.sh. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:10 +02:00
Fabiano Fidêncio	19cc1eb7f8	runtime-rs: Add kernel_modules_images support Add support for attaching multiple kernel modules disk images in the Rust runtime, mirroring the Go runtime implementation. Each configured image is cold-plugged as a read-only block device and a Storage entry is sent to the agent to mount it at /lib/modules/kata-modules-<N>. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:09 +02:00
Fabiano Fidêncio	f5551a5bdd	runtime: Add kernel_modules_images support Add support for attaching multiple kernel modules disk images to the guest VM as additional block devices. This enables loading out-of-tree kernel modules from separate, independently managed volumes without modifying the dm-verity measured rootfs. Configuration uses TOML array of tables: [[hypervisor.qemu.kernel_modules_images]] path = "/path/to/modules-volume-1.img" verity_params = "" [[hypervisor.qemu.kernel_modules_images]] path = "/path/to/modules-volume-2.img" verity_params = "root_hash=..." Each image is cold-plugged as a virtio-blk device (vdb, vdc, ...) and a Storage entry is sent to the agent to mount it read-only at /lib/modules/kata-modules-<N>. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:08 +02:00
Fabiano Fidêncio	8516029270	kernel: Add script to build modules volume with dm-verity Add build-modules-volume.sh to package signed kernel modules into a standalone ext4 disk image that can be attached to a kata guest VM as a secondary block device. This allows loading out-of-tree modules without modifying the dm-verity measured rootfs. The rootfs image and its root hash remain unchanged. The script optionally supports dm-verity on the modules volume itself (-V flag), providing defense-in-depth alongside kernel module signing. Security risks documented in the script header: - Without dm-verity, the volume relies solely on kernel module signing (CONFIG_MODULE_SIG_FORCE) for integrity. - With dm-verity, the hash must be verified during attestation to provide actual security benefit. - Host-side file permissions on the volume image must prevent unauthorized modification. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:06 +02:00
Fabiano Fidêncio	3686c4a20a	kernel: Remove redundant CONFIG_MODULES from NVIDIA GPU fragments Remove CONFIG_MODULES, CONFIG_MODULE_UNLOAD, and CONFIG_MODULE_SIG from the NVIDIA GPU config fragments (nvidia.x86_64.conf.in and nvidia.arm64.conf.in) since these are now provided by the shared common/modules/modules.conf and common/signing/module_signing.conf fragments, which are always included for confidential builds. NVIDIA GPU builds always use -x (confidential), so these options were redundant. CONFIG_FW_LOADER is kept as it is specific to GPU firmware loading needs. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:04 +02:00
Fabiano Fidêncio	bde96141df	kernel: Enable module loading and signing for confidential builds For confidential builds (-x), always include modules/modules.conf (CONFIG_MODULES=y, CONFIG_MODULE_UNLOAD=y) and signing/module_signing.conf (CONFIG_MODULE_SIG_FORCE=y, etc.). This enables two important capabilities for confidential guests: 1. Loadable module support: allows out-of-tree kernel modules to be loaded from separate modules volume images without modifying the dm-verity measured rootfs. 2. Module signature enforcement: the kernel rejects any unsigned or wrongly-signed module, maintaining the trust chain from the attested kernel to loaded modules. Previously, module signing was only included when KBUILD_SIGN_PIN was set. For non-confidential builds, that behavior is preserved. For confidential builds, module signing is now always enabled since it is essential for the security model. Security notes: - CONFIG_MODULE_SIG_FORCE=y ensures the kernel rejects unsigned modules, preventing arbitrary code execution in the guest. - The signing key is generated during kernel build. Users need this key (protected by KBUILD_SIGN_PIN) to sign out-of-tree modules. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:16:00 +02:00
Fabiano Fidêncio	a6ebcc2d38	kernel: Add config fragment for module loading Add a new conditional kernel config fragment in a subdirectory (following the pattern of signing/ and confidential_containers/) so it is not auto-included by the common/*.conf wildcard: - common/modules/modules.conf: Enables CONFIG_MODULES and CONFIG_MODULE_UNLOAD for out-of-tree kernel module support. This is required for loading user-compiled modules delivered via separate modules volume images. This fragment will be explicitly included by build-kernel.sh for confidential builds. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:15:59 +02:00
Fabiano Fidêncio	f2c8f66dcd	kata-deploy: Fix cleanup_and_fail returning non-numeric value cleanup_and_fail() prints nothing to stdout and returns 1. The callers used `return "$(cleanup_and_fail ...)"` which expands to `return ""`, causing bash to error with "numeric argument required". Replace the command substitution with a compound command that calls the cleanup function and propagates its exit code via `$?`. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-27 07:15:57 +02:00
Steve Horsman	63e50dd946	Merge pull request #12817 from burgerdev/regorus-bump genpolicy: update regorus to 0.9.1	2026-04-26 13:58:40 +01:00
Fabiano Fidêncio	120d895d60	Merge pull request #12918 from mythi/no-ita tests: align qemu-tdx kbs tests to use Trustee AS	2026-04-26 13:13:59 +02:00
Fabiano Fidêncio	74d9d043f0	agent: raise regorus policy length limits regorus 0.9.0 introduced a hard, per-engine ceiling on parsed-policy size (1024 columns / 1 MiB / 20 000 lines, see lexer.rs:30 in microsoft/regorus). The 1024-column cap rejects realistic policies emitted by `genpolicy`: the `NVIDIA_REQUIRE_CUDA` environment variable on `nvcr.io/nvidia/k8s/cuda-sample` is roughly 1.3 KiB on a single line, so the agent's `set_policy()` returns an error, the agent (PID 1) exits, the guest kernel reboots, and the runtime eventually times out connecting to the agent's vsock. regorus PR #624 ("feat: make policy length limits configurable per engine") adds `Engine::set_policy_length_config`, but it has not been released yet -- the latest published version is still 0.9.1, which predates that change. Pin `regorus` to the upstream commit that includes #624 and call the new setter from `AgentPolicy::new_engine()` with values that comfortably fit any policy we expect to evaluate (64 KiB per line, 16 MiB per file, 200 000 lines) while still rejecting pathological/minified input. Once a regorus release > 0.9.1 ships with #624, the dependency can be moved back to crates.io. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-26 10:18:26 +02:00
Markus Rudy	c8fe6a60d0	genpolicy: update regorus to 0.9.1 The version we used before was released in 2024, it's about time to use a newer version. The new version of the crate comes with a license, which addresses a `cargo deny` finding. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-04-26 10:18:26 +02:00
Fabiano Fidêncio	815db4a1df	Merge pull request #12920 from zvonkok/driver-bump cuda: Bump Driver Version	2026-04-26 00:00:00 +02:00
Mikko Ylinen	9cccfb5cb5	tests: align qemu-tdx kbs tests to use Trustee AS No need to deviate from how other CoCo targets use Trustee and enables us to add more tests (e.g., RVPS) that ITA Trustee implemention does not support. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-04-25 22:53:15 +02:00
Fabiano Fidêncio	749d4713e8	Merge pull request #12897 from kata-containers/dependabot/cargo/src/tools/trace-forwarder/rand-0.8.6 build(deps): bump rand from 0.8.5 to 0.8.6 in /src/tools/trace-forwarder	2026-04-25 22:49:59 +02:00
Steve Horsman	fc359d2140	Merge pull request #12901 from kata-containers/dependabot/cargo/openssl-0.10.78 build(deps): bump openssl from 0.10.76 to 0.10.78	2026-04-25 20:59:51 +01:00
Zvonko Kaiser	150e3ab4b8	cuda: Bump Driver Version For HGX B300 systems we need the 595 driver branch, bump the guest fs driver to support those systems. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-04-25 19:28:31 +02:00
Fabiano Fidêncio	28d9043d4c	build: Add driver version to artefact cache Add the nvidia driver version to the artefact cache keys so that a driver bump triggers image and initrd rebuilds. Also rename the helper functions to follow a consistent get_latest_nvidia_* naming convention. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-25 19:28:31 +02:00
Fabiano Fidêncio	b3ed669d16	Merge pull request #12913 from pmores/fix-exec runtime-rs: fix exec when selinux is disabled on guest	2026-04-25 17:34:46 +02:00
Fabiano Fidêncio	3d94620df5	Merge pull request #12900 from kata-containers/dependabot/cargo/src/tools/kata-ctl/openssl-0.10.78 build(deps): bump openssl from 0.10.73 to 0.10.78 in /src/tools/kata-ctl	2026-04-25 17:13:01 +02:00
Steve Horsman	db51842229	Merge pull request #12923 from stevenhorsman/bump-webpki-to-0.103.13 versions: Update rustls-webpki to 0.103.13	2026-04-25 16:09:47 +01:00
Fabiano Fidêncio	0a4fb4f11b	Merge pull request #12891 from fidencio/topic/networking-handle-device-type-interfaces runtimes: network: handle "device" type interfaces (mlx5 SFs)	2026-04-25 16:46:37 +02:00
dependabot[bot]	151a797fc0	build(deps): bump openssl from 0.10.76 to 0.10.78 Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.76 to 0.10.78. - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](https://github.com/rust-openssl/rust-openssl/compare/openssl-v0.10.76...openssl-v0.10.78) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.78 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-25 10:28:48 +00:00
dependabot[bot]	365f6c1efa	build(deps): bump openssl from 0.10.73 to 0.10.78 in /src/tools/kata-ctl Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.73 to 0.10.78. - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](https://github.com/rust-openssl/rust-openssl/compare/openssl-v0.10.73...openssl-v0.10.78) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.78 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-25 10:27:45 +00:00
dependabot[bot]	9a88f4f8cf	build(deps): bump rand from 0.8.5 to 0.8.6 in /src/tools/trace-forwarder Bumps [rand](https://github.com/rust-random/rand) from 0.8.5 to 0.8.6. - [Release notes](https://github.com/rust-random/rand/releases) - [Changelog](https://github.com/rust-random/rand/blob/0.8.6/CHANGELOG.md) - [Commits](https://github.com/rust-random/rand/compare/0.8.5...0.8.6) --- updated-dependencies: - dependency-name: rand dependency-version: 0.8.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-25 10:27:32 +00:00
Pavel Mores	d3f56cd3a6	runtime-rs: remove process selinux label on exec if disable_guest_selinux Without this commit any attempt to exec a command in a container will fail if SELinux is disabled in the guest but an SELinux label is given for the new process. That will happen pretty much any time SELinux is enabled on the host (and the container is not privileged). Signed-off-by: Pavel Mores <pmores@redhat.com>	2026-04-25 11:27:15 +01:00
Pavel Mores	1390ad650b	runtime-rs: factor getting disable_guest_linux value out to own function We'll need to get the `disable_guest_linux` value in the exec handler, too. This will allow us to avoid duplicating the get. Signed-off-by: Pavel Mores <pmores@redhat.com>	2026-04-25 11:27:15 +01:00
stevenhorsman	d6df75853b	versions: Update rustls-webpki to 0.103.13 Simple bump to fix CVE GHSA-82j2-j2ch-gfr8: Denial of service via panic on malformed CRL BIT STRING Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-25 11:27:02 +01:00
Fabiano Fidêncio	966e9b7f80	agent: skip non-PCI addresses in PCIDEVICE env vars Device plugins may set PCIDEVICE_* environment variables with non-PCI identifiers (e.g. "mlx5_core.sf.10" for mlx5 Scalable Functions). The update_env_pci() function assumed all values were PCI BDF addresses and failed to parse them, causing container creation to fail with: "PCI address mlx5_core.sf.10 should have the format DDDD:BB:SS.F" Skip PCIDEVICE_* entries whose values don't parse as PCI addresses, leaving them untouched for the workload. The corresponding _INFO variable is also left as-is since no mapping is collected. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-25 12:26:20 +02:00
Fabiano Fidêncio	8c3a0e692b	runtime-rs: network: handle "device" type interfaces (mlx5 SFs) Same fix as the Go runtime: interfaces whose drivers do not register a specific netlink kind (e.g. mlx5 Scalable Functions) are reported with the generic type "device", which is not handled by the endpoint creation match, causing sandbox creation to fail with: "unsupported link type: device" Add "device" as an alternative pattern alongside "veth" so these interfaces are connected through a TAP + TC-filter bridge. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-25 12:26:20 +02:00
Fabiano Fidêncio	6436922f5b	runtime: network: handle "device" type interfaces (mlx5 SFs) Interfaces whose drivers do not register a specific netlink kind (e.g. mlx5 Scalable Functions) are reported with the generic type "device". The endpoint creation code did not handle this type, causing sandbox creation to fail with: "Unsupported network interface: device" This is particularly visible on arm64 with Mellanox ConnectX NICs using Scalable Functions, where the ethtool BusInfo returns a non-PCI identifier (e.g. "mlx5_core.sf.4") so isPhysicalIface() cannot classify the interface as physical either. Handle "device" type interfaces the same way as veth endpoints, connecting them through a TAP + TC-filter bridge. Additionally, relax getLinkForEndpoint() for VethEndpoint so it accepts the concrete link type returned by the kernel instead of asserting netlink.Veth. A "device" type interface wrapped in a VethEndpoint returns netlink.Device from LinkByName(), which would fail the strict type assertion. All callers only need link.Attrs(), so accepting any link type is safe. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-25 12:26:20 +02:00
Steve Horsman	4b2d529a34	Merge pull request #12924 from fidencio/topic/temp-skip-smb-tests ci: k8s: temporarily remove smb tests	2026-04-25 11:25:49 +01:00
Fabiano Fidêncio	df68536cd6	ci: Skip tests not working with k8s 1.36.0 At first we thought this only happened with AKS, but it seems this is a change in k8s 1.36.0 as the tests now started failing outside of AKS as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2026-04-25 08:56:42 +02:00
Fabiano Fidêncio	e6c6aad7af	ci: k8s: temporarily remove smb tests All the CIs are failing on the tests and in order to avoid blocking upstream while allowing enough time for the developers to properly fix it, let's just not execute the test. This commit should be reverted once a fix is proposed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-24 21:13:23 +02:00
Fabiano Fidêncio	e0927e0e0c	Merge pull request #12846 from RainaYL/rainax/split_irqchip_pr dragonball: Implement userspace IOAPIC to enable split irqchip	2026-04-24 19:07:45 +02:00
Aurélien Bombo	15296fc9fe	Merge pull request #12374 from microsoft/cameronbaird/add-cifs kernel: add required configs for CIFS support	2026-04-24 10:42:09 -05:00
Steve Horsman	1cab92139c	Merge pull request #12501 from ANJANA-A-R-K/vuln-fix kata-agent: Bump serde-enum-str to v0.5.0	2026-04-24 15:03:45 +01:00
Fabiano Fidêncio	3505576a98	Merge pull request #12912 from fidencio/topic/runtime-rs-qemu-as-default runtime-rs: Set QEMU as the default hypervisor	2026-04-24 13:37:35 +02:00
Greg Kurz	de91eda11b	Merge pull request #12890 from fidencio/topic/shell-check shell check: Let the bot fix those issues	2026-04-24 12:41:33 +02:00
Anjana A R K	d2e0e277cc	kata-agent: Bump serde-enum-str to v0.5.0 Upgraded the serde-enum-str to v0.5.0 which bumps serde-attributes to 0.3.0 version Signed-off-by: Anjana A R K <anjana.a.r.k1@ibm.com>	2026-04-24 15:57:59 +05:30
Fabiano Fidêncio	785c2ca981	Merge pull request #12911 from fidencio/topic/ci-only-run-arm64-tests-on-nightly ci: Only run arm64 k8s tests on nightly builds	2026-04-24 10:19:34 +02:00
Fabiano Fidêncio	12bb497ce2	runtime-rs: Set QEMU as the default hypervisor Dragonball is only supported on x86_64 and aarch64, so using it as the default hypervisor means architectures like s390x, powerpc64le, and riscv64gc have no working default. Switch to QEMU, which is available across all supported architectures. Dragonball is still compiled as a feature on x86_64 and aarch64 via USE_BUILTIN_DB, and users can still override the default with HYPERVISOR=dragonball. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-24 09:42:10 +02:00

1 2 3 4 5 ...

18775 Commits