kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-17 13:04:23 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	a3d6829ed4	Merge pull request #12964 from kata-containers/dependabot/github_actions/editorconfig-checker/action-editorconfig-checker-2.2.0 build(deps): bump editorconfig-checker/action-editorconfig-checker from 2.1.0 to 2.2.0	2026-05-04 19:37:42 +02:00
Fabiano Fidêncio	7c61c55011	Merge pull request #12966 from kata-containers/dependabot/github_actions/streetsidesoftware/cspell-action-8.4.0 build(deps): bump streetsidesoftware/cspell-action from 8.3.0 to 8.4.0	2026-05-04 19:37:28 +02:00
Fabiano Fidêncio	3d43259463	Merge pull request #12974 from fidencio/topic/ci-tdx-nightly-run-with-runtime-rs ci: tdx: Remove ITA key usage and run qemu-tdx-runtime-rs on nightly	2026-05-04 19:04:03 +02:00
Fabiano Fidêncio	b195dcca65	Merge pull request #12975 from kata-containers/topic/ci-nvidia-run-nightly-without-trace-log-level ci: nvidia: Disable NVRC trace logging on nightly runs	2026-05-04 19:02:14 +02:00
Fabiano Fidêncio	d9722ba4be	Merge pull request #12960 from microsoft/saul/update_mariner_test_configs kata-deploy: configure_mariner: update test configs	2026-05-04 18:26:41 +02:00
Fabiano Fidêncio	cd51003f3f	Merge pull request #12947 from fidencio/topic/runtime-rs-s390x-docker runtime-rs: qemu: add CCW network hotplug & retry update_interface	2026-05-03 22:06:00 +02:00
Fabiano Fidêncio	746d182c1a	runtime-rs: qemu: add CCW network hotplug & retry update_interface On s390x, QEMU uses the CCW bus instead of PCI. The network device hotplug path was hardcoded to find a PCI slot, which fails with "no free slots on PCI bridges" on s390x. Add CCW support to `hotplug_network_device`: when running on a native CCW bus, allocate a CCW subchannel address and use `devno` instead of PCI `bus`/`addr`/`vectors`. Additionally, after hotplugging a network device, the guest kernel needs time to probe the CCW device before the network interface appears. Add a retry loop (up to 10 attempts, 100ms apart) to `handle_interfaces` so that `update_interface` succeeds once the guest has created the link. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-05-03 19:26:39 +02:00
Fabiano Fidêncio	8655d87892	ci: nvidia: Disable NVRC trace logging on nightly runs On nightly CI, run the NVIDIA GPU tests without setting nvrc.log=trace. This gives us end-to-end test coverage that more closely matches how users would actually run Kata Containers with NVIDIA GPUs, since trace logging is not enabled by default in production. NVRC trace logging remains enabled for PR runs, where the extra verbosity is useful for debugging failures. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-03 18:13:07 +02:00
Fabiano Fidêncio	51d5f2ea7b	ci: Run runtime-rs tests for TDX on nightly As we're in the process to stabilise runtime-rs for the coming 4.0.0 release, we better start running as many tests as possible with that. The TDX runtime-rs job is gated to nightly runs only (pr-number == "nightly") since we only have a single TDX machine and cannot afford to run both qemu-tdx and qemu-tdx-runtime-rs on every PR. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-03 18:05:58 +02:00
Fabiano Fidêncio	8c3c7aa871	ci: Drop ITA_KEY usage from CI workflows The ITA_KEY secret was conditionally passed to TDX jobs for Intel Trust Authority attestation, but it is no longer needed. Remove it from all workflow files and the test helper export. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-05-03 18:05:51 +02:00
Steve Horsman	86e5975ad6	Merge pull request #12973 from stevenhorsman/release-concurrency-fix release: fix release workflow concurrency deadlock 3.30.0	2026-05-02 20:16:29 +01:00
stevenhorsman	9715a7cca2	release: fix release workflow concurrency deadlock Architecture-specific release workflows were using the same concurrency group when called from release.yaml, causing GitHub Actions to detect a deadlock and cancel the builds. Fix by appending architecture suffix to each workflow's concurrency group, allowing parallel execution without conflicts. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-05-02 20:13:17 +01:00
Fabiano Fidêncio	5540f50198	Merge pull request #12972 from stevenhorsman/release/3.30.0 release: Bump version to 3.30.0	2026-05-02 20:54:54 +02:00
Steve Horsman	fd2b85f8ad	Merge pull request #12969 from burgerdev/require-codegen gatekeeper: require codegen	2026-05-02 18:38:53 +01:00
stevenhorsman	a1a6a9a150	release: Bump version to 3.30.0 Bump VERSION and helm-charts versions. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-05-02 17:57:39 +01:00
Steve Horsman	3ae3a0437b	Merge pull request #12963 from zvonkok/copyfail kernel: Bump Kernel Version	2026-05-02 16:58:53 +01:00
Markus Rudy	22598a34b2	gatekeeper: require codegen The codegen check ensures that generated files are up-to-date and correspond to the tool versions used in CI. Requiring this check prevents us from accidentally merging, e.g., proto changes without the corresponding Rust/Go updates. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-05-02 12:28:58 +02:00
dependabot[bot]	7a1fa7842d	build(deps): bump streetsidesoftware/cspell-action from 8.3.0 to 8.4.0 Bumps [streetsidesoftware/cspell-action](https://github.com/streetsidesoftware/cspell-action) from 8.3.0 to 8.4.0. - [Release notes](https://github.com/streetsidesoftware/cspell-action/releases) - [Changelog](https://github.com/streetsidesoftware/cspell-action/blob/main/CHANGELOG.md) - [Commits](`9cd41bb518...de2a73e963`) --- updated-dependencies: - dependency-name: streetsidesoftware/cspell-action dependency-version: 8.4.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2026-05-01 18:06:22 +00:00
dependabot[bot]	883edd798f	build(deps): bump editorconfig-checker/action-editorconfig-checker Bumps [editorconfig-checker/action-editorconfig-checker](https://github.com/editorconfig-checker/action-editorconfig-checker) from 2.1.0 to 2.2.0. - [Release notes](https://github.com/editorconfig-checker/action-editorconfig-checker/releases) - [Commits](`4b6cd6190d...840e866d93`) --- updated-dependencies: - dependency-name: editorconfig-checker/action-editorconfig-checker dependency-version: 2.2.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2026-05-01 18:05:35 +00:00
Saul Paredes	cbb06545f7	kata-deploy: configure_mariner: also apply test config to runtime-rs Apply same test configs we use in runtime-go config to runtime-rs config. These are: - runtime.static_sandbox_resource_mgmt = true - hypervisor.clh.valid_hypervisor_paths includes cloud-hypervisor-glibc - hypervisor.clh.path = cloud-hypervisor-glibc Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2026-05-01 08:15:52 -07:00
Saul Paredes	564d381b79	kata-deploy: configure_mariner: correctly set static_sandbox_resource_mgmt static_sandbox_resource_mgmt is under the runtime config, not the hypervisor one. See `31f7438ecd/src/runtime/config/configuration-clh.toml.in (L439)` Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2026-05-01 08:15:52 -07:00
Zvonko Kaiser	803531dd9c	kernel: Bump Kernel Version Copy Fail" (CVE-2026-31431) is a high-severity local privilege escalation (LPE) vulnerability found in the Linux kernel in April 2026, which affects all major Linux distributions—including those using Long Term Support (LTS) kernels—released since 2017. The bug allows an unprivileged user to gain root access, escape containers, and modify the in-memory page cache reliably using a tiny 732-byte script Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-05-01 14:21:49 +00:00
Steve Horsman	62b847fd6c	Merge pull request #12850 from burgerdev/remove-standard-oci-runtime agent: remove standard-oci-runtime feature	2026-05-01 12:44:10 +01:00
Fabiano Fidêncio	79ba4e2dd0	Merge pull request #12937 from fidencio/topic/kata-deploy-support-containerd-config-version-4 kata-deploy: support containerd config version 4	2026-05-01 07:46:36 +02:00
Fabiano Fidêncio	96b68e77a7	kata-deploy: support containerd config schema version 4 and newer Containerd 2.3.0 introduces config schema version 4 (see upstream RELEASES.md and the version-4 server-plugin documentation). The default file still uses the same split-CRI layout as version 3 (plugins under io.containerd.cri.v1.runtime and io.containerd.cri.v1.images). Schema v4 mainly moves gRPC, TTRPC, debug, and metrics listener settings under io.containerd.server.v1.*; kata-deploy does not edit those server tables except for containerd log verbosity when DEBUG=true. Fixes: #12936 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-30 16:23:43 +02:00
Steve Horsman	31f7438ecd	Merge pull request #12949 from stevenhorsman/kata-ctl/move-into-root-workspace kata-ctl: Move into root workspace	2026-04-30 11:45:50 +01:00
stevenhorsman	b61b3d2f20	kata-deploy: Update default tool binary location Now that all but agent-ctl (still WIP) of the tools are in the root workspace, switch the default to that and add the exception for agent-ctl as it's the odd one out. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-30 08:46:22 +01:00
stevenhorsman	f8cf47d17c	kata-ctl: fix clippy to_string_in_format_args warnings With the workspace unification we've bumped anyhow from 1.0.31 to 1.0.102, so update the code to reflect that error implements `Display` now in the newer version. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-30 08:45:27 +01:00
stevenhorsman	efe62c9280	kata-ctl: Move into root workspace Add kata-ctl to be a workspace member to simplify the dependency management. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-30 08:45:27 +01:00
Fabiano Fidêncio	1e6c54cbcf	Merge pull request #12856 from harshitgupta1337/cbl-mariner-config-return-0 rootfs: Suppress condition check failure errors in cbl-mariner/config.sh	2026-04-30 08:35:06 +02:00
Fabiano Fidêncio	3b978c77ed	Merge pull request #12950 from stevenhorsman/trace-forwarder/move-to-root-workspace trace-forwarder: Move into root workspace	2026-04-29 23:54:43 +02:00
Harshit Gupta	3b796c6579	rootfs: mariner: suppress condition check failure errors Avoid returning failure from sourced scripts when condition check evaluates to false. Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>	2026-04-29 14:11:32 -04:00
Fabiano Fidêncio	5f59e20032	Merge pull request #12944 from fidencio/topic/run-arm64-ci-on-PR-again Revert "ci: Only run arm64 k8s tests on nightly builds"	2026-04-29 15:30:22 +02:00
stevenhorsman	9cae783f14	kata-deploy: fix binary location for trace-forwarder Moving the trace-forwarder into the root workspace moves the target directory, so update this target. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-29 13:27:09 +01:00
stevenhorsman	7664ebda7e	trace-forwarder: Move into root workspace Add trace-forwarder to be a workspace member to simplify the dependency management. Assisted-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-29 12:11:04 +01:00
Fabiano Fidêncio	1a22c3adec	Merge pull request #12942 from stevenhorsman/fix-cri-containerd-test-names ci: Fix cri-containerd-test names	2026-04-29 09:56:43 +02:00
Fabiano Fidêncio	ef15324b04	Revert "ci: Only run arm64 k8s tests on nightly builds" This reverts commit `c5b159c556`, as now we have 3 runners plugged into the CI. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-04-29 07:38:12 +02:00
Steve Horsman	2435970fe8	Merge pull request #12933 from fidencio/topic/runtime-rs-decouple-dragonball-from-non-x86-checks runtime-rs: drop misleading unsupported arches gating	2026-04-28 18:36:16 +01:00
stevenhorsman	4d4dee3af2	ci: Fix cri-containerd-test names During the zizmor refactoring I changed the name of two jobs to make all the architectures match. I forgot to update required_tests and as a workflow only change the PR didn't check this, so update them now. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-04-28 18:30:53 +01:00
Aurélien Bombo	886d05c7ee	Merge pull request #12898 from kata-containers/sprt/clh-runtime-rs runtime-rs: rename `cloud-hypervisor` to `clh-runtime-rs`	2026-04-28 11:50:56 -05:00
Aurélien Bombo	f3dc71a770	Revert "tests: k8s: policy: improve settings selection for runtime-rs hypervisors" This reverts commit `cafdd278ba`.	2026-04-28 10:58:01 -05:00
Aurélien Bombo	dc0f1795de	kata-deploy: remove useless unit tests These essentially merely test format!(), which is not our job. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-28 10:58:01 -05:00
Aurélien Bombo	cf6a91a104	runtime-rs/config: rename cloud-hypervisor to clh This aligns on the previous commit and runtime-go. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-28 10:58:01 -05:00
Aurélien Bombo	e4fbddb91a	ci: rename cloud-hypervisor to clh-runtime-rs This aligns on qemu-runtime-rs and makes more sense. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-04-28 10:58:01 -05:00
Fabiano Fidêncio	99b6dcf411	Merge pull request #12935 from fidencio/topic/rootfs-add-mlx5-modules kernel: bake in Mellanox MLX5 Ethernet support	2026-04-28 17:24:31 +02:00
Fabiano Fidêncio	7e5cc37fab	runtime-rs: resource: discover hugetlbfs page sizes from sysfs in test `volume::hugepage::tests::test_get_huge_page_size` was hard-coded to exercise the round-trip through `get_huge_page_option` / `get_page_size` for two hugetlbfs page sizes: let format_sizes = ["1Gi", "2Mi"]; These are the sizes x86_64 Ubuntu kernels expose by default (`/sys/kernel/mm/hugepages/hugepages-{1048576,2048}kB`), but other architectures use different sizes: * s390x: typically `hugepages-1048576kB` only (1 GiB; no 2 MiB pool) -- the kernel returns `EINVAL` for the missing 2 MiB iteration: thread 'volume::hugepage::tests::test_get_huge_page_size' panicked at .../resource/src/volume/hugepage.rs:242:14: called `Result::unwrap()` on an `Err` value: EINVAL * ppc64le: page sizes vary by kernel build (e.g. 16M/16G with 64K base pages, 2M/1G with 4K base pages), and may not match `["1Gi", "2Mi"]` exactly. Same EINVAL on the iteration whose size isn't a registered hstate. The reason this never bit before is the same as the SELinux test in the previous-but-one commit: the runtime-rs `Makefile` wrapped `test` in an `ifeq UNSUPPORTED_ARCHS` block that turned it into `echo ...; exit 0` on s390x/ppc64le/riscv64gc, so the test was only ever exercised on x86_64 (and aarch64, which happens to have the same default hugetlb page sizes). Dropping that gate is what exposed the latent assumption. Replace the hard-coded list with a small helper that lists the hugetlbfs page sizes the running kernel actually exposes via `/sys/kernel/mm/hugepages/hugepages-NkB`, rendered as binary-unit strings (e.g. "2Mi", "1Gi") that are accepted both by the kernel's `pagesize=...` mount option and by `byte_unit::Byte::parse_str(s, /allow_binary=/ true)`. If `/sys/kernel/mm/hugepages` doesn't exist or the directory is empty (e.g. hugetlbfs is unconfigured in the test environment) the test simply returns -- there's nothing meaningful to round-trip. On x86_64 the discovered list comes out as `["1Gi", "2Mi"]` (the same coverage as before). On s390x it becomes `["1Gi"]`, on ppc64le whatever that kernel build supports. Sysfs alone, however, is a necessary-but-not-sufficient signal: it tells us the kernel registered the page size, not whether this process is allowed to mount hugetlbfs. The ubuntu-24.04-s390x GHA runner demonstrates the gap -- it exposes `hugepages-1048576kB` via /sys but runs the build inside a user/mount namespace where mount(2) of hugetlbfs returns EPERM even when the test is invoked through sudo: thread 'volume::hugepage::tests::test_get_huge_page_size' panicked at .../resource/src/volume/hugepage.rs:292:14: called `Result::unwrap()` on an `Err` value: EPERM There's no portable capability bit we can sniff for that, so probe once with the first discovered size before iterating; if the probe mount fails, skip the test (rather than panic on something it can't control). A real regression on a host where mount() does work will still surface inside the loop below, since the per-size mount calls there continue to assert via `.unwrap()`. While here, feed the kernel-native shorthand (e.g. "2M", "1G") rather than the IEC form ("2Mi", "1Gi") to mount(2). hugetlbfs parses `pagesize=` via `memparse()`, which understands K/M/G but not the IEC `Ki/Mi/Gi`; today the kernel happens to silently drop the trailing `i` (memparse just stops scanning), but that leniency is incidental. /proc/mounts in turn always renders the option back as `pagesize=<N>{K,M,G}`, which is exactly the form `get_page_size()` already expects -- it strips `pagesize=` and unconditionally appends `i` before handing the result to byte_unit. Stripping the `i` for the mount option keeps the test's input aligned with the kernel's canonical syntax, while leaving the IEC form intact for the `Byte::parse_str(..., /allow_binary=/ true)` comparison. Also drop the unused `Ok` re-export from `use anyhow::{anyhow, Context, Ok, Result}`. Every existing `Ok(...)` site in this module is the variant-constructor form, for which the prelude's `Result::Ok` already works fine in `anyhow::Result<T>` context (same enum, with `E = anyhow::Error` inferred from the surrounding return type), so nothing actually needed `anyhow::Ok` to begin with. Removing the import lets the new helper use plain `let Ok(entries) = ... else` / `let Ok(name) = ... else` patterns directly instead of funneling everything through `.ok()` + `if let Some(...)` to dodge the shadowing. Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 16:25:31 +02:00
Fabiano Fidêncio	cd67638618	runtime-rs: hypervisor: don't assert kernel LSM behaviour in selinux test `selinux::tests::test_set_exec_label` had two branches: when SELinux is enabled it asserts that `set_exec_label` succeeds and round-trips the label through `/proc/thread-self/attr/exec`, and when SELinux is NOT enabled it asserted that `set_exec_label` returns `Err`. The second assertion is wrong -- it's a claim about the kernel/LSM interface, not about `set_exec_label` itself. `/proc/thread-self/attr/exec` is a generic LSM interface, not SELinux-specific. When no LSM owns the slot, kernel behaviour is arch/distro/build dependent: some kernels return `EINVAL` (observed on x86_64 Ubuntu CI runners, where the test was originally written and was passing), others silently accept the write (observed on ppc64le Ubuntu CI runners, which is what made this surface): thread 'selinux::tests::test_set_exec_label' panicked at src/runtime-rs/crates/hypervisor/src/selinux.rs:62:13: Expecting error, Got Ok(()) The reason this never blew up before is that the previous-but-one commit's `ifeq UNSUPPORTED_ARCHS ... exit 0` block in the runtime-rs `Makefile` made `make test` a no-op on s390x/ppc64le/riscv64gc. Dropping that gate (so `make test` actually runs on every arch that runtime-rs builds on) is what surfaced the latent bug. Drop the `else { assert!(ret.is_err(), ...); }` branch and replace it with a comment explaining why we deliberately don't assert on `ret` in that path. The "SELinux is enabled" branch is the only side that exercises anything we own; the no-SELinux path is a kernel detail that's not ours to normalize. Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 16:25:31 +02:00
Fabiano Fidêncio	8ab97a60f3	ci: install protobuf-compiler for runtime-rs build-checks The `runtime-rs` component of `build-checks.yaml` declared `rust` as its only dependency, but the runtime-rs build pulls in `prost-build v0.8.0` (via `ttrpc-codegen` -> `containerd-shim-protos`, and via the in-tree `hypervisor` crate), and `prost-build`'s build script needs a `protoc` binary at compile time. This worked on x86_64 and aarch64 only because `prost-build v0.8.0` ships bundled `protoc` binaries for those targets. On s390x (and ppc64le, when the matrix gets there) there is no bundled binary, so the build fails with: Failed to find the protoc binary. The PROTOC environment variable is not set, there is no bundled protoc for this platform, and protoc is not in the PATH The reason this didn't show up in CI before is that `make test` and `make check` for runtime-rs were wrapped in arch-specific `ifeq` blocks in `src/runtime-rs/Makefile` that turned them into no-ops on s390x/ppc64le/riscv64gc. The previous commit dropped those gates so `make {test,check}` now actually run on every arch, which exposes this latent CI gap. Match what `agent`, `libs`, `agent-ctl`, `kata-ctl` and `genpolicy` already declare and add `protobuf-compiler` to runtime-rs's needs. The existing `Install protobuf-compiler` step in this workflow already runs `sudo apt-get -y install protobuf-compiler`, which the s390x/ppc64le runners support (those other components have been using it on s390x for some time). Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 16:25:31 +02:00
Fabiano Fidêncio	48ef1be3be	runtime-rs: Drop misleading "unsupported arch" gates The Makefile pretended to reject s390x, powerpc64le and riscv64gc by wrapping `default`, `test` and `install` in `ifeq UNSUPPORTED_ARCHS`, and `check` in `ifeq ($(ARCH),x86_64)`. In reality `default` and `install` were byte-for-byte identical in both branches, so only `test` and `check` were ever skipped. The user-visible "$(ARCH) is not currently supported" message and the bare `exit 0` made it look like the build was a no-op when in fact builds and installs were proceeding -- which has burned at least one maintainer trying to debug a downstream packaging failure (issue #12914). The original reasons those targets were skipped were: * `test` (commit `389ae9702`, 2022): `cargo test` would pull in the dragonball crate, which only builds on x86_64/aarch64. * `check`: delegates to `standard_rust_check` in utils.mk, which runs `cargo clippy --all-targets --all-features`. `--all-features` unconditionally turns on the `dragonball` (and `cloud-hypervisor`) feature regardless of arch, breaking the build wherever those crates can't compile. Both are now obsolete. The preceding commit arch-gated the dragonball and firecracker drivers (and their dependencies) at the Cargo and Rust source level, so on s390x/ppc64le/riscv64gc: * the `dragonball` cargo feature is a safe no-op -- enabling it just doesn't pull in the dep, * the `cloud-hypervisor` cargo feature still pulls in `ch-config` (which is portable Rust), but the `ch` driver module that uses it remains arch-gated at the source level, * `dbs-utils` and `hyperlocal` are not built at all. That means `cargo clippy --all-targets --all-features` -- exactly what `standard_rust_check` runs -- is safe on every architecture, and no runtime-rs-local override of `check` is needed. Drop both `ifeq` blocks and let `test` and `check` run on every arch the way `default` and `install` already did. Net result: `make {default, test,check,install}` now Just Work everywhere, with no arch-specific code paths in this Makefile and no misleading "not currently supported" messages. Fixes: #12914 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 16:25:31 +02:00
Fabiano Fidêncio	6a1d7f7d85	runtime-rs: Arch-gate dragonball and firecracker hypervisors Two of the in-tree hypervisor drivers, dragonball and firecracker, along with three of their transitive dependencies (the dragonball crate itself, dbs-utils, hyperlocal), are built unconditionally on every architecture even though both upstream projects only support x86_64 and aarch64: * dragonball: the dragonball VMM crate is x86_64+aarch64 only. The runtime-rs `dragonball` cargo feature is already gated via `USE_BUILTIN_DB` -> `ARCH_SUPPORT_DB` in the Makefile, so the default `make` flow does the right thing today. But anything that bypasses that gate -- a contributor running `cargo clippy --all-features`, a CI matrix that forces the feature on, etc. -- fails to build on s390x/ppc64le/riscv64gc, because the optional `dragonball` dependency is declared without a target predicate and Rust source sites reference it under a feature gate alone. * firecracker: firecracker upstream only releases for x86_64 and aarch64 (https://github.com/firecracker-microvm/firecracker/releases/tag/v1.15.1). The Makefile already reflects this -- `FCCMD` is only defined in the x86_64/aarch64 arch options files -- but the in-tree `firecracker` driver module compiles unconditionally, so on s390x/ppc64le/riscv64gc we still ship a runtime that thinks it can drive a hypervisor binary that doesn't exist on the platform. Decouple both at the Cargo and Rust source level, mirroring the existing cloud-hypervisor pattern. * Cargo.toml: move the optional `dragonball` dependency, plus `dbs-utils` and `hyperlocal` (whose only consumers are the dragonball and firecracker driver modules), into a target- specific dependency block: [target.'cfg(any(target_arch = "x86_64", target_arch = "aarch64"))'.dependencies] dbs-utils = { workspace = true } hyperlocal = { workspace = true } dragonball = { workspace = true, features = [ ... ], optional = true } On x86_64/aarch64 the resolved dep graph is unchanged. On s390x/ppc64le/riscv64gc enabling the `dragonball` feature becomes a safe no-op, and the dep graph for the `hypervisor` crate is completely free of any dragonball or firecracker artifacts. This also makes the gating self-policing: any future `use dbs_utils::...` or `use hyperlocal::...` outside an arch-gated module will fail to build on non-x86 instead of silently shipping dead code. * Rust modules: combine the existing `feature = "dragonball"` gate with `target_arch = "x86_64"\|"aarch64"` on `pub mod dragonball;` and the dragonball-only constants (`DEV_HUGEPAGES`, `SHMEM`, `HUGE_SHMEM`) in `crates/hypervisor/src/lib.rs`. Add the same target_arch gate to `pub mod firecracker;` (matching the existing gate on `pub mod ch;`) and to every site in `crates/runtimes/virt_container/src/{lib,sandbox}.rs` that names a now-gated type (`Dragonball`, `Firecracker`, `DragonballConfig`, `FirecrackerConfig`). * `pub(crate) enum VmmState` in `crates/hypervisor/src/lib.rs` gets the same target_arch gate -- its only consumers are the `ch`, `dragonball` and `firecracker` modules, all of which are gated to x86_64+aarch64. Without it, `cargo clippy --all-features -- -D warnings` (i.e. what `make check` runs via `standard_rust_check`) would fail on non-x86 with "enum `VmmState` is never used". The plain `HYPERVISOR_DRAGONBALL` and `HYPERVISOR_FIRECRACKER` string constants stay ungated, and the persist-side match arms in `sandbox.rs` that only compare against those strings also stay ungated, mirroring how `HYPERVISOR_NAME_CH` is already handled. Verified with `cargo tree --target=<triple> --features dragonball -p hypervisor` for x86_64/aarch64/s390x/powerpc64le/riscv64gc: * x86_64/aarch64: full dragonball stack (dbs_address_space, dbs_allocator, dbs_arch, dbs_boot, dbs_device, dbs_interrupt, dbs_legacy_devices, dbs_pci, dbs_upcall, dbs-utils, hyperlocal, ...) is pulled in, as before. * s390x/ppc64le/riscv64gc: the dep graph for the `hypervisor` crate is completely free of any dragonball or firecracker artifacts, even with `--features dragonball` explicitly enabled. `cargo clippy --target=s390x-unknown-linux-gnu --all-targets --all-features --release --locked -- -D warnings` is also clean, and `make check` on x86_64 with the default `USE_BUILTIN_DB=true` still passes. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-04-28 16:25:31 +02:00

1 2 3 4 5 ...

18855 Commits