Commit Graph

18842 Commits

Author SHA1 Message Date
Fabiano Fidêncio
8655d87892 ci: nvidia: Disable NVRC trace logging on nightly runs
On nightly CI, run the NVIDIA GPU tests without setting nvrc.log=trace.
This gives us end-to-end test coverage that more closely matches how
users would actually run Kata Containers with NVIDIA GPUs, since trace
logging is not enabled by default in production.

NVRC trace logging remains enabled for PR runs, where the extra
verbosity is useful for debugging failures.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-03 18:13:07 +02:00
Steve Horsman
86e5975ad6 Merge pull request #12973 from stevenhorsman/release-concurrency-fix
release: fix release workflow concurrency deadlock
3.30.0
2026-05-02 20:16:29 +01:00
stevenhorsman
9715a7cca2 release: fix release workflow concurrency deadlock
Architecture-specific release workflows were using the same concurrency
group when called from release.yaml, causing GitHub Actions to detect
a deadlock and cancel the builds.

Fix by appending architecture suffix to each workflow's concurrency
group, allowing parallel execution without conflicts.

Assisted-by: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-02 20:13:17 +01:00
Fabiano Fidêncio
5540f50198 Merge pull request #12972 from stevenhorsman/release/3.30.0
release: Bump version to 3.30.0
2026-05-02 20:54:54 +02:00
Steve Horsman
fd2b85f8ad Merge pull request #12969 from burgerdev/require-codegen
gatekeeper: require codegen
2026-05-02 18:38:53 +01:00
stevenhorsman
a1a6a9a150 release: Bump version to 3.30.0
Bump VERSION and helm-charts versions.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-02 17:57:39 +01:00
Steve Horsman
3ae3a0437b Merge pull request #12963 from zvonkok/copyfail
kernel: Bump Kernel Version
2026-05-02 16:58:53 +01:00
Markus Rudy
22598a34b2 gatekeeper: require codegen
The codegen check ensures that generated files are up-to-date and
correspond to the tool versions used in CI. Requiring this check
prevents us from accidentally merging, e.g., proto changes without the
corresponding Rust/Go updates.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2026-05-02 12:28:58 +02:00
Zvonko Kaiser
803531dd9c kernel: Bump Kernel Version
Copy Fail" (CVE-2026-31431) is a high-severity local privilege escalation (LPE)
vulnerability found in the Linux kernel in April 2026, which affects all major
Linux distributions—including those using Long Term Support (LTS) kernels—released since 2017.
The bug allows an unprivileged user to gain root access, escape containers,
and modify the in-memory page cache reliably using a tiny 732-byte script

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-05-01 14:21:49 +00:00
Steve Horsman
62b847fd6c Merge pull request #12850 from burgerdev/remove-standard-oci-runtime
agent: remove standard-oci-runtime feature
2026-05-01 12:44:10 +01:00
Fabiano Fidêncio
79ba4e2dd0 Merge pull request #12937 from fidencio/topic/kata-deploy-support-containerd-config-version-4
kata-deploy: support containerd config version 4
2026-05-01 07:46:36 +02:00
Fabiano Fidêncio
96b68e77a7 kata-deploy: support containerd config schema version 4 and newer
Containerd 2.3.0 introduces config schema version 4 (see upstream
RELEASES.md and the version-4 server-plugin documentation). The default file
still uses the same split-CRI layout as version 3 (plugins under
io.containerd.cri.v1.runtime and io.containerd.cri.v1.images). Schema v4
mainly moves gRPC, TTRPC, debug, and metrics listener settings under
io.containerd.server.v1.*; kata-deploy does not edit those server tables except
for containerd log verbosity when DEBUG=true.

Fixes: #12936

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-30 16:23:43 +02:00
Steve Horsman
31f7438ecd Merge pull request #12949 from stevenhorsman/kata-ctl/move-into-root-workspace
kata-ctl: Move into root workspace
2026-04-30 11:45:50 +01:00
stevenhorsman
b61b3d2f20 kata-deploy: Update default tool binary location
Now that all but agent-ctl (still WIP) of the tools are
in the root workspace, switch the default to that and add
the exception for agent-ctl as it's the odd one out.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-30 08:46:22 +01:00
stevenhorsman
f8cf47d17c kata-ctl: fix clippy to_string_in_format_args warnings
With the workspace unification we've bumped anyhow
from 1.0.31 to 1.0.102, so update the code to reflect that
error implements `Display` now in the newer version.

Assisted-by: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-30 08:45:27 +01:00
stevenhorsman
efe62c9280 kata-ctl: Move into root workspace
Add kata-ctl to be a workspace member to simplify the
dependency management.

Assisted-by: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-30 08:45:27 +01:00
Fabiano Fidêncio
1e6c54cbcf Merge pull request #12856 from harshitgupta1337/cbl-mariner-config-return-0
rootfs: Suppress condition check failure errors in cbl-mariner/config.sh
2026-04-30 08:35:06 +02:00
Fabiano Fidêncio
3b978c77ed Merge pull request #12950 from stevenhorsman/trace-forwarder/move-to-root-workspace
trace-forwarder: Move into root workspace
2026-04-29 23:54:43 +02:00
Harshit Gupta
3b796c6579 rootfs: mariner: suppress condition check failure errors
Avoid returning failure from sourced scripts when condition check evaluates
to false.

Signed-off-by: Harshit Gupta <guptaharshit@microsoft.com>
2026-04-29 14:11:32 -04:00
Fabiano Fidêncio
5f59e20032 Merge pull request #12944 from fidencio/topic/run-arm64-ci-on-PR-again
Revert "ci: Only run arm64 k8s tests on nightly builds"
2026-04-29 15:30:22 +02:00
stevenhorsman
9cae783f14 kata-deploy: fix binary location for trace-forwarder
Moving the trace-forwarder into the root workspace moves the target
directory, so update this target.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-29 13:27:09 +01:00
stevenhorsman
7664ebda7e trace-forwarder: Move into root workspace
Add trace-forwarder to be a workspace member to simplify the
dependency management.

Assisted-by: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-29 12:11:04 +01:00
Fabiano Fidêncio
1a22c3adec Merge pull request #12942 from stevenhorsman/fix-cri-containerd-test-names
ci: Fix cri-containerd-test names
2026-04-29 09:56:43 +02:00
Fabiano Fidêncio
ef15324b04 Revert "ci: Only run arm64 k8s tests on nightly builds"
This reverts commit c5b159c556, as now we
have 3 runners plugged into the CI.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-29 07:38:12 +02:00
Steve Horsman
2435970fe8 Merge pull request #12933 from fidencio/topic/runtime-rs-decouple-dragonball-from-non-x86-checks
runtime-rs: drop misleading unsupported arches gating
2026-04-28 18:36:16 +01:00
stevenhorsman
4d4dee3af2 ci: Fix cri-containerd-test names
During the zizmor refactoring I changed the name of two jobs
to make all the architectures match. I forgot to update required_tests
and as a workflow only change the PR didn't check this, so update
them now.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 18:30:53 +01:00
Aurélien Bombo
886d05c7ee Merge pull request #12898 from kata-containers/sprt/clh-runtime-rs
runtime-rs: rename `cloud-hypervisor` to `clh-runtime-rs`
2026-04-28 11:50:56 -05:00
Aurélien Bombo
f3dc71a770 Revert "tests: k8s: policy: improve settings selection for runtime-rs hypervisors"
This reverts commit cafdd278ba.
2026-04-28 10:58:01 -05:00
Aurélien Bombo
dc0f1795de kata-deploy: remove useless unit tests
These essentially merely test format!(), which is not our job.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Aurélien Bombo
cf6a91a104 runtime-rs/config: rename cloud-hypervisor to clh
This aligns on the previous commit and runtime-go.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Aurélien Bombo
e4fbddb91a ci: rename cloud-hypervisor to clh-runtime-rs
This aligns on qemu-runtime-rs and makes more sense.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Fabiano Fidêncio
99b6dcf411 Merge pull request #12935 from fidencio/topic/rootfs-add-mlx5-modules
kernel: bake in Mellanox MLX5 Ethernet support
2026-04-28 17:24:31 +02:00
Fabiano Fidêncio
7e5cc37fab runtime-rs: resource: discover hugetlbfs page sizes from sysfs in test
`volume::hugepage::tests::test_get_huge_page_size` was hard-coded to
exercise the round-trip through `get_huge_page_option` /
`get_page_size` for two hugetlbfs page sizes:

    let format_sizes = ["1Gi", "2Mi"];

These are the sizes x86_64 Ubuntu kernels expose by default
(`/sys/kernel/mm/hugepages/hugepages-{1048576,2048}kB`), but other
architectures use different sizes:

  * s390x: typically `hugepages-1048576kB` only (1 GiB; no 2 MiB pool)
    -- the kernel returns `EINVAL` for the missing 2 MiB iteration:

      thread 'volume::hugepage::tests::test_get_huge_page_size'
      panicked at .../resource/src/volume/hugepage.rs:242:14:
      called `Result::unwrap()` on an `Err` value: EINVAL

  * ppc64le: page sizes vary by kernel build (e.g. 16M/16G with 64K
    base pages, 2M/1G with 4K base pages), and may not match
    `["1Gi", "2Mi"]` exactly. Same EINVAL on the iteration whose
    size isn't a registered hstate.

The reason this never bit before is the same as the SELinux test
in the previous-but-one commit: the runtime-rs `Makefile` wrapped
`test` in an `ifeq UNSUPPORTED_ARCHS` block that turned it into
`echo ...; exit 0` on s390x/ppc64le/riscv64gc, so the test was
only ever exercised on x86_64 (and aarch64, which happens to have
the same default hugetlb page sizes). Dropping that gate is what
exposed the latent assumption.

Replace the hard-coded list with a small helper that lists the
hugetlbfs page sizes the running kernel actually exposes via
`/sys/kernel/mm/hugepages/hugepages-NkB`, rendered as binary-unit
strings (e.g. "2Mi", "1Gi") that are accepted both by the kernel's
`pagesize=...` mount option and by `byte_unit::Byte::parse_str(s,
/*allow_binary=*/ true)`. If `/sys/kernel/mm/hugepages` doesn't
exist or the directory is empty (e.g. hugetlbfs is unconfigured in
the test environment) the test simply returns -- there's nothing
meaningful to round-trip.

On x86_64 the discovered list comes out as `["1Gi", "2Mi"]` (the
same coverage as before). On s390x it becomes `["1Gi"]`, on ppc64le
whatever that kernel build supports.

Sysfs alone, however, is a necessary-but-not-sufficient signal: it
tells us the kernel registered the page size, not whether *this
process* is allowed to mount hugetlbfs. The ubuntu-24.04-s390x GHA
runner demonstrates the gap -- it exposes `hugepages-1048576kB`
via /sys but runs the build inside a user/mount namespace where
mount(2) of hugetlbfs returns EPERM even when the test is invoked
through sudo:

    thread 'volume::hugepage::tests::test_get_huge_page_size'
    panicked at .../resource/src/volume/hugepage.rs:292:14:
    called `Result::unwrap()` on an `Err` value: EPERM

There's no portable capability bit we can sniff for that, so probe
once with the first discovered size before iterating; if the probe
mount fails, skip the test (rather than panic on something it
can't control). A real regression on a host where mount() *does*
work will still surface inside the loop below, since the per-size
mount calls there continue to assert via `.unwrap()`.

While here, feed the kernel-native shorthand (e.g. "2M", "1G")
rather than the IEC form ("2Mi", "1Gi") to mount(2). hugetlbfs
parses `pagesize=` via `memparse()`, which understands K/M/G but
not the IEC `Ki/Mi/Gi`; today the kernel happens to silently drop
the trailing `i` (memparse just stops scanning), but that leniency
is incidental. /proc/mounts in turn always renders the option back
as `pagesize=<N>{K,M,G}`, which is exactly the form
`get_page_size()` already expects -- it strips `pagesize=` and
unconditionally appends `i` before handing the result to byte_unit.
Stripping the `i` for the mount option keeps the test's input
aligned with the kernel's canonical syntax, while leaving the IEC
form intact for the `Byte::parse_str(..., /*allow_binary=*/ true)`
comparison.

Also drop the unused `Ok` re-export from
`use anyhow::{anyhow, Context, Ok, Result}`. Every existing
`Ok(...)` site in this module is the variant-constructor form, for
which the prelude's `Result::Ok` already works fine in
`anyhow::Result<T>` context (same enum, with `E = anyhow::Error`
inferred from the surrounding return type), so nothing actually
needed `anyhow::Ok` to begin with. Removing the import lets the
new helper use plain `let Ok(entries) = ... else` /
`let Ok(name) = ... else` patterns directly instead of funneling
everything through `.ok()` + `if let Some(...)` to dodge the
shadowing.

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Fabiano Fidêncio
cd67638618 runtime-rs: hypervisor: don't assert kernel LSM behaviour in selinux test
`selinux::tests::test_set_exec_label` had two branches: when SELinux
is enabled it asserts that `set_exec_label` succeeds and round-trips
the label through `/proc/thread-self/attr/exec`, and when SELinux is
NOT enabled it asserted that `set_exec_label` returns `Err`. The
second assertion is wrong -- it's a claim about the kernel/LSM
interface, not about `set_exec_label` itself.

`/proc/thread-self/attr/exec` is a generic LSM interface, not
SELinux-specific. When no LSM owns the slot, kernel behaviour is
arch/distro/build dependent: some kernels return `EINVAL` (observed
on x86_64 Ubuntu CI runners, where the test was originally written
and was passing), others silently accept the write (observed on
ppc64le Ubuntu CI runners, which is what made this surface):

  thread 'selinux::tests::test_set_exec_label' panicked at
  src/runtime-rs/crates/hypervisor/src/selinux.rs:62:13:
  Expecting error, Got Ok(())

The reason this never blew up before is that the previous-but-one
commit's `ifeq UNSUPPORTED_ARCHS ... exit 0` block in the runtime-rs
`Makefile` made `make test` a no-op on s390x/ppc64le/riscv64gc.
Dropping that gate (so `make test` actually runs on every arch
that runtime-rs builds on) is what surfaced the latent bug.

Drop the `else { assert!(ret.is_err(), ...); }` branch and replace
it with a comment explaining why we deliberately don't assert on
`ret` in that path. The "SELinux is enabled" branch is the only
side that exercises anything we own; the no-SELinux path is a
kernel detail that's not ours to normalize.

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Fabiano Fidêncio
8ab97a60f3 ci: install protobuf-compiler for runtime-rs build-checks
The `runtime-rs` component of `build-checks.yaml` declared `rust`
as its only dependency, but the runtime-rs build pulls in
`prost-build v0.8.0` (via `ttrpc-codegen` -> `containerd-shim-protos`,
and via the in-tree `hypervisor` crate), and `prost-build`'s build
script needs a `protoc` binary at compile time.

This worked on x86_64 and aarch64 only because `prost-build v0.8.0`
ships bundled `protoc` binaries for those targets. On s390x (and
ppc64le, when the matrix gets there) there is no bundled binary,
so the build fails with:

  Failed to find the protoc binary. The PROTOC environment variable
  is not set, there is no bundled protoc for this platform, and
  protoc is not in the PATH

The reason this didn't show up in CI before is that `make test`
and `make check` for runtime-rs were wrapped in arch-specific
`ifeq` blocks in `src/runtime-rs/Makefile` that turned them into
no-ops on s390x/ppc64le/riscv64gc. The previous commit dropped
those gates so `make {test,check}` now actually run on every arch,
which exposes this latent CI gap.

Match what `agent`, `libs`, `agent-ctl`, `kata-ctl` and `genpolicy`
already declare and add `protobuf-compiler` to runtime-rs's needs.
The existing `Install protobuf-compiler` step in this workflow
already runs `sudo apt-get -y install protobuf-compiler`, which
the s390x/ppc64le runners support (those other components have
been using it on s390x for some time).

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Fabiano Fidêncio
48ef1be3be runtime-rs: Drop misleading "unsupported arch" gates
The Makefile pretended to reject s390x, powerpc64le and riscv64gc
by wrapping `default`, `test` and `install` in `ifeq UNSUPPORTED_ARCHS`,
and `check` in `ifeq ($(ARCH),x86_64)`. In reality `default` and
`install` were byte-for-byte identical in both branches, so only
`test` and `check` were ever skipped. The user-visible "$(ARCH) is
not currently supported" message and the bare `exit 0` made it look
like the build was a no-op when in fact builds and installs were
proceeding -- which has burned at least one maintainer trying to
debug a downstream packaging failure (issue #12914).

The original reasons those targets were skipped were:

  * `test` (commit 389ae9702, 2022): `cargo test` would pull in the
    dragonball crate, which only builds on x86_64/aarch64.
  * `check`: delegates to `standard_rust_check` in utils.mk, which
    runs `cargo clippy --all-targets --all-features`. `--all-features`
    unconditionally turns on the `dragonball` (and `cloud-hypervisor`)
    feature regardless of arch, breaking the build wherever those
    crates can't compile.

Both are now obsolete. The preceding commit arch-gated the
dragonball and firecracker drivers (and their dependencies) at the
Cargo and Rust source level, so on s390x/ppc64le/riscv64gc:

  * the `dragonball` cargo feature is a safe no-op -- enabling it
    just doesn't pull in the dep,
  * the `cloud-hypervisor` cargo feature still pulls in `ch-config`
    (which is portable Rust), but the `ch` driver module that uses
    it remains arch-gated at the source level,
  * `dbs-utils` and `hyperlocal` are not built at all.

That means `cargo clippy --all-targets --all-features` -- exactly
what `standard_rust_check` runs -- is safe on every architecture,
and no runtime-rs-local override of `check` is needed. Drop both
`ifeq` blocks and let `test` and `check` run on every arch the way
`default` and `install` already did. Net result: `make {default,
test,check,install}` now Just Work everywhere, with no arch-specific
code paths in this Makefile and no misleading "not currently
supported" messages.

Fixes: #12914

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Fabiano Fidêncio
6a1d7f7d85 runtime-rs: Arch-gate dragonball and firecracker hypervisors
Two of the in-tree hypervisor drivers, dragonball and firecracker,
along with three of their transitive dependencies (the dragonball
crate itself, dbs-utils, hyperlocal), are built unconditionally on
every architecture even though both upstream projects only support
x86_64 and aarch64:

  * dragonball: the dragonball VMM crate is x86_64+aarch64 only.
    The runtime-rs `dragonball` cargo feature is already gated via
    `USE_BUILTIN_DB` -> `ARCH_SUPPORT_DB` in the Makefile, so the
    default `make` flow does the right thing today. But anything
    that bypasses that gate -- a contributor running `cargo clippy
    --all-features`, a CI matrix that forces the feature on, etc.
    -- fails to build on s390x/ppc64le/riscv64gc, because the
    optional `dragonball` dependency is declared without a target
    predicate and Rust source sites reference it under a feature
    gate alone.

  * firecracker: firecracker upstream only releases for x86_64 and
    aarch64
    (https://github.com/firecracker-microvm/firecracker/releases/tag/v1.15.1).
    The Makefile already reflects this -- `FCCMD` is only defined
    in the x86_64/aarch64 arch options files -- but the in-tree
    `firecracker` driver module compiles unconditionally, so on
    s390x/ppc64le/riscv64gc we still ship a runtime that thinks it
    can drive a hypervisor binary that doesn't exist on the platform.

Decouple both at the Cargo and Rust source level, mirroring the
existing cloud-hypervisor pattern.

  * Cargo.toml: move the optional `dragonball` dependency, plus
    `dbs-utils` and `hyperlocal` (whose only consumers are the
    dragonball and firecracker driver modules), into a target-
    specific dependency block:

        [target.'cfg(any(target_arch = "x86_64",
                         target_arch = "aarch64"))'.dependencies]
        dbs-utils  = { workspace = true }
        hyperlocal = { workspace = true }
        dragonball = { workspace = true, features = [ ... ],
                       optional = true }

    On x86_64/aarch64 the resolved dep graph is unchanged. On
    s390x/ppc64le/riscv64gc enabling the `dragonball` feature
    becomes a safe no-op, and the dep graph for the `hypervisor`
    crate is completely free of any dragonball or firecracker
    artifacts. This also makes the gating self-policing: any
    future `use dbs_utils::...` or `use hyperlocal::...` outside
    an arch-gated module will fail to build on non-x86 instead of
    silently shipping dead code.

  * Rust modules: combine the existing `feature = "dragonball"`
    gate with `target_arch = "x86_64"|"aarch64"` on
    `pub mod dragonball;` and the dragonball-only constants
    (`DEV_HUGEPAGES`, `SHMEM`, `HUGE_SHMEM`) in
    `crates/hypervisor/src/lib.rs`. Add the same target_arch gate
    to `pub mod firecracker;` (matching the existing gate on
    `pub mod ch;`) and to every site in
    `crates/runtimes/virt_container/src/{lib,sandbox}.rs` that
    names a now-gated type (`Dragonball`, `Firecracker`,
    `DragonballConfig`, `FirecrackerConfig`).

  * `pub(crate) enum VmmState` in `crates/hypervisor/src/lib.rs`
    gets the same target_arch gate -- its only consumers are the
    `ch`, `dragonball` and `firecracker` modules, all of which
    are gated to x86_64+aarch64. Without it, `cargo clippy
    --all-features -- -D warnings` (i.e. what `make check` runs
    via `standard_rust_check`) would fail on non-x86 with
    "enum `VmmState` is never used".

The plain `HYPERVISOR_DRAGONBALL` and `HYPERVISOR_FIRECRACKER`
string constants stay ungated, and the persist-side match arms in
`sandbox.rs` that only compare against those strings also stay
ungated, mirroring how `HYPERVISOR_NAME_CH` is already handled.

Verified with `cargo tree --target=<triple> --features dragonball
-p hypervisor` for x86_64/aarch64/s390x/powerpc64le/riscv64gc:

  * x86_64/aarch64: full dragonball stack (dbs_address_space,
    dbs_allocator, dbs_arch, dbs_boot, dbs_device, dbs_interrupt,
    dbs_legacy_devices, dbs_pci, dbs_upcall, dbs-utils,
    hyperlocal, ...) is pulled in, as before.
  * s390x/ppc64le/riscv64gc: the dep graph for the `hypervisor`
    crate is completely free of any dragonball or firecracker
    artifacts, even with `--features dragonball` explicitly
    enabled.

`cargo clippy --target=s390x-unknown-linux-gnu --all-targets
--all-features --release --locked -- -D warnings` is also clean,
and `make check` on x86_64 with the default `USE_BUILTIN_DB=true`
still passes.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Steve Horsman
a85ed99252 Merge pull request #12938 from stevenhorsman/build-checks-concurrency-fix
workflows: Remove workflow concurrency
2026-04-28 15:13:42 +01:00
stevenhorsman
09ac10e8df workflows: Remove workflow concurrency
It seems like some of our workflow concurrency rules are clashing
with the job-level ones for some reason and cancelling jobs, so
remove these problematic workflow rules.

Co-authored-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 14:56:07 +01:00
Steve Horsman
47f5de85bb Merge pull request #12323 from kata-containers/zizmor-update-to-1.20
workflows: zizmor update to 1.22
2026-04-28 13:30:14 +01:00
stevenhorsman
d5411e00f6 workflows: Fix version on pinned action
docker/build-push-action@bcafcacb16
seemed to be given the wrong version in the comment, so
update this to be correct

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 13:10:36 +01:00
stevenhorsman
063a13ccd0 workflows: Bump zizmor to 1.22
Bump zizmor to the 1.22 version to pick up new rule updates.
Later bumps to follow once this has proven stable

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 13:10:36 +01:00
stevenhorsman
92ded7ff98 workflows: Add timeouts
Recently I've seen a couple of occasions where
jobs have seemed to run infinitely. Add timeouts
for these jobs to stop this from happening if things
get into a bad state.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 13:10:36 +01:00
stevenhorsman
af4ced32f4 workflows: Add concurrency limits
It is good practice to add concurrency limits to automatically
cancel jobs that have been superceded and potentially stop
race conditions if we try and get artifacts by workflows and job id
rather than run id.

See https://docs.zizmor.sh/audits/#concurrency-limits

Assisted-by: IBM Bob

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 13:10:36 +01:00
Fabiano Fidêncio
5eefbbafb3 Merge pull request #12899 from kata-containers/topic/runtime-rs-docker-qemu
runtime-rs: qemu: Add docker support and tests
2026-04-28 13:23:22 +02:00
Fabiano Fidêncio
2aa09df780 Merge pull request #12860 from Xynnn007/release/slsa-for-artifacts
ci: enforce SLSA provenance for published artifacts
2026-04-28 12:36:38 +02:00
Xynnn007
f4a9847877 ci: enforce SLSA provenance for published artifacts
Published artifacts are consumed as security-critical runtime inputs, so
they need verifiable provenance that binds each binary back to the exact
source and build context.

Without provenance, downstream users cannot reliably distinguish trusted
CI outputs from repackaged or substituted artifacts.

Recording provenance in Sigstore's immutable transparency infrastructure
provides auditable evidence that survives mirror/registry movement and
strengthens supply-chain forensics and policy enforcement.

This also aligns artifact publication with a zero-trust verification
model expected by confidential-computing consumers and automated
admission controls.

Remove workflow-level attestation gating so published artifacts are
consistently accompanied by build provenance.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2026-04-28 11:40:15 +02:00
Fabiano Fidêncio
a5e1521727 kernel: bake in Mellanox MLX5 Ethernet support
The MLX5 Ethernet driver is useful well beyond the DPU/SmartNIC use case
(any guest sitting on top of a Mellanox/ConnectX NIC benefits from it),
yet the existing config fragment lived under dpu/ and was only pulled in
when the kernel was built with `-D nvidia`.

Promote it to a first-class common fragment so every Kata kernel gets
MLX5 Ethernet built in.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-28 11:02:39 +02:00
Markus Rudy
044c96a9d6 agent: remove standard-oci-runtime feature
This feature was only added for runk, which was removed entirely in
96e1fb4ca6.

Fixes: #12849
Signed-off-by: Markus Rudy <mr@edgeless.systems>
2026-04-28 10:35:14 +02:00
Fabiano Fidêncio
3ef2c5db65 docs: docker: Update docs to mention runtime-rs and what's tested
Now that we're adding support for the rust runtime, let's also update
the docs.

We may also need to update the docs again once we start testing with
different VMMs, but that's not in the scope for this PR.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-28 10:22:21 +02:00