Commit Graph

18993 Commits

Author SHA1 Message Date
Fabiano Fidêncio
c19bdbf23b tests: nvidia-nim: use trusted storage templates for runtime-rs
Now that runtime-rs supports block-encrypted emptyDir volumes, remove
the no-trusted-storage workaround templates and the is_runtime_rs
branching in the NIM test. Runtime-rs now uses the same TEE templates
as the Go runtime with emptyDir + PVC at 48Gi memory, instead of the
128Gi workaround that compensated for lacking trusted storage.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 22:56:11 +02:00
Fabiano Fidêncio
54aaa1ea2a tests: enable trusted ephemeral storage for runtime-rs
Remove the runtime-rs skip from the trusted ephemeral data storage
test now that runtime-rs implements block-encrypted emptyDir volumes.

Also remove the genpolicy drop-in that disabled encrypted_emptydir
for runtime-rs and the corresponding copy logic in tests_common.sh.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 22:56:11 +02:00
Fabiano Fidêncio
aa7392b1b9 runtime-rs: add emptydir_mode to config templates
Add the emptydir_mode configuration option to all runtime-rs config
template files. CoCo configs (snp, tdx, se, coco-dev, nvidia-gpu-snp,
nvidia-gpu-tdx) default to block-encrypted via @DEFEMPTYDIRMODE_COCO@,
while non-CoCo configs (qemu, nvidia-gpu, fc) default to shared-fs
via @DEFEMPTYDIRMODE@.

Also add DEFEMPTYDIRMODE and DEFEMPTYDIRMODE_COCO variables to the
runtime-rs Makefile for template substitution.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 22:56:11 +02:00
Fabiano Fidêncio
5e2ca6d6ee runtime-rs: skip local type conversion for block-encrypted emptyDirs
When emptydir_mode is "block-encrypted", host emptyDir paths must
remain as "bind" mounts so the EncryptedEmptyDirVolume handler can
intercept them in the volume dispatch chain.  Previously,
update_ephemeral_storage_type() would unconditionally convert them
to "local" type, causing them to be handled as plain local volumes
instead.

Add the emptydir_mode parameter to update_ephemeral_storage_type()
and its call chain (amend_spec in container.rs) and skip the
host-emptyDir-to-local conversion when the mode is block-encrypted.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 22:56:11 +02:00
Fabiano Fidêncio
d3a9669be5 runtime-rs: implement EncryptedEmptyDirVolume
Add the core volume handler for block-encrypted emptyDir support
in runtime-rs, bringing it to parity with the Go runtime (PR #10559).

When emptydir_mode is set to "block-encrypted", host emptyDir bind
mounts are intercepted and handled as follows:

  1. A sparse disk image (disk.img) is created inside the emptyDir
     folder, sized to match the host filesystem capacity.
  2. A mountInfo.json is written under the kata direct-volume root
     with volume_type "blk", fs_type "ext4", and metadata
     encryptionKey=ephemeral.
  3. The disk image is plugged into the guest VM as a virtio-blk
     device via the hypervisor device manager.
  4. An agent::Storage is built with driver_options containing
     encryption_key=ephemeral and shared=true, so the kata-agent
     delegates formatting and encryption to CDH using LUKS2.

The volume is registered in the dispatch chain before the regular
block-volume check, and ephemeral disk metadata is tracked for
sandbox-level cleanup at teardown.

Also re-exports EMPTYDIR_MODE_* constants from kata-types::config
so downstream crates can reference them.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 22:56:11 +02:00
Fabiano Fidêncio
0b1e103886 runtime-rs: agent: add shared field to Storage struct
The proto Storage message already has a "shared" field (field 8),
but the runtime-rs agent crate's internal Storage struct was
missing it, so it was never forwarded to the kata-agent.

Add the field to the Rust struct and its From<Storage> translation,
and update all explicit struct initialisers across the resource
crate to include shared: false so the build stays clean.

This is needed for trusted ephemeral data storage, where the
agent uses the shared flag to avoid premature cleanup of volumes
that are shared across containers in a pod.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 15:42:20 +02:00
Fabiano Fidêncio
00d4ee2344 kata-types: add direct-volume write/remove helpers
Add add_volume_mount_info(), is_volume_mounted(), and
remove_volume_path() to the mount module. These mirror the Go
helpers (AddMountInfo, IsVolumeMounted, Remove) in
src/runtime/pkg/direct-volume/utils.go and are needed by the
upcoming EncryptedEmptyDirVolume to write and clean up
mountInfo.json metadata for block-encrypted emptyDir volumes.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 15:42:20 +02:00
Fabiano Fidêncio
b4a9d3256b kata-types: add emptydir_mode configuration option
Add the emptydir_mode field to the Runtime configuration struct,
allowing runtime-rs to read the emptyDir handling mode from the
TOML config file. This is groundwork for trusted ephemeral data
storage support in runtime-rs (parity with the Go runtime).

Two modes are supported:
  - shared-fs (default): share emptyDir via virtio-fs/9p.
  - block-encrypted: plug a block device encrypted in-guest via
    CDH/LUKS2.

Empty values default to "shared-fs"; unknown values are rejected
during validation.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-14 11:29:40 +02:00
Fabiano Fidêncio
c8f6f17269 Merge pull request #13027 from PiotrProkop/fix-loop-blockfile-sandbox-cgroup
runtime: allow loopback devices when sandbox_cgroup_only is enabled
2026-05-14 11:18:45 +02:00
Fabiano Fidêncio
44b356c654 Merge pull request #13033 from microsoft/saul/static_maxvcpus
runtime-rs: static resources: always set maxvcpus equal to vcpus
2026-05-14 11:16:35 +02:00
Fabiano Fidêncio
9af625d3f1 Merge pull request #13006 from manuelh-dev/mahuber/cdh-upgrade
rootfs: cdh: Update CDH to new version
2026-05-14 10:09:18 +02:00
Saul Paredes
d930fc42b8 runtime-rs: static resources: always set maxvcpus equal to vcpus
based on current runtime-go behaviour introduced in https://github.com/kata-containers/kata-containers/pull/9195

When using static resources, always set maxvcpus value equal to the vcpus value.
This is because the static resources case does not support dynamic CPU hotplugging,
and therefore the maximum number of vCPUs should be limited to the number of vCPUs.
Booting with a high number of max vCPUs is a bit slower compared to a lower number.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2026-05-13 13:21:56 -07:00
Manuel Huber
ed4233bf91 rootfs: cdh: Update CDH to new version
Update CDH to a newer version and:
- adjust the NVIDIA root filesystem build to reflect the change from
  using libcryptsetup to using the cryptsetup binary.
- adjust image-pull test cases to conduct parallel write operations
  on the /dev/trusted_store backed guest image pull location since
  issue #12721 has been solved on CDH side.

Fixes #12721

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-13 20:20:45 +02:00
Wainer Moschetta
54674d4a90 Merge pull request #12797 from ldoktor/ci-docs
ci.ocp: Remove workaround and update docs
2026-05-13 14:52:27 -03:00
Lukáš Doktor
5322c5d228 ci.ocp: Remove workaround which force-skipped nydus
the f27def1a5b resolved the setup issue,
we can start using the defaults again.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-05-13 12:58:53 -03:00
Lukáš Doktor
7a4a2cbab5 ci.ocp: Update links to pipeline results
we expanded the test matrix, update the links in docs to simplify
finding the results.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-05-13 12:58:53 -03:00
Steve Horsman
6d5d2a1c20 Merge pull request #13037 from pavithiran34/pavi_fix_CVE-2026-7246
fix: add click 8.3.3 to docs requirements
2026-05-13 15:27:30 +01:00
Fabiano Fidêncio
edf5a968d9 Merge pull request #13034 from Amulyam24/static-check-runners
gha: move static checks to self hosted runners for ppc64le
2026-05-13 15:49:37 +02:00
Amulyam24
631dc72ceb gha: move static checks to self hosted runners for ppc64le
Move build checks to self hosted runners for Power.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2026-05-13 11:07:52 +01:00
pavithiran34
83ea8e0915 fix: add click 8.3.3 to docs requirements
- Added click==8.3.3 to docs/requirements.txt
- Click 8.3.3 is the latest version for Python >=3.10
- Required for mkdocs toolchain compatibility and resolves vulnerability in indirect dependencies
- Ref : CVE-2026-7246

Signed-off-by: pavithiran34 <pavithiran.p@ibm.com>
2026-05-13 10:11:58 +01:00
Greg Kurz
d2dc0a923c Merge pull request #13030 from stevenhorsman/go-1.25.10-bump
Go 1.25.10 bump
2026-05-13 08:09:51 +02:00
Aurélien Bombo
dcafae9645 Merge pull request #13032 from kata-containers/sprt/fix-virtiofsd-args
runtime-rs: align virtiofsd args on runtime-go
2026-05-12 19:55:54 -05:00
Dan Mihai
3799473041 Merge pull request #13010 from microsoft/danmihai1/label-references
genpolicy: support env variable values sourced from metadata.labels values
2026-05-12 15:41:11 -07:00
Aurélien Bombo
555b7738fe runtime-rs: align virtiofsd args on runtime-go
Runtime-go doesn't hardcode --sandbox none --seccomp none [1],
so mirror that in runtime-rs.

 [1]: 733ccb3254/src/runtime/virtcontainers/virtiofsd.go (L183)

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-05-12 12:51:32 -05:00
Greg Kurz
733ccb3254 Merge pull request #12996 from stevenhorsman/swap-agent-ctl-to-skopeo&umoci
agent-ctl: Swap rootfs bundle pull implementation
2026-05-12 19:12:27 +02:00
Zvonko Kaiser
7d25934fef Merge pull request #13019 from fidencio/topic/nvidia-rootfs-use-erofs-instead-of-ext4
nvidia: rootfs: (try to) use erofs for the image instead of ext4
2026-05-12 18:54:21 +02:00
PiotrProkop
5065058d4a runtime: fix device allowlist detection comparing pointers
Because intptr() returns a fresh pointer on every call, those comparisons compared addresses,
never values, so every check evaluated to false.
As a result /dev/null, /dev/urandom, /dev/ptmx, /dev/loop-control and /dev/loop*
were appended to devices allowlist for sandbox_cgroup
even when the runtime spec already listed them, producing duplicate entries.

Switch to nil-safe value comparisons via a type switch on the cgroup device type
and dereferenced *d.Major / *d.Minor,
keeping the same detection semantics but actually matching existing entries.

Assisted-By: Claude 4.7
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-05-12 18:52:53 +02:00
PiotrProkop
5cd187619e runtime: allow loopback devices for sandbox cgroup only
When sandbox_cgroup_only is enabled, the kata shim threads inherit
the sandbox device cgroup. For container rootfs whose mount source
is a regular file backed by a loop device (notably the blockfile
snapshotter), containerd's mount package opens /dev/loop-control to
allocate a free /dev/loopN and then opens that block node to attach
the backing file. Neither device is on the sandbox cgroup allowlist,
so both opens fail with EPERM.

This change adds /dev/loop-control (char 10:237) and the /dev/loopN
block nodes (block major 7, any minor) to the sandbox device cgroup
allowlist when sandbox_cgroup_only is true, mirroring the existing
treatment of /dev/null, /dev/urandom and /dev/ptmx. The additions
are gated on SandboxCgroupOnly because that is the only mode in
which the shim itself is constrained by this cgroup.

Assisted-By: Claude 4.7
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-05-12 18:48:58 +02:00
stevenhorsman
7cc72b933d versions: bump golang.org/x/net to v0.53.0
Bump golang.org/x/net to resolve CVE:
- GO-2026-4918

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:26 +01:00
stevenhorsman
4a65aca9cf versions: bump golang to 1.25.10
Bump the go version to resolve CVEs:
- GO-2026-4918
- GO-2026-4971
- GO-2026-4976
- GO-2026-4977
- GO-2026-4980
- GO-2026-4981
- GO-2026-4982
- GO-2026-4986

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:13 +01:00
Steve Horsman
2b329074f1 Merge pull request #13023 from manuelh-dev/mahuber/nim-journal-fix
tests: nvidia: avoid NIM journal dumps on success
2026-05-12 09:32:07 +01:00
Fabiano Fidêncio
ea5755572c Merge pull request #13026 from stevenhorsman/fix-fixed-datae-stale-issue
ci: correct environment variable syntax in stale issues workflow
2026-05-11 20:57:41 +02:00
stevenhorsman
37e7bf0773 ci: correct environment variable syntax in stale issues workflow
The stale issues workflow was using shell syntax ${AGE} instead of
GitHub Actions syntax ${{ env.AGE }} for the days-before-issue-stale
parameter. This prevented the workflow from correctly reading the
calculated AGE value.

Also added days-before-stale: -1 to disable default stale behavior
and ensure only issue-specific settings apply.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-By: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-11 09:31:36 +01:00
Manuel Huber
c265e4905f tests: nvidia: avoid NIM journal dumps on success
BATS_TEST_COMPLETED is per-test and remains empty in teardown_file.
Track file-level state so successful NIM runs skip the journal dump
while setup or test failures still include node diagnostics.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-10 09:10:01 -07:00
Fabiano Fidêncio
93e02944fa image-builder/nvidia: skip DAX header for virtio-blk-pci images
The DAX header (2 MiB of NVDIMM metadata + a duplicate MBR) is
unconditionally prepended to every image by set_dax_header(). NVIDIA
images use virtio-blk-pci with disable_image_nvdimm=true, so the
kernel reads MBR #1 directly and never touches the DAX metadata --
it is dead weight.

Add a SKIP_DAX_HEADER environment variable (default "no") that, when
set to "yes", skips the DAX header entirely:
- Removes the 2 MiB DAX overhead from image size calculations in
  both the erofs and ext4 paths
- Skips the set_dax_header() call, avoiding compilation and
  execution of the nsdax tool
- Passes the variable through to containerised builds

Enable SKIP_DAX_HEADER=yes for both install_image_nvidia_gpu() and
install_image_nvidia_gpu_confidential() in the build pipeline. All
other image builds are unaffected (default remains "no").

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
b72bb7243e image-builder: bump base image from Fedora 42 to 44
Fedora 42 reaches end-of-life in May 2026. Move the image-builder
container to Fedora 44, which is the current stable release.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
6b802a4e30 nvidia: switch GPU rootfs images to erofs
Switch the NVIDIA GPU rootfs images (both standard and confidential)
from ext4 to erofs (Enhanced Read-Only File System).

Unlike ext4, which is a read-write filesystem mounted read-only by
convention, erofs is structurally read-only -- no journal, no write
metadata, no superblock write path. This eliminates accidental
mutation and reduces the attack surface inside the guest VM, which
is particularly important for confidential workloads using dm-verity.

Introduce a DEFROOTFSTYPE_NV Makefile variable (set to erofs) for
both Go and Rust runtimes, keeping the global DEFROOTFSTYPE as ext4
so non-NVIDIA configurations are unaffected.

Update all six NVIDIA GPU configuration templates (base, SNP, TDX
for both runtimes) to use @DEFROOTFSTYPE_NV@ instead of the global
@DEFROOTFSTYPE@.

Export FS_TYPE=erofs in install_image_nvidia_gpu() and
install_image_nvidia_gpu_confidential() so the build pipeline
produces erofs images via the image builder.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
bfcd249f40 image-builder: add erofs dm-verity support and lz4hc compression
Add full dm-verity and measured rootfs support to
create_erofs_rootfs_image(), bringing it to parity with the ext4 path.

Unlike ext4, which is a read-write filesystem mounted read-only by
convention, erofs is structurally read-only -- no journal, no write
metadata, no superblock write path.

This is a natural fit for dm-verity: erofs never attempts writes, so
verity never has to reject anything. With ext4, the kernel must skip
journal replay on verity-protected devices, which is a fragile
assumption.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
d2e0555cf0 image-builder: refactor dm-verity setup into shared functions
Extract build_kernel_verity_params() and setup_verity() from the
inline block inside create_rootfs_image() into top-level functions.

This is a pure refactoring with no behavior change. The verity logic
is moved verbatim, with the only difference being that
build_kernel_verity_params() now takes the image path as an explicit
parameter instead of capturing it from the enclosing scope.

The extracted functions will be reused by create_erofs_rootfs_image()
in a subsequent commit to add dm-verity support for erofs images.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
manuelh-dev
2ffd1538a2 Merge pull request #13021 from fidencio/topic/kata-deploy-log-level-containerd-version-4
kata-deploy: Fix containerd debug level path for config schema v4
2026-05-10 07:28:26 -07:00
Fabiano Fidêncio
341a0d366c kata-deploy: Fix containerd debug level path for config schema v4
Containerd 2.3 (config schema v4) uses the top-level [debug] table
for log level configuration, not plugins."io.containerd.server.v1.debug"
as was the case in the RC builds.

Update containerd_debug_level_toml_path() to use .debug.level for all
schema versions, matching the released containerd behavior.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 12:02:24 +02:00
Fabiano Fidêncio
46b46589a6 Merge pull request #13020 from manuelh-dev/mahuber/nim-op-placement
tests: nvidia: place NIM service into namespace
2026-05-10 12:01:58 +02:00
Manuel Huber
1c081ff434 tests: nvidia: place NIM service into namespace
Place the NIM service into our test namespace. We are still observing
various situations where for some reasons, the NIM service appears in
the default namespace in our CI.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-10 07:36:23 +00:00
Fabiano Fidêncio
905303b6b0 Merge pull request #13013 from BbolroC/filter-vfio-gk-only-runtime-rs
runtime-rs: filter VFIO devices only in guest-kernel mode
2026-05-08 23:49:50 +02:00
Fabiano Fidêncio
a447a1fb03 Merge pull request #13015 from stevenhorsman/kernel-6.18.28-bump
version: Bump to latest 6.18 kernel
2026-05-08 21:12:50 +02:00
Fabiano Fidêncio
f7be57efe2 Merge pull request #13007 from manuelh-dev/mahuber/dbg-nim-svc
tests: nvidia: Wait for NIM operator pod and print
2026-05-08 20:58:51 +02:00
stevenhorsman
87664c608d version: Bump to latest 6.18 kernel
Pick up the latest kernel that fixes CVE-2026-43284

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-08 17:15:24 +01:00
Hyounggyu Choi
754707fe83 runtime-rs: filter VFIO devices only in guest-kernel mode
After #12857, the VFIO-AP hotplug test fails because runtime-rs
unconditionally removes all /dev/vfio/* devices from the OCI spec
before sending it to the kata agent. The agent then rejects
the container creation with:

```
Missing devices in OCI spec
```

Filter devices from the OCI spec conditionally based on the
vfio_mode configuration (e.g. guest-kernel). Also factor the
filtering logic out into a separate function and add unit tests.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-05-08 15:39:16 +02:00
Fabiano Fidêncio
8e65e89ade Merge pull request #13011 from kata-containers/fix-warnings
runtime-rs: Fix warnings in rust runtime
2026-05-08 15:12:53 +02:00
Fabiano Fidêncio
a541827a7e Merge pull request #12984 from fidencio/topic/network-pair-use-name-for-lookup
runtime-rs: network: use provided name for virt interface lookup
2026-05-08 14:31:58 +02:00