Commit Graph

18993 Commits

Author SHA1 Message Date
Hyounggyu Choi
540986bc8f test: skip CDH resource test for qemu-se without reference values
Since gc and trustee were bumped (#13046), the test
"Cannot get CDH resource when affirming policy is set without reference values"
has started failing for IBM SEL.

The attestation policy for IBM SEL returns an "affirming"
result whenever the claim can be parsed successfully,
meaning the evidence verification succeeds. As a result,
the negative test above always produces a positive result.

Skip this negative test for IBM SEL environments
(e.g. qemu-se*).

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-05-18 08:40:16 +02:00
manuelh-dev
48671ad525 Merge pull request #13046 from fitzthum/bump-coco-0210
Bump guest components and Trustee for CoCo v0.21.0
2026-05-14 14:59:50 -07:00
Tobin Feldman-Fitzthum
1cfed3c20b release: update guest-components for release
Pick up the latest version of guest-components.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-05-14 09:40:06 -07:00
Tobin Feldman-Fitzthum
79ea56f24e versions: bump Trustee version for release
Pick up the latest versions of Trustee.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-05-14 09:37:06 -07:00
Steve Horsman
557fb5187b Merge pull request #12853 from kata-containers/dependabot/go_modules/src/runtime/github.com/sirupsen/logrus-1.9.4
build(deps): bump github.com/sirupsen/logrus from 1.9.3 to 1.9.4 in /src/runtime
2026-05-14 13:56:10 +01:00
Steve Horsman
aade0f5fbe Merge pull request #12854 from kata-containers/dependabot/go_modules/tools/testing/kata-webhook/github.com/sirupsen/logrus-1.9.4
build(deps): bump github.com/sirupsen/logrus from 1.9.3 to 1.9.4 in /tools/testing/kata-webhook
2026-05-14 13:55:44 +01:00
Fabiano Fidêncio
c8f6f17269 Merge pull request #13027 from PiotrProkop/fix-loop-blockfile-sandbox-cgroup
runtime: allow loopback devices when sandbox_cgroup_only is enabled
2026-05-14 11:18:45 +02:00
Fabiano Fidêncio
44b356c654 Merge pull request #13033 from microsoft/saul/static_maxvcpus
runtime-rs: static resources: always set maxvcpus equal to vcpus
2026-05-14 11:16:35 +02:00
Fabiano Fidêncio
9af625d3f1 Merge pull request #13006 from manuelh-dev/mahuber/cdh-upgrade
rootfs: cdh: Update CDH to new version
2026-05-14 10:09:18 +02:00
Saul Paredes
d930fc42b8 runtime-rs: static resources: always set maxvcpus equal to vcpus
based on current runtime-go behaviour introduced in https://github.com/kata-containers/kata-containers/pull/9195

When using static resources, always set maxvcpus value equal to the vcpus value.
This is because the static resources case does not support dynamic CPU hotplugging,
and therefore the maximum number of vCPUs should be limited to the number of vCPUs.
Booting with a high number of max vCPUs is a bit slower compared to a lower number.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2026-05-13 13:21:56 -07:00
Manuel Huber
ed4233bf91 rootfs: cdh: Update CDH to new version
Update CDH to a newer version and:
- adjust the NVIDIA root filesystem build to reflect the change from
  using libcryptsetup to using the cryptsetup binary.
- adjust image-pull test cases to conduct parallel write operations
  on the /dev/trusted_store backed guest image pull location since
  issue #12721 has been solved on CDH side.

Fixes #12721

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-13 20:20:45 +02:00
Wainer Moschetta
54674d4a90 Merge pull request #12797 from ldoktor/ci-docs
ci.ocp: Remove workaround and update docs
2026-05-13 14:52:27 -03:00
Lukáš Doktor
5322c5d228 ci.ocp: Remove workaround which force-skipped nydus
the f27def1a5b resolved the setup issue,
we can start using the defaults again.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-05-13 12:58:53 -03:00
Lukáš Doktor
7a4a2cbab5 ci.ocp: Update links to pipeline results
we expanded the test matrix, update the links in docs to simplify
finding the results.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-05-13 12:58:53 -03:00
Steve Horsman
6d5d2a1c20 Merge pull request #13037 from pavithiran34/pavi_fix_CVE-2026-7246
fix: add click 8.3.3 to docs requirements
2026-05-13 15:27:30 +01:00
Fabiano Fidêncio
edf5a968d9 Merge pull request #13034 from Amulyam24/static-check-runners
gha: move static checks to self hosted runners for ppc64le
2026-05-13 15:49:37 +02:00
Amulyam24
631dc72ceb gha: move static checks to self hosted runners for ppc64le
Move build checks to self hosted runners for Power.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2026-05-13 11:07:52 +01:00
pavithiran34
83ea8e0915 fix: add click 8.3.3 to docs requirements
- Added click==8.3.3 to docs/requirements.txt
- Click 8.3.3 is the latest version for Python >=3.10
- Required for mkdocs toolchain compatibility and resolves vulnerability in indirect dependencies
- Ref : CVE-2026-7246

Signed-off-by: pavithiran34 <pavithiran.p@ibm.com>
2026-05-13 10:11:58 +01:00
dependabot[bot]
408e15641c build(deps): bump github.com/sirupsen/logrus in /src/runtime
Bumps [github.com/sirupsen/logrus](https://github.com/sirupsen/logrus) from 1.9.3 to 1.9.4.
- [Release notes](https://github.com/sirupsen/logrus/releases)
- [Changelog](https://github.com/sirupsen/logrus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/sirupsen/logrus/compare/v1.9.3...v1.9.4)

---
updated-dependencies:
- dependency-name: github.com/sirupsen/logrus
  dependency-version: 1.9.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-13 06:11:21 +00:00
dependabot[bot]
18a13773da build(deps): bump github.com/sirupsen/logrus
Bumps [github.com/sirupsen/logrus](https://github.com/sirupsen/logrus) from 1.9.3 to 1.9.4.
- [Release notes](https://github.com/sirupsen/logrus/releases)
- [Changelog](https://github.com/sirupsen/logrus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/sirupsen/logrus/compare/v1.9.3...v1.9.4)

---
updated-dependencies:
- dependency-name: github.com/sirupsen/logrus
  dependency-version: 1.9.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-13 06:11:16 +00:00
Greg Kurz
d2dc0a923c Merge pull request #13030 from stevenhorsman/go-1.25.10-bump
Go 1.25.10 bump
2026-05-13 08:09:51 +02:00
Aurélien Bombo
dcafae9645 Merge pull request #13032 from kata-containers/sprt/fix-virtiofsd-args
runtime-rs: align virtiofsd args on runtime-go
2026-05-12 19:55:54 -05:00
Dan Mihai
3799473041 Merge pull request #13010 from microsoft/danmihai1/label-references
genpolicy: support env variable values sourced from metadata.labels values
2026-05-12 15:41:11 -07:00
Aurélien Bombo
555b7738fe runtime-rs: align virtiofsd args on runtime-go
Runtime-go doesn't hardcode --sandbox none --seccomp none [1],
so mirror that in runtime-rs.

 [1]: 733ccb3254/src/runtime/virtcontainers/virtiofsd.go (L183)

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-05-12 12:51:32 -05:00
Greg Kurz
733ccb3254 Merge pull request #12996 from stevenhorsman/swap-agent-ctl-to-skopeo&umoci
agent-ctl: Swap rootfs bundle pull implementation
2026-05-12 19:12:27 +02:00
Zvonko Kaiser
7d25934fef Merge pull request #13019 from fidencio/topic/nvidia-rootfs-use-erofs-instead-of-ext4
nvidia: rootfs: (try to) use erofs for the image instead of ext4
2026-05-12 18:54:21 +02:00
PiotrProkop
5065058d4a runtime: fix device allowlist detection comparing pointers
Because intptr() returns a fresh pointer on every call, those comparisons compared addresses,
never values, so every check evaluated to false.
As a result /dev/null, /dev/urandom, /dev/ptmx, /dev/loop-control and /dev/loop*
were appended to devices allowlist for sandbox_cgroup
even when the runtime spec already listed them, producing duplicate entries.

Switch to nil-safe value comparisons via a type switch on the cgroup device type
and dereferenced *d.Major / *d.Minor,
keeping the same detection semantics but actually matching existing entries.

Assisted-By: Claude 4.7
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-05-12 18:52:53 +02:00
PiotrProkop
5cd187619e runtime: allow loopback devices for sandbox cgroup only
When sandbox_cgroup_only is enabled, the kata shim threads inherit
the sandbox device cgroup. For container rootfs whose mount source
is a regular file backed by a loop device (notably the blockfile
snapshotter), containerd's mount package opens /dev/loop-control to
allocate a free /dev/loopN and then opens that block node to attach
the backing file. Neither device is on the sandbox cgroup allowlist,
so both opens fail with EPERM.

This change adds /dev/loop-control (char 10:237) and the /dev/loopN
block nodes (block major 7, any minor) to the sandbox device cgroup
allowlist when sandbox_cgroup_only is true, mirroring the existing
treatment of /dev/null, /dev/urandom and /dev/ptmx. The additions
are gated on SandboxCgroupOnly because that is the only mode in
which the shim itself is constrained by this cgroup.

Assisted-By: Claude 4.7
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-05-12 18:48:58 +02:00
stevenhorsman
7cc72b933d versions: bump golang.org/x/net to v0.53.0
Bump golang.org/x/net to resolve CVE:
- GO-2026-4918

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:26 +01:00
stevenhorsman
4a65aca9cf versions: bump golang to 1.25.10
Bump the go version to resolve CVEs:
- GO-2026-4918
- GO-2026-4971
- GO-2026-4976
- GO-2026-4977
- GO-2026-4980
- GO-2026-4981
- GO-2026-4982
- GO-2026-4986

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:13 +01:00
Steve Horsman
2b329074f1 Merge pull request #13023 from manuelh-dev/mahuber/nim-journal-fix
tests: nvidia: avoid NIM journal dumps on success
2026-05-12 09:32:07 +01:00
Fabiano Fidêncio
ea5755572c Merge pull request #13026 from stevenhorsman/fix-fixed-datae-stale-issue
ci: correct environment variable syntax in stale issues workflow
2026-05-11 20:57:41 +02:00
stevenhorsman
37e7bf0773 ci: correct environment variable syntax in stale issues workflow
The stale issues workflow was using shell syntax ${AGE} instead of
GitHub Actions syntax ${{ env.AGE }} for the days-before-issue-stale
parameter. This prevented the workflow from correctly reading the
calculated AGE value.

Also added days-before-stale: -1 to disable default stale behavior
and ensure only issue-specific settings apply.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-By: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-11 09:31:36 +01:00
Manuel Huber
c265e4905f tests: nvidia: avoid NIM journal dumps on success
BATS_TEST_COMPLETED is per-test and remains empty in teardown_file.
Track file-level state so successful NIM runs skip the journal dump
while setup or test failures still include node diagnostics.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-10 09:10:01 -07:00
Fabiano Fidêncio
93e02944fa image-builder/nvidia: skip DAX header for virtio-blk-pci images
The DAX header (2 MiB of NVDIMM metadata + a duplicate MBR) is
unconditionally prepended to every image by set_dax_header(). NVIDIA
images use virtio-blk-pci with disable_image_nvdimm=true, so the
kernel reads MBR #1 directly and never touches the DAX metadata --
it is dead weight.

Add a SKIP_DAX_HEADER environment variable (default "no") that, when
set to "yes", skips the DAX header entirely:
- Removes the 2 MiB DAX overhead from image size calculations in
  both the erofs and ext4 paths
- Skips the set_dax_header() call, avoiding compilation and
  execution of the nsdax tool
- Passes the variable through to containerised builds

Enable SKIP_DAX_HEADER=yes for both install_image_nvidia_gpu() and
install_image_nvidia_gpu_confidential() in the build pipeline. All
other image builds are unaffected (default remains "no").

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
b72bb7243e image-builder: bump base image from Fedora 42 to 44
Fedora 42 reaches end-of-life in May 2026. Move the image-builder
container to Fedora 44, which is the current stable release.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
6b802a4e30 nvidia: switch GPU rootfs images to erofs
Switch the NVIDIA GPU rootfs images (both standard and confidential)
from ext4 to erofs (Enhanced Read-Only File System).

Unlike ext4, which is a read-write filesystem mounted read-only by
convention, erofs is structurally read-only -- no journal, no write
metadata, no superblock write path. This eliminates accidental
mutation and reduces the attack surface inside the guest VM, which
is particularly important for confidential workloads using dm-verity.

Introduce a DEFROOTFSTYPE_NV Makefile variable (set to erofs) for
both Go and Rust runtimes, keeping the global DEFROOTFSTYPE as ext4
so non-NVIDIA configurations are unaffected.

Update all six NVIDIA GPU configuration templates (base, SNP, TDX
for both runtimes) to use @DEFROOTFSTYPE_NV@ instead of the global
@DEFROOTFSTYPE@.

Export FS_TYPE=erofs in install_image_nvidia_gpu() and
install_image_nvidia_gpu_confidential() so the build pipeline
produces erofs images via the image builder.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
bfcd249f40 image-builder: add erofs dm-verity support and lz4hc compression
Add full dm-verity and measured rootfs support to
create_erofs_rootfs_image(), bringing it to parity with the ext4 path.

Unlike ext4, which is a read-write filesystem mounted read-only by
convention, erofs is structurally read-only -- no journal, no write
metadata, no superblock write path.

This is a natural fit for dm-verity: erofs never attempts writes, so
verity never has to reject anything. With ext4, the kernel must skip
journal replay on verity-protected devices, which is a fragile
assumption.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
Fabiano Fidêncio
d2e0555cf0 image-builder: refactor dm-verity setup into shared functions
Extract build_kernel_verity_params() and setup_verity() from the
inline block inside create_rootfs_image() into top-level functions.

This is a pure refactoring with no behavior change. The verity logic
is moved verbatim, with the only difference being that
build_kernel_verity_params() now takes the image path as an explicit
parameter instead of capturing it from the enclosing scope.

The extracted functions will be reused by create_erofs_rootfs_image()
in a subsequent commit to add dm-verity support for erofs images.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:18:05 +02:00
manuelh-dev
2ffd1538a2 Merge pull request #13021 from fidencio/topic/kata-deploy-log-level-containerd-version-4
kata-deploy: Fix containerd debug level path for config schema v4
2026-05-10 07:28:26 -07:00
Fabiano Fidêncio
341a0d366c kata-deploy: Fix containerd debug level path for config schema v4
Containerd 2.3 (config schema v4) uses the top-level [debug] table
for log level configuration, not plugins."io.containerd.server.v1.debug"
as was the case in the RC builds.

Update containerd_debug_level_toml_path() to use .debug.level for all
schema versions, matching the released containerd behavior.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 12:02:24 +02:00
Fabiano Fidêncio
46b46589a6 Merge pull request #13020 from manuelh-dev/mahuber/nim-op-placement
tests: nvidia: place NIM service into namespace
2026-05-10 12:01:58 +02:00
Manuel Huber
1c081ff434 tests: nvidia: place NIM service into namespace
Place the NIM service into our test namespace. We are still observing
various situations where for some reasons, the NIM service appears in
the default namespace in our CI.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-10 07:36:23 +00:00
Fabiano Fidêncio
905303b6b0 Merge pull request #13013 from BbolroC/filter-vfio-gk-only-runtime-rs
runtime-rs: filter VFIO devices only in guest-kernel mode
2026-05-08 23:49:50 +02:00
Fabiano Fidêncio
a447a1fb03 Merge pull request #13015 from stevenhorsman/kernel-6.18.28-bump
version: Bump to latest 6.18 kernel
2026-05-08 21:12:50 +02:00
Fabiano Fidêncio
f7be57efe2 Merge pull request #13007 from manuelh-dev/mahuber/dbg-nim-svc
tests: nvidia: Wait for NIM operator pod and print
2026-05-08 20:58:51 +02:00
stevenhorsman
87664c608d version: Bump to latest 6.18 kernel
Pick up the latest kernel that fixes CVE-2026-43284

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-05-08 17:15:24 +01:00
Hyounggyu Choi
754707fe83 runtime-rs: filter VFIO devices only in guest-kernel mode
After #12857, the VFIO-AP hotplug test fails because runtime-rs
unconditionally removes all /dev/vfio/* devices from the OCI spec
before sending it to the kata agent. The agent then rejects
the container creation with:

```
Missing devices in OCI spec
```

Filter devices from the OCI spec conditionally based on the
vfio_mode configuration (e.g. guest-kernel). Also factor the
filtering logic out into a separate function and add unit tests.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-05-08 15:39:16 +02:00
Fabiano Fidêncio
8e65e89ade Merge pull request #13011 from kata-containers/fix-warnings
runtime-rs: Fix warnings in rust runtime
2026-05-08 15:12:53 +02:00
Fabiano Fidêncio
a541827a7e Merge pull request #12984 from fidencio/topic/network-pair-use-name-for-lookup
runtime-rs: network: use provided name for virt interface lookup
2026-05-08 14:31:58 +02:00