Commit Graph

18964 Commits

Author SHA1 Message Date
stevenhorsman
e1d7d5bef8 agent: Remove async-std
It's a dev-dependency that doesn't seem to be used, so
remove it and resolve RUSTSEC-2025-0052

Assisted-By: Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:45:17 +00:00
stevenhorsman
e4eda5e1d8 agent: Bump tracing-subscriber
- Bump tracing-subscriber to 0.3.20 to resolve RUSTSEC-2025-0055
- Switch deprecated `slog_info!` for `slog::info!`

Generated-By: Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:45:17 +00:00
stevenhorsman
e62df07b6a static-checks: Delete kata-spell-check
The old hunspell based spell-check was causing contributors
challenges and proving a barrier to doc updates. We've replaced
it with a cspell based-solution, so clean up the old approach.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
stevenhorsman
44ec815f77 gatekeeper: Add check-spelling to required
Add the new job to the list of static checks that runs on doc updates.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
stevenhorsman
c2cedd7c02 workflows: Add spellcheck workflow
Add a separate spellcheck workflow, so we can replace
the complex hunspell approach embedded in static-checks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
stevenhorsman
d06dadd8ef docs: Spelling updates
Either fixing typos, or including program/repo name in
backticks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
stevenhorsman
829a32ee67 spellcheck: Add cspell files
Add cspell config and initial dictionary

Assisted-by: Bob (dictionary ordering and catergorisation)
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
dependabot[bot]
2f5415d8f5 build(deps): bump google.golang.org/grpc
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.63.2 to 1.79.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.63.2...v1.79.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-19 10:03:45 +00:00
dependabot[bot]
3876a80208 build(deps): bump google.golang.org/grpc in /src/runtime
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.72.0 to 1.79.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.72.0...v1.79.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-19 10:03:30 +00:00
Alex Lyn
de2ddf6ed9 Merge pull request #12637 from stevenhorsman/bump-rust-to-1.92
versions: Bump rust to 1.92
2026-03-19 10:02:18 +08:00
Manuel Huber
5765bc97b4 tests: cc: setup function for python venv
We recently had a failure on a new CI runner where
${HOME}/.cicd/venv/bin/activate was not present. The relevant call
originated from ensure_sev_snp_measure. Thus, add a function
ensure_cicd_python_venv before callers to pip install.
Currently, the NVIDIA NIM test and the confidential attestation
tests use pip to install dependencies.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-18 17:07:47 -07:00
Manuel Huber
62d74bb1fd agent: add mkfs_opts parameter to cdh_secure_mount
Add an mkfs_opts parameter to cdh_secure_mount so that its users
can parametrize these options depending on their needs. For now,
there is two users providing explicit values (container image
layer storage and container data storage features).

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-18 15:37:32 -07:00
Aurélien Bombo
352b4cdad2 Merge pull request #12660 from LandonTClipp/ci_docs
ci: Don't run CI builds on doc PRs
2026-03-17 12:19:11 -05:00
stevenhorsman
56b6917adf versions: Bump rust to 1.92
Now that 1.94 has release, in compliance with our toolchain guidance
we should bump to rust 1.92

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
stevenhorsman
2a4227e02e kata-ctl: Try fixing unused_assignement error
`allow(unused_assignments)` isn't working as it's
in macro generated code, so referencing the command
in the error, to use it

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
stevenhorsman
ca7cdcd732 kata-ctl: Rewrite path_join test
This test was failing clippy by calling .unwrap() after
an .is_ok(), but after I looked at it, it seemed a bit messy,
so I split it up and tried rewriting it to make it more readable
IMHO.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
stevenhorsman
501578cc5a agent: Remove non-idiomatic unwrap
Calling .unwrap() after an .is_some() check is considered non-idiomatic in
as it performs redundant work and makes the code more verbose.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
Alex Lyn
833b72470c Merge pull request #12647 from sprt/gp-improve
genpolicy: Improve emptyDir storage options and mount point validation
2026-03-17 13:56:42 +08:00
Manuel Huber
660e3bb653 gpu: Obsolete the NVIDIA initrd build
As the NVIDIA stack has shifted to using an image for both the
confidential and non-confidential variants, we retire the initrd
build.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
3.28.0
2026-03-16 21:29:58 -04:00
Aurélien Bombo
f8e234c6f9 Merge pull request #12650 from kata-containers/sprt/remove-csi
ci: Stop building/deploying CSI driver
2026-03-16 16:53:02 -05:00
Steve Horsman
294c367063 Merge pull request #12668 from manuelh-dev/release/3.28.0
release: Bump version to 3.28.0
2026-03-16 19:47:12 +00:00
Manuel Huber
5210584f95 release: Bump version to 3.28.0
Bump VERSION and helm-charts versions.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:52:35 -07:00
Manuel Huber
e13748f46d tests: Adapt trusted ephemeral storage test
With the new CDH version, the LUKS header is moved off of the disk
into guest memory. We hence adapt the test's filesystem type checks.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:43:17 -07:00
Manuel Huber
5bbc0abb81 tests: use pre-created, signed sealed secrets
With signature support for sealed secret, use pre-created signed
sealed secrets and provision the signing public key to the KBS.

Add instructions for re-creating these signed secrets.

Improve k8s-sealed-secrets.bats by reducing repeated kubectl logs
calls. A test run showed a SIGPIPE error one one of the grep-logs
while the printouts of the initial kubectl logs invocation showed
that the expected values were actually in the logs.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:43:17 -07:00
Manuel Huber
a9b222f91e gpu: Update chiseled rootfs with new CDH deps
With CDH requiring libcryptsetup, mkfs.ext4, dd, and their
dependencies, we will need to update the chiseled NVIDIA rootfs
accordingly.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:43:17 -07:00
Manuel Huber
169f92ff09 agent: cdh: Update CDH and API
With the new CDH version, the secure_mount API changes.
Further, the new CDH version no longer uses the luks-encrypt-storage
script but utilizes libcryptsetup as well as mkfs.ext4 and dd. Hence, adapt
some of the CDH and Kata components build steps

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:43:17 -07:00
Fupan Li
eabb98ecab dragonball: fix the issue of type miss match with api change
For aarch64, it should match the api for get_one_reg and
set_one_reg.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-16 14:41:53 +08:00
Alex Lyn
ef5db0a01f Merge pull request #12607 from zvonkok/system-map
kernel: Ship System.map as part of the kernel build
2026-03-16 09:37:44 +08:00
Zvonko Kaiser
99f32de1e5 kata-deploy: Update RuntimeClass PodOverhead
Align the podOverhead with the default_memory updated
in the previous commit.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
6a853a9684 gpu: Bump NVRC
We have a new release add this one to the next
Kata release.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
8ff5d164c6 runtime: make CDI annotation vendor-agnostic with lookup table
Replace hardcoded NVIDIA vendor ID (0x10de) and class (0x030) checks
with a vendor-agnostic lookup table (cdiDeviceKind) that maps PCI
vendor/class pairs to CDI device kinds. This makes it straightforward
to add support for new device types by adding entries to the table.

Refactor siblingAnnotation to resolve device BDFs once upfront and
reuse them for both CDI type detection and sibling matching, eliminating
redundant sysfs reads. Devices not in the lookup table (e.g. NVSwitches)
are skipped with errNoSiblingFound, while known device types that fail
to match a sibling produce a hard error.

Consolidate the hot-plug and cold-plug device loops into a single loop
over extracted container paths, removing duplicated filtering logic.

Export GetPCIDeviceProperty from the device drivers package to allow
vendor/class lookup from sysfs in the container annotation path.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
d4c21f50b5 gpu: Bump default memory to 8G for GPU runtimes
We need enough inital memory to prepare more complex
platforms like HGX H100 or HGX B200 systems.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
5c9683f006 gpu: Remove devtmpfs.mount=0
With the newest NVRC release this is solved and does
not need to be overriden.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
d22c314e91 gpu: Increase dial_timeout=1200
For cold-plug when running with nerdctl the timeouts in the config
are being used, increase the dial_timeout (e.g. for CreateSandbox) to match
create_container_timeout.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
7fe84c8038 gpu: HGX Rootfs Fixes
Various smaller fixes to enable HGX systems.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Joji Mekkattuparamban
1fd66db271 nvidia-gpu: add missing libraries to rootfs
Added the missing packages to the nvidia rootfs.

Fixes #12534

Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>
2026-03-13 16:24:32 -07:00
Dan Mihai
9332b75c04 Merge pull request #12661 from stevenhorsman/runtime-go-1.25.8
runtime: bump go.mod version
2026-03-13 14:06:08 -07:00
Zvonko Kaiser
d382379571 kernel: Ship System.map as part of the kernel build
Some use-cases need the System.map of the running kernel,
ship it via kernel-artifact.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-13 19:27:18 +00:00
Manuel Huber
4a7022d2f4 tests: nvidia: call genpolicy auth for all tests
Call the setup_genpolicy_registry_auth in run_kubernetes_nv_tests.sh.
Authenticate before exercising any tests.

Recently, we have seen UnauthorizedError messages for the CUDA
vectorAdd image. While this image is not gated behind authentication,
rate limiting may be a possible issue.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-13 09:03:01 -07:00
LandonTClipp
9a8932412d docs: remove URL and markdown reference checks
This URL check performed a CURL command to see if it was real. This will
not work in the mkdocs world because the docs might reference a link that
is not yet built on the main page. This is a chicken-and-egg problem.

For reference:

```
ERROR: Invalid URL 'https://kata-containers.github.io/kata-containers/installation/#helm-chart' found in the following files:

tools/packaging/kata-deploy/helm-chart/README.md
```

The markdown reference requirement was put in place for the old docs system, but this
will not apply anymore in the new mkdocs system. I'm removing this
entirely because it will only get in the way and cause confusion.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-03-12 15:48:35 -05:00
LandonTClipp
d5d741f4e3 ci: Don't run CI builds on doc PRs
We disable the Kata artifact builds and testing if the PR is only
related to documentation. Regular static checks will remain.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-03-12 15:48:05 -05:00
Zvonko Kaiser
4c450a5b01 Merge pull request #12648 from manuelh-dev/mahuber/trustee-upgrade
versions: bump trustee to latest version
2026-03-12 14:09:15 -04:00
Steve Horsman
7d2e18575c Merge pull request #12343 from zvonkok/release-model
doc: Release model update
2026-03-12 14:44:51 +00:00
Zvonko Kaiser
7f662662cf lint: Fix 80 char column size
Make markdownlint happy

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-12 12:03:29 +00:00
Zvonko Kaiser
6e03a95730 doc: Update Release Process
Add how Kata is doing the rolling release.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-12 12:03:29 +00:00
Fupan Li
c1b7069e50 tools: fix the genpolicy building issue
Add the new helper item bring by the cargo bump

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
fddd1e8b6e dragonball: update the Cargo.lock and rm the unused crate
update the Cargo.lock  and rm the unused crate

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
d6178d78b1 dragonball: fix the tcp test address for 127.0.0.2
Fix TCP test addresses from 127.0.0.2 to 127.0.0.1 for vsock backend
tests

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
1c7b14e282 dragonball: Fix the feature gating for host devices
Fix feature-gating for PCI/virtio in dragonball
device_manager and mptable test

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
e9bda42b01 dragonball: fix the failed UT tests
Fix dragonball make check: clippy and format errors

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00