Commit Graph

6131 Commits

Author SHA1 Message Date
Alex Lyn
65b2a75aca runtime-rs: Fix typo USE_BUILDIN_DB with USE_BUILTIN_DB
Corrects the typo 'BUILDIN' to the standard 'BUILTIN' across the
codebase to improve code quality and documentation consistency.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-29 19:17:03 +02:00
PiotrProkop
64735222c6 runtime: allow specifying logical/physical sector size for block devices
Add two new configuration knobs that control the logical and physical
sector sizes advertised by virtio-blk devices to the guest:

  block_device_logical_sector_size  (config file)
  block_device_physical_sector_size (config file)

  io.katacontainers.config.hypervisor.blk_logical_sector_size  (annotation)
  io.katacontainers.config.hypervisor.blk_physical_sector_size (annotation)

The annotation names are abbreviated relative to the config file keys
because Kubernetes enforces a 63-character limit on annotation name
segments, and the full names would exceed it.

Both settings default to 0 (let QEMU decide). When set, they are passed
as logical_block_size and physical_block_size in the QMP device_add
command during block device hotplug.

Setting logical_sector_size smaller then container filesystem
block size will cause EINVAL on mount. The physical_sector_size can
always be set independently.

Values must be 0 or a power of 2 in the range [512, 65536]; other
values are rejected with an error at sandbox creation time.

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2026-03-27 18:56:54 +01:00
Aurélien Bombo
30e030e18e Merge pull request #12679 from microsoft/user/romoh/gpu-fix
clh: Add VFIO device cold-plug support
2026-03-27 11:12:51 -05:00
Hyounggyu Choi
cd931d4905 runtime: Set emptydir_mode to DEFEMPTYDIRMODE_COCO for IBM SEL
The enablement of the trusted ephemeral storage for IBM SEL was
missed in #10559. Set the emptydir_mode properly for the TEE.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-03-26 15:55:30 +01:00
Roaa Sakr
858620d2e7 clh: Add VFIO device cold-plug support
Enable VFIO device pass-through at VM creation time on Cloud Hypervisor,
in addition to the existing hot-plug path.

Signed-off-by: Roaa Sakr <romoh@microsoft.com>
2026-03-25 16:39:25 -07:00
Manuel Huber
79efe3e041 tests: gpu: use container data storage feature
Use the container data storage feature for the k8s-nvidia-nim.bats
test pod manifests. This reduces the pods' memory requirements.
For this, enable the block-encrypted emptydir_mode for the NVIDIA
GPU TEE handlers.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-23 11:43:11 -07:00
Fabiano Fidêncio
1ec97d25e7 Merge pull request #12704 from stevenhorsman/security-fixes-23-mar-26
Security fixes 23 mar 26
2026-03-23 15:27:07 +01:00
Fabiano Fidêncio
aa6890eae1 Merge pull request #12675 from manuelh-dev/mahuber/cdh-storage-options
agent: add mkfs_opts parameter to cdh_secure_mount
2026-03-23 15:18:38 +01:00
stevenhorsman
2edb588ed9 kata-ctl: Pin micro_http
the micro_http crate was just pointing the the main branch and hadn't been updated for
around 3 years, so pin to the latest for stability and update to remediate RUSTSEC-2024-0002

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-23 10:34:28 +00:00
stevenhorsman
9871256771 versions: Bump cloud-hypervisor to v51
In v51 the license was added, so try bumping to this version
to solve the cargo deny issue

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-23 10:34:28 +00:00
dependabot[bot]
8de7f29981 agent-ctl: Bump aws-lc-rs to 1.16.2
Bump aws-lc-rs, so that aws-lc-sys updates to 0.39.0 to remediate
RUSTSEC-2026-0044 and https://osv.dev/vulnerability/RUSTSEC-2026-0048

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-23 10:34:28 +00:00
dependabot[bot]
1c63738b80 build(deps): bump aws-lc-fips-sys in /src/tools/agent-ctl
Bumps [aws-lc-fips-sys](https://github.com/aws/aws-lc-rs) from 0.13.12 to 0.13.13.
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](https://github.com/aws/aws-lc-rs/compare/aws-lc-fips-sys/v0.13.12...aws-lc-fips-sys/v0.13.13)

---
updated-dependencies:
- dependency-name: aws-lc-fips-sys
  dependency-version: 0.13.13
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-23 10:34:28 +00:00
dependabot[bot]
6e79a9d6ad build(deps): bump rustls-webpki in /src/tools/agent-ctl
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.103.3 to 0.103.10.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.103.3...v/0.103.10)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-version: 0.103.10
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-23 10:34:27 +00:00
dependabot[bot]
ef32923461 build(deps): bump tar from 0.4.44 to 0.4.45
Bumps [tar](https://github.com/alexcrichton/tar-rs) from 0.4.44 to 0.4.45.
- [Commits](https://github.com/alexcrichton/tar-rs/compare/0.4.44...0.4.45)

---
updated-dependencies:
- dependency-name: tar
  dependency-version: 0.4.45
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-23 10:34:27 +00:00
Steve Horsman
20cb65b1fb Merge pull request #12624 from lifupan/bump_rust_vmms
runtime-rs: Bump rust vmms for dragonball
2026-03-23 08:56:47 +00:00
Fupan Li
608f378bff dragonball: make sure the nydus's worker thread access network
Since the dragonball's vmm thread had been joined in the pod's
netns, which wouldn't access the network, thus we should make
sure the nydus's worker thread join into the runD's main thread's
netns which would access the network.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-22 22:44:24 +08:00
RuoqingHe
cfc1836a31 Merge pull request #12672 from stevenhorsman/agent-security-fixes
agent: Bump tracing-subscriber
2026-03-20 17:37:16 +08:00
Steve Horsman
7ab6e11e10 Merge pull request #12678 from kata-containers/dependabot/go_modules/src/runtime/google.golang.org/grpc-1.79.3
build(deps): bump google.golang.org/grpc from 1.72.0 to 1.79.3 in /src/runtime
2026-03-20 08:49:35 +00:00
Steve Horsman
e475fb2116 Merge pull request #12680 from kata-containers/dependabot/go_modules/src/tools/csi-kata-directvolume/google.golang.org/grpc-1.79.3
build(deps): bump google.golang.org/grpc from 1.63.2 to 1.79.3 in /src/tools/csi-kata-directvolume
2026-03-20 08:49:27 +00:00
stevenhorsman
38a655487f vsock-exporter: Switch bincode for serde_json
bincode is not maintained, so switch to serde_json to
resolve RUSTSEC-2025-0141

Assisted-By: Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:45:17 +00:00
stevenhorsman
e1d7d5bef8 agent: Remove async-std
It's a dev-dependency that doesn't seem to be used, so
remove it and resolve RUSTSEC-2025-0052

Assisted-By: Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:45:17 +00:00
stevenhorsman
e4eda5e1d8 agent: Bump tracing-subscriber
- Bump tracing-subscriber to 0.3.20 to resolve RUSTSEC-2025-0055
- Switch deprecated `slog_info!` for `slog::info!`

Generated-By: Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:45:17 +00:00
stevenhorsman
d06dadd8ef docs: Spelling updates
Either fixing typos, or including program/repo name in
backticks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
dependabot[bot]
2f5415d8f5 build(deps): bump google.golang.org/grpc
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.63.2 to 1.79.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.63.2...v1.79.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-19 10:03:45 +00:00
dependabot[bot]
3876a80208 build(deps): bump google.golang.org/grpc in /src/runtime
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.72.0 to 1.79.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.72.0...v1.79.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-19 10:03:30 +00:00
Manuel Huber
62d74bb1fd agent: add mkfs_opts parameter to cdh_secure_mount
Add an mkfs_opts parameter to cdh_secure_mount so that its users
can parametrize these options depending on their needs. For now,
there is two users providing explicit values (container image
layer storage and container data storage features).

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-18 15:37:32 -07:00
stevenhorsman
2a4227e02e kata-ctl: Try fixing unused_assignement error
`allow(unused_assignments)` isn't working as it's
in macro generated code, so referencing the command
in the error, to use it

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
stevenhorsman
ca7cdcd732 kata-ctl: Rewrite path_join test
This test was failing clippy by calling .unwrap() after
an .is_ok(), but after I looked at it, it seemed a bit messy,
so I split it up and tried rewriting it to make it more readable
IMHO.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
stevenhorsman
501578cc5a agent: Remove non-idiomatic unwrap
Calling .unwrap() after an .is_some() check is considered non-idiomatic in
as it performs redundant work and makes the code more verbose.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-17 16:04:58 +00:00
Alex Lyn
833b72470c Merge pull request #12647 from sprt/gp-improve
genpolicy: Improve emptyDir storage options and mount point validation
2026-03-17 13:56:42 +08:00
Manuel Huber
660e3bb653 gpu: Obsolete the NVIDIA initrd build
As the NVIDIA stack has shifted to using an image for both the
confidential and non-confidential variants, we retire the initrd
build.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 21:29:58 -04:00
Manuel Huber
169f92ff09 agent: cdh: Update CDH and API
With the new CDH version, the secure_mount API changes.
Further, the new CDH version no longer uses the luks-encrypt-storage
script but utilizes libcryptsetup as well as mkfs.ext4 and dd. Hence, adapt
some of the CDH and Kata components build steps

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-03-16 09:43:17 -07:00
Fupan Li
eabb98ecab dragonball: fix the issue of type miss match with api change
For aarch64, it should match the api for get_one_reg and
set_one_reg.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-16 14:41:53 +08:00
Zvonko Kaiser
6a853a9684 gpu: Bump NVRC
We have a new release add this one to the next
Kata release.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
8ff5d164c6 runtime: make CDI annotation vendor-agnostic with lookup table
Replace hardcoded NVIDIA vendor ID (0x10de) and class (0x030) checks
with a vendor-agnostic lookup table (cdiDeviceKind) that maps PCI
vendor/class pairs to CDI device kinds. This makes it straightforward
to add support for new device types by adding entries to the table.

Refactor siblingAnnotation to resolve device BDFs once upfront and
reuse them for both CDI type detection and sibling matching, eliminating
redundant sysfs reads. Devices not in the lookup table (e.g. NVSwitches)
are skipped with errNoSiblingFound, while known device types that fail
to match a sibling produce a hard error.

Consolidate the hot-plug and cold-plug device loops into a single loop
over extracted container paths, removing duplicated filtering logic.

Export GetPCIDeviceProperty from the device drivers package to allow
vendor/class lookup from sysfs in the container annotation path.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
d4c21f50b5 gpu: Bump default memory to 8G for GPU runtimes
We need enough inital memory to prepare more complex
platforms like HGX H100 or HGX B200 systems.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
5c9683f006 gpu: Remove devtmpfs.mount=0
With the newest NVRC release this is solved and does
not need to be overriden.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Zvonko Kaiser
d22c314e91 gpu: Increase dial_timeout=1200
For cold-plug when running with nerdctl the timeouts in the config
are being used, increase the dial_timeout (e.g. for CreateSandbox) to match
create_container_timeout.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-03-15 09:53:32 -07:00
Fupan Li
c1b7069e50 tools: fix the genpolicy building issue
Add the new helper item bring by the cargo bump

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
fddd1e8b6e dragonball: update the Cargo.lock and rm the unused crate
update the Cargo.lock  and rm the unused crate

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
d6178d78b1 dragonball: fix the tcp test address for 127.0.0.2
Fix TCP test addresses from 127.0.0.2 to 127.0.0.1 for vsock backend
tests

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
1c7b14e282 dragonball: Fix the feature gating for host devices
Fix feature-gating for PCI/virtio in dragonball
device_manager and mptable test

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:04 +00:00
Fupan Li
e9bda42b01 dragonball: fix the failed UT tests
Fix dragonball make check: clippy and format errors

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
a66c93caaa dragonball: add GuestRegionCollection error
add GuestRegionCollection error variant with proper
error context preservation

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
17454c0969 dragonball: Fix remaining warnings
remove unused imports (ioctl_ioc_nr, std::io) that cause -D warnings failures

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
f8617241f4 dragonball: Fix dragonball compilation errors for upgraded
Fix dragonball compilation errors for upgraded
dependencies (vm-memory 0.17.1, kvm-ioctls 0.24.0, vfio-ioctls 0.5.2)

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
d0f0dc2008 dragonball: fix the dbs-virtio-devices compiled errors
Update dbs-virtio-devices to compile with:
- virtio-bindings 0.2.x: VIRTIO_F_VERSION_1, VIRTIO_F_NOTIFY_ON_EMPTY,
  VIRTIO_F_RING_PACKED moved from virtio_blk/virtio_net/virtio_ring to
  virtio_config module.
- virtio-queue 0.17.0: Descriptor no longer exported at top level, use
  desc::split::Descriptor instead.
- vhost 0.15.0: Master->Frontend, VhostUserMaster->VhostUserFrontend,
  MasterReqHandler->FrontendReqHandler,
  VhostUserMasterReqHandler->VhostUserFrontendReqHandler,
  SLAVE_REQ->BACKEND_REQ, SLAVE_SEND_FD->BACKEND_SEND_FD,
  set_slave_request_fd->set_backend_request_fd.
  FS slave messages (VhostUserFSSlaveMsg etc.) removed from vhost crate;
  SlaveReqHandler now implements VhostUserFrontendReqHandler with
  handle_config_change only.
- fuse-backend-rs 0.14.0: Handle CachePolicy::Metadata variant,
  fix get_rootfs() returning tuple, use buffer-based I/O for Ufile
  since ReadVolatile/WriteVolatile are not implemented for Box<dynUfile>.
- vm-memory 0.17.1: GuestRegionMmap::new returns Option instead of
  Result, mmap::Error removed.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
3e39c1fad3 dragonball: fix the issue of kvm-binding upgraded
Fix the compiling errors caused by kvm-binding and
kvm-ioctls upgraded.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
a6a81124cb runtime-rs: fix the api changes for vm-memory 0.17.1 API
Rename vm-memory GuestMemory methods for 0.17.1 upgrade
Rename read_from -> read_volatile_from, write_to -> write_volatile_to,
read_exact_from -> read_exact_volatile_from, and write_all_to ->
write_all_volatile_to across all dragonball Rust source files.
Change bitmap() return type from &Self::B to BS<'_, Self::B>
Move as_slice/as_mut_slice from GuestMemoryRegion trait impl to inherent
impl block, using get_host_address for mmap regions
Update GuestMemory impl: remove type I, use impl Iterator return type
Replace Error with GuestRegionCollectionError for region collection errors
Fix VolatileSlice::with_bitmap call to include mmap parameter
Fix test: use ptr_guard().as_ptr() instead of removed as_ptr()

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00
Fupan Li
8d09a0e7e7 runtime-rs: Bump the rust-vmm related crates
vm-memory 0.10.0 → =0.17.1
vmm-sys-util 0.11.0 → 0.15.0
kvm-bindings 0.6.0 → 0.14.0
kvm-ioctls =0.12.1 → 0.24.0
virtio-queue 0.7.0 → 0.17.0
virtio-bindings 0.1.0 → 0.2.0
fuse-backend-rs 0.10.5 → 0.14.0

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2026-03-12 10:58:03 +00:00