Commit Graph

19340 Commits

Author SHA1 Message Date
Fabiano Fidêncio
56da8097c2 Merge pull request #13204 from fidencio/topic/versions-bump-qemu
versions: Bump QEMU to 11.0.1
2026-06-12 17:14:57 +02:00
Fabiano Fidêncio
110843d6e1 Merge pull request #13138 from manuelh-dev/mahuber/runt-rs-mem-file-removal
runtime(-rs): remove file_mem_backend config option
2026-06-12 17:13:04 +02:00
Hyounggyu Choi
edead9e97b Merge pull request #13189 from stevenhorsman/osv-scanner-refactor
workflows: refactor osv-scanner workflows
2026-06-12 12:04:12 +02:00
Fabiano Fidêncio
e758f4b280 Merge pull request #13202 from gkurz/fix-generate-vendor
generate_vendor: Fix heavily broken logic
2026-06-12 11:48:50 +02:00
Fabiano Fidêncio
a016fd0485 Merge pull request #13198 from fidencio/topic/fix-ci-tee-static-sizing-overhead
tests: raise k8s memory/QoS pod limits for TEE runtime-rs CI
2026-06-12 11:46:56 +02:00
Fabiano Fidêncio
723f74e782 Merge pull request #13209 from fidencio/topic/fix-kata-monitor-runc-pod-runtime
tests: launch kata-monitor runc workload with explicit runtime
2026-06-12 11:40:19 +02:00
Fabiano Fidêncio
54bb736ab4 Merge pull request #13193 from BbolroC/set-nightly-for-qemu-coco-dev-runtime-rs-on-s390x
GHA: Set nightly/dev builds for qemu-coco-dev-runtime-rs on s390x
2026-06-12 11:21:33 +02:00
Greg Kurz
eac5dd2907 generate_vendor: Fix heavily broken logic
While checking the content of the vendor tarball artifact in the 3.31.0
release page, I realized that it is lacking most of the rust code and
all the go code. It turns out that the script is badly broken in many
ways :

1. Cargo workspace conflicts: Vendored dependencies were treated as
   workspace members, causing "current package believes it's in a
   workspace when it's not" errors. Fixed by adding vendor directory
   exclusions to root Cargo.toml.

2. Missing Go vendoring: Script only searched for Cargo.lock files,
   never processing go.mod files despite having a case statement for
   them. Fixed by adding go.mod to the find command with '-o -name go.mod'.

3. Wrong tar execution directory: Script ran tar from release/ directory
   but vendor_dir_list contained paths relative to repo root (./vendor,
   ./src/agent/vendor, etc.), causing "Cannot stat" errors. Fixed by
   moving tar command before final popd.

4. Relative tarball path: Since tar now runs from repo root, converted
   tarball path to absolute to ensure it's created in the release
   directory.

5. Vendored go.mod pollution: Added '-path ./vendor -prune' to find
   command to exclude vendor directory, preventing the script from
   finding go.mod files inside vendored Rust dependencies.

The fixes are simple enough they can be squashed into a single
commit.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-06-12 10:06:53 +02:00
Fupan Li
9553614f32 Merge pull request #12772 from Apokleos/nydus-standalone
runtime-rs: Nydus standalone mode support in runtime-rs
2026-06-12 10:36:17 +08:00
Manuel Huber
70d8f1bf3d runtime: remove file_mem_backend config option
Remove the Go runtime file_mem_backend and valid_file_mem_backends
config knobs, along with the corresponding sandbox annotation handling.

The runtime still enables file-backed shared memory automatically for
virtio-fs by using /dev/shm as the backing directory. This only removes
the user-selectable backend path.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-12 00:07:16 +00:00
Manuel Huber
86fd65271c runtime-rs: remove file_mem_backend config option
While the config knob is being parsed, it is being unused in the
rust shim. This renders the config knob useless. Remove the
file_mem_backend config option as there is no current users for it.
As this option is being usable in the go shim, we leave it intact.

For the rust shim, /dev/shm is still being used in a similar way to
the go shim when filesystem sharing is enabled (virtio-fs). Future
use cases where other file_mem_backends are being utilized are
currently planning to define these backends in a similar manner:
based on the configuration/platform, determine the proper file
memory backend, but do not let end users determine the file memory
backend.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-06-12 00:07:16 +00:00
Fabiano Fidêncio
b323697f37 Merge pull request #13111 from Apokleos/monitor-disk-usage
Metrics: Add support for monitoring disk usage via statfs
2026-06-12 00:41:31 +02:00
Fabiano Fidêncio
780c242bfd Merge pull request #12832 from Apokleos/indep-iothreads
runtime-rs: Add support Independent iothreads
2026-06-12 00:24:41 +02:00
Fabiano Fidêncio
cda6c8c6e0 tests: raise k8s memory/QoS pod limits for TEE runtime-rs CI
Increase memory request/limit values used by k8s memory and QoS
integration workloads so SNP/TDX static-sized sandboxes boot reliably
under the new sizing defaults.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-11 22:03:36 +02:00
Fabiano Fidêncio
46add95802 versions: Bump QEMU to 11.0.1
Bump QEMU to its latest release.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-11 22:01:26 +02:00
Fabiano Fidêncio
9e597d33f2 tests: launch kata-monitor runc workload with explicit runtime
The kata-monitor negative test creates a non-kata pod and asserts it does
not appear in the kata-monitor cache (built from /run/vc/sbs, where only
kata sandboxes register).

However, the workload was started without a runtime handler, so it used
containerd's default runtime, which in the CI containerd config is set
to kata, so the "runc" pod was actually launched as a kata sandbox,
registered under /run/vc/sbs, and tripped the assertion ("cache: got
runc pod ...").

Start the workload with an explicit runc handler (configurable via
RUNC_RUNTIME) so it is a genuine runc sandbox that never touches
/run/vc/sbs.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-11 21:59:53 +02:00
Alex Lyn
1034d7fc46 tests: Add support nydus tests for qemu-runtime-rs and clh-runtime-rs
This commit is to enable qemu-runtime-rs/clh-runtime-rs and make it
compatiable with qemu-runtime-rs and clh-runtime-rs.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
e21621140f ci: Add qemu-runtime-rs and clh-runtime-rs test with nydus
It aims to enable nydus tests for qemu-runtime-rs and clh-runtime-rs.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
4eb7512e7b docs: Update how-to guide for virtio-fs-nydus with runtime-rs
Add comprehensive documentation for using virtio-fs-nydus shared
filesystem with Kata Containers. This guide covers:
(1) Clarify configuration options for virtio-fs-nydus and nydus image
    preparation and usage.
(2) Update daemon configuration and lifecycle management and introduce
    standalone, inline nydus architecture.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
fa84eecd2d runtime-rs: Implement ShareVirtioFsNydus for standalone mode
Introduce `ShareVirtioFsNydus` to enable standalone Nydus rootfs
support. This implementation acts as the bridge between runtime-rs
and the external `nydusd` daemon.

Key Capabilities:
(1) Trait Implementation: Implements `ShareFs` (for VM device/storage) and
  `NydusShareFs` (for RAFS lifecycle) traits.
(2) Daemon Lifecycle Management: Handles `nydusd` spawning, supervision,
  and graceful shutdown.
(3) Native Overlay Support: Configures `nydusd` with `passthrough_fs`
  backend to provide native overlay (upperdir/workdir) support.
(4) API Integration: Utilizes `NydusClient` for granular control over RAFS
  mount/umount operations.
(5) QEMU Integration: Enables `virtio-fs-nydus` device support,
  facilitating standalone mode execution.

This implementation allows Kata containers to utilize an external `nydusd`
process for Nydus rootfs management, providing a cleaner separation between
the runtime and the Nydus daemon lifecycle.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
edfe9ea403 runtime-rs: refine ShareFs abstraction with lifecycle and Nydus traits
Refactor the `ShareFs` trait to improve modularity and support
standalone Nydus mode:

(1) Added `stop()` method to manage daemon teardown.
(2) Introduced a dedicated trait for Nydus-specific data-plane
operations.

This refactoring cleans up the `ShareFs` trait by consolidating
daemon lifecycle handling and isolating Nydus-specific extensions,
paving the way for cleaner standalone Nydus implementation.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
720a8688b4 runtime-rs: Add daemon manager for nydusd process lifecycle
Implement Nydusd to manage nydusd daemon process:
(1) start: spawn process, validate paths, wait for API ready,
    setup passthrough fs.
(2) stop: kill process, cleanup socket files.
(3) mount_rafs/mount_rafs_with_overlay: high-level filesystem
    mount operations.
(4) build_args: construct virtiofs mode command line arguments.

This provides process lifecycle management with internal NydusClient

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
c1ebf269f7 runtime-rs: Add nydus client for nydusd API communication via HTTP
Implement NydusClient to interact with nydusd daemon via Unix
socket:
(1) check_status: query daemon state via GET /api/v1/daemon.
(2) mount/umount: manage filesystem mounts via POST/DELETE
  /api/v1/mount.
(3) wait_until_ready: poll daemon until RUNNING state.

This provides a lightweight, stateless HTTP client layer for nydusd
API.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:42:48 +02:00
Alex Lyn
4c63b8e3de agent: handle ENOSYS in overlayfs storage handler
In standalone nydusd mode with virtio-fs passthrough, the guest-side
mkdir may fail with ENOSYS. Update the overlayfs storage handler to
skip directory creation when the directory already exists, logging a
warning instead of failing.

This ensures container rootfs setup succeeds when nydusd's native
overlay manages the directory structure.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:25:18 +02:00
Alex Lyn
8eb564dfb8 kata-sys-util: handle ENOSYS gracefully in mount destination creation
When using virtio-fs with nydusd's passthrough_fs, mkdir operations may
return ENOSYS on certain filesystem configurations. This causes mount
destination creation to fail unexpectedly.

Handle ENOSYS errors gracefully alongside AlreadyExists by verifying the
directory exists after the failed mkdir attempt, allowing the mount to
proceed if the directory is already present.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:25:18 +02:00
Alex Lyn
b50f803a4e kata-types: add virtio-fs-nydus shared fs configuration support
Add "virtio-fs-nydus" as a recognized shared filesystem type in the
hypervisor configuration. This enables the standalone nydusd mode where
nydusd runs as a separate process alongside virtiofsd.

The key changes:
(1) Add VIRTIO_FS_NYDUS constant for the new shared fs type.
(2) Register virtio-fs-nydus in adjust() and validate() paths, reusing
  the same virtio-fs validation logic since both use vhost-user protocol

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 21:25:18 +02:00
Alex Lyn
854e76fb47 kata-types: Enhance related stuff for independent io threads
Refactor comments and tests stuff for independent iothreads.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
b0ebbc685d runtime-rs: Add support for independent iothreads for virtio blk devices
As independent iothreads can work in both virtio-scsi and virtio-blk
devices, this commit aims to enable such feature in virtio-blk-pci
devices.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
980ecfdd96 runtime-rs: Add support iodependent iothreads within virtio-blk
1. Determine iothread for virtio-blk devices, only attach iothread
when:
(1) enable_iothreads is true
(2) indep_iothreads > 0
(3) block driver is not virtio-scsi (i.e., it's
virtio-blk)
And for more complex cases, some enhancements will be done in future

2. Add iothread parameter for virtio-blk devices if specified.
If iothreads set and passed, we will have to set it correctly for
virtio-blk devices via qmp with device_add arguments.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
36e626649d runtime-rs: Add support independent IO threads in qemu cmdline
To make it work well for independent IO threads for virtio-blk devices.
A new method for independent IO threads for virtio-blk hotplug devices
within qemu command line.

Note that as ObjectIoThread has been done for days, it can be directly
reused in this case.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
86d165c0cc kata-types: Introduce a dedicated annotation for indep_iothreads
To make it more flexible when users want to set this feature, one
more way to make it valid is via annotations.

The dedicated annnotation of
"io.katacontainers.config.hypervisor.indep_iothreads" is introduced
within k8s clusters.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
bdc57b16e5 runtime-rs: Add configurable indep_iothreads in configurations
It's useful and helpful to set indep_iothreads with enable_iothreads
for high IO performance. And we need provide an entry for people to
set it if needed.

This commit will introduce two configurable items:
- Makefile: DEFINDEPIOTHREADS when make build.
- configurations: indep_iothreads for people to set.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
d086d324e0 kata-types: Introduce independent IO thread for virtio-blk devices
The 'indep_iothreads' field is introduced in Hypervisor to make it
configurable for number of independent IO threads for virtio-blk
devices. When set to a value greater than 0, creates independent
IO threads that can be attached to virtio-blk devices during hotplug.

Note that it requires 'enable_iothreads' to be true for virtio-blk
devices to use these threads.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:20 +02:00
Alex Lyn
5a00053b38 kata-agent: Implement filesystem space usage collection via statfs
Add update_guest_filesystem_metrics() that collects disk space usage
(total/used/available) for all read-write mounted filesystems inside
the guest VM. This enables monitoring guest disk usage in kata/coco
pod through the existing GetMetrics RPC.

And its output metrics looks like as below:
- kata_guest_filesystem_bytes{mount="/",device="vda",item="total|used|available"}
- kata_guest_filesystem_inodes{mount="/",device="vda",item="total|used|available"}

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:05 +02:00
Alex Lyn
6c66724591 kata-agent: Add filesystem space usage metric declarations
Add two new GaugeVec metrics to expose guest filesystem space usage:
(1) kata_guest_filesystem_bytes{mount, device, item}: space in bytes
  (total/used/available)
(2) kata_guest_filesystem_inodes{mount, device, item}: inode counts
  (total/used/available)

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-11 20:47:05 +02:00
Fabiano Fidêncio
8f5b898e6d Merge pull request #13206 from stevenhorsman/fix-required-payload-name
ci: Update required tests
2026-06-11 20:46:37 +02:00
stevenhorsman
fb4600d66a runtime-rs: Fix test breakage
In #13147, for some reason a test block was added in the middle of code
and the code was stale when merged, which meant that a second
`mod test` section was added, breaking our tests. Merge the two
to fix this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-06-11 19:03:33 +02:00
stevenhorsman
1d854ad7af ci: Update required tests
publish-kata-deploy-payload got renamed in #13107, which broke the CI.

Now, instead of tracking all those intermediate steps, let's make sure
we only track the tests themselves.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-11 19:02:23 +02:00
Hyounggyu Choi
ae97194349 GHA: Run qemu-coco-dev-runtime-rs k8s test on zVSI nightly only
Run qemu-coco-dev-runtime-rs k8s test workflow on the zVSI
only during nightly builds.

Changes:
- Modified run-k8s-tests-on-zvsi.yaml to accept vmm as workflow
inputs instead of hardcoded matrix values
- run-k8s-tests-on-zvsi passes a conditional vmm value; 4 vmms
for nightly/dev builds and 3 vmms for all other PRs.

This ensures qemu-coco-dev-runtime-rs is only tested with nydus
snapshotter during nightly CI runs, reducing PR test time while
maintaining comprehensive nightly coverage.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-06-11 13:10:09 +02:00
Hyounggyu Choi
ed9d24d111 GHA: Add qemu-coco-dev-runtime-rs VMM to zVSI k8s tests
Add qemu-coco-dev-runtime-rs to the VMM matrix in the zVSI K8S
test workflow, configured to run only with the nydus snapshotter.

Changes:
- Add qemu-coco-dev-runtime-rs to the vmm matrix
- Exclude overlayfs + qemu-coco-dev-runtime-rs combination
- Exclude devmapper + qemu-coco-dev-runtime-rs combination
- Update CoCo-related conditional steps to include the new VMM:
* KBS environment variable setup
* kbs-client uninstall/install steps
* CoCo KBS deployment

This ensures qemu-coco-dev-runtime-rs is only tested with nydus
snapshotter, while maintaining existing test configurations for
other VMMs.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-06-11 13:10:09 +02:00
Fabiano Fidêncio
21657b9cd9 Merge pull request #13147 from manuelh-dev/mahuber/debug-go-rust
runtime-rs: Honor enable_debug for logs and adjust debugging documentation
2026-06-11 08:57:36 +02:00
Fabiano Fidêncio
38416f78ec Merge pull request #13190 from manuelh-dev/mahuber/fix-num-cpus-bats
tests: fix k8s-number-cpus expectation
2026-06-10 21:59:21 +02:00
Steve Horsman
150c7648cf Merge pull request #13197 from fidencio/topic/kata-monitor-tests-fixups
tests: align kata-monitor containerd version selector
2026-06-10 15:12:43 +01:00
Fabiano Fidêncio
4935bf8bc6 tests: align kata-monitor containerd version selector
Switch kata-monitor workflows from the deprecated "active" key to
"latest" so CI resolves containerd versions from versions.yaml correctly
after the key rename.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-09 21:25:45 +02:00
Hyounggyu Choi
6d2066b692 Merge pull request #13188 from BbolroC/set-static-resource-mgmt-properly-for-ibm-sel
runtime*: use static_sandbox_resource_mgmt defaults for qemu-se
2026-06-09 18:38:09 +02:00
Fabiano Fidêncio
6b06bf4ba5 Merge pull request #13107 from kata-containers/topic/kata-monitor-ship-image-as-part-of-the-release
kata-monitor: ship as a standalone multi-arch image starting with 3.32.0
2026-06-09 17:14:09 +02:00
Hyounggyu Choi
7cc6767fa2 runtime*: use static_sandbox_resource_mgmt defaults for qemu-se
Switch qemu-se config templates to use the TEE/CoCo-specific
static_sandbox_resource_mgmt defaults instead of the generic
QEMU defaults.

qemu-se-runtime-rs config now uses DEFSTATICRESOURCEMGMT_COCO
while runtime qemu-se config now uses DEFSTATICRESOURCEMGMT_TEE.
This aligns static sandbox resource management behavior with confidential
container expectations for qemu-se variants.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-06-09 14:45:50 +02:00
stevenhorsman
86b8afb761 workflows: refactor osv-scanner workflows
When I implemented the OSC scanner I followed the
guidance on the the action repo to use a single workflow for
both PR and main tests and rely on a re-usable workflow.
Since then I've realised some negatives of this approach:
- Unlike actions, dependabot needs custom logic to bump
workflow pins, so we are more likely to be out of date
- A lack of transparency/notification of when updates
are needed, due to bugs/ security fixes
- The dual workflow results in skipped jobs that
clutter the UI
- No ability to customise the pre-steps, or config

As such let's take the hit of managing two workflows,
in order to give us better flexibility.

Also add the `--call-analysis=none` option as we run govulncheck
separately, so don't want to have to compile and have a slow build

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Generated-By: IBM Bob
2026-06-09 13:38:17 +01:00
Fabiano Fidêncio
620d641458 ci: rename kata-deploy publish jobs
These jobs build and push the kata-deploy OCI image, so call them
publish-kata-deploy-image-* instead of *-payload-*, matching the
kata-monitor image jobs and making the workflow easier to read.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-09 14:33:30 +02:00
Fabiano Fidêncio
92a9691470 tests: add kata-monitor helm chart k8s test
Add a single-job k8s test that installs the kata-deploy helm chart
with monitor.enabled=true, pointed at the per-PR kata-monitor image
built earlier in the same run, and exercises both the rollout and the
user-visible behaviour:

  * the kata-monitor DaemonSet rolls out and the pod stays up without
    container restarts;
  * a real kata-runtime probe pod is scheduled, then /metrics and
    /sandboxes are scraped through the apiserver pod-proxy to prove
    kata-monitor sees the sandbox (non-zero running-shim count plus at
    least one per-sandbox kata_shim_* metric);
  * after the probe pod is deleted, /metrics drops back to a zero
    running-shim count.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: OpenAI Codex <codex@openai.com>
2026-06-09 14:33:30 +02:00