Commit Graph

18136 Commits

Author SHA1 Message Date
Aurélien Bombo
cbfdc4b764 Revert "ci: Implement build step for CSI driver"
This partially reverts commit fb87bf221f.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-10 13:51:21 -05:00
Aurélien Bombo
b6c60d9229 Merge pull request #10559 from sprt/conf-local-storage
coco: Implement trusted ephemeral data storage
2026-03-10 10:39:40 -05:00
Dan Mihai
f9a8eb6ecc genpolicy: allow_mount improvements for emptyDir
1. Reduce the complexity of the new allow_mount rules for emptyDir.

2. Reverse the order of the two allow_mount versions, as a hint to the
   rego engine that the first version is more often matching the input.

3. Remove `p_mount.source != ""` from mount_source_allows, because:
 - Policy rules typically test the values from input, not values read
   from Policy.
 - mount_source_allows is no longer called for emptyDir mounts after
   these changes, so p_mount.source is not empty.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-03-09 14:52:17 -05:00
Fabiano Fidêncio
374b0abe29 tests: Fix kubelet data dir for k0s in trusted ephemeral storage test
k0s uses /var/lib/k0s/kubelet instead of /var/lib/kubelet as its
kubelet data directory. Introduce get_kubelet_data_dir() in
tests_common.sh and use it in k8s-trusted-ephemeral-data-storage.bats
instead of hardcoding /var/lib/kubelet.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
718632bfe0 build: Add artifacts to .gitignore
This adds various files that are generated during development.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
68bdbef676 tests: Improve logging for some tests
Use modern test semantics to ease debugging.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
3dd77bf576 tests: Introduce new env variables to ease development
It can be useful to set these variables during local testing:

 * AZ_REGION: Region for the cluster.
 * AZ_NODEPOOL_TAGS: Node pool tags for the cluster.
 * GENPOLICY_BINARY: Path to the genpolicy binary.
 * GENPOLICY_SETTINGS_DIR: Directory holding the genpolicy settings.

I've also made it so that tests_common.sh modifies the duplicated
genpolicy-settings.json (used for testing) instead of the original git-tracked
one.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
aae54f704c ci: Stop deploying the CSI driver
The design moved away from CSI driver so stop deploying that.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
a98e328359 tests: Add test for trusted ephemeral data storage
This tests the feature on CoCo machines.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
9fe03fb170 genpolicy: Support trusted ephemeral data storage
* Introduces a new cluster_config setting encrypted_emptydir defaulting to true.
 * Adapts genpolicy for encrypted emptyDirs.

Crucially, the rules.rego change checks that the mount and the storage are
well-formed together:

 * i_storage.source matches a known regex.
 * i_storage.mount_point == $(spath)/BASE64(i_storage.source)
 * i_storage.mount_point == p_storage.mount_point
 * i_storage.mount_point == i_mount.source

Note that policy enforcement is necessary to prevent rogue device injection.
E.g. the agent could not blindly encrypt all block devices as some use cases
only need dm-verity.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
eaa711617e agent: Support trusted ephemeral data storage
Handles block-based emptyDirs plugged via virtio-blk and virtio-scsi by
encrypting and formatting them.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Aurélien Bombo
a4fd32a29a runtime: Support trusted ephemeral data storage
* Introduces the `emptydir_mode` config flag to allow instructing the runtime
   to create a block device for emptyDir volumes.
 * The block device is created in the original emptyDir folder on the host
   so that Kubelet can monitors its disk usage and evict the pod if it exceeds
   its sizeLimit. This matches runc and virtio-fs.
 * The block device's disk image file is sparse to minimize host disk
   footprint.

Fixes: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Alex Lyn
fb743a304c runtime: Support plugging a disk as an image file
Some VMMs support plugging a disk as an image file instead of a block device,
so we adapt the runtime to support that.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Co-authored-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-09 14:52:17 -05:00
Alex Lyn
22c4cab237 Merge pull request #12623 from Apokleos/fix-dgb-ut
runtime-rs: Fix dragonball's flaky unit tests
2026-03-09 11:38:02 +08:00
Alex Lyn
62b0f63e37 dragonball: Generate unique TAP names to avoid conflicts
The vhost-kern net unit test used a fixed TAP interface name
("test_vhosttap"). When tests run in parallel or a previous run
leaves the interface behind, TAP creation can fail with
EBUSY ("Resource busy"), making CI flaky.

Introduce a unique_tap_name() helper in the tests and use it to
generate a per-test TAP name (based on pid/thread/counter),
avoiding name collisions and stabilizing CI.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 17:33:40 +08:00
Alex Lyn
b2932f963a Merge pull request #12631 from Apokleos/fix-suffix
ci: keep mktemp output suffix stable with .yaml
2026-03-06 14:15:49 +08:00
Alex Lyn
1c8c0089da dragonball: fix flaky signal_handler test using libc::raise
The signal_handler test was intermittently failing because it used
kill(pid, sig), which sends signals asynchronously to the process.
This created a race condition where the child thread could exit and
be joined before the signal was delivered or processed.

This fix including:
1. Replaces `kill` with `libc::raise` to ensure signals are delivered
   synchronously to the calling thread.
2. Reorders triggers to verify standard signals before installing
   seccomp filters.
3. Guarantees that metrics are incremented before the child thread
   terminates and is joined by the main thread.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
d0718f6001 dragonball: Fix unnecessary parentheses around type
warning: unnecessary parentheses around type
   --> src/dragonball/dbs_legacy_devices/src/serial.rs:245:39
    |
245 |         let out: Arc<Mutex<Option<Box<(dyn std::io::Write + Send +
'static)>>>> =
    |                                       ^
^
    |
    = note: `#[warn(unused_parens)]` (part of `#[warn(unused)]`) on by
default
help: remove these parentheses

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
b4161198ee dragonball: Remove unused imports variables in dbs_pci
Fix warnings of unused imports as below:
```
warning: unused imports: `DEVICE_ACKNOWLEDGE`, `DEVICE_DRIVER_OK`,
`DEVICE_DRIVER`, `DEVICE_FEATURES_OK`, and `DEVICE_INIT`
    --> src/dragonball/dbs_pci/src/virtio_pci.rs:1177:9
     |
1177 |         DEVICE_ACKNOWLEDGE, DEVICE_DRIVER, DEVICE_DRIVER_OK,
DEVICE_FEATURES_OK, DEVICE_INIT,
     |         ^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^
     |
     = note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by
default
```

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
ca4e14086f runtime-rs: Fix warnings of unformatted codes
Fix warnings from unformattted codes.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
ce800b7c37 dragonball: Fix flaky test_vhost_user_net_virtio_device_activate hang
The vhost-user-net tests could hang in CI because
VhostUserNet::new_server() blocks indefinitely on listener.accept()
when the slave fails to connect in time
(e.g. due to scheduler delays or flaky socket paths). This also caused
panics when connect_slave() returned None and the test unwrapped it.

Fix the tests by:
- using a `/tmp`, absolute, unique unix socket path per test run
  retrying slave connect with a deadline
- running new_server() in a separate thread and waiting via
  recv_timeout() to ensure the test never blocks indefinitely

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
a988b10440 dragonball: Fix flaky test_vhost_user_net_virtio_device_normal hang
It aims to fix flaky test hang by implementing thread timeouts.

The `test_vhost_user_net_virtio_device_normal` was hanging in CI
when master/slave threads drifted.

This commit stabilizes the test by:
- Using `tempfile` and unique paths to ensure socket isolation.
- Adding a 5s deadline for slave connections to handle CI jitter.
- Running `new_server` in a separate thread with a `recv_timeout`
  to prevent the CI pipeline from deadlocking.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
f36218d566 dragonball: Fix flaky test_inner_stream_timeout in inner backend
The `test_inner_stream_timeout` test case was prone to failure due to a
race condition between the main thread and the background handler. The
test relied on hardcoded `thread::sleep` durations, which could cause
the second read operation to time out (150ms window) before the main
thread performed its write (after a 300ms sleep) under high system load.

This commit stabilizes the test by:
1. Replacing fixed sleep durations with a `Condvar` and a `stage`
   variable to implement a deterministic state machine.
2. Synchronizing the threads so that the main thread only writes data
   after the background handler has confirmed it is ready or has
   completed its previous phase.
3. Ensuring the read timeout is explicitly managed between different
   validation stages to prevent accidental `TimedOut` errors.

This change eliminates the flakiness and ensures the test passes
consistently across different CIenvironments.

Fixes #12618

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
c8a39ad28d dragonball: Fix flaky test_epoll_manager by improving synchronization
This commit aims to address issues of "Infinite loop in epoll_manager
tests" and improve stablity.

Root causes as below:
1. Using `handle_events(-1)` caused the worker thread to block forever
   if an event was missed or if the internal `kick()` signal was not
   accounted for correctly.
2. Relying on event counts was unreliable because internal signals could
   fluctuate the total count, causing the it to enter an infinite loop.
3. Using `EventSet::OUT` on an EventFd is often continuously ready,
   leading to non-deterministic trigger behavior.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:28:56 +08:00
Alex Lyn
a35dcf952e ci: Fix YAML parsing flakiness caused by mktemp random suffixes
In some CI runs, `mktemp` generates random characters that accidentally
form file extensions like `.cSV` or `.Xml`. This triggers downstream
parsing errors because the YAML content is misidentified as CSV/XML.
The issues look like as below:
```
'/tmp/bats-run-KodZEA/.../pod-guest-pull-in-trusted-storage.yaml.in.cSV':
...
```

This commit fixes the issue by:
1. Moving the `XXXXXX` placeholder before the `.yaml` extension.
2. Ensuring the generated file always ends in `.yaml`.

This prevents format misidentification while maintaining filename
uniqueness and security.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-03-06 09:21:29 +08:00
Fabiano Fidêncio
2fff33cfa4 Merge pull request #12628 from stevenhorsman/agent-ctl-bump-aws-lc-rs
agent-ctl: Update aws-lc-rs
2026-03-05 20:52:03 +01:00
Fabiano Fidêncio
83a8b257d1 Merge pull request #12265 from fidencio/topic/nvidia-bump-container-toolkit
nvidia: Bump nvidia-container-toolkit to 1.18.1
2026-03-05 15:25:15 +01:00
Fabiano Fidêncio
079fac1309 Merge pull request #12591 from fidencio/topic/kernel-add-mmio-back-to-the-unified-kernels
kernel: include mmio fragment in unified build for firecracker
2026-03-05 13:45:41 +01:00
Steve Horsman
5df7c4aa9c Merge pull request #12630 from zachspar/spar/kata-deploy-helm/configurable-pod-overhead
kata-deploy: add per-shim configurable pod overhead
2026-03-05 12:42:53 +00:00
Fabiano Fidêncio
e9894c0bd8 nvidia: Bump nvidia-container-toolkit to 1.18.1
Let's update the nvidia-container-toolkit to 1.18.1 (from 1.17.6).

We're, from now on, relying on the version set in the versions.yaml
file.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-05 11:53:09 +01:00
stevenhorsman
c57f2be18e agent-ctl: Update aws-lc-rs
aws-lc has mutliple high severity CVEs:
- GHSA-vw5v-4f2q-w9xf
- GHSA-65p9-r9h6-22vj
- GHSA-hfpc-8r3f-gw53

so try and bump to the latest `aws-lc-rs` crate to pull in the available fixed versions

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-05 10:02:22 +00:00
Zachary Spar
bda9f6491f kata-deploy: add per-shim configurable pod overhead
Allow users to override the default RuntimeClass pod overhead for
any shim via shims.<name>.runtimeClass.overhead.{memory,cpu}.

When the field is absent the existing hardcoded defaults from the
dict are used, so this is fully
backward compatible.

Signed-off-by: Zachary Spar <zspar@coreweave.com>
2026-03-05 08:00:01 +01:00
Fabiano Fidêncio
8f35c31b30 Merge pull request #12542 from fidencio/topic/genpolicy-distribute-different-settings-rather-than-patching-for-ci
genpolicy: settings.d drop-ins and scenario example drop-ins
2026-03-05 07:37:30 +01:00
Fabiano Fidêncio
b5e0a5b7d6 Merge pull request #12555 from fidencio/topic/tests-use-local-pv-pvc-for-policy-tests
k8s-policy-pvc: use local PV/PVC when no default StorageClass exists
2026-03-05 07:37:11 +01:00
Dan Mihai
cb97ebd067 Merge pull request #12615 from microsoft/danmihai1/subPathExpr
tests: k8s: basic test for subPathExpr
2026-03-04 13:10:57 -08:00
Fabiano Fidêncio
a0b9d965e5 k8s-policy-pvc: use local PV/PVC when no default StorageClass exists
Create local block storage (loop device, StorageClass, PV) in the test
only when the cluster has no default StorageClass, matching the approach
used in k8s-volume.bats. Set our StorageClass as default so the PVC
binds to our PV; tear it down after the test.

When a default already exists (e.g. AKS), skip creation and cleanup so
we do not change the cluster's default storage class.

Fixes: #9846

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 21:50:51 +01:00
Fabiano Fidêncio
83dd7dcc75 runtimes: reject virtio-blk-mmio when confidential_guest is true
Virtio-mmio transport is not hardened for confidential computing (unlike
virtio-pci). Reject config that would use virtio-blk-mmio for rootfs/block
when confidential_guest is set, so CoCo guests only use virtio-blk-pci.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 21:41:27 +01:00
Fabiano Fidêncio
cb0d02e40b kernel: include mmio fragment in unified build for firecracker
Remove # !confidential from mmio.conf so CONFIG_VIRTIO_MMIO and
CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES are included when building the
unified x86_64/s390x kernel with -x

Firecracker requires virtio-mmio for block devices; without it the
guest kernel panics (no /dev/vda).

Fixes: #12581
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 21:18:35 +01:00
Fabiano Fidêncio
d40afe592c genpolicy: add settings drop-in directory and RFC 6902 JSON Patch support
Allow genpolicy -j to accept a directory instead of a single file.
When given a directory, genpolicy loads genpolicy-settings.json from it
and applies all genpolicy-settings.d/*.json files (sorted by name) as
RFC 6902 JSON Patches. This gives precise control over settings with
explicit operations (add, remove, replace, move, copy, test), including
array index manipulation and assertions.

Ship composable drop-in examples in drop-in-examples/:
- 10-* files set platform base settings (non-CoCo, AKS, CBL-Mariner)
- 20-* files overlay specific adjustments (OCI version, guest pull)
Users copy the combination they need into genpolicy-settings.d/.

Replace the old adapt_common_policy_settings_* jq-patching functions
in tests_common.sh with install_genpolicy_drop_ins(), which copies the
right combination of 10-* and 20-* drop-ins for the CI scenario.
Tests still generate 99-test-overrides.json on the fly for per-test
request/exec overrides.

Packaging installs 10-* and 20-* drop-ins from drop-in-examples/ into
the tarball; the default genpolicy-settings.d/ is left empty.

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 20:13:21 +01:00
Dan Mihai
e40d962b13 genpolicy: improve allow_mount logging
Add simple -------- text lines separator to the beginnning of the
allow_mount log output, to help log readers easier separate the ~30
lines of text generated while verifying each mount.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-03-04 16:28:29 +00:00
Dan Mihai
3f845af9d4 tests: k8s: basic test for subPathExpr
Add basic genpolicy test coverage for subPathExpr and corresponding
container mounts.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-03-04 16:28:29 +00:00
Steve Horsman
a4a4683ec7 Merge pull request #12626 from kata-containers/topic/kata-deploy-k3s-rke2-use-imports
kata-deploy: a bunch of fixes regarding uninstall, rke2 and k3s tests
2026-03-04 14:01:09 +00:00
Steve Horsman
2687ad75c1 Merge pull request #12617 from BbolroC/skip-cgroup-device-check-for-remote
runtime: Skip to call sandboxDevices() for remote hypervisor
2026-03-04 14:00:23 +00:00
Steve Horsman
8e11bb2526 Merge pull request #12611 from mythi/coco-kernel-v6.18.15
versions: bump to Linux v6.18.15 (LTS)
2026-03-04 14:00:00 +00:00
Steve Horsman
94f850979f Merge pull request #12613 from stevenhorsman/tooling-bump-x/net-to-v0.51.0
Tooling bump x/net to v0.51.0
2026-03-04 13:44:22 +00:00
stevenhorsman
8640f27516 ci: Remove SNP tests from required
The SNP tests have been unstable on nightlies, but even when these
it seems to be manually cleaned up or something as PR tests are consistently
failing, so we should skip this from the required list until it is reliable.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-04 14:41:09 +01:00
Fabiano Fidêncio
56c3618c1d tests: kata-deploy: wait for API recovery after uninstall
kata-deploy's SIGTERM cleanup restarts the CRI runtime, which on
k3s/rke2 takes down the API server temporarily. The helm uninstall
may complete with errors, and the next test suite would start with
a dead API. Add a wait loop after uninstall to ensure the API is
available before proceeding.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 11:26:31 +01:00
Fabiano Fidêncio
966d710df5 tests: increase kata-deploy wait timeout to 15 minutes
kata-deploy restarts the CRI runtime during install, which can cause
the kata-deploy pod to be killed and recreated by the DaemonSet
controller. On k3s and rke2 in particular, the restart can take
several minutes. Increase the default timeout from 600s (10m) to
900s (15m) to accommodate this.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 11:26:31 +01:00
Fabiano Fidêncio
ebe75cc3e3 kata-deploy: make verification job resilient to CRI runtime restarts
kata-deploy restarts the CRI runtime (k3s/containerd) during install,
which can kill the verification job pod or cause transient API server
errors. Bump backoffLimit from 0 to 3 so the job can retry after being
killed, and add a retry loop around kubectl rollout status to handle
transient connection failures.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 11:26:31 +01:00
Fabiano Fidêncio
7a08ef2f8d kata-deploy: run cleanup on SIGTERM instead of preStop hook
Move the cleanup logic from a preStop lifecycle hook (separate exec)
into the main process's SIGTERM handler. This simplifies the
architecture: the install process now handles its own teardown when
the pod is terminated.

The SIGTERM handler is registered before install begins, and
tokio::select! races install against SIGTERM so cleanup always runs
even if SIGTERM arrives mid-install (e.g. helm uninstall while the
container is restarting after a failed install attempt).

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-04 11:26:31 +01:00