kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-17 10:12:24 +00:00

Author	SHA1	Message	Date
Steve Horsman	7d2e18575c	Merge pull request #12343 from zvonkok/release-model doc: Release model update	2026-03-12 14:44:51 +00:00
Zvonko Kaiser	7f662662cf	lint: Fix 80 char column size Make markdownlint happy Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-03-12 12:03:29 +00:00
Zvonko Kaiser	6e03a95730	doc: Update Release Process Add how Kata is doing the rolling release. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2026-03-12 12:03:29 +00:00
Steve Horsman	a29eb3751a	Merge pull request #12517 from kata-containers/osv-scanner-bump-2.3.3 workflows: Bump OSV scanner	2026-03-12 08:48:52 +00:00
stevenhorsman	064a960aaa	workflows: Bump OSV scanner Bump to the latest version to pick up bug fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-12 07:00:11 +00:00
Steve Horsman	f41edcb4c0	Merge pull request #12653 from kata-containers/dependabot/cargo/src/tools/agent-ctl/quinn-proto-0.11.14 build(deps): bump quinn-proto from 0.11.8 to 0.11.14 in /src/tools/agent-ctl	2026-03-12 06:53:59 +00:00
Manuel Huber	8162d15b46	nvidia: fix invalid CTK reference Use proper reference from versions yaml structure. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2026-03-11 12:49:29 -07:00
dependabot[bot]	d366d103cc	build(deps): bump quinn-proto in /src/tools/agent-ctl Bumps [quinn-proto](https://github.com/quinn-rs/quinn) from 0.11.8 to 0.11.14. - [Release notes](https://github.com/quinn-rs/quinn/releases) - [Commits](https://github.com/quinn-rs/quinn/compare/quinn-proto-0.11.8...quinn-proto-0.11.14) --- updated-dependencies: - dependency-name: quinn-proto dependency-version: 0.11.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-11 16:04:34 +00:00
Dan Mihai	04f180434e	Merge pull request #12640 from burgerdev/genpolicy-workspace genpolicy: add to Cargo workspace	2026-03-11 09:02:39 -07:00
Steve Horsman	ba0f5b98fe	Merge pull request #12643 from stevenhorsman/bump-golang-to-1.25.8 versions: bump golang to 1.25.8	2026-03-11 08:53:21 +00:00
Markus Rudy	cf7d4c33b3	kata-deploy: fix binary location for genpolicy Moving the genpolicy crate into the root workspace causes the build outputs to go into the root workspace's target directory, instead of src/tools/genpolicy/target, invalidating assumptions made by the kata-deploy-binaries script. This commit adds a special case for the lookup path of the genpolicy binary, and fixes two bugs that made identifying this problem harder. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:48 +01:00
Markus Rudy	221a22bd7d	genpolicy: ignore RUSTSEC-2024-0320 The yaml-rust dependency is unmaintained, but no suitable alternatives exist. We log an exception for this now and will revisit the topic after some time. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:48 +01:00
Markus Rudy	6643b258bb	genpolicy: update oci-client to v0.16.1 The older version we used transitively depends on an unmaintained crate. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:48 +01:00
Markus Rudy	8dfeeea924	genpolicy: add to Cargo workspace This commit adds the genpolicy utility to the root workspace. For now, only dependencies that are already in the root workspace are consumed from there, the genpolicy-specific ones should be added later. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:30:46 +01:00
Markus Rudy	fc4eaf8b66	runtime-rs: specify the subpackage to build Before this change, `make test` for runtime-rs used to test all crates in the root workspace (due to the `--all` flag). This was not intended but happened to be mostly working. However, genpolicy needs additional steps before it can build, so this behavior blocks adding genpolicy to the root workspace. The solution here is to only build the inteded packages. For the build and run commands, this is the runtime-rs crate itself. For testing, we need to include the sub-crates, too, which needs a bit of cargo metadata scraping. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2026-03-11 09:28:24 +01:00
Aurélien Bombo	b6c60d9229	Merge pull request #10559 from sprt/conf-local-storage coco: Implement trusted ephemeral data storage	2026-03-10 10:39:40 -05:00
Dan Mihai	f9a8eb6ecc	genpolicy: allow_mount improvements for emptyDir 1. Reduce the complexity of the new allow_mount rules for emptyDir. 2. Reverse the order of the two allow_mount versions, as a hint to the rego engine that the first version is more often matching the input. 3. Remove `p_mount.source != ""` from mount_source_allows, because: - Policy rules typically test the values from input, not values read from Policy. - mount_source_allows is no longer called for emptyDir mounts after these changes, so p_mount.source is not empty. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2026-03-09 14:52:17 -05:00
Fabiano Fidêncio	374b0abe29	tests: Fix kubelet data dir for k0s in trusted ephemeral storage test k0s uses /var/lib/k0s/kubelet instead of /var/lib/kubelet as its kubelet data directory. Introduce get_kubelet_data_dir() in tests_common.sh and use it in k8s-trusted-ephemeral-data-storage.bats instead of hardcoding /var/lib/kubelet. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	718632bfe0	build: Add artifacts to .gitignore This adds various files that are generated during development. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	68bdbef676	tests: Improve logging for some tests Use modern test semantics to ease debugging. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	3dd77bf576	tests: Introduce new env variables to ease development It can be useful to set these variables during local testing: * AZ_REGION: Region for the cluster. * AZ_NODEPOOL_TAGS: Node pool tags for the cluster. * GENPOLICY_BINARY: Path to the genpolicy binary. * GENPOLICY_SETTINGS_DIR: Directory holding the genpolicy settings. I've also made it so that tests_common.sh modifies the duplicated genpolicy-settings.json (used for testing) instead of the original git-tracked one. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	aae54f704c	ci: Stop deploying the CSI driver The design moved away from CSI driver so stop deploying that. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	a98e328359	tests: Add test for trusted ephemeral data storage This tests the feature on CoCo machines. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	9fe03fb170	genpolicy: Support trusted ephemeral data storage * Introduces a new cluster_config setting encrypted_emptydir defaulting to true. * Adapts genpolicy for encrypted emptyDirs. Crucially, the rules.rego change checks that the mount and the storage are well-formed together: * i_storage.source matches a known regex. * i_storage.mount_point == $(spath)/BASE64(i_storage.source) * i_storage.mount_point == p_storage.mount_point * i_storage.mount_point == i_mount.source Note that policy enforcement is necessary to prevent rogue device injection. E.g. the agent could not blindly encrypt all block devices as some use cases only need dm-verity. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	eaa711617e	agent: Support trusted ephemeral data storage Handles block-based emptyDirs plugged via virtio-blk and virtio-scsi by encrypting and formatting them. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Aurélien Bombo	a4fd32a29a	runtime: Support trusted ephemeral data storage * Introduces the `emptydir_mode` config flag to allow instructing the runtime to create a block device for emptyDir volumes. * The block device is created in the original emptyDir folder on the host so that Kubelet can monitors its disk usage and evict the pod if it exceeds its sizeLimit. This matches runc and virtio-fs. * The block device's disk image file is sparse to minimize host disk footprint. Fixes: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
Alex Lyn	fb743a304c	runtime: Support plugging a disk as an image file Some VMMs support plugging a disk as an image file instead of a block device, so we adapt the runtime to support that. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Aurélien Bombo <abombo@microsoft.com> Co-authored-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-09 14:52:17 -05:00
stevenhorsman	8ae0e36737	versions: bump golang to 1.25.8 Bump the builder image and versions to resolve CVEs: - GO-2026-4601 - GO-2026-4602 - GO-2026-4603 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-09 09:10:01 +00:00
Alex Lyn	22c4cab237	Merge pull request #12623 from Apokleos/fix-dgb-ut runtime-rs: Fix dragonball's flaky unit tests	2026-03-09 11:38:02 +08:00
Alex Lyn	62b0f63e37	dragonball: Generate unique TAP names to avoid conflicts The vhost-kern net unit test used a fixed TAP interface name ("test_vhosttap"). When tests run in parallel or a previous run leaves the interface behind, TAP creation can fail with EBUSY ("Resource busy"), making CI flaky. Introduce a unique_tap_name() helper in the tests and use it to generate a per-test TAP name (based on pid/thread/counter), avoiding name collisions and stabilizing CI. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 17:33:40 +08:00
Alex Lyn	b2932f963a	Merge pull request #12631 from Apokleos/fix-suffix ci: keep mktemp output suffix stable with .yaml	2026-03-06 14:15:49 +08:00
Alex Lyn	1c8c0089da	dragonball: fix flaky signal_handler test using libc::raise The signal_handler test was intermittently failing because it used kill(pid, sig), which sends signals asynchronously to the process. This created a race condition where the child thread could exit and be joined before the signal was delivered or processed. This fix including: 1. Replaces `kill` with `libc::raise` to ensure signals are delivered synchronously to the calling thread. 2. Reorders triggers to verify standard signals before installing seccomp filters. 3. Guarantees that metrics are incremented before the child thread terminates and is joined by the main thread. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	d0718f6001	dragonball: Fix unnecessary parentheses around type warning: unnecessary parentheses around type --> src/dragonball/dbs_legacy_devices/src/serial.rs:245:39 \| 245 \| let out: Arc<Mutex<Option<Box<(dyn std::io::Write + Send + 'static)>>>> = \| ^ ^ \| = note: `#[warn(unused_parens)]` (part of `#[warn(unused)]`) on by default help: remove these parentheses Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	b4161198ee	dragonball: Remove unused imports variables in dbs_pci Fix warnings of unused imports as below: ``` warning: unused imports: `DEVICE_ACKNOWLEDGE`, `DEVICE_DRIVER_OK`, `DEVICE_DRIVER`, `DEVICE_FEATURES_OK`, and `DEVICE_INIT` --> src/dragonball/dbs_pci/src/virtio_pci.rs:1177:9 \| 1177 \| DEVICE_ACKNOWLEDGE, DEVICE_DRIVER, DEVICE_DRIVER_OK, DEVICE_FEATURES_OK, DEVICE_INIT, \| ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by default ``` Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	ca4e14086f	runtime-rs: Fix warnings of unformatted codes Fix warnings from unformattted codes. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	ce800b7c37	dragonball: Fix flaky test_vhost_user_net_virtio_device_activate hang The vhost-user-net tests could hang in CI because VhostUserNet::new_server() blocks indefinitely on listener.accept() when the slave fails to connect in time (e.g. due to scheduler delays or flaky socket paths). This also caused panics when connect_slave() returned None and the test unwrapped it. Fix the tests by: - using a `/tmp`, absolute, unique unix socket path per test run retrying slave connect with a deadline - running new_server() in a separate thread and waiting via recv_timeout() to ensure the test never blocks indefinitely Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	a988b10440	dragonball: Fix flaky test_vhost_user_net_virtio_device_normal hang It aims to fix flaky test hang by implementing thread timeouts. The `test_vhost_user_net_virtio_device_normal` was hanging in CI when master/slave threads drifted. This commit stabilizes the test by: - Using `tempfile` and unique paths to ensure socket isolation. - Adding a 5s deadline for slave connections to handle CI jitter. - Running `new_server` in a separate thread with a `recv_timeout` to prevent the CI pipeline from deadlocking. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	f36218d566	dragonball: Fix flaky test_inner_stream_timeout in inner backend The `test_inner_stream_timeout` test case was prone to failure due to a race condition between the main thread and the background handler. The test relied on hardcoded `thread::sleep` durations, which could cause the second read operation to time out (150ms window) before the main thread performed its write (after a 300ms sleep) under high system load. This commit stabilizes the test by: 1. Replacing fixed sleep durations with a `Condvar` and a `stage` variable to implement a deterministic state machine. 2. Synchronizing the threads so that the main thread only writes data after the background handler has confirmed it is ready or has completed its previous phase. 3. Ensuring the read timeout is explicitly managed between different validation stages to prevent accidental `TimedOut` errors. This change eliminates the flakiness and ensures the test passes consistently across different CIenvironments. Fixes #12618 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	c8a39ad28d	dragonball: Fix flaky test_epoll_manager by improving synchronization This commit aims to address issues of "Infinite loop in epoll_manager tests" and improve stablity. Root causes as below: 1. Using `handle_events(-1)` caused the worker thread to block forever if an event was missed or if the internal `kick()` signal was not accounted for correctly. 2. Relying on event counts was unreliable because internal signals could fluctuate the total count, causing the it to enter an infinite loop. 3. Using `EventSet::OUT` on an EventFd is often continuously ready, leading to non-deterministic trigger behavior. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:28:56 +08:00
Alex Lyn	a35dcf952e	ci: Fix YAML parsing flakiness caused by mktemp random suffixes In some CI runs, `mktemp` generates random characters that accidentally form file extensions like `.cSV` or `.Xml`. This triggers downstream parsing errors because the YAML content is misidentified as CSV/XML. The issues look like as below: ``` '/tmp/bats-run-KodZEA/.../pod-guest-pull-in-trusted-storage.yaml.in.cSV': ... ``` This commit fixes the issue by: 1. Moving the `XXXXXX` placeholder before the `.yaml` extension. 2. Ensuring the generated file always ends in `.yaml`. This prevents format misidentification while maintaining filename uniqueness and security. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-06 09:21:29 +08:00
Fabiano Fidêncio	2fff33cfa4	Merge pull request #12628 from stevenhorsman/agent-ctl-bump-aws-lc-rs agent-ctl: Update aws-lc-rs	2026-03-05 20:52:03 +01:00
Fabiano Fidêncio	83a8b257d1	Merge pull request #12265 from fidencio/topic/nvidia-bump-container-toolkit nvidia: Bump nvidia-container-toolkit to 1.18.1	2026-03-05 15:25:15 +01:00
Fabiano Fidêncio	079fac1309	Merge pull request #12591 from fidencio/topic/kernel-add-mmio-back-to-the-unified-kernels kernel: include mmio fragment in unified build for firecracker	2026-03-05 13:45:41 +01:00
Steve Horsman	5df7c4aa9c	Merge pull request #12630 from zachspar/spar/kata-deploy-helm/configurable-pod-overhead kata-deploy: add per-shim configurable pod overhead	2026-03-05 12:42:53 +00:00
Fabiano Fidêncio	e9894c0bd8	nvidia: Bump nvidia-container-toolkit to 1.18.1 Let's update the nvidia-container-toolkit to 1.18.1 (from 1.17.6). We're, from now on, relying on the version set in the versions.yaml file. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-05 11:53:09 +01:00
stevenhorsman	c57f2be18e	agent-ctl: Update aws-lc-rs aws-lc has mutliple high severity CVEs: - GHSA-vw5v-4f2q-w9xf - GHSA-65p9-r9h6-22vj - GHSA-hfpc-8r3f-gw53 so try and bump to the latest `aws-lc-rs` crate to pull in the available fixed versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-03-05 10:02:22 +00:00
Zachary Spar	bda9f6491f	kata-deploy: add per-shim configurable pod overhead Allow users to override the default RuntimeClass pod overhead for any shim via shims.<name>.runtimeClass.overhead.{memory,cpu}. When the field is absent the existing hardcoded defaults from the dict are used, so this is fully backward compatible. Signed-off-by: Zachary Spar <zspar@coreweave.com>	2026-03-05 08:00:01 +01:00
Fabiano Fidêncio	8f35c31b30	Merge pull request #12542 from fidencio/topic/genpolicy-distribute-different-settings-rather-than-patching-for-ci genpolicy: settings.d drop-ins and scenario example drop-ins	2026-03-05 07:37:30 +01:00
Fabiano Fidêncio	b5e0a5b7d6	Merge pull request #12555 from fidencio/topic/tests-use-local-pv-pvc-for-policy-tests k8s-policy-pvc: use local PV/PVC when no default StorageClass exists	2026-03-05 07:37:11 +01:00
Dan Mihai	cb97ebd067	Merge pull request #12615 from microsoft/danmihai1/subPathExpr tests: k8s: basic test for subPathExpr	2026-03-04 13:10:57 -08:00

1 2 3 4 5 ...

18151 Commits