kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-07-02 07:02:16 +00:00

Author	SHA1	Message	Date
Steve Horsman	20bcff185f	Merge pull request #13254 from kata-containers/dependabot/go_modules/src/runtime/go.mongodb.org/mongo-driver-1.17.7 build(deps): bump go.mongodb.org/mongo-driver from 1.14.0 to 1.17.7 in /src/runtime	2026-06-22 11:17:29 +01:00
Fabiano Fidêncio	f9682356ce	Merge pull request #13216 from Apokleos/hotunplug-blk runtime-rs: Add support for hot-unplugging block devices	2026-06-22 12:14:30 +02:00
Fabiano Fidêncio	337b600268	Merge pull request #13256 from fidencio/release/3.32.0 release: Bump version to 3.32.0 3.32.0	2026-06-22 10:33:25 +02:00
Alex Lyn	9550a323ac	Merge pull request #13245 from kata-containers/unify-nix-version Unify nix version	2026-06-22 15:25:10 +08:00
Alex Lyn	7aaa4e63d1	Merge pull request #13241 from PiotrProkop/exit-code agent: report 128+signal as exit code for signal-terminated processes	2026-06-22 09:13:24 +08:00
Fabiano Fidêncio	dc70b93573	release: Bump version to 3.32.0 Bump VERSION and helm-charts versions. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-06-22 01:15:24 +02:00
PiotrProkop	c2d737c9d7	agent: report 128+signal as exit code for signal-terminated processes When a container process is terminated by a signal, the agent's SIGCHLD reaper stored the raw signal number as the process exit code. As a result a process killed by SIGKILL(9) reported exit code 9 instead of the conventional 137 (128+9). Apply the standard shell convention of 128+signal_number so that signal-terminated processes report the expected exit codes, e.g. SIGKILL(9) -> 137, SIGTERM(15) -> 143, SIGINT(2) -> 130. This mimics runc, which encodes wait-status exit codes the same way: https://github.com/opencontainers/runc/blob/v1.4.3/libcontainer/utils/utils.go#L19 Both runc and this new Kata behaviour follow the conventional exit code semantics documented at https://tldp.org/LDP/abs/html/exitcodes.html. The conversion is factored into a small helper and covered by a unit test. The runtime and shim already pass the exit code through unchanged, so no further changes are needed for the corrected value to surface. Fixes: signal-terminated containers reporting raw signal numbers Signed-off-by: PiotrProkop <pprokop@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-21 16:34:17 +02:00
Fabiano Fidêncio	374a867774	Merge pull request #13196 from microsoft/cameronbaird/upstream/runtime-go-clh-templating runtime: Enable VM Templating Support for CLH	2026-06-21 16:31:19 +02:00
Alex Lyn	0a63aebea9	runtime-rs: Implement remove_device for block device hot removal Replace the "Not yet implemented" stub in QemuInner::remove_device() with a working implementation that calls hotunplug_device() to perform the QMP-level device removal, then cleans up the internal devices list via retain() to remove stale coldplug entries. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-20 22:08:57 +08:00
Alex Lyn	d4212bcb74	runtime-rs: Add hotunplug_device dispatcher for device type routing Introduce hotunplug_device() as the device-type dispatcher that routes hot removal requests to the appropriate QMP method. Currently supports Block and BlockModern device types, which are forwarded to Qmp::hotunplug_block_device(). All other device types return an explicit "unsupported" error. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-20 22:08:57 +08:00
Alex Lyn	281b6aa61a	runtime-rs: Add hotunplug_block_device for block device hot removal Implement QMP-level block device hot-unplug by issuing device_del to remove the frontend device and blockdev_del to remove the backend blockdev node. For virtio-blk-ccw on s390x, the CCW subchannel slot is also released. Since QMP device_del is asynchronous and only initiates the removal request, introduce wait_for_device_deleted() to poll for the DEVICE_DELETED event before tearing down the backend. This prevents blockdev_del from failing with "Node is still in use". If blockdev_del fails, the error is logged but CCW cleanup still proceeds before the error is propagated, ensuring consistent subchannel state. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-20 22:08:57 +08:00
Alex Lyn	431720025c	runtime-rs: Enhance hotplug_block_device error handling and rollback Improve the reliability of block device hotplug by ensuring that blockdev-add nodes are properly cleaned up when subsequent device_add operations fail. To address this, A new method of device_add_with_rollback is introduced to do device_add and do properly cleaned up when it fails. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-20 22:08:57 +08:00
dependabot[bot]	399c863cd2	build(deps): bump go.mongodb.org/mongo-driver in /src/runtime Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.14.0 to 1.17.7. - [Release notes](https://github.com/mongodb/mongo-go-driver/releases) - [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.14.0...v1.17.7) --- updated-dependencies: - dependency-name: go.mongodb.org/mongo-driver dependency-version: 1.17.7 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-06-20 10:22:56 +00:00
Cameron Baird	730307f32c	factory: Default to normal sandbox boot path when factory init not done The behavior we had before was that, for a starting k8s pod, it sees enable_template=true and therefore: 1. Tries NewFactory with fetchOnly=true 2. When that fails (because template.Fetch fails to find the artifacts, we retry with fetchOnly=false. This creates a direct factory which creates the template from scratch (hence we pay a full pod sandbox boot time here) and then restores from that. Hence the boot times are strictly worse on this path. Now, even when enable_template=true, we don't try to force a direct factory. Instead we just revert to the standard sandbox boot path. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2026-06-19 18:00:02 +00:00
Cameron Baird	65a5f272f8	ci: Introduce tests for VM template factory Add k8s-vm-templating-test.bats which exercises pod create with the factory initialized on the target node. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2026-06-19 18:00:02 +00:00
Cameron Baird	c0f9744225	runtime: Implement support for VM Template factory in clh Add support for VM Template factory on the clh path. In order to support snapshot/restore-based VM templating, the following changes were needed: 1. For clh.go, implement SaveVM, PauseVM, restoreVM, ResumeVM 2. Remove initrd config check for VM Templating path. The root disk image (when using image mode) is created in memory and therefore captured in the VM snapshot. 3. Truncate the memory file to the size of the VM at factory VM create time. This allows CLH to use the memory file as the backing for the template VM memory, allowing O(1) snapshot times. 4. CLH uses memory zones as backing for its memory on the template paths 5. Update StartVM in CLH to use the restore path when template is configured and available Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2026-06-19 18:00:02 +00:00
stevenhorsman	d09d1959c2	libs: Update mem-agent to use nix workspace version Now that the workspace version has been updated, switch the mem-agent to pick up the new workspace version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-19 03:49:16 -07:00
stevenhorsman	531877f28f	deps: Upgrade nix crate from 0.26.4 to 0.31.3 Upgrade the nix crate across the workspace to version 0.30.1 to address security vulnerabilities and adopt safer file descriptor handling patterns. ### Breaking Changes in nix 0.28.0 1. File Descriptor Type Changes - Functions now return `OwnedFd` instead of `RawFd` (i32) - Functions requiring file descriptors now expect types implementing `AsFd` trait - This provides RAII-based automatic cleanup and prevents fd leaks 2. API Signature Changes - `pipe()`, `pipe2()`, `openpty()` now return `OwnedFd` tuples - `socket()` returns `OwnedFd` instead of `RawFd` - `open()`, `memfd_create()` return `OwnedFd` - `setns()`, `write()`, `fcntl()` require `AsFd` trait - `madvise()` requires `NonNull<c_void>` instead of raw pointer - `bind()`, `listen()`, `connect()` require `AsFd` and `Backlog` type 3. Module Feature Flags - Modules now require explicit feature flags (mman, reboot, etc.) ### Additional Breaking Changes in nix 0.30.1 1. symlinkat() API Change - `dirfd` parameter now requires `AsFd` trait instead of `Option<RawFd>` - Use `BorrowedFd::borrow_raw(libc::AT_FDCWD)` for current directory 2. Type Alias Deprecation - `MemFdCreateFlag` renamed to `MFdFlags` for consistency ### Changes Made Workspace Configuration (Cargo.toml) - Updated nix to 0.30.1 with features: fs, mount, sched, process, ioctl, signal, socket, feature, user, hostname, term, event, mman, reboot File Descriptor Handling Patterns - Use `BorrowedFd::borrow_raw(raw_fd)` to wrap RawFd for AsFd requirements - Use `.as_fd().as_raw_fd()` to extract raw fd without ownership transfer - Use `.into_raw_fd()` only when ownership transfer is needed - Use `NonNull::new().unwrap()` for madvise pointer conversion Deprecated API Replacements - `eventfd()` → `EventFd::from_value_and_flags()` - `Errno::from_i32()` → `Errno::from_raw()` - `listen(fd, backlog)` → `listen(&fd, Backlog::new(backlog).unwrap())` - `MemFdCreateFlag` → `MFdFlags` Generated by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-19 03:49:16 -07:00
stevenhorsman	ac508b093d	runtime-rs: Use workspace nix version See if we can sync to use the workspace version for easier dependency management Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-19 03:49:16 -07:00
stevenhorsman	2b8b09469d	dragonball: Use workspace nix version See if we can sync to use the workspace version for easier dependency management Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-19 03:49:16 -07:00
stevenhorsman	b37b81bb75	lib: Use workspace nix version We have a note in the workspace Cargo.toml that the version there needs to be in sync with the libs versions, so just update them to use the workspace version rather than manually managing this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-19 03:49:16 -07:00
Steve Horsman	ec55a74969	Merge pull request #11485 from kata-containers/zvonkok-patch-1 Create SECURITY.md	2026-06-19 09:19:26 +01:00
Hyounggyu Choi	103b0b2cbc	Merge pull request #13078 from SantoshMadhukar-K/improved-test-coverage test: Improve test coverage for device handlers	2026-06-19 03:52:14 +02:00
SantoshMadhukar-K	736e07d18e	test: Improve test coverage for device handlers Add comprehensive test coverage for the device handler modules under src/agent/src/device, including matcher behavior, edge cases, and shared helper coverage across block, network, nvdimm, scsi, and vfio device paths. Assisted-by: IBM Bob Signed-off-by: SantoshMadhukar-K <SantoshMadhukar.Khandyana@ibm.com>	2026-06-18 07:18:36 -07:00
stevenhorsman	4bbbcb813e	doc: Create SECURITY.md Explicit SECURITY.md that reflects Kata’s rolling-release model (monthly cadence, no long-term branches) and sets clear expectations for reporters and downstream users. With the SECURITY.md in place we need also the SECURITY_CONTACTS - Add alternative reporting method (email) for non-GitHub users - Add section for downstream distributions and vendors with early notification details - Clarify that timelines are independent objectives, not sequential steps - Reorder disclosure process to emphasize patch releases are exceptions - Update git tag command in version table (remove unnecessary pipe) - Expand FAQ with downstream distribution and non-GitHub reporter questions - Update timestamp to reflect current changes (2026-04-01) - Update SECURITY_CONTACTS with email contact and downstream notification info - Clarify CVE assignment process through GitHub Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk> Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-18 14:23:52 +01:00
Steve Horsman	49ce886f20	Merge pull request #13242 from charludo/fix/runtime-rs-safe-path runtime-rs: change `safe-path` dependency from crates.io to workspace	2026-06-18 11:39:19 +01:00
Charlotte Hartmann Paludo	b4be5fdcca	runtime-rs: change `safe-path` dependency from crates.io to workspace `safe-path` is resolved from the local workspace in all other workspace member crates. This commit changes the dependency to a local one for runtime-rs as well. Signed-off-by: Charlotte Hartmann Paludo <git@charlotteharludo.com> Co-authored-by: Markus Rudy <mr@edgeless.systems>	2026-06-18 06:32:06 +02:00
Steve Horsman	66e938e02d	Merge pull request #13244 from BbolroC/use-ibm-actionspz-runners-for-publishing-jobs GHA: Use IBM ActionsPZ runners for publish jobs on s390x	2026-06-17 15:45:20 +01:00
Hyounggyu Choi	308eb34af6	GHA: Use IBM ActionsPZ runners for publish jobs on s390x Let's use the ActionsPZ runners for the following jobs: - publish-kata-deploy-image-s390x - publish-kata-monitor-image-s390x to improve CI experiences. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-06-17 15:34:39 +02:00
Alex Lyn	47ac08b419	Merge pull request #13239 from Apokleos/remove-9p runtime-rs: Remove unused msize_9p totally from configurations	2026-06-17 20:17:52 +08:00
Greg Kurz	f0f8233759	Merge pull request #13237 from gkurz/osbuilder-version osbuilder: Simplify version fetching	2026-06-17 13:56:13 +02:00
Greg Kurz	c3d98fe323	osbuilder: Simplify version fetching `tools/osbuilder/VERSION` points to the root `VERSION` file, just like the code does. Use that file. Signed-off-by: Greg Kurz <groug@kaod.org>	2026-06-17 10:08:23 +02:00
Alex Lyn	854eef0312	runtime-rs: Remove unused msize_9p totally from configurations As virtio-9p is deprecated already, and its msize_9p should be deprecated too. This commit aims to remove the unused msize_9p. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-17 14:53:48 +08:00
Fabiano Fidêncio	0ddb2ee1f1	Merge pull request #13160 from LandonTClipp/kata_visible_devices feat(agent): translate KATA_VISIBLE_DEVICES into CDI GPU requests	2026-06-16 19:10:35 +02:00
Fabiano Fidêncio	3ca5742338	Merge pull request #13129 from pmores/fix-default_memory_annontation runtime-rs: fix default_memory annonation processing	2026-06-16 18:11:19 +02:00
Fabiano Fidêncio	3e98f925cf	Merge pull request #13142 from davidweisse/dav/genpolicy-pod-resources genpolicy: support pod-level resources	2026-06-16 15:31:50 +02:00
davidweisse	ac56ea21d8	genpolicy: support pod-level resources Add support for resource requests and limits in the PodSpec. Fixes #12816 Signed-off-by: davidweisse <98460960+davidweisse@users.noreply.github.com>	2026-06-16 15:30:22 +02:00
Fabiano Fidêncio	774e698aeb	Merge pull request #12293 from Apokleos/graceful-errors runtime-rs: make OOM watcher and signal handling lifecycle-aware	2026-06-16 15:02:54 +02:00
Fabiano Fidêncio	c76c82ce1c	Merge pull request #13229 from hgowda-amd/skip-qos-tests-snp-tdx-runtime-rs tests: skip Guaranteed QoS test for SNP/TDX runtime-rs	2026-06-16 14:02:51 +02:00
Fabiano Fidêncio	492d604daf	Merge pull request #13214 from fidencio/topic/block-volume-readonly-propagation runtime(-rs): Propagate host block device read-only flag to the VMM	2026-06-16 13:39:23 +02:00
Pavel Mores	9b31e06c20	runtime-rs: bump the byte-unit dependency version The unit tests added by the previous commit exposed a malfunction of the byte-unit crate on big-endian systems(*), causing s390x CI to fail. Bump the dependency's version to include a fix. Signed-off-by: Pavel Mores <pmores@redhat.com>	2026-06-16 13:15:23 +02:00
Pavel Mores	5ba5046e97	runtime-rs: fix default_memory annonation processing The annotation value is implicitly in MiB but when presented to the byte-unit crate this is interpreted as bytes. When a common value like 2048, meant to mean 2048 MiB but interpreted as 2048 B, is then converted to MiB the result is zero which is less than the minimal allowable memory and the runtime fails to launch. This is fixed by adding a detection whether the annotation value contains units or not. If it doesn't it's first converted to MiB and the rest of the processing then goes like before. This way we allow for the implicit MiB units when no units are given, thus keeping compatibility with existing go shim behaviour, while also allowing for any legal units to be given as well. We take the opportunity to add some unit tests as well. Signed-off-by: Pavel Mores <pmores@redhat.com>	2026-06-16 13:15:23 +02:00
LandonTClipp	4a9da5d37a	chore(docs): Add info on building and running custom artifacts I created this over the course of testing my VISIBLE_CDI_DEVICES changes. I think this will be useful to folks who don't understand the right way to deploy custom artifacts. Signed-off-by: LandonTClipp <lclipp@coreweave.com>	2026-06-16 11:44:09 +02:00
LandonTClipp	a1dd28cb52	feat(runtime): plumb VISIBLE_CDI_DEVICES through the Go runtime Add a `visible_cdi_devices` TOML option to the Go runtime so the agent.visible_cdi_devices=true kernel parameter is emitted to the guest when enabled. Wire the option through the NVIDIA GPU configuration templates and add tests verifying the kernel-params flow. Signed-off-by: LandonTClipp <lclipp@coreweave.com>	2026-06-16 11:44:09 +02:00
LandonTClipp	b49eb577b2	feat(runtime-rs): expose visible_cdi_devices in config Declare the `visible_cdi_devices` agent option (kernel param agent.visible_cdi_devices) in kata-types so runtime-rs can opt into emitting it to the guest, and expose it in the three NVIDIA GPU configuration templates (qemu, qemu-snp, qemu-tdx) at runtime-rs/config/. The agent consumes the corresponding VISIBLE_CDI_DEVICES env var to drive CDI device requests. Signed-off-by: LandonTClipp <lclipp@coreweave.com>	2026-06-16 11:44:09 +02:00
LandonTClipp	676fc90d0b	feat(agent): translate VISIBLE_CDI_DEVICES into CDI device requests Add an opt-in `visible_cdi_devices` agent option that lets a container select which of the VM's CDI-known devices it sees via a VISIBLE_CDI_DEVICES env var. The schema is `<cdi-kind>=<devices>` (e.g. "nvidia.com/gpu=all", or "kata.com/gpu=0,1"), with multiple kinds delimited by ':'. When enabled, the agent maps the value to CDI device requests and feeds them through the existing CDI injection path, so device nodes, mounts, env and createContainer hooks from the guest CDI spec (e.g. /var/run/cdi/nvidia.yaml, generated by NVRC/nvidia-ctk) are applied. The variable is intentionally distinct from NVIDIA_VISIBLE_DEVICES and does not promise identical semantics. If a requested kind is present in the guest CDI registry but the specific device index is not, the agent fails fast rather than waiting for the CDI-spec watch/timeout path. An entirely absent kind falls through to the existing wait/timeout behavior. Defaults to false; containers that don't set the env var are unaffected. Signed-off-by: LandonTClipp <lclipp@coreweave.com>	2026-06-16 11:44:09 +02:00
Alex Lyn	8fc1a16225	runtime-rs: Make signal_process idempotent for exited init processes Address the issue where signal_process returns an INTERNAL error when the container's init process has already exited, and ensure teardown is never aborted by signal failures. Introduce is_no_such_process_error() to detect "no such process" conditions (ESRCH/ENOENT codes or equivalent messages). When the init process is already gone, treat it as success with an info log instead of an error. In stop_process(), never propagate signal failures. During sandbox shutdown the agent connection is often already closed, causing AgentConnectionClosed errors that bypass is_no_such_process_error(). If stop_process() aborts on such errors, cleanup_container() is skipped and leftover mounts cause "Resource busy" failures in sandbox cleanup. Restore "always proceed to cleanup" semantics: log the failure as a warning, but never skip resource cleanup. Resource cleanup must be best-effort and idempotent regardless of kill outcome. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-06-16 15:12:28 +08:00
Alex Lyn	44dd2b1f34	runtime-rs: Refine OOM watcher error reporting for sandbox teardown This commit refines the error handling within the OOM watcher to distinguish between genuine failures and errors that occur as a natural consequence of sandbox shutdown via the helper is_normal_shutdown_error. Previously, various connection-related errors during teardown were logged as warnings, contributing to noisy logs. It aims to improve OOM error handling, distinguish error types: The logic now differentiates between "normal shutdown" errors (e.g., Connection reset by peer, broken pipe) and actual OOM watcher failures. This enhancement makes OOM event logs more informative and less prone to clutter during normal sandbox termination. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-16 15:12:24 +08:00
Alex Lyn	3095bd379b	runtime-rs: Introduce cancellation for OOM watcher during teardown This commit introduces an explicit cancellation mechanism for the OOM watcher loop within VirtSandbox. This addresses the issue where the watcher continues to poll for OOM events even when the sandbox is being stopped, leading to spurious "Connection reset by peer" errors. Key changes: (1) A CancellationToken is added to VirtSandbox to signal the watcher loop when the sandbox is undergoing teardown. (2) The OOM watcher loop in VirtSandbox::start() is now wrapped in a tokio::select! statement. This allows it to concurrently listen for two events: - cancel_token.cancelled(): Triggered when the sandbox/VM is stopping. - agent.get_oom_event(): The regular OOM event polling. (3) In the sandbox stop/teardown path, cancel_token.cancel() is called before stopping the VM. This ensures the OOM watcher loop exits cleanly via the cancellation token, preventing the occurrence of ECONNRESET/EOF errors on a closed channel. This change improves the robustness of OOM event handling during sandbox lifecycle management. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-16 12:56:54 +08:00
Alex Lyn	0ffdc576d3	runtime-rs: Introduce a helper to check if process/container exists Returns `true` if the error indicates that the target process/container no longer exists. This is used to determine if an operation, like signaling a process, failed because the target is no longer available. The function checks for standard OS error codes (`ESRCH`, `ENOENT`) and common error message patterns. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-06-16 12:56:54 +08:00

1 2 3 4 5 ...

19427 Commits