kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-02 18:13:57 +00:00

Author	SHA1	Message	Date
Steve Horsman	468abea97a	Merge pull request #12719 from kata-containers/sprt/env-no-deploy gha: Avoid noisy deployment logs in PRs	2026-03-31 17:12:07 +01:00
Aurélien Bombo	78289d19f7	gha: Pin actionlint version Pin to the latest released version as a security measure. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-31 10:51:17 -05:00
Aurélien Bombo	3122fa651e	gha: Avoid noisy deployment logs in PRs GitHub recently announced that developers can now use environments without auto-deployment, which allows us to avoid the noisy deployment logs in our PRs: https://github.blog/changelog/2026-03-19-github-actions-late-march-2026-updates/#github-actions-now-allows-developers-to-use-environments-without-auto-deployment Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-31 10:51:13 -05:00
RuoqingHe	a0e99a86cf	Merge pull request #12690 from Jiahao1226/put-agent-into-root-workspace build: Move agent to root workspace	2026-03-30 18:07:28 +08:00
Steve Horsman	012bf4b333	Merge pull request #12635 from Apokleos/update-docs-rs runtime-rs: Update docs for runtime-rs	2026-03-30 10:42:31 +01:00
Alex Lyn	7dce05b5fc	docs: Update the pictures of kata 4.0 with mermaid codes It becomes simple and flexible with mermaid codes to update the pic or diagrams. And it also remove the legacy PNG pictures to reduce the kata-statics release file size. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	3c584a474f	docs: Update libs README with complete library documentation Add all 9 library crates which are missing in workspace including: (1) kata-types with annotations, hypervisor configs, and K8s utilities. (2) kata-sys-util with all sub-modules: cpu, device, fs, hooks, k8s, mount, netns, numa, pcilibs, protection, spec, validate. (3) protocols with ttrpc bindings: agent, health, remote, csi, oci, confidential_data_hub. (4) runtime-spec with OCI container state types and namespace constants. (5) shim-interface with RESTful API and Unix socket path. (6) logging with slog framework features: JSON, journal, filtering. (7) safe-path with security-focused path resolution utilities. (8) mem-agent with memory management: memcg, compact, psi. (9) test-utils with privilege and KVM test macros. And one more thing, uniformly adopt TOCTOU in place of the redundant TOCTTOU abbreviation. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	48ef2220e8	docs: Update runtime-rs README with accurate architecture documentation Add comprehensive hypervisor support table (Dragonball, QEMU, Cloud Hypervisor, Firecracker, Remote). Document all runtime handlers (VirtContainer, LinuxContainer, WasmContainer) and resource types. List all configuration files including CoCo variants (TDX, SNP, SE). Add shim-ctl crate to crates table for development tooling reference. Add Feature Flags section documenting dragonball and cloud-hypervisor options. Simplify and restructure content for clarity while preserving technical accuracy. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	c96b2034dc	docs: Update kata-types README with comprehensive module documentation Add detailed module documentation table describing all available modules including: - annotations - capabilities - config - container ... Document configuration module features including TOML-based loading, drop-in files, and hypervisor-specific configurations (QEMU, Cloud Hypervisor, Firecracker, Dragonball, Remote). Improve formatting with Markdown tables and structured sections for better readability. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	b8576ef476	docs: Update kata-sys-util README with comprehensive feature documentation Expand README.md to include detailed documentation for all modules: - File system operations (fs) - Mount operations (mount) - CPU utilities (cpu) - NUMA support (numa) - Device management (device) - Kubernetes support (k8s) - Network namespace (netns) - OCI specification utilities (spec) - Validation (validate) - Hooks (hooks) - Guest protection (protection) - Random generation (rand) - PCI device management (pcilibs) Add supported architectures list and improve overview section. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	a747b9f774	docs: Improve and refine hypervisor README documentation Enhance documentation in the hypervisor README.md file with: (1) Standardized terminology and formatting (VMM capitalization) (2) Improved paragraph transitions and logical flow (3) Fixed punctuation errors in code blocks Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	302b2c8d75	docs: Restructure and modernize virtualization design document Comprehensive rewrite of docs/design/virtualization.md to improve clarity, completeness, and usability. This document now serves as the authoritative guide for understanding and selecting hypervisors in Kata Containers deployments. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	7fa68ffd52	docs: Consolidate hypervisor documentation in virtualization.md Add 'Choose a Hypervisor', 'Hypervisor Configuration Files', and 'Hypervisor Versions' sections to virtualization.md. Key changes: - Integrate hypervisor comparison table from hypervisors.md - Add configuration file reference table for both go and rust runtimes - Add current hypervisor versions from versions.yaml: - Cloud Hypervisor: v51.1 - Firecracker: v1.12.1 - QEMU: v10.2.1 - StratoVirt: v2.3.0 - Dragonball: builtin (part of rust runtime) - Preserve original structure documenting each hypervisor's device model and features - Add reference links for all hypervisors This consolidates hypervisor selection guidance and version information into a single comprehensive virtualization design document. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	119a145923	docs: Upgrade architecture documentation from 3.0 to 4.0 Replace Kata 3.0 architecture docs with Kata 4.0 (Rust Runtime) documentation. Key changes: - Remove deprecated architecture 3.0 documentation - Add comprehensive Kata 4.0 architecture guide covering: - Unified single-binary architecture - Built-in Dragonball VMM integration - Async I/O model with Tokio - Layered architecture design - Modular resource manager - Extensible framework for multiple container types The new documentation reflects the production-ready Rust runtime with improved performance and reduced resource consumption. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	9f6bce9517	docs: Remove containerd settings from crio dedicated document As the document is just for CRI-O, we need remove containerd related settings from it and make it clear for users. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	b04260f926	docs: Rename run-kata-with-k8s with adding crio As previous document of run-kata-with-k8s.md is not clear for new comers to quickly find the way to run kata with k8s/crio. In this commit, it just rename the document name and make it clear. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	26d41b8f6e	docs: Remove the dedicated installation guide for runtime-rs When runtime-rs becomes default runtime, everything just for runtime-rs will be changed, and the dedicated installation for runtime-rs will be deprecated. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	004333ed71	docs: Update containerd-kata.md with clear settings In this commit: (1) Update containerd config with kata configurations (2) Add more comments to guide how to use containerd/kata with default setting and customized configure setting; (3) Update the usage of containerd cmd tool ctr with explicitly specified runtime-config-path options to make it work. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	8dae67794a	docs: switch to blockfile snapshotter for SEV-SNP in runtime-rs Updated the configuration guide to use `shared_fs = "none"`. This change reflects that `virtio-9p` is deprecated in `runtime-rs` and recommends the blockfile snapshotter as a stable alternative to the buggy `virtio-fs` in SEV-SNP QEMU versions. But this's limited in the nerdctl or ctr tools. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	65b2a75aca	runtime-rs: Fix typo USE_BUILDIN_DB with USE_BUILTIN_DB Corrects the typo 'BUILDIN' to the standard 'BUILTIN' across the codebase to improve code quality and documentation consistency. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	75ecfe3fe2	docs: Fix volume type and fs type Correct the volume type with `volume-type` and fix the fs type with `fstype`. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Alex Lyn	a923bb2917	docs: Add document for how-to-use passthroughfd-IO within runtime-rs This document describes the Passthrough-FD (pass-fd) technology implemented in Kata Containers to optimize IO performance. By bypassing the intermediate proxy layers, this technology significantly reduces latency and CPU overhead for container IO streams. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2026-03-29 19:17:03 +02:00
Jiahao Wang	1163b6581f	agent: Change TARGET_PATH to root workspace After agent was moved to root workspace, the products are now under the repo root. Change the TARGET_PATH accordingly to tell Makefile where to lookup output. Signed-off-by: Jiahao Wang <jiahao.wang@lingcage.com>	2026-03-29 06:35:54 +00:00
Jiahao Wang	29e5d5d951	build: Move agent to root workspace This commit adds kata agent to the root workspace, as a follow up work of #12413. Remove agent from exclude list, and make it as a member of root workspace. Signed-off-by: Jiahao Wang <jiahao.wang@lingcage.com>	2026-03-29 06:35:38 +00:00
Fabiano Fidêncio	0cf3243801	Merge pull request #12683 from PiotrProkop/logical-physical-block-size runtime: allow specifying logical/physical sector size for block devices	2026-03-27 22:55:29 +01:00
PiotrProkop	64735222c6	runtime: allow specifying logical/physical sector size for block devices Add two new configuration knobs that control the logical and physical sector sizes advertised by virtio-blk devices to the guest: block_device_logical_sector_size (config file) block_device_physical_sector_size (config file) io.katacontainers.config.hypervisor.blk_logical_sector_size (annotation) io.katacontainers.config.hypervisor.blk_physical_sector_size (annotation) The annotation names are abbreviated relative to the config file keys because Kubernetes enforces a 63-character limit on annotation name segments, and the full names would exceed it. Both settings default to 0 (let QEMU decide). When set, they are passed as logical_block_size and physical_block_size in the QMP device_add command during block device hotplug. Setting logical_sector_size smaller then container filesystem block size will cause EINVAL on mount. The physical_sector_size can always be set independently. Values must be 0 or a power of 2 in the range [512, 65536]; other values are rejected with an error at sandbox creation time. Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2026-03-27 18:56:54 +01:00
Aurélien Bombo	30e030e18e	Merge pull request #12679 from microsoft/user/romoh/gpu-fix clh: Add VFIO device cold-plug support	2026-03-27 11:12:51 -05:00
Hyounggyu Choi	8cebcf0113	Merge pull request #12742 from BbolroC/remove-skipped-emptydir-tests-for-ibm-sel tests: Remove skip condition for emptyDir-related tests on IBM SEL	2026-03-27 14:35:48 +01:00
Fabiano Fidêncio	237729d728	Merge pull request #12739 from fidencio/topic/kata-deploy-nydus-use-a-different-namespace kata-deploy: rename nydus-snapshotter to nydus-for-kata-tee	2026-03-27 14:32:58 +01:00
Fabiano Fidêncio	f0ad9f1709	tests: snp: policy: Adjust to containerd 2.3.0 As the AMD maintainers switched to the 2.3.0-beta.0 containerd (due to the nydus fixes that landed there). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Fabiano Fidêncio	1b8189731a	tests: hand nydus snapshotter setup over to kata-deploy Now that kata-deploy deploys and manages nydus-for-kata-tee on all platforms, the separate standalone nydus-snapshotter DaemonSet deployment is no longer needed. - Short-circuit deploy_nydus_snapshotter and cleanup_nydus_snapshotter to no-ops with an explanatory message. - Add qemu-snp to the workaround case so AMD SEV-SNP baremetal runners also get USE_EXPERIMENTAL_SETUP_SNAPSHOTTER=true and kata-deploy picks up the snapshotter setup on every run. - Drop the x86_64 arch guard and the hypervisor sub-case from the EXPERIMENTAL_SETUP_SNAPSHOTTER block, allowing any architecture and hypervisor to use the kata-deploy-managed path when the flag is set. Made-with: Cursor Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Fabiano Fidêncio	4fad88499c	kata-deploy: rename nydus-snapshotter to nydus-for-kata-tee Rename all host-visible names of the nydus-snapshotter instance managed by kata-deploy from the generic "nydus-snapshotter" to "nydus-for-kata-tee". This covers the systemd service name, the containerd proxy plugin key, the runtime class snapshotter field, the data directory (/var/lib/nydus-for-kata-tee), the socket path (/run/nydus-for-kata-tee/), and the host install subdirectory. The rename makes it immediately clear that this nydus-snapshotter instance is the one deployed and managed by kata-deploy specifically for Kata TEE use cases, rather than any general-purpose nydus-snapshotter that might be present on the host. Because the old code operated under a completely separate set of paths (nydus-snapshotter.*), any previously deployed installation continues to run without interference during the transition to this new naming. CI pipelines and operators can upgrade kata-deploy on their own schedule without having to coordinate an atomic cutover: the old service keeps serving its existing workloads until it is explicitly replaced, and the new deployment lands cleanly alongside it. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-27 11:14:54 +01:00
Fabiano Fidêncio	fb77c357f4	Merge pull request #12743 from BbolroC/enable-trusted-ephemeral-storage-ibm-sel runtime: Set emptydir_mode to DEFEMPTYDIRMODE_COCO for IBM SEL	2026-03-27 09:48:28 +01:00
Hyounggyu Choi	de3afd3076	tests: Remove skip condition for s390x in trusted ephemeral storage test Remove the skip condition for s390x in k8s-trusted-ephemeral-data-storage.bats. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-03-26 18:58:13 +01:00
Hyounggyu Choi	cd931d4905	runtime: Set emptydir_mode to DEFEMPTYDIRMODE_COCO for IBM SEL The enablement of the trusted ephemeral storage for IBM SEL was missed in #10559. Set the emptydir_mode properly for the TEE. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-03-26 15:55:30 +01:00
Hyounggyu Choi	911aee5ad7	tests: Remove skip condition for emptyDir-related tests on IBM SEL Fixes: #10002 Since #11537 resolves the issue, remove the skip conditions for the k8s e2e tests involving emptyDir volume mounts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-03-26 15:39:33 +01:00
Roaa Sakr	858620d2e7	clh: Add VFIO device cold-plug support Enable VFIO device pass-through at VM creation time on Cloud Hypervisor, in addition to the existing hot-plug path. Signed-off-by: Roaa Sakr <romoh@microsoft.com>	2026-03-25 16:39:25 -07:00
Steve Horsman	8c2b7ed619	Merge pull request #12729 from fidencio/topic/kata-deploy-nydus-dont-touch-data-dir-on-install kata-deploy: nydus: never remove the data dir	2026-03-25 10:28:50 +00:00
Steve Horsman	af7fdd5cd1	Merge pull request #12725 from kata-containers/sprt/cargo-check-fix build: Don't fail `cargo check` on a dirty tree	2026-03-25 10:21:16 +00:00
Steve Horsman	0d8186ae16	Merge pull request #12730 from fidencio/topic/bump-nydus-snapshotter versions: Bump nydus-snapshotter to v0.15.13	2026-03-25 10:20:23 +00:00
Steve Horsman	7e0f5e533a	Merge pull request #12733 from fidencio/topic/unrequire-nvidia-gpu-snp-tests-till-we-fix-auth-issues gatekeeper: Unrequire NVIDIA GPU SNP tests till auth is fixed	2026-03-25 10:11:10 +00:00
Fabiano Fidêncio	bcfb2354e0	gatekeeper: Unrequire NVIDIA GPU SNP tests till auth is fixed SSIA, the NIM tests are breaking due to authentication issues, and those issues are blocking other PRs. Let's unrequire the test for now, and mark it as required again once we fixed the auth issues. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-25 10:23:53 +01:00
Fabiano Fidêncio	caf6b244e6	versions: Bump nydus-snapshotter to v0.15.13 As this brings in a fix for using images with too many layers. https://github.com/containerd/nydus-snapshotter/releases/tag/v0.15.13 Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-25 08:31:48 +01:00
Fabiano Fidêncio	fb5482f647	kata-deploy: nydus: never remove the data directory Removing /var/lib/nydus-snapshotter during install or uninstall creates a split-brain state: the nydus backend starts empty while containerd's BoltDB (meta.db) still holds snapshot records from the previous run. Any subsequent image pull then fails with: "unable to prepare extraction snapshot: target snapshot \"sha256:...\": already exists" An earlier attempt cleaned up containerd's BoltDB via `ctr snapshots rm` before wiping the directory, but that cleanup is inherently fragile: - It requires the nydus gRPC service to be reachable at cleanup time. If the service is stopped, crashed, or not yet running, every `ctr` call silently fails and the stale records remain. - Any workload still actively using a snapshot blocks the entire cleanup, making it impossible to guarantee a clean state. The correct invariant is that meta.db and the nydus backend always agree. Preserving the data directory unconditionally guarantees this: - Fresh install: data directory does not exist, nydus starts empty. - Reinstall: existing snapshots and nydus.db are preserved, meta.db and backend remain in sync, new binary starts cleanly. - After uninstall: containerd is reconfigured without the nydus proxy_plugins entry and restarted, so the snapshot records in meta.db are completely dormant — nothing will use them. If nydus is reinstalled later, the data directory is still present and both sides remain in sync, so no split-brain can occur. Any stale snapshots from previous workloads are garbage-collected by containerd once the images referencing them are removed. This also removes the cleanup_containerd_nydus_snapshots, cleanup_nydus_snapshots, and cleanup_nydus_containers helpers that were introduced by the earlier (fragile) attempt. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Made-with: Cursor	2026-03-25 07:06:41 +01:00
Alex Lyn	46aa318b74	Merge pull request #12716 from lifupan/bump_dragonball_kernel kernel: Bump the kernel to v6.18.15 for dragonball	2026-03-25 11:04:44 +08:00
Aurélien Bombo	ec9c57c595	Merge pull request #12467 from ldoktor/gk-output tools.gatekeeper: Improve output	2026-03-24 17:03:55 -05:00
Fabiano Fidêncio	8950f1caeb	Merge pull request #12706 from fidencio/topic/ci-tdx-nydus-snapshotter tests: Use the helm chart to setup nydus for TDX	2026-03-24 22:37:38 +01:00
Fabiano Fidêncio	814ae53d77	tests: Use the helm chart to setup nydus for TDX Now that containerd 2.3.0-beta.0 has been released, it brings fixes for multi-snapshotters that allows us to test the baremetal machines in the same way we test the non-baremetal ones. Let's start doing the switch for TDX as timezone is friendlier with Mikko. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-03-24 19:13:59 +01:00
Fabiano Fidêncio	27dfb0d06f	Merge pull request #12724 from fidencio/topic/kata-deploy-properly-cleanup-nydus-snapshotter-on-uninstall kata-deploy: nydus: clean containerd metadata before cleaning up the backend	2026-03-24 19:13:25 +01:00
Aurélien Bombo	7ae2282a99	build: Don't fail `cargo check` on a dirty tree `cargo check` was introduced in `3f1533a` to check that Cargo.lock is in sync with Cargo.toml. However, if there are uncommitted changes in the working tree, the current invocation will immediately fail because of the `git diff` call, which is frustrating for local development. As it turns out, `cargo clippy` is a superset of `cargo check`, so we can simply pass `--locked` to `cargo clippy` to detect Cargo.lock issues. This is tested with the following change: diff --git a/src/agent/Cargo.lock b/src/agent/Cargo.lock index 96b6c676d..e1963af00 100644 --- a/src/agent/Cargo.lock +++ b/src/agent/Cargo.lock @@ -4305,6 +4305,7 @@ checksum = "8f50febec83f5ee1df3015341d8bd429f2d1cc62bcba7ea2076759d315084683" name = "test-utils" version = "0.1.0" dependencies = [ - "libc", "nix 0.26.4", ] which results in the following output: $ make -C src/agent check make: Entering directory '/kata-containers/src/agent' standard rust check... cargo fmt -- --check cargo clippy --all-targets --all-features --release --locked \ -- \ -D warnings error: the lock file /kata-containers/src/agent/Cargo.lock needs to be updated but --locked was passed to prevent this If you want to try to generate the lock file without accessing the network, remove the --locked flag and use --offline instead. make: *** [../../utils.mk:184: standard_rust_check] Error 101 make: Leaving directory '/kata-containers/src/agent' Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-03-24 11:22:14 -05:00

1 2 3 4 5 ...

18312 Commits