kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-15 03:33:05 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	c75a46d17f	tests: Do not enable NFD on s390x As we're failing on the uninstall, which seems related to a bug on NFD itself, but I don't have access to a s390x machine to debug, let's skip the enablement for now and enable it back once we've experimented it better on s390x. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	67e38e0f92	tests: Do not enable NFD on cbl-mariner As we're failing to install NFD on CBL Mariner, let's skip the enablement there, and enable it once we've experimented it better there. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	1bc873397b	tests: Use NFD as part of the tests As we have the ability to deploy NFD as a sub-chart of our chart, let's make sure we test it during our CI. We had to increase the timeout values, where we had timeouts set, to deploy / undeploy kata, as now NFD is also deployed / undeployed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	ebe15d154e	kata-deploy: Add NFD as a dependency Let's ensure that we add NFD as a weak dependency of the kata-deploy helm chart. What we're doing for now is leaving it up to the user / admin to enable it, and if enabled then we do a explicit check for virtualization support (x86_64 only for now). In case NFD is already deployed, we fail the installation (in case it's enabled on the kata-deploy helm chart) with a clear error message to the user. While I know that kata-remote DOES NOT require virtualization, I've left this out (with a comment for when we add a peer-pods dependency on kata-deploy) in order to simplify things for now, as kata-remote is not a deployed shim by default. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	be05e1370c	kata-deploy: Allow setting the default runtime class name As Kata Containers can be consumed by other helm-charts, hard coding the default runtime class name to `kata` is not optimal. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Fabiano Fidêncio	820e6d6351	kata-deploy: Add more per-arch options All the options that take a specific shim as an argument MUST have specific per arch settings, as not all the shims are available for all the arches, leading to issues when setting up multi-arch deployments. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Zvonko Kaiser	94abe4fc00	osbuilder: nvrc: Consume NVRC release instead of building it Let's ensure that we consume NVRC releases straight from GitHub instead of building the binaries ourselves. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 12:10:20 +01:00
Zvonko Kaiser	69c76971f3	gpu: Handle VFIO and IOMMUFD We have here either /dev/vfio/<num> or /dev/vfio/devices/vfio<num>, for IOMMUFD format /dev/vfio/devices/vfio<num>, strip "vfio" prefix /dev/vfio/123 - basename "123" - vfioNum = "123" - cdi.k8s.io/vfio123 /dev/vfio/devices/vfio123 - basename "vfio123" - strip - vfioNum = "123" - cdi.k8s.io/vfio123 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-31 09:46:07 +01:00
Saul Paredes	26396881cf	webhook: allow privileged containers This allows us to test privileged containers when using the webhook. We can do this because kata-deploy sets privileged_without_host_devices = true for kata runtime by default. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-10-30 14:59:26 -07:00
Fabiano Fidêncio	e30e2b5f45	tests: k8s: Remove tests running on GitHub provided runner We have 2 tests running on GitHub provided runners: * devmapper * CRI-O - devmapper situation For devmapper, we're currently testing devmapper with s390x as part of one of its jobs. More than that, this test has been failing here due to a lack of space in the machine for quite some time, and no-action was taken to bring it back either via GARM or some other way. With that said, let's rely on the s390x CI to test devmapper and avoid one extra failure on our CI by removing this one. - cri-o situation CRI-O is being tested with a fixed version of kubernetes that's already reached its EOL, and a CRI-O version that matches that k8s version. There has been attempts to raise issues, and also to provide a PR that does at least part of the work ... leaving the debugging part for the maintainers of the CI. However, there was no action on those from the maintainers. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-30 11:46:59 +01:00
Alex Lyn	fa521220a9	Merge pull request #11816 from jiuyi123/rs-vm-template-kata-ctl-merge kata-ctl: add factory subcommands for VM template management	2025-10-30 18:21:12 +08:00
ssc	551caad4b1	docs: add guide on VM templating usage in runtime-rs - Explained the concept and benefits of VM templating - Provided step-by-step instructions for enabling VM templating - Detailed the setup for using snapshotter in place of VirtioFS for template-based VM creation - Added performance test results comparing template-based and direct VM creation Signed-off-by: ssc <741026400@qq.com>	2025-10-30 15:18:31 +08:00
ssc	5a586e13a1	kata-ctl: add factory subcommands for VM template management - init: initialize the VM template factory - status: check the current factory status - destroy: clean up and remove factory resources These commands provide basic lifecycle management for VM templates. Signed-off-by: ssc <741026400@qq.com>	2025-10-30 10:27:17 +08:00
RuoqingHe	8878c46e8f	Merge pull request #11867 from spectator333/update-rust-vmm-deps dragonball: Bump kvm-ioctls to fix security issue	2025-10-30 00:17:29 +08:00
Siyu Tao	dd444d23b3	dragonball: Bump kvm-ioctls to fix security issue Use `ioctl_with_mut_ref` instead of `ioctl_with_ref` in the `create_device` method as it needs to write to the `kvm_create_device` struct passed to it, which was released in v0.12.1. Signed-off-by: Siyu Tao <taosiyu2024@163.com>	2025-10-29 14:03:29 +00:00
Steve Horsman	0e19a2bf91	Merge pull request #11993 from zvonkok/vectorAdd gpu: Add libs for CC	2025-10-29 13:42:34 +00:00
stevenhorsman	555926ea1a	libs: Fix formatting issue Fix the cargo fmt issues and then we can make the libs tests required again to avoid this regression happening again. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-29 13:13:50 +01:00
Steve Horsman	dbdd1009af	Merge pull request #11933 from kata-containers/topic/kata-deploy-nfd-dependency-part-I kata-deploy: Automatically deploy NodeFeatureRules for TEEs	2025-10-29 09:50:38 +00:00
Fabiano Fidêncio	103f80c7f5	readme: install: Drop outdated documentation kata-deploy helm chart is THE way to deploy kata-containers on kubernetes environments, and kubernetes environments is basically the only reliably tested deployment we have. For now, let's just drop documentation that is outdated / incorrect, and in the future let's ensure we update the linked docs, as we work on update / upgrade for the helm chart. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-29 09:41:57 +01:00
Zvonko Kaiser	5ff218823c	gpu: Remove unneeded libraries The libs in question were added when moving to developer.nvidia.com but switching back to ubuntu only based builds they are not needed. Remove them to keep the rootfs as minimal as possible. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-29 08:03:36 +01:00
Zvonko Kaiser	6d9b4059f5	gpu: Add libs for CC In the case of CC we need additional libraries in the rootfs. Add them conditionally if type == confidential. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-29 08:03:36 +01:00
Xuewei Niu	55d181beb1	Merge pull request #11828 from jiuyi123/rs-vm-template-runtime-rs runtime-rs: introduce VM template lifecycle and integration	2025-10-29 14:03:46 +08:00
Xuewei Niu	8aca32dfa9	Merge pull request #11862 from StevenFryto/rootless_clh runtime-rs: supporting the CLH VMM process running in non-root mode	2025-10-29 13:31:53 +08:00
ssc	16e8cf1a09	runtime-rs: boot vm from template Add build_vm_from_template() that flips boot_from_template flag, wires factory.template_path/{memory,state} into the hypervisor config, and returns ready-to-use hypervisor & agent instances. When factory.template is enabled, VirtContainer bypasses normal creation and directly boots the VM by restoring the template through incoming migration, completing the "create → save → clone" loop. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 12:38:28 +08:00
ssc	550615285c	runtime-rs: add factory, template and vm modules for VM template lifecycle Introduced factory::FactoryConfig with init/destroy/status commands to manage template pools. Added template::Template to fetch, create and persist base VMs. Introduced vm::{VM, VMConfig} exposing create, pause, save, resume, stop, disconnect and migration helpers for sandbox integration. Extended QemuInner to executes QMP incoming migration, pause/resume and status tracking. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 12:38:28 +08:00
ssc	135c84b6cb	kata-types: add VM template and factory configuration Added new fields in Hypervisor struct to support VM template creation, template boot, memory and device state paths, shared path, and store paths. Introduced a Factory struct in config to manage template path, cache endpoint, cache number, and template enable flag. Integrated Factory into TomlConfig for runtime configuration parsing. Fixes: #11413 Signed-off-by: ssc <741026400@qq.com>	2025-10-29 11:49:08 +08:00
stevenfryto	2ceadc5fa3	runtime-rs: supporting the CLH VMM process running in non-root mode This change enables to run the Cloud Hypervisor VMM using a non-root user when rootless flag is set true in configuration. Fixes: #11414 Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-10-29 01:55:10 +00:00
stevenfryto	2ddbae3aa6	runtime-rs: pass the tuntap fds down to Cloud Hypervisor Pass the file descriptors of the tuntap device to the Cloud Hypervisor VMM process so that the process could open the device without cap_net_admin Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-10-29 01:55:10 +00:00
Fabiano Fidêncio	59883a2d99	actions: Remove unused USING_NFD There's no reason to keep the env var / input as it's never been used and now kata-deploy detects automatically whether NFD is deployed or not. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 21:24:27 +01:00
Fabiano Fidêncio	f9825b4e6e	kata-deploy: Automatically deploy NodeFeatureRules for TEEs When the NodeFeatureRule CRD is detected kata-deploy will: * Create the specific NodeFeatureRules for the x86_64 TEEs * Adapt the TEEs runtime classes to take into account the amount of keys available in the system when spawning the podsandbox. Note, we still do not have NFD as sub-dependency of the helm chart, and I'm not even sure if we will have. However, it's important to integrate better with the scenarios where the NFD is already present. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 21:24:27 +01:00
Manuel Huber	8dc78057d6	ci: Refactor NVIDIA NIM test Change NIM bats file logic to allow skipping test cases which require multiple GPUs. This can be helpful for test clusters where there is only one node with a single GPU, or for local test environments with a single-node cluster with a single GPU. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-28 19:12:16 +01:00
Manuel Huber	be32b77baf	ci: Add NVIDIA CUDA vectoradd test This change adds a CUDA vectoradd test case and makes enabling NVRC tracing optional and idempotent. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-10-28 19:12:16 +01:00
Fabiano Fidêncio	a164693e1a	release: Bump version to 3.22.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> 3.22.0	2025-10-28 16:28:18 +01:00
Steve Horsman	1b46cf43c4	Merge pull request #11989 from Amulyam24/actionpz-ppc64le revert: Enable new ibm runners for ppc64le	2025-10-28 12:09:03 +00:00
Amulyam24	c603094584	revert: Enable new ibm runners for ppc64le Temporarily disables the new runners for building artifacts jobs. Will be re-enabled once they are stable. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-10-28 17:09:26 +05:30
Hyounggyu Choi	7d2fe5e187	revert: Enable new ibm runners for s390x This partially reverts `8dcd91c` for the s390x because the CI jobs are currently blocking the release. The new runners will be re-introduced once they are stable and no longer impact critical paths. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-10-28 11:11:51 +01:00
Fabiano Fidêncio	754e832cfa	kata-deploy: Allow passing shims / defaultShim per arch This allows us to do a full multi-arch deployment, as the user can easily select which shim can be deployed per arch, as some of the VMMs are not supported on all architectures, which would lead to a broken installation. Now, passing shims per arch we can easily have an heterogenous deployment where, for instance, we can set qemu-se-runtime-rs for s390x, qemu-cca for aarch64, and qemu-snp / qemu-tdx for x86_64 and call all of those a default kata-confidential ... and have everything working with the same deployment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-27 22:42:37 +01:00
Greg Kurz	ffdc80733a	Merge pull request #11966 from zvonkok/gpu-cc-fix gpu: rootfs fixes	2025-10-27 10:18:13 +01:00
Alex Lyn	418d5f724e	Merge pull request #11971 from lifupan/fupan_blk_ratelimit runtime-rs: Support disk rate limiter for dragonball	2025-10-27 17:12:47 +08:00
Alex Lyn	f86ac595a8	Merge pull request #11973 from Apokleos/enhance-oci-spec runtime-rs: Enhancements for items within OCI Spec	2025-10-27 16:15:00 +08:00
Alex Lyn	690dad5528	runtime-rs: Ensure complete cleanup of stale Device Cgroups The previous procedure failed to reliably ensure that all unused Device Cgroups were completely removed, a failure consistently verified by CI tests. This change introduces a more robust and thorough cleanup mechanism. The goal is to prevent previous issues—likely stemming from improper use of Rust mutable references—that caused the modifications to be ineffective or incomplete. This ensures a clean environment and reliable CI test execution. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-10-27 12:47:48 +08:00
Alex Lyn	25ab615da5	Merge pull request #11913 from Apokleos/dedicated-error-rs CI: Add dedicated expected error message for runtime-rs	2025-10-27 10:47:07 +08:00
Zvonko Kaiser	39848e0983	gpu: rootfs fixes Build only from Ubuntu repositories do not mix with developer.nvidia.com Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Update tools/osbuilder/rootfs-builder/nvidia/nvidia_chroot.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-26 19:36:55 +01:00
stevenhorsman	aec0ceb860	gatekeeper: Update mariner tests name In https://github.com/kata-containers/kata-containers/pull/11972 the auto-generate-policy: yes matrix parameter was removed which updates the name of the name, so sync this change in required-tests.yaml Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-10-25 17:51:31 +02:00
Kevin Zhao	e2dbe87a99	tests: Fix cca test failure on arm64 and other architectures Fix the wrong test with appendProtectionDevice on arm64 Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-25 13:54:35 +02:00
dependabot[bot]	99ae3607dc	build(deps): bump astral-tokio-tar in /src/tools/agent-ctl Bumps [astral-tokio-tar](https://github.com/astral-sh/tokio-tar) from 0.5.5 to 0.5.6. - [Release notes](https://github.com/astral-sh/tokio-tar/releases) - [Changelog](https://github.com/astral-sh/tokio-tar/blob/main/CHANGELOG.md) - [Commits](https://github.com/astral-sh/tokio-tar/compare/v0.5.5...v0.5.6) --- updated-dependencies: - dependency-name: astral-tokio-tar dependency-version: 0.5.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-25 13:53:24 +02:00
Dan Mihai	61ee4d7f8b	Merge pull request #11951 from burgerdev/watchable genpolicy: allow non-watchable ConfigMaps	2025-10-24 08:38:55 -07:00
Steve Horsman	ac601ecd45	Merge pull request #11964 from Amulyam24/k8s-ppc64le github: migrate k8s job to a different runner on ppc64le	2025-10-24 15:55:59 +01:00
Dan Mihai	ac3ea973ee	Merge pull request #11958 from microsoft/danmihai1/policy-tests-upstream5 tests: k8s: auto-generate policy for additional tests	2025-10-24 07:18:00 -07:00
Amulyam24	9876cbffd6	github: migrate k8s job to a different runner on ppc64le Migrate the k8s job to a different runner and use a long running cluster instead of creating the cluster on every run. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-10-24 18:20:11 +05:30

... 3 4 5 6 7 ...

17320 Commits