kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-14 02:53:02 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	75996945aa	kata-deploy: try-kata-values.yaml -> values.yaml This makes the user experience better, as the admin can deploy Kata Containers without having to download / set up any additional file. Of course, if the admin wants something more specific, examples are provided. Tests and documentation are updated to reflect this change. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-17 12:16:17 +01:00
Fabiano Fidêncio	2e000129a9	kata-deploy: tests: Add example values files for easy Kata deployment Add three example values files to make it easier for users to try out different Kata Containers configurations: - try-kata.values.yaml: Enables all available shims - try-kata-tee.values.yaml: Enables only TEE/confidential computing shims - try-kata-nvidia-gpu.values.yaml: Enables only NVIDIA GPU shims These files use the new structured configuration format and serve as ready-to-use examples for common deployment scenarios. Also update the README.md to document these example files and how to use them. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	aa89fda7fc	kata-deploy: Document new structured configuration and deprecation Add comprehensive documentation for the new structured configuration format, including: - Migration guide from legacy env.* format - List of deprecated fields with removal timeline (2 releases) - Examples of the new structured format - Explanation of key benefits - Backward compatibility notes The documentation makes it clear that the legacy format is deprecated but will continue to work during the transition period. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	119893b8e8	kata-deploy: Add backward compatibility for legacy env.* configuration This commit adds backward compatibility support to ensure existing configurations using the legacy env.* format continue to work. The helper functions now check for legacy env.* values first, and only fall back to the new structured format if legacy values are not set. This allows for gradual migration without breaking existing deployments. Backward compatibility is maintained for: - env.shims, env.shims_* (per architecture) - env.defaultShim, env.defaultShim_* (per architecture) - env.allowedHypervisorAnnotations - env.snapshotterHandlerMapping_* (per architecture) - env.pullTypeMapping_* (per architecture) - env.agentHttpsProxy, env.agentNoProxy - env._experimentalSetupSnapshotter - env._experimentalForceGuestPull_* (per architecture) - env.debug Legacy env vars (SHIMS, DEFAULT_SHIM, etc.) are still set in the DaemonSet when using the old format to maintain full compatibility. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	ae3fb45814	kata-deploy: Introduce structured configuration format for shims This commit introduces a new structured configuration format for configuring Kata Containers shims in the Helm chart. The new format provides: - Per-shim configuration with enabled/supportedArches - Per-shim snapshotter, guest pull, and agent proxy settings - Architecture-aware default shim configuration - Root-level debug and snapshotter setup configuration All shims are disabled by default and must be explicitly enabled. This provides better type safety and clearer organization compared to the legacy env.* string-based format. The templates are updated to use the new structure exclusively. Backward compatibility will be added in a follow-up commit. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	e85d584e1c	kata-deploy: script: Fix FOR_ARCH handling As the some of the global vars can be empty, we should actually check their _FOR_ARCH version instead. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	397289c67c	kata-deploy: script: Handle {https,no}_proxy per shim As we're making the values.yaml more user friendly, we actually have to handle the https_proxy and no_proxy entries per shim, instead of having this globally available, as this will only affect images being pulled inside the guest (as in, when using TEE variations of the shims). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
Fabiano Fidêncio	f62d9435a2	runtimeclasses: firecracker is not a valid one At least not for now, and it was mistakenly added to the list. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-15 09:36:14 +01:00
nheinemans-asml	3380458269	kata-deploy: Add daemonsets to the RBAC Add missing rules which are necessary for dealing with daemonsets as kata-deploy know checks for the NFD daemonset as part of its script. fixes #12083 Signed-off-by: nheinemans-asml <97238218+nheinemans-asml@users.noreply.github.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-14 17:16:58 +01:00
Simon Kaegi	716c55abdd	kernel: adds nft bridging and filtering support for IPv4 and IPv6 Adds a practical set of kernel config used by docker-in-docker and kind for network bridging and filtering. It also includes the matching IPv6 support to allow tools like kind that require IPv6 network policies to work out of the box. This support includes: - nftables reject and filtering support for inet/ipv4/ipv6 - Bridge filtering for container-to-container traffic - IPv6 NAT, filtering, and packet matching rules for network policies - VXLAN and IPsec crypto support for network tunneling - TMPFS POSIX ACL support for filesystem permissions The configs are organized across fragment files: - common/fs.conf: TMPFS ACL support - common/crypto.conf: IPsec/VXLAN crypto algorithms - common/network.conf: VXLAN, IPsec ESP, nftables bridge/ARP/netdev - common/netfilter.conf: IPv6 netfilter stack and nftables advanced features Fixes: #11886 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2025-11-14 15:57:47 +01:00
stevenhorsman	a1ddd2c3dd	kata-deploy: Add kata-qemu-coco-dev-runtime-rs runtime class Add the runtime class and shim references for the new non-tee runtime-rs class Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-11-13 14:18:43 +00:00
Manuel Huber	caff6df827	deploy: Improve busybox build Parallelize busybox builds to build a bit faster and create the build directory prior to Docker execution, which on my environment, helps with permission issues when building busybox without the kata-containers/build directory existing beforehand. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-07 10:09:57 -08:00
Manuel Huber	25ce0afd52	kata-deploy: Allow the CDI annotation for CC GPU cases For the nvidia-gpu-snp and nvidia-gpu-tdx we must set containerd to allow the CDI annotation to be passed to down. This solution may become obsolete soon enough, but the cleanest way to have it properly working is by adding it here (even if we remove it before the next release). Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	c91edf884b	runtimeclasses: nvidia: Bump TEE podOverhead It's been noticed that as more RAM is needed to run the CC tests, we also need to update the podOverhead of the NVIDIA CC runtime classes to avoid getting OOM Killed. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Fabiano Fidêncio	b2ee64a2d6	kata-deploy: scripts: Ensure we don't add duplicated values Let's now make sure that we don't add duplicated values to any of our entries, making the script as sane as possible for sequential runs. Vibed with Cursor's help! Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:48:24 +01:00
Fabiano Fidêncio	78ae79d153	kata-deploy: scripts: Add helper functions to avoid duplicated items Let's add some helper functions, not yet used, to avoid adding duplicated items. This idea is an expansion of Choi's idea to avoid setting duplicated items, and it'll help on making the whole script idempotent on sequential runs. Vibed with Cursor's help! Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:48:24 +01:00
Fabiano Fidêncio	f773368d93	kata-deploy: Add per arch ALLOWED_HYPERVISOR_ANNOTATIONS I know, this is not simplifying much things for now, but it has a good intent in the background and will serve as base for making the kata-deploy helm chart more user friendly. With that said, let's add ALLOWED_HYPERVISOR_ANNOTATIONS per arch, while adding support to set something like "qemu:foo,bar clh:bar foobar barfoo". Why? Because in the future we'll have a better way to set this per shim (and the shim is per arch ...). More details of what we'll do in the future are being discussed here: https://github.com/kata-containers/kata-containers/issues/12024 Anyways, the variables are DELIBERATELY not exposed to the chart for now, as those will be later on when addressing the issue mentioned above. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:45:34 +01:00
Fabiano Fidêncio	66e133e096	kata-deploy: Add missing runtimeClasses When the runtimeClasses were added, as part of `7cfa826804`, the firecracker runtimeClass ended up missing from the dictionary. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-05 19:07:28 +01:00
Fabiano Fidêncio	02f47d3f18	helm: uninstall: Take nodeSelector into consideration As we're already doing for the install part, but this bit was missed during review. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-04 09:29:35 +01:00
Fabiano Fidêncio	12f3b206eb	Revert "kata-deploy: Allow setting the default runtime class name" This reverts commit `be05e1370c`, which is not a problem as we never released such option. Conflicts: tools/packaging/kata-deploy/helm-chart/README.md Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 17:32:18 +01:00
Fabiano Fidêncio	7cfa826804	kata-deploy: Let helm deal with runtimeClass creation We had this logic inside the script when we didn't use the helm chart. However, this only makes the shim script more convoluted for no reason. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-03 17:32:18 +01:00
Fabiano Fidêncio	157b2c32ce	scripts: release: Run helm dependencies update Otherwise we'll face issues like: ``` Error: found in Chart.yaml, but missing in charts/ directory: node-feature-discovery ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-01 17:54:58 +01:00
Fabiano Fidêncio	ebe15d154e	kata-deploy: Add NFD as a dependency Let's ensure that we add NFD as a weak dependency of the kata-deploy helm chart. What we're doing for now is leaving it up to the user / admin to enable it, and if enabled then we do a explicit check for virtualization support (x86_64 only for now). In case NFD is already deployed, we fail the installation (in case it's enabled on the kata-deploy helm chart) with a clear error message to the user. While I know that kata-remote DOES NOT require virtualization, I've left this out (with a comment for when we add a peer-pods dependency on kata-deploy) in order to simplify things for now, as kata-remote is not a deployed shim by default. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:30:13 +01:00
Fabiano Fidêncio	be05e1370c	kata-deploy: Allow setting the default runtime class name As Kata Containers can be consumed by other helm-charts, hard coding the default runtime class name to `kata` is not optimal. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Fabiano Fidêncio	820e6d6351	kata-deploy: Add more per-arch options All the options that take a specific shim as an argument MUST have specific per arch settings, as not all the shims are available for all the arches, leading to issues when setting up multi-arch deployments. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-31 16:14:53 +01:00
Fabiano Fidêncio	f9825b4e6e	kata-deploy: Automatically deploy NodeFeatureRules for TEEs When the NodeFeatureRule CRD is detected kata-deploy will: * Create the specific NodeFeatureRules for the x86_64 TEEs * Adapt the TEEs runtime classes to take into account the amount of keys available in the system when spawning the podsandbox. Note, we still do not have NFD as sub-dependency of the helm chart, and I'm not even sure if we will have. However, it's important to integrate better with the scenarios where the NFD is already present. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 21:24:27 +01:00
Fabiano Fidêncio	a164693e1a	release: Bump version to 3.22.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-28 16:28:18 +01:00
Fabiano Fidêncio	754e832cfa	kata-deploy: Allow passing shims / defaultShim per arch This allows us to do a full multi-arch deployment, as the user can easily select which shim can be deployed per arch, as some of the VMMs are not supported on all architectures, which would lead to a broken installation. Now, passing shims per arch we can easily have an heterogenous deployment where, for instance, we can set qemu-se-runtime-rs for s390x, qemu-cca for aarch64, and qemu-snp / qemu-tdx for x86_64 and call all of those a default kata-confidential ... and have everything working with the same deployment. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-27 22:42:37 +01:00
Zvonko Kaiser	39848e0983	gpu: rootfs fixes Build only from Ubuntu repositories do not mix with developer.nvidia.com Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Update tools/osbuilder/rootfs-builder/nvidia/nvidia_chroot.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-26 19:36:55 +01:00
Fabiano Fidêncio	12a515826d	tools: Install Golang from a reliable mirror (follow-up) Aurélien has moved to a reliable mirror for our tests, but we missed that our tools Dockerfiles could benefit from the same change, which is added now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-23 11:15:13 +02:00
Fabiano Fidêncio	560425f31f	build: kernel: Bump version to trigger signed builds for arm64 GPU Although we saw this happening, we expected it to NOT happen ... As the kernel is not signed, but we expect it to be (the cached version), then we're bailing. :-/ Let's ensure a full rebuild of kernels happen and we'll be good from that point onwards. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-23 11:15:13 +02:00
Fabiano Fidêncio	ba912e6a84	kata-deploy: Adapt nydus installation to MULTI_INSTALL_SUFFIX By doing this we can ensure that more than one instance of nydus-snapshotter can be running inside the cluster, which is super useful for doing A-B "upgrades" (where we install a new version of kata-containers + nydus on B, while A is still running, and then only uninstall A after making sure that B is working as expected). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-22 20:25:03 +02:00
Fabiano Fidêncio	ded336405f	kata-deploy: All qemu variants use .hypervisors.qemu.* We've been wrongly trying to set up the `${shim}` (as the qemu-snp, for instance) as the hypervisor name in the kata-containers configuration file, leading to an `tomlq` breaking as all the .hypervisors.qemu* shims are tied to the `qemu` hypervisor, and it happens regardless of the shim having a different name, or the hypervisor being experimental or not. ```sh $ grep "hypervisor.qemu" src/runtime/config/configuration- src/runtime/config/configuration-qemu-cca.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-coco-dev.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu-snp.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu-tdx.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-nvidia-gpu.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-se.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-snp.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu-tdx.toml.in:[hypervisor.qemu] src/runtime/config/configuration-qemu.toml.in:[hypervisor.qemu] $ grep "hypervisor.qemu" src/runtime-rs/config/configuration- src/runtime-rs/config/configuration-qemu-runtime-rs.toml.in:[hypervisor.qemu] src/runtime-rs/config/configuration-qemu-se-runtime-rs.toml.in:[hypervisor.qemu] ``` Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-22 10:23:12 +02:00
Fabiano Fidêncio	552378cf1e	helm: Add missing documentation We've recently added support for: * deploying and setting up a snapshotter, via _experimentalSetupSnapshotter * enabling experimental_force_guest_pull, via _experimentalForceGuestPull However, we never updated the documentation for those, thus let's do it now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-21 16:20:21 +02:00
Kevin Zhao	141070b388	Kata-deploy: Add kata-deploy set up for qemu-cca Support launch qemu-cca in Kata-deploy. Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:52 +08:00
Kevin Zhao	af919686ab	Kata-deploy: Add CCA firmware build support runtime: pass firmware to CCA Realm Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:45 +08:00
Kevin Zhao	16e91bfb21	kata-deploy: Add support for Arm CCA Qemu build The Qemu support is picked up from: https://git.codelinaro.org/linaro/dcap/qemu.git, branch: cca/2025-04-16 More info regarding the CCA software stack dev and test, please refer to link: https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+RME+stack+for+QEMU Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-10-16 17:24:08 +08:00
Seunguk Shin	c7d5f207f1	kata-deploy: support build confidential rootfs and initrd for CCA Also add cca-attester for coco-guest-component Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org> Co-authored-by: Seunguk Shin <seunguk.shin@arm.com>	2025-10-16 17:24:03 +08:00
Seunguk Shin	40dac78412	kata-deploy: support build confidential kernel and shim-v2 for CCA After supporting the Arm CCA, it will rely on the kernel kvm.h headers to build the runtime. The kernel-headers currently quite new with the traditional one, so that we rely on build the kernel header first and then inject it to the shim-v2 build container. Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org> Co-authored-by: Seunguk Shin <seunguk.shin@arm.com>	2025-10-16 17:23:58 +08:00
Fabiano Fidêncio	2ad81c4797	build: qemu: Fix cache logic We need to ensure that any change on the Dockerfile (and its dir) leads to the build being retriggered, rather than using the cached version. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-14 12:17:43 +02:00
Fabiano Fidêncio	2f73e34e33	builds: qemu: Use a liburing newer than 2.2 Due to a potential regression introduced by: `984a32f17e (565f3835aaed6321caab4f7c4f8560a687f6000b_379_386)` Reported-by: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-14 12:17:28 +02:00
Fabiano Fidêncio	b0b0038689	versions: Bump QEMU to 10.1.1 QEMU 10.1.1 was released on October 8th, 2025, let's bump it on our side. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-13 23:52:01 +02:00
Fabiano Fidêncio	fb43d3419f	build: Fix nvidia kernel breakage On commit `9602ba6ccc`, from February this year, we've introduced a check to ensure that the files needed for signing the kernel build are present. However, we've noticed last week that there were a reasonable amount of wrong assumptions with the workflow. :-) Zvonko fixed the majority of those, but this bit was left and it'd cause breakages when using kernel that was cached ... although passing when building new kernels. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-13 19:28:40 +02:00
Zvonko Kaiser	b00013c717	kernel: Add KBUILD_SIGN_PIN pass through This is needed to the kernel setup picks up the correct config values from our fragments directories. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-10 15:45:34 -04:00
Fabiano Fidêncio	496e255ea2	build: Fix KBUILD_SIGN_PIN usage What was done in the past, trying to set the env var on the same step it'd be used, simply does not work. Instead, we need to properly set it through the `env` set up, as done now. We're also bumping the kata_config_version to ensure we retrigger the kernel builds. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-10 15:25:10 +02:00
Fabiano Fidêncio	dbb1eb959c	kata-deploy: Allow users to set experimental_force_guest_pull For those who are not willing to use the nydus-snapshotter for pulling the image inside the guest, let's allow them setting the experimetal_force_guest_pull, introduced by Edgeless, as part of our helm-chart. This option can be set as: _experimentalForceGuestPull: "qemu-tdx,qemu-coco-dev" Which would them ensure that the configuration for `qemu-tdx` and `qemu-coco-dev` would have the option enabled. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 17:43:09 +02:00
Fabiano Fidêncio	8c4bad68a8	kata-deploy: Remove kustomize yamls, rely on helm-chart only As the kata-deploy helm chart has been the only way we've been testing kata-containers deployment as part of our CI, it's time to finally get rid of the kustomize yamls and avoid us having to maintain two different methods (with one of those not being tested). Here I removed: * kata-deploy yamls and kustomize yamls * kata-cleanup yamls and kustomize yamls * kata-rbac yals and kustomize yamls * README.md for the kustomize yamls was removed Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-08 16:54:19 +02:00
Zvonko Kaiser	59b4e3d3f8	gpu: Add CONFIG_FW_LOADER to the kernel We need it for the newer CC kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-10-08 10:01:27 +02:00
Szymon Klimek	8dc6b24e7d	kata-deploy: accept 25.10 as supported distro for TDX Canonical TDX release is not needed for vanilla Ubuntu 25.10 but GRUB_CMDLINE_LINUX_DEFAULT needs to contain `nohibernate` and `kvm_intel.tdx=1` Signed-off-by: Szymon Klimek <szymon.klimek@intel.com>	2025-10-07 23:41:52 +02:00
Fabiano Fidêncio	000c9cce23	kata-deploy: chart: Add `_experimentalSetupSnapshotter` Let's expose the EXPERIMENTAL_SETUP_SNAPSHOTTER script environment variable to our chart, allowing then users of our helm chart to take advantage of this experimental feature. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-10-07 10:32:46 +02:00

1 2 3 4 5 ...

1388 Commits