kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-05-14 11:03:31 +00:00

Author	SHA1	Message	Date
Zvonko Kaiser	76e4e6bc24	Merge pull request #12061 from Apokleos/correct-unexpected-cap tests: Correct unexpected capability for policy failure test	2025-11-11 12:20:33 -05:00
Fabiano Fidêncio	d82eb8d0f1	ci: Drop docker tests We have had those tests broken for months. It's time to get rid of those. NOTE that we could easily revert this commit and re-add those tests as soon as we find someone to maintain and be responsible for such integration. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-11 17:02:02 +01:00
Steve Horsman	4b33000c56	Merge pull request #12067 from Apokleos/fix-guest-emptydir runtime-rs: Fix several incorrect settings with guest empty dir.	2025-11-11 15:21:31 +00:00
dependabot[bot]	281f69a540	build(deps): bump github.com/containerd/containerd in /src/runtime Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.27 to 1.7.29. - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.27...v1.7.29) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-version: 1.7.29 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-11 14:23:47 +01:00
Alex Lyn	79d1a6ed8f	runtime-rs: Correct the mount type for emptydir with local storage Previous set for the Mount.type with `bind` is wrong, and for local storage, the type of Mount should be `local`. This commit aims to correct the type with "local". Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 17:09:33 +08:00
Alex Lyn	935ecf2765	runtime-rs: Fix disable_guest_empty_dir parameters order As the disable_guest_empty_dir order is wrong which causes the bool value is not correct and it got a wrong result. This commit aims to correct the parameters order. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 16:59:00 +08:00
Alex Lyn	c225cba0e6	tests: Correct unexpected capability for policy failure test The test case designed to verify policy failures due to an "unexpected capability" was misconfigured. It was using "CAP_SYS_CHROOT" as the unexpected capability to be added. This configuration was flawed for two main reasons: 1.Incorrect Syntax: Kubernetes Pod specs expect capability names without the "CAP_" prefix (e.g., "SYS_CHROOT", not "CAP_SYS_CHROOT"). This made the test case's premise incorrect from a K8s API perspective. 2.Part of Default Set: "SYS_CHROOT" is already included in the `default_caps` list for a standard container. Therefore, adding it would not trigger a policy violation, defeating the purpose of the "unexpected capability" test. Furthermore, a related issue was observed where a malformed capability like "CAP_CAP_SYS_CHROOT" was being generated, causing parsing failures in the `oci-spec-rs` library. This was a symptom of incorrect string manipulation when handling capabilities. This commit corrects the test by selecting "SYS_NICE" as the unexpected capability. "SYS_NICE" is a more suitable choice because: - It is a valid Linux capability. - It is relatively harmless. - It is not part of the default capability set defined in `genpolicy-settings.json`. By using "SYS_NICE", the test now accurately simulates a scenario where a Pod requests a legitimate but non-default capability, which the policy (generated from a baseline Pod without this capability) should correctly reject. This change fixes the test's logic and also resolves the downstream `oci-spec-rs` parsing error by ensuring only valid capability names are processed. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 14:06:30 +08:00
Alex Lyn	9aaf41a71b	Merge pull request #11985 from Apokleos/policy-caps-rs genpolicy: Correct caps matcher for runtime-rs	2025-11-11 11:08:11 +08:00
Alex Lyn	29fe46bc06	genpolicy: Correct caps matcher for runtime-rs Detected a format mismatch in OCI Spec Capabilities fields between `runtime-rs` (no `CAP_` prefix) and `runtime-go` (with `CAP_` prefix). This introduces a normalization of caps in match_caps(p_caps, i_caps). This ensures robust and consistent processing of Capabilities regardless of whether the OCI Spec originates from `runtime-rs` or `runtime-go`. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-11 10:03:54 +08:00
Dan Mihai	f78584e868	Merge pull request #12048 from manuelh-dev/mahuber/bb-build deploy: Improve busybox build	2025-11-10 11:32:07 -08:00
Alex Lyn	7423eb7a30	agent: Support both virtio-blk and virtio-scsi devices for initdata Currently, the initdata module only detects virtio-blk devices (/dev/vd) when searching for the initdata block device. However, when using virtio-scsi, the devices appear as /dev/sd in the guest, causing the initdata detection to fail. This commit extends the device detection logic to support both device types: - virtio-blk devices: /dev/vda, /dev/vdb, etc. - virtio-scsi devices: /dev/sda, /dev/sdb, etc. This commits aims to address issue of theinitdata device not being found when using virtio-scsi Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-10 18:03:23 +01:00
dependabot[bot]	f699f097f3	build(deps): bump github.com/opencontainers/runc in /src/runtime Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.2.6 to 1.2.8. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/v1.2.8/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.2.6...v1.2.8) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-version: 1.2.8 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-10 15:43:48 +01:00
Fabiano Fidêncio	92226d0a19	tests: nvidia: Be prepared for TDX Thankfully there's only one piece that's still SNP specific (for the supported TEEs). Let's adjust it so we can have an easy and smooth execution when adding a TDX CI machine. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	4d314e8676	tests: nvidia: nims: Adjust to CC There are several changes needed in order to get this test working with CC, and yet we still are skipping it. Basically, we need to: * Pull an authenticated image inside the guest, which requires: * Using Trustee to release the credential * We still depend on a PR to be merged on Trustee side * https://github.com/confidential-containers/trustee/pull/1035 * We still depend on a Trustee bump (including the PR above) on our side Apart from those changes, I ended up "duplicating" the tests by adding a "-tee" version of those, which already have: * The proper kbs annotations set up * Dropped host mounts * Increases the memory needed Last but not least, as "bats" probably means "being a terrible script", I had to re-arrange a few things otherwise the tests would not even run due to bats-isms that I am sincerely not able to pin-point. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	8cedd96d54	tests: nvidia: k8s: Enforce experimental_force_guest_pull We added the tests using virtio-9p as we knew it'd require incremental changes to be able to use any kind of guest-pull method. Now, as in the coming commits we'll be actually ensuring that guest-pull works and is in use, we can enforce the experimental_force_guest_pull usage for the nvidia cases. Note: We're using experimental_force_guest_pull instead of nydus-snapshotter due to stability concerns with the snapshotter. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	464764c7e0	tests: nvidia: kbs: Ensure KBS_INGRESS=nodeport I've missed doing this doing the KBS deployment set up. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Manuel Huber	a5cd7235cb	runtime: Align nvidia TEEs enable_annotations with TEEs It was just missed when adding those configurations. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	e85cf83573	k8s: tests: Fix default for EXPERIMENTAL_FORCE_GUEST_PULL It takes either a shim name or "", but we were treating this (thankfully only in this specific file) as a boolean. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Manuel Huber	8b39468b36	tests: nvidia: Logging for NIM Adjust output to the setup_file and teardown_file behavior. With this, we will be able to observe relevant logging rather than adding to the output variable. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-10 13:01:30 +01:00
Fabiano Fidêncio	812191c1f3	tests: nvidia: Do not deploy NFD on nvidia-gpu cases As it'll come from the GPU Operator for now. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-10 13:01:30 +01:00
Pavel Mores	74f9fdb11f	runtime-rs: remove hardcoding of SEV physical address reduction Previous commit enabled getting the physical address reduction from processor but just stored it for later use. This commit adds handling of the value to ProtectionDevice and enables the QEMU driver to use it. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-10 13:01:03 +01:00
Pavel Mores	6f9178d290	runtime-rs: get SEV params using CPUID and store them in SevSnpDetails An implementation of cbitpos acquisition is supplied that was missing so far. We also get the physical address reduction value from the same source (CPUID Fn8000_001f function). This has been hardcoded at 1 so far, following the Go runtime example, but it's better to get it from the processor. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-10 13:01:03 +01:00
Greg Kurz	5810279edf	Merge pull request #12008 from microsoft/saulparedes/allow_priv webhook: allow privileged containers	2025-11-10 11:13:41 +01:00
Zvonko Kaiser	df58972d41	Merge pull request #12051 from microsoft/danmihai1/agent-version agent: update version.rs when VERSION file changed	2025-11-09 20:34:58 -05:00
Fabiano Fidêncio	37d4eb0b77	ci: nvidia: Ensure K8S_TEST_HOST_TYPE=baremetal So the proper cleanups are performed in case something goes awry in a previous run. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-09 10:51:33 +01:00
Dan Mihai	7b10f4c72a	agent: update version.rs when VERSION file changed - version.rs gets generated from version.rs.in - version.rs.in contains values read from VERSION - so version.rs (and maybe other Agent files too) must be re-generated when the VERSION file changes Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 17:53:09 +00:00
Alex Lyn	83b0a59215	Merge pull request #12046 from Apokleos/disable-guest-emptydir Disable guest emptydir	2025-11-08 11:54:15 +08:00
Dan Mihai	df7ee2dd38	ci: k8s: AUTO_GENERATE_POLICY for cbl-mariner Auto-generate policy on cbl-mariner Hosts if the user didn't specify an AUTO_GENERATE_POLICY value. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	53acb74f26	genpolicy: adapt to new AKS pause container behavior The new image reference has changed to mcr.microsoft.com/oss/v2/kubernetes/pause:3.6 from mcr.microsoft.com/oss/kubernetes/pause:3.6. The new image uses by default UID=0, GID=0 while the older. The older image had: UID=65535, GID=65535. There is a new pause_container_id_policy field in genpolicy-settings.json, informing genpolicy about the way AdditionalGids gets updated - "v1" for the older behavior and "v2" for the newer AKS version: - When using v1, the default value of AdditionalGids is {65535}. - When using v2, the default value of AdditionalGids is {}. UID=65535 and GID=65535 are still hard-coded by default in genpolicy-settings.json. We might be able to remove/ignore these fields in the future, if we'll stop relying on policy::KataSpec::get_process_fields to use these fields. A new CI function adapt_common_policy_settings_for_aks() changes the pause container UID, GID, pause_container_id_policy, and image ref settings values when testing on AKS Hosts - i.e., when testing coco-dev or mariner Hosts. The genpolicy workarounds for the unexpected behavior with guest pull enabled have been improved to use the current container's GID instead of hard-coding GID=0 as the guest pull default. Also, AdditionalGids gets updated when the current container's GID is changing, instead of always changing the AdditionalGids at the very end of policy::AgentPolicy::get_container_process(), when the relevant evolution of the GID value was no longer available. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	1f784bb770	genpolicy: improve policy generation comments Make it easier to understand the source of the UID/GID/AdditionalGids values from the container in the auto-generated policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	969b8e0fb8	genpolicy: more detailed UID/GID debug logs Add more details to code paths handling UID/GID values, for easier debugging. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Dan Mihai	cacd37ee6e	tests: genpolicy: restore test settings for non-Coco configMap These settings got broken recently because the non-CoCo tests were disabled for unrelated reasons. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-11-08 00:00:09 +01:00
Manuel Huber	caff6df827	deploy: Improve busybox build Parallelize busybox builds to build a bit faster and create the build directory prior to Docker execution, which on my environment, helps with permission issues when building busybox without the kata-containers/build directory existing beforehand. Signed-off-by: Manuel Huber <manuelh@nvidia.com>	2025-11-07 10:09:57 -08:00
Alex Lyn	23024876b2	runtime-rs: Use the configurable disable_guest_empty_dir Correct the hardcoded value of disable_guest_empty_dir, instead, we use the real value of it which comes from the configuration. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:52:11 +08:00
Alex Lyn	382924bdf3	kata-sys-util: Introduce a sandbox annotation for disable guest emptydir A sandbox annotation that determines if it should create Kubernetes emptyDir mounts on the guest filesystem. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:48:42 +08:00
Alex Lyn	720a229579	kata-types: Introduce disable guest emptydir flag It acts as if it should create Kubernetes emptyDir mounts on the guest filesystem. If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem.Instead, emptyDir mounts will be created on the host and shared via virtio-fs. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-11-07 19:45:55 +08:00
Fabiano Fidêncio	03e06fdf4d	tests: nvidia: Deploy Trustee Let's ensure Trustee is deployed as some of the tests rely images that live behind authentication. /o\ The approach taken here to deploy Trustee is exactly the same one taken on the other CoCo tests, apart from an env var passed to ensure we're using the NVIDIA remote verifier (which will be in handy very very soon). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-07 12:32:11 +01:00
Pavel Mores	841fee28da	runtime-rs: add a helper to run external command and capture its output This isn't really related to remote hypervisor though it was useful for its debugging. It's a small helper I've been using regularly during development for quite some time that I think might be useful more broadly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	72c704b287	runtime-rs: make error reporting for CreateVM a bit more explicit A naked ttrpc error with no context turns out to be rather hard to understand or even notice in log. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	45d8141edc	runtime-rs: remote hv needs neither image nor initrd specified in config The remote hypervisor launches no VM, it just instructs the Cloud API Adaptor to do so, therefore it has no need for an image or initrd to boot from and should be exempt from the mandate for one or the other to be specified. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Pavel Mores	80ef102a00	runtime-rs: fix scoping of the remote hv Hypervisor service The go runtime's .proto file - which is also used by the Cloud API Adaptor - puts the Hypervisor service into the "hypervisor" package. runtime-rs has to do the same to avoid an "unimplemented" error. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-11-07 10:49:14 +01:00
Alex Lyn	d5e2071869	Merge pull request #11921 from Apokleos/enhance-copyfile2 runtime-rs: Add support LocalStorage for emptyDir within nontee cases	2025-11-07 16:58:39 +08:00
Fabiano Fidêncio	a591cda466	gatekeeper: Adjust the nvidia gpu test name With the change made to the matrix when the CC GPU runner was added, there was a change in the job name (@sprt saw that coming, but I didn't). Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	c6dc176a03	tests: nvidia: cc: Enable NIMs tests Same deal as the previous commut, just enabling the tests here, with the same list of improvements that we will need to go through in order to get is working in a perfect way. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	8ca77f2655	tests: nvidia: cc: Run CUDA vectorAdd tests on CC mode While the primary goal of this change is to detect regressions to the NVIDIA SNP GPU scenario, various improvements to reflect a more realistic CC setting are planned in subsequent changes, such as: * moving away from the overlayfs snapshotter * disabling filesystem sharing * applying a pod security policy * activating the GPUs only after attestation * using a refined approach for GPU cold-plugging without requiring annotations * revisiting pod timeout and overhead parameters (the podOverhead value was increased due to CUDA vectorAdd requiring about 6Gi of podOverhead, as well as the inference and embedqa requiring at least 12Gi, respectively, 14Gi of podOverhead to run without invoking the host's oom-killer. We will revisit this aspect after addressing points 1. and 2.) Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	25ce0afd52	kata-deploy: Allow the CDI annotation for CC GPU cases For the nvidia-gpu-snp and nvidia-gpu-tdx we must set containerd to allow the CDI annotation to be passed to down. This solution may become obsolete soon enough, but the cleanest way to have it properly working is by adding it here (even if we remove it before the next release). Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Manuel Huber	c91edf884b	runtimeclasses: nvidia: Bump TEE podOverhead It's been noticed that as more RAM is needed to run the CC tests, we also need to update the podOverhead of the NVIDIA CC runtime classes to avoid getting OOM Killed. Signed-off-by: Manuel Huber <manuelh@nvidia.com> Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2025-11-06 16:28:33 +01:00
Fupan Li	aac2a37ff5	runtime-rs: enable pselect6 syscall for dragonball seccomp Since the nerdctl's network hook would call pselect6 syscall by xtables-nft-multi, thus we'd better add it to the seccomp's whitelist. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-11-06 11:17:57 +01:00
Hyounggyu Choi	ff429072b6	Merge pull request #11924 from BbolroC/fix-static-checks-actionspz ci: Fix failing static checks to enable IBM actionspz - Z specific	2025-11-06 09:04:04 +01:00
Zvonko Kaiser	fce6a75899	Merge pull request #12027 from fidencio/topic/kata-deploy-make-ALLOWED_HYPERVISOR_ANNOTATIONS-per-arch kata-deploy: Add per arch ALLOWED_HYPERVISOR_ANNOTATIONS	2025-11-05 18:20:14 -05:00

1 2 3 4 5 ...

17215 Commits