kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-07-01 22:50:54 +00:00

Author	SHA1	Message	Date
Fabiano Fidêncio	e122d7ffb0	versions: bump containerd to 2.3 and define minimum/latest test matrix Bump the containerd version used by CI from v1.7.25 to v2.3.0. Rename the version-range fields in versions.yaml and throughout the GitHub Actions workflows from lts/active/version/sandbox_api to minimum/latest to make their meaning self-evident: minimum: "v1.7" # oldest containerd branch under test latest: "v2.3" # newest containerd branch under test Drop the bare version field (superseded by the matrix) and the sandbox_api alias (covered by latest). Update all containerd_version matrix entries in the workflow files accordingly, and update gha-run-k8s-common.sh to resolve the new key names. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Cursor <noreply@cursor.com>	2026-06-08 19:20:14 +02:00
Fabiano Fidêncio	a4138794ea	Merge pull request #13183 from fidencio/topic/kata-deploy-custom-kata-drop-in-for-default-runtimes kata-deploy: support drop-in configs for default runtimes	2026-06-08 18:44:33 +02:00
Fabiano Fidêncio	d6e1b45ce7	Merge pull request #13171 from fidencio/topic/runtime-rs-enforce-sandbox_cgroup_only-and-static_sandbox_resource_mgmt runtime-rs: default static sizing-related config flags to true	2026-06-08 17:43:37 +02:00
Fabiano Fidêncio	b119b051cb	kata-deploy: support drop-in configs for default runtimes Allow operators to provide per-shim drop-in TOML for built-in runtimes and reconcile stale override files so upgrades and migrations remain safe when drop-ins are added or removed. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Codex	2026-06-08 13:31:03 +02:00
Fabiano Fidêncio	4dc288401e	runtime-rs: make sandbox cgroup runtime attach idempotent The dragonball nerdctl CI job can race when creating and attaching the runtime process to the sandbox cgroup, surfacing an os error 17 (AlreadyExists) during shim task creation. Let's retry add_proc once on this pre-existing cgroup condition so startup remains robust. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Codex <codex@openai.com>	2026-06-08 13:11:34 +02:00
Fabiano Fidêncio	4d569c22b4	runtime-rs: enforce a minimum vsock reconnect window Low-CPU sandboxes can take longer than a few seconds to complete guest boot and start the agent. Let's clamp the reconnect timeout to a safe minimum so sandbox startup does not fail early with transient vsock ECONNRESET. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Codex <codex@openai.com>	2026-06-08 13:11:34 +02:00
Fabiano Fidêncio	ed34d7811d	runtime-rs: supplement static sizing from sandbox annotations When static sandbox resource management is enabled, CRI CPU/memory sizing may live only in sandbox annotations and be missing from the OCI spec. Let's fill missing sizing fields from annotations before applying static VM sizing so runtime-rs follows the expected Kubernetes behavior for constrained pods. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Codex <codex@openai.com>	2026-06-08 13:11:34 +02:00
Fabiano Fidêncio	e93558e810	runtime-rs: default static sizing-related config flags to true Add top-level runtime-rs Makefile options `DEFSANDBOXCGROUP_ONLY` and `DEFSTATICRESOURCEMGMT`, both defaulting to true, and use them for the runtime defaults that previously disabled these paths. This aligns runtime-rs defaults with static sandbox resource management, which sizes sandbox memory up front instead of relying on memory hotplug, helping avoid architecture-specific hotplug limitations. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-06-08 12:57:40 +02:00
Fupan Li	024c2531a5	Merge pull request #13029 from fidencio/topic/rfc-composable-vm-images docs: add composable VM images design proposal	2026-06-08 18:40:35 +08:00
Fabiano Fidêncio	9e65f85ccd	Merge pull request #13174 from stevenhorsman/cri-o-cve-false-positive runtime: ignore false positive CRI-O vulnerabilities	2026-06-08 09:13:39 +02:00
Fabiano Fidêncio	5801a87a4b	Merge pull request #13182 from fidencio/topic/tests-enable-more-tests-for-tdx-runtime-rs tests: unskip hard-coded policy tests on qemu-tdx-runtime-rs	2026-06-08 07:24:50 +02:00
Fabiano Fidêncio	2440b5940b	docs: add composable VM images design proposal Add an RFC document describing the composable image architecture that replaces monolithic guest rootfs images with a lean base image plus purpose-specific addon images cold-plugged as virtio-blk devices. The proposal covers the runtime configuration (extra_images), host-side cold-plugging, guest-side mounting via systemd and dm-verity, agent-side dynamic path resolution, the image build pipeline, and the security model. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-07 13:58:17 +02:00
Fabiano Fidêncio	57c61e0c2f	tests: unskip hard-coded policy tests on qemu-tdx-runtime-rs Enable the hard-coded init-data policy test gate for qemu-tdx-runtime-rs so runtime-rs and Go TDX variants exercise the same Kubernetes policy coverage. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-06-06 22:48:20 +02:00
Fabiano Fidêncio	43321c7a78	Merge pull request #12931 from mythi/qemu-tdx-tests tests: fix TDX runtime-rs and initdata tests	2026-06-06 11:42:19 +02:00
Fabiano Fidêncio	1ca7129581	Merge pull request #13176 from Amulyam24/kata-deploy-fix kata-deploy: add the imports directive explicitly if expected but not found	2026-06-05 22:24:16 +02:00
Fabiano Fidêncio	f6ff9578d4	Merge pull request #13161 from kata-containers/sprt/remove-configure-mariner ci: remove Mariner annotations and use new config	2026-06-05 20:22:58 +02:00
Fabiano Fidêncio	e529ca0292	Merge pull request #13170 from fidencio/topic/kata-deploy-custom-runtimes-podOverhead kata-deploy: inherit custom RuntimeClass overhead from baseConfig	2026-06-05 19:46:17 +02:00
Fabiano Fidêncio	e9ee97f751	kata-deploy: inherit custom RuntimeClass overhead from baseConfig Default custom runtime RuntimeClass overhead.podFixed to the selected baseConfig values, so equivalent runtimes behave consistently without repeating boilerplate. In case the user wants to enforce that no overhead is set on the custom RuntimeClass, disable inheritance with inheritBaseOverhead=false. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>	2026-06-05 17:22:25 +02:00
Steve Horsman	2ac6bb173b	Merge pull request #13036 from stevenhorsman/jaeger-to-otlp-tracing-switch trace-forwarder: migrate from Jaeger to OTLP exporter	2026-06-05 14:30:26 +01:00
Amulyam24	b15a5fbe36	kata-deploy: add the imports directive explicitly if expected but not found For containerd v2.2+, the flow assumes that the imports directive would be present. It is better to check it and add if it doesn't exist. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2026-06-05 18:47:07 +05:30
Mikko Ylinen	013e901f1b	tests: re-enable initdata tests for qemu-tdx The coco initdata tests signature verification and authenticated registry never worked on qemu-tdx and so they have been disabled since. Add them back now that all necessary fixes are in place. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-05 16:04:05 +03:00
Mikko Ylinen	9313e336b5	tests: set image.image_pull_proxy for CDH initdata initdata tests set kernel arguments to "" which resets the kernel arguments configured by Helm install. However, TDX runner depends on agent.https_proxy= kernel arguments to pull images. In order for initdata tests to work on TDX, the same needs to be added to CDH configuration via image.image_pull_proxy. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-05 16:04:05 +03:00
Mikko Ylinen	f3a0ef6a7c	tests: use kubectl set to configure KBS env No need to patch yamls locally. Also, set RUST_LOG=debug and enable https_proxy for all TDX targets when the runner has HTTPS_PROXY is set. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-05 16:04:05 +03:00
stevenhorsman	7033d56e2c	runtime: ignore false positive CRI-O vulnerabilities Add osv-scanner ignores for GO-2025-3426 (CVE-2025-0750) and GO-2025-3897 (CVE-2025-4437), which are false positives for kata-containers. The vulnerabilities have been open for 10 and 16 months and there is no indication that the cri-o community have any intension of addressing the situation. They also only affect the main CRI-O runtime code (log management and user creation functions), but kata-containers only imports github.com/cri-o/cri-o/pkg/annotations for string constant definitions. The vulnerable code paths are not imported or used, therefore we should just filter these out. GO-2025-3426: Path traversal in UnMountPodLogs/LinkContainerLogs GO-2025-3897: Memory exhaustion when reading /etc/passwd Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-05 10:08:06 +01:00
Steve Horsman	1624ebe362	Merge pull request #13135 from kata-containers/dependabot/cargo/tar-0.4.46 build(deps): bump tar from 0.4.45 to 0.4.46	2026-06-05 09:44:46 +01:00
stevenhorsman	b737ae48bf	trace-forwarder: migrate from Jaeger to OTLP exporter Migrate trace-forwarder from the deprecated opentelemetry-jaeger exporter to the modern opentelemetry-otlp exporter. This change remediates GHSA-2f9f-gq7v-9h6m (CVE-2026-43868), a medium-severity vulnerability in Apache Thrift. The opentelemetry-jaeger crate is no longer maintained and depends on vulnerable thrift versions (0.13.0 and 0.16.0). The opentelemetry-otlp exporter does not use thrift and is actively maintained. Changes: - Replace opentelemetry-jaeger with opentelemetry-otlp in Cargo.toml - Update tracer.rs to use OTLP exporter instead of Jaeger exporter - Replace --jaeger-host/--jaeger-port flags with --otlp-endpoint flag - Update server.rs to use TracerProvider instead of SpanExporter - Update documentation to reflect OTLP migration - Add examples for common OTLP-compatible collectors Breaking change: Users must update their trace-forwarder invocations to use --otlp-endpoint instead of --jaeger-host and --jaeger-port. Default endpoint: http://localhost:4317 (OTLP gRPC) Generated-by: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2026-06-04 19:39:47 +01:00
Dan Mihai	c78ccc2e9f	Merge pull request #13088 from kata-containers/dependabot/cargo/openssl-0.10.80 build(deps): bump openssl from 0.10.79 to 0.10.80	2026-06-04 11:38:08 -07:00
Fabiano Fidêncio	743b0a4839	Merge pull request #13165 from stevenhorsman/bump-go-to-1.25.11 versions: bump golang to 1.25.11	2026-06-04 20:24:57 +02:00
Fabiano Fidêncio	cd21b7b607	Merge pull request #13156 from fidencio/topic/runtime-rs-shim-leftover-on-failure runtime-rs: shut down shim daemon on a failed create	2026-06-04 20:09:28 +02:00
Fabiano Fidêncio	354b85784c	Merge pull request #13166 from stevenhorsman/required-tests/remote-kata-monitor ci: Remove kata-monitor test from required	2026-06-04 20:04:15 +02:00
stevenhorsman	81c7dde0ae	ci: Remove kata-monitor test from required The kata-monitor test is currently failing and is running a very EoL version of cri-o. This area is being actively reworked in #13107, so remove this and then once kata-monitor tests are stable we can re-add the new versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-04 14:40:17 +01:00
Fabiano Fidêncio	80e2473440	runtime-rs: shut down shim daemon on a failed create When CreateContainer fails before the runtime instance is registered (e.g. a hypervisor/cgroup error), no sandbox exists to drive the normal teardown. containerd's follow-up Shutdown RPC then reaches get_runtime_instance(), fails with "runtime not ready", and returns before the service loop is ever told to stop. Because the shim ignores SIGTERM, the containerd-shim-kata-v2 daemon is left running and orphaned. Make the Shutdown RPC force the daemon to exit when there is no runtime instance, emitting the same Action::Shutdown that sandbox.shutdown() sends on the normal path. This guarantees the shim process is reaped after a failed create instead of leaking. Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com> Assisted-by: Cursor <noreply@cursor.com>	2026-06-04 14:12:01 +02:00
Fabiano Fidêncio	2a1ce7b8c4	Merge pull request #12539 from mythi/no-vcpu-hotplug Disable CPU hotplug when confidential guest setting enabled	2026-06-04 10:56:52 +02:00
dependabot[bot]	4ab63d0a5d	build(deps): bump tar from 0.4.45 to 0.4.46 Bumps [tar](https://github.com/composefs/tar-rs) from 0.4.45 to 0.4.46. - [Release notes](https://github.com/composefs/tar-rs/releases) - [Commits](https://github.com/composefs/tar-rs/compare/0.4.45...0.4.46) --- updated-dependencies: - dependency-name: tar dependency-version: 0.4.46 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-06-04 07:52:44 +00:00
dependabot[bot]	d155f1a4ab	build(deps): bump openssl from 0.10.79 to 0.10.80 Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.79 to 0.10.80. - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](https://github.com/rust-openssl/rust-openssl/compare/openssl-v0.10.79...openssl-v0.10.80) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.80 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-06-04 07:51:50 +00:00
stevenhorsman	879912be25	versions: bump golang to 1.25.11 Bump the go version to resolve CVEs: - GO-2026-5037 - GO-2026-5038 - GO-2026-5039 Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-04 08:49:17 +01:00
Steve Horsman	53c1a627e4	Merge pull request #13143 from stevenhorsman/x/net-0.55-bump bump golang.org/x/dependencies	2026-06-03 16:46:08 +01:00
Aurélien Bombo	de5333f275	ci: remove Mariner annotations and use new config This is a follow-up to #13126 where we forgot to remove this now-unused code. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2026-06-03 09:25:12 -05:00
Mikko Ylinen	018389cb22	tests: enable k8s-sandbox-vcpus-allocation.bats for tdx and coco-dev k8s-sandbox-vcpus-allocation.bats was disabled for qemu-tdx due to errors when moving to use "upstream" TDX KVM code. The failing test is vcpus-less-than-one-with-no-limits pod which ends up getting x86 default MaxCPU = 240 and erroring: Number of hotpluggable cpus requested (240) exceeds the maximum cpus supported by KVM (224) TDX max vcpus is capped to host's logical CPUs so 240 is too much. With the maxcpus logic fixed (=maxcpus not set at all) for configurations where confidential guest is enabled, qemu-tdx can be enabled for k8s-sandox-vcpus-allocation.bats again. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-03 15:27:35 +03:00
Mikko Ylinen	e475d870fb	runtime: qemu: don't set maxcpus when confidential guest is enabled QEMU maxcpus enables CPU hotplug capabilities but it's unused when confidential guest is enabled. Change Go runtime code to skip setting maxcpus QEMU cmdline if CPU hotplug is not needed. Commit `07db945b09` built a relationship between kernel's cmdline nr_cpus and the maxcpus config. Now that maxcpus is dropped for confidential guests, drop nr_cpus from kernel commandline too. This hopefully helps with the reference values computation too. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-03 15:27:35 +03:00
Mikko Ylinen	2e625d0bab	runtime-rs: qemu: don't set maxcpus when confidential guest is enabled QEMU maxcpus enables CPU hotplug capabilities but it's unused when confidential guest is enabled. Change runtime-rs code to skip setting maxcpus QEMU cmdline if CPU hotplug is not needed. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2026-06-03 15:27:35 +03:00
stevenhorsman	51eee428f4	testing/webhook: bump golang.org/x dependencies Bump golang.org/x/net from v0.53.0 to v0.55.0 and golang.org/x/sys from v0.43.0 to v0.44.0 to resolve CVEs: - GO-2026-5024 - GO-2026-5025 - GO-2026-5026 - GO-2026-5027 - GO-2026-5028 - GO-2026-5029 - GO-2026-5030 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-03 09:56:54 +01:00
stevenhorsman	144ab161f1	tetss: bump golang.org/x/sys dependency Bump golang.org/x/sys from v0.19.0 to v0.44.0 to resolve CVE: - GO-2026-5024 Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-03 09:56:54 +01:00
stevenhorsman	46d704a7ab	log-parser: bump golang.org/x/sys dependency Bump golang.org/x/sys from v0.1.0 to v0.44.0 to resolve CVE: - GO-2026-5024 Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-03 09:56:54 +01:00
stevenhorsman	08ab789d9a	csi-kata-directvolume: bump golang.org/x dependencies Bump golang.org/x/net from v0.53.0 to v0.55.0 and golang.org/x/sys from v0.43.0 to v0.44.0 to resolve CVEs: - GO-2026-5024 - GO-2026-5025 - GO-2026-5026 - GO-2026-5027 - GO-2026-5028 - GO-2026-5029 - GO-2026-5030 Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-03 09:56:54 +01:00
stevenhorsman	c0f549860e	runtime: bump golang.org/x dependencies Bump golang.org/x/net from v0.53.0 to v0.55.0 and golang.org/x/sys from v0.43.0 to v0.44.0 to resolve CVEs: - GO-2026-5024 - GO-2026-5025 - GO-2026-5026 - GO-2026-5027 - GO-2026-5028 - GO-2026-5029 - GO-2026-5030 Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob	2026-06-03 09:56:54 +01:00
Fabiano Fidêncio	a2bb3f64b0	Merge pull request #12436 from mythi/tdx-updates-2026-3 runtime(-rs): tdx: use TDX QGS via unix-domain-socket by default	2026-06-03 08:50:26 +02:00
Fabiano Fidêncio	ecd9344dd1	Merge pull request #13144 from stevenhorsman/bump-rust-to-1.94 Bump rust to 1.94	2026-06-02 09:58:56 +02:00
Fabiano Fidêncio	230e01b04e	Merge pull request #13126 from kata-containers/topic/runtimes-introduce-azure-specific-configs runtime/runtime-rs: introduce Azure specific configs	2026-06-02 09:17:09 +02:00
stevenhorsman	b1928cc22f	runtime-rs: run cargo fmt for Rust 1.94 Run cargo fmt on runtime-rs to ensure consistent formatting with Rust 1.94 toolchain. Signed-off-by: stevenhorsman <steven@uk.ibm.com> Generated-By: IBM Bob Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2026-06-01 17:32:06 +01:00

1 2 3 4 5 ...

19263 Commits