kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-18 07:58:36 +00:00

Author	SHA1	Message	Date
Aurélien Bombo	6d96875d04	runtime: virtio-fs: Support "metadata" cache mode The Rust virtiofsd supports a "metadata" cache mode [1] that wasn't present in the C version [2], so this PR adds support for that. [1] https://gitlab.com/virtio-fs/virtiofsd [2] https://qemu.weilnetz.de/doc/5.1/tools/virtiofsd.html#cmdoption-virtiofsd-cache Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-08-07 21:24:40 +08:00
Pavel Mores	69f21692ed	runtime-rs: enable vcpu allocation tests in CI This series should make runtime-rs's vcpu allocation behaviour match the behaviour of runtime-go so we can now enable pertinent tests which were skipped so far due the difference between both shims. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	00bfa3fa02	runtime-rs: re-adjust config after modifying it with annotations Configuration information is adjusted after loading from file but so far, there has been no similar check for configuration coming from annotations. This commit introduces re-adjusting config after annotations have been processed. A small refactor was necessary as a prerequisite which introduces function TomlConfig::adjust_config() to make it easier to invoke the adjustment for a whole TomlConfig instance. This function is analogous to the existing validate() function. The immediate motivation for this change is to make sure that 0 in "default_vcpus" annotation will be properly adjusted to 1 as is the case if 0 is loaded from a config file. This is required to match the golang runtime behaviour. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	e2156721fd	runtime-rs: add tests to exercise floating-point 'default_vcpus' Also included (as commented out) is a test that does not pass although it should. See source code comment for explanation why fixing this seems beyond the scope of this PR. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	1f95d9401b	runtime-rs: change representation of default_vcpus from i32 to f32 This commit focuses purely on the formal change of type. If any subsequent changes in semantics are needed they are purposely avoided here so that the commit can be reviewed as a 100% formal and 0% semantic change. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Pavel Mores	cdc0eab8e4	runtime-rs: make sandbox vcpu allocation more accurate This commit addresses a part of the same problem as PR #7623 did for the golang runtime. So far we've been rounding up individual containers' vCPU requests and then summing them up which can lead to allocation of excess vCPUs as described in the mentioned PR's cover letter. We address this by reversing the order of operations, we sum the (possibly fractional) container requests and only then round up the total. We also align runtime-rs's behaviour with runtime-go in that we now include the default vcpu request from the config file ('default_vcpu') in the total. We diverge from PR #7623 in that `default_vcpu` is still treated as an integer (this will be a topic of a separate commit), and that this implementation avoids relying on 32-bit floating point arithmetic as there are some potential problems with using f32. For instance, some numbers commonly used in decimal, notably all of single-decimal-digit numbers 0.1, 0.2 .. 0.9 except 0.5, are periodic in binary and thus fundamentally not representable exactly. Arithmetics performed on such numbers can lead to surprising results, e.g. adding 0.1 ten times gives 1.0000001, not 1, and taking a ceil() results in 2, clearly a wrong answer in vcpu allocation. So instead, we take advantage of the fact that container requests happen to be expressed as a quota/period fraction so we can sum up quotas, fundamentally integral numbers (possibly fractional only due to the need to rewrite them with a common denominator) with much less danger of precision loss. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-08-07 10:32:44 +02:00
Christophe de Dinechin	ec480dc438	qemu: Respect the JSON schema for hot plug When hot-plugging CPUs on QEMU, we send a QMP command with JSON arguments. QEMU 9.2 recently became more strict[1] enforcing the JSON schema for QMP parameters. As a result, running Kata Containers with QEMU 9.2 results in a message complaining that the core-id parameter is expected to be an integer: ``` qmp hotplug cpu, cpuID=cpu-0 socketID=1, error: QMP command failed: Invalid parameter type for 'core-id', expected: integer ``` Fix that by changing the core-id, socket-id and thread-id to be integer values. [1]: `be93fd5372` Fixes: #11633 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2025-08-07 09:13:57 +02:00
Alex Lyn	37685c41c7	runtime-rs: Correct the coresponding initdata annotation const As we have changed the initdata annotation definition, Accordingly, we also need correct its const definition with KATA_ANNO_CFG_RUNTIME_INIT_DATA. Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-07 10:45:28 +08:00
Alex Lyn	163f04a918	Merge pull request #11651 from microsoft/danmihai1/debug-kubectl-logs tests: k8s-sandbox-vcpus-allocation debug info	2025-08-07 10:27:29 +08:00
Aurélien Bombo	e3b4d87b6d	ci: static-checks: add SECURITY.md to exclude list This adds SECURITY.md to the list of GH-native files that should be excluded by the reference checker. Today this is useful for downstreams who already have a SECURITY.md file for compliance reasons. When Kata onboards that file, this commit will also be required. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-08-06 11:24:52 -05:00
Markus Rudy	3eb0641431	genpolicy: add rule for AddARPNeighbors When the network interface provisioned by the CNI has static ARP table entries, the runtime calls AddARPNeighbor to propagate these to the agent. As of today, these calls are simply rejected. In order to allow the calls, we do some sanity checks on the arguments: We must ensure that we don't unexpectedly route traffic to the host that was not intended to leave the VM. In a first approximation, this applies to loopback IPs and devices. However, there may be other sensitive ranges (for example, VPNs between VMs), so there should be some flexibility for users to restrict this further. This is why we introduce a setting, similar to UpdateRoutes, that allows restricting the neighbor IPs further. The only valid state of an ARP neighbor entry is NUD_PERMANENT, which has a value of 128 [1]. This is already enforced by the runtime. According to rtnetlink(7), valid flag values are 8 and 128, respectively [2], thus we allow any combination of these. [1]: https://github.com/torvalds/linux/blob/4790580/include/uapi/linux/neighbour.h#L72 [2]: https://github.com/torvalds/linux/blob/4790580/include/uapi/linux/neighbour.h#L49C20-L53 Fixes: #11664 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-08-06 17:24:36 +02:00
Zvonko Kaiser	1b1b3af9ab	ci: Remove trigger for stable branch We do not support stable branches anymore, remove the trigger for it. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-08-06 09:22:24 +08:00
Hyounggyu Choi	af01434226	Merge pull request #11646 from kata-containers/sprt/param-static-checks ci: static-checks: Auto-detect repo by default	2025-08-05 22:13:20 +02:00
Alex Lyn	ede773db17	kata-types: Align the initdata annotation with kata-runtime's definition To make it work within CI, we do alignment with kata-runtime's definition with "io.katacontainers.config.runtime.cc_init_data". Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2025-08-03 22:51:39 +08:00
Dan Mihai	05eca5ca25	tests: k8s-sandbox-vcpus-allocation debug info Print more details about the behavior of "kubectl logs", trying to understand errors like: https://github.com/kata-containers/kata-containers/actions/runs/16662887973/job/47164791712 not ok 1 Check the number vcpus are correctly allocated to the sandbox (in test file k8s-sandbox-vcpus-allocation.bats, line 37) `[ `kubectl logs ${pods[$i]}` -eq ${expected_vcpus[$i]} ]' failed with status 2 No resources found in kata-containers-k8s-tests namespace. ... k8s-sandbox-vcpus-allocation.bats: line 37: [: -eq: unary operator expected Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-08-01 20:09:17 +00:00
Aurélien Bombo	c47bff6d6a	Merge pull request #11637 from kata-containers/sprt/remove-install-az-cli gha: Remove unnecessary install-azure-cli step	2025-08-01 09:34:46 -05:00
Fabiano Fidêncio	82f141a02e	Merge pull request #11632 from burgerdev/codegen runtime: reproducible generation of Golang proto bindings	2025-07-31 23:49:18 +02:00
Fabiano Fidêncio	7198c8789e	Merge pull request #11639 from zvonkok/gpu_guest_components gpu: guest components	2025-07-31 21:42:31 +02:00
Aurélien Bombo	9585e608e5	ci: static-checks: Auto-detect repo by default This auto-detects the repo by default (instead of having to specify KATA_DEV_MODE=true) so that forked repos can leverage the static-checks.yaml CI check without modification. An alternative would have been to pass the repo in static-checks.yaml. However, because of the matrix, this would've changed the check name, which is a pain to handle in either the gatekeeper/GH UI. Example fork failure: https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142421739#step:8:75 I've tested this change to work in a fork. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-31 14:33:24 -05:00
Zvonko Kaiser	8422411d91	gpu: Add coco guest components The second stage needs to consider the coco guest components Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-31 17:11:21 +00:00
Markus Rudy	3fd354b991	ci: add codegen to static-checks Signed-off-by: Markus Rudy <mr@edgeless.systems> Fixes: #11631 Co-authored-by: Steve Horsman <steven@uk.ibm.com>	2025-07-31 17:58:25 +01:00
Markus Rudy	9e38fd2562	tools: add image for Go proto bindings In order to have a reproducible code generation process, we need to pin the versions of the tools used. This is accomplished easiest by generating inside a container. This commit adds a container image definition with fixed dependencies for Golang proto/ttrpc code generation, and changes the agent Makefile to invoke the update-generated-proto.sh script from within that container. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-31 17:58:25 +01:00
Markus Rudy	f7a36df290	runtime: generate proto files The generated Go bindings for the agent are out of date. This commit was produced by running src/agent/src/libs/protocols/hack/update-generated-proto.sh with protobuf compiler versions matching those of the last run, according to the generated code comments. Since there are new RPC methods, those needed to be added to the HybridVSockTTRPCMockImp. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-31 17:58:25 +01:00
Fabiano Fidêncio	d077ed4c1e	Merge pull request #11645 from kata-containers/topic/fix-kbuild-sign-pin-issue build: nvidia: Fix KBUILD_SIGN_PIN breakage	2025-07-31 18:31:34 +02:00
Fabiano Fidêncio	8d30b84abd	build: nvidia: Fix KBUILD_SIGN_PIN breakage We only need KBUILD_SIGN_PIN exported when building nvidia related artefacts. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-31 16:39:20 +02:00
Fabiano Fidêncio	20bef41347	Merge pull request #11236 from kata-containers/amd64-nvidia-gpu-cicd gpu: AMD64 NVIDIA GPU CI/CD	2025-07-31 14:52:01 +02:00
Aurélien Bombo	96f1d95de5	gha: Remove unnecessary install-azure-cli step az cli is already installed by the azure/login action. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-30 10:42:56 -05:00
Zvonko Kaiser	fbb0e7f2f2	gpu: Add secrets passthrough to the workflow We need to pass-through the secrets in all the needed workflows ci, ci-on-push, ci-nightly, ci-devel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:51:01 +00:00
Zvonko Kaiser	30778594d0	gpu: Add arm64-nvidia-a100 to actionlint.yaml Make zizmor happy about our custom runner label Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	8768e08258	gpu: Add embeding service For a simple RAG pipeline add a embeding service Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	254dbd9b45	gpu: Add Pod spec for NIM llama Pod spec for the NIM inferencing service Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	568b13400a	gpu: Add NIM bats test We're running a simple NIM container to test if the GPUs are working properly Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	6188b7f79f	gpu: Add run_kubernetes_nv_tests.sh Replicate what we have for run_tests and run .bats files Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	9a829107ba	gpu: Add selector for k8s tests We want to reuse the current run_tests with GPUs, introduce a var that will define what to run. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	7669f1fbd1	gpu: Add NVIDIA GPU test block for amd64 Once we have the amd64 artifacts we can run some arm64 k8s tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:59 +00:00
Zvonko Kaiser	97d7575d41	gpu: Disable metrics tests We are not running the metrics tests anyway for now lets make room to run the GPU tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-30 13:45:58 +00:00
Anastassios Nanos	00e0db99a3	Merge pull request #11627 from itsmohitnarayan/FirecrackerVersionUpdate	2025-07-30 13:59:55 +03:00
Kumar Mohit	5cccbb9f41	versions: Upgrade Firecracker Version to 1.12.1 Updated versions.yaml to use Firecracker v1.12.1. Replaced firecracker and jailer binaries under /opt/kata/bin. Tested with kata-fc runtime on Kubernetes: - Deployed pods using gitpod/openvscode-server - Verified microVM startup, container access, and Firecracker usage - Confirmed Firecracker and jailer versions via CLI Signed-off-by: Kumar Mohit <68772712+itsmohitnarayan@users.noreply.github.com>	2025-07-30 12:51:08 +05:30
Saul Paredes	1aaaef2134	Merge pull request #11553 from microsoft/danmihai1/genpolicy-cleanup genpolicy: reduce complexity	2025-07-28 14:32:59 -07:00
Dan Mihai	c11c972465	genpolicy: config layer logging clean-up Use a simple debug!() for logging the config_layer string, instead of transcoding, etc. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	30bfa2dfcc	genpolicy: use CoCo settings by default - "confidential_emptyDir" becomes "emptyDir" in the settings file. - "confidential_configMap" becomes "configMap" in settings. - "mount_source_cpath" becomes "cpath". - The new "root_path" gets used instead of the old "cpath" to point to the container root path.. - "confidential_guest" is no longer used. By default it gets replaced by "enable_configmap_secret_storages"=false, because CoCo is using CopyFileRequest instead of the Storage data structures for ConfigMap and/or Secret volume mounts during CreateContainerRequest. - The value of "guest_pull" becomes true by default. - "image_layer_verification" is no longer used - just CoCo's guest pull is supported. - The Request input files from unit tests are changing to reflect the new default settings values described above. - tests/integration/kubernetes/tests_common.sh adjusts the settings for platforms that are not set-up for CoCo during CI (i.e., platforms other than SNP, TDX, and CoCo Dev). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	94995d7102	genpolicy: skip pulling layers for guest-pull Skip pulling container image layers when guest-pull=true. The contents of these layers were ignored due to: - #11162, and - tarfs snapshotter support having been removed from genpolicy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	f6016f4f36	genpolicy: remove tarfs snapshotter support AKS Confidential Containers are using the tarfs snapshotter. CoCo upstream doesn't use this snapshotter, so remove this Policy complexity from upstream. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:10 +00:00
Steve Horsman	077c59dd1f	Merge pull request #11385 from wainersm/ci_make_coco_nontee_required ci/gatekeeper: make run-k8s-tests-coco-nontee job required	2025-07-28 14:16:23 +01:00
Steve Horsman	74fba9c736	Merge pull request #11619 from kata-containers/install-dependencies-gh-cli ci: Try passing api token into githubh api call	2025-07-28 13:35:12 +01:00
Xuewei Niu	2a3c8b04df	Merge pull request #11613 from RuoqingHe/clippy-fix-for-libs-20250721 mem-agent: Ignore Cargo.lock	2025-07-28 17:45:29 +08:00
RuoqingHe	3f46347dc5	Merge pull request #11618 from RuoqingHe/fix-dragonball-default-build dragonball: Fix warnings in default build	2025-07-28 11:24:46 +08:00
Xuewei Niu	e5d5768c75	Merge pull request #11626 from RuoqingHe/bump-cloud-hypervisor-v47 versions: Upgrade to Cloud Hypervisor v47.0	2025-07-28 10:34:45 +08:00
Ruoqing He	4ca6c2d917	mem-agent: Ignore Cargo.lock `mem-agent` here is now a library and do not contain examples, ignore Cargo.lock to get rid of untracked file noise produced by `cargo run` or `cargo test`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-28 10:32:46 +08:00
Ruoqing He	3ec10b3721	runtime: clh: Re-generate client code against v47.0 Re-generates the client code against Cloud Hypervisor v47.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 20:44:14 +02:00

1 2 3 4 5 ...

16620 Commits