kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-17 15:38:00 +00:00

Author	SHA1	Message	Date
alex.lyn	548f252bc4	runtime-rs: bugfix incorrect use of refcount before vfio attach When there's a pod with multiple containers, there may be case that attach point more than 2, we should not return Err in that case when we are doing attach ops, but just return Ok. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-01 11:28:57 +08:00
Alex Lyn	aa9cd232cd	Merge pull request #9358 from GabyCT/topic/nerdrandom gha: Update journal log names for nerdctl artifacts	2024-04-01 09:50:16 +08:00
Alex Lyn	dfa8832406	Merge pull request #9345 from c3d/bug/9342-agent-test-errors agent: Fix errors in `make check`	2024-04-01 09:48:44 +08:00
Dan Mihai	3a7dbcfc17	Merge pull request #9367 from microsoft/danmihai1/infinite-io-stream-copy-loop runtime: remove stream copy infinite loop	2024-03-29 09:37:44 -07:00
Dan Mihai	600f9266f3	runtime: remove stream copy infinite loop This reverts commit `1c5693be86`. Avoid apparent infinite loop when ReadStreamRequest is blocked by policy - for some of the pods. When running the k8s-limit-range.bats test with Policy enabled, the Shim + VMM never get terminated on my cluster. Not sure why the sandbox clean-up works better for other tests, but the k8s-limit-range test pod gets stuck in an infinite loop: stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... Fixes: #9380 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-28 22:43:28 +00:00
James O. D. Hunt	13966f4d1d	docs: kata-manager: Add help for permissions issue The 3.3.0 release installs the `kata-manager` script with overly restrictive permissions (see #9373), so add details to help users handle the situation. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	5589e4e291	docs: kata-manager: Update with latest details Now that v3.3.0 has been released, simplify the `kata-manager` documentation. Fixes: #9227. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	52fe60c94b	docs: kata-manager: Fix heading levels Add an extra heading indent so that there is only a single top-level heading. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:21:31 +00:00
Dan Mihai	ebb26edf42	Merge pull request #9347 from microsoft/danmihai1/reduce-exec-test-policy-prints genpolicy: reduce policy debug prints	2024-03-27 15:12:10 -07:00
Gabriela Cervantes	a32418bf32	versions: Remove runc version information This PR removes the runc version information as this is not longer being used in the kata containers scripts. Fixes #9364 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-27 20:32:38 +00:00
Steve Horsman	b3acbe0b7f	Merge pull request #8046 from fitzthum/clean-config runtime: remove unimplemented CoCo configurations	2024-03-27 19:39:48 +00:00
Tobin Feldman-Fitzthum	04d021bd12	packaging: remove SERVICEOFFLOAD option Since we're removing the unused service_offload parameter, don't set it in any of the packaging scripts. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	9856fe5bea	runtime: remove ServiceOffload parameter Since we no longer use the service_offload configuration, remove the ServiceOffload field from the image struct. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	a18c7ca307	runtime: remove unimplemented CoCo configurations These experimental options were added 2 years ago in anticipation of features that would be added in CoCo. These do not match the features that were eventually added and will soon be ported to main. Fixes: #8047 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:06 -05:00
Steve Horsman	53fa1fd82d	Merge pull request #9349 from fidencio/topic/ci-k8s-update-cpuid k8s: confidential: Update cpuid to its latest release	2024-03-27 16:57:36 +00:00
Chengyu Zhu	e66a5cb54d	Merge pull request #9332 from ChengyuZhu6/guest-pull-timeout Support to set timeout to pull large image in guest	2024-03-28 00:34:08 +08:00
Christophe de Dinechin	82c4079fd0	agent: Remove useless loop This is the report from `make check`: ``` error: this loop never actually loops --> src/signal.rs:147:9 \| 147 \| / loop { 148 \| \| select! { 149 \| \| _ = handle => { 150 \| \| println!("INFO: task completed"); ... \| 156 \| \| } 157 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop = note: `#[deny(clippy::never_loop)]` on by default ``` There is only one option: you get something or a timeout. You never retry, so the report is correct. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	df5c88cdf0	agent: Remove lint error about `.flatten` running forever The lint report is the following: ``` error: `flatten()` will run forever if the iterator repeatedly produces an `Err` --> src/rpc.rs:1754:10 \| 1754 \| .flatten() \| ^^^^^^^^^ help: replace with: `map_while(Result::ok)` \| note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error --> src/rpc.rs:1752:5 \| 1752 \| / reader 1753 \| \| .lines() \| \|________________^ = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok = note: `-D clippy::lines-filter-map-ok` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]` ``` This commit simply applies the suggestion. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	bfb55312be	agent: Fix `.enumerate` errors during `make check` Running `make check` in the `src/agent` directory gives: ``` error: you seem to use `.enumerate()` and immediately discard the index --> rustjail/src/mount.rs:572:27 \| 572 \| for (_index, line) in reader.lines().enumerate() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index = note: `-D clippy::unused-enumerate-index` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]` help: remove the `.enumerate()` call \| 572 \| for line in reader.lines() { \| ~~~~ ~~~~~~~~~~~~~~ Checking tokio-native-tls v0.3.1 Checking hyper-tls v0.5.0 Checking reqwest v0.11.18 error: could not compile `rustjail` (lib) due to 1 previous error warning: build failed, waiting for other jobs to finish... make: *** [../../utils.mk:177: standard_rust_check] Error 101 ``` Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Greg Kurz	e1068da1a0	Merge pull request #9326 from gkurz/draft-release Only tag and publish the release when it is fully ready	2024-03-27 15:59:59 +01:00
ChengyuZhu6	c50d3ebacc	tests:k8s: Add a test to pull large images in the guest Add a test to pull large images in the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	8551ee9533	how-to: add createcontainer timeout to sandbox config documentation add createcontainer timeout annotation to sandbox config documentation. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	c2dc13ebaa	runtime: support to configure CreateContainer Timeout in configurations support to configure CreateContainerRequestTimeout in the configurations. e.g.: [runtime] ... create_container_timeout = 300 Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:41 +08:00
Chengyu Zhu	87fc17d4d2	Merge pull request #9341 from ChengyuZhu6/guest-pull-doc docs: Add documents for kata guest image management	2024-03-27 21:20:22 +08:00
ChengyuZhu6	95b2f7f129	how-to: Add a document for kata guest image management usage Add a document for kata guest image management usage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 20:09:37 +08:00
Greg Kurz	693c9487d4	docs: Adjust release documentation Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto deprecated by the re-design of the release process described in #9064. Remove this file and all its references in the repo. The `## Versioning` section has some useful information though. It is moved to `docs/Release-Process.md`. The documentation of the `PATCH` field is adapted according to new workflow. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-27 12:41:48 +01:00
Steve Horsman	45aba769c0	Merge pull request #9346 from cmaf/ci-remove-repo-docs Remove additional links to tests directory	2024-03-27 11:13:32 +00:00
Steve Horsman	a1a615a7c8	Merge pull request #9356 from stevenhorsman/agent-opa-ppc64le-s390x workflows: Build agent-opa for more archs	2024-03-27 08:53:28 +00:00
ChengyuZhu6	2224f6d63f	runtime: support to configure CreateContainer timeout in annotation Support to configure CreateContainerRequestTimeout in the annotations. e.g.: annotations: "io.katacontainers.config.runtime.create_container_timeout": "300" Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
ChengyuZhu6	39bd462431	runtime: support to set timeout for CreateContainerRequest In the situation to pull images in the guest #8484, it’s important to account for pulling large images. Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout. However, this duration may prove insufficient for pulling larger images, such as those containing AI models. Consequently, we must devise a method to extend the timeout period for large image pull. Fixes: #8141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
Gabriela Cervantes	a997e282be	gha: Update journal log names for nerdctl artifacts This PR updates the journal log name for nerdctl artifacts to make sure that we have different names in case we add a parallel GHA job. Fixes #9357 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-26 20:03:54 +00:00
GabyCT	c163d9f114	Merge pull request #9329 from GabyCT/topic/seun scripts: Fix unbound variables in k8s setup script	2024-03-26 11:19:33 -06:00
stevenhorsman	9aa675abb9	workflows: Build agent-opa for more archs Since https://github.com/kata-containers/kata-containers/pull/7769, we support building the OPA binary into the ppc64le and s390x arch versions of the rootfs, so build the policy enabled agent to match for those architectures too. Fixes: #9355 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-03-26 17:02:14 +00:00
Lukáš Doktor	a671b3fc6e	tests: Use full svc address to check kbs service the service might not listen on the default port, use the full service address to ensure we are talking to the right resource. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:02 +01:00
Lukáš Doktor	6b0eaca4d4	tests: Add support for nodeport ingress for the kbs setup this can be used on kcli or other systems where cluster nodes are accessible from all places where the tests are running. Fixes: #9272 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:00 +01:00
Greg Kurz	5009fabde4	release: Keep it draft until all artifacts have been published The automated release workflow starts with the creation of the release in GitHub. This is followed by the build and upload of the various artifacts, which can be very long (like hours). During this period, the release appears to be fully available in https://github.com/kata-containers/kata-containers/ even though it lacks all the artifacts. This might be confusing for users or automation consuming the release. Create the release as draft and clear the draft flag when all jobs are done. This ensure that the release will only be tagged and made public when it is fully usable. If some job fails because of network timeout or any other transient error, the correct action is to restart the failed jobs until they eventually all succeed. This is by far the quicker path to complete the release process. If the workflow is canceled for some reason, the draft release is left behind. A new run of the workflow will create a brand new draft release with the same name (not an issue with GitHub). The draft release from the previous run should be manually deleted. This step won't be automated as it looks safer to leave the decision to a human. [1] https://github.com/kata-containers/kata-containers/releases Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-26 14:48:05 +01:00
Pavel Mores	4c72b02e53	runtime-rs: remove the now-unused code of NetDevice The remaining code in network.rs was mostly moved to utils.rs which seems better home for these utility functions anyway (and a closely related function open_named_tuntap() has already lived there). ToString implementation for Address was removed after some consideration. Address should probably ideally implement Display (as per RFC 565) which would also supply a ToString implementation, however it implements Debug instead, probably to enable automatic implementation of Debug for anything that Address is a member of, if for no other reason. Rather than having two identical functions this commit simply switches to using the Debug implementation for printing Address on qemu command line. Fixes #9352 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:52:40 +01:00
Pavel Mores	c94e55d45a	runtime-rs: make QemuCmdLine own vsock file descriptor Make file descriptors to be passed to qemu owned by QemuCmdLine. See commit 52958f17cd for more explanation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0cf0e923fc	runtime-rs: refactor QemuCmdLine::add_network_device() signature add_network_device() doesn't need to be passed NetworkInfo since it already has access to the full HypervisorConfig. Also, one of the goals of QemuCmdLine interface's design is to avoid coupling between QemuCmdLine and the hypervisor crate's device module, if at all possible. That's why add_network_device() shouldn't take device module's NetworkConfig but just parts that are useful in add_network_device()'s implementation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	a4f033f864	runtime-rs: add should_disable_modern() utility function is_running_in_vm() is enough to figure out whether to disable_modern but it's clumsy and verbose to use. should_disable_modern() streamlines the usage by encapsulating the verbosity. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	12e40ede97	runtime-rs: reimplement add_network_device() using Netdev & DeviceVirtioNet This commit replaces the existing NetDevice-based implementation with one using Netdev and DeviceVirtioNet. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0a57e2bb32	runtime-rs: refactor NetDevice in qemu driver In keeping with architecture of QemuCmdLine implementation we split the functionality into two objects: Netdev to represent and generate the -netdev part and DeviceVirtioNet for the -device virtio-net-<transport> part. This change is a pure refactor, existing functionality does not change. However, we do remove some stub generalizations and govmm-isms, notably: - we remove the NetDev enum since the only network interface types that kata seems to use with qemu are tuntap and macvtap, both of which are implemented by the same -netdev tap - enum DeviceDriver is also left out since it doesn't seem reasonable to try to represent VFIO NICs (which are completely different from virtio-net ones) with the same struct as virtio-net - we also remove VirtioTransport because there's no use for it so far, but with the expectation that it will be added soon. We also make struct Netdev the owner of any vhost-net and queue file descriptors so that their lifetime is tied ultimately to the lifetime of QemuCmdLine automatically, instead of returning the fds to the caller and forcing it to achieve the equivalent functionality but manually. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	7f23734172	runtime-rs: reduce generate_netdev_fds() dependencies generate_netdev_fds() takes NetworkConfig from which it however only needs a host-side network device name. This commit makes it take the device name directly, making the function useful to callers who don't have the whole NetworkConfig but do have the requisite device name. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	d4ac45d840	runtime-rs: refactor clear_fd_flags() The idea of this function is to make sure O_CLOEXEC is not set on file descriptors that should be inherited by a child (=hypervisor) process. The approach so far is however rather heavy-handed - clearing all flags is unjustifiably aggresive for a low-level function with no knowledge of context whatsoever. This commit refactors the function so that it only does what's expected and renames it accordingly. It also clarifies some of its call sites. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:14 +01:00
Fabiano Fidêncio	cfe75f9422	k8s: confidential: Update cpuid to its latest release Since v2.2.6 it can detect TDX guests on Azure, so let's bump it even if Azure peer-pods are not currently used as part of our CI. Fixes: #9348 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-26 10:21:12 +01:00
Chengyu Zhu	d16971e37e	Merge pull request #9325 from ChengyuZhu6/image_service agent:image: Refactor code to improve memory efficiency of image service	2024-03-26 10:38:37 +08:00
Dan Mihai	6c72c29535	genpolicy: reduce policy debug prints Kata CI has full debug output enabled for the cbl-mariner k8s tests, and the test AKS node is relatively slow. So debug prints from policy are expensive during CI. Fixes: #9296 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-26 02:21:26 +00:00
Alex Lyn	cec943fc26	Merge pull request #9244 from Apokleos/dgb-gpu runtime-rs/dragonball: add support building kernel with upcall and GPU hotplug	2024-03-26 08:53:54 +08:00
Chelsea Mafrica	4e3deb5a3b	tools: Fix path for installing yq in packaging script The lib.sh script uses the right directory but the wrong path for the script that installs yq; fix it. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Chelsea Mafrica	cfb977625e	docs: Remove links to tests repo Remove links to tests repo and update with corresponding location in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00

... 10 11 12 13 14 ...

13813 Commits