kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-04-11 06:22:55 +00:00

Author	SHA1	Message	Date
Dan Mihai	600f9266f3	runtime: remove stream copy infinite loop This reverts commit `1c5693be86`. Avoid apparent infinite loop when ReadStreamRequest is blocked by policy - for some of the pods. When running the k8s-limit-range.bats test with Policy enabled, the Shim + VMM never get terminated on my cluster. Not sure why the sandbox clean-up works better for other tests, but the k8s-limit-range test pod gets stuck in an infinite loop: stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... Fixes: #9380 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-28 22:43:28 +00:00
Tobin Feldman-Fitzthum	9856fe5bea	runtime: remove ServiceOffload parameter Since we no longer use the service_offload configuration, remove the ServiceOffload field from the image struct. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	a18c7ca307	runtime: remove unimplemented CoCo configurations These experimental options were added 2 years ago in anticipation of features that would be added in CoCo. These do not match the features that were eventually added and will soon be ported to main. Fixes: #8047 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:06 -05:00
Chengyu Zhu	e66a5cb54d	Merge pull request #9332 from ChengyuZhu6/guest-pull-timeout Support to set timeout to pull large image in guest	2024-03-28 00:34:08 +08:00
Greg Kurz	e1068da1a0	Merge pull request #9326 from gkurz/draft-release Only tag and publish the release when it is fully ready	2024-03-27 15:59:59 +01:00
ChengyuZhu6	c2dc13ebaa	runtime: support to configure CreateContainer Timeout in configurations support to configure CreateContainerRequestTimeout in the configurations. e.g.: [runtime] ... create_container_timeout = 300 Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:41 +08:00
Greg Kurz	693c9487d4	docs: Adjust release documentation Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto deprecated by the re-design of the release process described in #9064. Remove this file and all its references in the repo. The `## Versioning` section has some useful information though. It is moved to `docs/Release-Process.md`. The documentation of the `PATCH` field is adapted according to new workflow. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-27 12:41:48 +01:00
ChengyuZhu6	2224f6d63f	runtime: support to configure CreateContainer timeout in annotation Support to configure CreateContainerRequestTimeout in the annotations. e.g.: annotations: "io.katacontainers.config.runtime.create_container_timeout": "300" Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
ChengyuZhu6	39bd462431	runtime: support to set timeout for CreateContainerRequest In the situation to pull images in the guest #8484, it’s important to account for pulling large images. Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout. However, this duration may prove insufficient for pulling larger images, such as those containing AI models. Consequently, we must devise a method to extend the timeout period for large image pull. Fixes: #8141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
Chelsea Mafrica	d69514766e	src: Remove references to files in tests repo Change scripts and source that uses files in the tests repo to use the corresponding file in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
ChengyuZhu6	ba242b0198	runtime: support different cri container type check To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	965da9bc9b	runtime: support to pass image information to guest by KataVirtualVolume support to pass image information to guest by KataVirtualVolumeImageGuestPullType in KataVirtualVolume, which will be used to pull image on the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 17:22:36 +01:00
Alexandru Matei	617b0114b3	clh: initialize clh pid before using it The PID needs to be initialized before calling isClhRunning. waitVMM() uses isClhRunning and is called by launchClh() just before returning from function. Fixes: #9230 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-03-09 13:53:51 +02:00
Steve Horsman	54e5ce2464	Merge pull request #9154 from chungeun-choi/change-deprecated-package fixed - Change the deprecated module from 'io/util' to util. 'io/util…	2024-03-08 15:05:43 +00:00
Chungeun Choi	bad263f399	runtime: Replace deprecated module io/ioutil" to "io" This change updates the module import to use 'util' instead of the deprecated 'io/util' Fixes: #9166 Signed-off-by: Chungeun Choi <ce.choi@okestro.com>	2024-03-07 10:56:06 +00:00
Linda Yu	1c5693be86	stream: repeat copybuffer if it is blocked by policy copyBuffer returns and the streams will be closed when error occurs. If the error contains "blocked by policy" it means the log output is disabled by policy with "ReadStreamRequest" and "WriteStreamRequest" set to false. But at this moment, we want the real stream still working (not be seen) because we might want to enable logging for debugging purpose, so we repeat copybuffer in this case to avoid streams being closed. Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Linda Yu	eda419cb03	kata-runtime: add set policy function to kata-runtime logging/debugging information might probably be disabled in production due to security consideration, but we'd better provide an approach for customer to get logging information during runtime, this PR implement setpolicy function in kata-runtime tools, although it can set whole policy other than logging. setpolicy would evokes remote attestation, which means before setting policy during runtime, user has to reconfigure new policy hash in KBS/AS. usage: kata-runtime policy set policy.rego --sandbox-id XXXXXXXX Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Fupan Li	628f57aca0	Merge pull request #9193 from UiPath/fix/clh-dax clh: Enable DAX for rootfs	2024-03-05 09:39:22 +08:00
Liu Bo	b6f8355ea3	katautils: fix panic on tracing. This fixes a panic on tracing on container exit. The root cause is that global var needs to be set by "=" instead of ":=". Fixes: #9102 Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2024-02-29 18:40:23 -08:00
Alexandru Matei	6856e8f678	clh: Enable DAX for rootfs Fixes: #9192 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-02-29 18:01:47 +02:00
Dan Mihai	352e2af5f0	Merge pull request #9153 from microsoft/danmihai1/clh-bootVM-timeout runtime: clh: minimum 10s timeout for CreateVM + BootVM	2024-02-27 09:58:01 -08:00
Dan Mihai	f4509b806b	runtime: clh: minimum 10s timeout for CreateVM + BootVM Relax the timeout for calling CLH's CreateVM + BootVM APIs. When hitting the older 1s timeout, killing a half-booted Guest and retrying the same boot sequence could have been wasteful and resulting in unstable CI testing on slower Hosts. Fixes: #9152 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-24 19:15:57 +00:00
Julien Ropé	9de65707ca	runtime: stop reporting net dev metrics for the shim As part of the shim network metrics, the shim is reporting network interfaces from the host with no namespace isolation - this gives insight in interfaces not tied to the kata containers, and causes an increase in resource usage for kata metrics. As the shim itself is not using the network (all its communication with other processes is done with local unix sockets), there is no reason to keep gathering and reporting shim-specific network metrics. Actual network usage of the kata containers can be found from the existing hypervisor network metrics (kata_hypervisor_netdev) and from the agent network metrics (kata_guest_netdev_stat). Fixes: #5738 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-22 14:00:00 +01:00
Archana Shinde	6d38fa1530	network: Try removing as many changes as possible during network cleanup In case an error is encountered while removing a network endpoint during network cleanup, we cuurently return immediately with the error. With this change, in case of error we simply log the error and proceed towards removing the next endpoint. With this, we can cleanup the network changes made by the shim as much as possible. This is especially important when multiple interfaces are passed to the network namespace using a network plugin like multus. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:08:05 -08:00
Archana Shinde	b005cda689	network: Move up defer block tp cleanup network Move the defer for cleaning up network before the call to add network. This way if any change made by add network is reverted by in case of failure. This is particulary important for physical network interfaces as with this step we make sure that driver for the physical interface is reverted back to the original host driver. Without this the physical network iterface will remain bound to vfio. Fixes: #8646 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:06:42 -08:00
ChengyuZhu6	96c297cb37	runtime: fix checksum mismatch error in `make vendor` Fix checksum mismatch error in `make vendor`. Fixes: #9111 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-18 22:22:38 +08:00
Fabiano Fidêncio	eea4277fbf	runtime: Update runc to v1.1.12 Although we don't seem to be affected by https://nvd.nist.gov/vuln/detail/CVE-2024-21626, we vendor and use the runc package in a few different places of our code, and we better update the package to its latest release. Fixes: #9097 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-14 23:13:39 +01:00
Niteesh Dubey	3e383674f8	runtime: fix creation of SEV confidential container on SNP enabled host. This is needed to fix the bug which is not allowing to create SEV container on SNP enabled host anymore. This is a regression that was introduced as part of the following commit: `de39fb7d38` Fixes: #9036 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-02-06 19:01:30 +00:00
Alex Lyn	1ab9a21492	Merge pull request #8552 from deagon/fix/missing-port-type runtime: missing port type in the DeviceInfo	2024-02-06 10:56:46 +08:00
Fabiano Fidêncio	1362918ff0	Merge pull request #9011 from fidencio/topic/switch-to-using-the-confidential-rootfs runtime: Replace TEE specific initrd / image for the confidential one	2024-02-05 10:43:12 +01:00
Guoqiang Ding	6068faf40b	runtime: failed to run in the case of ColdPlugVFIO Add the missing port type in the DeviceInfo. Fixes: #9014 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-05 17:30:11 +08:00
Alex Lyn	cf74166d75	Merge pull request #9015 from Apokleos/bugfix-exec-uds runtime: display accurate error msg to avoid misleading users.	2024-02-05 13:50:43 +08:00
Alex Lyn	c6830ceb89	runtime: display accurate error msg to avoid misleading users. The original handling method does not reach user expectations. When the ClientSocketAddress method stats the corresponding path of runtime-rs and has not found it yet, we should return an error message here that includes the reason for the failure (which should be an error display indicating that both runtime-go and runtime-rs were not found). Instead of simply displaying the corresponding path of runtime-rs as the final error message to users. It is also necessary to return the error promptly to the caller for further error handling. Fixes: #8999 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-02-04 16:45:59 +08:00
Guoqiang Ding	7bf1ebe16d	kata-monitor: fix agentUrl from containerd shim Fix the missing leading slash. Fixes: #9013 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-04 16:24:13 +08:00
Fabiano Fidêncio	e4258d8694	runtime: Use confidential image / initrd instead of TEE specific ones Now that we have a confidential image / initrd being built, instead of a specific one for each TEE, let's use it everywhere possible. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 13:20:14 +01:00
Fabiano Fidêncio	3755c69165	runtime: makefile: remove SNP specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	57b132f94c	runtime: makefile: remove SEV specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	2562d23242	runtime: makefile: remove TDX specific kernel references As this is not used anymore, we can go ahead and just remove it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:43 +01:00
Fabiano Fidêncio	f4e3c936d8	runtime: snp: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Fabiano Fidêncio	8731366d7b	runtime: sev: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Fabiano Fidêncio	6cbdba7268	runtime: tdx: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:06 +01:00
Fabiano Fidêncio	a618461d3a	runtime: Add confidential kernel to the makefile With this we can properly generate and the the `-confidential` kernel, which supports SEV / SNP / TDX as part of our configuration files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:05 +01:00
Zhigang Wang	9317e23df1	mount: Reduce the mount points with namespace isolation This patch can reduce load on systemd process, and increase the k8s deployment density when using go runtime. Fixes: #8758 Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com> Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2024-02-01 18:34:24 +08:00
Alex Lyn	cf26c16017	Merge pull request #8931 from yaoyinnan/8930/feat/merge-ValidCgroupPath runtime: merged ValidCgroupPath method	2024-02-01 12:53:55 +08:00
yaoyinnan	9aa1ed805a	runtime: add SingleContainer when obtaining OCI Spec When creating a cgroup, add a SingleContainer when obtaining the OCI Spec to apply to ctr, podman, etc. Fixes: #5240 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 15:24:07 +08:00
yaoyinnan	b0b8523cea	runtime: modify ValidCgroupPath unit test Modify ValidCgroupPath unit test. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:17 +08:00
yaoyinnan	feed5c8ff9	runtime: merged ValidCgroupPath method Merged ValidCgroupPath method to handle cgroupv1 and cgroupv2. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:13 +08:00
Kvlil	a4b208a712	runtime: remove SharedVersions field dead code SharedVersion fiel add a versiontable property that isn't supported by upstream QEMU. This is dead code since virtcontainers isn't setting SharedVersions to true. Fixes: #7720 Signed-off-by: Kvlil <kalil.pelissier@gmail.com>	2024-01-22 12:18:42 +00:00
Amulyam24	394777291d	runtime: fix failing unit tests on ppc64le A few CPU related test cases were failing as the version was being verified against Power8 while the CI machine is Power9. Fixes: #5531 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	540a2a7fb1	runtime: Allow no initrd path for IBM Z Secure Execution This is to reintroduce a configuration rule for IBM Z Secure Execution, where no initrd path should be configured. For the TEE of interest, only a kernel image should be specified with `confidential_guest=true`. Fixes: #8692 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-19 11:21:16 +01:00

1 2 3 4 5 ...

1795 Commits