This reverts commit 1c5693be86.
Avoid apparent infinite loop when ReadStreamRequest is blocked by
policy - for some of the pods.
When running the k8s-limit-range.bats test with Policy enabled,
the Shim + VMM never get terminated on my cluster. Not sure why
the sandbox clean-up works better for other tests, but the
k8s-limit-range test pod gets stuck in an infinite loop:
stdout io stream copy error happens: error = %wrpc error: code =
PermissionDenied desc = \"ReadStreamRequest is blocked by policy
...
policy check: ReadStreamRequest
...
stdout io stream copy error happens: error = %wrpc error: code =
PermissionDenied desc = \"ReadStreamRequest is blocked by policy
...
policy check: ReadStreamRequest
...
Fixes: #9380
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Since we no longer use the service_offload configuration,
remove the ServiceOffload field from the image struct.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
These experimental options were added 2 years ago
in anticipation of features that would be added
in CoCo. These do not match the features that were
eventually added and will soon be ported to main.
Fixes: #8047
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
support to configure CreateContainerRequestTimeout in the
configurations.
e.g.:
[runtime]
...
create_container_timeout = 300
Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto
deprecated by the re-design of the release process described in #9064.
Remove this file and all its references in the repo.
The `## Versioning` section has some useful information though. It is
moved to `docs/Release-Process.md`. The documentation of the `PATCH`
field is adapted according to new workflow.
Fixes#9064 - part VI
Signed-off-by: Greg Kurz <groug@kaod.org>
Support to configure CreateContainerRequestTimeout in the annotations.
e.g.:
annotations:
"io.katacontainers.config.runtime.create_container_timeout": "300"
Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
In the situation to pull images in the guest #8484, it’s important to account for pulling large images.
Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout.
However, this duration may prove insufficient for pulling larger images, such as those containing AI models.
Consequently, we must devise a method to extend the timeout period for large image pull.
Fixes: #8141
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Change scripts and source that uses files in the tests repo to use the
corresponding file in the current repo.
Fixes#9165
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
support to pass image information to guest by KataVirtualVolumeImageGuestPullType
in KataVirtualVolume, which will be used to pull image on the guest.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
The PID needs to be initialized before calling isClhRunning.
waitVMM() uses isClhRunning and is called by launchClh() just
before returning from function.
Fixes: #9230
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
This change updates the module import to use 'util' instead of the deprecated 'io/util'
Fixes: #9166
Signed-off-by: Chungeun Choi <ce.choi@okestro.com>
copyBuffer returns and the streams will be closed when error occurs.
If the error contains "blocked by policy" it means the log output is
disabled by policy with "ReadStreamRequest" and "WriteStreamRequest" set
to false. But at this moment, we want the real stream still working (not
be seen) because we might want to enable logging for debugging purpose,
so we repeat copybuffer in this case to avoid streams being closed.
Fixes: #8797
Signed-off-by: Linda Yu <linda.yu@intel.com>
logging/debugging information might probably be disabled in production
due to security consideration, but we'd better provide an approach for
customer to get logging information during runtime, this PR implement
setpolicy function in kata-runtime tools, although it can set whole policy
other than logging.
setpolicy would evokes remote attestation, which means before setting
policy during runtime, user has to reconfigure new policy hash in KBS/AS.
usage: kata-runtime policy set policy.rego --sandbox-id XXXXXXXX
Fixes: #8797
Signed-off-by: Linda Yu <linda.yu@intel.com>
This fixes a panic on tracing on container exit.
The root cause is that global var needs to be set by "=" instead of
":=".
Fixes: #9102
Signed-off-by: Liu Bo <liub.liubo@gmail.com>
Relax the timeout for calling CLH's CreateVM + BootVM APIs. When
hitting the older 1s timeout, killing a half-booted Guest and
retrying the same boot sequence could have been wasteful and resulting
in unstable CI testing on slower Hosts.
Fixes: #9152
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
As part of the shim network metrics, the shim is reporting network interfaces
from the host with no namespace isolation - this gives insight in interfaces
not tied to the kata containers, and causes an increase in resource usage for
kata metrics.
As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
hypervisor network metrics (kata_hypervisor_netdev) and from the agent
network metrics (kata_guest_netdev_stat).
Fixes: #5738
Signed-off-by: Julien Ropé <jrope@redhat.com>
In case an error is encountered while removing a network endpoint during
network cleanup, we cuurently return immediately with the error.
With this change, in case of error we simply log the error and proceed
towards removing the next endpoint. With this, we can cleanup the
network changes made by the shim as much as possible.
This is especially important when multiple interfaces are passed to the
network namespace using a network plugin like multus.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Move the defer for cleaning up network before the call to add network.
This way if any change made by add network is reverted by in case of
failure. This is particulary important for physical network interfaces
as with this step we make sure that driver for the physical interface is
reverted back to the original host driver. Without this the physical
network iterface will remain bound to vfio.
Fixes: #8646
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This is needed to fix the bug which is not allowing to create SEV container
on SNP enabled host anymore. This is a regression that was introduced as
part of the following commit:
de39fb7d38Fixes: #9036
Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>
The original handling method does not reach user expectations.
When the ClientSocketAddress method stats the corresponding
path of runtime-rs and has not found it yet, we should return
an error message here that includes the reason for the failure
(which should be an error display indicating that both runtime-go
and runtime-rs were not found). Instead of simply displaying the
corresponding path of runtime-rs as the final error message to
users.
It is also necessary to return the error promptly to the caller
for further error handling.
Fixes: #8999
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Now that we have a confidential image / initrd being built, instead of a
specific one for each TEE, let's use it everywhere possible.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.
However, for debugability-sake, let's do this change TEE by TEE.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.
However, for debugability-sake, let's do this change TEE by TEE.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're building a single confidential kernel, we should rely on it
rather than keep using the specific ones for TDX / SEV / SNP.
However, for debugability-sake, let's do this change TEE by TEE.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With this we can properly generate and the the `-confidential` kernel,
which supports SEV / SNP / TDX as part of our configuration files.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This patch can reduce load on systemd process, and
increase the k8s deployment density when using go runtime.
Fixes: #8758
Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com>
Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
SharedVersion fiel add a versiontable property that isn't supported by upstream QEMU.
This is dead code since virtcontainers isn't setting SharedVersions to true.
Fixes: #7720
Signed-off-by: Kvlil <kalil.pelissier@gmail.com>
A few CPU related test cases were failing as the version was being verified against Power8 while the CI machine is Power9.
Fixes: #5531
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
This is to reintroduce a configuration rule for IBM Z Secure Execution,
where no initrd path should be configured. For the TEE of interest,
only a kernel image should be specified with `confidential_guest=true`.
Fixes: #8692
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>