Commit Graph

4092 Commits

Author SHA1 Message Date
ChengyuZhu6
c2dc13ebaa runtime: support to configure CreateContainer Timeout in configurations
support to configure CreateContainerRequestTimeout in the
configurations.

e.g.:
[runtime]
...
create_container_timeout = 300

Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 21:58:41 +08:00
ChengyuZhu6
2224f6d63f runtime: support to configure CreateContainer timeout in annotation
Support to configure CreateContainerRequestTimeout in the annotations.

e.g.:
annotations:
      "io.katacontainers.config.runtime.create_container_timeout": "300"

Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
(https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 15:44:29 +08:00
ChengyuZhu6
39bd462431 runtime: support to set timeout for CreateContainerRequest
In the situation to pull images in the guest #8484, it’s important to account for pulling large images.
Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout.
However, this duration may prove insufficient for pulling larger images, such as those containing AI models.
Consequently, we must devise a method to extend the timeout period for large image pull.

Fixes: #8141

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-27 15:44:29 +08:00
Chengyu Zhu
d16971e37e Merge pull request #9325 from ChengyuZhu6/image_service
agent:image: Refactor code to improve memory efficiency of image service
2024-03-26 10:38:37 +08:00
Alex Lyn
cec943fc26 Merge pull request #9244 from Apokleos/dgb-gpu
runtime-rs/dragonball: add support building kernel with upcall and GPU hotplug
2024-03-26 08:53:54 +08:00
Alex Lyn
5c54315a87 dragonball: fix CI failure due to poor UT adaptation.
Fixes: #9144

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-25 20:25:27 +08:00
ChengyuZhu6
f47408fdf4 agent:image: Refactor code to improve memory efficiency of image service
Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`.
Consequently, if kata-agent receives numerous image pulling requests simultaneously,
it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources.

In image.rs, we introduce two public functions:
`merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate
the operations on `IMAGE_SERVICE`, ensuring that its internal details remain
hidden from external modules such as `rpc.rs`.

Fixes: #9225 -- part II

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-25 07:46:50 +08:00
ChengyuZhu6
7a49ec1c80 agent:util: Refactor the unit tests to leverage rstest
Refactor the unit tests in util.rs to leverage rstest for parameterization.

Fixes: #9314

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:53 +08:00
ChengyuZhu6
2df2b4d30d agent:namespace: Refactor unit tests to leverage rstest
Refactor the unit tests in `namespace.rs` to leverage rstest for parameterization.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:48 +08:00
Hyounggyu Choi
d915a79e2d Merge pull request #9280 from BbolroC/enable-qemu-on-s390x
runtime-rs: Enable qemu on s390x
2024-03-22 23:58:42 +01:00
Hyounggyu Choi
81aaa34bd6 runtime-rs: Add DeviceVirtioSerial and DeviceVirtconsole
It is observed that virtiofsd exits immediately on s390x
if there is no attached console devices.
This commit resolves the issue by migrating `appendConsole()`
from runtime and being triggered in `start_vm()`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
2cfe745efb runtime-rs: Enable memory backend option for Machine for s390x
For s390x, it requires an additional option `memory-backend` for `-machine`.
Otherwise, virtiofsd exits with HandleRequest(InvalidParam).

This commit is to add a field `memory_backend` to `struct Machine`
and turn it on for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
9bcfaad625 runtime-rs: Add ccw block device for rootfs
Like nvdimm for x86_64, a block device for s390x should be
treated differently with `virtio-blk-ccw`.
This is to generate a QEMU command line parameter for a block
device by using `-blockdev` and `-device` if the `vm_rootfs_driver`
is set to `virtio-blk-ccw`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
David Esparza
3e40051634 Merge pull request #9255 from dborquez/thread_pid_function
runtime-rs: ch: Implement full thread/tid/pid handling
2024-03-22 10:05:02 -06:00
Chengyu Zhu
9a4cb96262 Merge pull request #9312 from ChengyuZhu6/show-feature
agent: Add guest-pull to the list of agent features in announce()
2024-03-21 23:35:29 +08:00
David Esparza
b498e140a1 runtime-rs: ch: Implement full thread/tid/pid handling
Add in the full details once cloud-hypervisor/cloud-hypervisor#6103
has been implemented, and the feature is available in a Cloud Hypervisor
release.

Fixes: #8799

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-03-21 08:24:53 -06:00
ChengyuZhu6
754399d909 agent: Add guest-pull to the list of agent features in announce()
Add guest-pull to the list of agent features in announce().

Fixes: #9225 -- part IV

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-21 20:01:52 +08:00
Xuewei Niu
9c4f9dcb35 Merge pull request #9311 from studychao/chao/fix_mtrr
Dragonballl: introduce MTRR regs support
2024-03-21 17:24:27 +08:00
Hyounggyu Choi
9b2c08935b runtime-rs: Pass different device argument based on bus type
Currently, `*-pci` is used as an argument for the device config.
It is not true for a case where a different type of bus is used.
s390x uses `ccw`.
This commit is to make it flexible to generate the device argument
based on the bus type. A structure `DeviceVhostUserFsPci` and
`VhostVsockPci` is renamed to `DeviceVhostUserFs` and `VhostVsock`
because the structure name is not bound to a certain bus type any more.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-21 09:25:37 +01:00
Hyounggyu Choi
7b3d1adb8c libs: Bump sysinfo to v0.30.5
It has been observed that the runtime stops running around
`sysinfo::total_memory()` while adjusting a config on s390x.
This is to update the crate to the latest version which happened
to resolve the issue. (No explicit release note for this)

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-20 09:27:13 +01:00
Chao Wu
5a4b858ece Dragonballl: introduce MTRR regs support
MTRR, or Memory-Type Range Registers are a group of x86 MSRs providing a way to control access
 and cache ability of physical memory regions.
During our test in runtime-rs + Dragonball, we found out that this register support is a must
for passthrough GPU running CUDA application, GPU needs that information to properly use GPU memory.

fixes: #9310
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2024-03-20 14:18:16 +08:00
ChengyuZhu6
5bad18f9c9 agent: set https_proxy/no_proxy before initializing agent policy
When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang.
This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization,
potentially impacting the effectiveness of the proxy settings during image guest pulling.
Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy.

Fixes: #9212

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
db9f18029c README: Add https_proxy and no_proxy to agent README
Add agent.https_proxy and agent.no_proxy to the table in the agent README.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
8724d7deeb packaging: Enable to build agent with PULL_TYPE feature
Enable to build kata-agent with PULL_TYPE feature.

We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default.
This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest.
Only the nydus snapshotter with proxy mode can activate this feature.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
ba242b0198 runtime: support different cri container type check
To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
874d83b510 agent/image: Use guest provided pause image
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Julien Ropé <jrope@redhat.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
c269b9e8c6 agent: Add guest-pull feature for kata-agent
Add "guest-pull" feature option to determine that the related dependencies
would be compiled if the feature is enabled.

By default, agent would be built with default-pull feature, which would
support all pull types, including sharing images by virtio-fs and
pulling images in the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
965da9bc9b runtime: support to pass image information to guest by KataVirtualVolume
support to pass image information to guest by KataVirtualVolumeImageGuestPullType
in KataVirtualVolume, which will be used to pull image on the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
cfd14784a0 agent: Introduce ImagePullHandler to support IMAGE_GUEST_PULL volume
As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume
which storing the image information as an integral part of `CreateContainer`.
Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`.
This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage.
To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`.
This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment.
This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`,
followed by a bind mount to `/run/kata-containers/{container_id}`.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
462051b067 agent/image: merge container spec for images pulled inside guest
When being passed an image name through a container annotation,
merge its corresponding bundle OCI specification and process into the passed container creation one.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
Co-authored-by: jordan9500 <jordan.jackson@ibm.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
cec1916196 agent: Support https_proxy/no_proxy config for image download in guest
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
9cddd5813c agent/image: Enable image-rs crate to pull image inside guest
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
2b3a00f848 agent: export the image service singleton instance
Export the image service singleton instance.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
1f1ca6187d agent: Introduce ImageService
Introduce structure ImageService, which will be used to pull images
inside the guest.

Fixes: #8103

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 17:22:33 +01:00
Chelsea Mafrica
42dfe0e8d1 Merge pull request #9286 from jodh-intel/agent-show-enabled-features
agent: Show features enabled at build time
2024-03-19 08:54:49 -07:00
Alex Lyn
7af2df408e Merge pull request #9295 from likebreath/0318/fix_clh_default_netconfig
runtime-rs: ch: Provide valid default value for NetConfig
2024-03-19 15:17:18 +08:00
Xuewei Niu
99d0e5fff8 Merge pull request #9270 from zvonkok/kata-agent-bind-mount
kata-agent: optional bind flag
2024-03-19 10:39:23 +08:00
Bo Chen
ad4262e86b runtime-rs: ch: Provide valid default value for NetConfig
The current default value of IP `0.0.0.0` with mask `0.0.0.0` will cause
ioctl error when being used to create and configure TAP device, with
newer version of Cloud Hypervisor [1]. This patch replaces them with
valid value that are the same as the Go-lang runtime [2].

[1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/5924
[2] e3f7852738/src/runtime/virtcontainers/pkg/cloud-hypervisor/client/model_net_config.go (L40-L57)

Fixes: #9254

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-18 15:47:58 -07:00
Greg Kurz
1e526a4769 runtime-rs: Consolidate the handling of fds passed to QEMU
File descriptors that are passed to QEMU need some special care.
We want them to be closed when the QEMU process is started. But
at the same time, it is required that the associated rust File
structures, either coming from the` std::fs` or the `tokio::fs`
crates, are still in scope when the QEMU process is forked. This
is currently achieved by keeping File structures in variables
at the outer scope of `start_vm()`. This scheme is currently
duplicated, with similar justifications in the corresponding
comments.

Consolidate all this handling in one place with a more generic
explanation.

Fixes #9281

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-15 16:14:59 +01:00
James O. D. Hunt
9ef59488d9 agent: Show features enabled at build time
The agent now has a number of optional build-time features that can be
enabled.

Add details of these features to the following areas:

- Version output (`kata-agent --version`)
- Announce message (so that the details are always added to the journal
  at agent startup).
- The response message returned by the ttRPC `GetGuestDetails()` API.

Fixes: #9285.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-15 13:29:21 +00:00
Greg Kurz
6a112cc7a5 runtime-rs: Fix missing dependency
Some previous contribution missed to run cargo clippy.
Fix the dependency now so that it doesn't cause noise
in future contributions.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-14 23:19:38 +01:00
Dan Mihai
b3b00e00a6 Merge pull request #9246 from microsoft/danmihai/default-env
genpolicy: default env if image doesn't have env
2024-03-14 11:01:43 -07:00
Zvonko Kaiser
c15e19c806 kata-agent: optional bind flag
Fixes: #9269

From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts
type (string, OPTIONAL) The type of the filesystem to be mounted.
bind may be only specified in the oci spec options -> flags update r#type
The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-03-14 14:42:01 +00:00
Hyounggyu Choi
1dac6b1357 runtime-rs: Configure s390x specific flags for Makefile
s390x supports a different machine type `s390-ccw-virtio` and it is
not required to configure cpu features by default for the platform.
A hypervisor `dragonball` is not supported on s390x so that `DBCMD`
is not necessary. `vm-rootfs_driver` should be set to `virtio-blk-ccw`.
This commit is to set the architecture-specific flags for Makefile.

Fixes: #9158

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-14 13:05:35 +01:00
Alex Lyn
410afcc913 Merge pull request #8866 from Apokleos/netdev-qemu-rs
runtime-rs: add netdev params to cmdline for qemu-rs.
2024-03-13 13:07:43 +08:00
Alex Lyn
e2ae8ba79b runtime-rs: add network device into Qemu's cmdline
It will open tuntap device and vhost-net device
and store device files.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:28:54 +08:00
Alex Lyn
d3bca4597e runtime-rs: add open_named_tuntap to open a named tuntap device.
The open_named_tuntap function is designed as a public function to
open a tuntap device with the specified name. However, in order to
reference existing methods in dbs_utils, we still need to keep the
reference "path = "../../../dragonball/src/dbs_utils" in dependencies
and cannot hide it.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:26:32 +08:00
Alex Lyn
005b333976 runtime-rs: add network helpers and impl ToQemuParams
Add network helpers and impl ToQemuParams trait to build
netdev params which are put into cmdline for Qemu VM running.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:25:39 +08:00
Alex Lyn
63786934f4 runtime-rs: set network namespace for qemu process and netdev.
We need ensure the add_network_device happens in netns and
move qemu process into netns which keeps the qemu process
running in this net namespace.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:21:43 +08:00
Alex Lyn
69a5e5b955 runtime-rs: add network device handler in start_vm.
Add network device handler in start_vm, which is sepcially
for Qemu VM running with added net params to command line.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:18:01 +08:00