Linux kernel generates a panic when the init process exits.
The kernel is booted with panic=1, hence this leads to a
vm reboot.
When used as a service the kata-agent service has an ExecStop
option which does a full sync and shuts down the vm.
This patch mimicks this behavior when kata-agent is used as
the init process.
Fixes: #9429
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
This patch introduces a one-time cpath to mitigate the cgroup residuals. It
might break the device cgroup merging rules when the cgroup has children.
Fixes: #9456
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Support different pause images in the guest for guest-pull, such as k8s
pause image (registry.k8s.io/pause) and openshift pause image (quay.io/bpradipt/okd-pause).
Fixes: #9225 -- part III
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Configure the system to mount cgroups-v2 by default during system boot
by the systemd system, We must add systemd.unified_cgroup_hierarchy=1
parameter to kernel cmdline, which will be passed by kernel_params in
configuration.toml.
To enable cgroup-v2, just add systemd.unified_cgroup_hierarchy=true[1]
to kernel_params.
Fixes: #9336
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
This is the report from `make check`:
```
error: this loop never actually loops
--> src/signal.rs:147:9
|
147 | / loop {
148 | | select! {
149 | | _ = handle => {
150 | | println!("INFO: task completed");
... |
156 | | }
157 | | }
| |_________^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop
= note: `#[deny(clippy::never_loop)]` on by default
```
There is only one option: you get something or a timeout. You never retry, so
the report is correct.
Fixes: #9342
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The lint report is the following:
```
error: `flatten()` will run forever if the iterator repeatedly produces an `Err`
--> src/rpc.rs:1754:10
|
1754 | .flatten()
| ^^^^^^^^^ help: replace with: `map_while(Result::ok)`
|
note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error
--> src/rpc.rs:1752:5
|
1752 | / reader
1753 | | .lines()
| |________________^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok
= note: `-D clippy::lines-filter-map-ok` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]`
```
This commit simply applies the suggestion.
Fixes: #9342
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Running `make check` in the `src/agent` directory gives:
```
error: you seem to use `.enumerate()` and immediately discard the index
--> rustjail/src/mount.rs:572:27
|
572 | for (_index, line) in reader.lines().enumerate() {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index
= note: `-D clippy::unused-enumerate-index` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]`
help: remove the `.enumerate()` call
|
572 | for line in reader.lines() {
| ~~~~ ~~~~~~~~~~~~~~
Checking tokio-native-tls v0.3.1
Checking hyper-tls v0.5.0
Checking reqwest v0.11.18
error: could not compile `rustjail` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
make: *** [../../utils.mk:177: standard_rust_check] Error 101
```
Fixes: #9342
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`.
Consequently, if kata-agent receives numerous image pulling requests simultaneously,
it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources.
In image.rs, we introduce two public functions:
`merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate
the operations on `IMAGE_SERVICE`, ensuring that its internal details remain
hidden from external modules such as `rpc.rs`.
Fixes: #9225 -- part II
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang.
This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization,
potentially impacting the effectiveness of the proxy settings during image guest pulling.
Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy.
Fixes: #9212
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Enable to build kata-agent with PULL_TYPE feature.
We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default.
This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest.
Only the nydus snapshotter with proxy mode can activate this feature.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Julien Ropé <jrope@redhat.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Add "guest-pull" feature option to determine that the related dependencies
would be compiled if the feature is enabled.
By default, agent would be built with default-pull feature, which would
support all pull types, including sharing images by virtio-fs and
pulling images in the guest.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume
which storing the image information as an integral part of `CreateContainer`.
Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`.
This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage.
To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`.
This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment.
This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`,
followed by a bind mount to `/run/kata-containers/{container_id}`.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
The agent now has a number of optional build-time features that can be
enabled.
Add details of these features to the following areas:
- Version output (`kata-agent --version`)
- Announce message (so that the details are always added to the journal
at agent startup).
- The response message returned by the ttRPC `GetGuestDetails()` API.
Fixes: #9285.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Fixes: #9269
From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts
type (string, OPTIONAL) The type of the filesystem to be mounted.
bind may be only specified in the oci spec options -> flags update r#type
The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The guest_hook_path item in configuration.toml allows OCI hook scripts
to be executed within Kata's guest environment. Traditionally, these
guest hook programs are pre-built and included in Kata's guest rootfs
image at a fixed location.
While setting guest_hook_path = "/usr/share/oci/hooks" in configuration.toml
works, it lacks flexibility. Not all guest hooks reside in the path
/usr/share/oci/hooks, and users might have custom locations.
To address this, a more flexible and configurable approach is to be proposed
that allows users to specify their desired path. This could include using a
sandbox bind mount path for hooks specific to that particular container.
However, The current implementation of guest hooks and bind mounts in kata-agent
has a reversed order of execution compared to the desired behavior.
To achieve the intended functionality, we simply need to swap the order of their
implementation.
Fixes: #9274
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Add all agent configuration options to README so that users can more easily understand
what these options do and how to configure them at runtime.
Fixes: #9109
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
When cgroup v2 is in use, a container should only see its part of the
unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created
at the OS level. Similarly, `/proc/self/cgroup` inside the container
should display `0::/`, rather than a full path such as :
0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope
What is needed here is isolation from the OS. Do that by running the
container in its own cgroup namespace. This matches what runc and
other non VM based runtimes do.
Fixes#9124
Signed-off-by: Greg Kurz <groug@kaod.org>
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>,
making some images relying on the special file /dev/stdout(stderr),
/proc/self/fd/1(2) fail to boot in passfd io mode, where the
stdout/stderr of a container process is a vsock socket.
For back compatibility, a pipe is introduced between the process
and the socket, and its read end is set as stdout/stderr of the
container process instead of the socket. The agent will do the
forwarding between the pipe and the socket.
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
In passfd io mode, when not using a terminal, the stdout/stderr vsock
streams are directly used as the stdout/stderr of the child process.
These streams are non-blocking by default.
The stdout/stderr of the process should be blocking, otherwise
the process may encounter EAGAIN error when writing to stdout/stderr.
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
When container exits, the agent should clean up the term master fd,
otherwise the fd will be leaked.
Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.
To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.
The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.
In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.
This commit is a preparation for vsock fd passthrough io feature.
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
- test_volume_capacity_stats: verify the file block size against the fetched size via statfs()
- test_reseed_rng: Correct the request codes for RNDADDTOENTCNT and RNDRESEEDCRNG when platform is ppc64le
- test list_routes: Add the route only if destination is not empty
- test_new_fs_manager: skip the test if cgroups v2 is used by default
- skip test cases rpc::tests::test_do_write_stream, sandbox::tests::test_find_process, sandbox::t
ests::test_find_container_process and sandbox::tests::add_and_get_container on ppc64le as they are fl
aky
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>