Commit Graph

1102 Commits

Author SHA1 Message Date
Fabiano Fidêncio
b4360e7e37 Merge pull request #9510 from microsoft/danmihai1/regorus-policy2
agent: use regorus instead of opa
2024-04-24 21:40:29 +02:00
Dan Mihai
ff385eac41 agent: remove unnecessary comment
Remove reminder to initialize Policy earlier, because currently there
are no plans to initialize earlier.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-24 14:53:51 +00:00
Fupan Li
469c4e4f44 Merge pull request #9335 from Tim-Zhang/fix-passfd-fifo-open
passfd-io: fix FIFO opening and vsock handling
2024-04-23 09:04:45 +08:00
Dan Mihai
5d31eb4847 agent: use regorus 0.1.4
Use regorus 0.1.4 from crates.io, instead of its source code
repository.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 23:21:17 +00:00
Dan Mihai
df23eb09a6 agent: use regorus instead of opa
Implement Agent Policy using the regorus crate instead of the OPA
daemon.

The OPA daemon will be removed from the Guest rootfs in a future PR.

Fixes: #9388

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:58:30 +00:00
Dan Mihai
b509c1beee agent: lock anyhow version to 1.0.58
Lock anyhow version to 1.0.58 because:

- Versions between 1.0.59 - 1.0.76 have not been tested yet using
  Kata CI. However, those versions pass "make test" for the
  Kata Agent.

- Versions 1.0.77 or newer fail during "make test" - see
  https://github.com/kata-containers/kata-containers/issues/9538.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-22 19:49:15 +00:00
Tim Zhang
aeba483ec8 agent: avoid fd leakage of passfd-io
In do_create_container and do_exec_process, we should create the proc_io first,
in case there's some error occur below, thus we can make sure
the io stream closed when error occur.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-22 17:39:33 +08:00
Tim Zhang
d68eb7f0ad agent: Fix close_stdin for passfd-io
In scenario passfd-io, we should wait for stdin to close itself
instead of manually intervening in it.

Signed-off-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-04-22 17:39:32 +08:00
Dan Mihai
4242801b1c agent: update cargo.lock
Update Kata Agent's Cargo.lock after the recent changes to Cargo.toml.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-04-18 17:12:48 +00:00
Alexandru Matei
9e01732f7a agent: shutdown vm on exit when agent is used as init process
Linux kernel generates a panic when the init process exits.
The kernel is booted with panic=1, hence this leads to a
vm reboot.
When used as a service the kata-agent service has an ExecStop
option which does a full sync and shuts down the vm.
This patch mimicks this behavior when kata-agent is used as
the init process.

Fixes: #9429

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-04-12 11:32:31 +03:00
Xuewei Niu
50f78ec52c agent: Fix the issue with the "test_new_fs_manager" test
This patch introduces a one-time cpath to mitigate the cgroup residuals. It
might break the device cgroup merging rules when the cgroup has children.

Fixes: #9456

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2024-04-11 18:06:05 +08:00
Wainer Moschetta
fba1d394d7 Merge pull request #9369 from ChengyuZhu6/sandbox-image
agent:image: Support different pause image in the guest for guest pull
2024-04-08 11:06:21 -03:00
stevenhorsman
864e9c22ba agent: doc: Add new config doc
Document the new guest_components_rest_api config parameter

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-08 11:38:53 +01:00
Biao Lu
f0edec84f6 agent: Launch api-server-rest
If 'rest_api' is configured, let's start the  api-server-rest after
the attestation-agent and the confidential-data-hub have been started.

Fixes: #7555

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:38:53 +01:00
Biao lu
4d752e6350 agent: Add config for api-server-rest
Add configuration for 'rest api server'.

Optional configurations are
  'agent.rest_api=attestation' will enable attestation api
  'agent.rest_api=resource' will enable resource api
  'agent.rest_api=all' will enable all (attestation and resource) api

Fixes: #7555

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:06:14 +01:00
Biao Lu
f476d671ed agent: Launch the confidential data hub
Let's introduce a new method to start the confidential data hub and the
attestation agent.  The former depends on the later, and it needs to be
started before the RPC server.

Starting the attestation components is based on whether the confidential
containers guest components binaries are found in the rootfs.

Fixes: #7544

Signed-off-by: Biao Lu <biao.lu@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
Co-authored-by: Alex Carter <alex.carter@ibm.com>
Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
2024-04-08 11:06:14 +01:00
ChengyuZhu6
8c897f822c agent:image: Support different pause image in the guest for guest pull
Support different pause images in the guest for guest-pull, such as k8s
pause image (registry.k8s.io/pause) and openshift pause image (quay.io/bpradipt/okd-pause).

Fixes: #9225 -- part III

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-04-07 09:00:10 +08:00
Alex Lyn
fa8049af6c Merge pull request #9383 from Apokleos/unified-cgrp-cmdline
kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy
2024-04-02 09:08:04 +08:00
Alex Lyn
07bfdf4a22 Merge pull request #9275 from Apokleos/swap-hooks-bindmnt
kata-agent: Change order of guest hook and bind mount processing
2024-04-02 07:40:10 +08:00
Alex Lyn
c88014834b kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy
Configure the system to mount cgroups-v2 by default during system boot
by the systemd system, We must add systemd.unified_cgroup_hierarchy=1
parameter to kernel cmdline, which will be passed by kernel_params in
configuration.toml.
To enable cgroup-v2, just add systemd.unified_cgroup_hierarchy=true[1]
to kernel_params.

Fixes: #9336

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-04-01 18:45:12 +08:00
Christophe de Dinechin
82c4079fd0 agent: Remove useless loop
This is the report from `make check`:

```
error: this loop never actually loops
   --> src/signal.rs:147:9
    |
147 | /         loop {
148 | |             select! {
149 | |                 _ = handle => {
150 | |                     println!("INFO: task completed");
...   |
156 | |             }
157 | |         }
    | |_________^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop
    = note: `#[deny(clippy::never_loop)]` on by default
```

There is only one option: you get something or a timeout. You never retry, so
the report is correct.

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Christophe de Dinechin
df5c88cdf0 agent: Remove lint error about .flatten running forever
The lint report is the following:

```
error: `flatten()` will run forever if the iterator repeatedly produces an `Err`
    --> src/rpc.rs:1754:10
     |
1754 |         .flatten()
     |          ^^^^^^^^^ help: replace with: `map_while(Result::ok)`
     |
note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error
    --> src/rpc.rs:1752:5
     |
1752 | /     reader
1753 | |         .lines()
     | |________________^
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok
     = note: `-D clippy::lines-filter-map-ok` implied by `-D warnings`
     = help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]`
```

This commit simply applies the suggestion.

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Christophe de Dinechin
bfb55312be agent: Fix .enumerate errors during make check
Running `make check` in the `src/agent` directory gives:

```
error: you seem to use `.enumerate()` and immediately discard the index
   --> rustjail/src/mount.rs:572:27
    |
572 |     for (_index, line) in reader.lines().enumerate() {
    |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index
    = note: `-D clippy::unused-enumerate-index` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]`
help: remove the `.enumerate()` call
    |
572 |     for line in reader.lines() {
    |         ~~~~    ~~~~~~~~~~~~~~

    Checking tokio-native-tls v0.3.1
    Checking hyper-tls v0.5.0
    Checking reqwest v0.11.18
error: could not compile `rustjail` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
make: *** [../../utils.mk:177: standard_rust_check] Error 101
```

Fixes: #9342

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2024-03-27 17:03:44 +01:00
Chengyu Zhu
d16971e37e Merge pull request #9325 from ChengyuZhu6/image_service
agent:image: Refactor code to improve memory efficiency of image service
2024-03-26 10:38:37 +08:00
ChengyuZhu6
f47408fdf4 agent:image: Refactor code to improve memory efficiency of image service
Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`.
Consequently, if kata-agent receives numerous image pulling requests simultaneously,
it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources.

In image.rs, we introduce two public functions:
`merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate
the operations on `IMAGE_SERVICE`, ensuring that its internal details remain
hidden from external modules such as `rpc.rs`.

Fixes: #9225 -- part II

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-25 07:46:50 +08:00
ChengyuZhu6
7a49ec1c80 agent:util: Refactor the unit tests to leverage rstest
Refactor the unit tests in util.rs to leverage rstest for parameterization.

Fixes: #9314

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:53 +08:00
ChengyuZhu6
2df2b4d30d agent:namespace: Refactor unit tests to leverage rstest
Refactor the unit tests in `namespace.rs` to leverage rstest for parameterization.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-23 10:49:48 +08:00
ChengyuZhu6
754399d909 agent: Add guest-pull to the list of agent features in announce()
Add guest-pull to the list of agent features in announce().

Fixes: #9225 -- part IV

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-21 20:01:52 +08:00
ChengyuZhu6
5bad18f9c9 agent: set https_proxy/no_proxy before initializing agent policy
When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang.
This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization,
potentially impacting the effectiveness of the proxy settings during image guest pulling.
Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy.

Fixes: #9212

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
db9f18029c README: Add https_proxy and no_proxy to agent README
Add agent.https_proxy and agent.no_proxy to the table in the agent README.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
8724d7deeb packaging: Enable to build agent with PULL_TYPE feature
Enable to build kata-agent with PULL_TYPE feature.

We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default.
This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest.
Only the nydus snapshotter with proxy mode can activate this feature.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:06:00 +01:00
ChengyuZhu6
ba242b0198 runtime: support different cri container type check
To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
874d83b510 agent/image: Use guest provided pause image
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Julien Ropé <jrope@redhat.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
c269b9e8c6 agent: Add guest-pull feature for kata-agent
Add "guest-pull" feature option to determine that the related dependencies
would be compiled if the feature is enabled.

By default, agent would be built with default-pull feature, which would
support all pull types, including sharing images by virtio-fs and
pulling images in the guest.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 18:05:59 +01:00
ChengyuZhu6
cfd14784a0 agent: Introduce ImagePullHandler to support IMAGE_GUEST_PULL volume
As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume
which storing the image information as an integral part of `CreateContainer`.
Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`.
This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage.
To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`.
This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment.
This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`,
followed by a bind mount to `/run/kata-containers/{container_id}`.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
462051b067 agent/image: merge container spec for images pulled inside guest
When being passed an image name through a container annotation,
merge its corresponding bundle OCI specification and process into the passed container creation one.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
Co-authored-by: jordan9500 <jordan.jackson@ibm.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
cec1916196 agent: Support https_proxy/no_proxy config for image download in guest
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
9cddd5813c agent/image: Enable image-rs crate to pull image inside guest
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Arron Wang <arron.wang@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
2b3a00f848 agent: export the image service singleton instance
Export the image service singleton instance.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
2024-03-19 17:22:36 +01:00
ChengyuZhu6
1f1ca6187d agent: Introduce ImageService
Introduce structure ImageService, which will be used to pull images
inside the guest.

Fixes: #8103

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
co-authored-by: stevenhorsman <steven@uk.ibm.com>
2024-03-19 17:22:33 +01:00
Chelsea Mafrica
42dfe0e8d1 Merge pull request #9286 from jodh-intel/agent-show-enabled-features
agent: Show features enabled at build time
2024-03-19 08:54:49 -07:00
James O. D. Hunt
9ef59488d9 agent: Show features enabled at build time
The agent now has a number of optional build-time features that can be
enabled.

Add details of these features to the following areas:

- Version output (`kata-agent --version`)
- Announce message (so that the details are always added to the journal
  at agent startup).
- The response message returned by the ttRPC `GetGuestDetails()` API.

Fixes: #9285.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-15 13:29:21 +00:00
Zvonko Kaiser
c15e19c806 kata-agent: optional bind flag
Fixes: #9269

From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts
type (string, OPTIONAL) The type of the filesystem to be mounted.
bind may be only specified in the oci spec options -> flags update r#type
The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-03-14 14:42:01 +00:00
Alex Lyn
2aa3519520 kata-agent: Change order of guest hook and bind mount processing
The guest_hook_path item in configuration.toml allows OCI hook scripts
to be executed within Kata's guest environment. Traditionally, these
guest hook programs are pre-built and included in Kata's guest rootfs
image at a fixed location.

While setting guest_hook_path = "/usr/share/oci/hooks" in configuration.toml
works, it lacks flexibility. Not all guest hooks reside in the path
/usr/share/oci/hooks, and users might have custom locations.

To address this, a more flexible and configurable approach is to be proposed
that allows users to specify their desired path. This could include using a
sandbox bind mount path for hooks specific to that particular container.

However, The current implementation of guest hooks and bind mounts in kata-agent
has a reversed order of execution compared to the desired behavior.
To achieve the intended functionality, we simply need to swap the order of their
implementation.

Fixes: #9274

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-13 20:30:32 +08:00
Chengyu Zhu
bb4c608b32 Merge pull request #9110 from ChengyuZhu6/agent_option
agent: Add all agent configuration options to README
2024-02-28 18:50:44 +08:00
ChengyuZhu6
731c490ded agent: Add all agent configuration options to README
Add all agent configuration options to README so that users can more easily understand
what these options do and how to configure them at runtime.

Fixes: #9109

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-27 17:35:19 +08:00
Greg Kurz
600b951afd agent: Run container workload in its own cgroup namespace
When cgroup v2 is in use, a container should only see its part of the
unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created
at the OS level. Similarly, `/proc/self/cgroup` inside the container
should display `0::/`, rather than a full path such as :

0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope

What is needed here is isolation from the OS. Do that by running the
container in its own cgroup namespace. This matches what runc and
other non VM based runtimes do.

Fixes #9124

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
Greg Kurz
14886c7b32 agent: lint code
Run cargo-clippy to reduce noise in actual functional changes.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-02-21 13:14:13 +01:00
Zixuan Tan
222de4f684 agent: Fix a race condition in passfd_io.rs
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
6e4d4c329a agent,runtime-rs: Add license header to passfd_io.rs
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00