kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-18 10:44:10 +00:00

Author	SHA1	Message	Date
Chengyu Zhu	bb4c608b32	Merge pull request #9110 from ChengyuZhu6/agent_option agent: Add all agent configuration options to README	2024-02-28 18:50:44 +08:00
ChengyuZhu6	731c490ded	agent: Add all agent configuration options to README Add all agent configuration options to README so that users can more easily understand what these options do and how to configure them at runtime. Fixes: #9109 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-27 17:35:19 +08:00
Greg Kurz	600b951afd	agent: Run container workload in its own cgroup namespace When cgroup v2 is in use, a container should only see its part of the unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created at the OS level. Similarly, `/proc/self/cgroup` inside the container should display `0::/`, rather than a full path such as : 0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope What is needed here is isolation from the OS. Do that by running the container in its own cgroup namespace. This matches what runc and other non VM based runtimes do. Fixes #9124 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
Greg Kurz	14886c7b32	agent: lint code Run cargo-clippy to reduce noise in actual functional changes. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
Zixuan Tan	222de4f684	agent: Fix a race condition in passfd_io.rs There is a race condition in agent HVSOCK_STREAMS hashmap, where a stream may be taken before it is inserted into the hashmap. This patch add simple retry logic to the stream consumer to alleviate this issue. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	6e4d4c329a	agent,runtime-rs: Add license header to passfd_io.rs Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	1206de2c23	agent: Use pipes as stdout/stderr of container process Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>, making some images relying on the special file /dev/stdout(stderr), /proc/self/fd/1(2) fail to boot in passfd io mode, where the stdout/stderr of a container process is a vsock socket. For back compatibility, a pipe is introduced between the process and the socket, and its read end is set as stdout/stderr of the container process instead of the socket. The agent will do the forwarding between the pipe and the socket. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f6710610d1	agent,runtime-rs,runk: fix fmt and clippy warnings Fix rustfmt and clippy warnings detected by CI. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	3eb4bed957	agent: use biased select to avoid data loss This patch uses a biased select to avoid stdin data loss in case of CloseStdinRequest. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	7874ef5fd2	agent: set stdout/err vsock stream as blocking before passing to child In passfd io mode, when not using a terminal, the stdout/stderr vsock streams are directly used as the stdout/stderr of the child process. These streams are non-blocking by default. The stdout/stderr of the process should be blocking, otherwise the process may encounter EAGAIN error when writing to stdout/stderr. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	5536743361	agent,runtime-rs: fix container io detach and attach Partially fix some issues related to container io detach and attach. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f1b33fd2e0	agent: clean up term master fd when container exits When container exits, the agent should clean up the term master fd, otherwise the fd will be leaked. Fixes: kata-containers#6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	442df71fe5	agent,runtime-rs: refactor process io using vsock fd passthrough feature Currently in the kata container, every io read/write operation requires an RPC request from the runtime to the agent. This process involves data copying into/from an RPC request/response, which are high overhead. To solve this issue, this commit utilize the vsock fd passthrough, a newly introduced feature in the Dragonball hypervisor. This feature allows other host programs to pass a file descriptor to the Dragonball process, directly as the backend of an ordinary hybrid vsock connection. The runtime-rs now utilizes this feature for container process io. It open the stdin/stdout/stderr fifo from containerd, and pass them to Dragonball, then don't bother with process io any more, eliminating the need for an RPC for each io read/write operation. In passfd io mode, the agent uses the vsock connections as the child process's stdin/stdout/stderr, eliminating the need for a pipe to bump data (in non-tty mode). Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	eb6bb6fe0d	config: add two options to control vsock passthrough io feature Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced to enable and configure dragonball's vsock fd passthrough io feature. This commit is a preparation for vsock fd passthrough io feature. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Amulyam24	f6fea5f2ca	agent: fix failing unit tests on ppc64le - test_volume_capacity_stats: verify the file block size against the fetched size via statfs() - test_reseed_rng: Correct the request codes for RNDADDTOENTCNT and RNDRESEEDCRNG when platform is ppc64le - test list_routes: Add the route only if destination is not empty - test_new_fs_manager: skip the test if cgroups v2 is used by default - skip test cases rpc::tests::test_do_write_stream, sandbox::tests::test_find_process, sandbox::t ests::test_find_container_process and sandbox::tests::add_and_get_container on ppc64le as they are fl aky Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:32:16 +01:00
Hyounggyu Choi	c0f57c9e0a	Lint: Fix `cargo clippy` errors for s390x Some linting errors were identified during the enablement of `make check`. These have not been found by the Jenkins CI job because `make test` was only triggered. The errors for the `agent` occurs under the s390x specific tests while the other ones for the `kata-ctl` are the architecture-specific code. This commit is to fix those errors. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Greg Kurz	e3611cf27d	Merge pull request #8326 from cheriL/8325/fix_method_param agent: use method params instead of const params in functions	2024-01-09 07:35:19 +01:00
Xuewei Niu	192c6ee9c3	Merge pull request #8773 from justxuewei/dbs-k8s-fragile	2024-01-05 12:54:32 +08:00
Xuewei Niu	0e9d73fe30	agent: Fix an issue reporting OOM events by mistake The agent registers an event fd in `memory.oom_control`. An OOM event is forwarded to containerd when the event is emitted, regardless of the content in that file. I observed content indicating that events should not be forwarded, as shown below. When `oom_kill` is set to 0, it means no OOM has occurred. Therefore, it is important to check the content to avoid mistakenly forwarding OOM events. ``` oom_kill_disable 0 under_oom 0 oom_kill 0 ``` Fixes: #8715 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-05 11:06:37 +08:00
Dan Mihai	7d5336aca3	agent: hold lock while setting new policy Don't release the lock between is_allowed and set_policy calls, because the policy might change in between these calls. Also, move more policy code into policy.rs. Fixes: #8734 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-04 16:45:30 +00:00
soup	7c176a62fe	agent: use method params instead of const params in functions Fixes: #8325 Signed-off-by: soup <lqh348659137@outlook.com>	2024-01-04 09:29:29 +01:00
Xuewei Niu	91360e7ddb	agent: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Chao Wu	71c322c293	runtime-rs: fix ci complains vfio commits introduce quite a lot change in runtime-rs, this commit is for all the changes related to ci, including compilation errors and so on. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 23:34:41 +08:00
Jianyong Wu	58e88d9469	agent: correct CPUShares and CPUWeight value If cgroup driver is systemd, CPUShares, for cgroup v1, should be at least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2]. Fixes: #8340 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> [1] `d19434fbf8/src/basic/cgroup-util.h (L122)` [2] `d19434fbf8/src/basic/cgroup-util.h (L91)`	2023-12-15 02:04:31 +08:00
Sumedh Alok Sharma	4aaf54bdad	runtime: Fix configmap/secrets update propagation with FS sharing disabled This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled. The commit introduces below changes with some limitations: - creates new timestamped directory in guest - updates the '..data' symlink - creates user visible symlinks to newly created secrets. - Limitation: The older timestamped directory and stale user visible symlinks exist in guest due to missing DELETE api in agent. Fixes: #7398 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2023-11-17 13:01:23 +05:30
gaohuatao	78df1bb851	agent: update AGENT_THREADS metrics value Fixes: #8369 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2023-11-10 10:39:57 +08:00
Xuewei Niu	023d8dc01e	agent: Changes according to Pan's comments - Disable device cgroup restriction while pod cgroup is not available. - Remove balcklist-related names and change whitelist-related names to allowed_all. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:08 +08:00
Xuewei Niu	b5f3a8cb39	agent: Fix container launching failure with systemd cgroup FSManager of systemd cgroup manager is responsible for setting up cgroup path. The container launching will be failed if the FSManager is in read-only mode. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	6477825195	agent: Minor changes according to Zhou's comments The changes include: - Change to debug logging level for resources after processed. - Remove a todo for pod cgroup cleanup. - Add an anyhow context to `get_paths_and_mounts()`. - Remove code which denys access to VMROOTFS since it won't take effect. If blackmode is in use, the VMROOTFS will be denyed as default. Otherwise, device cgroups won't be updated in whitelist mode. - Add a unit test for `default_allowed_devices()`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	cec8044744	agent: Make devcg_info optional for LinuxContainer::new() The runk is a standard OCI runtime that isnt' aware of concept of sandbox. Therefore, the `devcg_info` argument of `LinuxContainer::new()` is unneccessary to be provided. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	ef4c3844a3	agent: Restrict device access at upper node of container's cgroup The target is to guarantee that containers couldn't escape to access extra devices, like vm rootfs, etc. Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup, and the `A` is what we called pod cgroup. No matter what permissions are set for the container (`B`), the `A`'s permission is always `a : rwm`. It leads that containers could acquire permission to access to other devices in VM that not belongs to themselves. In order to set devices cgroup properly, the order of setting cgroups is that the pod cgroup comes first and the container cgroup comes after. The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To avoid setting container cgroup too early, an initialization should be done carefully. `inited`, one of the states, is a boolean to indicate if the pod cgroup is initialized. If no, the pod cgroup should be created firstly, and set default permissions. After that, the pause container cgroup is created and inherits the permissions from the pod cgroup. If whitelist mode which allows containers to access all devices in VM is enabled, then device resources from OCI spec are ignored. This feature not supports systemd cgroup and cgroup v2, since: - Systemd cgroup implemented on Agent hasn't supported devices subsystem so far, see: https://github.com/kata-containers/kata-containers/issues/7506. - Cgroup v2's device controller depends on eBPF programs, which is out of scope of cgroup. Fixes: #7507 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Beraldo Leal	c5d845b30a	agent: updating Cargo.lock files Probably previous changes missed updating Cargo.lock. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Archana Shinde	148c565b2f	Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x agent: Skip flaky create_tmpfs on s390x	2023-10-30 22:26:28 -07:00
HanZiyao	a3b003c345	agent: support bind mounts between containers This feature supports creating bind mounts directly between containers through annotations. Fixes: #6715 Signed-off-by: HanZiyao <h56983577@126.com>	2023-10-26 16:34:50 +08:00
Hyounggyu Choi	a0746c8d7b	agent: Skip flaky create_tmpfs on s390x This is to skip a flaky test `create_tmpfs()` on s390x until a root cause is identified and fixed. Fixes: #4248 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-10-23 11:22:14 +02:00
Dan Mihai	52aaf10759	agent: no endpoint blocking from agent-config.toml Remove the ability to block access to kata agent endpoints by using agent-config.toml. That functionality is now implemented using the Agent Policy feature (#7573). The CCv0 branch relied on blocking endpoints using agent-config.toml but will set-up an equivalent default policy file instead (#8219). Fixes: #8228 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-20 02:26:54 +00:00
Fabiano Fidêncio	1727487eef	agent: Allow specifying DESTDIR and AGENT_POLICY via env vars This will help to build the agent binary as part of the kata-deploy localbuild, as we need to pass the DESTDIR to where the agent will be installed, and also whether we're building the agent with policy support enabled or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 14:18:45 +02:00
Chao Wu	6f98fbafde	Merge pull request #6706 from guixiongwei/feat/thp feat(runtime-rs): introduce huge page mode to select VM RAM's backend	2023-09-22 15:27:06 +08:00
Fabiano Fidêncio	ec826f328f	agent: Ensure GENERATED_CODE is a dep of `make test` Otherwise `make test` will fail with: ``` error[E0583]: file not found for module `version` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:57 +02:00
stevenhorsman	75cfdd5d59	agent: config: Allow clippy lint - Allow `clippy::redundant-closure-call` in `from_cmdline` which has issues with the guard function passed into the `parse_cmdline_param` macro Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	f3a0fd5907	agent: config: Fix useles-vec warning Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
Yipeng Yin	a16b0962b5	chore(cargo): update cargo lock Update cargo lock for runtime-rs, agent and kata-ctl. Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-09-12 15:27:38 +08:00
Yuan-Zhuo	470d065415	agent: optimize the code of systemd cgroup manager 1. Directly support CgroupManager::freeze through systemd API. 2. Avoid always passing unit_name by storing it into DBusClient. 3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it. 4. Ignore no such unit error when destroying systemd unit. 5. Update zbus version and corresponding interface file. Acknowledgement: error handling for no such systemd unit error refers to Fixes: #7080, #7142, #7143, #7166 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com> Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>	2023-09-09 13:56:43 +08:00
Jiang Liu	57e7bf14a6	agent: refine StorageDeviceGeneric::cleanup() Refine StorageDeviceGeneric::cleanup() to improve safety. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:22:21 +08:00
Jiang Liu	53edb19374	agent: implement StorageDeviceGeneric::cleanup() Refactor cleanup_sandbox_storage as StorageDeviceGeneric::cleanup(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:00:26 +08:00
Jiang Liu	0c63453e28	types: make StorageDevice::cleanup() return possible error code Make StorageDevice::cleanup() return possible error code. Fixes: #7818 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:27:06 +08:00
Jiang Liu	3a3d77b3b5	agent: move StorageDeviceGeneric from kata-types into agent Move StorageDeviceGeneric from kata-types into agent, so we can refactor code later. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:12:17 +08:00
Jiang Liu	9cd706d1c9	agent: avoid possible leakage of storage device When a storage device is used by more than one container, the second and forth instances will cause storage device reference count leakage, thus cause storage device leakage. The reason is: add_storages() will increase reference count of existing storage device, but forget to add the device to the `mount_list` array, thus leak the reference count. Fixes: #7820 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-01 22:52:42 +08:00
Jiang Liu	91db888d83	Merge pull request #7602 from jiangliu/agent-storage Refine storage device management for kata-agent	2023-08-25 22:20:18 +08:00
Jiang Liu	aaa5ab1264	agent: simplify storage device by removing StorageDeviceObject Simplify storage device implementation by removing StorageDeviceObject. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-25 17:23:16 +08:00

1 2 3 4 5 ...

1058 Commits