mirror of
https://github.com/kata-containers/kata-containers.git
synced 2025-04-27 19:35:32 +00:00
1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd). 2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs). 3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md). Fixes: #4336 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
85 lines
3.9 KiB
Markdown
85 lines
3.9 KiB
Markdown
# Systemd Cgroup for Agent
|
|
|
|
As we know, we can interact with cgroups in two ways, **`cgroupfs`** and **`systemd`**. The former is achieved by reading and writing cgroup `tmpfs` files under `/sys/fs/cgroup` while the latter is done by configuring a transient unit by requesting systemd. Kata agent uses **`cgroupfs`** by default, unless you pass the parameter `--systemd-cgroup`.
|
|
|
|
## usage
|
|
|
|
For systemd, kata agent configures cgroups according to the following `linux.cgroupsPath` format standard provided by `runc` (`[slice]:[prefix]:[name]`). If you don't provide a valid `linux.cgroupsPath`, kata agent will treat it as `"system.slice:kata_agent:<container-id>"`.
|
|
|
|
> Here slice is a systemd slice under which the container is placed. If empty, it defaults to system.slice, except when cgroup v2 is used and rootless container is created, in which case it defaults to user.slice.
|
|
>
|
|
> Note that slice can contain dashes to denote a sub-slice (e.g. user-1000.slice is a correct notation, meaning a `subslice` of user.slice), but it must not contain slashes (e.g. user.slice/user-1000.slice is invalid).
|
|
>
|
|
> A slice of `-` represents a root slice.
|
|
>
|
|
> Next, prefix and name are used to compose the unit name, which is `<prefix>-<name>.scope`, unless name has `.slice` suffix, in which case prefix is ignored and the name is used as is.
|
|
|
|
## supported properties
|
|
|
|
The kata agent will translate the parameters in the `linux.resources` of `config.json` into systemd unit properties, and send it to systemd for configuration. Since systemd supports limited properties, only the following parameters in `linux.resources` will be applied. We will simply treat hybrid mode as legacy mode by the way.
|
|
|
|
- CPU
|
|
|
|
- v1
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| --------------------- | --------------------- |
|
|
| `cpu.shares` | `CPUShares` |
|
|
|
|
- v2
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| -------------------------- | -------------------------- |
|
|
| `cpu.shares` | `CPUShares` |
|
|
| `cpu.period` | `CPUQuotaPeriodUSec`(v242) |
|
|
| `cpu.period` & `cpu.quota` | `CPUQuotaPerSecUSec` |
|
|
|
|
- MEMORY
|
|
|
|
- v1
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| --------------------- | --------------------- |
|
|
| `memory.limit` | `MemoryLimit` |
|
|
|
|
- v2
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| ------------------------------ | --------------------- |
|
|
| `memory.low` | `MemoryLow` |
|
|
| `memory.max` | `MemoryMax` |
|
|
| `memory.swap` & `memory.limit` | `MemorySwapMax` |
|
|
|
|
- PIDS
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| --------------------- | --------------------- |
|
|
| `pids.limit ` | `TasksMax` |
|
|
|
|
- CPUSET
|
|
|
|
| runtime spec resource | systemd property name |
|
|
| --------------------- | -------------------------- |
|
|
| `cpuset.cpus` | `AllowedCPUs`(v244) |
|
|
| `cpuset.mems` | `AllowedMemoryNodes`(v244) |
|
|
|
|
## Systemd Interface
|
|
|
|
`session.rs` and `system.rs` in `src/agent/rustjail/src/cgroups/systemd/interface` are automatically generated by `zbus-xmlgen`, which is is an accompanying tool provided by `zbus` to generate Rust code from `D-Bus XML interface descriptions`. The specific commands to generate these two files are as follows:
|
|
|
|
```shell
|
|
// system.rs
|
|
zbus-xmlgen --system org.freedesktop.systemd1 /org/freedesktop/systemd1
|
|
// session.rs
|
|
zbus-xmlgen --session org.freedesktop.systemd1 /org/freedesktop/systemd1
|
|
```
|
|
|
|
The current implementation of `cgroups/systemd` uses `system.rs` while `session.rs` could be used to build rootless containers in the future.
|
|
|
|
## references
|
|
|
|
- [runc - systemd cgroup driver](https://github.com/opencontainers/runc/blob/main/docs/systemd.md)
|
|
|
|
- [systemd.resource-control — Resource control unit settings](https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html)
|
|
|