mirror of
https://github.com/kata-containers/kata-containers.git
synced 2025-06-28 16:27:50 +00:00
architecture.md: update the architecture of v1.5
- shimv2 has been introduced - vsock has been supported Fixes: #322 Signed-off-by: Xu Wang <xu@hyper.sh>
This commit is contained in:
parent
75501df2fa
commit
04498bd4d2
142
architecture.md
142
architecture.md
@ -1,39 +1,38 @@
|
|||||||
# Kata Containers Architecture
|
# Kata Containers Architecture
|
||||||
|
|
||||||
* [Overview](#overview)
|
* [Overview](#overview)
|
||||||
* [Hypervisor](#hypervisor)
|
* [Hypervisor](#hypervisor)
|
||||||
* [Assets](#assets)
|
* [Assets](#assets)
|
||||||
* [Guest kernel](#guest-kernel)
|
* [Guest kernel](#guest-kernel)
|
||||||
* [Root filesystem image](#root-filesystem-image)
|
* [Root filesystem image](#root-filesystem-image)
|
||||||
* [Agent](#agent)
|
* [Agent](#agent)
|
||||||
* [Runtime](#runtime)
|
* [Runtime](#runtime)
|
||||||
* [Configuration](#configuration)
|
* [Configuration](#configuration)
|
||||||
* [Significant commands](#significant-commands)
|
* [Significant commands](#significant-commands)
|
||||||
* [create](#create)
|
* [create](#create)
|
||||||
* [start](#start)
|
* [start](#start)
|
||||||
* [exec](#exec)
|
* [exec](#exec)
|
||||||
* [kill](#kill)
|
* [kill](#kill)
|
||||||
* [delete](#delete)
|
* [delete](#delete)
|
||||||
* [Proxy](#proxy)
|
* [Proxy](#proxy)
|
||||||
* [Shim](#shim)
|
* [Shim](#shim)
|
||||||
* [Networking](#networking)
|
* [Networking](#networking)
|
||||||
* [Storage](#storage)
|
* [Storage](#storage)
|
||||||
* [Kubernetes Support](#kubernetes-support)
|
* [Kubernetes Support](#kubernetes-support)
|
||||||
* [Problem Statement](#problem-statement)
|
* [Problem Statement](#problem-statement)
|
||||||
* [CRI-O](#cri-o)
|
* [Containerd](#containerd)
|
||||||
* [OCI Annotations](#oci-annotations)
|
* [CRI-O](#cri-o)
|
||||||
* [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
|
* [OCI Annotations](#oci-annotations)
|
||||||
* [Containerd CRI Plugin (Formerly CRI-containerd)](#containerd-cri-plugin-formerly-cri-containerd)
|
* [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
|
||||||
* [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes-1)
|
|
||||||
* [Appendices](#appendices)
|
* [Appendices](#appendices)
|
||||||
* [DAX](#dax)
|
* [DAX](#dax)
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
This is an architectural overview of Kata Containers, based on the 1.2.0 release.
|
This is an architectural overview of Kata Containers, based on the 1.5.0 release.
|
||||||
|
|
||||||
The two primary deliverables of the Kata Containers project are a container runtime
|
The two primary deliverables of the Kata Containers project are a container runtime
|
||||||
and a CRI friendly library API.
|
and a CRI friendly shim. There is also a CRI friendly library API behind them.
|
||||||
|
|
||||||
The [Kata Containers runtime (kata-runtime)](https://github.com/kata-containers/runtime)
|
The [Kata Containers runtime (kata-runtime)](https://github.com/kata-containers/runtime)
|
||||||
is compatible with the [OCI](https://github.com/opencontainers) [runtime specification](https://github.com/opencontainers/runtime-spec)
|
is compatible with the [OCI](https://github.com/opencontainers) [runtime specification](https://github.com/opencontainers/runtime-spec)
|
||||||
@ -45,29 +44,39 @@ through the [CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and
|
|||||||
select between the [default Docker and CRI shim runtime (runc)](https://github.com/opencontainers/runc)
|
select between the [default Docker and CRI shim runtime (runc)](https://github.com/opencontainers/runc)
|
||||||
and `kata-runtime`.
|
and `kata-runtime`.
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
`kata-runtime` creates a QEMU\*/KVM virtual machine for each container or pod,
|
`kata-runtime` creates a QEMU\*/KVM virtual machine for each container or pod,
|
||||||
the Docker engine or Kubernetes' `kubelet` creates respectively.
|
the Docker engine or Kubernetes' `kubelet` creates respectively.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
The [`containerd-shim-kata-v2` (shown as `shimv2` from this point onwards)](https://github.com/kata-containers/runtime/tree/master/containerd-shim-v2)
|
||||||
|
is another Kata Containers entrypoint, which
|
||||||
|
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2) for Kata.
|
||||||
|
With `shimv2`, kubernetes can launch Pod and OCI compatible containers with one shim (the `shimv2`) per Pod instead
|
||||||
|
of `2N+1` shims (a `containerd-shim` and a `kata-shim` for each container and the Pod sandbox itself), and no standalone
|
||||||
|
`kata-proxy` process even if no vsock is available.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
The container process is then spawned by
|
The container process is then spawned by
|
||||||
[agent](https://github.com/kata-containers/agent), an agent process running
|
[agent](https://github.com/kata-containers/agent), an agent process running
|
||||||
as a daemon inside the virtual machine. kata-agent runs a gRPC server in
|
as a daemon inside the virtual machine. kata-agent runs a gRPC server in
|
||||||
the guest using a virtio serial interface which QEMU exposes as a serial
|
the guest using a virtio serial or vsock interface which QEMU exposes as a socket
|
||||||
device on the host. kata-runtime uses a gRPC protocol to communicate with
|
file on the host. kata-runtime uses a gRPC protocol to communicate with
|
||||||
the agent. This protocol allows the runtime to send container management
|
the agent. This protocol allows the runtime to send container management
|
||||||
commands to the agent. The protocol is also used to pass I/O streams (stdout,
|
commands to the agent. The protocol is also used to carry the I/O streams (stdout,
|
||||||
stderr, stdin) between the guest and the Docker Engine.
|
stderr, stdin) between the containers and the manage engines (e.g. Docker Engine).
|
||||||
|
|
||||||
For any given container, both the init process and all potentially executed
|
For any given container, both the init process and all potentially executed
|
||||||
commands within that container, together with their related I/O streams, need
|
commands within that container, together with their related I/O streams, need
|
||||||
to go through the virtio serial interface exported by QEMU. A [Kata Containers
|
to go through the virtio serial or vsock interface exported by QEMU.
|
||||||
|
In the virtio serial case, a [Kata Containers
|
||||||
proxy (`kata-proxy`)](https://github.com/kata-containers/proxy) instance is
|
proxy (`kata-proxy`)](https://github.com/kata-containers/proxy) instance is
|
||||||
launched for each virtual machine to handle multiplexing and demultiplexing
|
launched for each virtual machine to handle multiplexing and demultiplexing
|
||||||
those commands and streams.
|
those commands and streams.
|
||||||
|
|
||||||
On the host, each container process's removal is handled by a reaper in the higher
|
On the host, each container process's removal is handled by a reaper in the higher
|
||||||
layers of the container stack. In the case of Docker it is handled by `containerd-shim`.
|
layers of the container stack. In the case of Docker or containerd it is handled by `containerd-shim`.
|
||||||
In the case of CRI-O it is handled by `conmon`. For clarity, for the remainder
|
In the case of CRI-O it is handled by `conmon`. For clarity, for the remainder
|
||||||
of this document the term "container process reaper" will be used to refer to
|
of this document the term "container process reaper" will be used to refer to
|
||||||
either reaper. As Kata Containers processes run inside their own virtual machines,
|
either reaper. As Kata Containers processes run inside their own virtual machines,
|
||||||
@ -81,6 +90,10 @@ reaper. `kata-runtime` creates a `kata-shim` daemon for each container and for e
|
|||||||
OCI command received to run within an already running container (example, `docker
|
OCI command received to run within an already running container (example, `docker
|
||||||
exec`).
|
exec`).
|
||||||
|
|
||||||
|
Since Kata Containers version 1.5, the new introduced `shimv2` has integrated the
|
||||||
|
functionalities of the reaper, the `kata-runtime`, the `kata-shim`, and the `kata-proxy`.
|
||||||
|
As a result, there will not be any of the additional processes previously listed.
|
||||||
|
|
||||||
The container workload, that is, the actual OCI bundle rootfs, is exported from the
|
The container workload, that is, the actual OCI bundle rootfs, is exported from the
|
||||||
host to the virtual machine. In the case where a block-based graph driver is
|
host to the virtual machine. In the case where a block-based graph driver is
|
||||||
configured, virtio-scsi will be used. In all other cases a 9pfs virtio mount point
|
configured, virtio-scsi will be used. In all other cases a 9pfs virtio mount point
|
||||||
@ -562,6 +575,17 @@ a previously created virtual machine. In both cases it will get called with very
|
|||||||
arguments, so it needs the help of the Kubernetes CRI runtime to be able to distinguish a
|
arguments, so it needs the help of the Kubernetes CRI runtime to be able to distinguish a
|
||||||
pod creation request from a container one.
|
pod creation request from a container one.
|
||||||
|
|
||||||
|
### Containerd
|
||||||
|
|
||||||
|
As of Kata Containers 1.5, using `shimv2` with containerd 1.2.0 or above is the preferred
|
||||||
|
way to run Kata Containers with Kubernetes ([see the howto](https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md#configure-containerd-to-use-kata-containers)).
|
||||||
|
The CRI-O will catch up soon ([kubernetes-sigs/cri-o#2024](https://github.com/kubernetes-sigs/cri-o/issues/2024)).
|
||||||
|
|
||||||
|
Refer to the following how-to guides:
|
||||||
|
|
||||||
|
- [How to use Kata Containers and Containerd](../how-to/containerd-kata.md)
|
||||||
|
- [How to use Kata Containers and CRI (containerd plugin) with Kubernetes](../how-to/how-to-use-k8s-with-cri-containerd-and-kata.md)
|
||||||
|
|
||||||
### CRI-O
|
### CRI-O
|
||||||
|
|
||||||
#### OCI annotations
|
#### OCI annotations
|
||||||
@ -607,6 +631,9 @@ with a Kubernetes pod:
|
|||||||
|
|
||||||
#### Mixing VM based and namespace based runtimes
|
#### Mixing VM based and namespace based runtimes
|
||||||
|
|
||||||
|
> **Note:** Since Kubernetes 1.12, the `[Kubernetes RuntimeClass](how-to/containerd-kata.md#kubernetes-runtimeclass)`
|
||||||
|
> has been supported and the user can specify runtime without the non-standardized annotations.
|
||||||
|
|
||||||
One interesting evolution of the CRI-O support for `kata-runtime` is the ability
|
One interesting evolution of the CRI-O support for `kata-runtime` is the ability
|
||||||
to run virtual machine based pods alongside namespace ones. With CRI-O and Kata
|
to run virtual machine based pods alongside namespace ones. With CRI-O and Kata
|
||||||
Containers, one can introduce the concept of workload trust inside a Kubernetes
|
Containers, one can introduce the concept of workload trust inside a Kubernetes
|
||||||
@ -636,53 +663,6 @@ a pod is **not** `Privileged` the runtime selection is done as follows:
|
|||||||
| Default CRI-O trust level: `untrusted` | Kata Containers | Kata Containers | Kata Containers |
|
| Default CRI-O trust level: `untrusted` | Kata Containers | Kata Containers | Kata Containers |
|
||||||
|
|
||||||
|
|
||||||
### Containerd CRI Plugin (Formerly CRI-containerd)
|
|
||||||
|
|
||||||
The general guidelines for the Containerd CRI Plugin support is similar to the CRI-O support. You can run trusted workloads with a runtime like `runc` and then run an untrusted workload with Kata Containers. The parameters that you can modify in the containerd config to run Kata Containers along with another 'trusted' runtime are the following:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
# /etc/containerd/config.toml
|
|
||||||
|
|
||||||
[plugins.cri]
|
|
||||||
|
|
||||||
[plugins.cri.containerd]
|
|
||||||
|
|
||||||
# "plugins.cri.containerd.default_runtime" is the runtime to use in containerd.
|
|
||||||
[plugins.cri.containerd.default_runtime]
|
|
||||||
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
|
|
||||||
runtime_type = "io.containerd.runtime.v1.linux"
|
|
||||||
|
|
||||||
# runtime_engine is the name of the runtime engine used by containerd.
|
|
||||||
runtime_engine = ""
|
|
||||||
|
|
||||||
# runtime_root is the directory used by containerd for runtime state.
|
|
||||||
runtime_root = ""
|
|
||||||
|
|
||||||
# "plugins.cri.containerd.untrusted_workload_runtime" is a runtime to run untrusted workloads on it.
|
|
||||||
[plugins.cri.containerd.untrusted_workload_runtime]
|
|
||||||
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
|
|
||||||
runtime_type = "io.containerd.runtime.v1.linux"
|
|
||||||
|
|
||||||
# runtime_engine is the name of the runtime engine used by containerd.
|
|
||||||
runtime_engine = "/usr/bin/kata-runtime"
|
|
||||||
|
|
||||||
# runtime_root is the directory used by containerd for runtime state.
|
|
||||||
runtime_root = ""
|
|
||||||
```
|
|
||||||
|
|
||||||
You can find more information on the [Containerd config documentation](https://github.com/containerd/cri/blob/master/docs/config.md)
|
|
||||||
|
|
||||||
|
|
||||||
#### Mixing VM based and namespace based runtimes
|
|
||||||
|
|
||||||
The CRI Plugin supports the following annotation in a Kubernetes pod to identify as an untrusted workload
|
|
||||||
|
|
||||||
```
|
|
||||||
annotations:
|
|
||||||
io.kubernetes.cri.untrusted-workload: "true"
|
|
||||||
```
|
|
||||||
|
|
||||||
# Appendices
|
# Appendices
|
||||||
|
|
||||||
## DAX
|
## DAX
|
||||||
|
Loading…
Reference in New Issue
Block a user