Compare commits

..

56 Commits

Author SHA1 Message Date
Fabiano Fidêncio
0e2be438bd Merge pull request #2007 from fidencio/2.1.1-branch-bump
# Kata Containers 2.1.1
2021-06-11 17:09:11 +02:00
Fabiano Fidêncio
55dede1bce release: Kata Containers 2.1.1
- stable-2.1 | week 23: weekly backports
- stable-2.1 | versions: Update kubernetes to 1.21.1
- stable-2.1 | Port fd leak fixes
- [stable-2.1] Weekly backports to stable-2.1 branch, May 31st 2021
- [backport]runtime: and cgroup and SandboxCgroupOnly check for check sub-command
- Weekly stable 2.1 backports may 24th
- [backport-2.1] workflows: release kata 2.x snap through the stable channel
- [2.1] how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
- github: Do not run require porting labels on stable-2.1

492729f4 tools/packaging: clone meson and dependencies before building QEMU
db8d853b runtime: remove covertool from cli test
3fad5277 docs: Fix Release Process document
175970c9 versions: Update kubernetes to 1.21.1
1cc2ad3f agent: Fix fd leak caused by netlink
ac34f6df agent: Upgrade tokio-vsock to fix fd leak of vsock socket
915fea7b cgroup: fix the issue of set mem.limit and mem.swap
a05e1377 agent: re-enable the standard SIGPIPE behavior
8019f732 virtiofsd: Fix file descriptors leak and return correct PID
e48c9d42 runtime: and cgroup and SandboxCgroupOnly check for check sub-command
7874ab33 agent: fix start container failed when dropping all capabilities
536634e9 qemu: align before memory hotplug on arm64
c51891fe sandbox-bindmount: persist mount information
b137c7ac sandbox: Cleanup if failure to setup sandbox-bindmount occurs
68a77a7d workflows: release kata 2.x snap through the stable channel
550269ff how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
1ea0dc98 github: Do not run require porting labels on stable-2.1

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-06-11 09:48:55 +02:00
Fabiano Fidêncio
4d514ba1fb Merge pull request #1978 from fidencio/wip/weekly-backports-week-23
stable-2.1 | week 23: weekly backports
2021-06-10 10:19:52 +02:00
Julio Montes
492729f443 tools/packaging: clone meson and dependencies before building QEMU
In some distros (Ubuntu 18 and 20) it's not possible to clone meson
and QEMU dependencies from https://git.qemu.org due to problems with
its certificates, let's pull these dependencies from github before
building QEMU.

fixes #1965

Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 9ec9bbbabc)
2021-06-08 10:37:42 +02:00
Fabiano Fidêncio
645e950b8e Merge pull request #1963 from fidencio/wip/stable-2.1-update-kubernetes-1.21.0-to-1.21.1
stable-2.1 | versions: Update kubernetes to 1.21.1
2021-06-08 10:02:21 +02:00
Shengjing Zhu
db8d853b99 runtime: remove covertool from cli test
covertool has no active since 2018 and is not compatible with go1.16

  ../vendor/github.com/dlespiau/covertool/pkg/cover/cover.go:76:29: cannot use f (type dummyTestDeps) as type testing.testDeps in argument to testing.MainStart:
  dummyTestDeps does not implement testing.testDeps (missing SetPanicOnExit0 method)

Fixes: #1862

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
(cherry picked from commit 1b60705646)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-06-08 10:00:51 +02:00
Gabriela Cervantes
3fad527734 docs: Fix Release Process document
This PR updates the correct url for github actions as well as it
corrects a misspelling.

Fixes #1960

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 9158ec68cc)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-06-08 10:00:47 +02:00
Fabiano Fidêncio
175970c93a versions: Update kubernetes to 1.21.1
The reason for doing such is to (try to) avoid random crashes we've been
facing as part of our CI, such as the one reported as part of
https://github.com/kata-containers/tests/issues/3473

Fixes: #1850

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
(cherry picked from commit dbef2b2931)
2021-06-04 15:43:28 +02:00
Tim Zhang
6cea1b146c Merge pull request #1959 from Tim-Zhang/port-fix-fd-leak-for-stable-2.1
stable-2.1 | Port fd leak fixes
2021-06-03 20:15:23 +08:00
Tim Zhang
1cc2ad3f34 agent: Fix fd leak caused by netlink
See also: little-dude/netlink#165

Fixes: #1952

Because the author of netlink has no time to maintain the crate
(https://github.com/little-dude/netlink/issues/161), so we
need to switch the dependency to github temporarily.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-03 17:24:09 +08:00
Tim Zhang
ac34f6dfd9 agent: Upgrade tokio-vsock to fix fd leak of vsock socket
Fixes: #1950

The further information: rust-vsock/vsock-rs#15

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-03 10:29:33 +08:00
Julio Montes
ff206cf6cf Merge pull request #1946 from fidencio/wip/weekly-backports-to-stable-2.1
[stable-2.1] Weekly backports to stable-2.1 branch, May 31st 2021
2021-06-01 16:00:42 -05:00
Bin Liu
57f7ffbe39 Merge pull request #1940 from liubin/backport/1934
[backport]runtime: and cgroup and SandboxCgroupOnly check for check sub-command
2021-06-01 08:32:45 +08:00
fupan.lfp
915fea7b1f cgroup: fix the issue of set mem.limit and mem.swap
When update memory limit, we should adapt the write sequence
for memory and swap memory, so it won't fail because
the new value and the old value don't fit kernel's
validation.

Fixes: #1917

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit 30f4834c5b)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
fupan.lfp
a05e137710 agent: re-enable the standard SIGPIPE behavior
The Rust standard library had suppressed the default SIGPIPE
behavior, see https://github.com/rust-lang/rust/pull/13158.
Since the parent's signal handler would be inherited by it's child
process, thus we should re-enable the standard SIGPIPE behavior as a
workaround.

Fixes: #1887

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit 0ae364c8eb)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
bin
8019f7322d virtiofsd: Fix file descriptors leak and return correct PID
This commit will fix two problems:
- Virtiofsd process ID returned to the caller will always be 0,
   the pid var is never being assigned a value.
- Socket listen fd may leak in case of failure of starting virtiofsd process.
  This is a port of be9ca0d58b

Fixes: #1931

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 773deca2f6)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
bin
e48c9d426d runtime: and cgroup and SandboxCgroupOnly check for check sub-command
In kata-runtime check sub-command, checks cgroups and SandboxCgroupOnly
to show message if the SandboxCgroupOnly is not set to true
and cgroup v2 is used.

Fixes: #1927

Signed-off-by: bin <bin@hyper.sh>
2021-05-28 16:36:49 +08:00
Peng Tao
6a7e6a8d0a Merge pull request #1921 from fidencio/wip/weekly-stable-2.1-backports-may-24th
Weekly stable 2.1 backports may 24th
2021-05-25 10:17:18 +08:00
quanweiZhou
7874ab33d4 agent: fix start container failed when dropping all capabilities
When starting a container and dropping all capabilities,
the init child process has no permission to read the exec.fifo
file because the parent set the file mode 0o622. So change the exec.fifo file mode to 0o644.

fixes #1913

Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
(cherry picked from commit 3e4ebe10ac)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:15 +02:00
Yuanzhe Liu
536634e909 qemu: align before memory hotplug on arm64
When hotplug memory on arm64 in kata, kernel will shout:

[ 0.396551] Block size [0x40000000] unaligned hotplug range: start 0xc8000000, size 0x40000000
[ 0.396556] acpi PNP0C80:01: add_memory failed
[ 0.396834] acpi PNP0C80:01: acpi_memory_enable_device() error
[ 0.396948] acpi PNP0C80:01: Enumeration failure

It means that kernel will check if the memory range to be hotplugged
align with 1G before plug the memory. So we should twist the qemu to
make sure the memory range align with 1G to pass the kernel check.

Fixes: #1841

Signed-off-by: Yuanzhe Liu <yuanzheliu09@gmail.com>
(cherry picked from commit bc36b7b49f)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:11 +02:00
Eric Ernst
c51891fee7 sandbox-bindmount: persist mount information
Without this, if the shim dies, we will not have a reliable way to
identify what mounts should be cleaned up if `containerd-shim-kata-v2
cleanup` is called for the sandbox.

Before this, if you `ctr run` with a sandbox bindmount defined and SIGKILL the
containerd-shim-kata-v2, you'll notice the sandbox bindmount left on
host.

With this change, the shim is able to get the sandbox bindmount
information from disk and do the appropriate cleanup.

Fixes #1896

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 7f1030d303)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:08 +02:00
Eric Ernst
b137c7ac33 sandbox: Cleanup if failure to setup sandbox-bindmount occurs
If for any reason there's an error when trying to setup the sandbox
bindmounts, make sure we roll back any mounts already created when
setting up the sandbox.

Without this, we'd leave shared directory mount and potentially
sandbox-bindmounts on the host.

Fixes: #1895

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 089a7484e1)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:04 +02:00
Fabiano Fidêncio
a407e53b94 Merge pull request #1911 from devimc/2021-05-21/stable-2.1/updateChannels
[backport-2.1] workflows: release kata 2.x snap through the stable channel
2021-05-22 09:22:10 +02:00
Julio Montes
68a77a7dec workflows: release kata 2.x snap through the stable channel
kata 1.x has been deprecated, now kata 2.x can be released through
the stable channel

fixes #1909

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-05-21 15:47:50 -05:00
Bin Liu
169cf133c9 Merge pull request #1873 from teawater/vm_doc2.1
[2.1] how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
2021-05-20 17:36:03 +08:00
Hui Zhu
550269ff89 how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
Update this howto because the virtio-mem support of kata, qemu and Linux
was updated.

Fixes: #1845

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-05-18 14:11:43 +08:00
Chelsea Mafrica
d6d16dc597 Merge pull request #1849 from GabyCT/topic/removeprportinglabel
github: Do not run require porting labels on stable-2.1
2021-05-17 13:19:25 -07:00
Gabriela Cervantes
1ea0dc9804 github: Do not run require porting labels on stable-2.1
When we are creating a PR in stable-2.1 we do not need to run
the github action of porting labels as we are doing backports or
new releases in stable-2.1 and we it is unnecessary to put labels
like no-backport-needed or no-forwardport-needed, etc.

Fixes #1847

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2021-05-14 16:37:44 -05:00
Fabiano Fidêncio
0f82291926 Merge pull request #1856 from fidencio/2.1.0-branch-bump
# Kata Containers 2.1.0
2021-05-14 19:59:06 +02:00
Fabiano Fidêncio
5d3610e25f release: Kata Containers 2.1.0
- stable-2.1 | The last round of backports before releasing 2.1.0
- back port: image_build: align image size to 128M for arm64
- stable-2.1 | runtime: make dialing timeout configurable
- stable-2.1 | agent: avoid reaping the exit signal of execute_hook in the reaper
- stable-2.1 | Get sandbox metrics cli
- packaging/kata-cleanup: add k3s containerd volume
- stable-2.1: First round of backports
- [backport]runtime: use s.ctx instead ctx for checking cancellation
- [2.1.0] kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
- [2.1.0] Fix issue of virtio-mem

9266c246 rustjail: separated the propagation flags from mount flags
7086f91e runtime: sandbox delete should succeed after verifying sandbox state
0a7befa6 docs: Fix spell-check errors found after new text is discovered
eff70d2e docs: Remove horizontal ruler markers that disable spell checks
260f59df image_build: align image size to 128M for arm64
c0bdba23 runtime: make dialing timeout configurable
1b3cf2fb kata-monitor: export get stats for sandbox
59b9e5d0 kata-runtime: add `metrics` command
828a3048 agent: avoid reaping the exit signal of execute_hook in the reaper
d3690952 runtime: shim: dedup client, socket addr code
7f7c794d runtime: Short the shim-monitor path
3f1b7c91 cli: delete tracing code for kata-runtime binary
68cad377 agent: Set fixed NOFILE limit value for kata-agent
7c9067cc docs: add per-Pod Kata configurations for enable_pprof
dba86ef3 ci/install_yq.sh: install_yq: Check version before return
3883e4e2 kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
79831faf runtime: use s.ctx instead ctx for checking cancellation
3212c7ae packaging/kata-cleanup: add k3s containerd volume
7f7c3fc8 qemu.go: qemu: resizeMemory: Fix virtio-mem resize overflow issue
c9053ea3 qemu.go: qemu: setupVirtioMem: let sizeMB be multiple of 2Mib

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-14 16:05:01 +02:00
Fabiano Fidêncio
ed01ac3e0c Merge pull request #1853 from fidencio/wip/last-round-of-backports-for-2.1.0
stable-2.1 | The last round of backports before releasing 2.1.0
2021-05-14 14:35:25 +02:00
fupan.lfp
9266c2460a rustjail: separated the propagation flags from mount flags
Since the propagation flags couldn't be combinted with the
standard mount flags, and they should be used with the remount,
thus it's better to split them from the standard mount flags.

Fixes: #1699

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit e5fe572f51)
2021-05-14 09:42:00 +02:00
Peng Tao
7086f91e1f runtime: sandbox delete should succeed after verifying sandbox state
Otherwise we might block delete and create orphan containers.

Fixes: #1039

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 35151f1786)
2021-05-14 09:41:38 +02:00
Christophe de Dinechin
0a7befa645 docs: Fix spell-check errors found after new text is discovered
The spell-checker scripts has some bugs that caused large chunks of texts to not
be spell checked at all (see #1793). The previous commit worked around this bug,
which exposed another bug:

The following source text:

    are discussions about using VM save and restore to
    give [`criu`](https://github.com/checkpoint-restore/criu)-like
    functionality, which might provide a solution

yields the surprising error below:

    WARNING: Word 'givelike': did you mean one of the following?: give like, give-like, wavelike

Apparently, an extra space is removed, which is another issue with the
spell-checking script. This case is somewhat contrived because of the URL link,
so for now, I decided for a creative rewriting, inserting the word "a" knowing
that "alike" is a valid word ;-)

Fixes: #1793

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
(cherry picked from commit 5fdf617e7f)
2021-05-14 09:41:38 +02:00
Christophe de Dinechin
eff70d2eea docs: Remove horizontal ruler markers that disable spell checks
There is a bug in the CI script checking spelling that causes it
to skip any text that follows a horizontal ruler.
(https://github.com/kata-containers/tests/issues/3448)

Solution: replace one horizontal ruler marker with another that
does not trip the spell-checking script.

Fixes: #1793

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
(cherry picked from commit 42425456e7)
2021-05-14 09:38:50 +02:00
Fabiano Fidêncio
dd26aa5838 Merge pull request #1840 from jongwu/stable-2.1_image_align
back port: image_build: align image size to 128M for arm64
2021-05-13 10:37:50 +02:00
Jianyong Wu
260f59df38 image_build: align image size to 128M for arm64
There is an inconformity between qemu and kernel of memory alignment
check in memory hotplug. Both of qemu and kernel will do the start
address alignment check in memory hotplug. But it's 2M in qemu
while 128M in kernel. It leads to an issue when memory hotplug.

Currently, the kata image is a nvdimm device, which will plug into the VM as
a dimm. If another dimm is pluged, it will reside on top of that nvdimm.
So, the start address of the second dimm may not pass the alginment
check in kernel if the nvdimm size doesn't align with 128M.

There are 3 ways to address this issue I think:
1. fix the alignment size in kernel according to qemu. I think people
in linux kernel community will not accept it.
2. do alignment check in qemu and force the start address of hotplug
in alignment with 128M, which means there maybe holes between memory blocks.
3. obey the rule in user end, which means fix it in kata.

I think the second one is the best, but I can't do that for some reason.
Thus, the last one is the choice here.

Fixes: #1769
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-05-13 10:09:25 +08:00
Chelsea Mafrica
9a32a3e16d Merge pull request #1835 from snir911/backport_configure_timeout
stable-2.1 | runtime: make dialing timeout configurable
2021-05-12 13:14:37 -07:00
Fabiano Fidêncio
123f7d53cb Merge pull request #1830 from Tim-Zhang/fix-reap-for-stable-2.1
stable-2.1 | agent: avoid reaping the exit signal of execute_hook in the reaper
2021-05-12 20:26:42 +02:00
Fabiano Fidêncio
aa213fdc28 Merge pull request #1833 from fidencio/wip/stable-2.1-backport-of-1816
stable-2.1 | Get sandbox metrics cli
2021-05-12 19:34:20 +02:00
Snir Sheriber
c0bdba2350 runtime: make dialing timeout configurable
allow to set dialing timeout in configuration.toml
default is 30s

Fixes: #1789
(cherry-picked 01b56d6cbf)
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2021-05-12 14:17:34 +03:00
Eric Ernst
1b3cf2fb7d kata-monitor: export get stats for sandbox
Gathering stats for a given sandbox is pretty useful; let's export a
function from katamonitor pkg to do this.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 3787306107)
2021-05-12 11:44:58 +02:00
Eric Ernst
59b9e5d0f8 kata-runtime: add metrics command
For easier debug, let's add subcommand to kata-runtime for gathering
metrics associated with a given sandbox.

kata-runtime metrics --sandbox-id foobar

Fixes: #1815

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 8068a4692f)
2021-05-12 11:44:53 +02:00
Tim Zhang
828a304883 agent: avoid reaping the exit signal of execute_hook in the reaper
Fixes: #1826

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-05-12 16:33:44 +08:00
Fabiano Fidêncio
70734dfa17 Merge pull request #1803 from nubificus/stable-2.1
packaging/kata-cleanup: add k3s containerd volume
2021-05-11 19:38:57 +02:00
Fabiano Fidêncio
f170df6201 Merge pull request #1821 from fidencio/wip/first-round-of-backports
stable-2.1: First round of backports
2021-05-11 08:52:18 +02:00
Eric Ernst
d3690952e6 runtime: shim: dedup client, socket addr code
(1) Add an accessor function, SocketAddress, to the shim-v2 code for
determining the shim's abstract domain socket address, given the sandbox
ID.

(2) In kata monitor, create a function, BuildShimClient, for obtaining the appropriate
http.Client for communicating with the shim's monitoring endpoint.

(3) Update the kata CLI and kata-monitor code to make use of these.

(4) Migrate some kata monitor methods to be functions, in order to ease
future reuse.

(5) drop unused namespace from functions where it is no longer needed.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 3caed6f88d)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:53 +02:00
Fabiano Fidêncio
7f7c794da4 runtime: Short the shim-monitor path
Instead of having something like
"/containerd-shim/$namespace/$sandboxID/shim-monitor.sock", let's change
the approach to:
* create the file in a more neutral location "/run/vc", instead of
  "/containerd-shim";
* drop the namespace, as the sandboxID should be unique;
* remove ".sock" from the socket name.

This will result on a name that looks like:
"/run/vc/$sandboxID/shim-monitor"

Fixes: #497

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
(cherry picked from commit 4bc006c8a4)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:47 +02:00
bin
3f1b7c9127 cli: delete tracing code for kata-runtime binary
There are no pod/container operations in kata-runtime binary,
tracing in this package is meaningless.

Fixes: #1748

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 13c23fec11)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:36 +02:00
Snir Sheriber
68cad37720 agent: Set fixed NOFILE limit value for kata-agent
Some applications may fail if NOFILE limit is set to unlimited.
Although in some environments this value is explicitly overridden,
lets set it to a more sane value in case it doesn't.

Fixes #1715
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
(cherry picked from commit a188577ebf)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:34:28 +02:00
bin
7c9067cc9d docs: add per-Pod Kata configurations for enable_pprof
Now enabling enable_pprof for individual pods is supported,
but not documented.

This commit will add per-Pod Kata configurations for `enable_pprof`
in file `docs/how-to/how-to-set-sandbox-config-kata.md`

Fixes: #1744

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 95e54e3f48)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:34:05 +02:00
Hui Zhu
dba86ef31a ci/install_yq.sh: install_yq: Check version before return
Check the yq version before return.

Fixes: #1776

Signed-off-by: Hui Zhu <teawater@antfin.com>
(cherry picked from commit d8896157df)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:33:33 +02:00
Tim Zhang
0e2df80bda Merge pull request #1814 from liubin/fix/1804-select-sandbox-ctx
[backport]runtime: use s.ctx instead ctx for checking cancellation
2021-05-07 19:43:14 +08:00
Bin Liu
8c4e187049 Merge pull request #1813 from teawater/open_vm
[2.1.0] kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
2021-05-07 19:31:23 +08:00
Fabiano Fidêncio
3bcdc26008 Merge pull request #1812 from teawater/fix_vm
[2.1.0] Fix issue of virtio-mem
2021-05-07 08:14:19 +02:00
Orestis Lagkas Nikolos
3212c7ae00 packaging/kata-cleanup: add k3s containerd volume
kata-deploy cleanup expects to find containerd configuration
in /etc/containerd/config.toml. In case of k3s mount the k3s
containerd config as a volume.

Original PR #1802

Fixes #1801

Signed-off-by: Orestis Lagkas Nikolos <olagkasn@nubificus.co.uk>
2021-05-06 03:36:38 -05:00
2212 changed files with 108279 additions and 201437 deletions

18
.github/workflows/gather-artifacts.sh vendored Executable file
View File

@@ -0,0 +1,18 @@
#!/bin/bash
# Copyright (c) 2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
set -o errexit
set -o pipefail
pushd kata-artifacts >>/dev/null
for c in ./*.tar.gz
do
echo "untarring tarball $c"
tar -xvf $c
done
tar cvfJ ../kata-static.tar.xz ./opt
popd >>/dev/null

View File

@@ -0,0 +1,36 @@
#!/bin/bash
# Copyright (c) 2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
set -o errexit
set -o pipefail
main() {
artifact_stage=${1:-}
artifact=$(echo ${artifact_stage} | sed -n -e 's/^install_//p' | sed -r 's/_/-/g')
if [ -z "${artifact}" ]; then
"Scripts needs artifact name to build"
exit 1
fi
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
export GOPATH=$HOME/go
go get github.com/kata-containers/packaging || true
pushd $GOPATH/src/github.com/kata-containers/packaging/release >>/dev/null
git checkout $tag
pushd ../obs-packaging
./gen_versions_txt.sh $tag
popd
source ./kata-deploy-binaries.sh
${artifact_stage} $tag
popd
mv $HOME/go/src/github.com/kata-containers/packaging/release/kata-static-${artifact}.tar.gz .
}
main $@

View File

@@ -0,0 +1,34 @@
#!/bin/bash
# Copyright (c) 2019 Intel Corporation
# Copyright (c) 2020 Ant Group
#
# SPDX-License-Identifier: Apache-2.0
#
set -o errexit
set -o pipefail
main() {
artifact_stage=${1:-}
artifact=$(echo ${artifact_stage} | sed -n -e 's/^install_//p' | sed -r 's/_/-/g')
if [ -z "${artifact}" ]; then
"Scripts needs artifact name to build"
exit 1
fi
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
pushd $GITHUB_WORKSPACE/tools/packaging
git checkout $tag
./scripts/gen_versions_txt.sh $tag
popd
pushd $GITHUB_WORKSPACE/tools/packaging/release
source ./kata-deploy-binaries.sh
${artifact_stage} $tag
popd
mv $GITHUB_WORKSPACE/tools/packaging/release/kata-static-${artifact}.tar.gz .
}
main $@

View File

@@ -1,58 +0,0 @@
name: kata-deploy-build
on: push
jobs:
build-asset:
runs-on: ubuntu-latest
strategy:
matrix:
asset:
- kernel
- shim-v2
- qemu
- cloud-hypervisor
- firecracker
- rootfs-image
- rootfs-initrd
steps:
- uses: actions/checkout@v2
- name: Install docker
run: |
curl -fsSL https://test.docker.com -o test-docker.sh
sh test-docker.sh
- name: Build ${{ matrix.asset }}
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r --preserve=all "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
- name: store-artifact ${{ matrix.asset }}
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
steps:
- uses: actions/checkout@v2
- name: get-artifacts
uses: actions/download-artifact@v2
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
- name: store-artifacts
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
path: kata-static.tar.xz

View File

@@ -46,11 +46,9 @@ jobs:
VERSION="2.0.0"
ARTIFACT_URL="https://github.com/kata-containers/kata-containers/releases/download/${VERSION}/kata-static-${VERSION}-x86_64.tar.xz"
wget "${ARTIFACT_URL}" -O tools/packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:${PR_SHA} -t quay.io/kata-containers/kata-deploy-ci:${PR_SHA} ./tools/packaging/kata-deploy
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:${PR_SHA} ./tools/packaging/kata-deploy
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
docker push katadocker/kata-deploy-ci:$PR_SHA
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/kata-containers/kata-deploy-ci:$PR_SHA
echo "##[set-output name=pr-sha;]${PR_SHA}"
- name: test-kata-deploy-ci-in-aks

View File

@@ -247,11 +247,9 @@ jobs:
pkg_sha=$(git rev-parse HEAD)
popd
mv release-candidate/kata-static.tar.xz ./packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha ./packaging/kata-deploy
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha ./packaging/kata-deploy
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
docker push katadocker/kata-deploy-ci:$pkg_sha
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha
echo "::set-output name=PKG_SHA::${pkg_sha}"
- name: test-kata-deploy-ci-in-aks
uses: ./packaging/kata-deploy/action

View File

@@ -5,45 +5,213 @@ on:
- '2.*'
jobs:
build-asset:
get-artifact-list:
runs-on: ubuntu-latest
strategy:
matrix:
asset:
- cloud-hypervisor
- firecracker
- kernel
- qemu
- rootfs-image
- rootfs-initrd
- shim-v2
steps:
- uses: actions/checkout@v2
- name: Install docker
- name: get the list
run: |
curl -fsSL https://test.docker.com -o test-docker.sh
sh test-docker.sh
pushd $GITHUB_WORKSPACE
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
git checkout $tag
popd
$GITHUB_WORKSPACE/tools/packaging/artifact-list.sh > artifact-list.txt
- name: save-artifact-list
uses: actions/upload-artifact@v2
with:
name: artifact-list
path: artifact-list.txt
- name: Build ${{ matrix.asset }}
build-kernel:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_kernel"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- run: |
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
- name: build-kernel
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"
build_dir=$(readlink -f build)
# store-artifact does not work with symlink
sudo cp -r "${build_dir}" "kata-build"
env:
KATA_ASSET: ${{ matrix.asset }}
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
- name: store-artifact ${{ matrix.asset }}
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
if-no-files-found: error
path: kata-static-kernel.tar.gz
create-kata-tarball:
runs-on: ubuntu-latest
needs: build-asset
build-experimental-kernel:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_experimental_kernel"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- run: |
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
- name: build-experimental-kernel
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-experimental-kernel.tar.gz
build-qemu:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_qemu"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- name: build-qemu
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-qemu.tar.gz
build-image:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_image"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- name: build-image
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-image.tar.gz
build-firecracker:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_firecracker"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- name: build-firecracker
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-firecracker.tar.gz
build-clh:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_clh"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- name: build-clh
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-clh.tar.gz
build-kata-components:
runs-on: ubuntu-16.04
needs: get-artifact-list
env:
buildstr: "install_kata_components"
steps:
- uses: actions/checkout@v2
- name: get-artifact-list
uses: actions/download-artifact@v2
with:
name: artifact-list
- name: build-kata-components
run: |
if grep -q $buildstr artifact-list.txt; then
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
echo "artifact-built=true" >> $GITHUB_ENV
else
echo "artifact-built=false" >> $GITHUB_ENV
fi
- name: store-artifacts
if: ${{ env.artifact-built }} == 'true'
uses: actions/upload-artifact@v2
with:
name: kata-artifacts
path: kata-static-kata-components.tar.gz
gather-artifacts:
runs-on: ubuntu-16.04
needs: [build-experimental-kernel, build-kernel, build-qemu, build-image, build-firecracker, build-kata-components, build-clh]
steps:
- uses: actions/checkout@v2
- name: get-artifacts
@@ -51,24 +219,24 @@ jobs:
with:
name: kata-artifacts
path: kata-artifacts
- name: merge-artifacts
- name: colate-artifacts
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
$GITHUB_WORKSPACE/.github/workflows/gather-artifacts.sh
- name: store-artifacts
uses: actions/upload-artifact@v2
with:
name: kata-static-tarball
name: release-candidate
path: kata-static.tar.xz
kata-deploy:
needs: create-kata-tarball
needs: gather-artifacts
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: get-kata-tarball
- name: get-artifacts
uses: actions/download-artifact@v2
with:
name: kata-static-tarball
name: release-candidate
- name: build-and-push-kata-deploy-ci
id: build-and-push-kata-deploy-ci
run: |
@@ -78,11 +246,9 @@ jobs:
pkg_sha=$(git rev-parse HEAD)
popd
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
docker push katadocker/kata-deploy-ci:$pkg_sha
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha
mkdir -p packaging/kata-deploy
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
echo "::set-output name=PKG_SHA::${pkg_sha}"
@@ -101,9 +267,7 @@ jobs:
# tag the container image we created and push to DockerHub
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag}
docker tag quay.io/kata-containers/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} quay.io/kata-containers/kata-deploy:${tag}
docker push katadocker/kata-deploy:${tag}
docker push quay.io/kata-containers/kata-deploy:${tag}
upload-static-tarball:
needs: kata-deploy
@@ -113,7 +277,7 @@ jobs:
- name: download-artifacts
uses: actions/download-artifact@v2
with:
name: kata-static-tarball
name: release-candidate
- name: install hub
run: |
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')

View File

@@ -6,15 +6,15 @@
name: Ensure PR has required porting labels
on:
pull_request:
branches:
- main
pull_request_target:
types:
- opened
- reopened
- labeled
- unlabeled
pull_request:
branches:
- main
jobs:
check-pr-porting-labels:

View File

@@ -9,8 +9,6 @@ jobs:
steps:
- name: Check out Git repository
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Install Snapcraft
uses: samuelmeuli/action-snapcraft@v1

View File

@@ -6,8 +6,6 @@ jobs:
steps:
- name: Check out
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Install Snapcraft
uses: samuelmeuli/action-snapcraft@v1

View File

@@ -1,19 +1,10 @@
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
- labeled
- unlabeled
on: ["pull_request"]
name: Static checks
jobs:
test:
strategy:
matrix:
go-version: [1.15.x, 1.16.x]
go-version: [1.13.x, 1.14.x, 1.15.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
env:
@@ -22,65 +13,54 @@ jobs:
TRAVIS_PULL_REQUEST_BRANCH: ${{ github.head_ref }}
TRAVIS_PULL_REQUEST_SHA : ${{ github.event.pull_request.head.sha }}
RUST_BACKTRACE: "1"
target_branch: ${{ github.base_ref }}
target_branch: ${TRAVIS_BRANCH}
steps:
- name: Install Go
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/setup-go@v2
with:
go-version: ${{ matrix.go-version }}
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Setup GOPATH
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "TRAVIS_BRANCH: ${TRAVIS_BRANCH}"
echo "TRAVIS_PULL_REQUEST_BRANCH: ${TRAVIS_PULL_REQUEST_BRANCH}"
echo "TRAVIS_PULL_REQUEST_SHA: ${TRAVIS_PULL_REQUEST_SHA}"
echo "TRAVIS: ${TRAVIS}"
- name: Set env
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Checkout code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup travis references
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "TRAVIS_BRANCH=${TRAVIS_BRANCH:-$(echo $GITHUB_REF | awk 'BEGIN { FS = \"/\" } ; { print $3 }')}"
target_branch=${TRAVIS_BRANCH}
- name: Setup
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Building rust
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh
PATH=$PATH:"$HOME/.cargo/bin"
rustup target add x86_64-unknown-linux-musl
rustup component add rustfmt clippy
# Check whether the vendored code is up-to-date & working as the first thing
- name: Check vendored code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
# Must build before static checks as we depend on some generated code in runtime and agent
- name: Build
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor
cd ${GOPATH}/src/github.com/${{ github.repository }} && make
- name: Static Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/static-checks.sh
- name: Run Compiler Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make check
- name: Run Unit Tests
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make test

View File

@@ -1,94 +0,0 @@
# Glossary
[A](#a), [B](#b), [C](#c), [D](#d), [E](#e), [F](#f), [G](#g), [H](#h), [I](#i), [J](#j), [K](#k), [L](#l), [M](#m), [N](#n), [O](#o), [P](#p), [Q](#q), [R](#r), [S](#s), [T](#t), [U](#u), [V](#v), [W](#w), [X](#x), [Y](#y), [Z](#z)
## A
### Auto Scaling
a method used in cloud computing, whereby the amount of computational resources in a server farm, typically measured in terms of the number of active servers, which vary automatically based on the load on the farm.
## B
## C
### Container Security Solutions
The process of implementing security tools and policies that will give you the assurance that everything in your container is running as intended, and only as intended.
### Container Software
A standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
### Container Runtime Interface
A plugin interface which enables Kubelet to use a wide variety of container runtimes, without the need to recompile.
### Container Virtualization
A container is a virtual runtime environment that runs on top of a single operating system (OS) kernel and emulates an operating system rather than the underlying hardware.
## D
## E
## F
## G
## H
## I
### Infrastructure Architecture
A structured and modern approach for supporting an organization and facilitating innovation within an enterprise.
## J
## K
### Kata Containers
Kata containers is an open source project delivering increased container security and Workload isolation through an implementation of lightweight virtual machines.
## L
## M
## N
## O
## P
### Pod Containers
A Group of one or more containers , with shared storage/network, and a specification for how to run the containers.
### Private Cloud
A computing model that offers a proprietary environment dedicated to a single business entity.
### Public Cloud
Computing services offered by third-party providers over the public Internet, making them available to anyone who wants to use or purchase them.
## Q
## R
## S
### Serverless Containers
An architecture in which code is executed on-demand. Serverless workloads are typically in the cloud, but on-premises serverless platforms exist, too.
## T
## U
## V
### Virtual Machine Monitor
Computer software, firmware or hardware that creates and runs virtual machines.
### Virtual Machine Software
A software program or operating system that not only exhibits the behavior of a separate computer, but is also capable of performing tasks such as running applications and programs like a separate computer.
## W
## X
## Y
## Z

View File

@@ -15,7 +15,7 @@ TOOLS =
TOOLS += agent-ctl
STANDARD_TARGETS = build check clean install test vendor
STANDARD_TARGETS = build check clean install test
include utils.mk
@@ -29,14 +29,4 @@ $(eval $(call create_all_rules,$(COMPONENTS),$(TOOLS),$(STANDARD_TARGETS)))
generate-protocols:
make -C src/agent generate-protocols
# Some static checks rely on generated source files of components.
static-checks: build
bash ci/static-checks.sh
binary-tarball:
make -f ./tools/packaging/kata-deploy/local-build/Makefile
install-binary-tarball:
make -f ./tools/packaging/kata-deploy/local-build/Makefile install
.PHONY: all default static-checks binary-tarball install-binary-tarball
.PHONY: all default

View File

@@ -2,6 +2,22 @@
# Kata Containers
* [Kata Containers](#kata-containers)
* [Introduction](#introduction)
* [Getting started](#getting-started)
* [Documentation](#documentation)
* [Community](#community)
* [Getting help](#getting-help)
* [Raising issues](#raising-issues)
* [Kata Containers 1.x versions](#kata-containers-1x-versions)
* [Developers](#developers)
* [Components](#components)
* [Kata Containers 1.x components](#kata-containers-1x-components)
* [Common repositories](#common-repositories)
* [Packaging and releases](#packaging-and-releases)
---
Welcome to Kata Containers!
This repository is the home of the Kata Containers code for the 2.0 and newer
@@ -10,6 +26,11 @@ releases.
If you want to learn about Kata Containers, visit the main
[Kata Containers website](https://katacontainers.io).
For further details on the older (first generation) Kata Containers 1.x
versions, see the
[Kata Containers 1.x components](#kata-containers-1x-components)
section.
## Introduction
Kata Containers is an open source project and community working to build a
@@ -46,34 +67,69 @@ Please raise an issue
> **Note:**
> If you are reporting a security issue, please follow the [vulnerability reporting process](https://github.com/kata-containers/community#vulnerability-handling)
#### Kata Containers 1.x versions
For older Kata Containers 1.x releases, please raise an issue in the
[Kata Containers 1.x component repository](#kata-containers-1x-components)
that seems most appropriate.
If in doubt, raise an issue
[in the Kata Containers 1.x runtime repository](https://github.com/kata-containers/runtime/issues).
## Developers
### Components
### Main components
The table below lists the core parts of the project:
| Component | Type | Description |
|-|-|-|
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| [agent-ctl](tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images for the hypervisor. |
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| [trace-forwarder](src/trace-forwarder) | utility | Agent tracing helper. |
### Additional components
#### Kata Containers 1.x components
The table below lists the remaining parts of the project:
For the first generation of Kata Containers (1.x versions), each component was
kept in a separate repository.
For information on the Kata Containers 1.x releases, see the
[Kata Containers 1.x releases page](https://github.com/kata-containers/runtime/releases).
For further information on particular Kata Containers 1.x components, see the
individual component repositories:
| Component | Type | Description |
|-|-|-|
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [`agent-ctl`](tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`trace-forwarder`](src/trace-forwarder) | utility | Agent tracing helper. |
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
| [agent](https://github.com/kata-containers/agent) | core | See [components](#components). |
| [documentation](https://github.com/kata-containers/documentation) | documentation | |
| [KSM throttler](https://github.com/kata-containers/ksm-throttler) | optional core | Daemon that monitors containers and deduplicates memory to maximize container density on the host. |
| [osbuilder](https://github.com/kata-containers/osbuilder) | infrastructure | See [components](#components). |
| [packaging](https://github.com/kata-containers/packaging) | infrastructure | See [components](#components). |
| [proxy](https://github.com/kata-containers/proxy) | core | Multiplexes communications between the shims, agent and runtime. |
| [runtime](https://github.com/kata-containers/runtime) | core | See [components](#components). |
| [shim](https://github.com/kata-containers/shim) | core | Handles standard I/O and signals on behalf of the container process. |
> **Note:**
>
> - There are more components for the original Kata Containers 1.x implementation.
> - The current implementation simplifies the design significantly:
> compare the [current](docs/design/architecture.md) and
> [previous generation](https://github.com/kata-containers/documentation/blob/master/design/architecture.md)
> designs.
### Common repositories
The following repositories are used by both the current and first generation Kata Containers implementations:
| Component | Description | Current | First generation | Notes |
|-|-|-|-|-|
| CI | Continuous Integration configuration files and scripts. | [Kata 2.x](https://github.com/kata-containers/ci/tree/main) | [Kata 1.x](https://github.com/kata-containers/ci/tree/master) | |
| kernel | The Linux kernel used by the hypervisor to boot the guest image. | [Kata 2.x][kernel] | [Kata 1.x][kernel] | Patches are stored in the packaging component. |
| tests | Test code. | [Kata 2.x](https://github.com/kata-containers/tests/tree/main) | [Kata 1.x](https://github.com/kata-containers/tests/tree/master) | Excludes unit tests which live with the main code. |
| www.katacontainers.io | Contains the source for the [main web site](https://www.katacontainers.io). | [Kata 2.x][github-katacontainers.io] | [Kata 1.x][github-katacontainers.io] | | |
### Packaging and releases
@@ -82,9 +138,6 @@ Kata Containers is now
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
the [components](#components) section for further details.
## Glossary of Terms
See the [glossary of terms](Glossary.md) related to Kata Containers.
---
[kernel]: https://www.kernel.org

View File

@@ -1 +1 @@
2.3.0-alpha0
2.1.1

View File

@@ -15,18 +15,12 @@ die() {
# Install the yq yaml query package from the mikefarah github repo
# Install via binary download, as we may not have golang installed at this point
function install_yq() {
GOPATH=${GOPATH:-${HOME}/go}
local yq_path="${GOPATH}/bin/yq"
local yq_pkg="github.com/mikefarah/yq"
local yq_version=3.4.1
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
if [ "${INSTALL_IN_GOPATH}" == "true" ];then
GOPATH=${GOPATH:-${HOME}/go}
mkdir -p "${GOPATH}/bin"
local yq_path="${GOPATH}/bin/yq"
else
yq_path="/usr/local/bin/yq"
fi
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return
[ -x "${GOPATH}/bin/yq" ] && [ "`${GOPATH}/bin/yq --version`"X == "yq version ${yq_version}"X ] && return
read -r -a sysInfo <<< "$(uname -sm)"
@@ -57,6 +51,7 @@ function install_yq() {
;;
esac
mkdir -p "${GOPATH}/bin"
# Check curl
if ! command -v "curl" >/dev/null; then

View File

@@ -3,11 +3,9 @@
#
# SPDX-License-Identifier: Apache-2.0
set -o nounset
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
export tests_repo_dir="$GOPATH/src/$tests_repo"
export branch="${target_branch:-main}"
export branch="${branch:-main}"
# Clones the tests repository and checkout to the branch pointed out by
# the global $branch variable.
@@ -17,7 +15,7 @@ export branch="${target_branch:-main}"
clone_tests_repo()
{
if [ -d "$tests_repo_dir" ]; then
[ -n "${CI:-}" ] && return
[ -n "$CI" ] && return
pushd "${tests_repo_dir}"
git checkout "${branch}"
git pull

View File

@@ -4,6 +4,6 @@
#
# This is the build root image for Kata Containers on OpenShift CI.
#
FROM registry.centos.org/centos:8
FROM centos:8
RUN yum -y update && yum -y install git sudo wget

View File

@@ -13,6 +13,4 @@ clone_tests_repo
pushd ${tests_repo_dir}
.ci/run.sh
# temporary fix, see https://github.com/kata-containers/tests/issues/3878
[ "$(uname -m)" != "s390x" ] && tracing/test-agent-shutdown.sh
popd

View File

@@ -1,3 +1,55 @@
- [Warning](#warning)
- [Assumptions](#assumptions)
- [Initial setup](#initial-setup)
- [Requirements to build individual components](#requirements-to-build-individual-components)
- [Build and install the Kata Containers runtime](#build-and-install-the-kata-containers-runtime)
- [Check hardware requirements](#check-hardware-requirements)
- [Configure to use initrd or rootfs image](#configure-to-use-initrd-or-rootfs-image)
- [Enable full debug](#enable-full-debug)
- [debug logs and shimv2](#debug-logs-and-shimv2)
- [Enabling full `containerd` debug](#enabling-full-containerd-debug)
- [Enabling just `containerd shim` debug](#enabling-just-containerd-shim-debug)
- [Enabling `CRI-O` and `shimv2` debug](#enabling-cri-o-and-shimv2-debug)
- [journald rate limiting](#journald-rate-limiting)
- [`systemd-journald` suppressing messages](#systemd-journald-suppressing-messages)
- [Disabling `systemd-journald` rate limiting](#disabling-systemd-journald-rate-limiting)
- [Create and install rootfs and initrd image](#create-and-install-rootfs-and-initrd-image)
- [Build a custom Kata agent - OPTIONAL](#build-a-custom-kata-agent---optional)
- [Get the osbuilder](#get-the-osbuilder)
- [Create a rootfs image](#create-a-rootfs-image)
- [Create a local rootfs](#create-a-local-rootfs)
- [Add a custom agent to the image - OPTIONAL](#add-a-custom-agent-to-the-image---optional)
- [Build a rootfs image](#build-a-rootfs-image)
- [Install the rootfs image](#install-the-rootfs-image)
- [Create an initrd image - OPTIONAL](#create-an-initrd-image---optional)
- [Create a local rootfs for initrd image](#create-a-local-rootfs-for-initrd-image)
- [Build an initrd image](#build-an-initrd-image)
- [Install the initrd image](#install-the-initrd-image)
- [Install guest kernel images](#install-guest-kernel-images)
- [Install a hypervisor](#install-a-hypervisor)
- [Build a custom QEMU](#build-a-custom-qemu)
- [Build a custom QEMU for aarch64/arm64 - REQUIRED](#build-a-custom-qemu-for-aarch64arm64---required)
- [Run Kata Containers with Containerd](#run-kata-containers-with-containerd)
- [Run Kata Containers with Kubernetes](#run-kata-containers-with-kubernetes)
- [Troubleshoot Kata Containers](#troubleshoot-kata-containers)
- [Appendices](#appendices)
- [Checking Docker default runtime](#checking-docker-default-runtime)
- [Set up a debug console](#set-up-a-debug-console)
- [Simple debug console setup](#simple-debug-console-setup)
- [Enable agent debug console](#enable-agent-debug-console)
- [Connect to debug console](#connect-to-debug-console)
- [Traditional debug console setup](#traditional-debug-console-setup)
- [Create a custom image containing a shell](#create-a-custom-image-containing-a-shell)
- [Build the debug image](#build-the-debug-image)
- [Configure runtime for custom debug image](#configure-runtime-for-custom-debug-image)
- [Create a container](#create-a-container)
- [Connect to the virtual machine using the debug console](#connect-to-the-virtual-machine-using-the-debug-console)
- [Enabling debug console for QEMU](#enabling-debug-console-for-qemu)
- [Enabling debug console for cloud-hypervisor / firecracker](#enabling-debug-console-for-cloud-hypervisor--firecracker)
- [Connecting to the debug console](#connecting-to-the-debug-console)
- [Obtain details of the image](#obtain-details-of-the-image)
- [Capturing kernel boot logs](#capturing-kernel-boot-logs)
# Warning
This document is written **specifically for developers**: it is not intended for end users.
@@ -252,7 +304,7 @@ You MUST choose one of `alpine`, `centos`, `clearlinux`, `debian`, `euleros`, `f
> - You should only do this step if you are testing with the latest version of the agent.
```
$ sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/usr/bin ../../../src/agent/target/x86_64-unknown-linux-musl/release/kata-agent
$ sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/bin ../../../src/agent/target/x86_64-unknown-linux-musl/release/kata-agent
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-agent.service ${ROOTFS_DIR}/usr/lib/systemd/system/
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-containers.target ${ROOTFS_DIR}/usr/lib/systemd/system/
```
@@ -301,13 +353,12 @@ You MUST choose one of `alpine`, `centos`, `clearlinux`, `euleros`, and `fedora`
>
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
Optionally, add your custom agent binary to the rootfs with the following commands. The default `$LIBC` used
is `musl`, but on ppc64le and s390x, `gnu` should be used. Also, Rust refers to ppc64le as `powerpc64le`:
Optionally, add your custom agent binary to the rootfs with the following, `LIBC` default is `musl`, if `ARCH` is `ppc64le`, should set the `LIBC=gnu` and `ARCH=powerpc64le`:
```
$ export ARCH=$(uname -m)
$ [ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
$ export ARCH=$(shell uname -m)
$ [ ${ARCH} == "ppc64le" ] && export LIBC=gnu || export LIBC=musl
$ [ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
$ sudo install -o root -g root -m 0550 -T ../../../src/agent/target/${ARCH}-unknown-linux-${LIBC}/release/kata-agent ${ROOTFS_DIR}/sbin/init
$ sudo install -o root -g root -m 0550 -T ../../../src/agent/target/$(ARCH)-unknown-linux-$(LIBC)/release/kata-agent ${ROOTFS_DIR}/sbin/init
```
### Build an initrd image
@@ -342,40 +393,14 @@ You may choose to manually build your VMM/hypervisor.
Kata Containers makes use of upstream QEMU branch. The exact version
and repository utilized can be found by looking at the [versions file](../versions.yaml).
Find the correct version of QEMU from the versions file:
```
$ source ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/scripts/lib.sh
$ qemu_version=$(get_from_kata_deps "assets.hypervisor.qemu.version")
$ echo ${qemu_version}
```
Get source from the matching branch of QEMU:
```
$ go get -d github.com/qemu/qemu
$ cd ${GOPATH}/src/github.com/qemu/qemu
$ git checkout ${qemu_version}
$ your_qemu_directory=${GOPATH}/src/github.com/qemu/qemu
```
There are scripts to manage the build and packaging of QEMU. For the examples below, set your
environment as:
```
$ go get -d github.com/kata-containers/kata-containers
$ packaging_dir="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging"
```
Kata often utilizes patches for not-yet-upstream and/or backported fixes for components,
including QEMU. These can be found in the [packaging/QEMU directory](../tools/packaging/qemu/patches),
and it's *recommended* that you apply them. For example, suppose that you are going to build QEMU
version 5.2.0, do:
```
$ cd $your_qemu_directory
$ $packaging_dir/scripts/apply_patches.sh $packaging_dir/qemu/patches/5.2.x/
```
Kata often utilizes patches for not-yet-upstream fixes for components,
including QEMU. These can be found in the [packaging/QEMU directory](../tools/packaging/qemu/patches)
To build utilizing the same options as Kata, you should make use of the `configure-hypervisor.sh` script. For example:
```
$ go get -d github.com/kata-containers/kata-containers/tools/packaging
$ cd $your_qemu_directory
$ $packaging_dir/scripts/configure-hypervisor.sh kata-qemu > kata.cfg
$ ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/scripts/configure-hypervisor.sh kata-qemu > kata.cfg
$ eval ./configure "$(cat kata.cfg)"
$ make -j $(nproc)
$ sudo -E make install
@@ -417,7 +442,7 @@ script and paste its output directly into a
> [runtime](../src/runtime) repository.
To perform analysis on Kata logs, use the
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/master/cmd/log-parser)
tool, which can convert the logs into formats (e.g. JSON, TOML, XML, and YAML).
See [Set up a debug console](#set-up-a-debug-console).
@@ -450,16 +475,6 @@ debug_console_enabled = true
This will pass `agent.debug_console agent.debug_console_vport=1026` to agent as kernel parameters, and sandboxes created using this parameters will start a shell in guest if new connection is accept from VSOCK.
#### Start `kata-monitor` - ONLY NEEDED FOR 2.0.x
For Kata Containers `2.0.x` releases, the `kata-runtime exec` command depends on the`kata-monitor` running, in order to get the sandbox's `vsock` address to connect to. Thus, first start the `kata-monitor` process.
```
$ sudo kata-monitor
```
`kata-monitor` will serve at `localhost:8090` by default.
#### Connect to debug console
Command `kata-runtime exec` is used to connect to the debug console.
@@ -604,7 +619,7 @@ VMM solution.
In case of cloud-hypervisor, connect to the `vsock` as shown:
```
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id}/root/ && socat stdin unix-connect:clh.sock'
$ sudo su -c 'cd /var/run/vc/vm/{sandbox_id}/root/ && socat stdin unix-connect:clh.sock'
CONNECT 1026
```
@@ -612,7 +627,7 @@ CONNECT 1026
For firecracker, connect to the `hvsock` as shown:
```
$ sudo su -c 'cd /var/run/vc/firecracker/${sandbox_id}/root/ && socat stdin unix-connect:kata.hvsock'
$ sudo su -c 'cd /var/run/vc/firecracker/{sandbox_id}/root/ && socat stdin unix-connect:kata.hvsock'
CONNECT 1026
```
@@ -621,7 +636,7 @@ CONNECT 1026
For QEMU, connect to the `vsock` as shown:
```
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id} && socat "stdin,raw,echo=0,escape=0x11" "unix-connect:console.sock"'
$ sudo su -c 'cd /var/run/vc/vm/{sandbox_id} && socat "stdin,raw,echo=0,escape=0x11" "unix-connect:console.sock"
```
To disconnect from the virtual machine, type `CONTROL+q` (hold down the

View File

@@ -1,3 +1,16 @@
* [Introduction](#introduction)
* [General requirements](#general-requirements)
* [Linking advice](#linking-advice)
* [Notes](#notes)
* [Warnings and other admonitions](#warnings-and-other-admonitions)
* [Files and command names](#files-and-command-names)
* [Code blocks](#code-blocks)
* [Images](#images)
* [Spelling](#spelling)
* [Names](#names)
* [Version numbers](#version-numbers)
* [The apostrophe](#the-apostrophe)
# Introduction
This document outlines the requirements for all documentation in the [Kata
@@ -10,6 +23,10 @@ All documents must:
- Be written in simple English.
- Be written in [GitHub Flavored Markdown](https://github.github.com/gfm) format.
- Have a `.md` file extension.
- Include a TOC (table of contents) at the top of the document with links to
all heading sections. We recommend using the
[`kata-check-markdown`](https://github.com/kata-containers/tests/tree/master/cmd/check-markdown)
tool to generate the TOC.
- Be linked to from another document in the same repository.
Although GitHub allows navigation of the entire repository, it should be
@@ -26,10 +43,6 @@ All documents must:
which can then execute the commands specified to ensure the instructions are
correct. This avoids documents becoming out of date over time.
> **Note:**
>
> Do not add a table of contents (TOC) since GitHub will auto-generate one.
# Linking advice
Linking between documents is strongly encouraged to help users and developers
@@ -105,7 +118,7 @@ This section lists requirements for displaying commands and command output.
The requirements must be adhered to since documentation containing code blocks
is validated by the CI system, which executes the command blocks with the help
of the
[doc-to-script](https://github.com/kata-containers/tests/tree/main/.ci/kata-doc-to-script.sh)
[doc-to-script](https://github.com/kata-containers/tests/tree/master/.ci/kata-doc-to-script.sh)
utility.
- If a document includes commands the user should run, they **MUST** be shown
@@ -189,7 +202,7 @@ and compare them with standard tools (e.g. `diff(1)`).
Since this project uses a number of terms not found in conventional
dictionaries, we have a
[spell checking tool](https://github.com/kata-containers/tests/tree/main/cmd/check-spelling)
[spell checking tool](https://github.com/kata-containers/tests/tree/master/cmd/check-spelling)
that checks both dictionary words and the additional terms we use.
Run the spell checking tool on your document before raising a PR to ensure it

View File

@@ -1,5 +1,9 @@
# Licensing strategy
* [Project License](#project-license)
* [License file](#license-file)
* [License for individual files](#license-for-individual-files)
## Project License
The license for the [Kata Containers](https://github.com/kata-containers)

View File

@@ -1,3 +1,35 @@
* [Overview](#overview)
* [Definition of a limitation](#definition-of-a-limitation)
* [Scope](#scope)
* [Contributing](#contributing)
* [Pending items](#pending-items)
* [Runtime commands](#runtime-commands)
* [checkpoint and restore](#checkpoint-and-restore)
* [events command](#events-command)
* [update command](#update-command)
* [Networking](#networking)
* [Docker swarm and compose support](#docker-swarm-and-compose-support)
* [Resource management](#resource-management)
* [docker run and shared memory](#docker-run-and-shared-memory)
* [docker run and sysctl](#docker-run-and-sysctl)
* [Docker daemon features](#docker-daemon-features)
* [SELinux support](#selinux-support)
* [Architectural limitations](#architectural-limitations)
* [Networking limitations](#networking-limitations)
* [Support for joining an existing VM network](#support-for-joining-an-existing-vm-network)
* [docker --net=host](#docker---nethost)
* [docker run --link](#docker-run---link)
* [Storage limitations](#storage-limitations)
* [Kubernetes `volumeMounts.subPaths`](#kubernetes-volumemountssubpaths)
* [Host resource sharing](#host-resource-sharing)
* [docker run --privileged](#docker-run---privileged)
* [Miscellaneous](#miscellaneous)
* [Docker --security-opt option partially supported](#docker---security-opt-option-partially-supported)
* [Appendices](#appendices)
* [The constraints challenge](#the-constraints-challenge)
***
# Overview
A [Kata Container](https://github.com/kata-containers) utilizes a Virtual Machine (VM) to enhance security and

View File

@@ -1,5 +1,16 @@
# Documentation
* [Getting Started](#getting-started)
* [More User Guides](#more-user-guides)
* [Kata Use-Cases](#kata-use-cases)
* [Developer Guide](#developer-guide)
* [Design and Implementations](#design-and-implementations)
* [How to Contribute](#how-to-contribute)
* [Code Licensing](#code-licensing)
* [The Release Process](#the-release-process)
* [Help Improving the Documents](#help-improving-the-documents)
* [Website Changes](#website-changes)
The [Kata Containers](https://github.com/kata-containers)
documentation repository hosts overall system documentation, with information
common to multiple components.

View File

@@ -1,6 +1,20 @@
# How to do a Kata Containers Release
This document lists the tasks required to create a Kata Release.
<!-- TOC START min:1 max:3 link:true asterisk:false update:true -->
- [How to do a Kata Containers Release](#how-to-do-a-kata-containers-release)
- [Requirements](#requirements)
- [Release Process](#release-process)
- [Bump all Kata repositories](#bump-all-kata-repositories)
- [Merge all bump version Pull requests](#merge-all-bump-version-pull-requests)
- [Tag all Kata repositories](#tag-all-kata-repositories)
- [Check Git-hub Actions](#check-git-hub-actions)
- [Create release notes](#create-release-notes)
- [Announce the release](#announce-the-release)
<!-- TOC END -->
## Requirements
- [hub](https://github.com/github/hub)
@@ -15,7 +29,6 @@
## Release Process
### Bump all Kata repositories
Bump the repositories using a script in the Kata packaging repo, where:
@@ -28,23 +41,6 @@
$ ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH"
```
### Point tests repository to stable branch
If you create a new stable branch, i.e. if your release changes a major or minor version number (not a patch release), then
you should modify the `tests` repository to point to that newly created stable branch and not the `main` branch.
The objective is that changes in the CI on the main branch will not impact the stable branch.
In the test directory, change references the main branch in:
* `README.md`
* `versions.yaml`
* `cmd/github-labels/labels.yaml.in`
* `cmd/pmemctl/pmemctl.sh`
* `.ci/lib.sh`
* `.ci/static-checks.sh`
See the commits in [the corresponding PR for stable-2.1](https://github.com/kata-containers/tests/pull/3504) for an example of the changes.
### Merge all bump version Pull requests
- The above step will create a GitHub pull request in the Kata projects. Trigger the CI using `/test` command on each bump Pull request.
@@ -54,7 +50,7 @@
### Tag all Kata repositories
Once all the pull requests to bump versions in all Kata repositories are merged,
tag all the repositories as shown below.
tag all the repositories as shown below.
```
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
$ git checkout <kata-branch-to-release>

View File

@@ -32,16 +32,16 @@ provides additional information regarding release `99.123.77` in the previous ex
changing the existing behavior*.
- When `MAJOR` increases, the new release adds **new features, bug fixes, or
both** and which **changes the behavior from the previous release** (incompatible with previous releases).
both** and which *changes the behavior from the previous release* (incompatible with previous releases).
A major release will also likely require a change of the container manager version used,
for example Containerd or CRI-O. Please refer to the release notes for further details.
for example Docker\*. Please refer to the release notes for further details.
## Release Strategy
Any new features added since the last release will be available in the next minor
release. These will include bug fixes as well. To facilitate a stable user environment,
Kata provides stable branch-based releases and a main branch release.
Kata provides stable branch-based releases and a master branch release.
## Stable branch patch criteria
@@ -49,10 +49,9 @@ No new features should be introduced to stable branches. This is intended to li
providing only bug and security fixes.
## Branch Management
Kata Containers will maintain **one** stable release branch, in addition to the main branch, for
each active major release.
Once a new MAJOR or MINOR release is created from main, a new stable branch is created for
the prior MAJOR or MINOR release and the previous stable branch is no longer maintained. End of
Kata Containers will maintain two stable release branches in addition to the master branch.
Once a new MAJOR or MINOR release is created from master, a new stable branch is created for
the prior MAJOR or MINOR release and the older stable branch is no longer maintained. End of
maintenance for a branch is announced on the Kata Containers mailing list. Users can determine
the version currently installed by running `kata-runtime kata-env`. It is recommended to use the
latest stable branch available.
@@ -62,59 +61,59 @@ A couple of examples follow to help clarify this process.
### New bug fix introduced
A bug fix is submitted against the runtime which does not introduce new inter-component dependencies.
This fix is applied to both the main and stable branches, and there is no need to create a new
This fix is applied to both the master and stable branches, and there is no need to create a new
stable branch.
| Branch | Original version | New version |
|--|--|--|
| `main` | `2.3.0-rc0` | `2.3.0-rc1` |
| `stable-2.2` | `2.2.0` | `2.2.1` |
| `stable-2.1` | (unmaintained) | (unmaintained) |
| `master` | `1.3.0-rc0` | `1.3.0-rc1` |
| `stable-1.2` | `1.2.0` | `1.2.1` |
| `stable-1.1` | `1.1.2` | `1.1.3` |
### New release made feature or change adding new inter-component dependency
A new feature is introduced, which adds a new inter-component dependency. In this case a new stable
branch is created (stable-2.3) starting from main and the previous stable branch (stable-2.2)
branch is created (stable-1.3) starting from master and the older stable branch (stable-1.1)
is dropped from maintenance.
| Branch | Original version | New version |
|--|--|--|
| `main` | `2.3.0-rc1` | `2.3.0` |
| `stable-2.3` | N/A| `2.3.0` |
| `stable-2.2` | `2.2.1` | (unmaintained) |
| `stable-2.1` | (unmaintained) | (unmaintained) |
| `master` | `1.3.0-rc1` | `1.3.0` |
| `stable-1.3` | N/A| `1.3.0` |
| `stable-1.2` | `1.2.1` | `1.2.2` |
| `stable-1.1` | `1.1.3` | (unmaintained) |
Note, the stable-2.2 branch will still exist with tag 2.2.1, but under current plans it is
not maintained further. The next tag applied to main will be 2.4.0-alpha0. We would then
Note, the stable-1.1 branch will still exist with tag 1.1.3, but under current plans it is
not maintained further. The next tag applied to master will be 1.4.0-alpha0. We would then
create a couple of alpha releases gathering features targeted for that particular release (in
this case 2.4.0), followed by a release candidate. The release candidate marks a feature freeze.
this case 1.4.0), followed by a release candidate. The release candidate marks a feature freeze.
A new stable branch is created for the release candidate. Only bug fixes and any security issues
are added to the branch going forward until release 2.4.0 is made.
are added to the branch going forward until release 1.4.0 is made.
## Backporting Process
Development that occurs against the main branch and applicable code commits should also be submitted
Development that occurs against the master branch and applicable code commits should also be submitted
against the stable branches. Some guidelines for this process follow::
1. Only bug and security fixes which do not introduce inter-component dependencies are
candidates for stable branches. These PRs should be marked with "bug" in GitHub.
2. Once a PR is created against main which meets requirement of (1), a comparable one
2. Once a PR is created against master which meets requirement of (1), a comparable one
should also be submitted against the stable branches. It is the responsibility of the submitter
to apply their pull request against stable, and it is the responsibility of the
reviewers to help identify stable-candidate pull requests.
## Continuous Integration Testing
The test repository is forked to create stable branches from main. Full CI
runs on each stable and main PR using its respective tests repository branch.
The test repository is forked to create stable branches from master. Full CI
runs on each stable and master PR using its respective tests repository branch.
### An alternative method for CI testing:
Ideally, the continuous integration infrastructure will run the same test suite on both main
Ideally, the continuous integration infrastructure will run the same test suite on both master
and the stable branches. When tests are modified or new feature tests are introduced, explicit
logic should exist within the testing CI to make sure only applicable tests are executed against
stable and main. While this is not in place currently, it should be considered in the long term.
stable and master. While this is not in place currently, it should be considered in the long term.
## Release Management
@@ -122,7 +121,7 @@ stable and main. While this is not in place currently, it should be considered i
Releases are made every three weeks, which include a GitHub release as
well as binary packages. These patch releases are made for both stable branches, and a "release candidate"
for the next `MAJOR` or `MINOR` is created from main. If there are no changes across all the repositories, no
for the next `MAJOR` or `MINOR` is created from master. If there are no changes across all the repositories, no
release is created and an announcement is made on the developer mailing list to highlight this.
If a release is being made, each repository is tagged for this release, regardless
of whether changes are introduced. The release schedule can be seen on the
@@ -143,10 +142,10 @@ maturity, we have increased the cadence from six weeks to twelve weeks. The rele
### Compatibility
Kata guarantees compatibility between components that are within one minor release of each other.
This is critical for dependencies which cross between host (shimv2 runtime) and
This is critical for dependencies which cross between host (runtime, shim, proxy) and
the guest (hypervisor, rootfs and agent). For example, consider a cluster with a long-running
deployment, workload-never-dies, all on Kata version 2.1.3 components. If the operator updates
the Kata components to the next new minor release (i.e. 2.2.0), we need to guarantee that the 2.2.0
shimv2 runtime still communicates with 2.1.3 agent within workload-never-dies.
deployment, workload-never-dies, all on Kata version 1.1.3 components. If the operator updates
the Kata components to the next new minor release (i.e. 1.2.0), we need to guarantee that the 1.2.0
runtime still communicates with 1.1.3 agent within workload-never-dies.
Handling live-update is out of the scope of this document. See this [`kata-runtime` issue](https://github.com/kata-containers/runtime/issues/492) for details.

View File

@@ -1,3 +1,16 @@
* [Introduction](#introduction)
* [Maintenance warning](#maintenance-warning)
* [Determine current version](#determine-current-version)
* [Determine latest version](#determine-latest-version)
* [Configuration changes](#configuration-changes)
* [Upgrade Kata Containers](#upgrade-kata-containers)
* [Upgrade native distribution packaged version](#upgrade-native-distribution-packaged-version)
* [Static installation](#static-installation)
* [Determine if you are using a static installation](#determine-if-you-are-using-a-static-installation)
* [Remove a static installation](#remove-a-static-installation)
* [Upgrade a static installation](#upgrade-a-static-installation)
* [Custom assets](#custom-assets)
# Introduction
This document outlines the options for upgrading from a

View File

@@ -8,9 +8,4 @@ Kata Containers design documents:
- [VSocks](VSocks.md)
- [VCPU handling](vcpu-handling.md)
- [Host cgroups](host-cgroups.md)
- [`Inotify` support](inotify.md)
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
---
- [Design proposals](proposals)

View File

@@ -1,5 +1,12 @@
# Kata Containers and VSOCKs
- [Introduction](#introduction)
- [VSOCK communication diagram](#vsock-communication-diagram)
- [System requirements](#system-requirements)
- [Advantages of using VSOCKs](#advantages-of-using-vsocks)
- [High density](#high-density)
- [Reliability](#reliability)
## Introduction
There are two different ways processes in the virtual machine can communicate

Binary file not shown.

Before

Width:  |  Height:  |  Size: 101 KiB

View File

@@ -1,5 +1,26 @@
# Kata Containers Architecture
- [Kata Containers Architecture](#kata-containers-architecture)
- [Overview](#overview)
- [Virtualization](#virtualization)
- [Guest assets](#guest-assets)
- [Guest kernel](#guest-kernel)
- [Guest image](#guest-image)
- [Root filesystem image](#root-filesystem-image)
- [Initrd image](#initrd-image)
- [Agent](#agent)
- [Runtime](#runtime)
- [Configuration](#configuration)
- [Networking](#networking)
- [Network Hotplug](#network-hotplug)
- [Storage](#storage)
- [Kubernetes support](#kubernetes-support)
- [OCI annotations](#oci-annotations)
- [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
- [Appendices](#appendices)
- [DAX](#dax)
## Overview
This is an architectural overview of Kata Containers, based on the 2.0 release.

View File

@@ -1,3 +1,4 @@
# Kata Containers E2E Flow
![Kata containers e2e flow](arch-images/katacontainers-e2e-with-bg.jpg)

View File

@@ -1,3 +1,18 @@
- [Host cgroup management](#host-cgroup-management)
- [Introduction](#introduction)
- [`SandboxCgroupOnly` enabled](#sandboxcgrouponly-enabled)
- [What does Kata do in this configuration?](#what-does-kata-do-in-this-configuration)
- [Why create a Kata-cgroup under the parent cgroup?](#why-create-a-kata-cgroup-under-the-parent-cgroup)
- [Improvements](#improvements)
- [`SandboxCgroupOnly` disabled (default, legacy)](#sandboxcgrouponly-disabled-default-legacy)
- [What does this method do?](#what-does-this-method-do)
- [Impact](#impact)
- [Supported cgroups](#supported-cgroups)
- [Cgroups V1](#cgroups-v1)
- [Cgroups V2](#cgroups-v2)
- [Distro Support](#distro-support)
- [Summary](#summary)
# Host cgroup management
## Introduction

View File

@@ -1,30 +0,0 @@
# Kata Containers support for `inotify`
## Background on `inotify` usage
A common pattern in Kubernetes is to watch for changes to files/directories passed in as `ConfigMaps`
or `Secrets`. Sidecar's normally use `inotify` to watch for changes and then signal the primary container to reload
the updated configuration. Kata Containers typically will pass these host files into the guest using `virtiofs`, which
does not support `inotify` today. While we work to enable this use case in `virtiofs`, we introduced a workaround in Kata Containers.
This document describes how Kata Containers implements this workaround.
### Detecting a `watchable` mount
Kubernetes creates `secrets` and `ConfigMap` mounts at very specific locations on the host filesystem. For container mounts,
the `Kata Containers` runtime will check the source of the mount to identify these special cases. For these use cases, only a single file
or very few would typically need to be watched. To avoid excessive overheads in making a mount watchable,
we enforce a limit of eight files per mount. If a `secret` or `ConfigMap` mount contains more than 8 files, it will not be
considered watchable. We similarly enforce a limit of 1 MB per mount to be considered watchable. Non-watchable mounts will
continue to propagate changes from the mount on the host to the container workload, but these updates will not trigger an
`inotify` event.
If at any point a mount grows beyond the eight file or 1MB limit, it will no longer be `watchable.`
### Presenting a `watchable` mount to the workload
For mounts that are considered `watchable`, inside the guest, the `kata-agent` will poll the mount presented from
the host through `virtiofs` and copy any changed files to a `tmpfs` mount that is presented to the container. In this way,
for `watchable` mounts, Kata will do the polling on behalf of the workload and existing workloads needn't change their usage
of `inotify`.
![drawing](arch-images/inotify-workaround.png)

View File

@@ -1,5 +1,20 @@
# Kata 2.0 Metrics Design
* [Limitations of Kata 1.x and the target of Kata 2.0](#limitations-of-kata-1x-and-the-target-of-kata-20)
* [Metrics architecture](#metrics-architecture)
* [Kata monitor](#kata-monitor)
* [Kata runtime](#kata-runtime)
* [Kata agent](#kata-agent)
* [Performance and overhead](#performance-and-overhead)
* [Metrics list](#metrics-list)
* [Metric types](#metric-types)
* [Kata agent metrics](#kata-agent-metrics)
* [Firecracker metrics](#firecracker-metrics)
* [Kata guest OS metrics](#kata-guest-os-metrics)
* [Hypervisor metrics](#hypervisor-metrics)
* [Kata monitor metrics](#kata-monitor-metrics)
* [Kata containerd shim v2 metrics](#kata-containerd-shim-v2-metrics)
Kata implement CRI's API and support [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interface to get basic metrics about container.
But unlike `runc`, Kata is a VM-based runtime and has a different architecture.

View File

@@ -1,5 +1,4 @@
# Kata API Design
To fulfill the [Kata design requirements](kata-design-requirements.md), and based on the discussion on [Virtcontainers API extensions](https://docs.google.com/presentation/d/1dbGrD1h9cpuqAPooiEgtiwWDGCYhVPdatq7owsKHDEQ), the Kata runtime library features the following APIs:
- Sandbox based top API
- Storage and network hotplug API

View File

@@ -1,5 +0,0 @@
# Design proposals
Kata Containers design proposal documents:
- [Kata Containers tracing](tracing-proposals.md)

View File

@@ -1,213 +0,0 @@
# Kata Tracing proposals
## Overview
This document summarises a set of proposals triggered by the
[tracing documentation PR][tracing-doc-pr].
## Required context
This section explains some terminology required to understand the proposals.
Further details can be found in the
[tracing documentation PR][tracing-doc-pr].
### Agent trace mode terminology
| Trace mode | Description | Use-case |
|-|-|-|
| Static | Trace agent from startup to shutdown | Entire lifespan |
| Dynamic | Toggle tracing on/off as desired | On-demand "snapshot" |
### Agent trace type terminology
| Trace type | Description | Use-case |
|-|-|-|
| isolated | traces all relate to single component | Observing lifespan |
| collated | traces "grouped" (runtime+agent) | Understanding component interaction |
### Container lifespan
| Lifespan | trace mode | trace type |
|-|-|-|
| short-lived | static | collated if possible, else isolated? |
| long-running | dynamic | collated? (to see interactions) |
## Original plan for agent
- Implement all trace types and trace modes for agent.
- Why?
- Maximum flexibility.
> **Counterargument:**
>
> Due to the intrusive nature of adding tracing, we have
> learnt that landing small incremental changes is simpler and quicker!
- Compatibility with [Kata 1.x tracing][kata-1x-tracing].
> **Counterargument:**
>
> Agent tracing in Kata 1.x was extremely awkward to setup (to the extent
> that it's unclear how many users actually used it!)
>
> This point, coupled with the new architecture for Kata 2.x, suggests
> that we may not need to supply the same set of tracing features (in fact
> they may not make sense)).
## Agent tracing proposals
### Agent tracing proposal 1: Don't implement dynamic trace mode
- All tracing will be static.
- Why?
- Because dynamic tracing will always be "partial"
> In fact, not only would it be only a "snapshot" of activity, it may not
> even be possible to create a complete "trace transaction". If this is
> true, the trace output would be partial and would appear "unstructured".
### Agent tracing proposal 2: Simplify handling of trace type
- Agent tracing will be "isolated" by default.
- Agent tracing will be "collated" if runtime tracing is also enabled.
- Why?
- Offers a graceful fallback for agent tracing if runtime tracing disabled.
- Simpler code!
## Questions to ask yourself (part 1)
- Are your containers long-running or short-lived?
- Would you ever need to turn on tracing "briefly"?
- If "yes", is a "partial trace" useful or useless?
> Likely to be considered useless as it is a partial snapshot.
> Alternative tracing methods may be more appropriate to dynamic
> OpenTelemetry tracing.
## Questions to ask yourself (part 2)
- Are you happy to stop a container to enable tracing?
If "no", dynamic tracing may be required.
- Would you ever want to trace the agent and the runtime "in isolation" at the
same time?
- If "yes", we need to fully implement `trace_mode=isolated`
> This seems unlikely though.
## Trace collection
The second set of proposals affect the way traces are collected.
### Motivation
Currently:
- The runtime sends trace spans to Jaeger directly.
- The agent will send trace spans to the [`trace-forwarder`][trace-forwarder] component.
- The trace forwarder will send trace spans to Jaeger.
Kata agent tracing overview:
```
+-------------------------------------------+
| Host |
| |
| +-----------+ |
| | Trace | |
| | Collector | |
| +-----+-----+ |
| ^ +--------------+ |
| | spans | Kata VM | |
| +-----+-----+ | | |
| | Kata | spans | +-----+ | |
| | Trace |<-----------------|Kata | | |
| | Forwarder | VSOCK | |Agent| | |
| +-----------+ Channel | +-----+ | |
| +--------------+ |
+-------------------------------------------+
```
Currently:
- If agent tracing is enabled but the trace forwarder is not running,
the agent will error.
- If the trace forwarder is started but Jaeger is not running,
the trace forwarder will error.
### Goals
- The runtime and agent should:
- Use the same trace collection implementation.
- Use the most the common configuration items.
- Kata should should support more trace collection software or `SaaS`
(for example `Zipkin`, `datadog`).
- Trace collection should not block normal runtime/agent operations
(for example if `vsock-exporter`/Jaeger is not running, Kata Containers should work normally).
### Trace collection proposals
#### Trace collection proposal 1: Send all spans to the trace forwarder as a span proxy
Kata runtime/agent all send spans to trace forwarder, and the trace forwarder,
acting as a tracing proxy, sends all spans to a tracing back-end, such as Jaeger or `datadog`.
**Pros:**
- Runtime/agent will be simple.
- Could update trace collection target while Kata Containers are running.
**Cons:**
- Requires the trace forwarder component to be running (that is a pressure to operation).
#### Trace collection proposal 2: Send spans to collector directly from runtime/agent
Send spans to collector directly from runtime/agent, this proposal need
network accessible to the collector.
**Pros:**
- No additional trace forwarder component needed.
**Cons:**
- Need more code/configuration to support all trace collectors.
## Future work
- We could add dynamic and fully isolated tracing at a later stage,
if required.
## Further details
- See the new [GitHub project](https://github.com/orgs/kata-containers/projects/28).
- [kata-containers-tracing-status](https://gist.github.com/jodh-intel/0ee54d41d2a803ba761e166136b42277) gist.
- [tracing documentation PR][tracing-doc-pr].
## Summary
### Time line
- 2021-07-01: A summary of the discussion was
[posted to the mail list](http://lists.katacontainers.io/pipermail/kata-dev/2021-July/001996.html).
- 2021-06-22: These proposals were
[discussed in the Kata Architecture Committee meeting](https://etherpad.opendev.org/p/Kata_Containers_2021_Architecture_Committee_Mtgs).
- 2021-06-18: These proposals where
[announced on the mailing list](http://lists.katacontainers.io/pipermail/kata-dev/2021-June/001980.html).
### Outcome
- Nobody opposed the agent proposals, so they are being implemented.
- The trace collection proposals are still being considered.
[kata-1x-tracing]: https://github.com/kata-containers/agent/blob/master/TRACING.md
[trace-forwarder]: /src/trace-forwarder
[tracing-doc-pr]: https://github.com/kata-containers/kata-containers/pull/1937

View File

@@ -1,3 +1,11 @@
- [Virtual machine vCPU sizing in Kata Containers](#virtual-machine-vcpu-sizing-in-kata-containers)
* [Default number of virtual CPUs](#default-number-of-virtual-cpus)
* [Virtual CPUs and Kubernetes pods](#virtual-cpus-and-kubernetes-pods)
* [Container lifecycle](#container-lifecycle)
* [Container without CPU constraint](#container-without-cpu-constraint)
* [Container with CPU constraint](#container-with-cpu-constraint)
* [Do not waste resources](#do-not-waste-resources)
# Virtual machine vCPU sizing in Kata Containers
## Default number of virtual CPUs

View File

@@ -1,5 +1,16 @@
# Virtualization in Kata Containers
- [Virtualization in Kata Containers](#virtualization-in-kata-containers)
- [Mapping container concepts to virtual machine technologies](#mapping-container-concepts-to-virtual-machine-technologies)
- [Kata Containers Hypervisor and VMM support](#kata-containers-hypervisor-and-vmm-support)
- [QEMU/KVM](#qemukvm)
- [Machine accelerators](#machine-accelerators)
- [Hotplug devices](#hotplug-devices)
- [Firecracker/KVM](#firecrackerkvm)
- [Cloud Hypervisor/KVM](#cloud-hypervisorkvm)
- [Summary](#summary)
Kata Containers, a second layer of isolation is created on top of those provided by traditional namespace-containers. The
hardware virtualization interface is the basis of this additional layer. Kata will launch a lightweight virtual machine,
and use the guests Linux kernel to create a container workload, or workloads in the case of multi-container pods. In Kubernetes

View File

@@ -1,7 +1,11 @@
# Howto Guides
## Kubernetes Integration
* [Howto Guides](#howto-guides)
* [Kubernetes Integration](#kubernetes-integration)
* [Hypervisors Integration](#hypervisors-integration)
* [Advanced Topics](#advanced-topics)
## Kubernetes Integration
- [Run Kata containers with `crictl`](run-kata-with-crictl.md)
- [Run Kata Containers with Kubernetes](run-kata-with-k8s.md)
- [How to use Kata Containers and Containerd](containerd-kata.md)
@@ -17,13 +21,13 @@
- `firecracker`
- `ACRN`
While `qemu` , `cloud-hypervisor` and `firecracker` work out of the box with installation of Kata,
some additional configuration is needed in case of `ACRN`.
While `qemu` and `cloud-hypervisor` work out of the box with installation of Kata,
some additional configuration is needed in case of `firecracker` and `ACRN`.
Refer to the following guides for additional configuration steps:
- [Kata Containers with Firecracker](https://github.com/kata-containers/documentation/wiki/Initial-release-of-Kata-Containers-with-Firecracker-support)
- [Kata Containers with ACRN Hypervisor](how-to-use-kata-containers-with-acrn.md)
## Advanced Topics
- [How to use Kata Containers with virtio-fs](how-to-use-virtio-fs-with-kata.md)
- [Setting Sysctls with Kata](how-to-use-sysctls-with-kata.md)
- [What Is VMCache and How To Enable It](what-is-vm-cache-and-how-do-I-use-it.md)
@@ -33,4 +37,3 @@
- [How to use Kata Containers with `virtio-mem`](how-to-use-virtio-mem-with-kata.md)
- [How to set sandbox Kata Containers configurations with pod annotations](how-to-set-sandbox-config-kata.md)
- [How to monitor Kata Containers in K8s](how-to-set-prometheus-in-k8s.md)
- [How to use hotplug memory on arm64 in Kata Containers](how-to-hotplug-memory-arm64.md)

View File

@@ -1,5 +1,23 @@
# How to use Kata Containers and Containerd
- [Concepts](#concepts)
- [Kubernetes `RuntimeClass`](#kubernetes-runtimeclass)
- [Containerd Runtime V2 API: Shim V2 API](#containerd-runtime-v2-api-shim-v2-api)
- [Install](#install)
- [Install Kata Containers](#install-kata-containers)
- [Install containerd with CRI plugin](#install-containerd-with-cri-plugin)
- [Install CNI plugins](#install-cni-plugins)
- [Install `cri-tools`](#install-cri-tools)
- [Configuration](#configuration)
- [Configure containerd to use Kata Containers](#configure-containerd-to-use-kata-containers)
- [Kata Containers as a `RuntimeClass`](#kata-containers-as-a-runtimeclass)
- [Kata Containers as the runtime for untrusted workload](#kata-containers-as-the-runtime-for-untrusted-workload)
- [Kata Containers as the default runtime](#kata-containers-as-the-default-runtime)
- [Configuration for `cri-tools`](#configuration-for-cri-tools)
- [Run](#run)
- [Launch containers with `ctr` command line](#launch-containers-with-ctr-command-line)
- [Launch Pods with `crictl` command line](#launch-pods-with-crictl-command-line)
This document covers the installation and configuration of [containerd](https://containerd.io/)
and [Kata Containers](https://katacontainers.io). The containerd provides not only the `ctr`
command line tool, but also the [CRI](https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/)

View File

@@ -26,7 +26,7 @@ spec:
hostNetwork: true
containers:
- name: kata-monitor
image: quay.io/kata-containers/kata-monitor:2.0.0
image: docker.io/katadocker/kata-monitor:2.0.0
args:
- -log-level=debug
ports:

View File

@@ -1,28 +0,0 @@
# How to use memory hotplug feature in Kata Containers on arm64
## Introduction
Memory hotplug is a key feature for containers to allocate memory dynamically in deployment.
As Kata Container bases on VM, this feature needs support both from VMM and guest kernel. Luckily, it has been fully supported for the current default version of QEMU and guest kernel used by Kata on arm64. For other VMMs, e.g, Cloud Hypervisor, the enablement work is on the road. Apart from VMM and guest kernel, memory hotplug also depends on ACPI which depends on firmware either. On x86, you can boot a VM using QEMU with ACPI enabled directly, because it boots up with firmware implicitly. For arm64, however, you need specify firmware explicitly. That is to say, if you are ready to run a normal Kata Container on arm64, what you need extra to do is to install the UEFI ROM before use the memory hotplug feature.
## Install UEFI ROM
We have offered a helper script for you to install the UEFI ROM. If you have installed Kata normally on your host, you just need to run the script as fellows:
```bash
$ pushd $GOPATH/src/github.com/kata-containers/tests
$ sudo .ci/aarch64/install_rom_aarch64.sh
$ popd
```
## Run for test
Let's test if the memory hotplug is ready for Kata after install the UEFI ROM. Make sure containerd is ready to run Kata before test.
```bash
$ sudo ctr image pull docker.io/library/ubuntu:latest
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --rm docker.io/library/ubuntu:latest hello sh -c "free -h"
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --memory-limit 536870912 --rm docker.io/library/ubuntu:latest hello sh -c "free -h"
```
Compare the results between the two tests. If the latter is 0.5G larger than the former, you have done what you want, and congratulation!

View File

@@ -1,5 +1,20 @@
# Importing Kata Containers logs with Fluentd
* [Introduction](#introduction)
* [Overview](#overview)
* [Test stack](#test-stack)
* [Importing the logs](#importing-the-logs)
* [Direct import `logfmt` from `systemd`](#direct-import-logfmt-from-systemd)
* [Configuring `minikube`](#configuring-minikube)
* [Pull from `systemd`](#pull-from-systemd)
* [Systemd Summary](#systemd-summary)
* [Directly importing JSON](#directly-importing-json)
* [JSON in files](#json-in-files)
* [Prefixing all keys](#prefixing-all-keys)
* [Kata `shimv2`](#kata-shimv2)
* [Caveats](#caveats)
* [Summary](#summary)
# Introduction
This document describes how to import Kata Containers logs into [Fluentd](https://www.fluentd.org/),
@@ -128,7 +143,7 @@ YAML can be found
tag kata-containers
path /run/log/journal
pos_file /run/log/journal/kata-journald.pos
filters [{"SYSLOG_IDENTIFIER": "kata-runtime"}, {"SYSLOG_IDENTIFIER": "kata-shim"}]
filters [{"SYSLOG_IDENTIFIER": "kata-runtime"}, {"SYSLOG_IDENTIFIER": "kata-proxy"}, {"SYSLOG_IDENTIFIER": "kata-shim"}]
read_from_head true
</source>
```
@@ -146,7 +161,7 @@ generate some Kata specific log entries:
```bash
$ minikube addons open efk
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kata-deploy
$ cd $GOPATH/src/github.com/kata-containers/packaging/kata-deploy
$ kubectl apply -f examples/nginx-deployment-qemu.yaml
```
@@ -163,7 +178,7 @@ sub-filter on, for instance, the `SYSLOG_IDENTIFIER` to differentiate the Kata c
on the `PRIORITY` to filter out critical issues etc.
Kata generates a significant amount of Kata specific information, which can be seen as
[`logfmt`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser#logfile-requirements).
[`logfmt`](https://github.com/kata-containers/tests/tree/master/cmd/log-parser#logfile-requirements).
data contained in the `MESSAGE` field. Imported as-is, there is no easy way to filter on that data
in Kibana:
@@ -257,8 +272,9 @@ go directly to a full Kata specific JSON format logfile test.
Kata runtime has the ability to generate JSON logs directly, rather than its default `logfmt` format. Passing
the `--log-format=json` argument to the Kata runtime enables this. The easiest way to pass in this extra
parameter from a [Kata deploy](https://github.com/kata-containers/kata-containers/tree/main/tools/packaging/kata-deploy) installation
is to edit the `/opt/kata/bin/kata-qemu` shell script.
parameter from a [Kata deploy](https://github.com/kata-containers/packaging/tree/master/kata-deploy) installation
is to edit the `/opt/kata/bin/kata-qemu` shell script (generated by the
[Kata packaging release scripts](https://github.com/kata-containers/packaging/blob/master/release/kata-deploy-binaries.sh)).
At the same time, we will add the `--log=/var/log/kata-runtime.log` argument to store the Kata logs in their
own file (rather than into the system journal).

View File

@@ -2,6 +2,14 @@
This document describes how to run `kata-monitor` in a Kubernetes cluster using Prometheus's service discovery to scrape metrics from `kata-agent`.
- [Introduction](#introduction)
- [Pre-requisites](#pre-requisites)
- [Configure Prometheus](#configure-prometheus)
- [Configure `kata-monitor`](#configure-kata-monitor)
- [Setup Grafana](#setup-grafana)
* [Create `datasource`](#create-datasource)
* [Import dashboard](#import-dashboard)
> **Warning**: This how-to is only for evaluation purpose, you **SHOULD NOT** running it in production using this configurations.
## Introduction

View File

@@ -79,7 +79,7 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.kernel` | string | the kernel used to boot the container VM |
| `io.katacontainers.config.hypervisor.machine_accelerators` | string | machine specific accelerators for the hypervisor |
| `io.katacontainers.config.hypervisor.machine_type` | string | the type of machine being emulated by the hypervisor |
| `io.katacontainers.config.hypervisor.memory_offset` | uint64| the memory space used for `nvdimm` device by the hypervisor |
| `io.katacontainers.config.hypervisor.memory_offset` | uint32| the memory space used for `nvdimm` device by the hypervisor |
| `io.katacontainers.config.hypervisor.memory_slots` | uint32| the memory slots assigned to the VM by the hypervisor |
| `io.katacontainers.config.hypervisor.msize_9p` | uint32 | the `msize` for 9p shares |
| `io.katacontainers.config.hypervisor.path` | string | the hypervisor that will run the container VM |

View File

@@ -1,9 +1,22 @@
# How to use Kata Containers and CRI (containerd plugin) with Kubernetes
* [Requirements](#requirements)
* [Install and configure containerd](#install-and-configure-containerd)
* [Install and configure Kubernetes](#install-and-configure-kubernetes)
* [Install Kubernetes](#install-kubernetes)
* [Configure Kubelet to use containerd](#configure-kubelet-to-use-containerd)
* [Configure HTTP proxy - OPTIONAL](#configure-http-proxy---optional)
* [Start Kubernetes](#start-kubernetes)
* [Configure Pod Network](#configure-pod-network)
* [Allow pods to run in the master node](#allow-pods-to-run-in-the-master-node)
* [Create runtime class for Kata Containers](#create-runtime-class-for-kata-containers)
* [Run pod in Kata Containers](#run-pod-in-kata-containers)
* [Delete created pod](#delete-created-pod)
This document describes how to set up a single-machine Kubernetes (k8s) cluster.
The Kubernetes cluster will use the
[CRI containerd plugin](https://github.com/containerd/containerd/tree/main/pkg/cri) and
[CRI containerd plugin](https://github.com/containerd/cri) and
[Kata Containers](https://katacontainers.io) to launch untrusted workloads.
## Requirements

View File

@@ -2,6 +2,11 @@
This document provides an overview on how to run Kata containers with ACRN hypervisor and device model.
- [Introduction](#introduction)
- [Pre-requisites](#pre-requisites)
- [Configure Docker](#configure-docker)
- [Configure Kata Containers with ACRN](#configure-kata-containers-with-acrn)
## Introduction
ACRN is a flexible, lightweight Type-1 reference hypervisor built with real-time and safety-criticality in mind. ACRN uses an open source platform making it optimized to streamline embedded development.

View File

@@ -1,7 +1,6 @@
# Setting Sysctls with Kata
## Sysctls
In Linux, the sysctl interface allows an administrator to modify kernel
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
process file system.
@@ -17,10 +16,11 @@ To get a complete list of kernel parameters, run:
$ sudo sysctl -a
```
Kubernetes provide mechanisms for setting namespaced sysctls.
Namespaced sysctls can be set per pod in the case of Kubernetes.
Both Docker and Kubernetes provide mechanisms for setting namespaced sysctls.
Namespaced sysctls can be set per pod in the case of Kubernetes or per container
in case of Docker.
The following sysctls are known to be namespaced and can be set with
Kubernetes:
Docker and Kubernetes:
- `kernel.shm*`
- `kernel.msg*`
@@ -30,10 +30,31 @@ Kubernetes:
### Namespaced Sysctls:
Kata Containers supports setting namespaced sysctls with Kubernetes.
Kata Containers supports setting namespaced sysctls with Docker and Kubernetes.
All namespaced sysctls can be set in the same way as regular Linux based
containers, the difference being, in the case of Kata they are set inside the guest.
#### Setting Namespaced Sysctls with Docker:
```
$ sudo docker run --runtime=kata-runtime -it alpine cat /proc/sys/fs/mqueue/queues_max
256
$ sudo docker run --runtime=kata-runtime --sysctl fs.mqueue.queues_max=512 -it alpine cat /proc/sys/fs/mqueue/queues_max
512
```
... and:
```
$ sudo docker run --runtime=kata-runtime -it alpine cat /proc/sys/kernel/shmmax
18446744073692774399
$ sudo docker run --runtime=kata-runtime --sysctl kernel.shmmax=1024 -it alpine cat /proc/sys/kernel/shmmax
1024
```
For additional documentation on setting sysctls with Docker please refer to [Docker-sysctl-doc](https://docs.docker.com/engine/reference/commandline/run/#configure-namespaced-kernel-parameters-sysctls-at-runtime).
#### Setting Namespaced Sysctls with Kubernetes:
Kubernetes considers certain sysctls as safe and others as unsafe. For detailed
@@ -79,7 +100,7 @@ spec:
### Non-Namespaced Sysctls:
Kubernetes disallow sysctls without a namespace.
Docker and Kubernetes disallow sysctls without a namespace.
The recommendation is to set them directly on the host or use a privileged
container in the case of Kubernetes.

View File

@@ -1,9 +1,12 @@
# Kata Containers with virtio-fs
- [Kata Containers with virtio-fs](#kata-containers-with-virtio-fs)
- [Introduction](#introduction)
## Introduction
Container deployments utilize explicit or implicit file sharing between host filesystem and containers. From a trust perspective, avoiding a shared file-system between the trusted host and untrusted container is recommended. This is not always feasible. In Kata Containers, block-based volumes are preferred as they allow usage of either device pass through or `virtio-blk` for access within the virtual machine.
As of the 2.0 release of Kata Containers, [virtio-fs](https://virtio-fs.gitlab.io/) is the default filesystem sharing mechanism.
virtio-fs support works out of the box for `cloud-hypervisor` and `qemu`, when Kata Containers is deployed using `kata-deploy`. Learn more about `kata-deploy` and how to use `kata-deploy` in Kubernetes [here](https://github.com/kata-containers/kata-containers/tree/main/tools/packaging/kata-deploy#kubernetes-quick-start).
virtio-fs support works out of the box for `cloud-hypervisor` and `qemu`, when Kata Containers is deployed using `kata-deploy`. Learn more about `kata-deploy` and how to use `kata-deploy` in Kubernetes [here](https://github.com/kata-containers/packaging/tree/master/kata-deploy#kubernetes-quick-start).

View File

@@ -1,5 +1,9 @@
# Kata Containers with `virtio-mem`
- [Introduction](#introduction)
- [Requisites](#requisites)
- [Run a Kata Container utilizing `virtio-mem`](#run-a-kata-container-utilizing-virtio-mem)
## Introduction
The basic idea of `virtio-mem` is to provide a flexible, cross-architecture memory hot plug and hot unplug solution that avoids many limitations imposed by existing technologies, architectures, and interfaces.
@@ -37,7 +41,7 @@ $ echo 1 | sudo tee /proc/sys/vm/overcommit_memory
Use following command to start a Kata Container.
```
$ pod_yaml=pod.yaml
$ container_yaml=container.yaml
$ container_yaml=${REPORT_DIR}/container.yaml
$ image="quay.io/prometheus/busybox:latest"
$ cat << EOF > "${pod_yaml}"
metadata:

View File

@@ -3,6 +3,11 @@
Kata Containers supports creation of containers that are "privileged" (i.e. have additional capabilities and access
that is not normally granted).
* [Warnings](#warnings)
* [Host Devices](#host-devices)
* [Containerd and CRI](#containerd-and-cri)
* [CRI-O](#cri-o)
## Warnings
**Warning:** Whilst this functionality is supported, it can decrease the security of Kata Containers if not configured

View File

@@ -1,5 +1,16 @@
# Working with `crictl`
* [What's `cri-tools`](#whats-cri-tools)
* [Use `crictl` run Pods in Kata containers](#use-crictl-run-pods-in-kata-containers)
* [Run `busybox` Pod](#run-busybox-pod)
* [Run pod sandbox with config file](#run-pod-sandbox-with-config-file)
* [Create container in the pod sandbox with config file](#create-container-in-the-pod-sandbox-with-config-file)
* [Start container](#start-container)
* [Run `redis` Pod](#run-redis-pod)
* [Create `redis-server` Pod](#create-redis-server-pod)
* [Create `redis-client` Pod](#create-redis-client-pod)
* [Check `redis` server is working](#check-redis-server-is-working)
## What's `cri-tools`
[`cri-tools`](https://github.com/kubernetes-sigs/cri-tools) provides debugging and validation tools for Kubelet Container Runtime Interface (CRI).

View File

@@ -1,5 +1,18 @@
# Run Kata Containers with Kubernetes
* [Run Kata Containers with Kubernetes](#run-kata-containers-with-kubernetes)
* [Prerequisites](#prerequisites)
* [Install a CRI implementation](#install-a-cri-implementation)
* [CRI-O](#cri-o)
* [Kubernetes Runtime Class (CRI-O v1.12 )](#kubernetes-runtime-class-cri-o-v112)
* [Untrusted annotation (until CRI-O v1.12)](#untrusted-annotation-until-cri-o-v112)
* [Network namespace management](#network-namespace-management)
* [containerd with CRI plugin](#containerd-with-cri-plugin)
* [Install Kubernetes](#install-kubernetes)
* [Configure for CRI-O](#configure-for-cri-o)
* [Configure for containerd](#configure-for-containerd)
* [Run a Kubernetes pod with Kata Containers](#run-a-kubernetes-pod-with-kata-containers)
## Prerequisites
This guide requires Kata Containers available on your system, install-able by following [this guide](../install/README.md).
@@ -158,10 +171,10 @@ $ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
# If using CRI-O
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.244.0.0/16
$ sudo kubeadm init --skip-preflight-checks --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.244.0.0/16
# If using CRI-containerd
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
$ sudo kubeadm init --skip-preflight-checks --cri-socket /run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
$ export KUBECONFIG=/etc/kubernetes/admin.conf
```

View File

@@ -1,5 +1,21 @@
# Kata Containers and service mesh for Kubernetes
* [Assumptions](#assumptions)
* [How they work](#how-they-work)
* [Prerequisites](#prerequisites)
* [Kata and Kubernetes](#kata-and-kubernetes)
* [Restrictions](#restrictions)
* [Install and deploy your service mesh](#install-and-deploy-your-service-mesh)
* [Service Mesh Istio](#service-mesh-istio)
* [Service Mesh Linkerd](#service-mesh-linkerd)
* [Inject your services with sidecars](#inject-your-services-with-sidecars)
* [Sidecar Istio](#sidecar-istio)
* [Sidecar Linkerd](#sidecar-linkerd)
* [Run your services with Kata](#run-your-services-with-kata)
* [Lower privileges](#lower-privileges)
* [Add annotations](#add-annotations)
* [Deploy](#deploy)
A service mesh is a way to monitor and control the traffic between
micro-services running in your Kubernetes cluster. It is a powerful
tool that you might want to use in combination with the security
@@ -60,16 +76,15 @@ is not able to perform a proper setup of the rules.
### Service Mesh Istio
The following is a summary of what you need to install Istio on your system:
As a reference, you can follow Istio [instructions](https://istio.io/docs/setup/kubernetes/quick-start/#download-and-prepare-for-the-installation).
The following is a summary of what you need to install Istio on your system:
```
$ curl -L https://git.io/getLatestIstio | sh -
$ cd istio-*
$ export PATH=$PWD/bin:$PATH
```
See the [Istio documentation](https://istio.io/docs) for further details.
Now deploy Istio in the control plane of your cluster with the following:
```
$ kubectl apply -f install/kubernetes/istio-demo.yaml

View File

@@ -1,5 +1,10 @@
# What Is VMCache and How To Enable It
* [What is VMCache](#what-is-vmcache)
* [How is this different to VM templating](#how-is-this-different-to-vm-templating)
* [How to enable VMCache](#how-to-enable-vmcache)
* [Limitations](#limitations)
### What is VMCache
VMCache is a new function that creates VMs as caches before using it.

View File

@@ -1,7 +1,6 @@
# What Is VM Templating and How To Enable It
### What is VM templating
VM templating is a Kata Containers feature that enables new VM
creation using a cloning technique. When enabled, new VMs are created
by cloning from a pre-created template VM, and they will share the
@@ -9,13 +8,11 @@ same initramfs, kernel and agent memory in readonly mode. It is very
much like a process fork done by the kernel but here we *fork* VMs.
### How is this different from VMCache
Both [VMCache](../how-to/what-is-vm-cache-and-how-do-I-use-it.md) and VM templating help speed up new container creation.
When VMCache enabled, new VMs are created by the VMCache server. So it is not vulnerable to share memory CVE because each VM doesn't share the memory.
VM templating saves a lot of memory if there are many Kata Containers running on the same host.
### What are the Pros
VM templating helps speed up new container creation and saves a lot
of memory if there are many Kata Containers running on the same host.
If you are running a density workload, or care a lot about container
@@ -32,7 +29,6 @@ showed that VM templating speeds up Kata Containers creation by as much as
38.68%. See [full results here](https://gist.github.com/bergwolf/06974a3c5981494a40e2c408681c085d).
### What are the Cons
One drawback of VM templating is that it cannot avoid cross-VM side-channel
attack such as [CVE-2015-2877](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-2877)
that originally targeted at the Linux KSM feature.
@@ -43,11 +39,10 @@ and can be classified as potentially misunderstood behaviors rather than vulnera
**Warning**: If you care about such attack vector, do not use VM templating or KSM.
### How to enable VM templating
VM templating can be enabled by changing your Kata Containers config file (`/usr/share/defaults/kata-containers/configuration.toml`,
overridden by `/etc/kata-containers/configuration.toml` if provided) such that:
- `qemu` version `v4.1.0` or above is specified in `hypervisor.qemu`->`path` section
- `qemu-lite` is specified in `hypervisor.qemu`->`path` section
- `enable_template = true`
- `initrd =` is set
- `image =` option is commented out or removed

View File

@@ -1,5 +1,11 @@
# Hypervisors
* [Hypervisors](#hypervisors)
* [Introduction](#introduction)
* [Types](#types)
* [Determine currently configured hypervisor](#determine-currently-configured-hypervisor)
* [Choose a Hypervisor](#choose-a-hypervisor)
## Introduction
Kata Containers supports multiple hypervisors. This document provides a very

View File

@@ -1,19 +1,39 @@
# Kata Containers installation guides
# Kata Containers installation user guides
The following is an overview of the different installation methods available.
* [Kata Containers installation user guides](#kata-containers-installation-user-guides)
* [Prerequisites](#prerequisites)
* [Legacy installation](#legacy-installation)
* [Packaged installation methods](#packaged-installation-methods)
* [Official packages](#official-packages)
* [Snap Installation](#snap-installation)
* [Automatic Installation](#automatic-installation)
* [Manual Installation](#manual-installation)
* [Build from source installation](#build-from-source-installation)
* [Installing on a Cloud Service Platform](#installing-on-a-cloud-service-platform)
* [Further information](#further-information)
The following is an overview of the different installation methods available. All of these methods equally result
in a system configured to run Kata Containers.
## Prerequisites
Kata Containers requires nested virtualization or bare metal. Check
[hardware requirements](/src/runtime/README.md#hardware-requirements) to see if your system is capable of running Kata
Containers.
Kata Containers requires nested virtualization or bare metal.
See the
[hardware requirements](/src/runtime/README.md#hardware-requirements)
to see if your system is capable of running Kata Containers.
## Legacy installation
If you wish to install a legacy 1.x version of Kata Containers, see
[the Kata Containers 1.x installation documentation](https://github.com/kata-containers/documentation/tree/master/install/).
## Packaged installation methods
Packaged installation methods uses your distribution's native package format (such as RPM or DEB).
*Note:* We encourage installation methods that provides automatic updates, it ensures security updates and bug fixes are
easily applied.
> **Notes:**
>
> - Packaged installation methods uses your distribution's native package format (such as RPM or DEB).
> - You are strongly encouraged to choose an installation method that provides
> automatic updates, to ensure you benefit from security updates and bug fixes.
| Installation method | Description | Automatic updates | Use case |
|------------------------------------------------------|---------------------------------------------------------------------|-------------------|----------------------------------------------------------|
@@ -32,9 +52,16 @@ Kata packages are provided by official distribution repositories for:
| [CentOS](centos-installation-guide.md) | 8 |
| [Fedora](fedora-installation-guide.md) | 34 |
> **Note::**
>
> All users are encouraged to uses the official distribution versions of Kata
> Containers unless they understand the implications of alternative methods.
### Snap Installation
The snap installation is available for all distributions which support `snapd`.
> **Note:** The snap installation is available for all distributions which support `snapd`.
[![Get it from the Snap Store](https://snapcraft.io/static/images/badges/en/snap-store-black.svg)](https://snapcraft.io/kata-containers)
[Use snap](snap-installation-guide.md) to install Kata Containers from https://snapcraft.io.
@@ -48,9 +75,11 @@ Follow the [containerd installation guide](container-manager/containerd/containe
## Build from source installation
*Note:* Power users who decide to build from sources should be aware of the
implications of using an unpackaged system which will not be automatically
updated as new [releases](../Stable-Branch-Strategy.md) are made available.
> **Notes:**
>
> - Power users who decide to build from sources should be aware of the
> implications of using an unpackaged system which will not be automatically
> updated as new [releases](../Stable-Branch-Strategy.md) are made available.
[Building from sources](../Developer-Guide.md#initial-setup) allows power users
who are comfortable building software from source to use the latest component
@@ -66,6 +95,6 @@ versions. This is not recommended for normal users.
## Further information
* [upgrading document](../Upgrading.md)
* [developer guide](../Developer-Guide.md)
* [runtime documentation](../../src/runtime/README.md)
* The [upgrading document](../Upgrading.md).
* The [developer guide](../Developer-Guide.md).
* The [runtime documentation](../../src/runtime/README.md).

View File

@@ -1,5 +1,10 @@
# Install Kata Containers on Amazon Web Services
* [Install and Configure AWS CLI](#install-and-configure-aws-cli)
* [Create or Import an EC2 SSH key pair](#create-or-import-an-ec2-ssh-key-pair)
* [Launch i3.metal instance](#launch-i3metal-instance)
* [Install Kata](#install-kata)
Kata Containers on Amazon Web Services (AWS) makes use of [i3.metal](https://aws.amazon.com/ec2/instance-types/i3/) instances. Most of the installation procedure is identical to that for Kata on your preferred distribution, except that you have to run it on bare metal instances since AWS doesn't support nested virtualization yet. This guide walks you through creating an i3.metal instance.
## Install and Configure AWS CLI

View File

@@ -98,12 +98,12 @@
```toml
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "kata"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
[plugins.cri]
[plugins.cri.containerd]
default_runtime_name = "kata"
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
```
> **Note:**

View File

@@ -1,5 +1,11 @@
# Install Kata Containers on Google Compute Engine
* [Create an Image with Nested Virtualization Enabled](#create-an-image-with-nested-virtualization-enabled)
* [Create the Image](#create-the-image)
* [Verify VMX is Available](#verify-vmx-is-available)
* [Install Kata](#install-kata)
* [Create a Kata-enabled Image](#create-a-kata-enabled-image)
Kata Containers on Google Compute Engine (GCE) makes use of [nested virtualization](https://cloud.google.com/compute/docs/instances/enable-nested-virtualization-vm-instances). Most of the installation procedure is identical to that for Kata on your preferred distribution, but enabling nested virtualization currently requires extra steps on GCE. This guide walks you through creating an image and instance with nested virtualization enabled. Note that `kata-runtime check` checks for nested virtualization, but does not fail if support is not found.
As a pre-requisite this guide assumes an installed and configured instance of the [Google Cloud SDK](https://cloud.google.com/sdk/downloads). For a zero-configuration option, all of the commands below were been tested under [Google Cloud Shell](https://cloud.google.com/shell/) (as of Jun 2018). Verify your `gcloud` installation and configuration:

View File

@@ -1,12 +1,24 @@
# Installing Kata Containers in Minikube
* [Installing Kata Containers in Minikube](#installing-kata-containers-in-minikube)
* [Introduction](#introduction)
* [Prerequisites](#prerequisites)
* [Setting up Minikube](#setting-up-minikube)
* [Checking for nested virtualization](#checking-for-nested-virtualization)
* [Check Minikube is running](#check-minikube-is-running)
* [Installing Kata Containers](#installing-kata-containers)
* [Enabling Kata Containers](#enabling-kata-containers)
* [Register the runtime](#register-the-runtime)
* [Testing Kata Containers](#testing-kata-containers)
* [Wrapping up](#wrapping-up)
## Introduction
[Minikube](https://kubernetes.io/docs/setup/minikube/) is an easy way to try out a Kubernetes (k8s)
cluster locally. It creates a single node Kubernetes stack in a local VM.
[Kata Containers](https://github.com/kata-containers) can be installed into a Minikube cluster using
[`kata-deploy`](https://github.com/kata-containers/kata-containers/tree/main/tools/packaging/kata-deploy).
[`kata-deploy`](https://github.com/kata-containers/packaging/tree/master/kata-deploy).
This document details the pre-requisites, installation steps, and how to check
the installation has been successful.
@@ -123,7 +135,7 @@ $ kubectl apply -f kata-deploy/base/kata-deploy.yaml
This installs the Kata Containers components into `/opt/kata` inside the Minikube node. It can take
a few minutes for the operation to complete. You can check the installation has worked by checking
the status of the `kata-deploy` pod, which will be executing
[this script](https://github.com/kata-containers/kata-containers/tree/main/tools/packaging/kata-deploy/scripts/kata-deploy.sh),
[this script](https://github.com/kata-containers/packaging/blob/master/kata-deploy/scripts/kata-deploy.sh),
and will be executing a `sleep infinity` once it has successfully completed its work.
You can accomplish this by running the following:
@@ -154,8 +166,8 @@ $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/node-api/master/
Now register the `kata qemu` runtime with that class. This should result in no errors:
```sh
$ cd kata-containers/tools/packaging/kata-deploy/runtimeclasses
$ kubectl apply -f kata-runtimeClasses.yaml
$ cd kata-containers/tools/packaging/kata-deploy/k8s-1.14
$ kubectl apply -f kata-qemu-runtimeClass.yaml
```
The Kata Containers installation process should be complete and enabled in the Minikube cluster.

View File

@@ -1,5 +1,11 @@
# Kata Containers snap package
* [Install Kata Containers](#install-kata-containers)
* [Configure Kata Containers](#configure-kata-containers)
* [Integration with shim v2 Container Engines](#integration-with-shim-v2-container-engines)
* [Remove Kata Containers snap package](#remove-kata-containers-snap-package)
## Install Kata Containers
Kata Containers can be installed in any Linux distribution that supports

View File

@@ -1,5 +1,13 @@
# Using Intel GPU device with Kata Containers
- [Using Intel GPU device with Kata Containers](#using-intel-gpu-device-with-kata-containers)
- [Hardware Requirements](#hardware-requirements)
- [Host Kernel Requirements](#host-kernel-requirements)
- [Install and configure Kata Containers](#install-and-configure-kata-containers)
- [Build Kata Containers kernel with GPU support](#build-kata-containers-kernel-with-gpu-support)
- [GVT-d with Kata Containers](#gvt-d-with-kata-containers)
- [GVT-g with Kata Containers](#gvt-g-with-kata-containers)
An Intel Graphics device can be passed to a Kata Containers container using GPU
passthrough (Intel GVT-d) as well as GPU mediated passthrough (Intel GVT-g).
@@ -57,8 +65,8 @@ configuration in the Kata `configuration.toml` file as shown below.
$ sudo sed -i -e 's/^# *\(hotplug_vfio_on_root_bus\).*=.*$/\1 = true/g' /usr/share/defaults/kata-containers/configuration.toml
```
Make sure you are using the `q35` machine type by verifying `machine_type = "q35"` is
set in the `configuration.toml`. Make sure `pcie_root_port` is set to a positive value.
Make sure you are using the `pc` machine type by verifying `machine_type = "pc"` is
set in the `configuration.toml`.
## Build Kata Containers kernel with GPU support

View File

@@ -1,5 +1,17 @@
# Using Nvidia GPU device with Kata Containers
- [Using Nvidia GPU device with Kata Containers](#using-nvidia-gpu-device-with-kata-containers)
- [Hardware Requirements](#hardware-requirements)
- [Host BIOS Requirements](#host-bios-requirements)
- [Host Kernel Requirements](#host-kernel-requirements)
- [Install and configure Kata Containers](#install-and-configure-kata-containers)
- [Build Kata Containers kernel with GPU support](#build-kata-containers-kernel-with-gpu-support)
- [Nvidia GPU pass-through mode with Kata Containers](#nvidia-gpu-pass-through-mode-with-kata-containers)
- [Nvidia vGPU mode with Kata Containers](#nvidia-vgpu-mode-with-kata-containers)
- [Install Nvidia Driver in Kata Containers](#install-nvidia-driver-in-kata-containers)
- [References](#references)
An Nvidia GPU device can be passed to a Kata Containers container using GPU passthrough
(Nvidia GPU pass-through mode) as well as GPU mediated passthrough (Nvidia vGPU mode). 
@@ -63,6 +75,13 @@ To use non-large BARs devices (for example, Nvidia Tesla T4), you need Kata vers
Follow the [Kata Containers setup instructions](../install/README.md)
to install the latest version of Kata.
The following configuration in the Kata `configuration.toml` file as shown below can work:
```
machine_type = "pc"
hotplug_vfio_on_root_bus = true
```
To use large BARs devices (for example, Nvidia Tesla P100), you need Kata version 1.11.0 or above.
The following configuration in the Kata `configuration.toml` file as shown below can work:
@@ -291,4 +310,4 @@ Tue Mar 3 00:03:49 2020
- [Configuring a VM for GPU Pass-Through by Using the QEMU Command Line](https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#using-gpu-pass-through-red-hat-el-qemu-cli)
- https://gitlab.com/nvidia/container-images/driver/-/tree/master
- https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers
- https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers-(Beta)

View File

@@ -1,5 +1,33 @@
# Table of Contents
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Helpful Links before starting](#helpful-links-before-starting)
- [Steps to enable Intel® QAT in Kata Containers](#steps-to-enable-intel-qat-in-kata-containers)
- [Script variables](#script-variables)
- [Set environment variables (Every Reboot)](#set-environment-variables-every-reboot)
- [Prepare the Ubuntu Host](#prepare-the-ubuntu-host)
- [Identify which PCI Bus the Intel® QAT card is on](#identify-which-pci-bus-the-intel-qat-card-is-on)
- [Install necessary packages for Ubuntu](#install-necessary-packages-for-ubuntu)
- [Download Intel® QAT drivers](#download-intel-qat-drivers)
- [Copy Intel® QAT configuration files and enable virtual functions](#copy-intel-qat-configuration-files-and-enable-virtual-functions)
- [Expose and Bind Intel® QAT virtual functions to VFIO-PCI (Every reboot)](#expose-and-bind-intel-qat-virtual-functions-to-vfio-pci-every-reboot)
- [Check Intel® QAT virtual functions are enabled](#check-intel-qat-virtual-functions-are-enabled)
- [Prepare Kata Containers](#prepare-kata-containers)
- [Download Kata kernel Source](#download-kata-kernel-source)
- [Build Kata kernel](#build-kata-kernel)
- [Copy Kata kernel](#copy-kata-kernel)
- [Prepare Kata root filesystem](#prepare-kata-root-filesystem)
- [Compile Intel® QAT drivers for Kata Containers kernel and add to Kata Containers rootfs](#compile-intel-qat-drivers-for-kata-containers-kernel-and-add-to-kata-containers-rootfs)
- [Copy Kata rootfs](#copy-kata-rootfs)
- [Verify Intel® QAT works in a container](#verify-intel-qat-works-in-a-container)
- [Build OpenSSL Intel® QAT engine container](#build-openssl-intel-qat-engine-container)
- [Test Intel® QAT with the ctr tool](#test-intel-qat-with-the-ctr-tool)
- [Test Intel® QAT in Kubernetes](#test-intel-qat-in-kubernetes)
- [Troubleshooting](#troubleshooting)
- [Optional Scripts](#optional-scripts)
- [Verify Intel® QAT card counters are incremented](#verify-intel-qat-card-counters-are-incremented)
# Introduction
Intel® QuickAssist Technology (QAT) provides hardware acceleration
@@ -46,7 +74,7 @@ Make sure to check [`01.org`](https://01.org/intel-quickassist-technology) for
the latest driver.
```bash
$ export QAT_DRIVER_VER=qat1.7.l.4.14.0-00031.tar.gz
$ export QAT_DRIVER_VER=qat1.7.l.4.12.0-00011.tar.gz
$ export QAT_DRIVER_URL=https://downloadmirror.intel.com/30178/eng/${QAT_DRIVER_VER}
$ export QAT_CONF_LOCATION=~/QAT_conf
$ export QAT_DOCKERFILE=https://raw.githubusercontent.com/intel/intel-device-plugins-for-kubernetes/master/demo/openssl-qat-engine/Dockerfile
@@ -374,7 +402,7 @@ different hypervisor, different install method for Kata, or a different
Intel® QAT chipset then the command will need to be modified.
> **Note: The following was tested with
[containerd v1.4.6](https://github.com/containerd/containerd/releases/tag/v1.4.6).**
[containerd v1.3.9](https://github.com/containerd/containerd/releases/tag/v1.3.9).**
```bash
$ config_file="/opt/kata/share/defaults/kata-containers/configuration-qemu.toml"
@@ -576,4 +604,4 @@ $ for i in 0434 0435 37c8 1f18 1f19; do lspci -d 8086:$i; done
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b1\:00.0/fw_counters
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b3\:00.0/fw_counters
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b5\:00.0/fw_counters
```
```

View File

@@ -1,5 +1,10 @@
# Kata Containers with SGX
- [Check if SGX is enabled](#check-if-sgx-is-enabled)
- [Install Host kernel with SGX support](#install-host-kernel-with-sgx-support)
- [Install Guest kernel with SGX support](#install-guest-kernel-with-sgx-support)
- [Run Kata Containers with SGX enabled](#run-kata-containers-with-sgx-enabled)
Intel® Software Guard Extensions (SGX) is a set of instructions that increases the security
of applications code and data, giving them more protections from disclosure or modification.

View File

@@ -1,6 +1,13 @@
# Setup to run SPDK vhost-user devices with Kata Containers and Docker*
> **Note:** This guide only applies to QEMU, since the vhost-user storage
- [SPDK vhost-user target overview](#spdk-vhost-user-target-overview)
- [Install and setup SPDK vhost-user target](#install-and-setup-spdk-vhost-user-target)
- [Get source code and build SPDK](#get-source-code-and-build-spdk)
- [Run SPDK vhost-user target](#run-spdk-vhost-user-target)
- [Host setup for vhost-user devices](#host-setup-for-vhost-user-devices)
- [Launch a Kata container with SPDK vhost-user block device](#launch-a-kata-container-with-spdk-vhost-user-block-device)
> **NOTE:** This guide only applies to QEMU, since the vhost-user storage
> device is only available for QEMU now. The enablement work on other
> hypervisors is still ongoing.

View File

@@ -1,5 +1,13 @@
# Setup to use SR-IOV with Kata Containers and Docker*
- [Install the SR-IOV Docker\* plugin](#install-the-sr-iov-docker-plugin)
- [Host setup for SR-IOV](#host-setup-for-sr-iov)
- [Checking your NIC for SR-IOV](#checking-your-nic-for-sr-iov)
- [IOMMU Groups and PCIe Access Control Services](#iommu-groups-and-pcie-access-control-services)
- [Update the host kernel](#update-the-host-kernel)
- [Set up the SR-IOV Device](#set-up-the-sr-iov-device)
- [Example: Launch a Kata Containers container using SR-IOV](#example-launch-a-kata-containers-container-using-sr-iov)
Single Root I/O Virtualization (SR-IOV) enables splitting a physical device into
virtual functions (VFs). Virtual functions enable direct passthrough to virtual
machines or containers. For Kata Containers, we enabled a Container Network

View File

@@ -12,7 +12,7 @@ For more information about VPP visit their [wiki](https://wiki.fd.io/view/VPP).
## Install and configure Kata Containers
Follow the [Kata Containers setup instructions](../Developer-Guide.md).
Follow the [Kata Containers setup instructions](https://github.com/kata-containers/documentation/wiki/Developer-Guide).
In order to make use of VHOST-USER based interfaces, the container needs to be backed
by huge pages. `HugePages` support is required for the large memory pool allocation used for

View File

@@ -1,5 +1,4 @@
# OpenStack Zun DevStack working with Kata Containers
## Introduction
This guide describes how to get Kata Containers to work with OpenStack Zun

View File

@@ -1,5 +1,13 @@
# Kata Containers snap image
* [Initial setup](#initial-setup)
* [Install snap](#install-snap)
* [Build and install snap image](#build-and-install-snap-image)
* [Configure Kata Containers](#configure-kata-containers)
* [Integration with docker and Kubernetes](#integration-with-docker-and-kubernetes)
* [Remove snap](#remove-snap)
* [Limitations](#limitations)
This directory contains the resources needed to build the Kata Containers
[snap][1] image.

View File

@@ -80,8 +80,6 @@ parts:
- uidmap
- gnupg2
override-build: |
[ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "s390x" ] && sudo apt-get --no-install-recommends install -y protobuf-compiler
yq=${SNAPCRAFT_STAGE}/yq
# set GOPATH
@@ -90,7 +88,6 @@ parts:
export GOROOT=${SNAPCRAFT_STAGE}
export PATH="${GOROOT}/bin:${PATH}"
export GO111MODULE="auto"
http_proxy=${http_proxy:-""}
https_proxy=${https_proxy:-""}
@@ -115,17 +112,14 @@ parts:
cd ${kata_dir}/tools/osbuilder
# build image
export AGENT_VERSION=$(cat ${kata_dir}/VERSION)
export AGENT_INIT=yes
export USE_DOCKER=1
export DEBUG=1
case "$(uname -m)" in
aarch64)
aarch64|ppc64le|s390x)
sudo -E PATH=$PATH make initrd DISTRO=alpine
;;
ppc64le|s390x)
# Cannot use alpine on ppc64le/s390x because it would require a musl agent
sudo -E PATH=$PATH make initrd DISTRO=ubuntu
;;
x86_64)
# In some build systems it's impossible to build a rootfs image, try with the initrd image
sudo -E PATH=$PATH make image DISTRO=clearlinux || sudo -E PATH=$PATH make initrd DISTRO=alpine
@@ -147,7 +141,6 @@ parts:
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GOROOT=${SNAPCRAFT_STAGE}
export PATH="${GOROOT}/bin:${PATH}"
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
cd ${kata_dir}/src/runtime
@@ -169,9 +162,12 @@ parts:
SKIP_GO_VERSION_CHECK=1 \
QEMUCMD=qemu-system-$arch
if [ ! -f ${SNAPCRAFT_PART_INSTALL}/../../image/install/usr/share/kata-containers/kata-containers.img ]; then
sed -i -e "s|^image =.*|initrd = \"/snap/${SNAPCRAFT_PROJECT_NAME}/current/usr/share/kata-containers/kata-containers-initrd.img\"|" \
${SNAPCRAFT_PART_INSTALL}/usr/share/defaults/${SNAPCRAFT_PROJECT_NAME}/configuration.toml
if [ -e ${SNAPCRAFT_PART_INSTALL}/../../image/install/usr/share/kata-containers/kata-containers.img ]; then
# Use rootfs image by default
sed -i -e '/^initrd =/d' ${SNAPCRAFT_PART_INSTALL}/usr/share/defaults/${SNAPCRAFT_PROJECT_NAME}/configuration.toml
else
# Use initrd by default
sed -i -e '/^image =/d' ${SNAPCRAFT_PART_INSTALL}/usr/share/defaults/${SNAPCRAFT_PROJECT_NAME}/configuration.toml
fi
kernel:
@@ -184,29 +180,19 @@ parts:
- bison
- flex
override-build: |
yq=${SNAPCRAFT_STAGE}/yq
export GOPATH=${SNAPCRAFT_STAGE}/gopath
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
versions_file="${kata_dir}/versions.yaml"
kernel_version="$(${yq} r $versions_file assets.kernel.version)"
#Remove extra 'v'
kernel_version=${kernel_version#v}
[ "$(uname -m)" = "s390x" ] && sudo apt-get --no-install-recommends install -y libssl-dev
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
cd ${kata_dir}/tools/packaging/kernel
# Setup and build kernel
./build-kernel.sh -v ${kernel_version} -d setup
./build-kernel.sh -d setup
kernel_dir_prefix="kata-linux-"
cd ${kernel_dir_prefix}*
version=$(basename ${PWD} | sed 's|'"${kernel_dir_prefix}"'||' | cut -d- -f1)
make -j $(($(nproc)-1)) EXTRAVERSION=".container"
kernel_suffix=${kernel_version}.container
kernel_suffix=${version}.container
kata_kernel_dir=${SNAPCRAFT_PART_INSTALL}/usr/share/kata-containers
mkdir -p ${kata_kernel_dir}
@@ -216,10 +202,8 @@ parts:
ln -sf ${vmlinuz_name} ${kata_kernel_dir}/vmlinuz.container
# Install raw kernel
vmlinux_path=vmlinux
[ "$(uname -m)" = "s390x" ] && vmlinux_path=arch/s390/boot/compressed/vmlinux
vmlinux_name=vmlinux-${kernel_suffix}
cp ${vmlinux_path} ${kata_kernel_dir}/${vmlinux_name}
cp vmlinux ${kata_kernel_dir}/${vmlinux_name}
ln -sf ${vmlinux_name} ${kata_kernel_dir}/vmlinux.container
qemu:
@@ -243,24 +227,21 @@ parts:
- libblkid-dev
- libffi-dev
- libmount-dev
- libseccomp-dev
- libselinux1-dev
- ninja-build
override-build: |
yq=${SNAPCRAFT_STAGE}/yq
export GOPATH=${SNAPCRAFT_STAGE}/gopath
export GO111MODULE="auto"
kata_dir=${GOPATH}/src/github.com/${SNAPCRAFT_PROJECT_NAME}/${SNAPCRAFT_PROJECT_NAME}
versions_file="${kata_dir}/versions.yaml"
# arch-specific definition
case "$(uname -m)" in
"aarch64")
branch="$(${yq} r ${versions_file} assets.hypervisor.qemu.architecture.aarch64.version)"
branch="$(${yq} r ${versions_file} assets.hypervisor.qemu.architecture.aarch64.branch)"
url="$(${yq} r ${versions_file} assets.hypervisor.qemu.url)"
commit="$(${yq} r ${versions_file} assets.hypervisor.qemu.architecture.aarch64.commit)"
patches_dir="${kata_dir}/tools/packaging/qemu/patches/$(echo ${branch} | sed -e 's/.[[:digit:]]*$//' -e 's/^v//').x"
patches_version_dir="${kata_dir}/tools/packaging/qemu/patches/tag_patches/${branch}"
patches_dir="${kata_dir}/tools/packaging/obs-packaging/qemu-aarch64/patches/"
;;
*)
@@ -274,7 +255,6 @@ parts:
# download source
qemu_dir=${SNAPCRAFT_STAGE}/qemu
rm -rf "${qemu_dir}"
git clone --branch ${branch} --single-branch ${url} "${qemu_dir}"
cd ${qemu_dir}
[ -z "${commit}" ] || git checkout ${commit}
@@ -283,12 +263,11 @@ parts:
[ -n "$(ls -A capstone)" ] || git clone https://github.com/qemu/capstone capstone
# Apply branch patches
[ -d "${patches_version_dir}" ] || mkdir "${patches_version_dir}"
${kata_dir}/tools/packaging/scripts/apply_patches.sh "${patches_dir}"
${kata_dir}/tools/packaging/scripts/apply_patches.sh "${patches_version_dir}"
# Only x86_64 supports libpmem
[ "$(uname -m)" = "x86_64" ] && sudo apt-get --no-install-recommends install -y apt-utils ca-certificates libpmem-dev
[ "$(uname -m)" = "x86_64" ] && sudo apt-get --no-install-recommends install -y apt-utils ca-certificates libpmem-dev libseccomp-dev
configure_hypervisor=${kata_dir}/tools/packaging/scripts/configure-hypervisor.sh
chmod +x ${configure_hypervisor}
@@ -299,15 +278,7 @@ parts:
| xargs ./configure
# Copy QEMU configurations (Kconfigs)
case "$(branch)" in
"v5.1.0")
cp -a ${kata_dir}/tools/packaging/qemu/default-configs/* default-configs
;;
*)
cp -a ${kata_dir}/tools/packaging/qemu/default-configs/* default-configs/devices/
;;
esac
cp -a ${kata_dir}/tools/packaging/qemu/default-configs/* default-configs/devices/
# build and install
make -j $(($(nproc)-1))

View File

@@ -1,2 +1 @@
tarpaulin-report.html
vendor/

822
src/agent/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -13,26 +13,23 @@ lazy_static = "1.3.0"
ttrpc = { version = "0.5.0", features = ["async", "protobuf-codec"], default-features = false }
protobuf = "=2.14.0"
libc = "0.2.58"
nix = "0.21.0"
capctl = "0.2.0"
nix = "0.17.0"
prctl = "1.0.0"
serde_json = "1.0.39"
scan_fmt = "0.2.3"
scopeguard = "1.0.0"
thiserror = "1.0.26"
regex = "1"
# Async helpers
async-trait = "0.1.42"
async-recursion = "0.3.2"
tokio = { version = "1.2.0", features = ["rt", "rt-multi-thread", "sync", "macros", "io-util", "time", "signal", "io-std", "process", "fs"] }
futures = "0.3.12"
# Async runtime
tokio = { version = "1", features = ["full"] }
netlink-sys = { version = "0.6.0", features = ["tokio_socket",]}
tokio-vsock = "0.3.1"
netlink-sys = { version = "0.7.0", features = ["tokio_socket",]}
rtnetlink = "0.8.0"
netlink-packet-utils = "0.4.1"
# Because the author has no time to maintain the crate, we switch the dependency to github,
# Once the new version released on crates.io, we switch it back.
# https://github.com/little-dude/netlink/issues/161
rtnetlink = { git = "https://github.com/little-dude/netlink", rev = "a9367bc4700496ddebc088110c28f40962923326" }
netlink-packet-utils = "0.4.0"
ipnetwork = "0.17.0"
# slog:
@@ -46,21 +43,13 @@ slog-scope = "4.1.2"
slog-stdlog = "4.0.0"
log = "0.4.11"
# for testing
tempfile = "3.1.0"
prometheus = { version = "0.9.0", features = ["process"] }
procfs = "0.7.9"
anyhow = "1.0.32"
cgroups = { package = "cgroups-rs", version = "0.2.5" }
# Tracing
tracing = "0.1.26"
tracing-subscriber = "0.2.18"
tracing-opentelemetry = "0.13.0"
opentelemetry = { version = "0.14.0", features = ["rt-tokio-current-thread"]}
vsock-exporter = { path = "vsock-exporter" }
[dev-dependencies]
tempfile = "3.1.0"
[workspace]
members = [
"oci",

View File

@@ -27,7 +27,40 @@ COMMIT_MSG = $(if $(COMMIT),$(COMMIT),unknown)
# Exported to allow cargo to see it
export VERSION_COMMIT := $(if $(COMMIT),$(VERSION)-$(COMMIT),$(VERSION))
include ../../utils.mk
##VAR BUILD_TYPE=release|debug type of rust build
BUILD_TYPE = release
##VAR ARCH=arch target to build (format: uname -m)
ARCH = $(shell uname -m)
##VAR LIBC=musl|gnu
LIBC ?= musl
ifneq ($(LIBC),musl)
ifeq ($(LIBC),gnu)
override LIBC = gnu
else
$(error "ERROR: A non supported LIBC value was passed. Supported values are musl and gnu")
endif
endif
ifeq ($(ARCH), ppc64le)
override ARCH = powerpc64le
override LIBC = gnu
$(warning "WARNING: powerpc64le-unknown-linux-musl target is unavailable")
endif
ifeq ($(ARCH), s390x)
override LIBC = gnu
$(warning "WARNING: s390x-unknown-linux-musl target is unavailable")
endif
EXTRA_RUSTFLAGS :=
ifeq ($(ARCH), aarch64)
override EXTRA_RUSTFLAGS = -C link-arg=-lgcc
$(warning "WARNING: aarch64-musl needs extra symbols from libgcc")
endif
TRIPLE = $(ARCH)-unknown-linux-$(LIBC)
TARGET_PATH = target/$(TRIPLE)/$(BUILD_TYPE)/$(TARGET)
@@ -121,10 +154,6 @@ clean:
@rm -f $(GENERATED_FILES)
@rm -f tarpaulin-report.html
vendor:
@cargo vendor
#TARGET test: run cargo tests
test:
@cargo test --all --target $(TRIPLE)
@@ -194,8 +223,7 @@ codecov-html: check_tarpaulin
help \
show-header \
show-summary \
optimize \
vendor
optimize
##TARGET generate-protocols: generate/update grpc agent protocols
generate-protocols:

View File

@@ -66,7 +66,6 @@ service AgentService {
rpc SetGuestDateTime(SetGuestDateTimeRequest) returns (google.protobuf.Empty);
rpc CopyFile(CopyFileRequest) returns (google.protobuf.Empty);
rpc GetOOMEvent(GetOOMEventRequest) returns (OOMEvent);
rpc AddSwap(AddSwapRequest) returns (google.protobuf.Empty);
}
message CreateContainerRequest {
@@ -504,10 +503,6 @@ message OOMEvent {
string container_id = 1;
}
message AddSwapRequest {
repeated uint32 PCIPath = 1;
}
message GetMetricsRequest {}
message Metrics {

View File

@@ -11,9 +11,9 @@ serde_derive = "1.0.91"
oci = { path = "../oci" }
protocols = { path ="../protocols" }
caps = "0.5.0"
nix = "0.21.0"
nix = "0.17.0"
scopeguard = "1.0.0"
capctl = "0.2.0"
prctl = "1.0.0"
lazy_static = "1.3.0"
libc = "0.2.58"
protobuf = "=2.14.0"
@@ -24,6 +24,7 @@ regex = "1.1"
path-absolutize = "1.2.0"
anyhow = "1.0.32"
cgroups = { package = "cgroups-rs", version = "0.2.5" }
tempfile = "3.1.0"
rlimit = "0.5.3"
tokio = { version = "1.2.0", features = ["sync", "io-util", "process", "time", "macros"] }
@@ -33,4 +34,3 @@ inotify = "0.9.2"
[dev-dependencies]
serial_test = "0.5.0"
tempfile = "3.1.0"

View File

@@ -232,19 +232,19 @@ fn set_devices_resources(
let mut devices = vec![];
for d in device_resources.iter() {
if let Some(dev) = linux_device_group_to_cgroup_device(d) {
if let Some(dev) = linux_device_group_to_cgroup_device(&d) {
devices.push(dev);
}
}
for d in DEFAULT_DEVICES.iter() {
if let Some(dev) = linux_device_to_cgroup_device(d) {
if let Some(dev) = linux_device_to_cgroup_device(&d) {
devices.push(dev);
}
}
for d in DEFAULT_ALLOWED_DEVICES.iter() {
if let Some(dev) = linux_device_group_to_cgroup_device(d) {
if let Some(dev) = linux_device_group_to_cgroup_device(&d) {
devices.push(dev);
}
}
@@ -828,7 +828,7 @@ fn get_blkio_stats_v2(cg: &cgroups::Cgroup) -> SingularPtrField<BlkioStats> {
fn get_blkio_stats(cg: &cgroups::Cgroup) -> SingularPtrField<BlkioStats> {
if cg.v2() {
return get_blkio_stats_v2(cg);
return get_blkio_stats_v2(&cg);
}
let blkio_controller: &BlkIoController = get_controller_or_return_singular_none!(cg);
@@ -923,12 +923,12 @@ pub fn get_mounts() -> Result<HashMap<String, String>> {
let paths = get_paths()?;
for l in fs::read_to_string(MOUNTS)?.lines() {
let p: Vec<&str> = l.splitn(2, " - ").collect();
let p: Vec<&str> = l.split(" - ").collect();
let pre: Vec<&str> = p[0].split(' ').collect();
let post: Vec<&str> = p[1].split(' ').collect();
if post.len() != 3 {
warn!(sl!(), "can't parse {} line {:?}", MOUNTS, l);
warn!(sl!(), "mountinfo corrupted!");
continue;
}
@@ -1022,7 +1022,7 @@ impl Manager {
.unwrap()
.trim_start_matches(root_path.to_str().unwrap());
info!(sl!(), "updating cpuset for parent path {:?}", &r_path);
let cg = new_cgroup(cgroups::hierarchies::auto(), r_path);
let cg = new_cgroup(cgroups::hierarchies::auto(), &r_path);
let cpuset_controller: &CpuSetController = cg.controller_of().unwrap();
cpuset_controller.set_cpus(guest_cpuset)?;
}

View File

@@ -139,6 +139,19 @@ async fn notify_on_oom(cid: &str, dir: String) -> Result<Receiver<String>> {
register_memory_event(cid, dir, "memory.oom_control", "").await
}
// level is one of "low", "medium", or "critical"
async fn notify_memory_pressure(cid: &str, dir: String, level: &str) -> Result<Receiver<String>> {
if dir.is_empty() {
return Err(anyhow!("memory controller missing"));
}
if level != "low" && level != "medium" && level != "critical" {
return Err(anyhow!("invalid pressure level {}", level));
}
register_memory_event(cid, dir, "memory.pressure_level", level).await
}
async fn register_memory_event(
cid: &str,
cg_dir: String,

View File

@@ -0,0 +1,56 @@
// Copyright (c) 2019 Ant Financial
//
// SPDX-License-Identifier: Apache-2.0
//
use libc::*;
use serde;
#[macro_use]
use serde_derive;
use serde_json;
#[derive(Serialize, Deserialize, Debug)]
pub struct Device {
#[serde(default)]
r#type: char,
#[serde(default)]
path: String,
#[serde(default)]
major: i64,
#[serde(default)]
minor: i64,
#[serde(default)]
permissions: String,
#[serde(default)]
file_mode: mode_t,
#[serde(default)]
uid: i32,
#[serde(default)]
gid: i32,
#[serde(default)]
allow: bool,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct BlockIODevice {
#[serde(default)]
major: i64,
#[serde(default)]
minor: i64,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct WeightDevice {
block: BlockIODevice,
#[serde(default)]
weight: u16,
#[serde(default, rename = "leafWeight")]
leaf_weight: u16,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct ThrottleDevice {
block: BlockIODevice,
#[serde(default)]
rate: u64,
}

View File

@@ -0,0 +1,372 @@
// Copyright (c) 2019 Ant Financial
//
// SPDX-License-Identifier: Apache-2.0
//
use serde;
#[macro_use]
use serde_derive;
use serde_json;
use protocols::oci::State as OCIState;
use std::collections::HashMap;
use std::fmt;
use std::path::PathBuf;
use std::time::Duration;
use nix::unistd;
use self::device::{Device, ThrottleDevice, WeightDevice};
use self::namespaces::Namespaces;
use crate::specconv::CreateOpts;
pub mod device;
pub mod namespaces;
pub mod validator;
#[derive(Serialize, Deserialize, Debug)]
pub struct Rlimit {
#[serde(default)]
r#type: i32,
#[serde(default)]
hard: i32,
#[serde(default)]
soft: i32,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct IDMap {
#[serde(default)]
container_id: i32,
#[serde(default)]
host_id: i32,
#[serde(default)]
size: i32,
}
type Action = i32;
#[derive(Serialize, Deserialize, Debug)]
pub struct Seccomp {
#[serde(default)]
default_action: Action,
#[serde(default)]
architectures: Vec<String>,
#[serde(default)]
flags: Vec<String>,
#[serde(default)]
syscalls: Vec<Syscall>,
}
type Operator = i32;
#[derive(Serialize, Deserialize, Debug)]
pub struct Arg {
#[serde(default)]
index: u32,
#[serde(default)]
value: u64,
#[serde(default)]
value_two: u64,
#[serde(default)]
op: Operator,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Syscall {
#[serde(default, skip_serializing_if = "String::is_empty")]
names: String,
#[serde(default)]
action: Action,
#[serde(default, rename = "errnoRet")]
errno_ret: u32,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
args: Vec<Arg>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Config<'a> {
#[serde(default)]
no_pivot_root: bool,
#[serde(default)]
parent_death_signal: i32,
#[serde(default)]
rootfs: String,
#[serde(default)]
readonlyfs: bool,
#[serde(default, rename = "rootPropagation")]
root_propagation: i32,
#[serde(default)]
mounts: Vec<Mount>,
#[serde(default)]
devices: Vec<Device>,
#[serde(default)]
mount_label: String,
#[serde(default)]
hostname: String,
#[serde(default)]
namespaces: Namespaces,
#[serde(default)]
capabilities: Option<Capabilities>,
#[serde(default)]
networks: Vec<Network>,
#[serde(default)]
routes: Vec<Route>,
#[serde(default)]
cgroups: Option<Cgroup<'a>>,
#[serde(default, skip_serializing_if = "String::is_empty")]
apparmor_profile: String,
#[serde(default, skip_serializing_if = "String::is_empty")]
process_label: String,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
rlimits: Vec<Rlimit>,
#[serde(default)]
oom_score_adj: Option<i32>,
#[serde(default)]
uid_mappings: Vec<IDMap>,
#[serde(default)]
gid_mappings: Vec<IDMap>,
#[serde(default)]
mask_paths: Vec<String>,
#[serde(default)]
readonly_paths: Vec<String>,
#[serde(default)]
sysctl: HashMap<String, String>,
#[serde(default)]
seccomp: Option<Seccomp>,
#[serde(default)]
no_new_privileges: bool,
hooks: Option<Hooks>,
#[serde(default)]
version: String,
#[serde(default)]
labels: Vec<String>,
#[serde(default)]
no_new_keyring: bool,
#[serde(default)]
intel_rdt: Option<IntelRdt>,
#[serde(default)]
rootless_euid: bool,
#[serde(default)]
rootless_cgroups: bool,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Hooks {
prestart: Vec<Box<Hook>>,
poststart: Vec<Box<Hook>>,
poststop: Vec<Box<Hook>>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Capabilities {
bounding: Vec<String>,
effective: Vec<String>,
inheritable: Vec<String>,
permitted: Vec<String>,
ambient: Vec<String>,
}
pub trait Hook {
fn run(&self, state: &OCIState) -> Result<()>;
}
pub struct FuncHook {
// run: fn(&OCIState) -> Result<()>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Command {
#[serde(default)]
path: String,
#[serde(default)]
args: Vec<String>,
#[serde(default)]
env: Vec<String>,
#[serde(default)]
dir: String,
#[serde(default)]
timeout: Duration,
}
pub struct CommandHook {
command: Command,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Mount {
#[serde(default)]
source: String,
#[serde(default)]
destination: String,
#[serde(default)]
device: String,
#[serde(default)]
flags: i32,
#[serde(default)]
propagation_flags: Vec<i32>,
#[serde(default)]
data: String,
#[serde(default)]
relabel: String,
#[serde(default)]
extensions: i32,
#[serde(default)]
premount_cmds: Vec<Command>,
#[serde(default)]
postmount_cmds: Vec<Command>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct HugepageLimit {
#[serde(default)]
page_size: String,
#[serde(default)]
limit: u64,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct IntelRdt {
#[serde(default, skip_serializing_if = "String::is_empty")]
l3_cache_schema: String,
#[serde(
default,
rename = "memBwSchema",
skip_serializing_if = "String::is_empty"
)]
mem_bw_schema: String,
}
pub type FreezerState = String;
#[derive(Serialize, Deserialize, Debug)]
pub struct Cgroup<'a> {
#[serde(default, skip_serializing_if = "String::is_empty")]
name: String,
#[serde(default, skip_serializing_if = "String::is_empty")]
parent: String,
#[serde(default)]
path: String,
#[serde(default)]
scope_prefix: String,
paths: HashMap<String, String>,
resource: &'a Resources<'a>,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Resources<'a> {
#[serde(default)]
allow_all_devices: bool,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
allowed_devices: Vec<&'a Device>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
denied_devices: Vec<&'a Device>,
#[serde(default)]
devices: Vec<&'a Device>,
#[serde(default)]
memory: i64,
#[serde(default)]
memory_reservation: i64,
#[serde(default)]
memory_swap: i64,
#[serde(default)]
kernel_memory: i64,
#[serde(default)]
kernel_memory_tcp: i64,
#[serde(default)]
cpu_shares: u64,
#[serde(default)]
cpu_quota: i64,
#[serde(default)]
cpu_period: u64,
#[serde(default)]
cpu_rt_quota: i64,
#[serde(default)]
cpu_rt_period: u64,
#[serde(default)]
cpuset_cpus: String,
#[serde(default)]
cpuset_mems: String,
#[serde(default)]
pids_limit: i64,
#[serde(default)]
blkio_weight: u64,
#[serde(default)]
blkio_leaf_weight: u64,
#[serde(default)]
blkio_weight_device: Vec<&'a WeightDevice>,
#[serde(default)]
blkio_throttle_read_bps_device: Vec<&'a ThrottleDevice>,
#[serde(default)]
blkio_throttle_write_bps_device: Vec<&'a ThrottleDevice>,
#[serde(default)]
blkio_throttle_read_iops_device: Vec<&'a ThrottleDevice>,
#[serde(default)]
blkio_throttle_write_iops_device: Vec<&'a ThrottleDevice>,
#[serde(default)]
freezer: FreezerState,
#[serde(default)]
hugetlb_limit: Vec<&'a HugepageLimit>,
#[serde(default)]
oom_kill_disable: bool,
#[serde(default)]
memory_swapiness: u64,
#[serde(default)]
net_prio_ifpriomap: Vec<&'a IfPrioMap>,
#[serde(default)]
net_cls_classid_u: u32,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Network {
#[serde(default)]
r#type: String,
#[serde(default)]
name: String,
#[serde(default)]
bridge: String,
#[serde(default)]
mac_address: String,
#[serde(default)]
address: String,
#[serde(default)]
gateway: String,
#[serde(default)]
ipv6_address: String,
#[serde(default)]
ipv6_gateway: String,
#[serde(default)]
mtu: i32,
#[serde(default)]
txqueuelen: i32,
#[serde(default)]
host_interface_name: String,
#[serde(default)]
hairpin_mode: bool,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Route {
#[serde(default)]
destination: String,
#[serde(default)]
source: String,
#[serde(default)]
gateway: String,
#[serde(default)]
interface_name: String,
}
#[derive(Serialize, Deserialize, Debug)]
pub struct IfPrioMap {
#[serde(default)]
interface: String,
#[serde(default)]
priority: i32,
}
impl IfPrioMap {
fn cgroup_string(&self) -> String {
format!("{} {}", self.interface, self.priority)
}
}

View File

@@ -0,0 +1,46 @@
// Copyright (c) 2019 Ant Financial
//
// SPDX-License-Identifier: Apache-2.0
//
use serde;
#[macro_use]
use serde_derive;
use serde_json;
use std::collections::HashMap;
#[macro_use]
use lazy_static;
pub type NamespaceType = String;
pub type Namespaces = Vec<Namespace>;
#[derive(Serialize, Deserialize, Debug)]
pub struct Namespace {
#[serde(default)]
r#type: NamespaceType,
#[serde(default)]
path: String,
}
pub const NEWNET: &'static str = "NEWNET";
pub const NEWPID: &'static str = "NEWPID";
pub const NEWNS: &'static str = "NEWNS";
pub const NEWUTS: &'static str = "NEWUTS";
pub const NEWUSER: &'static str = "NEWUSER";
pub const NEWCGROUP: &'static str = "NEWCGROUP";
pub const NEWIPC: &'static str = "NEWIPC";
lazy_static! {
static ref TYPETONAME: HashMap<&'static str, &'static str> = {
let mut m = HashMap::new();
m.insert("pid", "pid");
m.insert("network", "net");
m.insert("mount", "mnt");
m.insert("user", "user");
m.insert("uts", "uts");
m.insert("ipc", "ipc");
m.insert("cgroup", "cgroup");
m
};
}

View File

@@ -0,0 +1,23 @@
// Copyright (c) 2019 Ant Financial
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::configs::Config;
use std::io::Result;
pub trait Validator {
fn validate(&self, config: &Config) -> Result<()> {
Ok(())
}
}
pub struct ConfigValidator {}
impl Validator for ConfigValidator {}
impl ConfigValidator {
fn new() -> Self {
ConfigValidator {}
}
}

View File

@@ -8,7 +8,7 @@ use libc::pid_t;
use oci::{ContainerState, LinuxDevice, LinuxIdMapping};
use oci::{Hook, Linux, LinuxNamespace, LinuxResources, Spec};
use std::clone::Clone;
use std::ffi::CString;
use std::ffi::{CStr, CString};
use std::fmt::Display;
use std::fs;
use std::os::unix::io::RawFd;
@@ -62,7 +62,10 @@ use tokio::sync::Mutex;
use crate::utils;
const STATE_FILENAME: &str = "state.json";
const EXEC_FIFO_FILENAME: &str = "exec.fifo";
const VER_MARKER: &str = "1.2.5";
const PID_NS_PATH: &str = "/proc/self/ns/pid";
const INIT: &str = "INIT";
const NO_PIVOT: &str = "NO_PIVOT";
@@ -91,6 +94,10 @@ impl ContainerStatus {
self.cur_status
}
fn pre_status(&self) -> ContainerState {
self.pre_status
}
fn transition(&mut self, to: ContainerState) {
self.pre_status = self.status();
self.cur_status = to;
@@ -339,7 +346,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
Err(_e) => sched::unshare(CloneFlags::CLONE_NEWPID)?,
}
match unsafe { fork() } {
match fork() {
Ok(ForkResult::Parent { child, .. }) => {
log_child!(
cfd_log,
@@ -390,7 +397,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let linux = spec.linux.as_ref().unwrap();
// get namespace vector to join/new
let nses = get_namespaces(linux);
let nses = get_namespaces(&linux);
let mut userns = false;
let mut to_new = CloneFlags::empty();
@@ -462,7 +469,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
// Ref: https://github.com/opencontainers/runc/commit/50a19c6ff828c58e5dab13830bd3dacde268afe5
//
if !nses.is_empty() {
capctl::prctl::set_dumpable(false)
prctl::set_dumpable(false)
.map_err(|e| anyhow!(e).context("set process non-dumpable failed"))?;
}
@@ -538,7 +545,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
// notify parent to run prestart hooks
write_sync(cwfd, SYNC_SUCCESS, "")?;
// wait parent run prestart hooks
read_sync(crfd)?;
let _ = read_sync(crfd)?;
}
if mount_fd != -1 {
@@ -561,7 +568,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
}
if to_new.contains(CloneFlags::CLONE_NEWNS) {
mount::finish_rootfs(cfd_log, &spec, &oci_process)?;
mount::finish_rootfs(cfd_log, &spec)?;
}
if !oci_process.cwd.is_empty() {
@@ -595,7 +602,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
// NoNewPeiviledges, Drop capabilities
if oci_process.no_new_privileges {
capctl::prctl::set_no_new_privs().map_err(|_| anyhow!("cannot set no new privileges"))?;
prctl::set_no_new_privileges(true).map_err(|_| anyhow!("cannot set no new privileges"))?;
}
if oci_process.capabilities.is_some() {
@@ -605,6 +612,8 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
if init {
// notify parent to run poststart hooks
// cfd is closed when return from join_namespaces
// should retunr cfile instead of cfd?
write_sync(cwfd, SYNC_SUCCESS, "")?;
}
@@ -939,7 +948,7 @@ impl BaseContainer for LinuxContainer {
join_namespaces(
&logger,
spec,
&spec,
&p,
self.cgroup_manager.as_ref().unwrap(),
&st,
@@ -1031,7 +1040,7 @@ impl BaseContainer for LinuxContainer {
let fifo = format!("{}/{}", &self.root, EXEC_FIFO_FILENAME);
let fd = fcntl::open(fifo.as_str(), OFlag::O_WRONLY, Mode::from_bits_truncate(0))?;
let data: &[u8] = &[0];
unistd::write(fd, data)?;
unistd::write(fd, &data)?;
info!(self.logger, "container started");
self.init_process_start_time = SystemTime::now()
.duration_since(SystemTime::UNIX_EPOCH)
@@ -1072,8 +1081,9 @@ fn do_exec(args: &[String]) -> ! {
.iter()
.map(|s| CString::new(s.to_string()).unwrap_or_default())
.collect();
let a: Vec<&CStr> = sa.iter().map(|s| s.as_c_str()).collect();
let _ = unistd::execvp(p.as_c_str(), &sa).map_err(|e| match e {
let _ = unistd::execvp(p.as_c_str(), a.as_slice()).map_err(|e| match e {
nix::Error::Sys(errno) => {
std::process::exit(errno as i32);
}
@@ -1246,7 +1256,7 @@ async fn join_namespaces(
if p.init {
info!(logger, "notify child parent ready to run prestart hook!");
read_async(pipe_r).await?;
let _ = read_async(pipe_r).await?;
info!(logger, "get ready to run prestart hook!");
@@ -1306,7 +1316,7 @@ fn write_mappings(logger: &Logger, path: &str, maps: &[LinuxIdMapping]) -> Resul
fn setid(uid: Uid, gid: Gid) -> Result<()> {
// set uid/gid
capctl::prctl::set_keepcaps(true)
prctl::set_keep_capabilities(true)
.map_err(|e| anyhow!(e).context("set keep capabilities returned"))?;
{
@@ -1320,7 +1330,7 @@ fn setid(uid: Uid, gid: Gid) -> Result<()> {
capabilities::reset_effective()?;
}
capctl::prctl::set_keepcaps(false)
prctl::set_keep_capabilities(false)
.map_err(|e| anyhow!(e).context("set keep capabilities returned"))?;
Ok(())
@@ -1394,8 +1404,18 @@ impl LinuxContainer {
logger: logger.new(o!("module" => "rustjail", "subsystem" => "container", "cid" => id)),
})
}
fn load<T: Into<String>>(_id: T, _base: T) -> Result<Self> {
Err(anyhow!("not supported"))
}
}
// Handle the differing rlimit types for different targets
#[cfg(target_env = "musl")]
type RlimitsType = libc::c_int;
#[cfg(target_env = "gnu")]
type RlimitsType = libc::__rlimit_resource_t;
fn setgroups(grps: &[libc::gid_t]) -> Result<()> {
let ret = unsafe { libc::setgroups(grps.len(), grps.as_ptr() as *const libc::gid_t) };
Errno::result(ret).map(drop)?;
@@ -1537,7 +1557,6 @@ mod tests {
use std::os::unix::fs::MetadataExt;
use std::os::unix::io::AsRawFd;
use tempfile::tempdir;
use tokio::process::Command;
macro_rules! sl {
() => {
@@ -1545,27 +1564,12 @@ mod tests {
};
}
async fn which(cmd: &str) -> String {
let output: std::process::Output = Command::new("which")
.arg(cmd)
.output()
.await
.expect("which command failed to run");
match String::from_utf8(output.stdout) {
Ok(v) => v.trim_end_matches('\n').to_string(),
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
}
}
#[tokio::test]
async fn test_execute_hook() {
let xargs = which("xargs").await;
execute_hook(
&slog_scope::logger(),
&Hook {
path: xargs,
path: "/usr/bin/xargs".to_string(),
args: vec![],
env: vec![],
timeout: None,
@@ -1585,12 +1589,10 @@ mod tests {
#[tokio::test]
async fn test_execute_hook_with_timeout() {
let sleep = which("sleep").await;
let res = execute_hook(
&slog_scope::logger(),
&Hook {
path: sleep,
path: "/usr/bin/sleep".to_string(),
args: vec!["2".to_string()],
env: vec![],
timeout: Some(1),
@@ -1627,7 +1629,7 @@ mod tests {
let pre_status = status.status();
status.transition(*s);
assert_eq!(pre_status, status.pre_status);
assert_eq!(pre_status, status.pre_status());
}
}

View File

@@ -3,7 +3,15 @@
// SPDX-License-Identifier: Apache-2.0
//
// #![allow(unused_attributes)]
// #![allow(unused_imports)]
// #![allow(unused_variables)]
// #![allow(unused_mut)]
#![allow(dead_code)]
// #![allow(deprecated)]
// #![allow(unused_must_use)]
#![allow(non_upper_case_globals)]
// #![allow(unused_comparisons)]
#[macro_use]
#[cfg(test)]
extern crate serial_test;
@@ -15,7 +23,7 @@ extern crate caps;
extern crate protocols;
#[macro_use]
extern crate scopeguard;
extern crate capctl;
extern crate prctl;
#[macro_use]
extern crate lazy_static;
extern crate libc;
@@ -39,6 +47,16 @@ pub mod sync;
pub mod sync_with_async;
pub mod utils;
pub mod validator;
// pub mod factory;
//pub mod configs;
// pub mod devices;
// pub mod init;
// pub mod rootfs;
// pub mod capabilities;
// pub mod console;
// pub mod stats;
// pub mod user;
//pub mod intelrdt;
use std::collections::HashMap;
@@ -456,6 +474,10 @@ fn linux_grpc_to_oci(l: &grpc::Linux) -> oci::Linux {
}
}
fn linux_oci_to_grpc(_l: &oci::Linux) -> grpc::Linux {
grpc::Linux::default()
}
pub fn grpc_to_oci(grpc: &grpc::Spec) -> oci::Spec {
// process
let process = if grpc.Process.is_some() {
@@ -511,6 +533,7 @@ pub fn grpc_to_oci(grpc: &grpc::Spec) -> oci::Spec {
#[cfg(test)]
mod tests {
#[allow(unused_macros)]
#[macro_export]
macro_rules! skip_if_not_root {
() => {

View File

@@ -13,7 +13,7 @@ use nix::mount::{MntFlags, MsFlags};
use nix::sys::stat::{self, Mode, SFlag};
use nix::unistd::{self, Gid, Uid};
use nix::NixPath;
use oci::{LinuxDevice, Mount, Process, Spec};
use oci::{LinuxDevice, Mount, Spec};
use std::collections::{HashMap, HashSet};
use std::fs::{self, OpenOptions};
use std::mem::MaybeUninit;
@@ -62,56 +62,49 @@ const PROC_SUPER_MAGIC: libc::c_uint = 0x00009fa0;
lazy_static! {
static ref PROPAGATION: HashMap<&'static str, MsFlags> = {
let mut m = HashMap::new();
m.insert("shared", MsFlags::MS_SHARED);
m.insert("rshared", MsFlags::MS_SHARED | MsFlags::MS_REC);
m.insert("private", MsFlags::MS_PRIVATE);
m.insert("rprivate", MsFlags::MS_PRIVATE | MsFlags::MS_REC);
m.insert("rshared", MsFlags::MS_SHARED | MsFlags::MS_REC);
m.insert("rslave", MsFlags::MS_SLAVE | MsFlags::MS_REC);
m.insert("runbindable", MsFlags::MS_UNBINDABLE | MsFlags::MS_REC);
m.insert("shared", MsFlags::MS_SHARED);
m.insert("slave", MsFlags::MS_SLAVE);
m.insert("rslave", MsFlags::MS_SLAVE | MsFlags::MS_REC);
m.insert("unbindable", MsFlags::MS_UNBINDABLE);
m.insert("runbindable", MsFlags::MS_UNBINDABLE | MsFlags::MS_REC);
m
};
static ref OPTIONS: HashMap<&'static str, (bool, MsFlags)> = {
let mut m = HashMap::new();
m.insert("acl", (false, MsFlags::MS_POSIXACL));
m.insert("async", (true, MsFlags::MS_SYNCHRONOUS));
m.insert("atime", (true, MsFlags::MS_NOATIME));
m.insert("bind", (false, MsFlags::MS_BIND));
m.insert("defaults", (false, MsFlags::empty()));
m.insert("dev", (true, MsFlags::MS_NODEV));
m.insert("diratime", (true, MsFlags::MS_NODIRATIME));
m.insert("dirsync", (false, MsFlags::MS_DIRSYNC));
m.insert("exec", (true, MsFlags::MS_NOEXEC));
m.insert("iversion", (false, MsFlags::MS_I_VERSION));
m.insert("lazytime", (false, MsFlags::MS_LAZYTIME));
m.insert("loud", (true, MsFlags::MS_SILENT));
m.insert("mand", (false, MsFlags::MS_MANDLOCK));
m.insert("noacl", (true, MsFlags::MS_POSIXACL));
m.insert("noatime", (false, MsFlags::MS_NOATIME));
m.insert("nodev", (false, MsFlags::MS_NODEV));
m.insert("nodiratime", (false, MsFlags::MS_NODIRATIME));
m.insert("noexec", (false, MsFlags::MS_NOEXEC));
m.insert("noiversion", (true, MsFlags::MS_I_VERSION));
m.insert("nolazytime", (true, MsFlags::MS_LAZYTIME));
m.insert("nomand", (true, MsFlags::MS_MANDLOCK));
m.insert("norelatime", (true, MsFlags::MS_RELATIME));
m.insert("nostrictatime", (true, MsFlags::MS_STRICTATIME));
m.insert("nosuid", (false, MsFlags::MS_NOSUID));
m.insert("rbind", (false, MsFlags::MS_BIND | MsFlags::MS_REC));
m.insert("relatime", (false, MsFlags::MS_RELATIME));
m.insert("remount", (false, MsFlags::MS_REMOUNT));
m.insert("ro", (false, MsFlags::MS_RDONLY));
m.insert("rw", (true, MsFlags::MS_RDONLY));
m.insert("silent", (false, MsFlags::MS_SILENT));
m.insert("strictatime", (false, MsFlags::MS_STRICTATIME));
m.insert("suid", (true, MsFlags::MS_NOSUID));
m.insert("nosuid", (false, MsFlags::MS_NOSUID));
m.insert("dev", (true, MsFlags::MS_NODEV));
m.insert("nodev", (false, MsFlags::MS_NODEV));
m.insert("exec", (true, MsFlags::MS_NOEXEC));
m.insert("noexec", (false, MsFlags::MS_NOEXEC));
m.insert("sync", (false, MsFlags::MS_SYNCHRONOUS));
m.insert("async", (true, MsFlags::MS_SYNCHRONOUS));
m.insert("dirsync", (false, MsFlags::MS_DIRSYNC));
m.insert("remount", (false, MsFlags::MS_REMOUNT));
m.insert("mand", (false, MsFlags::MS_MANDLOCK));
m.insert("nomand", (true, MsFlags::MS_MANDLOCK));
m.insert("atime", (true, MsFlags::MS_NOATIME));
m.insert("noatime", (false, MsFlags::MS_NOATIME));
m.insert("diratime", (true, MsFlags::MS_NODIRATIME));
m.insert("nodiratime", (false, MsFlags::MS_NODIRATIME));
m.insert("bind", (false, MsFlags::MS_BIND));
m.insert("rbind", (false, MsFlags::MS_BIND | MsFlags::MS_REC));
m.insert("relatime", (false, MsFlags::MS_RELATIME));
m.insert("norelatime", (true, MsFlags::MS_RELATIME));
m.insert("strictatime", (false, MsFlags::MS_STRICTATIME));
m.insert("nostrictatime", (true, MsFlags::MS_STRICTATIME));
m
};
}
#[inline(always)]
#[allow(unused_variables)]
pub fn mount<
P1: ?Sized + NixPath,
P2: ?Sized + NixPath,
@@ -131,6 +124,7 @@ pub fn mount<
}
#[inline(always)]
#[allow(unused_variables)]
pub fn umount2<P: ?Sized + NixPath>(
target: &P,
flags: MntFlags,
@@ -189,7 +183,7 @@ pub fn init_rootfs(
let mut bind_mount_dev = false;
for m in &spec.mounts {
let (mut flags, pgflags, data) = parse_mount(m);
let (mut flags, pgflags, data) = parse_mount(&m);
if !m.destination.starts_with('/') || m.destination.contains("..") {
return Err(anyhow!(
"the mount destination {} is invalid",
@@ -198,7 +192,7 @@ pub fn init_rootfs(
}
if m.r#type == "cgroup" {
mount_cgroups(cfd_log, m, rootfs, flags, &data, cpath, mounts)?;
mount_cgroups(cfd_log, &m, rootfs, flags, &data, cpath, mounts)?;
} else {
if m.destination == "/dev" {
if m.r#type == "bind" {
@@ -226,7 +220,7 @@ pub fn init_rootfs(
}
}
mount_from(cfd_log, m, rootfs, flags, &data, "")?;
mount_from(cfd_log, &m, &rootfs, flags, &data, "")?;
// bind mount won't change mount options, we need remount to make mount options
// effective.
// first check that we have non-default options required before attempting a
@@ -356,7 +350,7 @@ fn mount_cgroups(
mounts: &HashMap<String, String>,
) -> Result<()> {
if cgroups::hierarchies::is_cgroup2_unified_mode() {
return mount_cgroups_v2(cfd_log, m, rootfs, flags);
return mount_cgroups_v2(cfd_log, &m, rootfs, flags);
}
// mount tmpfs
let ctm = Mount {
@@ -450,6 +444,7 @@ fn mount_cgroups(
Ok(())
}
#[allow(unused_variables)]
fn pivot_root<P1: ?Sized + NixPath, P2: ?Sized + NixPath>(
new_root: &P1,
put_old: &P2,
@@ -582,6 +577,7 @@ fn parse_mount_table() -> Result<Vec<Info>> {
}
#[inline(always)]
#[allow(unused_variables)]
fn chroot<P: ?Sized + NixPath>(path: &P) -> Result<(), nix::Error> {
#[cfg(not(test))]
return unistd::chroot(path);
@@ -902,21 +898,10 @@ fn bind_dev(dev: &LinuxDevice) -> Result<()> {
Ok(())
}
pub fn finish_rootfs(cfd_log: RawFd, spec: &Spec, process: &Process) -> Result<()> {
pub fn finish_rootfs(cfd_log: RawFd, spec: &Spec) -> Result<()> {
let olddir = unistd::getcwd()?;
log_child!(cfd_log, "old cwd: {}", olddir.to_str().unwrap());
unistd::chdir("/")?;
if !process.cwd.is_empty() {
// Although the process.cwd string can be unclean/malicious (../../dev, etc),
// we are running on our own mount namespace and we just chrooted into the
// container's root. It's safe to create CWD from there.
log_child!(cfd_log, "Creating CWD {}", process.cwd.as_str());
// Unconditionally try to create CWD, create_dir_all will not fail if
// it already exists.
fs::create_dir_all(process.cwd.as_str())?;
}
if spec.linux.is_some() {
let linux = spec.linux.as_ref().unwrap();
@@ -1222,7 +1207,7 @@ mod tests {
options: vec!["ro".to_string(), "shared".to_string()],
}];
let ret = finish_rootfs(stdout_fd, &spec, &oci::Process::default());
let ret = finish_rootfs(stdout_fd, &spec);
assert!(ret.is_ok(), "Should pass. Got: {:?}", ret);
}

View File

@@ -28,6 +28,16 @@ fn contain_namespace(nses: &[LinuxNamespace], key: &str) -> bool {
false
}
fn get_namespace_path(nses: &[LinuxNamespace], key: &str) -> Result<String> {
for ns in nses {
if ns.r#type.as_str() == key {
return Ok(ns.path.clone());
}
}
Err(einval())
}
fn rootfs(root: &str) -> Result<()> {
let path = PathBuf::from(root);
// not absolute path or not exists
@@ -156,6 +166,31 @@ lazy_static! {
};
}
fn check_host_ns(path: &str) -> Result<()> {
let cpath = PathBuf::from(path);
let hpath = PathBuf::from("/proc/self/ns/net");
let real_hpath = hpath
.read_link()
.context(format!("read link {:?}", hpath))?;
let meta = cpath
.symlink_metadata()
.context(format!("symlink metadata {:?}", cpath))?;
let file_type = meta.file_type();
if !file_type.is_symlink() {
return Ok(());
}
let real_cpath = cpath
.read_link()
.context(format!("read link {:?}", cpath))?;
if real_cpath == real_hpath {
return Err(einval());
}
Ok(())
}
fn sysctl(oci: &Spec) -> Result<()> {
let linux = get_linux(oci)?;
@@ -266,7 +301,7 @@ pub fn validate(conf: &Config) -> Result<()> {
security(oci).context("security")?;
usernamespace(oci).context("usernamespace")?;
cgroupnamespace(oci).context("cgroupnamespace")?;
sysctl(oci).context("sysctl")?;
sysctl(&oci).context("sysctl")?;
if conf.rootless_euid {
rootless_euid(oci).context("rootless euid")?;
@@ -299,6 +334,19 @@ mod tests {
assert_eq!(contain_namespace(&namespaces, ""), false);
assert_eq!(contain_namespace(&namespaces, "Net"), false);
assert_eq!(contain_namespace(&namespaces, "ipc"), false);
assert_eq!(
get_namespace_path(&namespaces, "net").unwrap(),
"/sys/cgroups/net"
);
assert_eq!(
get_namespace_path(&namespaces, "uts").unwrap(),
"/sys/cgroups/uts"
);
get_namespace_path(&namespaces, "").unwrap_err();
get_namespace_path(&namespaces, "Uts").unwrap_err();
get_namespace_path(&namespaces, "ipc").unwrap_err();
}
#[test]
@@ -480,6 +528,12 @@ mod tests {
rootless_euid(&spec).unwrap();
}
#[test]
fn test_check_host_ns() {
check_host_ns("/proc/self/ns/net").unwrap_err();
check_host_ns("/proc/sys/net/ipv4/tcp_sack").unwrap();
}
#[test]
fn test_sysctl() {
let mut spec = Spec::default();

View File

@@ -1,140 +0,0 @@
// Copyright (c) IBM Corp. 2021
//
// SPDX-License-Identifier: Apache-2.0
//
use std::fmt;
use std::str::FromStr;
use anyhow::anyhow;
// CCW bus ID follow the format <xx>.<d>.<xxxx> [1, p. 11], where
// - <xx> is the channel subsystem ID, which is always 0 from the guest side, but different from
// the host side, e.g. 0xfe for virtio-*-ccw [1, p. 435],
// - <d> is the subchannel set ID, which ranges from 0-3 [2], and
// - <xxxx> is the device number (0000-ffff; leading zeroes can be omitted,
// e.g. 3 instead of 0003).
// [1] https://www.ibm.com/docs/en/linuxonibm/pdf/lku4dd04.pdf
// [2] https://qemu.readthedocs.io/en/latest/system/s390x/css.html
// Maximum subchannel set ID
const SUBCHANNEL_SET_MAX: u8 = 3;
// CCW device. From the guest side, the first field is always 0 and can therefore be omitted.
#[derive(Copy, Clone, Debug)]
pub struct Device {
subchannel_set_id: u8,
device_number: u16,
}
impl Device {
pub fn new(subchannel_set_id: u8, device_number: u16) -> anyhow::Result<Self> {
if subchannel_set_id > SUBCHANNEL_SET_MAX {
return Err(anyhow!(
"Subchannel set ID {:?} should be in range [0..{}]",
subchannel_set_id,
SUBCHANNEL_SET_MAX
));
}
Ok(Device {
subchannel_set_id,
device_number,
})
}
}
impl FromStr for Device {
type Err = anyhow::Error;
fn from_str(s: &str) -> anyhow::Result<Self> {
let split: Vec<&str> = s.split('.').collect();
if split.len() != 3 {
return Err(anyhow!(
"Wrong bus format. It needs to be in the form 0.<d>.<xxxx>, got {:?}",
s
));
}
if split[0] != "0" {
return Err(anyhow!(
"Wrong bus format. First digit needs to be 0, but is {:?}",
split[0]
));
}
let subchannel_set_id = match split[1].parse::<u8>() {
Ok(id) => id,
Err(_) => {
return Err(anyhow!(
"Wrong bus format. Second digit needs to be 0-3, but is {:?}",
split[1]
))
}
};
let device_number = match u16::from_str_radix(split[2], 16) {
Ok(id) => id,
Err(_) => {
return Err(anyhow!(
"Wrong bus format. Third digit needs to be 0-ffff, but is {:?}",
split[2]
))
}
};
Device::new(subchannel_set_id, device_number)
}
}
impl fmt::Display for Device {
fn fmt(&self, f: &mut fmt::Formatter) -> Result<(), fmt::Error> {
write!(f, "0.{}.{:04x}", self.subchannel_set_id, self.device_number)
}
}
#[cfg(test)]
mod tests {
use crate::ccw::Device;
use std::str::FromStr;
#[test]
fn test_new_device() {
// Valid devices
let device = Device::new(0, 0).unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
let device = Device::new(3, 0xffff).unwrap();
assert_eq!(format!("{}", device), "0.3.ffff");
// Invalid device
let device = Device::new(4, 0);
assert!(device.is_err());
}
#[test]
fn test_device_from_str() {
// Valid devices
let device = Device::from_str("0.0.0").unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
let device = Device::from_str("0.0.0000").unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
let device = Device::from_str("0.3.ffff").unwrap();
assert_eq!(format!("{}", device), "0.3.ffff");
// Invalid devices
let device = Device::from_str("0.0");
assert!(device.is_err());
let device = Device::from_str("1.0.0");
assert!(device.is_err());
let device = Device::from_str("0.not_a_subchannel_set_id.0");
assert!(device.is_err());
let device = Device::from_str("0.0.not_a_device_number");
assert!(device.is_err());
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -145,10 +145,9 @@ fn run_in_child(slave_fd: libc::c_int, shell: String) -> Result<()> {
}
let cmd = CString::new(shell).unwrap();
let args: Vec<CString> = Vec::new();
// run shell
let _ = unistd::execvp(cmd.as_c_str(), &args).map_err(|e| match e {
let _ = unistd::execvp(cmd.as_c_str(), &[]).map_err(|e| match e {
nix::Error::Sys(errno) => {
std::process::exit(errno as i32);
}
@@ -206,7 +205,7 @@ async fn run_debug_console_vsock<T: AsyncRead + AsyncWrite>(
let slave_fd = pseudo.slave;
match unsafe { fork() } {
match fork() {
Ok(ForkResult::Child) => run_in_child(slave_fd, shell),
Ok(ForkResult::Parent { child: child_pid }) => {
run_in_parent(logger.clone(), stream, pseudo, child_pid).await

View File

@@ -14,20 +14,14 @@ use std::str::FromStr;
use std::sync::Arc;
use tokio::sync::Mutex;
#[cfg(target_arch = "s390x")]
use crate::ccw;
use crate::linux_abi::*;
use crate::mount::{
DRIVER_BLK_CCW_TYPE, DRIVER_BLK_TYPE, DRIVER_MMIO_BLK_TYPE, DRIVER_NVDIMM_TYPE,
DRIVER_SCSI_TYPE,
};
use crate::mount::{DRIVER_BLK_TYPE, DRIVER_MMIO_BLK_TYPE, DRIVER_NVDIMM_TYPE, DRIVER_SCSI_TYPE};
use crate::pci;
use crate::sandbox::Sandbox;
use crate::uevent::{wait_for_uevent, Uevent, UeventMatcher};
use anyhow::{anyhow, Result};
use oci::{LinuxDeviceCgroup, LinuxResources, Spec};
use protocols::agent::Device;
use tracing::instrument;
// Convenience macro to obtain the scope logger
macro_rules! sl {
@@ -38,21 +32,17 @@ macro_rules! sl {
const VM_ROOTFS: &str = "/";
#[derive(Debug)]
struct DevIndexEntry {
idx: usize,
residx: Vec<usize>,
}
#[derive(Debug)]
struct DevIndex(HashMap<String, DevIndexEntry>);
#[instrument]
pub fn rescan_pci_bus() -> Result<()> {
online_device(SYSFS_PCI_BUS_RESCAN_FILE)
}
#[instrument]
pub fn online_device(path: &str) -> Result<()> {
fs::write(path, "1")?;
Ok(())
@@ -61,8 +51,7 @@ pub fn online_device(path: &str) -> Result<()> {
// pcipath_to_sysfs fetches the sysfs path for a PCI path, relative to
// the sysfs path for the PCI host bridge, based on the PCI path
// provided.
#[instrument]
pub fn pcipath_to_sysfs(root_bus_sysfs: &str, pcipath: &pci::Path) -> Result<String> {
fn pcipath_to_sysfs(root_bus_sysfs: &str, pcipath: &pci::Path) -> Result<String> {
let mut bus = "0000:00".to_string();
let mut relpath = String::new();
@@ -120,7 +109,6 @@ impl UeventMatcher for ScsiBlockMatcher {
}
}
#[instrument]
pub async fn get_scsi_device_name(
sandbox: &Arc<Mutex<Sandbox>>,
scsi_addr: &str,
@@ -153,7 +141,6 @@ impl UeventMatcher for VirtioBlkPciMatcher {
}
}
#[instrument]
pub async fn get_virtio_blk_pci_device_name(
sandbox: &Arc<Mutex<Sandbox>>,
pcipath: &pci::Path,
@@ -168,47 +155,6 @@ pub async fn get_virtio_blk_pci_device_name(
Ok(format!("{}/{}", SYSTEM_DEV_PATH, &uev.devname))
}
#[cfg(target_arch = "s390x")]
#[derive(Debug)]
struct VirtioBlkCCWMatcher {
rex: Regex,
}
#[cfg(target_arch = "s390x")]
impl VirtioBlkCCWMatcher {
fn new(root_bus_path: &str, device: &ccw::Device) -> Self {
let re = format!(
r"^{}/0\.[0-3]\.[0-9a-f]{{1,4}}/{}/virtio[0-9]+/block/",
root_bus_path, device
);
VirtioBlkCCWMatcher {
rex: Regex::new(&re).unwrap(),
}
}
}
#[cfg(target_arch = "s390x")]
impl UeventMatcher for VirtioBlkCCWMatcher {
fn is_match(&self, uev: &Uevent) -> bool {
uev.action == "add" && self.rex.is_match(&uev.devpath) && !uev.devname.is_empty()
}
}
#[cfg(target_arch = "s390x")]
#[instrument]
pub async fn get_virtio_blk_ccw_device_name(
sandbox: &Arc<Mutex<Sandbox>>,
device: &ccw::Device,
) -> Result<String> {
let matcher = VirtioBlkCCWMatcher::new(&create_ccw_root_bus_path(), device);
let uev = wait_for_uevent(sandbox, matcher).await?;
let devname = uev.devname;
return match Path::new(SYSTEM_DEV_PATH).join(&devname).to_str() {
Some(path) => Ok(String::from(path)),
None => Err(anyhow!("CCW device name {} is not valid UTF-8", &devname)),
};
}
#[derive(Debug)]
struct PmemBlockMatcher {
suffix: String,
@@ -231,7 +177,6 @@ impl UeventMatcher for PmemBlockMatcher {
}
}
#[instrument]
pub async fn wait_for_pmem_device(sandbox: &Arc<Mutex<Sandbox>>, devpath: &str) -> Result<()> {
let devname = match devpath.strip_prefix("/dev/") {
Some(dev) => dev,
@@ -256,7 +201,6 @@ pub async fn wait_for_pmem_device(sandbox: &Arc<Mutex<Sandbox>>, devpath: &str)
}
/// Scan SCSI bus for the given SCSI address(SCSI-Id and LUN)
#[instrument]
fn scan_scsi_bus(scsi_addr: &str) -> Result<()> {
let tokens: Vec<&str> = scsi_addr.split(':').collect();
if tokens.len() != 2 {
@@ -291,7 +235,6 @@ fn scan_scsi_bus(scsi_addr: &str) -> Result<()> {
// the same device in the list of devices provided through the OCI spec.
// This is needed to update information about minor/major numbers that cannot
// be predicted from the caller.
#[instrument]
fn update_spec_device_list(device: &Device, spec: &mut Spec, devidx: &DevIndex) -> Result<()> {
let major_id: c_uint;
let minor_id: c_uint;
@@ -368,7 +311,6 @@ fn update_spec_device_list(device: &Device, spec: &mut Spec, devidx: &DevIndex)
// device.Id should be the predicted device name (vda, vdb, ...)
// device.VmPath already provides a way to send it in
#[instrument]
async fn virtiommio_blk_device_handler(
device: &Device,
spec: &mut Spec,
@@ -383,7 +325,6 @@ async fn virtiommio_blk_device_handler(
}
// device.Id should be a PCI path string
#[instrument]
async fn virtio_blk_device_handler(
device: &Device,
spec: &mut Spec,
@@ -398,34 +339,7 @@ async fn virtio_blk_device_handler(
update_spec_device_list(&dev, spec, devidx)
}
// device.id should be a CCW path string
#[cfg(target_arch = "s390x")]
#[instrument]
async fn virtio_blk_ccw_device_handler(
device: &Device,
spec: &mut Spec,
sandbox: &Arc<Mutex<Sandbox>>,
devidx: &DevIndex,
) -> Result<()> {
let mut dev = device.clone();
let ccw_device = ccw::Device::from_str(&device.id)?;
dev.vm_path = get_virtio_blk_ccw_device_name(sandbox, &ccw_device).await?;
update_spec_device_list(&dev, spec, devidx)
}
#[cfg(not(target_arch = "s390x"))]
#[instrument]
async fn virtio_blk_ccw_device_handler(
_: &Device,
_: &mut Spec,
_: &Arc<Mutex<Sandbox>>,
_: &DevIndex,
) -> Result<()> {
Err(anyhow!("CCW is only supported on s390x"))
}
// device.Id should be the SCSI address of the disk in the format "scsiID:lunID"
#[instrument]
async fn virtio_scsi_device_handler(
device: &Device,
spec: &mut Spec,
@@ -437,7 +351,6 @@ async fn virtio_scsi_device_handler(
update_spec_device_list(&dev, spec, devidx)
}
#[instrument]
async fn virtio_nvdimm_device_handler(
device: &Device,
spec: &mut Spec,
@@ -476,7 +389,6 @@ impl DevIndex {
}
}
#[instrument]
pub async fn add_devices(
devices: &[Device],
spec: &mut Spec,
@@ -491,7 +403,6 @@ pub async fn add_devices(
Ok(())
}
#[instrument]
async fn add_device(
device: &Device,
spec: &mut Spec,
@@ -516,7 +427,6 @@ async fn add_device(
match device.field_type.as_str() {
DRIVER_BLK_TYPE => virtio_blk_device_handler(device, spec, sandbox, devidx).await,
DRIVER_BLK_CCW_TYPE => virtio_blk_ccw_device_handler(device, spec, sandbox, devidx).await,
DRIVER_MMIO_BLK_TYPE => virtiommio_blk_device_handler(device, spec, sandbox, devidx).await,
DRIVER_NVDIMM_TYPE => virtio_nvdimm_device_handler(device, spec, sandbox, devidx).await,
DRIVER_SCSI_TYPE => virtio_scsi_device_handler(device, spec, sandbox, devidx).await,
@@ -527,7 +437,6 @@ async fn add_device(
// update_device_cgroup update the device cgroup for container
// to not allow access to the guest root partition. This prevents
// the container from being able to access the VM rootfs.
#[instrument]
pub fn update_device_cgroup(spec: &mut Spec) -> Result<()> {
let meta = fs::metadata(VM_ROOTFS)?;
let rdev = meta.dev();
@@ -966,12 +875,12 @@ mod tests {
uev_a.subsystem = "block".to_string();
uev_a.devname = devname.to_string();
uev_a.devpath = format!("{}{}/virtio4/block/{}", root_bus, relpath_a, devname);
let matcher_a = VirtioBlkPciMatcher::new(relpath_a);
let matcher_a = VirtioBlkPciMatcher::new(&relpath_a);
let mut uev_b = uev_a.clone();
let relpath_b = "/0000:00:0a.0/0000:00:0b.0";
uev_b.devpath = format!("{}{}/virtio0/block/{}", root_bus, relpath_b, devname);
let matcher_b = VirtioBlkPciMatcher::new(relpath_b);
let matcher_b = VirtioBlkPciMatcher::new(&relpath_b);
assert!(matcher_a.is_match(&uev_a));
assert!(matcher_b.is_match(&uev_b));
@@ -979,66 +888,6 @@ mod tests {
assert!(!matcher_a.is_match(&uev_b));
}
#[cfg(target_arch = "s390x")]
#[tokio::test]
async fn test_virtio_blk_ccw_matcher() {
let root_bus = create_ccw_root_bus_path();
let subsystem = "block";
let devname = "vda";
let relpath = "0.0.0002";
let mut uev = crate::uevent::Uevent::default();
uev.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev.subsystem = subsystem.to_string();
uev.devname = devname.to_string();
uev.devpath = format!(
"{}/0.0.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
// Valid path
let device = ccw::Device::from_str(relpath).unwrap();
let matcher = VirtioBlkCCWMatcher::new(&root_bus, &device);
assert!(matcher.is_match(&uev));
// Invalid paths
uev.devpath = format!(
"{}/0.0.0001/0.0.0003/virtio1/{}/{}",
root_bus, subsystem, devname
);
assert!(!matcher.is_match(&uev));
uev.devpath = format!("0.0.0001/{}/virtio1/{}/{}", relpath, subsystem, devname);
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.0.0001/{}/virtio/{}/{}",
root_bus, relpath, subsystem, devname
);
assert!(!matcher.is_match(&uev));
uev.devpath = format!("{}/0.0.0001/{}/virtio1", root_bus, relpath);
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/1.0.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.4.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.0.10000/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
assert!(!matcher.is_match(&uev));
}
#[tokio::test]
async fn test_scsi_block_matcher() {
let root_bus = create_pci_root_bus_path();
@@ -1053,7 +902,7 @@ mod tests {
"{}/0000:00:00.0/virtio0/host0/target0:0:0/0:0:{}/block/sda",
root_bus, addr_a
);
let matcher_a = ScsiBlockMatcher::new(addr_a);
let matcher_a = ScsiBlockMatcher::new(&addr_a);
let mut uev_b = uev_a.clone();
let addr_b = "2:0";
@@ -1061,7 +910,7 @@ mod tests {
"{}/0000:00:00.0/virtio0/host0/target0:0:2/0:0:{}/block/sdb",
root_bus, addr_b
);
let matcher_b = ScsiBlockMatcher::new(addr_b);
let matcher_b = ScsiBlockMatcher::new(&addr_b);
assert!(matcher_a.is_match(&uev_a));
assert!(matcher_b.is_match(&uev_b));

View File

@@ -65,10 +65,6 @@ pub fn create_pci_root_bus_path() -> String {
ret
}
#[cfg(target_arch = "s390x")]
pub fn create_ccw_root_bus_path() -> String {
String::from("/devices/css0")
}
// From https://www.kernel.org/doc/Documentation/acpi/namespace.txt
// The Linux kernel's core ACPI subsystem creates struct acpi_device
// objects for ACPI namespace objects representing devices, power resources

View File

@@ -5,8 +5,8 @@
#[macro_use]
extern crate lazy_static;
extern crate capctl;
extern crate oci;
extern crate prctl;
extern crate prometheus;
extern crate protocols;
extern crate regex;
@@ -32,10 +32,7 @@ use std::os::unix::io::AsRawFd;
use std::path::Path;
use std::process::exit;
use std::sync::Arc;
use tracing::{instrument, span};
#[cfg(target_arch = "s390x")]
mod ccw;
mod config;
mod console;
mod device;
@@ -54,12 +51,11 @@ mod test_utils;
mod uevent;
mod util;
mod version;
mod watcher;
use mount::{cgroups_mount, general_mount};
use sandbox::Sandbox;
use signal::setup_signal_handler;
use slog::{error, info, o, warn, Logger};
use slog::Logger;
use uevent::watch_uevents;
use futures::future::join_all;
@@ -74,7 +70,6 @@ use tokio::{
};
mod rpc;
mod tracer;
const NAME: &str = "kata-agent";
const KERNEL_CMDLINE_FILE: &str = "/proc/cmdline";
@@ -84,7 +79,6 @@ lazy_static! {
Arc::new(RwLock::new(config::AgentConfig::new()));
}
#[instrument]
fn announce(logger: &Logger, config: &AgentConfig) {
info!(logger, "announce";
"agent-commit" => version::VERSION_COMMIT,
@@ -205,17 +199,6 @@ async fn real_main() -> std::result::Result<(), Box<dyn std::error::Error>> {
ttrpc_log_guard = Ok(slog_stdlog::init().map_err(|e| e)?);
}
if config.tracing != tracer::TraceType::Disabled {
let _ = tracer::setup_tracing(NAME, &logger, &config)?;
}
let root = span!(tracing::Level::TRACE, "root-span", work_units = 2);
// XXX: Start the root trace transaction.
//
// XXX: Note that *ALL* spans needs to start after this point!!
let _enter = root.enter();
// Start the sandbox and wait for its ttRPC server to end
start_sandbox(&logger, &config, init_mode, &mut tasks, shutdown_rx.clone()).await?;
@@ -244,10 +227,6 @@ async fn real_main() -> std::result::Result<(), Box<dyn std::error::Error>> {
}
}
if config.tracing != tracer::TraceType::Disabled {
tracer::end_tracing();
}
eprintln!("{} shutdown complete", NAME);
Ok(())
@@ -281,7 +260,6 @@ fn main() -> std::result::Result<(), Box<dyn std::error::Error>> {
rt.block_on(real_main())
}
#[instrument]
async fn start_sandbox(
logger: &Logger,
config: &AgentConfig,
@@ -302,7 +280,7 @@ async fn start_sandbox(
}
// Initialize unique sandbox structure.
let s = Sandbox::new(logger).context("Failed to create sandbox")?;
let s = Sandbox::new(&logger).context("Failed to create sandbox")?;
if init_mode {
s.rtnl.handle_localhost().await?;
}
@@ -328,7 +306,7 @@ async fn start_sandbox(
let mut server = rpc::start(sandbox.clone(), config.server_addr.as_str());
server.start().await?;
rx.await?;
let _ = rx.await?;
server.shutdown().await?;
Ok(())
@@ -368,7 +346,6 @@ fn init_agent_as_init(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result
Ok(())
}
#[instrument]
fn sethostname(hostname: &OsStr) -> Result<()> {
let size = hostname.len() as usize;

View File

@@ -8,7 +8,6 @@ extern crate procfs;
use prometheus::{Encoder, Gauge, GaugeVec, IntCounter, TextEncoder};
use anyhow::Result;
use tracing::instrument;
const NAMESPACE_KATA_AGENT: &str = "kata_agent";
const NAMESPACE_KATA_GUEST: &str = "kata_guest";
@@ -69,7 +68,6 @@ lazy_static! {
prometheus::register_gauge_vec!(format!("{}_{}",NAMESPACE_KATA_GUEST,"meminfo").as_ref() , "Statistics about memory usage in the system.", &["item"]).unwrap();
}
#[instrument]
pub fn get_metrics(_: &protocols::agent::GetMetricsRequest) -> Result<String> {
AGENT_SCRAPE_COUNT.inc();
@@ -89,7 +87,6 @@ pub fn get_metrics(_: &protocols::agent::GetMetricsRequest) -> Result<String> {
Ok(String::from_utf8(buffer).unwrap())
}
#[instrument]
fn update_agent_metrics() {
let me = procfs::process::Process::myself();
@@ -139,7 +136,6 @@ fn update_agent_metrics() {
}
}
#[instrument]
fn update_guest_metrics() {
// try get load and task info
match procfs::LoadAverage::new() {
@@ -193,7 +189,7 @@ fn update_guest_metrics() {
Ok(kernel_stats) => {
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, "total", &kernel_stats.total);
for (i, cpu_time) in kernel_stats.cpu_time.iter().enumerate() {
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, format!("{}", i).as_str(), cpu_time);
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, format!("{}", i).as_str(), &cpu_time);
}
}
}
@@ -222,7 +218,6 @@ fn update_guest_metrics() {
}
}
#[instrument]
fn set_gauge_vec_meminfo(gv: &prometheus::GaugeVec, meminfo: &procfs::Meminfo) {
gv.with_label_values(&["mem_total"])
.set(meminfo.mem_total as f64);
@@ -337,7 +332,6 @@ fn set_gauge_vec_meminfo(gv: &prometheus::GaugeVec, meminfo: &procfs::Meminfo) {
.set(meminfo.k_reclaimable.unwrap_or(0) as f64);
}
#[instrument]
fn set_gauge_vec_cpu_time(gv: &prometheus::GaugeVec, cpu: &str, cpu_time: &procfs::CpuTime) {
gv.with_label_values(&[cpu, "user"])
.set(cpu_time.user as f64);
@@ -361,7 +355,6 @@ fn set_gauge_vec_cpu_time(gv: &prometheus::GaugeVec, cpu: &str, cpu_time: &procf
.set(cpu_time.guest_nice.unwrap_or(0.0) as f64);
}
#[instrument]
fn set_gauge_vec_diskstat(gv: &prometheus::GaugeVec, diskstat: &procfs::DiskStat) {
gv.with_label_values(&[diskstat.name.as_str(), "reads"])
.set(diskstat.reads as f64);
@@ -400,7 +393,6 @@ fn set_gauge_vec_diskstat(gv: &prometheus::GaugeVec, diskstat: &procfs::DiskStat
}
// set_gauge_vec_netdev set gauge for NetDevLine
#[instrument]
fn set_gauge_vec_netdev(gv: &prometheus::GaugeVec, status: &procfs::net::DeviceStatus) {
gv.with_label_values(&[status.name.as_str(), "recv_bytes"])
.set(status.recv_bytes as f64);
@@ -437,7 +429,6 @@ fn set_gauge_vec_netdev(gv: &prometheus::GaugeVec, status: &procfs::net::DeviceS
}
// set_gauge_vec_proc_status set gauge for ProcStatus
#[instrument]
fn set_gauge_vec_proc_status(gv: &prometheus::GaugeVec, status: &procfs::process::Status) {
gv.with_label_values(&["vmpeak"])
.set(status.vmpeak.unwrap_or(0) as f64);
@@ -478,7 +469,6 @@ fn set_gauge_vec_proc_status(gv: &prometheus::GaugeVec, status: &procfs::process
}
// set_gauge_vec_proc_io set gauge for ProcIO
#[instrument]
fn set_gauge_vec_proc_io(gv: &prometheus::GaugeVec, io_stat: &procfs::process::Io) {
gv.with_label_values(&["rchar"]).set(io_stat.rchar as f64);
gv.with_label_values(&["wchar"]).set(io_stat.wchar as f64);
@@ -493,7 +483,6 @@ fn set_gauge_vec_proc_io(gv: &prometheus::GaugeVec, io_stat: &procfs::process::I
}
// set_gauge_vec_proc_stat set gauge for ProcStat
#[instrument]
fn set_gauge_vec_proc_stat(gv: &prometheus::GaugeVec, stat: &procfs::process::Stat) {
gv.with_label_values(&["utime"]).set(stat.utime as f64);
gv.with_label_values(&["stime"]).set(stat.stime as f64);

View File

@@ -6,16 +6,13 @@
use std::collections::HashMap;
use std::ffi::CString;
use std::fs;
use std::fs::File;
use std::io;
use std::io::{BufRead, BufReader};
use std::iter;
use std::os::unix::fs::{MetadataExt, PermissionsExt};
use std::path::Path;
use std::ptr::null;
use std::str::FromStr;
use std::sync::Arc;
use tokio::sync::Mutex;
use libc::{c_void, mount};
@@ -23,6 +20,8 @@ use nix::mount::{self, MsFlags};
use nix::unistd::Gid;
use regex::Regex;
use std::fs::File;
use std::io::{BufRead, BufReader};
use crate::device::{
get_scsi_device_name, get_virtio_blk_pci_device_name, online_device, wait_for_pmem_device,
@@ -31,22 +30,17 @@ use crate::linux_abi::*;
use crate::pci;
use crate::protocols::agent::Storage;
use crate::Sandbox;
#[cfg(target_arch = "s390x")]
use crate::{ccw, device::get_virtio_blk_ccw_device_name};
use anyhow::{anyhow, Context, Result};
use slog::Logger;
use tracing::instrument;
pub const DRIVER_9P_TYPE: &str = "9p";
pub const DRIVER_VIRTIOFS_TYPE: &str = "virtio-fs";
pub const DRIVER_BLK_TYPE: &str = "blk";
pub const DRIVER_BLK_CCW_TYPE: &str = "blk-ccw";
pub const DRIVER_MMIO_BLK_TYPE: &str = "mmioblk";
pub const DRIVER_SCSI_TYPE: &str = "scsi";
pub const DRIVER_NVDIMM_TYPE: &str = "nvdimm";
pub const DRIVER_EPHEMERAL_TYPE: &str = "ephemeral";
pub const DRIVER_LOCAL_TYPE: &str = "local";
pub const DRIVER_WATCHABLE_BIND_TYPE: &str = "watchable-bind";
pub const TYPE_ROOTFS: &str = "rootfs";
@@ -137,7 +131,7 @@ lazy_static! {
];
}
pub const STORAGE_HANDLER_LIST: &[&str] = &[
pub const STORAGE_HANDLER_LIST: [&str; 8] = [
DRIVER_BLK_TYPE,
DRIVER_9P_TYPE,
DRIVER_VIRTIOFS_TYPE,
@@ -146,7 +140,6 @@ pub const STORAGE_HANDLER_LIST: &[&str] = &[
DRIVER_LOCAL_TYPE,
DRIVER_SCSI_TYPE,
DRIVER_NVDIMM_TYPE,
DRIVER_WATCHABLE_BIND_TYPE,
];
#[derive(Debug, Clone)]
@@ -163,7 +156,6 @@ pub struct BareMount<'a> {
// * evaluate all symlinks
// * ensure the source exists
impl<'a> BareMount<'a> {
#[instrument]
pub fn new(
s: &'a str,
d: &'a str,
@@ -182,7 +174,6 @@ impl<'a> BareMount<'a> {
}
}
#[instrument]
pub fn mount(&self) -> Result<()> {
let source;
let dest;
@@ -241,7 +232,6 @@ impl<'a> BareMount<'a> {
}
}
#[instrument]
async fn ephemeral_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -282,13 +272,12 @@ async fn ephemeral_storage_handler(
fs::set_permissions(&storage.mount_point, permission)?;
}
} else {
common_storage_handler(logger, storage)?;
common_storage_handler(logger, &storage)?;
}
Ok("".to_string())
}
#[instrument]
async fn local_storage_handler(
_logger: &Logger,
storage: &Storage,
@@ -335,7 +324,6 @@ async fn local_storage_handler(
Ok("".to_string())
}
#[instrument]
async fn virtio9p_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -345,7 +333,6 @@ async fn virtio9p_storage_handler(
}
// virtiommio_blk_storage_handler handles the storage for mmio blk driver.
#[instrument]
async fn virtiommio_blk_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -356,7 +343,6 @@ async fn virtiommio_blk_storage_handler(
}
// virtiofs_storage_handler handles the storage for virtio-fs.
#[instrument]
async fn virtiofs_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -366,7 +352,6 @@ async fn virtiofs_storage_handler(
}
// virtio_blk_storage_handler handles the storage for blk driver.
#[instrument]
async fn virtio_blk_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -392,33 +377,7 @@ async fn virtio_blk_storage_handler(
common_storage_handler(logger, &storage)
}
// virtio_blk_ccw_storage_handler handles storage for the blk-ccw driver (s390x)
#[cfg(target_arch = "s390x")]
#[instrument]
async fn virtio_blk_ccw_storage_handler(
logger: &Logger,
storage: &Storage,
sandbox: Arc<Mutex<Sandbox>>,
) -> Result<String> {
let mut storage = storage.clone();
let ccw_device = ccw::Device::from_str(&storage.source)?;
let dev_path = get_virtio_blk_ccw_device_name(&sandbox, &ccw_device).await?;
storage.source = dev_path;
common_storage_handler(logger, &storage)
}
#[cfg(not(target_arch = "s390x"))]
#[instrument]
async fn virtio_blk_ccw_storage_handler(
_: &Logger,
_: &Storage,
_: Arc<Mutex<Sandbox>>,
) -> Result<String> {
Err(anyhow!("CCW is only supported on s390x"))
}
// virtio_scsi_storage_handler handles the storage for scsi driver.
#[instrument]
// virtio_scsi_storage_handler handles the storage for scsi driver.
async fn virtio_scsi_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -433,7 +392,6 @@ async fn virtio_scsi_storage_handler(
common_storage_handler(logger, &storage)
}
#[instrument]
fn common_storage_handler(logger: &Logger, storage: &Storage) -> Result<String> {
// Mount the storage device.
let mount_point = storage.mount_point.to_string();
@@ -442,7 +400,6 @@ fn common_storage_handler(logger: &Logger, storage: &Storage) -> Result<String>
}
// nvdimm_storage_handler handles the storage for NVDIMM driver.
#[instrument]
async fn nvdimm_storage_handler(
logger: &Logger,
storage: &Storage,
@@ -456,22 +413,7 @@ async fn nvdimm_storage_handler(
common_storage_handler(logger, &storage)
}
async fn bind_watcher_storage_handler(
logger: &Logger,
storage: &Storage,
sandbox: Arc<Mutex<Sandbox>>,
) -> Result<()> {
let mut locked = sandbox.lock().await;
let container_id = locked.id.clone();
locked
.bind_watcher
.add_container(container_id, iter::once(storage.clone()), logger)
.await
}
// mount_storage performs the mount described by the storage structure.
#[instrument]
fn mount_storage(logger: &Logger, storage: &Storage) -> Result<()> {
let logger = logger.new(o!("subsystem" => "mount"));
@@ -522,8 +464,7 @@ fn mount_storage(logger: &Logger, storage: &Storage) -> Result<()> {
}
/// Looks for `mount_point` entry in the /proc/mounts.
#[instrument]
pub fn is_mounted(mount_point: &str) -> Result<bool> {
fn is_mounted(mount_point: &str) -> Result<bool> {
let mount_point = mount_point.trim_end_matches('/');
let found = fs::metadata(mount_point).is_ok()
// Looks through /proc/mounts and check if the mount exists
@@ -540,7 +481,6 @@ pub fn is_mounted(mount_point: &str) -> Result<bool> {
Ok(found)
}
#[instrument]
fn parse_mount_flags_and_options(options_vec: Vec<&str>) -> (MsFlags, String) {
let mut flags = MsFlags::empty();
let mut options: String = "".to_string();
@@ -549,12 +489,8 @@ fn parse_mount_flags_and_options(options_vec: Vec<&str>) -> (MsFlags, String) {
if !opt.is_empty() {
match FLAGS.get(opt) {
Some(x) => {
let (clear, f) = *x;
if clear {
flags &= !f;
} else {
flags |= f;
}
let (_, f) = *x;
flags |= f;
}
None => {
if !options.is_empty() {
@@ -573,7 +509,6 @@ fn parse_mount_flags_and_options(options_vec: Vec<&str>) -> (MsFlags, String) {
// associated operations such as waiting for the device to show up, and mount
// it to a specific location, according to the type of handler chosen, and for
// each storage.
#[instrument]
pub async fn add_storages(
logger: Logger,
storages: Vec<Storage>,
@@ -589,9 +524,6 @@ pub async fn add_storages(
let res = match handler_name.as_str() {
DRIVER_BLK_TYPE => virtio_blk_storage_handler(&logger, &storage, sandbox.clone()).await,
DRIVER_BLK_CCW_TYPE => {
virtio_blk_ccw_storage_handler(&logger, &storage, sandbox.clone()).await
}
DRIVER_9P_TYPE => virtio9p_storage_handler(&logger, &storage, sandbox.clone()).await,
DRIVER_VIRTIOFS_TYPE => {
virtiofs_storage_handler(&logger, &storage, sandbox.clone()).await
@@ -607,11 +539,6 @@ pub async fn add_storages(
virtio_scsi_storage_handler(&logger, &storage, sandbox.clone()).await
}
DRIVER_NVDIMM_TYPE => nvdimm_storage_handler(&logger, &storage, sandbox.clone()).await,
DRIVER_WATCHABLE_BIND_TYPE => {
bind_watcher_storage_handler(&logger, &storage, sandbox.clone()).await?;
// Don't register watch mounts, they're handled separately by the watcher.
Ok(String::new())
}
_ => {
return Err(anyhow!(
"Failed to find the storage handler {}",
@@ -631,7 +558,6 @@ pub async fn add_storages(
Ok(mount_list)
}
#[instrument]
fn mount_to_rootfs(logger: &Logger, m: &InitMount) -> Result<()> {
let options_vec: Vec<&str> = m.options.clone();
@@ -657,7 +583,6 @@ fn mount_to_rootfs(logger: &Logger, m: &InitMount) -> Result<()> {
Ok(())
}
#[instrument]
pub fn general_mount(logger: &Logger) -> Result<()> {
let logger = logger.new(o!("subsystem" => "mount"));
@@ -675,7 +600,6 @@ pub fn get_mount_fs_type(mount_point: &str) -> Result<String> {
// get_mount_fs_type_from_file returns the FS type corresponding to the passed mount point and
// any error ecountered.
#[instrument]
pub fn get_mount_fs_type_from_file(mount_file: &str, mount_point: &str) -> Result<String> {
if mount_point.is_empty() {
return Err(anyhow!("Invalid mount point {}", mount_point));
@@ -706,7 +630,6 @@ pub fn get_mount_fs_type_from_file(mount_file: &str, mount_point: &str) -> Resul
))
}
#[instrument]
pub fn get_cgroup_mounts(
logger: &Logger,
cg_path: &str,
@@ -797,7 +720,6 @@ pub fn get_cgroup_mounts(
Ok(cg_mounts)
}
#[instrument]
pub fn cgroups_mount(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result<()> {
let logger = logger.new(o!("subsystem" => "mount"));
@@ -813,7 +735,6 @@ pub fn cgroups_mount(logger: &Logger, unified_cgroup_hierarchy: bool) -> Result<
Ok(())
}
#[instrument]
pub fn remove_mounts(mounts: &[String]) -> Result<()> {
for m in mounts.iter() {
mount::umount(m.as_str()).context(format!("failed to umount {:?}", m))?;
@@ -823,30 +744,26 @@ pub fn remove_mounts(mounts: &[String]) -> Result<()> {
// ensure_destination_exists will recursively create a given mountpoint. If directories
// are created, their permissions are initialized to mountPerm(0755)
#[instrument]
fn ensure_destination_exists(destination: &str, fs_type: &str) -> Result<()> {
let d = Path::new(destination);
if d.exists() {
return Ok(());
}
let dir = d
.parent()
.ok_or_else(|| anyhow!("mount destination {} doesn't exist", destination))?;
if !dir.exists() {
fs::create_dir_all(dir).context(format!("create dir all {:?}", dir))?;
if !d.exists() {
let dir = d
.parent()
.ok_or_else(|| anyhow!("mount destination {} doesn't exist", destination))?;
if !dir.exists() {
fs::create_dir_all(dir).context(format!("create dir all failed on {:?}", dir))?;
}
}
if fs_type != "bind" || d.is_dir() {
fs::create_dir_all(d).context(format!("create dir all {:?}", d))?;
fs::create_dir_all(d).context(format!("create dir all failed on {:?}", d))?;
} else {
fs::File::create(d).context(format!("create file {:?}", d))?;
fs::OpenOptions::new().create(true).open(d)?;
}
Ok(())
}
#[instrument]
fn parse_options(option_list: Vec<String>) -> HashMap<String, String> {
let mut options = HashMap::new();
for opt in option_list.iter() {
@@ -866,7 +783,6 @@ mod tests {
use super::*;
use crate::{skip_if_not_root, skip_loop_if_not_root, skip_loop_if_root};
use libc::umount;
use std::fs::metadata;
use std::fs::File;
use std::fs::OpenOptions;
use std::io::Write;
@@ -1104,8 +1020,8 @@ mod tests {
// Create an actual mount
let bare_mount = BareMount::new(
mnt_src_filename,
mnt_dest_filename,
&mnt_src_filename,
&mnt_dest_filename,
"bind",
MsFlags::MS_BIND,
"",
@@ -1274,7 +1190,7 @@ mod tests {
let logger = slog::Logger::root(drain, o!());
let result = get_cgroup_mounts(&logger, "", true);
assert!(result.is_ok());
assert_eq!(true, result.is_ok());
let result = result.unwrap();
assert_eq!(1, result.len());
assert_eq!(result[0].fstype, "cgroup2");
@@ -1442,39 +1358,4 @@ mod tests {
assert!(mounts[1].eq(&cg_devices_mount), "{}", msg);
}
}
#[test]
fn test_ensure_destination_exists() {
let dir = tempdir().expect("failed to create tmpdir");
let mut testfile = dir.into_path();
testfile.push("testfile");
let result = ensure_destination_exists(testfile.to_str().unwrap(), "bind");
assert!(result.is_ok());
assert!(testfile.exists());
let result = ensure_destination_exists(testfile.to_str().unwrap(), "bind");
assert!(result.is_ok());
let meta = metadata(testfile).unwrap();
assert!(meta.is_file());
let dir = tempdir().expect("failed to create tmpdir");
let mut testdir = dir.into_path();
testdir.push("testdir");
let result = ensure_destination_exists(testdir.to_str().unwrap(), "ext4");
assert!(result.is_ok());
assert!(testdir.exists());
let result = ensure_destination_exists(testdir.to_str().unwrap(), "ext4");
assert!(result.is_ok());
//let meta = metadata(testdir.to_str().unwrap()).unwrap();
let meta = metadata(testdir).unwrap();
assert!(meta.is_dir());
}
}

View File

@@ -11,7 +11,6 @@ use std::fmt;
use std::fs;
use std::fs::File;
use std::path::{Path, PathBuf};
use tracing::instrument;
use crate::mount::{BareMount, FLAGS};
use slog::Logger;
@@ -21,7 +20,6 @@ pub const NSTYPEIPC: &str = "ipc";
pub const NSTYPEUTS: &str = "uts";
pub const NSTYPEPID: &str = "pid";
#[instrument]
pub fn get_current_thread_ns_path(ns_type: &str) -> String {
format!(
"/proc/{}/task/{}/ns/{}",
@@ -42,7 +40,6 @@ pub struct Namespace {
}
impl Namespace {
#[instrument]
pub fn new(logger: &Logger) -> Self {
Namespace {
logger: logger.clone(),
@@ -53,13 +50,11 @@ impl Namespace {
}
}
#[instrument]
pub fn get_ipc(mut self) -> Self {
self.ns_type = NamespaceType::Ipc;
self
}
#[instrument]
pub fn get_uts(mut self, hostname: &str) -> Self {
self.ns_type = NamespaceType::Uts;
if !hostname.is_empty() {
@@ -68,7 +63,6 @@ impl Namespace {
self
}
#[instrument]
pub fn get_pid(mut self) -> Self {
self.ns_type = NamespaceType::Pid;
self
@@ -82,7 +76,6 @@ impl Namespace {
// setup creates persistent namespace without switching to it.
// Note, pid namespaces cannot be persisted.
#[instrument]
pub async fn setup(mut self) -> Result<Self> {
fs::create_dir_all(&self.persistent_ns_dir)?;
@@ -102,7 +95,7 @@ impl Namespace {
let new_thread = tokio::spawn(async move {
if let Err(err) = || -> Result<()> {
let origin_ns_path = get_current_thread_ns_path(ns_type.get());
let origin_ns_path = get_current_thread_ns_path(&ns_type.get());
File::open(Path::new(&origin_ns_path))?;
@@ -121,12 +114,8 @@ impl Namespace {
let mut flags = MsFlags::empty();
if let Some(x) = FLAGS.get("rbind") {
let (clear, f) = *x;
if clear {
flags &= !f;
} else {
flags |= f;
}
let (_, f) = *x;
flags |= f;
};
let bare_mount = BareMount::new(source, destination, "none", flags, "", &logger);

Some files were not shown because too many files have changed in this diff Show More