Commit Graph

6312 Commits

Author SHA1 Message Date
Gabriela Cervantes
3fad527734 docs: Fix Release Process document
This PR updates the correct url for github actions as well as it
corrects a misspelling.

Fixes #1960

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 9158ec68cc)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-06-08 10:00:47 +02:00
Tim Zhang
6cea1b146c
Merge pull request #1959 from Tim-Zhang/port-fix-fd-leak-for-stable-2.1
stable-2.1 | Port fd leak fixes
2021-06-03 20:15:23 +08:00
Tim Zhang
1cc2ad3f34 agent: Fix fd leak caused by netlink
See also: little-dude/netlink#165

Fixes: #1952

Because the author of netlink has no time to maintain the crate
(https://github.com/little-dude/netlink/issues/161), so we
need to switch the dependency to github temporarily.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-03 17:24:09 +08:00
Tim Zhang
ac34f6dfd9 agent: Upgrade tokio-vsock to fix fd leak of vsock socket
Fixes: #1950

The further information: rust-vsock/vsock-rs#15

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-03 10:29:33 +08:00
Julio Montes
ff206cf6cf
Merge pull request #1946 from fidencio/wip/weekly-backports-to-stable-2.1
[stable-2.1] Weekly backports to stable-2.1 branch, May 31st 2021
2021-06-01 16:00:42 -05:00
Bin Liu
57f7ffbe39
Merge pull request #1940 from liubin/backport/1934
[backport]runtime: and cgroup and SandboxCgroupOnly check for check sub-command
2021-06-01 08:32:45 +08:00
fupan.lfp
915fea7b1f cgroup: fix the issue of set mem.limit and mem.swap
When update memory limit, we should adapt the write sequence
for memory and swap memory, so it won't fail because
the new value and the old value don't fit kernel's
validation.

Fixes: #1917

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit 30f4834c5b)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
fupan.lfp
a05e137710 agent: re-enable the standard SIGPIPE behavior
The Rust standard library had suppressed the default SIGPIPE
behavior, see https://github.com/rust-lang/rust/pull/13158.
Since the parent's signal handler would be inherited by it's child
process, thus we should re-enable the standard SIGPIPE behavior as a
workaround.

Fixes: #1887

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit 0ae364c8eb)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
bin
8019f7322d virtiofsd: Fix file descriptors leak and return correct PID
This commit will fix two problems:
- Virtiofsd process ID returned to the caller will always be 0,
   the pid var is never being assigned a value.
- Socket listen fd may leak in case of failure of starting virtiofsd process.
  This is a port of be9ca0d58b

Fixes: #1931

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 773deca2f6)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-31 16:56:15 +02:00
bin
e48c9d426d runtime: and cgroup and SandboxCgroupOnly check for check sub-command
In kata-runtime check sub-command, checks cgroups and SandboxCgroupOnly
to show message if the SandboxCgroupOnly is not set to true
and cgroup v2 is used.

Fixes: #1927

Signed-off-by: bin <bin@hyper.sh>
2021-05-28 16:36:49 +08:00
Peng Tao
6a7e6a8d0a
Merge pull request #1921 from fidencio/wip/weekly-stable-2.1-backports-may-24th
Weekly stable 2.1 backports may 24th
2021-05-25 10:17:18 +08:00
quanweiZhou
7874ab33d4 agent: fix start container failed when dropping all capabilities
When starting a container and dropping all capabilities,
the init child process has no permission to read the exec.fifo
file because the parent set the file mode 0o622. So change the exec.fifo file mode to 0o644.

fixes #1913

Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
(cherry picked from commit 3e4ebe10ac)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:15 +02:00
Yuanzhe Liu
536634e909 qemu: align before memory hotplug on arm64
When hotplug memory on arm64 in kata, kernel will shout:

[ 0.396551] Block size [0x40000000] unaligned hotplug range: start 0xc8000000, size 0x40000000
[ 0.396556] acpi PNP0C80:01: add_memory failed
[ 0.396834] acpi PNP0C80:01: acpi_memory_enable_device() error
[ 0.396948] acpi PNP0C80:01: Enumeration failure

It means that kernel will check if the memory range to be hotplugged
align with 1G before plug the memory. So we should twist the qemu to
make sure the memory range align with 1G to pass the kernel check.

Fixes: #1841

Signed-off-by: Yuanzhe Liu <yuanzheliu09@gmail.com>
(cherry picked from commit bc36b7b49f)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:11 +02:00
Eric Ernst
c51891fee7 sandbox-bindmount: persist mount information
Without this, if the shim dies, we will not have a reliable way to
identify what mounts should be cleaned up if `containerd-shim-kata-v2
cleanup` is called for the sandbox.

Before this, if you `ctr run` with a sandbox bindmount defined and SIGKILL the
containerd-shim-kata-v2, you'll notice the sandbox bindmount left on
host.

With this change, the shim is able to get the sandbox bindmount
information from disk and do the appropriate cleanup.

Fixes #1896

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 7f1030d303)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:08 +02:00
Eric Ernst
b137c7ac33 sandbox: Cleanup if failure to setup sandbox-bindmount occurs
If for any reason there's an error when trying to setup the sandbox
bindmounts, make sure we roll back any mounts already created when
setting up the sandbox.

Without this, we'd leave shared directory mount and potentially
sandbox-bindmounts on the host.

Fixes: #1895

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 089a7484e1)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-24 15:08:04 +02:00
Fabiano Fidêncio
a407e53b94
Merge pull request #1911 from devimc/2021-05-21/stable-2.1/updateChannels
[backport-2.1] workflows: release kata 2.x snap through the stable channel
2021-05-22 09:22:10 +02:00
Julio Montes
68a77a7dec workflows: release kata 2.x snap through the stable channel
kata 1.x has been deprecated, now kata 2.x can be released through
the stable channel

fixes #1909

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-05-21 15:47:50 -05:00
Bin Liu
169cf133c9
Merge pull request #1873 from teawater/vm_doc2.1
[2.1] how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
2021-05-20 17:36:03 +08:00
Hui Zhu
550269ff89 how-to-use-virtio-mem-with-kata.md: Update doc to make it clear
Update this howto because the virtio-mem support of kata, qemu and Linux
was updated.

Fixes: #1845

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-05-18 14:11:43 +08:00
Chelsea Mafrica
d6d16dc597
Merge pull request #1849 from GabyCT/topic/removeprportinglabel
github: Do not run require porting labels on stable-2.1
2021-05-17 13:19:25 -07:00
Gabriela Cervantes
1ea0dc9804 github: Do not run require porting labels on stable-2.1
When we are creating a PR in stable-2.1 we do not need to run
the github action of porting labels as we are doing backports or
new releases in stable-2.1 and we it is unnecessary to put labels
like no-backport-needed or no-forwardport-needed, etc.

Fixes #1847

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2021-05-14 16:37:44 -05:00
Fabiano Fidêncio
0f82291926
Merge pull request #1856 from fidencio/2.1.0-branch-bump
# Kata Containers 2.1.0
2021-05-14 19:59:06 +02:00
Fabiano Fidêncio
5d3610e25f release: Kata Containers 2.1.0
- stable-2.1 | The last round of backports before releasing 2.1.0
- back port: image_build: align image size to 128M for arm64
- stable-2.1 | runtime: make dialing timeout configurable
- stable-2.1 | agent: avoid reaping the exit signal of execute_hook in the reaper
- stable-2.1 | Get sandbox metrics cli
- packaging/kata-cleanup: add k3s containerd volume
- stable-2.1: First round of backports
- [backport]runtime: use s.ctx instead ctx for checking cancellation
- [2.1.0] kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
- [2.1.0] Fix issue of virtio-mem

9266c246 rustjail: separated the propagation flags from mount flags
7086f91e runtime: sandbox delete should succeed after verifying sandbox state
0a7befa6 docs: Fix spell-check errors found after new text is discovered
eff70d2e docs: Remove horizontal ruler markers that disable spell checks
260f59df image_build: align image size to 128M for arm64
c0bdba23 runtime: make dialing timeout configurable
1b3cf2fb kata-monitor: export get stats for sandbox
59b9e5d0 kata-runtime: add `metrics` command
828a3048 agent: avoid reaping the exit signal of execute_hook in the reaper
d3690952 runtime: shim: dedup client, socket addr code
7f7c794d runtime: Short the shim-monitor path
3f1b7c91 cli: delete tracing code for kata-runtime binary
68cad377 agent: Set fixed NOFILE limit value for kata-agent
7c9067cc docs: add per-Pod Kata configurations for enable_pprof
dba86ef3 ci/install_yq.sh: install_yq: Check version before return
3883e4e2 kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
79831faf runtime: use s.ctx instead ctx for checking cancellation
3212c7ae packaging/kata-cleanup: add k3s containerd volume
7f7c3fc8 qemu.go: qemu: resizeMemory: Fix virtio-mem resize overflow issue
c9053ea3 qemu.go: qemu: setupVirtioMem: let sizeMB be multiple of 2Mib

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-14 16:05:01 +02:00
Fabiano Fidêncio
ed01ac3e0c
Merge pull request #1853 from fidencio/wip/last-round-of-backports-for-2.1.0
stable-2.1 | The last round of backports before releasing 2.1.0
2021-05-14 14:35:25 +02:00
fupan.lfp
9266c2460a rustjail: separated the propagation flags from mount flags
Since the propagation flags couldn't be combinted with the
standard mount flags, and they should be used with the remount,
thus it's better to split them from the standard mount flags.

Fixes: #1699

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
(cherry picked from commit e5fe572f51)
2021-05-14 09:42:00 +02:00
Peng Tao
7086f91e1f runtime: sandbox delete should succeed after verifying sandbox state
Otherwise we might block delete and create orphan containers.

Fixes: #1039

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 35151f1786)
2021-05-14 09:41:38 +02:00
Christophe de Dinechin
0a7befa645 docs: Fix spell-check errors found after new text is discovered
The spell-checker scripts has some bugs that caused large chunks of texts to not
be spell checked at all (see #1793). The previous commit worked around this bug,
which exposed another bug:

The following source text:

    are discussions about using VM save and restore to
    give [`criu`](https://github.com/checkpoint-restore/criu)-like
    functionality, which might provide a solution

yields the surprising error below:

    WARNING: Word 'givelike': did you mean one of the following?: give like, give-like, wavelike

Apparently, an extra space is removed, which is another issue with the
spell-checking script. This case is somewhat contrived because of the URL link,
so for now, I decided for a creative rewriting, inserting the word "a" knowing
that "alike" is a valid word ;-)

Fixes: #1793

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
(cherry picked from commit 5fdf617e7f)
2021-05-14 09:41:38 +02:00
Christophe de Dinechin
eff70d2eea docs: Remove horizontal ruler markers that disable spell checks
There is a bug in the CI script checking spelling that causes it
to skip any text that follows a horizontal ruler.
(https://github.com/kata-containers/tests/issues/3448)

Solution: replace one horizontal ruler marker with another that
does not trip the spell-checking script.

Fixes: #1793

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
(cherry picked from commit 42425456e7)
2021-05-14 09:38:50 +02:00
Fabiano Fidêncio
dd26aa5838
Merge pull request #1840 from jongwu/stable-2.1_image_align
back port: image_build: align image size to 128M for arm64
2021-05-13 10:37:50 +02:00
Jianyong Wu
260f59df38 image_build: align image size to 128M for arm64
There is an inconformity between qemu and kernel of memory alignment
check in memory hotplug. Both of qemu and kernel will do the start
address alignment check in memory hotplug. But it's 2M in qemu
while 128M in kernel. It leads to an issue when memory hotplug.

Currently, the kata image is a nvdimm device, which will plug into the VM as
a dimm. If another dimm is pluged, it will reside on top of that nvdimm.
So, the start address of the second dimm may not pass the alginment
check in kernel if the nvdimm size doesn't align with 128M.

There are 3 ways to address this issue I think:
1. fix the alignment size in kernel according to qemu. I think people
in linux kernel community will not accept it.
2. do alignment check in qemu and force the start address of hotplug
in alignment with 128M, which means there maybe holes between memory blocks.
3. obey the rule in user end, which means fix it in kata.

I think the second one is the best, but I can't do that for some reason.
Thus, the last one is the choice here.

Fixes: #1769
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-05-13 10:09:25 +08:00
Chelsea Mafrica
9a32a3e16d
Merge pull request #1835 from snir911/backport_configure_timeout
stable-2.1 | runtime: make dialing timeout configurable
2021-05-12 13:14:37 -07:00
Fabiano Fidêncio
123f7d53cb
Merge pull request #1830 from Tim-Zhang/fix-reap-for-stable-2.1
stable-2.1 | agent: avoid reaping the exit signal of execute_hook in the reaper
2021-05-12 20:26:42 +02:00
Fabiano Fidêncio
aa213fdc28
Merge pull request #1833 from fidencio/wip/stable-2.1-backport-of-1816
stable-2.1 | Get sandbox metrics cli
2021-05-12 19:34:20 +02:00
Snir Sheriber
c0bdba2350 runtime: make dialing timeout configurable
allow to set dialing timeout in configuration.toml
default is 30s

Fixes: #1789
(cherry-picked 01b56d6cbf)
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2021-05-12 14:17:34 +03:00
Eric Ernst
1b3cf2fb7d kata-monitor: export get stats for sandbox
Gathering stats for a given sandbox is pretty useful; let's export a
function from katamonitor pkg to do this.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 3787306107)
2021-05-12 11:44:58 +02:00
Eric Ernst
59b9e5d0f8 kata-runtime: add metrics command
For easier debug, let's add subcommand to kata-runtime for gathering
metrics associated with a given sandbox.

kata-runtime metrics --sandbox-id foobar

Fixes: #1815

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 8068a4692f)
2021-05-12 11:44:53 +02:00
Tim Zhang
828a304883 agent: avoid reaping the exit signal of execute_hook in the reaper
Fixes: #1826

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-05-12 16:33:44 +08:00
Fabiano Fidêncio
70734dfa17
Merge pull request #1803 from nubificus/stable-2.1
packaging/kata-cleanup: add k3s containerd volume
2021-05-11 19:38:57 +02:00
Fabiano Fidêncio
f170df6201
Merge pull request #1821 from fidencio/wip/first-round-of-backports
stable-2.1: First round of backports
2021-05-11 08:52:18 +02:00
Eric Ernst
d3690952e6 runtime: shim: dedup client, socket addr code
(1) Add an accessor function, SocketAddress, to the shim-v2 code for
determining the shim's abstract domain socket address, given the sandbox
ID.

(2) In kata monitor, create a function, BuildShimClient, for obtaining the appropriate
http.Client for communicating with the shim's monitoring endpoint.

(3) Update the kata CLI and kata-monitor code to make use of these.

(4) Migrate some kata monitor methods to be functions, in order to ease
future reuse.

(5) drop unused namespace from functions where it is no longer needed.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
(cherry picked from commit 3caed6f88d)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:53 +02:00
Fabiano Fidêncio
7f7c794da4 runtime: Short the shim-monitor path
Instead of having something like
"/containerd-shim/$namespace/$sandboxID/shim-monitor.sock", let's change
the approach to:
* create the file in a more neutral location "/run/vc", instead of
  "/containerd-shim";
* drop the namespace, as the sandboxID should be unique;
* remove ".sock" from the socket name.

This will result on a name that looks like:
"/run/vc/$sandboxID/shim-monitor"

Fixes: #497

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
(cherry picked from commit 4bc006c8a4)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:47 +02:00
bin
3f1b7c9127 cli: delete tracing code for kata-runtime binary
There are no pod/container operations in kata-runtime binary,
tracing in this package is meaningless.

Fixes: #1748

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 13c23fec11)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:35:36 +02:00
Snir Sheriber
68cad37720 agent: Set fixed NOFILE limit value for kata-agent
Some applications may fail if NOFILE limit is set to unlimited.
Although in some environments this value is explicitly overridden,
lets set it to a more sane value in case it doesn't.

Fixes #1715
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
(cherry picked from commit a188577ebf)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:34:28 +02:00
bin
7c9067cc9d docs: add per-Pod Kata configurations for enable_pprof
Now enabling enable_pprof for individual pods is supported,
but not documented.

This commit will add per-Pod Kata configurations for `enable_pprof`
in file `docs/how-to/how-to-set-sandbox-config-kata.md`

Fixes: #1744

Signed-off-by: bin <bin@hyper.sh>
(cherry picked from commit 95e54e3f48)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:34:05 +02:00
Hui Zhu
dba86ef31a ci/install_yq.sh: install_yq: Check version before return
Check the yq version before return.

Fixes: #1776

Signed-off-by: Hui Zhu <teawater@antfin.com>
(cherry picked from commit d8896157df)
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-05-10 15:33:33 +02:00
Tim Zhang
0e2df80bda
Merge pull request #1814 from liubin/fix/1804-select-sandbox-ctx
[backport]runtime: use s.ctx instead ctx for checking cancellation
2021-05-07 19:43:14 +08:00
Bin Liu
8c4e187049
Merge pull request #1813 from teawater/open_vm
[2.1.0] kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
2021-05-07 19:31:23 +08:00
Hui Zhu
3883e4e290 kernel: configs: Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel
Open CONFIG_VIRTIO_MEM in x86_64 Linux kernel.

Fixes: #1798

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-05-07 15:19:06 +08:00
Fabiano Fidêncio
3bcdc26008
Merge pull request #1812 from teawater/fix_vm
[2.1.0] Fix issue of virtio-mem
2021-05-07 08:14:19 +02:00
bin
79831fafaf runtime: use s.ctx instead ctx for checking cancellation
s.ctx should be used for checking cancellation, and the
local ctx is used for tracing.

Fixes: #1804

Signed-off-by: bin <bin@hyper.sh>
2021-05-06 17:22:53 +08:00