Commit Graph

9729 Commits

Author SHA1 Message Date
Greg Kurz
a5319c6be6 runtime: Start QEMU undaemonized
QEMU has always been started daemonized since the beginning. I
could not find any justification for that though, but it certainly
introduces a problem : QEMU stops logging errors when started this
way, which isn't accaptable from a support standpoint. The QEMU
community discourages the use of -daemonize ; mostly because
libvirt, QEMU's primary consummer, doesn't use this option and
prefers getting errors from QEMU's stderr through a pipe in order
to enforce rollover.

Now that virtcontainers knows how to start QEMU with a pre-
established QMP connection, let's start QEMU without -daemonize.
This requires to handle the reaping of QEMU when it terminates.
Since cmd.Wait() is blocking, call it from a goroutine.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
bf4e3a618f runtime: Launch QEMU with cmd.Start()
LaunchCustomQemu() currently starts QEMU with cmd.Run() which is
supposed to block until the child process terminates. This assumes
that QEMU daemonizes itself, otherwise LaunchCustomQemu() would
block forever. The virtcontainers package indeed enables the
Daemonize knob in the configuration but having such an implicit
dependency on a supposedly configurable setting is ugly and fragile.

cmd.Run() is :

func (c *Cmd) Run() error {
	if err := c.Start(); err != nil {
		return err
	}
	return c.Wait()
}

Let's open-code this : govmm calls cmd.Start() and returns the
cmd to virtcontainers which calls cmd.Wait().

If QEMU doesn't start, e.g. missing binary, there won't be any
errors to collect from QEMU output. Just drop these lines in govmm.
Similarily there won't be any log file to read from in virtcontainers.
Drop that as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
8a1723a5cb runtime: Pre-establish the QMP connection
Running QEMU daemonized ensures that the QMP socket is ready to
accept connections when LaunchQemu() returns. In order to be
able to run QEMU undaemonized, let's handle that part upfront.
Create a listener socket and connect to it. Pass the listener
to QEMU and pass the connected socket to QMP : this ensures
that we cannot fail to establish QMP connection and that we
can detect if QEMU exits before accepting the connection.
This is basically what libvirt does.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:09:11 +01:00
Greg Kurz
8a4f08cb0f govmm: Optionally pass QMP listener to QEMU
QEMU's -qmp option can be passed the file descriptor of a socket that
is already in listening mode. This is done with by passing `fd=XXX`
to `-qmp` instead of a path. Note that these two options are mutually
exclusive : QEMU errors out if both are passed, so we check that as
well in the validation function.

While here add the `path=` stanza in the path based case for clarity.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 23:08:48 +01:00
Greg Kurz
219bb8e7d0 govmm: Optionally start QMP with a pre-configured connection
When QEMU is launched daemonized, we have the guarantee that the
QMP socket is available. In order to launch a non-daemonized QEMU,
the QMP connection should be created before QEMU is started in order
to avoid a race. Introduce a variant of QMPStart() that can use such
an existing connection.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-01-24 19:16:47 +01:00
Peng Tao
856d4b7361
Merge pull request #5798 from pmores/qemu-support
basic framework for QEMU support in runtime-rs
2022-12-14 15:05:33 +08:00
GabyCT
b637d12d19
Merge pull request #5884 from GabyCT/topic/fixbuildscript
tools: Fix indentation on build kernel script
2022-12-13 15:28:24 -06:00
Chao Wu
bb4be2a666
Merge pull request #5690 from yipengyin/fix-virtiofsd
runtime-rs: fix standalone share fs
2022-12-14 00:16:10 +08:00
Pavel Mores
1f28ff6838 runtime-rs: add binary to exercise shim proper w/o containerd dependencies
After building the binary as usual with `cargo build` run it as follows.

It needs a configuration.toml in which only qemu keys `path`, `kernel`
and `initrd` will initially need to be set.  Point them to respective
files e.g. from a kata distribution tarball.

It also needs to be launched from an exported container bundle
directory.  One can be created by running

mkdir rootfs
podman export $(podman create busybox) | tar -C ./rootfs -xvf -
runc spec -b .

in a suitable directory.

Then launch the program like this:

KATA_CONF_FILE=/path/to/configuration-qemu.toml /path/to/shim-ctl

Fixes: #5817

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:55:21 +01:00
Pavel Mores
eb8c9d38ff runtime-rs: add launch of a simple qemu process to start_vm()
The point here is just to get a simplest Kata VM running.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
2f6d0d408b runtime-rs: support qemu in VirtContainer
Added registration of qemu config plugin and support for creating Qemu
Hypervisor instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
1413dfe91c runtime-rs: add basic empty boilerplate for qemu driver
This does almost literally nothing so far apart from getting and setting
HypervisorConfig.  It's mostly copied from/inspired by dragonball.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:53:45 +01:00
Bin Liu
3952fedcd0
Merge pull request #5882 from bergwolf/github/oci-namespaces
runtime-rs: fix sandbox_pidns calculation and oci spec amending
2022-12-13 18:32:02 +08:00
Fabiano Fidêncio
f1381eb361
Merge pull request #4813 from ManaSugi/fix/add-selinux-agent
runtime,agent: Add SELinux support for containers inside the guest
2022-12-13 11:24:53 +01:00
Fupan Li
015674df16
Merge pull request #5873 from justxuewei/fix/umount2
kata-sys-util: fix issues where umount2 couldn't get the correct path
2022-12-13 15:52:32 +08:00
Bin Liu
03b6124fc6
Merge pull request #5848 from Yuan-Zhuo/drop-cgmr-option
agent: Drop the Option for LinuxContainer.cgroup_manager
2022-12-13 12:09:39 +08:00
Bin Liu
add2486259
Merge pull request #5853 from jongwu/test_kata3.0_arm
dragonball: enable kata3.0/dragonball CI on Arm
2022-12-13 11:05:17 +08:00
Gabriela Cervantes
a577df8b71 tools: Fix indentation on build kernel script
This PR fixes the indentation on the build kernel script.

Fixes #5883

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-12 16:37:47 +00:00
Fabiano Fidêncio
740387b569
Merge pull request #5829 from singhwang/main
fix kata deploy error after node reboot.
2022-12-12 14:20:14 +01:00
singhwang
b087667ac5 kata-deploy: Fix the pod of kata deploy starts to occur an error
If a pod of kata is deployed on a machine, after the machine restarts, the pod status of kata-deploy will be CrashLoopBackOff.

Fixes: #5868
Signed-off-by: SinghWang <wangxin_0611@126.com>
2022-12-12 19:11:38 +08:00
Peng Tao
79cf38e6ea runtime-rs: clear OCI spec namespace path
None of the host namespace paths make sense in the guest. Let's clear
them all before sending the spec to the agent.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 11:07:14 +00:00
Peng Tao
62f4603e81 runtime-rs: reset rdma cgroup
We don't support rdma cgroups yet. Let's make sure it is reset to empty.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
5b6596f54e runtime-rs: CreateContainerRequest has Default
We can just use it to initialize the default fields.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
e9e82ce28b runtime-rs: fix is_pid_namespace_enabled check
We should test is_pid_namespace_enabled before amending the container
spec, where the pid namespace path is cleared and resulting
sandbox_pidns to always being false.

Fixes: #5881
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:54:48 +00:00
Xuewei Niu
8079a9732d kata-sys-util: fix issues where umount2 couldn't get the correct path
Strings in Rust don't have \0 at the end, but C does, which leads to `umount2`
in the libc can't get the correct path. Besides, calling `nix::mount::umount2`
to avoid using an unsafe block is a robust solution.

Fixes: #5871

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2022-12-12 11:50:32 +08:00
Yipeng Yin
4661ea8d3b runtime-rs: fix standalone share fs
Standalone share fs should add virtiofs device in setup_device_before_start_vm
and return the storages to mount the directory in guest. And it uses
hypervisor's jailer root directly instead of jail config.

Besides, we tweaked the parameter, so it adapts to rust version virtiofsd
now. And its cache policy which forbids caching is "never" now,  instead of
"none". Hence, we change the default cache mode.

Fixes: #5655

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2022-12-12 10:58:09 +08:00
GabyCT
67e82804c5
Merge pull request #5865 from GabyCT/topic/fixspacesovmfscript
tools: Fix indentation for ovmf script
2022-12-09 15:33:49 -06:00
Jianyong Wu
c5abc5ed4d config: speed up rng init when kernel boot for arm64
For now, rng init is too slow for kata3.0/dragonball. Enable
random_trust_cpu can speed up rng init when kernel boot.

Fixes: #5870
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-12-09 14:20:18 +08:00
Gabriela Cervantes
3e6114b2ef tools: Fix indentation for ovmf script
This PR fixes the indentation for the ovmf script for packaging.

Fixes #5864

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-08 16:12:20 +00:00
Greg Kurz
5ef7ed72ae
Merge pull request #5610 from UiPath/fix-process-wait
runtime: prevent waiting 50 ms minimum for a process exit
2022-12-08 11:02:39 +01:00
Peng Tao
0a1d1ec2fa
Merge pull request #5830 from openanolis/fix-high-cpu
runtime-rs: fix high cpu
2022-12-08 12:16:06 +08:00
Steve Horsman
39394fa2a8
Merge pull request #5844 from jtumber-ibm/patch-1
agent: remove `sysinfo` dependency
2022-12-07 16:35:05 +00:00
Fupan Li
cce316b5e9
Merge pull request #5607 from justxuewei/feat/sandbox-level-volume
runtime-rs: bind mount volumes in sandbox level
2022-12-07 19:23:38 +08:00
Chelsea Mafrica
1ff4185111
Merge pull request #5842 from cyyzero/update_install_guide
docs: Update the rust version in the installation documentation
2022-12-06 23:40:35 -08:00
Yuan-Zhuo
7fdbbcda82 agent: Drop the Option for LinuxContainer.cgroup_manager
Cgroup manager for a container will always be created.
Thus, dropping the option for LinuxContainer.cgroup_manager
is feasible and could simplify the code.

Fixes: #5778

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-12-07 13:40:38 +08:00
Alexandru Matei
d04d45ea05 runtime: use pidfd to wait for processes on Linux
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic

Fixes: #5617

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:31:05 +02:00
Alexandru Matei
e9ba0c11d0 runtime: use exponential backoff for process wait
Initial wait period between checks is 1ms, and the
next ones are min(wait_period*5, 50ms)

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:30:58 +02:00
James Tumber
748f22e7d0 agent: remove sysinfo dependency
Removes the redundant dependency `sysinfo`.

Fixes: #5843

Signed-off-by: James Tumber <james.tumber@ibm.com>
2022-12-06 10:18:53 +00:00
Quanwei Zhou
0019d653d6 runtime-rs: fix high cpu
Fixed the issue when using nonblocking, the `tokio::io::copy()` needing
to handle EAGAIN, resulting in high CPU usage.

Fixes: #5740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-12-06 14:25:33 +08:00
Chao Wu
326d589ff5
Merge pull request #5822 from liubin/fix/5820-var-name-and-typo
runtime-rs: fix some variable names and typos
2022-12-06 14:24:11 +08:00
Zhongtao Hu
c12bb5008d
Merge pull request #5769 from jongwu/check_host_arm
kata-ctl: add host check for aarch64
2022-12-06 14:05:52 +08:00
Chen Yiyang
46b38458af
docs: Update the rust version in the installation documentation
Rust version in the installation documentation does not match the
requirements. Just fix it.

Fixes: #5841

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-12-06 12:50:32 +08:00
Chao Wu
538bddf4ee
Merge pull request #5811 from tzY15368/fix-katactl-conflict-dependency
kata-ctl: fix dependency version conflict
2022-12-06 10:44:48 +08:00
Alexandru Matei
71491a69c3 runtime: move process wait logic to another function
extract process wait logic to another function

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:32:04 +02:00
Alexandru Matei
92ebe61fea runtime: reap force killed processes
reap child processes after sending SIGKILL

Fixes #5739

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:31:58 +02:00
Xuewei Niu
fdf0a7bb14 runtime-rs: fix the issues mentioned in the code review
Removed the `Debug` trait for the `ShareFs` and etc. Renamed
`ShareFsMount::upgrade()` and `ShareFsMount::downgrade()` to
`upgrade_to_rw()` and `downgrade_to_ro()`. Protected `mounted_info_set`
with a mutex to avoid race conditions.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 11:18:26 +08:00
Xuewei Niu
1d823c4f65 runtime-rs: umount and permission controls in sandbox level
This commit implemented umonut controls and permission controls. When a volume
is no longer referenced, it will be umounted immediately. When a volume mounted
with readonly permission and a new coming container needs readwrite permission,
the volume should be upgraded to readwrite permission. On the contrary, if a
volume with readwrite permission and no container needs readwrite, then the
volume should be downgraded.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Xuewei Niu
527b871414 runtime-rs: bind mount volumes in sandbox level
Implemented bind mount related managment on the sandbox side, involving bind
mount a volume if it's not mounted before, upgrade permission to readwrite if
there is a new container needs.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Bin Liu
9ccf2ebe8a agent: add signal value to log
For signal_process call, log the signal value in logs.

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:53:58 +08:00
Bin Liu
fb2c142f18 runtime-rs: fix some variable names and typos
Fix some not perfect variable names, and some typos in logs.

Fixes: #5820

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:52:34 +08:00