Commit Graph

557 Commits

Author SHA1 Message Date
Tobin Feldman-Fitzthum
a18c7ca307 runtime: remove unimplemented CoCo configurations
These experimental options were added 2 years ago
in anticipation of features that would be added
in CoCo. These do not match the features that were
eventually added and will soon be ported to main.

Fixes: #8047

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2024-03-27 12:21:06 -05:00
Hyounggyu Choi
d915a79e2d Merge pull request #9280 from BbolroC/enable-qemu-on-s390x
runtime-rs: Enable qemu on s390x
2024-03-22 23:58:42 +01:00
Hyounggyu Choi
81aaa34bd6 runtime-rs: Add DeviceVirtioSerial and DeviceVirtconsole
It is observed that virtiofsd exits immediately on s390x
if there is no attached console devices.
This commit resolves the issue by migrating `appendConsole()`
from runtime and being triggered in `start_vm()`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
2cfe745efb runtime-rs: Enable memory backend option for Machine for s390x
For s390x, it requires an additional option `memory-backend` for `-machine`.
Otherwise, virtiofsd exits with HandleRequest(InvalidParam).

This commit is to add a field `memory_backend` to `struct Machine`
and turn it on for s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
Hyounggyu Choi
9bcfaad625 runtime-rs: Add ccw block device for rootfs
Like nvdimm for x86_64, a block device for s390x should be
treated differently with `virtio-blk-ccw`.
This is to generate a QEMU command line parameter for a block
device by using `-blockdev` and `-device` if the `vm_rootfs_driver`
is set to `virtio-blk-ccw`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-22 19:27:13 +01:00
David Esparza
b498e140a1 runtime-rs: ch: Implement full thread/tid/pid handling
Add in the full details once cloud-hypervisor/cloud-hypervisor#6103
has been implemented, and the feature is available in a Cloud Hypervisor
release.

Fixes: #8799

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-03-21 08:24:53 -06:00
Hyounggyu Choi
9b2c08935b runtime-rs: Pass different device argument based on bus type
Currently, `*-pci` is used as an argument for the device config.
It is not true for a case where a different type of bus is used.
s390x uses `ccw`.
This commit is to make it flexible to generate the device argument
based on the bus type. A structure `DeviceVhostUserFsPci` and
`VhostVsockPci` is renamed to `DeviceVhostUserFs` and `VhostVsock`
because the structure name is not bound to a certain bus type any more.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-21 09:25:37 +01:00
Hyounggyu Choi
7b3d1adb8c libs: Bump sysinfo to v0.30.5
It has been observed that the runtime stops running around
`sysinfo::total_memory()` while adjusting a config on s390x.
This is to update the crate to the latest version which happened
to resolve the issue. (No explicit release note for this)

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-20 09:27:13 +01:00
Chelsea Mafrica
42dfe0e8d1 Merge pull request #9286 from jodh-intel/agent-show-enabled-features
agent: Show features enabled at build time
2024-03-19 08:54:49 -07:00
Bo Chen
ad4262e86b runtime-rs: ch: Provide valid default value for NetConfig
The current default value of IP `0.0.0.0` with mask `0.0.0.0` will cause
ioctl error when being used to create and configure TAP device, with
newer version of Cloud Hypervisor [1]. This patch replaces them with
valid value that are the same as the Go-lang runtime [2].

[1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/5924
[2] e3f7852738/src/runtime/virtcontainers/pkg/cloud-hypervisor/client/model_net_config.go (L40-L57)

Fixes: #9254

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-18 15:47:58 -07:00
Greg Kurz
1e526a4769 runtime-rs: Consolidate the handling of fds passed to QEMU
File descriptors that are passed to QEMU need some special care.
We want them to be closed when the QEMU process is started. But
at the same time, it is required that the associated rust File
structures, either coming from the` std::fs` or the `tokio::fs`
crates, are still in scope when the QEMU process is forked. This
is currently achieved by keeping File structures in variables
at the outer scope of `start_vm()`. This scheme is currently
duplicated, with similar justifications in the corresponding
comments.

Consolidate all this handling in one place with a more generic
explanation.

Fixes #9281

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-15 16:14:59 +01:00
James O. D. Hunt
9ef59488d9 agent: Show features enabled at build time
The agent now has a number of optional build-time features that can be
enabled.

Add details of these features to the following areas:

- Version output (`kata-agent --version`)
- Announce message (so that the details are always added to the journal
  at agent startup).
- The response message returned by the ttRPC `GetGuestDetails()` API.

Fixes: #9285.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2024-03-15 13:29:21 +00:00
Greg Kurz
6a112cc7a5 runtime-rs: Fix missing dependency
Some previous contribution missed to run cargo clippy.
Fix the dependency now so that it doesn't cause noise
in future contributions.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-03-14 23:19:38 +01:00
Hyounggyu Choi
1dac6b1357 runtime-rs: Configure s390x specific flags for Makefile
s390x supports a different machine type `s390-ccw-virtio` and it is
not required to configure cpu features by default for the platform.
A hypervisor `dragonball` is not supported on s390x so that `DBCMD`
is not necessary. `vm-rootfs_driver` should be set to `virtio-blk-ccw`.
This commit is to set the architecture-specific flags for Makefile.

Fixes: #9158

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-03-14 13:05:35 +01:00
Alex Lyn
e2ae8ba79b runtime-rs: add network device into Qemu's cmdline
It will open tuntap device and vhost-net device
and store device files.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:28:54 +08:00
Alex Lyn
d3bca4597e runtime-rs: add open_named_tuntap to open a named tuntap device.
The open_named_tuntap function is designed as a public function to
open a tuntap device with the specified name. However, in order to
reference existing methods in dbs_utils, we still need to keep the
reference "path = "../../../dragonball/src/dbs_utils" in dependencies
and cannot hide it.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:26:32 +08:00
Alex Lyn
005b333976 runtime-rs: add network helpers and impl ToQemuParams
Add network helpers and impl ToQemuParams trait to build
netdev params which are put into cmdline for Qemu VM running.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:25:39 +08:00
Alex Lyn
63786934f4 runtime-rs: set network namespace for qemu process and netdev.
We need ensure the add_network_device happens in netns and
move qemu process into netns which keeps the qemu process
running in this net namespace.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:21:43 +08:00
Alex Lyn
69a5e5b955 runtime-rs: add network device handler in start_vm.
Add network device handler in start_vm, which is sepcially
for Qemu VM running with added net params to command line.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-12 22:18:01 +08:00
Alex Lyn
9f6003adde runtime-rs: add a new netns field in struct QemuInner.
We need add a new netns field in struct QemuInner, and
initialize it with argument passed down in prepare_vm().

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 16:02:39 +08:00
Alex Lyn
f571ec84d2 runtime-rs: add a public method to support process entering netns.
The enter_netns function is designed as a public method to help
VMMs running as a independent process enter a network namespace,
reducing duplicate code.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 15:55:52 +08:00
Alex Lyn
4176fcc3c6 runtime-rs: make the code for cleanup fd flags as public method.
It just move the related code to a public file(utils.rs) and make
it a common method for both vsock and network, or some others.

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-03-11 15:52:20 +08:00
Alex Lyn
b1038704e0 runtime-rs: make NetnsGuard common for hypervisor and resource.
In order to better support non-builtin vmm usage of NetnsGuard and
reduce code duplication, we need to move it to a common path that
can be referenced by both hypervisor and resource manager.

In this patch, it just do moving code from network/utils/netns.rs
to kata-sys-utils/src/netns.rs

Fixes: #8865

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-11 15:38:42 +08:00
Alex Lyn
7145243bd3 kata-ctl: Support using container short ID to enter guest.
Fixes: #9189

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-03-07 15:44:47 +08:00
Julien Ropé
1c306fe4a6 runtime-rs: stop reporting net dev metrics for the shim
For consistency with the go runtime.
As the shim itself is not using the network (all its communication with
other processes is done with local unix sockets), there is no reason to
keep gathering and reporting shim-specific network metrics.
Actual network usage of the kata containers can be found from the existing
agent network metrics (kata_guest_netdev_stat).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-02-22 14:00:00 +01:00
Alex Lyn
014e0f4e46 runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.
We need initailize the pci_hotplug_enabled with true before we do GPU
passthrough with runtime-rs/dragonball. Otherwise it fails with error
`InvalidOperation`.

Fixes: #9129

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2024-02-22 10:22:32 +08:00
Greg Kurz
d7afd31fd4 Merge pull request #8455 from BbolroC/runtime-rs-qemu-config
runtime-rs: Add a new config option for QEMU
2024-02-10 08:48:23 +01:00
Hyounggyu Choi
05c4c8055c runtime-rs: Configure argument replacement for QEMU in Makefile
Last but not least, all placeholders for argument replacement
should be configured to generate a configuration file when `QEMUCMD`
is defined. This enriches those variables.

Additionally, this involves creating a symbolic link to `configuration-qemu.toml`
if QEMU is defined as the default hypervisor.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-09 19:31:20 +01:00
Hyounggyu Choi
27cb30d8ce runtime-rs: Adjust configuration template for runtime-rs
There are some variables newly introduced to runtime-rs, such as:

- runtime.name
- runtime.hypervisor_name
- runtime.agent_name
- vm_rootfs_driver

Additionally some of the placeholders for argument replacement are
made hypervisor-specific based on the changes made for dragonball.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-09 16:26:59 +01:00
Greg Kurz
6ead48ec06 Merge pull request #8986 from pmores/drop-shim-v2-address-value-validation
runtime-rs: fix interoperability issues between runtime-rs and cri-o
2024-02-08 09:44:12 +01:00
Pavel Mores
6346e04cf7 runtime-rs: fix handling of TTRCP_ADDRESS
Since cri-o doesn't seem to use address for event publishing as mentioned
in the previous commit it will not send it.  However, the exact way of
not sending it is unfortunately different from what is assumed by
runtime-rs.  Due to an implementation detail of cri-o which uses containerd
libraries for some low-level tasks, TTRPC_ADDRESS will not be missing from
environment as assumed, instead it will be present with an empty value.

This commit contains a small adjustment to account for that and use
LogForwarder even if TTRPC_ADDRESS is present, but with an empty value.

Fixes #8985

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-02-07 17:01:04 +01:00
ChengyuZhu6
34c47e08b2 runtime-rs: fix assert error in test in make check
Fix assert error:
error: used `assert_eq!` with a literal bool
   --> crates/hypervisor/src/ch/inner.rs:218:9
    |
218 |         assert_eq!(state.jailed, false);
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#bool_assert_comparison
    = note: `-D clippy::bool-assert-comparison` implied by `-D warnings`

Fixes: #9042

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-02-07 19:31:10 +08:00
Hyounggyu Choi
462afcf829 runtime-rs: Copy configuration for QEMU from runtime
It makes sense to reuse a configuration template for runtime-golang
as a base. This is simply to copy it into the config directory.

Fixes: #8441

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-06 19:35:44 +01:00
Pavel Mores
f0256fded5 runtime-rs: remove validation of shim v2 -address value
It appears that under the shim v2 protocol, a shim has no use of its own
for the -address value, it just passes it back to container runtime's
(mostly containerd or cri-o) event-publishing binary.  Since the -address
value only flows through the shim, being passed to the shim by a container
runtime and then essentially passed back by shim to the container runtime,
it seems inappropriate for a shim to validate the value that is fully
owned and only used by the container runtime.

This commit removes such validation from runtime-rs.  Doing so, it solves
(part of) an interoperability problem between runtime-rs and cri-o.  cri-o
seems to intentionally choose not to implement the event-publishing part
of the shim v2 protocol and thus it has no value it could pass to
runtime-rs for -address.  As a result, it sends an empty string which has
been failing the excessive validation performed by runtime-rs so far.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-02-06 13:43:09 +01:00
Archana Shinde
b3c74411f6 runtime-rs: Add tests for persist api for clh
Add tests to check clh struct is saved/restored correctly.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:57 -08:00
Archana Shinde
0b78296dca runtime-rs: Store additional field for hypervisor state
Implementing Persist API for cloud-hypervisor was done partially with
initial support for cloud-hypervisor. Store and retrieve additional
fields to/from the hypervisor state.

Fixes: #6202

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:57 -08:00
Archana Shinde
a5f0b92bca runtime-rs: Add guest protection to hypervisor state
Store guest-protection used while storing the state of the hypervisor.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-02-04 22:03:54 -08:00
Greg Kurz
d1a26ead94 Merge pull request #8454 from BbolroC/compile-with-qemu-s390x
runtime-rs: make compilation for QEMU on s390x
2024-02-02 09:29:32 +01:00
Hyounggyu Choi
bb6f5073aa runtime-rs: Allow compilation for s390x
Until now, runtime-rs couldn't be compiled on s390x.
We need to lift those restrictions in Makefile first.

Fixes: #8446

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 23:48:15 +01:00
Hyounggyu Choi
8fcee6e6ec runtime-rs: Use Persist::restore() of QEMU for VirtSandbox
It fails to compile virt_container because Dragonball is only
used in the implementation of the trait method Persist::restore().
As the hypervisor is not compiled on s390x and QEMU implements
the trait method, this commit is to let the method use QEMUi's.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 18:02:10 +01:00
Hyounggyu Choi
56aef3741d runtime-rs: Exclude hypervisors plugins except QEMU for s390x
Dragonball and cloud-hypervisor are not supported on s390x.
We need to exclude the plugins for these hypervisors from compilation.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-02-01 18:02:10 +01:00
Xuewei Niu
2332552c8f Merge pull request #7483 from frezcirno/passfd_io_feature
runtime-rs: improving io performance using dragonball's vsock fd passthrough
2024-02-01 14:53:53 +08:00
Greg Kurz
8b1dc06971 Merge pull request #8938 from pmores/log-qemus-stderr-in-shim-log
runtime-rs: Log qemu's stderr in shim log
2024-01-31 18:04:28 +01:00
Zixuan Tan
6e4d4c329a agent,runtime-rs: Add license header to passfd_io.rs
Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
f6710610d1 agent,runtime-rs,runk: fix fmt and clippy warnings
Fix rustfmt and clippy warnings detected by CI.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
89be42a177 runtime-rs: open stdout and stderr fifos NONBLOCK
This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs
to avoid blocking.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Fupan Li
cfb262d02f container: keep the io connection when pass fd to hybrid vsock
We want the io connection keep connected when the containerd closed
the io pipe, thus it can be attached on the io stream.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
5536743361 agent,runtime-rs: fix container io detach and attach
Partially fix some issues related to container io detach and attach.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
657b17a86f runtime-rs: open stdin fifo with RDWR|NONBLOCK when pass vsock streams
In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic
when creating containers in detached mode, as the stdin FIFO writer
is closed after the container is created, resulting in this situation.

In passfd io mode, open stdin fifo with O_RDWR|O_NONBLOCK to avoid the
HUP event.

Fixes: #6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00
Zixuan Tan
442df71fe5 agent,runtime-rs: refactor process io using vsock fd passthrough feature
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: #6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2024-01-31 21:07:48 +08:00