Commit Graph

397 Commits

Author SHA1 Message Date
James O. D. Hunt
bdb83f8282 runtime-rs: ch: Remove unused function
Remove the redundant `parse_mac()` function: this was never used and we
already have an implementation in `crates/resource/src/network/utils/mod.rs`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
Beraldo Leal
c5d845b30a agent: updating Cargo.lock files
Probably previous changes missed updating Cargo.lock.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Archana Shinde
036b7787dd runtime-rs: Use PCI path from hypervisor for vfio devices
Remove earlier functionality that tries to assign PCI path to vfio
devices from the host assuming pci slots to start from 1.
Get this from the hypervisor instead.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
c3ce6a1d15 runtime-rs: Provide PCI path to the agent for virtio-block
If PCI path for block device is not empty for a block device, use
that as identifier for agent instead of virt path which is valid only
for mmio devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
a2bbbad711 runtime-rs: change hypervisor add_device trait to return device copy
Block(virtio-blk) and vfio devices are currently not handled correctly
by the agent as the agent is not provided with correct PCI paths for
these devices.

The PCI paths for these devices can be inferred from the PCI information
provided by the hypervisor when the device is added.
Hence changing the add_device trait function to return a device copy
with PCI info potentially provided by the hypervisor. This can then be
provided to the agent to correctly detect devices within the VM.

This commit includes implementation for PCI info update for
cloud-hupervisor for virtio-blk devices with stubs provided for other
hypervisors.

Removing Vsock from the DeviceType enum as Vsock currently does not
implement the Device Trait, it has no attach and detach trait functions
among others. Part of the reason is because these functions require Vsock
to implement Clone trait as these functions need cloned copies to be
passed down the hypervisor.

The change introduced for returning a device copy from the add_device
hypervisor trait explicitly requires a device to implement
Copy trait. Hence removing Vsock from the DeviceType enum for now, as
its implementation is incomplete and not currently used.

Note, one of the blockers for adding the Clone trait to Vsock is that it
currently includes a file handle which cannot be cloned. For Clone and
Device Traits to be implemented for Vsock, it requires an implementation
change in the future for it to be cloneable.

Fixes: #8283

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Ruoqing He
4ad2cfe0c2 runtime-rs: Log system enhancement
By modifying RuntimeLevelFilter drain to improve logging control,
enabling isolation of change effect of the loggers between components,
tuning clh logs to be logged according to their log levels
given by cloud-hypervisor.

Fixes: #8310

Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>
2023-10-31 04:57:46 +00:00
Peng Tao
52a014d9cd Merge pull request #8033 from h56983577/6715/shared-mount
agent: use open_tree()/move_mount() to set up bind mounts between containers directly.
2023-10-28 10:57:34 +08:00
Archana Shinde
f5c17f89a3 Merge pull request #8250 from amshinde/runtime-rs-clh-config
runtime-rs: Add default configuration file for cloud-hypervisor
2023-10-26 14:54:47 -07:00
HanZiyao
a3b003c345 agent: support bind mounts between containers
This feature supports creating bind mounts directly between containers through annotations.

Fixes: #6715

Signed-off-by: HanZiyao <h56983577@126.com>
2023-10-26 16:34:50 +08:00
Archana Shinde
f99de4d5a1 runtime-rs: Make default kernel params as empty
The default kernel params passed to any hypervisor except dragonball is
empty.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:50:12 -07:00
Archana Shinde
a813012785 runtime-rs: Add default configuration file for clouf-hypervisor
The config template file for clh is in the new format for runtime-rs.
It is a result of merging the new format file and options supportted by
cloud-hypervisor.

Some config options from the golang runtime are missing as they may not
be currently supported by the rust runtime. An example of this is the
selinux options, rate limiting options as these are not currently
supported or verified with the rust runtime.

Fixes: #8249

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:17:24 -07:00
Zizheng Bian
7d7c25c1d6 runtime-rs: fix a typo in device manager
Fixes: #8293
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2023-10-23 20:33:47 +08:00
James O. D. Hunt
9336e2e492 Merge pull request #8155 from jodh-intel/runtime-rs-check-ch-tdx-build-feature
runtime-rs: ch: Add TDX CH features check
2023-10-19 14:13:08 +01:00
James O. D. Hunt
0e0867f15d runtime-rs: ch: Add TDX CH features check
If you attempt to create a container (a TD) on a TDX system using a
custom build of Cloud Hypervisor (CH) that was not built with the `tdx`
CH feature, Kata will report the following, somewhat cryptic, CH error:

```
ApiError(VmBoot(InvalidPayload))
```

Newer versions of CH now report their build-time features in the ping
API response message so we now use that, if available, to detect this
scenario and generate a user-friendly error message instead.

This changes improves the readability of `handle_guest_protection()` and
adds a couple of additional tests for that method.

Fixes: #8152.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:07:39 +01:00
James O. D. Hunt
409eadddb2 runtime-rs: ch: Improve readability of guest protection checks
Improve the way `handle_guest_protection()` is structured by inverting
the logic and checking the value of the `confidential_guest` setting
before checking the guest protection. This makes the code easier to
understand.

> **Notes:**
>
> - This change also unconditionally saves the available guest protection
>   (where previously it was only saved when `confidential_guest=true`).
>   This explains the minor unit test fix.
>
> - This changes also errors if the CH driver finds an unexpected
>   protection (since only Intel TDX is currently tested).

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-18 18:06:02 +01:00
Chao Wu
408b59c02c runtime-rs: fix bugs to support Nydus v5
1. enable virtio-fs-pro in Dragonball to have the ability to process nydus backend registry
2. change passthrough for rw layer's readonly config to false to have the accurate read write ability.

Fixes:#8013

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-10-16 10:22:21 +08:00
James O. D. Hunt
45d28998d9 Merge pull request #8149 from jodh-intel/runtime-rs-ch-detect-tdx-version
runtime-rs: ch: Detect Intel TDX version
2023-10-12 10:09:42 +01:00
James O. D. Hunt
87b760f569 runtime-rs: ch: Detect Intel TDX version
Improve the `GuestProtection` handling to detect the version of
Intel TDX available.

The TDX version is now logged by the Cloud Hypervisor driver.

Fixes: #8147.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-11 09:38:00 +01:00
Archana Shinde
8d6f7b9096 runtime-rs: Add support for handling vfio device for cloud-hypervisor
This change adds support for adding and removing vfio devices for
 cloud-hypervisor.

Fixes: #6691

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-10 12:25:44 -07:00
James O. D. Hunt
b8a46a4b85 runtime-rs: ch: Enable feature
Enable the Cloud Hypervisor driver (the `cloud-hypervisor` build feature) for the rust runtime.

Fixes: #6264.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-05 17:58:39 +01:00
James O. D. Hunt
b0a3293d53 runtime-rs: ch: Enable Intel TDX
Allow Cloud Hypervisor to create a confidential guest (a TD or
"Trust Domain") rather than a VM (Virtual Machine) on Intel systems
that provide TDX functionality.

> **Notes:**
>
> - At least currently, when built with the `tdx` feature, Cloud Hypervisor
>   cannot create a standard VM on a TDX capable system: it can only create
>   a TD. This implies that on TDX capable systems, the Kata Configuration
>   option `confidential_guest=` must be set to `true`. If it is not, Kata
>   will detect this and display the following error:
>
>   ```
>   TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true')
>   ```
>
> - This change expands the scope of the protection code, changing
>   Intel TDX specific booleans to more generic "available guest protection"
>   code that could be "none" or "TDX", or some other form of guest
>   protection.

Fixes: #6448.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 10:55:25 +01:00
James O. D. Hunt
523399c329 runtime-rs: ch: Add more consts
Introduce a few new constants (for PCI segment count and FS queues) and
move the disk queue constants to `convert.rs` to allow them to be used
there too.

> **Note:**
>
> This change gives the `ShareFs` code it's own set of values rather
> than relying on the disk queue constants.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
dea8065811 runtime-rs: ch: Remove unused function
Delete the `handle_pending_devices_after_boot()` function which is no
longer required.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
James O. D. Hunt
995f2c015f runtime-rs: ch: Only handle particular pending device types
Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and
`Network` devices to the list of pending devices since only these two
device types need to be cached before VM startup. Full details in the
comments.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-26 08:41:32 +01:00
Archana Shinde
9bb9a3e7a4 Merge pull request #7966 from amshinde/runtime-rs-network-clh
runtime-rs: Add network support for cloud-hypervisor
2023-09-22 13:08:09 -07:00
Chao Wu
6f98fbafde Merge pull request #6706 from guixiongwei/feat/thp
feat(runtime-rs): introduce huge page mode to select VM RAM's backend
2023-09-22 15:27:06 +08:00
Fabiano Fidêncio
08f2e5ae0b runtime-rs: Ensure static-checks-build is a dep of make test
Otherwise `make test` will simply fail with:
```
error[E0583]: file not found for module `config`
```

Fixes: #7974 -- part 0

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 12:53:13 +02:00
Archana Shinde
9c233bb9e0 test: Add test to verify try_from for clh Netconfig
Add tests to verify conversion from runtime NetworkConfig
to clh specific config.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-16 00:24:14 -07:00
Archana Shinde
9049d311df runtime-rs: Add network support for cloud-hypervisor
This PR adds support for adding a network device before starting the
cloud-hypervisor VM.

Support for adding and removing network devices is not really added to
the resource manager, so supporting this for cloud-hypervisor is not
scoped in this PR.

This also changes "pending_devices" for clh implementation from an
Option of vector to simply a vector. This simplifies the structure a bit
as we can simple iterate over the pending devices instead of having to
check for a "Some" value as this is not really required.

Fixes: #6333

Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-09-15 23:25:20 -07:00
James O. D. Hunt
7feb8de9dc Merge pull request #7887 from jodh-intel/hypervisor-remove-debug-kernel-options
runtime-rs: hypervisor: Remove debug kernel options
2023-09-12 16:31:48 +01:00
stevenhorsman
1d8b78959d runtime-rs: Fix useless-vec warning
Fix clippy::useless-vec warning

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
stevenhorsman
99f3d69e94 runtime-rs: Remove mut
Fix `error: variable does not need to be mutable`

Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-09-12 11:31:49 +01:00
Yipeng Yin
a16b0962b5 chore(cargo): update cargo lock
Update cargo lock for runtime-rs, agent and kata-ctl.

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 15:27:38 +08:00
Guixiong Wei
202049f35e feat(runtime-rs): introduce huge page type to select VM RAM's backend
This commit allows us to specify the huge page backend when enabling huge
page. Currently, we support two backends: thp and hugetlbfs, the default
is hugetlbfs.

To ensure backward compatibility, we introduce another configuration item
"hugepage_type" to select the memory backend, which is available only when
"enable_hugepages" is true. Besides, we add an annotation
"io.katacontainers.config.hypervisor.hugepage_type" to configure huge page
type per pod.

Fixes: #6703

Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 11:28:27 +08:00
Zhongtao Hu
e1f54f96d0 Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs
runtime-rs: bring hybrid vsock devices in manager.
2023-09-12 09:27:34 +08:00
James O. D. Hunt
c0f697fcc5 runtime: Allow kernel_params annotation
To support the removal of the `initcall_debug` and `earlyprintk=`
options from the default guest kernel cmdline, add `kernel_params` to the list
of enabled annotations to allow those kernel options (or others) to be
set using `kata-deploy` for either runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 12:12:12 +01:00
James O. D. Hunt
976d10150c runtime-rs: hypervisor: Remove debug kernel options
Removed the following kernel command line options:

- `earlyprintk=ttyS0`
- `initcall_debug`

Both these options are only useful when debugging a guest kernel failure
which is not a common occurrence.

Further, the `earlyprintk=` option can have a large negative performance
impact (it can increase the VM boot time significantly).

If the user wishes to use either of these options, they can add them to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Fixes: #7886.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:43:39 +01:00
Zhongtao Hu
aa85e0b3ec Merge pull request #7714 from justxuewei/volumes-cleanup
runtime-rs: Fix volumes and rootfs cleanup issues
2023-09-06 10:13:55 +08:00
alex.lyn
7870b33a2d runtime-rs: bring hybridVsock devices in manager.
Currently, virtio_vsock are still outside of the device
manager. This causes some management issues,such as the
inability to unify PCI address management.

Just do some work for hybrid vsock.

Fixes: #7655

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-09-05 08:46:56 +08:00
Zixuan Tan
dffc16e5b3 runtime-rs: check peer close in log_forwarder
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.

This patch fixes the problem.

Fixes: #7741

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2023-08-25 19:00:07 +08:00
Xuewei Niu
268e846558 runtime-rs: Fix volumes and rootfs cleanup issues
There are several processes for container exit:

- Non-detach mode: `Wait` request is sent by containerd, then
  `wait_process()` will be called eventually.
- Detach mode: `Wait` request is not sent, the `wait_process()` won’t be
  called.
    - Killed by ctr: For example, a container runs `tail -f /dev/null`, and
      is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is
      sent, then `kill_process()` will be called. User executes `sudo ctr c
      rm <CID>`, `Delete` request is sent, then `delete_process()` will be
      called.
    - Exited on its own: For example, a container runs `sleep 1s`. The
      container’s state goes to `Stopped` after 1 second. User executes
      the delete command as below.

Where do we do container cleanup things?

- `wait_process()`: No, because it won’t be called in detach mode.
- `delete_process()`: No, because it depends on when the user executes the
  delete command.
- `run_io_wait()`: Yes. A container is considered exited once its IO ended.
  And this always be called once a container is launched.

Fixes: #7713

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-24 13:23:47 +08:00
Zhongtao Hu
d90f7ac689 runtime-rs: add unit test for block driver
add unit test for block driver

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:45:27 +08:00
Zhongtao Hu
e44919f0da runtime-rs: add load_test_config for unit test
add load_test_config for unit test

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:56 +08:00
Zhongtao Hu
7f48a69379 runtime-rs: add driver option
add driver option when handle linux devices

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:49 +08:00
Chao Wu
b098960442 Merge pull request #7581 from justxuewei/bump-versions
deps: Bump dependent crate versions
2023-08-08 15:16:57 +08:00
Chao Wu
24bf637835 Merge pull request #7500 from pmores/fix-queue-num-in-dragonball-share-fs
fix number of queues handling in dragonball share fs device
2023-08-08 12:07:25 +08:00
Xuewei Niu
b23c5ed155 deps: Bump dependent crate versions
This pull request is mainly for updating vm-memory and vmm-sys-util.

The affacted crates include:

- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4

Fixes: #0000

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-08 11:54:09 +08:00
Xuewei Niu
3958a39d07 runtime-rs: Introduce directly attachable network
Kata containers as VM-based containers are allowed to run in the host
netns. That is, the network is able to isolate in the L2. The network
performance will benefit from this architecture, which eliminates as many
hops as possible. We called it a Directly Attachable Network (DAN for
short).

The network devices are placed at the host netns by the CNI plugins. The
configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON,
including device name, type, and network info. At the very beginning stage,
the DAN only supports host tap devices. More devices, like the DPDK, will
be supported in later versions.

The format of file looks like as below:

```json
{
	"netns": "/path/to/netns",
	"devices": [{
		"name": "eth0",
		"guest_mac": "xx:xx:xx:xx:xx",
		"device": {
			"type": "vhost-user",
			"path": "/tmp/test",
			"queue_num": 1,
			"queue_size": 1
		},
		"network_info": {
			"interface": {
				"ip_addresses": ["192.168.0.1/24"],
				"mtu": 1500,
				"ntype": "tuntap",
				"flags": 0
			},
			"routes": [{
				"dest": "172.18.0.0/16",
				"source": "172.18.0.1",
				"gateway": "172.18.31.1",
				"scope": 0,
				"flags": 0
			}],
			"neighbors": [{
				"ip_address": "192.168.0.3/16",
				"device": "",
				"state": 0,
				"flags": 0,
				"hardware_addr": "xx:xx:xx:xx:xx"
			}]
		}
	}]
}
```

Fixes: #1922

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-03 15:33:34 +08:00
Chelsea Mafrica
a81ad3b587 runtime-rs: Add block device handling in cloud hypervisor
Add functions for adding a block device to a container for CH.

Fixes #6690

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-08-02 09:18:48 -07:00
Fupan Li
1a6b27bf6a Merge pull request #5797 from Yuan-Zhuo/add-metrics-for-runtime-rs
runtime-rs: add support for gather metrics in runtime-rs
2023-08-02 13:40:22 +08:00