Commit Graph

943 Commits

Author SHA1 Message Date
Hyounggyu Choi
150c90e32a Merge pull request #11728 from BbolroC/fix-sealed-secret-volume
runtime-rs: Adjust path for sealed secret mount check
2025-09-02 16:57:33 +02:00
Xuewei Niu
f6ff9cf717 Merge pull request #11689 from Caspian443/fix-devmapper-selinux-mount-issue
runtime-rs: Empty block-rootfs Storage.options and align with Go runtime
2025-08-29 15:29:46 +08:00
Hyounggyu Choi
65fdb18c96 runtime-rs: Adjust path for sealed secret mount check
Mount validation for sealed secret requires the base path to start with
`/run/kata-containers/shared/containers`. Previously, it used
`/run/kata-containers/sandbox/passthrough`, which caused test
failures where volume mounts are used.

This commit renames the path to satisfy the validation check.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-08-28 15:38:07 +02:00
Caspian443
617af4cb3b runtime-rs: Empty block-rootfs Storage.options and align with Go runtime
- Set guest Storage.options for block rootfs to empty (do not propagate host mount options).
- Align behavior with Go runtime: only add xfs nouuid when needed.

Signed-off-by: Caspian443 <scrisis843@gmail.com>
2025-08-26 01:27:21 +00:00
Caspian443
9a7aadaaca libs: Introduce rootfs fs types
- Add new kata-types::fs module with:
  - VM_ROOTFS_FILESYSTEM_EXT4
  - VM_ROOTFS_FILESYSTEM_XFS
  - VM_ROOTFS_FILESYSTEM_EROFS
- Export fs module in src/libs/kata-types/src/lib.rs
- Remove duplicated filesystem constants from src/runtime-rs/crates/hypervisor/src/lib.rs
- Update src/runtime-rs/crates/hypervisor/src/kernel_param.rs (and tests) to import from kata_types::fs

Signed-off-by: Caspian443 <scrisis843@gmail.com>
2025-08-26 01:26:53 +00:00
Alex Lyn
91913f9e82 Merge pull request #11711 from stevenhorsman/remote-allow-cc_init_data-annotation
runtime: Enable init_data annotation
2025-08-22 14:41:53 +08:00
Fupan Li
1a0fbbfa32 Merge pull request #11699 from Apokleos/support-nonprotection
runtime-rs: Support initdata within NonProtection scenarios
2025-08-22 10:24:47 +08:00
stevenhorsman
081823b388 runtime: Enable init_data annotation
In #11693 the cc_init_data annotation was changes to be hypervisor
scoped, so each hypervisor needs to explicitly allow it in order to
use it now, so add this to both the go and rust runtime's remote
configurations

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-08-21 19:26:10 +01:00
Alex Lyn
e430727cb6 runtime-rs: Change the initdata device driver with block_device_driver
Currently, we change vm_rootfs_driver as the initdata device driver
with block_device_driver.

Fixes #11697

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-21 18:56:26 +08:00
Alex Lyn
5cc028a8b1 runtime-rs: Support initdata within NonProtection scenarios
we also need support initdat within nonprotection even though the
platform is detected as NonProtection or usually is called nontee
host. Within these cases, there's no need to validate the item of
`confidential_guest=true`, we believe the result of the method
`available_guest_protection()?`.

Fixes #11697

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-21 18:56:23 +08:00
Hyounggyu Choi
faf5aed965 runtime-rs: Adjust VSOCK timeouts for IBM SEL
The default `reconnect_timeout` (3 seconds) was found to be insufficient for
IBM SEL when using VSOCK. This commit updates the timeouts as follows:

- `dial_timeout_ms`: Set to 90ms to match the value used in go-runtime for IBM SEL
- `reconnect_timeout_ms`: Increased to 5000ms based on empirical testing

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-08-21 12:35:44 +02:00
Hyounggyu Choi
2ec70bc8e2 runtime-rs: Enable initdata spec for IBM SEL
Add support for the `InitData` resource config on IBM SEL,
so that a corresponding block device is created and the
initdata is passed to the guest through this device.

Note that we skip passing the initdata hash via QEMU’s
object, since the hypervisor does not yet support this
mechanism for IBM SEL. It will be introduced separately
once QEMU adds the feature.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-08-20 18:57:50 +02:00
Alex Lyn
014ab2fce6 Merge pull request #11693 from BbolroC/revert-initdata-annotation
runtime-rs: Fix issues for initdata
2025-08-20 21:17:52 +08:00
Hyounggyu Choi
0daafecef2 Revert "runtime-rs: Correct the coresponding initdata annotation const"
This reverts commit 37685c41c7.

This renames the relevant constant for initdata.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-08-20 10:15:23 +02:00
Hyounggyu Choi
208cec429a runtime-rs: Introduce CoCo-specific enable_annotations
We need to include `cc_init_data` in the enable_annotations
array to pass the data. Since initdata is a CoCo-specific
feature, this commit introduces a new array,
`DEFENABLEANNOTATIONS_COCO`, which contains the required
string and applies it to the relevant CoCo configuration.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-08-20 10:15:23 +02:00
Alex Lyn
903e608c23 runtime-rs: Add only static ARP entries with handle_neighours
To make it aligned with runtime-go, we need add only static ARP
entries into the targets.

Fixes #11697

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-19 20:09:20 +08:00
Zvonko Kaiser
dcb62a7f91 Merge pull request #11525 from was-saw/qemu-seccomp
runtime-rs: add seccomp support for qemu
2025-08-14 12:35:32 -04:00
wangxinge
d147e2491b runtime-rs: add seccomp support for qemu
This commit support the seccomp_sandbox option from the configuration.toml file
and add the logic for appending command-line arguments based on this new configuration parameter.

Fixes: #11524

Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>
2025-08-14 18:45:03 +08:00
wangxinge
f3a669ee2d runtime-rs: add seccomp support for cloud hypervisor and firecracker
The seccomp feature for Cloud Hypervisor and Firecracker is enabled by default.
This commit introduces an option to disable seccomp for both and updates the built-in configuration.toml file accordingly.

Fixes: #11535

Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>
2025-08-11 17:59:30 +08:00
Alex Lyn
196d7d674d runtime-rs: Label system journal log with kata
Route kata-shim logs directly to systemd-journald under 'kata' identifier.

This refactoring enables `kata-shim` logs to be properly attributed to
'kata' in systemd-journald, instead of inheriting the 'containerd'
identifier.

Previously, `kata-shim` logs were challenging to filter and debug as
they
appeared under the `containerd.service` unit.

This commit resolves this by:
1.  Introducing a `LogDestination` enum to explicitly define logging
targets (File or Journal).
2.  Modifying logger creation to set `SYSLOG_IDENTIFIER=kata` when
logging
to Journald.
3.  Ensuring type safety and correct ownership handling for different
logging backends.

This significantly enhances the observability and debuggability of Kata
Containers, making it easier to monitor and troubleshoot Kata-specific
events.

Fixes: #11590

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-10 16:00:36 +08:00
Alex Lyn
9816ffdac7 Merge pull request #11653 from Apokleos/align-initdata-annoation
Align initdata annoation with kata-runtime
2025-08-08 16:24:09 +08:00
Pavel Mores
e2156721fd runtime-rs: add tests to exercise floating-point 'default_vcpus'
Also included (as commented out) is a test that does not pass although
it should.  See source code comment for explanation why fixing this seems
beyond the scope of this PR.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
1f95d9401b runtime-rs: change representation of default_vcpus from i32 to f32
This commit focuses purely on the formal change of type.  If any subsequent
changes in semantics are needed they are purposely avoided here so that the
commit can be reviewed as a 100% formal and 0% semantic change.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Pavel Mores
cdc0eab8e4 runtime-rs: make sandbox vcpu allocation more accurate
This commit addresses a part of the same problem as PR #7623 did for the
golang runtime.  So far we've been rounding up individual containers'
vCPU requests and then summing them up which can lead to allocation of
excess vCPUs as described in the mentioned PR's cover letter.  We address
this by reversing the order of operations, we sum the (possibly fractional)
container requests and only then round up the total.

We also align runtime-rs's behaviour with runtime-go in that we now
include the default vcpu request from the config file ('default_vcpu')
in the total.

We diverge from PR #7623 in that `default_vcpu` is still treated as an
integer (this will be a topic of a separate commit), and that this
implementation avoids relying on 32-bit floating point arithmetic as there
are some potential problems with using f32.  For instance, some numbers
commonly used in decimal, notably all of single-decimal-digit numbers
0.1, 0.2 .. 0.9 except 0.5, are periodic in binary and thus fundamentally
not representable exactly.  Arithmetics performed on such numbers can lead
to surprising results, e.g. adding 0.1 ten times gives 1.0000001, not 1,
and taking a ceil() results in 2, clearly a wrong answer in vcpu
allocation.

So instead, we take advantage of the fact that container requests happen
to be expressed as a quota/period fraction so we can sum up quotas,
fundamentally integral numbers (possibly fractional only due to the need
to rewrite them with a common denominator) with much less danger of
precision loss.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2025-08-07 10:32:44 +02:00
Alex Lyn
37685c41c7 runtime-rs: Correct the coresponding initdata annotation const
As we have changed the initdata annotation definition, Accordingly, we also
need correct its const definition with KATA_ANNO_CFG_RUNTIME_INIT_DATA.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-08-07 10:45:28 +08:00
Xuewei Niu
6f6d64604f Merge pull request #11598 from justxuewei/cgroups 2025-07-25 17:53:03 +08:00
Xuewei Niu
60e3679eb7 runtime-rs: Add full cgroups support on host
Add full cgroups support on host. Cgroups are managed by `FsManager` and
`SystemdManager`. As the names impies, the `FsManager` manages cgroups
through cgroupfs, while the `SystemdManager` manages cgroups through
systemd. The two manages support cgroup v1 and cgroup v2.

Two types of cgroups path are supported:

1. For colon paths, for example "foo.slice:bar:baz", the runtime manages
cgroups by `SystemdManager`;
2. For relative/absolute paths, the runtime manages cgroups by
`FsManager`.

vCPU threads are added into the sandbox cgroups in cgroup v1 + cgroupfs,
others, cgroup v1 + systemd, cgroup v2 + cgroupfs, cgroup v2 + systemd, VMM
process is added into the cgroups.

The systemd doesn't provide a way to add thread to a unit. `add_thread()`
in `SystemdManager` is equivalent to `add_process()`.

Cgroup v2 supports threaded mode. However, we should enable threaded mode
from leaf node to the root node (`/`) iteratively [1]. This means the
runtime needs to modify the cgroups created by container runtime (e.g.
containerd). Considering cgroupfs + cgroup v2 is not a common combination,
its behavior is aligned with systemd + cgroup v2, which is not allowed to
manage process at the thread level.

1: https://www.kernel.org/doc/html/v4.18/admin-guide/cgroup-v2.html#threads

Fixes: #11356

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-25 14:52:55 +08:00
alex.lyn
613dba6f1f runtime-rs: Some extra work to enhance copyfile with sharedfs disabled
As some reasons, it first should make it align with runtime-go, this
commit  will do this work.

Fixes #11543

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-25 11:39:20 +08:00
Steve Horsman
c762a3dd4f Merge pull request #11372 from kata-containers/dependabot/cargo/src/dragonball/openssl-af8515b6e0
build(deps): bump the openssl group across 4 directories with 1 update
2025-07-24 13:27:24 +01:00
Xuewei Niu
635272f3e8 runtime-rs: Ignore SIGTERM signal in shim
When enabling systemd cgroup driver and sandbox cgroup only, the shim is
under a systemd unit. When the unit is stopping, systemd sends SIGTERM to
the shim. The shim can't exit immediately, as there are some cleanups to
do. Therefore, ignoring SIGTERM is required here. The shim should complete
the work within a period (Kata sets it to 300s by default). Once a timeout
occurs, systemd will send SIGKILL.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-24 17:15:15 +08:00
Xuewei Niu
79f29bc523 runtime-rs: QEMU get_thread_ids() returns real vCPU's tids
The information is obtained through QMP query_cpus_fast.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-24 17:15:15 +08:00
alex.lyn
b40d65bc1b runtime-rs: support block device driver virtio-scsi within qemu-rs
It is important that we continue to support VirtIO-SCSI. While
VirtIO-BLK is a common choice, virtio-scsi offers significant
performance advantages in specific scenarios, particularly when
utilizing iothreads and with NVMe Fabrics.

Maintaining Flexibility and Choice by supporting both virtio-blk and
virtio-scsi, we provide greater flexibility for users to choose the
optimal storage(virtio-blk, virtio-scsi) interface based on their
specific workload requirements and hardware configurations.

As virtio-scsi controller has been created when qemu vm starts with
block device driver is set to `virtio-scsi`. This commit is for blockdev_add
the backend block device and device_add frondend virtio-scsi device via qmp.

Fixes #11516

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 14:00:02 +08:00
alex.lyn
e683a7fd37 runtime-rs: Change the device_id with block device index
As block device index is an very important unique id of a block device
and can indicate a block device which is equivalent to device_id.
In case of index is required in calculating scsi LUN and reduce
useless arguments within reusing `hotplug_block_device`, we'd better
change the device_id with block device index.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
4521cae0c0 runtime-rs: Support AIO for hotplugging block device within qemu
In this commit, block device aio are introduced within hotplug_block_device
within qemu via qmp and the "iouring" is set the default.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
b4d276bc2b runtime-rs: Handle virtio-scsi within device manager
It should be correctly handled within the device manager when do
create_block_device if the driver_option is virtio-scsi.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
fbd84fd3f4 runtime-rs: Support virtio-scsi device within handle_block_volume
It supports handling scsi device when block device driver is `scsi`.
And it will ensure a correct storage source with LUN.

Fixes #11516

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
57645c0786 runtime-rs: Add support for block device AIO
In this commit, three block device aio modes are introduced and the
"iouring" is set the default.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
alex.lyn
40e6aacc34 runtime-rs: Introduce scsi_addr within BlockConfig for SCSI devices
It's used to help discover scsi devices inside guest and also add a
new const value `KATA_SCSI_DEV_TYPE` to help pass information.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-24 11:57:00 +08:00
dependabot[bot]
ef9d960763 build(deps): bump the openssl group across 4 directories with 1 update
Bumps the openssl group with 1 update in the /src/dragonball directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/runtime-rs directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/tools/genpolicy directory: [openssl](https://github.com/sfackler/rust-openssl).
Bumps the openssl group with 1 update in the /src/tools/kata-ctl directory: [openssl](https://github.com/sfackler/rust-openssl).


Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

Updates `openssl` from 0.10.72 to 0.10.73
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73)

---
updated-dependencies:
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: openssl
- dependency-name: openssl
  dependency-version: 0.10.73
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: openssl
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-23 15:17:12 +00:00
alex.lyn
a12ae58431 runtime-rs: Support hotplugging host block devices within qemu-rs
Although Previous implementation of hotplugging block device via QMP
can successfully hot-plug the regular file based block device, but it
fails when the backend is /dev/xxx(e.g. /dev/loop0). With analysis about
it, we can know that it lacks the ablility to hotplug host block devices.

This commit will fill the gap, and make it work well for host block
devices.

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-22 15:40:03 +08:00
stevenhorsman
e66aa1ef8c runtime: Bump promethus and ttrpc-codegen
Bump these crates to remove the old version of protobuf
and remediate RUSTSEC-2024-0437

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:29:39 +01:00
Fabiano Fidêncio
7f5f032aca runtime-rs: Update containerd-shim / containerd-shim-protos
Let's bump those to their 0.10.0 releases, which contain fixes for the
CVE-2025-53605.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-19 00:18:01 +02:00
Xuewei Niu
629c942d4b runtime-rs: Fix the issue of blocking socket with Tokio
According to the issue [1], Tokio will panic when we are giving a blocking
socket to Tokio's `from_std()` method, the information is as follows:

```
A panic occurred at crates/agent/src/sock/vsock.rs:59: Registering a
blocking socket with the tokio runtime is unsupported. If you wish to do
anyways, please add `--cfg tokio_allow_from_blocking_fd` to your RUSTFLAGS.
```

A workaround is to set the socket to non-blocking.

1: https://github.com/tokio-rs/tokio/issues/7172

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-18 10:55:48 +08:00
Xuewei Niu
5a4050660a runtime-rs: Bump Tokio to v1.46.1
Tokio now has a newer version, let us bump it.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2025-07-18 10:55:48 +08:00
Fabiano Fidêncio
eb2bfbf7ac Merge pull request #11572 from stevenhorsman/RUSTSEC-2024-0384-remediate
More crate bumps for security remediations
2025-07-17 22:35:05 +02:00
stevenhorsman
e56f493191 deps: Bump zbus, serial_test & async-std
Bump these crates across various components to remove the
dependency on unmaintained instant crate and remediate
RUSTSEC-2024-0384

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 18:23:19 +01:00
stevenhorsman
9185ef1a67 runtime-rs: Bump nix to matching version
runtime-rs needs the same version as libs,
so sync this up as well.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-17 15:08:46 +01:00
alex.lyn
56c0c172fa runtime-rs: Fix initdata length field missing when create block
The init data could not be read properly within kata-agent because the
data length field was omitted, a consequence of a mismatch in the data
write format.

Fixes #11556

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-07-15 19:22:17 +08:00
stevenhorsman
661d88b11f versions: Bump oci-spec
Try bumping oci-spec to 0.8.1 as it included fixes for vulnerabilities
including RUSTSEC-2024-0370

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-14 16:54:30 +01:00
Fabiano Fidêncio
579d373623 Merge pull request #11521 from stevenhorsman/idna-1.0.4-bump
versions: Bump idna crate to >= 1.0.3
2025-07-14 17:39:30 +02:00