Commit Graph

5085 Commits

Author SHA1 Message Date
Paul Meyer
8de8b8185e genpolicy: rename svc_name to svc_name_downward_env
Just to be more explicit what this matches.

Signed-off-by: Paul Meyer <katexochen0@gmail.com>
2025-05-27 10:13:43 +02:00
Paul Meyer
78eb65bb0b genpolicy: fix svc_name regex
The service name is specified as RFC 1035 lable name [1]. The svc_name
regex in the genpolicy settings is applied to the downward API env
variables created based on the service name. So it tries to match
RFC 1035 labels after they are transformed to downward API variable
names [2]. So the set of lower case alphanumerics and dashes is
transformed to upper case alphanumerics and underscores.
The previous regex wronly permitted use of numbers, but did allow
dot and dash, which shouldn't be allowed (dot not because they aren't
conform with RFC 1035, dash not because it is transformed to underscore).

We have to take care not to also try to use the regex in places where
we actually want to check for RFC 1035 label instead of the downward
API transformed version of it.

Further, we should consider using a format like JSON5/JSONC for the
policy settings, as these are far from trivial and would highly benefit
from proper documentation through comments.

[1]: https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service
[2]: b2dfba4151/pkg/kubelet/envvars/envvars.go (L29-L70)

Signed-off-by: Paul Meyer <katexochen0@gmail.com>
2025-05-27 08:43:25 +02:00
Xingru Li
71b6acfd7e dragonball: vsock: support single descriptor
Since kernel v6.3 the vsock packet is not split over two descriptors and
is instead included in a single one.

Therefore, we currently decide the specific method of obtaining
BufWrapper based on the length of descriptor.

Refer:
a2752fe04f
https://git.kernel.org/torvalds/c/71dc9ec9ac7d

Signed-off-by: Xingru Li <lixingru.lxr@linux.alibaba.com>
[ Gao Xiang: port this patch from the internal branch to address Linux 6.1.63+. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-05-26 15:48:19 +08:00
Fupan Li
e9b45126fc Merge pull request #11254 from sampleyang/main
runtime-rs: fix vfio pci address domain 0001 problem
2025-05-23 18:13:10 +08:00
yangsong
06c7c5bccb runtime-rs: fix vfio pci address domain 0001 problem
Some nvidia gpu pci address domain with 0001,
current runtime default deal with 0000:bdf,
which cause address errors during device initialization
and address conflicts during device registration.

Fixes #11252

Signed-off-by: yangsong <yunya.ys@antgroup.com>
2025-05-23 14:33:06 +08:00
Fabiano Fidêncio
5378e581d8 Merge pull request #11144 from Apokleos/hotplug-block-qemu-rs
Support hot-plug block device in qemu-rs with QMP
2025-05-21 11:31:48 +02:00
Fabiano Fidêncio
6c9b199ef1 Merge pull request #11289 from BbolroC/fix-vfio-coldplug
runtime: Preserve hotplug devices for vfio-coldplug mode
2025-05-21 09:48:25 +02:00
Steve Horsman
f8c5aa6df6 Merge pull request #11259 from fitzthum/bump-gc-0140
Update Trustee and Guest Components for CoCo v0.14.0
2025-05-20 18:05:17 +01:00
Sumedh Alok Sharma
9a4432d197 Merge pull request #11233 from Ankita13-code/ankitapareek/execprocess-additional-input-validation
genpolicy: validate input process fields for ExecProcessRequest
2025-05-20 20:11:41 +05:30
Fabiano Fidêncio
29099d139b Merge pull request #11280 from kata-containers/dependabot/cargo/src/tools/kata-ctl/ring-0.17.14
build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl
2025-05-20 13:47:22 +02:00
Ankita Pareek
ad75595dc8 genpolicy: Add tests for various input validations for ExecProcessRequest
These additional tests cover edge cases specific to-
- Terminal validation
- Capabilities validation
- Working directory (Cwd) validation
- NoNewPrivileges validation
- User validation
- Environment variables validation

Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>
2025-05-20 11:19:55 +00:00
Saul Paredes
1e466bf39c genpolicy: fix validation of env variables sourced from metadata.namespace
Use $(sandbox-namespace) wildcard in case none is specified in yaml. If wildcard is present, compare
input against annotation value.

Fixes regression introduced in https://github.com/microsoft/kata-containers/pull/273
where samples that use metadata.namespace env var were no longer working.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2025-05-20 11:19:46 +00:00
Dan Mihai
a113b9eefd genpolicy: validate probe process fields
Validate more process fields for k8s probe commands - e.g.,
livenessProbe, readinessProbe, etc.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-05-20 11:15:30 +00:00
Dan Mihai
c0b8c6ed5e genpolicy: validate process for commands from settings
Validate more process fields for commands enabled using the
ExecProcessRequest "commands" and/or "regex" fields from the
settings file.
Add function to get the container from state based on container_id
matching instead of matching it against every policy container data

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>
2025-05-20 11:15:30 +00:00
Dan Mihai
6f78aaa411 genpolicy: use process inputs for allow_process()
Using process data inputs for allow_process() is easier to
read/understand compared with the older OCI data inputs.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-05-20 11:15:30 +00:00
Steve Horsman
2871c31162 Merge pull request #11273 from mythi/tdx-qemu-params
config: update QEMU TDX configuration
2025-05-20 10:22:59 +01:00
alex.lyn
378d04bdf0 runtime-rs: Add hotplug block device type with QMP
There's several cases that block device plays very import roles:

1. Direct Volume:
In Kata cases, to achieve high-performance I/O, raw files on the host
are typically passed directly to the Guest via virtio-blk, and then
bond/mounted within the Guest for container usage.

2. Trusted Storage
In CoCo scenarios, particularly in Guest image pull mode, images are
typically pulled directly from the registry within the Guest. However,
due to constrained memory resources (prioritized for containers), CoCo
leverages externally attached encrypted storage to store images,
requiring hot-plug capability for block devices.

and as other vmms, like dragonball and cloud-hypervisor in runtime-rs or
qemu in kata-runtime have already supported such capabilities, we need
support block device with hot-plug method (QMP) in qemu-rs. Let's do it.

Fixes #11143

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-20 16:46:54 +08:00
alex.lyn
2405301e2e runtime-rs: Support hotplugging block device via QMP
This commit introduces block device hotplugging capability using
QMP commands.
The implementation enables attaching raw block devices to a running
VM through the following steps:

1.Block Device Configuration
Uses `blockdev-add` QMP command to define a raw block backend with
(1) Direct I/O mode
(2) Configurable read-only flag
(3) Host file/block device path (`/path/to/block`)

2.PCI Device Attachment, Attaches the block device via `device_add`
QMP command as a `virtio-blk-pci` device:
(1) Dynamically allocates PCI slots using `find_free_slot()`
(2) Binds to user-specified PCIe bus (e.g., `pcie.1`)
(3) Returns PCI path for further management

Fixes #11143

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-20 16:46:54 +08:00
alex.lyn
80bd71bfcc runtime-rs: Iterates through PCI devices to find a match with qdev_id
The get_pci_path_by_qdev_id function is designed to search for a PCI
device within a given list of devices based on a specified qdev_id.
It tracks the device's path in the PCI topology by recording the slot
values of the devices traversed during the search. If the device is
located behind a PCI bridge, the function recursively explores the
bridge's device list to find the target device. The function returns
the matching device along with its updated path if found, otherwise,
it returns None.

Fixes #11143

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-20 16:46:54 +08:00
Fupan Li
9a03815f18 Merge pull request #11095 from lifupan/ephemeral_volume
runtime-rs: add the ephemeral memory based volume support
2025-05-20 09:18:34 +08:00
Steve Horsman
cfdccaacb3 Merge pull request #11283 from Rtoax/p002-fix-typo
config: Fix typos
2025-05-19 14:59:37 +01:00
Hyounggyu Choi
2fd2cd4a9b runtime: Preserve hotplug devices for vfio-coldplug mode
Fixes: #11288

This commit appends hotplug devices (e.g., persistent volume)
to deviceInfos when `vfio_mod` is `vfio` and `cold_plug_vfio`
is set to one except `no-port`. For details, please visit the issue.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-05-19 13:46:49 +02:00
Pradipta Banerjee
9f9841492e runtime: Fix logging for remote hypervisor
Need to use hvLogger

Fixes: #11286

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2025-05-19 07:01:59 -04:00
Rong Tao
914730d948 config: Fix typos
devie should be device

Signed-off-by: Rong Tao <rongtao@cestc.cn>
2025-05-19 14:19:22 +08:00
Alex Lyn
305a5f5e41 Merge pull request #10578 from Apokleos/pcie-port-devices
runtime-rs: Introduce PCIe Port devices in runtime-rs for qemu-rs
2025-05-18 21:10:25 +08:00
Dan Mihai
b9651eadab Merge pull request #11214 from microsoft/cameronbaird/address-gid-mismatch-additionalgids
genpolicy: Enable AdditionalGids checks in rules.rego
2025-05-16 10:15:53 -07:00
dependabot[bot]
a2c7e48e0e build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl
Bumps [ring](https://github.com/briansmith/ring) from 0.17.5 to 0.17.14.
- [Changelog](https://github.com/briansmith/ring/blob/main/RELEASES.md)
- [Commits](https://github.com/briansmith/ring/commits)

---
updated-dependencies:
- dependency-name: ring
  dependency-version: 0.17.14
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-16 14:51:20 +00:00
Fabiano Fidêncio
219d6e8ea6 Merge pull request #11257 from mythi/coco-guest-hardening
confidential guest kernel hardening changes
2025-05-16 08:52:36 +02:00
Fabiano Fidêncio
02ce395a69 Merge pull request #11272 from seungukshin/enable-edk2-for-arm64
Enable edk2 for arm64
2025-05-15 20:59:56 +02:00
Cameron Baird
7bba7374ec genpolicy: Add retries to policy generation
As the genpolicy from_files call makes network requests to container
registries, it has a chance to fail.

Harden us against flakes due to network by introducing a 6x retry loop
in genpolicy tests.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-05-15 18:12:50 +00:00
Steve Horsman
d21d2a0657 Merge pull request #11265 from chathuryaadapa/bumpalo-crate-bump
Bump: libz-sys crate to address CVE
2025-05-15 16:18:00 +01:00
Mikko Ylinen
ff851202e6 config: update QEMU TDX configuration
Drop '-vmx-rdseed-exit' from '-cpu host' QEMU options. The history
of it is unknown but it's likely related to early TDX enablement.

TD pods start up fine without it (tested by manually editing the
configuration file) and it's also not used elsewhere.

Keep TDXCPUFEATURES for now in case a need for it shows up later.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2025-05-15 15:43:24 +03:00
alex.lyn
07533522b8 runtime-rs: Handle PortDevice devices when invoke start_vm with Qemu
Extract PortDevice relevant information, and then invoke different
processing methods based on the device type.

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
c109328097 runtime-rs: Introduce pcie root port and switch port in qemu-rs cmdline.
Some data structures and methods are introduced to help handle vfio devices.
And mothods add_pcie_root_ports and add_pcie_switch_ports follow runtime's
related implementations of vfio devices.

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
47c7ba8672 runtime-rs: Prepare pcie port devices before start sandbox
Prepare pcie port devices before starting VM with the help of
device manager and PCIe Topology.

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
d435712ccb runtime-rs: Introduce PortDevice in resource manager in sandbox
A new resource type `PortDevice` is introduced which is dedicated
for handling root ports/switch ports during sandbox creation(VM).

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
1d670bb46c runtime-rs: handle useless Device match arms in dragonball vmm case
Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
f08fdd25d8 runtime-rs: Introduce device type of PordDevice in device manager
PortDevice is for handling root ports or switch ports in PCIe
Topology. It will make it easy pass the root ports/switch ports
information during create VM with requirements of PCIe devices.

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
694a849eaa runtime-rs: Add PCIe topology mgmt for Root Port and Switch Port
This commit introduces an implementation for managing PCIe topologies,
focusing on the relationship between Root Ports and Switch Ports. The
design supports two strategies for generating Switch Ports:

Let's take the requirement of 4 switch ports as an example. There'll be
three possible solutions as below:
(1) Single Root Port + Single PCIe Switch: Uses 1 Root Port and 1 Switch
with 4 Downstream Ports.
(2) Multiple Root Ports + Multiple PCIe Switches: Uses 2 Root Ports and
2 Switches, each with 2 Downstream Ports.

The recommended strategy is Option 1 due to its simplicity, efficiency,
and scalability. The implementation includes data structures
(PcieTopology, RootPort, PcieSwitch, SwitchPort) and operations
(add_pcie_root_port, add_switch_to_root_port, add_switch_port_to_switch)
to manage the topology effectively.

Fxies #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
2f5ee0ec6d kata-types: Support switch port config via annotation and configuration
Support setting switch ports with annotatation or configuration.toml

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
alex.lyn
a42d16a6a4 kata-types: Introduce pcie_switch_port in configuration
(1) Introduce new field `pcie_switch_port` for switch ports.
(2) Add related checking logics in vmms(dragonball, qemu)

Fixes #10361

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2025-05-15 20:10:49 +08:00
Seunguk Shin
560e718979 runtime: Add edk2 to configuration-qemu.toml for arm64
The edk2 is required for memory hot plug on qemu for arm64.
This adds the edk2 to configuration-qemu.toml for arm64.

Signed-off-by: Seunguk Shin <seunguk.shin@arm.com>
Reviewed-by: Nick Connolly <nick.connolly@arm.com>
2025-05-15 10:12:31 +01:00
RuoqingHe
393cc61153 Merge pull request #11241 from kata-containers/dependabot/cargo/src/tools/agent-ctl/ring-0.17.14
build(deps): bump ring from 0.17.8 to 0.17.14 in /src/tools/agent-ctl
2025-05-14 16:20:33 +02:00
Adapa Chathurya
3d284d3b4e versions: Bump libz-sys version
Bump libz-sys version to update and remediate CVE-2025-1744.

Signed-off-by: Adapa Chathurya <adapa.chathurya1@ibm.com>
2025-05-14 19:48:10 +05:30
Steve Horsman
711fcd8f51 Merge pull request #11251 from stevenhorsman/rust-vulns-9th-may-2025
Rust vulns 9th may 2025
2025-05-14 09:58:12 +01:00
Cameron Baird
090497f520 genpolicy: Add test cases for fsGroup and supplementalGroup fields
Fix up genpolicy test inputs to include required additionalGids

Include a test for the pod_container container in security_context tests
as these containers follow slightly different paths in containerd.

Introduce a test for fsGroup/supplementalGroups fields in the security
context.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-05-13 21:48:58 +00:00
Cameron Baird
d3cd1af593 genpolicy: Enable AdditionalGids checks in rules.rego
With added support for parsing these fields in genpolicy, we can now
enable policy verification of AdditionalGids.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-05-13 21:48:58 +00:00
Cameron Baird
29ee46c186 genpolicy: Handle PodSecurityContext.fsGroup|supplementalGroups
Policy enforcement for additionalGids, A list of groups applied to the first process run in each container.

Manifests in OCI struct as additionalGids: Consists of container's GID, fsGroup, and supplementalGroups.

https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#PodSecurityContext-v1-core

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-05-13 21:44:51 +00:00
RuoqingHe
cd4c3e89e1 Merge pull request #11243 from kata-containers/dependabot/go_modules/src/runtime/github.com/opencontainers/runc-1.2.0
build(deps): bump github.com/opencontainers/runc from 1.1.12 to 1.2.0 in /src/runtime
2025-05-13 17:02:35 +02:00
stevenhorsman
b3825829d8 versions: Bump golang.org/x/oauth2
Update module to remediate
[CVE-2025-22868](https://www.cve.org/CVERecord?id=CVE-2025-22868)

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-05-13 11:00:35 +01:00