Compare commits

...

259 Commits

Author SHA1 Message Date
Fabiano Fidêncio
fe07edbb26 do-not-merge: main, with no cache
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-06 19:28:14 +01:00
Manuel Huber
d9d1073cf1 gpu: Install packages for devkit
Introduce a new function to install additional packages into the
devkit flavor. With modprobe, we avoid errors on pod startup
related to loading nvidia kernel modules in the NVRC phase.
Note, the production flavor gets modprobe from busybox, see its
configuration file containing CONFIG_MODPROBE=y.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-06 09:58:32 +01:00
Manuel Huber
a786582d0b rootfs: deprecate initramfs dm-verity mode
Remove the initramfs folder, its build steps, and use the kernel
based dm-verity enforcement for the handlers which used the
initramfs mode. Also, remove the initramfs verity mode
capability from the shims and their configs.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
cf7f340b39 tests: Read and overwrite kernel_verity_parameters
Read the kernel_verity_paramers from the shim config and adjust
the root hash for the negative test.
Further, improve some of the test logic by using shared
functions. This especially ensures we don't read the full
journalctl logs on a node but only the portion of the logs we are
actually supposed to look at.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
7958be8634 runtime: Make kernel_verity_params overwritable
Similar to the kernel_params annotation, add a
kernel_verity_params annotation and add logic to make these
parameters overwritable. For instance, this can be used in test
logic to provide bogus dm-verity hashes for negative tests.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
7700095ea8 runtime-rs: Make kernel_verity_params overwritable
Similar to the kernel_params annotation, add a
kernel_verity_params annotation and add logic to make these
parameters overwritable. For instance, this can be used in test
logic to provide bogus dm-verity hashes for negative tests.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
472b50fa42 runtime-rs: Enable kernelinit dm-verity variant
This change introduces the kernel_verity_parameters knob to the
rust based shim, picking up dm-verity information in a new config
field (the corresponding build variable is already produced by
the shim build). The change extends the shim to parse dm-verity
information from this parameter and to construct the kernel command
line appropriately, based on the indicated initramfs or kernelinit
build variant.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
f639c3fa17 runtime: Enable kernelinit dm-verity variant
This change introduces the kernel_verity_parameters knob to the
Go based shim, picking up dm-verity information in a new config
field (the corresponding build variable is already produced by
the shim build). The change extends the shim to parse dm-verity
information from this parameter and to construct the kernel command
line appropriately, based on the indicated initramfs or kernelinit
build variant.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
e120dd4cc6 tests: cc: Remove quotes from kernel command line
With dm-mod.create parameters using quotes, we remove the
backslashes used to escape these quotes from the output we
retrieve. This will enable attestation tests to work with the
kernelinit dm-verity mode.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
976df22119 rootfs: Change condition for cryptsetup-bin
Measured rootfs mode and CDH secure storage feature require the
cryptsetup-bin and e2fsprogs components in the guest.
This change makes this more explicity - confidential guests are
users of the CDH secure container image layer storage feature.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
a3c4e0b64f rootfs: Introduce kernelinit dm-verity mode
This change introduces the kernelinit dm-verity mode, allowing
initramfs-less dm-verity enforcement against the rootfs image.
For this, the change introduces a new variable with dm-verity
information. This variable will be picked up by shim
configurations in subsequent commits.
This will allow the shims to build the kernel command line
with dm-verity information based on the existing
kernel_parameters configuration knob and a new
kernel_verity_params configuration knob. The latter
specifically provides the relevant dm-verity information.
This new configuration knob avoids merging the verity
parameters into the kernel_params field. Avoiding this, no
cumbersome escape logic is required as we do not need to pass the
dm-mod.create="..." parameter directly in the kernel_parameters,
but only relevant dm-verity parameters in semi-structured manner
(see above). The only place where the final command line is
assembled is in the shims. Further, this is a line easy to comment
out for developers to disable dm-verity enforcement (or for CI
tasks).

This change produces the new kernelinit dm-verity parameters for
the NVIDIA runtime handlers, and modifies the format of how
these parameters are prepared for all handlers. With this, the
parameters are currently no longer provided to the
kernel_params configuration knob for any runtime handler.
This change alone should thus not be used as dm-verity
information will no longer be picked up by the shims.

systemd-analyze on the coco-dev handler shows that using the
kernelinit mode on a local machine, less time is spent in the
kernel phase, slightly speeding up pod start-up. On that machine,
the average of 172.5ms was reduced to 141ms (4 measurements, each
with a basic pod manifest), i.e., the kernel phase duration is
improved by about 18 percent.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
83a0bd1360 gpu: use dm-verity for the non-TEE GPU handler
Use a dm-verity protected rootfs image for the non-TEE NVIDIA
GPU handler as well.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
02ed4c99bc rootfs: Use maxdepth=1 to search for kata tarballs
These tarballs are in the top layer of the build directory,
no need to traverse all sub-directories.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
d37db5f068 rootfs: Restore "gpu: Handle root_hash.txt ..."
This reverts commit 923f97bc66 in
order to re-instantiate the logic from commit
e4a13b9a4a.

The latter commit was previously reverted due to the NVIDIA GPU TEE
handler using an initrd, not an image.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
f1ca547d66 initramfs: introduce log function
Log to /dev/kmsg, this way logs will show up and not get lost.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
6d0bb49716 runtime: nvidia: Use img and sanitize whitespaces
Shift NVIDIA shim configurations to use an image instead of an initrd,
and remove trailing whitespaces from the configs.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
282014000f tests: cc: support initrd, image for attestation
Allow using an image instead of an initrd. For confidential
guests using images, the assumption is that the guest kernel uses
dm-verity protection, implicitly measuring the rootfs image via
the kernel command line's dm-verity information.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Greg Kurz
e430b2641c Merge pull request #12435 from bpradipt/crio-annotation
shim: Add CRI-O annotation support for device cold plug
2026-02-05 09:29:19 +01:00
Alex Lyn
e257430976 Merge pull request #12433 from manuelh-dev/mahuber/cfg-sanitize-whitespaces
runtimes: Sanitize trailing whitespaces
2026-02-05 09:31:21 +08:00
Fabiano Fidêncio
dda1b30c34 tests: nvidia-nim: Use sealed secrets for NGC_API_KEY
Convert the NGC_API_KEY from a regular Kubernetes secret to a sealed
secret for the CC GPU tests. This ensures the API key is only accessible
within the confidential enclave after successful attestation.

The sealed secret uses the "vault" type which points to a resource stored
in the Key Broker Service (KBS). The Confidential Data Hub (CDH) inside
the guest will unseal this secret by fetching it from KBS after
attestation.

The initdata file is created AFTER create_tmp_policy_settings_dir()
copies the empty default file, and BEFORE auto_generate_policy() runs.
This allows genpolicy to add the generated policy.rego to our custom
CDH configuration.

The sealed secret format follows the CoCo specification:
sealed.<JWS header>.<JWS payload>.<signature>

Where the payload contains:
- version: "0.1.0"
- type: "vault" (pointer to KBS resource)
- provider: "kbs"
- resource_uri: KBS path to the actual secret

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-04 12:34:44 +01:00
Fabiano Fidêncio
c9061f9e36 tests: kata-deploy: Increase post-deployment wait time
Increase the sleep time after kata-deploy deployment from 10s to 60s
to give more time for runtimes to be configured. This helps avoid
race conditions on slower K8s distributions like k3s where the
RuntimeClass may not be immediately available after the DaemonSet
rollout completes.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-04 12:13:53 +01:00
Fabiano Fidêncio
0fb2c500fd tests: kata-deploy: Merge E2E tests to avoid timing issues
Merge the two E2E tests ("Custom RuntimeClass exists with correct
properties" and "Custom runtime can run a pod") into a single test, as
those 2 are very much dependent of each other.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-04 12:13:53 +01:00
Fabiano Fidêncio
fef93f1e08 tests: kata-deploy: Use die() instead of fail() for error handling
Replace fail() calls with die() which is already provided by
common.bash. The fail() function doesn't exist in the test
infrastructure, causing "command not found" errors when tests fail.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-04 12:13:53 +01:00
Fabiano Fidêncio
f90c12d4df kata-deploy: Avoid text file busy error with nydus-snapshotter
We cannot overwrtie a binary that's currently in use, and that's the
reason that elsewhere we remove / unlink the binary (the running process
keeps its file descriptor, so we're good doing that) and only then we
copy the binary.  However, we missed doing this for the
nydus-snapshotter deployment.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-04 10:24:49 +01:00
Manuel Huber
30c7325e75 runtimes: Sanitize trailing whitespaces
Clean up trailing whitespaces, making life easier for those who
have configured their IDE to clean these up.
Suggest to not add new code with trailing whitespaces etc.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-03 11:46:30 -08:00
Steve Horsman
30494abe48 Merge pull request #12426 from kata-containers/dependabot/github_actions/zizmorcore/zizmor-action-0.4.1
build(deps): bump zizmorcore/zizmor-action from 0.2.0 to 0.4.1
2026-02-03 14:38:54 +00:00
Pradipta Banerjee
8a449d358f shim: Add CRI-O annotation support for device cold plug
Add support for CRI-O annotations when fetching pod identifiers for
device cold plug. The code now checks containerd CRI annotations first,
then falls back to CRI-O annotations if they are empty.

This enables device cold plug to work with both containerd and CRI-O
container runtimes.

Annotations supported:
- containerd: io.kubernetes.cri.sandbox-name, io.kubernetes.cri.sandbox-namespace
- CRI-O: io.kubernetes.cri-o.KubeName, io.kubernetes.cri-o.Namespace

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2026-02-03 04:51:15 +00:00
Steve Horsman
6bb77a2f13 Merge pull request #12390 from mythi/tdx-updates-2026-2
runtime: tdx QEMU configuration changes
2026-02-02 16:58:44 +00:00
Zvonko Kaiser
6702b48858 Merge pull request #12428 from fidencio/topic/nydus-snapshotter-start-from-a-clean-state
kata-deploy: nydus: Always start from a clean state
2026-02-02 11:21:26 -05:00
Steve Horsman
0530a3494f Merge pull request #12415 from nlle/make-helm-updatestrategy-configurable
kata-deploy: Make update strategy configurable for kata-deploy DaemonSet
2026-02-02 10:29:01 +00:00
Steve Horsman
93dcaee965 Merge pull request #12423 from manuelh-dev/mahuber/pause-build-fix
packaging: Delete pause_bundle dir before unpack
2026-02-02 10:26:30 +00:00
Fabiano Fidêncio
62ad0814c5 kata-deploy: nydus: Always start from a clean state
Clean up existing nydus-snapshotter state to ensure fresh start with new
version.

This is safe across all K8s distributions (k3s, rke2, k0s, microk8s,
etc.) because we only touch the nydus data directory, not containerd's
internals.

When containerd tries to use non-existent snapshots, it will
re-pull/re-unpack.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-02 11:06:37 +01:00
Mikko Ylinen
870630c421 kata-deploy: drop custom TDX installation steps
As we have moved to use QEMU (and OVMF already earlier) from
kata-deploy, the custom tdx configurations and distro checks
are no longer needed.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-02-02 11:11:26 +02:00
Mikko Ylinen
927be7b8ad runtime: tdx: move to use QEMU from kata-deploy
Currently, a working TDX setup expects users to install special
TDX support builds from Canonical/CentOS virt-sig for TDX to
work. kata-deploy configured TDX runtime handler to use QEMU
from the distro's paths.

With TDX support now being available in upstream Linux and
Ubuntu 24.04 having an install candidate (linux-image-generic-6.17)
for a new enough kernel, move TDX configuration to use QEMU from
kata-deploy.

While this is the new default, going back to the original
setup is possible by making manual changes to TDX runtime handlers.

Note: runtime-rs is already using QEMUPATH for TDX.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-02-02 11:10:52 +02:00
Nikolaj Lindberg Lerche
6e98df2bac kata-deploy: Make update strategy configurable for kata-deploy DaemonSet
This Allows the updateStrategy to be configured for the kata-deploy helm
chart, this is enabling administrators to control the aggressiveness of
updates. For a less aggressive approach, the strategy can be set to
`OnDelete`. Alternatively, the update process can be made more
aggressive by adjusting the `maxUnavailable` parameter.

Signed-off-by: Nikolaj Lindberg Lerche <nlle@ambu.com>
2026-02-01 20:14:29 +01:00
Dan Mihai
d7ff54769c tests: policy: remove the need for using sudo
Modify the copy of root user's settings file, instead of modifying the
original file.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-02-01 20:09:50 +01:00
Dan Mihai
4d860dcaf5 tests: policy: avoid redundant debug output
Avoid redundant and confusing teardown_common() debug output for
k8s-policy-pod.bats and k8s-policy-pvc.bats.

The Policy tests skip the Message field when printing information about
their pods, because unfortunately that field might contain a truncated
Policy log - for the test cases that intentiocally cause Policy
failures. The non-truncated Policy log is already available from other
"kubectl describe" fields.

So, avoid the redundant pod information from teardown_common(), that
also included the confusing Message field.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-02-01 20:09:50 +01:00
dependabot[bot]
dc8d9e056d build(deps): bump zizmorcore/zizmor-action from 0.2.0 to 0.4.1
Bumps [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) from 0.2.0 to 0.4.1.
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](e673c3917a...135698455d)

---
updated-dependencies:
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.4.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-01 15:08:10 +00:00
Manuel Huber
8b0c199f43 packaging: Delete pause_bundle dir before unpack
Delete the pause_bundle directory before running the umoci unpack
operation. This will make builds idempotent and not fail with
errors like "create runtime bundle: config.json already exists in
.../build/pause-image/destdir/pause_bundle". This will make life
better when building locally.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-31 19:43:11 +01:00
Steve Horsman
4d1095e653 Merge pull request #12350 from manuelh-dev/mahuber/term-grace-period
tests: Remove terminationGracePeriod in manifests
2026-01-29 15:17:17 +00:00
Fabiano Fidêncio
b85393e70b release: Bump version to 3.26.0
Bump VERSION and helm-charts versions.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-29 00:23:26 +01:00
Fabiano Fidêncio
500146bfee versions: Bump Go to 1.24.12
Update Go from 1.24.11 to 1.24.12 to address security vulnerabilities
in the standard library:

- GO-2026-4342: Excessive CPU consumption in archive/zip
- GO-2026-4341: Memory exhaustion in net/url query parsing
- GO-2026-4340: TLS handshake encryption level issue in crypto/tls

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-29 00:23:26 +01:00
Dan Mihai
20ca4d2d79 runtime: DEFDISABLEBLOCK := true
1. Add disable_block_device_use to CLH settings file, for parity with
   the already existing QEMU settings.

2. Set DEFDISABLEBLOCK := true by default for both QEMU and CLH. After
   this change, Kata Guests will use by default virtio-fs to access
   container rootfs directories from their Hosts. Hosts that were
   designed to use Host block devices attached to the Guests can
   re-enable these rootfs block devices by changing the value of
   disable_block_device_use back to false in their settings files.

3. Add test using container image without any rootfs layers. Depending
   on the container runtime and image snapshotter being used, the empty
   container rootfs image might get stored on a host block device that
   cannot be safely hotplugged to a guest VM, because the host is using
   the same block device.

4. Add block device hotplug safety warning into the Kata Shim
   configuration files.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Cameron McDermott <cameron@northflank.com>
2026-01-28 19:47:49 +01:00
Manuel Huber
5e60d384a2 kata-deploy: Update for mariner in all target
Remove the initrd function and add the image function to align
with the actually existing functions in this file.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-28 08:58:45 -08:00
Greg Kurz
ea627166b9 Merge pull request #12389 from ldoktor/ci-helm
ci.ocp: Use 0.0.0-dev tagged helm chart
2026-01-28 17:20:07 +01:00
Manuel Huber
0d8fbdef07 kernel: Readjust kernel version after decrement
Readjust the kata_config_version counter after it was
accidentally decremented in commit c7f5ff4.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-28 10:48:12 +01:00
Joji Mekkattuparamban
1440dd7468 shim: enforce iommufd for confidential guest vfio
Confidential guests cannot use traditional IOMMU Group based VFIO.
Instead, they need to use IMMUFD. This is mainly because the group
abstraction is incompatible with a confidential device model.
If traditional VFIO is specified for a confidential guest, detect
the error and bail out early.

Fixes #12393

Signed-off-by: Joji Mekkattuparamban <jojim@nvidia.com>
2026-01-28 00:11:38 +01:00
stevenhorsman
c7bc428e59 versions: Bump guest-components
Bump guest-components to 9aae2eae
to pick up the latest security fixes and toolchain bump

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-28 00:05:58 +01:00
Aurélien Bombo
932920cb86 Merge pull request #11959 from houstar/main
agent: remove redundant func comment
2026-01-27 12:01:04 -06:00
Lukáš Doktor
5250d4bacd ci.ocp: Use 0.0.0-dev tagged helm chart
in CI we are testing the latest kata-deploy, which requires the latest
helm chart. The previous query doesn't work anymore, but these days we
should be able to rely on the "0.0.0-dev" tag and on helm to print the
to-be-installed version into console.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-01-27 14:58:46 +01:00
Steve Horsman
eb3d204ff3 Merge pull request #12274 from ldoktor/pp-images
ci.ocp: Two little fixes regarding the openshift-ci
2026-01-27 11:31:51 +00:00
Lukáš Doktor
971b096a1f ci.ocp: Update cleanup.sh to cope with helm deployment
replaces the old kata-deploy and uses "helm uninstall" instead.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-01-27 07:59:13 +01:00
Lukáš Doktor
272ff9c568 ci.ocp: Add notes about where to get other podvm images
I keep struggling finding the debug images, let's include them in the
peer-pods-azure.sh script so people can find them easier.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-01-27 07:59:12 +01:00
Qingyuan Hou
ca43a8cbb8 agent: remove redundant func comment
This comment was first introduced in e111093 with secure_join()
but then we forgot to remove it when we switched to the safe-path
lib in c0ceaf6

Signed-off-by: Qingyuan Hou <lenohou@gmail.com>
2026-01-27 03:07:57 +00:00
Alex Lyn
6c0ae4eb04 Merge pull request #11585 from Apokleos/enhance-qmp
runtime-rs: Make QMP init robust by retrying handshake with deadline
2026-01-27 09:11:19 +08:00
Zvonko Kaiser
a59f791bf5 gpu: Move CUDA repo selection to versions.yaml
We want to enable local and remote CUDA repository builds.
Moving the cuda and tools repo to versions.yaml with a
unified build for both types.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-26 22:19:40 +01:00
Fabiano Fidêncio
d0fe60e784 tests: Fix empty string handling for helm
Fix empty string handling in format conversion

When HELM_ALLOWED_HYPERVISOR_ANNOTATIONS, HELM_AGENT_HTTPS_PROXY, or
HELM_AGENT_NO_PROXY are empty, the pattern matching condition
`!= *:*` or `!= *=*` evaluates to true, causing the conversion loop
to create invalid entries like "qemu-tdx: qemu-snp:".

Add -n checks to ensure conversion only runs when variables are
non-empty.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
4b2d4e96ae tests: Add qemu-{tdx,snp}-runtime-rs to the list of tee shims
We missed doing this as part of
b5a986eacf.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
26c534d610 tests: Use shims.disableAll in test helpers
Update the CI and functional test helpers to use the new
shims.disableAll option instead of iterating over every shim
to disable them individually.

Also adds helm repo for node-feature-discovery before building
dependencies to fix CI failures on some distributions.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
04f45a379c kata-deploy: docs: Document shims.disableAll option
Update the Helm chart README to document the new shims.disableAll
option and simplify the examples that previously required listing
every shim to disable.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
c9e9a682ab kata-deploy: Use disableAll in example values files
Simplify the example values files by using the new shims.disableAll
option instead of listing every shim to disable.

Before (try-kata-nvidia-gpu.values.yaml):
  shims:
    clh:
      enabled: false
    cloud-hypervisor:
      enabled: false
    # ... 15 more lines ...

After:
  shims:
    disableAll: true

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
cfe9bcbaf1 kata-deploy: Add shims.disableAll option to Helm chart
Add a new `shims.disableAll` option that disables all standard shims
at once. This is useful when:
- Enabling only specific shims without listing every other shim
- Using custom runtimes only mode (no standard Kata shims)

Usage:
  shims:
    disableAll: true
    qemu:
      enabled: true  # Only qemu is enabled

All helper templates are updated to check for this flag before
iterating over shims.

One thing that's super important to note here is that helm recursively
merges user values with chart defaults, making a simple
`disableAll` flag problematic: if defaults have `enabled: true`, user's
`disableAll: true` gets merged with those defaults, resulting in all
shims still being enabled.

The workaround found is to use null (`~`) as the default for `enabled`
field. The template logic interprets null differently based on
disableAll:

| enabled value | disableAll: false | disableAll: true |
|---------------|-------------------|------------------|
| ~ (null)      | Enabled           | Disabled         |
| true          | Enabled           | Enabled          |
| false         | Disabled          | Disabled         |

This is backward compatible:
- Default behavior unchanged: all shims enabled when disableAll: false
- Users can set `disableAll: true` to disable all, then explicitly
  enable specific shims with `enabled: true`
- Explicit `enabled: false` always disables, regardless of disableAll

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
d8a3272f85 kata-deploy: Add tests for custom runtimes Helm templates
Add Bats tests to verify the custom runtimes Helm template rendering,
and that the we can start a pod with the custom runtime.

Tests were written with Cursor's help.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
3be57bb501 kata-deploy: Add Helm chart support for custom runtimes
Add Helm chart configuration for defining custom RuntimeClasses with
base configuration and drop-in overrides.

Usage:
  helm install kata-deploy ./kata-deploy \
    -f custom-runtimes.values.yaml

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
a76cdb5814 kata-deploy: Add custom runtime config installation/removal
Add functions to install and remove custom runtime configuration files.
Each custom runtime gets an isolated directory structure:

  custom-runtimes/{handler}/
    configuration-{baseConfig}.toml  # Copied from base config
    config.d/
      50-overrides.toml              # User's drop-in overrides

The base config is copied AFTER kata-deploy has applied its modifications
(debug settings, proxy configuration, annotations), so custom runtimes
inherit these settings.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
4c3989c3e4 kata-deploy: Add custom runtime configuration for containerd/CRI-O
Add functions to configure custom runtimes in containerd and CRI-O.
Custom runtimes use an isolated config directory under:
  custom-runtimes/{handler}/

Custom runtimes automatically derive the shim binary path from the
baseConfig field using the existing is_rust_shim() logic.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
678b560e6d kata-deploy: Add CustomRuntime struct and parsing
Add support for parsing custom runtime configurations from a mounted
ConfigMap. This allows users to define their own RuntimeClasses with
custom Kata configurations.

The ConfigMap format uses a custom-runtimes.list file with entries:
  handler:baseConfig:containerd_snapshotter:crio_pulltype

Drop-in files are read from dropin-{handler}.toml, if present.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Fabiano Fidêncio
609a25e643 kata-deploy: Refactor runtime configuration with helper functions
Let's extract the common logic from configure_containerd_runtime and
configure_crio_runtime into reusable helper functions. This reduces
code duplication and prepares for adding custom runtime support.

For containerd:
- Add ContainerdRuntimeParams struct to encapsulate common parameters
- Add get_containerd_pluginid() to extract version detection logic
- Add get_containerd_output_path() to extract file path resolution
- Add write_containerd_runtime_config() to write common TOML values

For CRI-O:
- Add CrioRuntimeParams struct to encapsulate common parameters
- Add write_crio_runtime_config() to write common configuration

While here, let's also simplify pod_annotations to always use
"[\"io.katacontainers.*\"]" for all runtimes, as the NVIDIA specific
case has been removed from the shell script, but we forgot to do so
here.

No functional changes intended.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-26 20:50:01 +01:00
Steve Horsman
aa94038355 Merge pull request #12388 from Apokleos/fix-shimio
runtime-rs: se File instead of UnixStream for FIFO to fix ENOTSOCK
2026-01-26 13:22:57 +00:00
tak-ka3
5471fa133c runtime-rs: Add -info flag support for containerd v2.0+
Add -info flag handling to containerd-shim-kata-v2 (Rust version).
This outputs RuntimeInfo protobuf (name, version, revision) to stdout,
providing compatibility with containerd v2.0+ which queries runtime
information via this flag.

This is the runtime-rs counterpart to the Go implementation.

Fixes #12133

Signed-off-by: tak-ka3 <takumi.hiraoka@acompany-ac.com>
2026-01-26 13:38:07 +01:00
Alex Lyn
68d671af0f runtime-rs: Make QMP init robust by retrying handshake with deadline
It aims to make QMP initialize robust by retrying QMP handshake with
global deadline to handle slow QEMU bring-up.

Qmp::new() used DEFAULT_QMP_READ_TIMEOUT as the effective deadline
for the QMP handshake read. When QEMU initialization is slow (e.g.
heavy host load, large memory/device init, slow storage, confidential
guests, etc.), the QMP greeting may not become readable within a small
per-read timeout (e.g. 250ms).  This caused QMP init to fail with
"Resource temporarily unavailable (os error 11)" and spam
"couldn't initialise QMP", while subsequent retries might eventually
succeed once QEMU became ready.

To address this issue, keep a short per-read timeout to avoid
indefinite blocking, but add a global "wait for QMP ready" deadline
that retries the handshake with a small backoff. This improves startup
reliability under load and avoids unnecessary reconnect failures.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-26 16:47:32 +08:00
Bo Liu
c7f5ff45a2 arm64: Update ptp.conf to correct time sync
Given the patch has been merged in linux upstream, it's safe to enable
these two options.

Signed-off-by: Bo Liu <152475812+liubocflt@users.noreply.github.com>
2026-01-24 21:08:21 +01:00
Hui Zhu
37a0c81b6a libs: Change kv of get_agent_kernel_params to BTreeMap
HashMap cannot guarantee the order.  The command line is always changed.
This commit change kv of get_agent_kernel_params to BTreeMap to make
sure the command line is not changed.

Fixes: #10977

Signed-off-by: Hui Zhu <teawater@antgroup.com>
2026-01-24 21:07:41 +01:00
Alex Lyn
e7b8b302ac runtime-rs: se File instead of UnixStream for FIFO to fix ENOTSOCK
It aims to address the issue:
"run_io_copy[Stdout]: failed to copy stream: Not a socket (os error 88)"

The `Not a socket (os error 88)` error was caused by incorrectly wrapping
a FIFO file descriptor in a `UnixStream`. The following changes:
(1) Refactor `open_fifo_write` to return `tokio::fs::File` (or a generic
  async reader/writer) instead of `AsyncUnixStream`.
(2) Ensure IO copying logic treats stdout/stderr streams as file-like
  objects rather than sockets.

This fix eliminates the "failed to copy stream" errors in the IO loop
and ensures reliable log forwarding for legacy-io.

Fixes: #12387

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-24 10:41:27 +00:00
Alex Lyn
8a0fad4b95 runtime-rs: Move the set_flag_with_blocking out as a public method
Move the private closure out and make it a public method which is
responsible for clear O_NONBLOCK for an fd and turn it into blocking
mode.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-24 10:41:27 +00:00
Manuel Huber
6438fe7f2d tests: Remove terminationGracePeriod in manifests
Do not kill containers immediately, instead use Kubernetes'
default termination grace period.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-23 16:18:44 -08:00
Manuel Huber
0d35b36652 Revert "ci: Ensure the KBS resources are created"
This reverts commit c0d7222194.

Soon, guest components will switch to using a DB instead of
storing resources in the filesystem. Further, I don't see any
more indicators why kbs-client would struggle to set simple
resources.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-23 16:18:10 -08:00
Fabiano Fidêncio
5b82b160e2 runtime-rs: Add arm64 QEMU support
Add the necessary configuration and code changes to support QEMU
on arm64 architecture in runtime-rs.

Changes:
- Set MACHINETYPE to "virt" for arm64
- Add machine accelerators "usb=off,gic-version=host" required for
  proper arm64 virtualization
- Add arm64-specific kernel parameter "iommu.passthrough=0"
- Guard vIOMMU (Intel IOMMU) to skip on arm64 since it's not supported

These changes align runtime-rs with the Go runtime's arm64 QEMU support.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
2026-01-23 19:48:31 +01:00
tak-ka3
29e7dd27f1 runtime: Add -info flag support for containerd v2.0+
Add support for the -info flag that containerd v2.0+ passes to shims.
The flag outputs RuntimeInfo protobuf to stdout containing the shim
name and version information.

Fixes #12133

Signed-off-by: tak-ka3 <takumi.hiraoka@acompany-ac.com>
2026-01-22 19:26:44 +01:00
Steve Horsman
d0bfb27857 Merge pull request #12384 from Apokleos/fix-full-debug
doc: update enabling full debug method
2026-01-22 14:25:11 +00:00
Fabiano Fidêncio
ac8436e326 kata-deploy: Update debian in the container image to 13 (trixie)
Just a bump to the latest version, as requested by Mikko.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-22 12:32:59 +01:00
Steve Horsman
2cd76796bd Merge pull request #12305 from stevenhorsman/fix-stalebot-permissions
ci: Fix stalebot permissions
2026-01-22 10:02:43 +00:00
Alex Lyn
fb7390ce3c doc: update enabling full debug method
The enable_debug parameter was explicitly set to false rather than
being commented out (e.g., # enable_debug = true). As the previous
enabling method failed to account for this explicit setting, it was
rendered invalid. This commit updates the matching logic to correctly
handle and toggle the explicit false value.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-22 17:44:57 +08:00
Hyounggyu Choi
bc131a84b9 GHA: Set timeout for kata-deploy and kbs cleanup
It was observed that some kata-deploy cleanup steps could hang,
causing the workflow to never finish properly. In these cases,
a QEMU process was not cleaned up and kept printing debug logs
to the journal. Over time, this maxed out the runner’s disk
usage and caused the runner service to stop.

Set timeouts for the relevant cleanup steps to avoid this.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-01-22 10:32:24 +01:00
Fabiano Fidêncio
dacb14619d kata-deploy: Make verification ConfigMap a regular resource
The verification job mounts a ConfigMap containing the pod spec for
the Kata runtime test. Previously, both the ConfigMap and the Job were
Helm hooks with different weights (-5 and 0 respectively).

On k3s, a race condition was observed where the Job pod would be
scheduled before the kubelet's informer cache had registered the
ConfigMap, causing a FailedMount error:

  MountVolume.SetUp failed for volume "pod-spec": object
  "kube-system"/"kata-deploy-verification-spec" not registered

This happened because k3s's lightweight architecture schedules pods
very quickly, and the hook weight difference only controls Helm's
ordering, not actual timing between resource creation and cache sync.

By making the ConfigMap a regular chart resource (removing hook
annotations), it is created during the main chart installation phase,
well before any post-install hooks run. This guarantees the ConfigMap
is fully propagated to all kubelets before the verification Job starts.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
89e287c3b2 kata-deploy: Add more permissions to verification job's RBAC
The verification job needs to list nodes to check for the
katacontainers.io/kata-runtime label and list events to detect
FailedCreatePodSandBox errors during pod creation.

This was discovered when testing with k0s, where the service account
lacked the required cluster-scope permissions to list nodes.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
869dd5ac65 kata-deploy: Enable dynamic drop-in support for k0s
Remove k0s-worker and k0s-controller from
RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT and always return true for
k0s in is_containerd_capable_of_using_drop_in_files since k0s auto-loads
from containerd.d/ directory regardless of containerd version.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
d4ea02e339 kata-deploy: Add microk8s support with dynamic version detection
Add microk8s case to get_containerd_paths() method and remove microk8s
from RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT to enable dynamic
containerd version checking.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
69dd9679c2 kata-deploy: Centralize containerd path management
Introduce ContainerdPaths struct and get_containerd_paths() method to
centralize the complex logic for determining containerd configuration
file paths across different Kubernetes distributions.

The new ContainerdPaths struct includes:
- config_file: File to read containerd version from and write to
- backup_file: Backup file path before modification
- imports_file: File to add/remove drop-in imports from (Option<String>)
- drop_in_file: Path to the drop-in configuration file
- use_drop_in: Whether drop-in files can be used

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
606c12df6d kata-deploy: fix JSONPath parsing for labels with dots
The JSONPath parser was incorrectly splitting on escaped dots (\.)
causing microk8s detection to fail. Labels like "microk8s.io/cluster"
were being split into ["microk8s\", "io/cluster"] instead of being
treated as a single key.

This adds a split_jsonpath() helper that properly handles escaped dots,
allowing the automatic microk8s detection via the node label to work
correctly.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
ec18dd79ba tests: Simplify kata-deploy test to use helm directly
The kata-deploy test was using helm_helper which made it hard to debug
failures (die() calls would cause "Executed 0 tests" errors) and added
unnecessary complexity.

The test now calls helm directly like a user would, making it simpler
and more representative of real-world usage. The verification job status
is explicitly checked with proper failure detection instead of relying
on helm --wait.

Timeouts are configurable via environment variables to account for
different network speeds and image sizes:
- KATA_DEPLOY_TIMEOUT (default: 600s)
- KATA_DEPLOY_DAEMONSET_TIMEOUT (default: 300s)
- KATA_DEPLOY_VERIFICATION_TIMEOUT (default: 120s)

Documentation has been added to explain what each timeout controls and
how to customize them.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
86e0b08b13 kata-deploy: Improve verification job timing and failure detection
The verification job now supports configurable timeouts to accommodate
different environments and network conditions. The daemonset timeout
defaults to 1200 seconds (20 minutes) to allow for large image downloads,
while the verification pod timeout defaults to 180 seconds.

The job now waits for the DaemonSet to exist, pods to be scheduled,
rollout to complete, and nodes to be labeled before creating the
verification pod. A 15-second delay is added after node labeling to
allow kubelet time to refresh runtime information.

Retry logic with 3 attempts and a 10-second delay handles transient
FailedCreatePodSandBox errors that can occur during runtime
initialization. The job only fails on pod errors after a 30-second
grace period to avoid false positives from timing issues.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
2369cf585d tests: Fix retry loop bugs in helm_helper
The retry loop in helm_helper had two bugs:
1. Counter initialized to 10 instead of 0, causing immediate failure
2. Exit condition used -eq instead of -ge, incorrect for loop logic

These bugs would cause helm_helper to fail immediately on the first
retry attempt instead of properly retrying up to max_tries times.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
stevenhorsman
19efeae12e workflow: Fix stalebot permissions
When looking into stale bot more for issues, I realised that our existing
stale job would need permissions to work. Unfortunately the behaviour
of the actions without these permissions is to log, but still finish as successful.
This means it was hard to spot we had an issue.

Add the required permissions to get this working again and improve the message
Also add concurrency rule to make zizmor happy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 17:28:59 +00:00
Steve Horsman
70f6543333 Merge pull request #12371 from stevenhorsman/cargo-check
build: Add cargo check
2026-01-21 14:50:07 +00:00
Steve Horsman
4eb50d7b59 Merge pull request #12334 from stevenhorsman/rust-linting-improvements
Rust linting improvements
2026-01-21 14:01:37 +00:00
Steve Horsman
ba47bb6583 Merge pull request #11421 from kata-containers/dependabot/go_modules/src/runtime/github.com/urfave/cli-1.22.17
build(deps): bump github.com/urfave/cli from 1.22.14 to 1.22.17 in /src/runtime
2026-01-21 11:46:02 +00:00
stevenhorsman
62847e1efb kata-ctl: Remove unnecessary unwrap
Switch `is_err()` and then `unwrap_err()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:53:40 +00:00
stevenhorsman
78824e0181 agent: Remove unnecessary unwrap
Switch `is_some()` and then `unwrap()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:53:40 +00:00
stevenhorsman
d135a186e1 libs: Remove unnecessary unwrap
Switch `is_err()` and then `unwrap_err()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
949e0c2ca0 libs: Remove unused imports
Tidy up the imports

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
83b0c44986 dragonball: Remove unused imports
Clean up the imports

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
7a02c54b6c kata-ctl: Allow unused assigned in clap parsing
command isn't ever read, but leave it in for now, so we don't disrupt
the parsing option

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
bf1539b802 libs: Replace manual default
HugePageType has a manual default that can be derived
more concisely

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:47 +00:00
stevenhorsman
0fd9eebf0f kata-ctl: Update Cargo.lock
The cargo check identified that the lock file is out of date,
so bump this to fix the issue

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-20 16:07:34 +00:00
stevenhorsman
3f1533ae8a build: Add cargo check
We've had a couple of occasions that Cargo.lock has been out of sync
with Cargo.toml, so try and extend our rust check to pick this up in the CI.

There is probably a more elegant way than doing `cargo check` and
checking for changes, but I'll start with this approach

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-20 16:07:34 +00:00
Greg Kurz
cf3441bd2c agent: Refresh Cargo.lock
Downstream builders at Red Hat complain that `Cargo.lock` doesn't match
`Cargo.toml`.

Run `cargo check` to refresh `Cargo.lock`.

`git bisect` shows that 7cfb97d41b is the first commit where
`cargo check` has an effect in `src/agent`.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-01-20 14:44:47 +01:00
Fabiano Fidêncio
e0158869b1 tests: Add common bats test runner function
Add run_bats_tests() function to common.bash that provides consistent
test execution and reporting across all test suites (k8s, nvidia,
kata-deploy).

This removes duplicated test runner code from run_kubernetes_tests.sh,
run_kubernetes_nv_tests.sh, and run-kata-deploy-tests.sh.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-20 12:31:55 +01:00
Fabiano Fidêncio
5aff81198f helm-chart: Fix warnings on README
nydus -> `nydus`
erofs -> `erofs`

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 22:41:50 +01:00
Fabiano Fidêncio
b5a986eacf kata-deploy: Add runtime-rs TDX / SNP runtimeclasses
https://github.com/kata-containers/kata-containers/pull/11534 has been
merged and it added all the needed bits to deploy the QEMU SNP / TDX
runtime-rs variants, apart from the kata-deploy additions, which is done
by this PR.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 22:41:50 +01:00
Fabiano Fidêncio
c7570427d2 tests: Add report generation to NVIDIA tests
The NVIDIA GPU test runner script was not generating test reports,
causing the report_tests() function in gha-run.sh to have nothing
to display. This aligns the script with run_kubernetes_tests.sh by:

- Adding set -o pipefail for proper pipeline error handling
- Creating a reports directory with timestamped subdirectory
- Capturing test output to files with ok-/not_ok- prefixes
- Adding --timing flag to bats for timing information

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 18:21:43 +01:00
Fabiano Fidêncio
c1216598e8 static-checks: Fix kata-deploy reference
Let's just point to the official documentation rather than explaining
exactly how to deploy (and the current text was very outdated).

Removing fluentd / minikube examples is out of context of this commit.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 15:09:20 +01:00
Fabiano Fidêncio
96e1fb4ca6 tools: Remove runk
The runk tool hasn't been supported for a few years, with no maintainers
since ManaSugi stopped being involved in the project and the CI was
disabled in 2024.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:43:53 +01:00
Fabiano Fidêncio
f68c25de6a kata-deploy: Switch to the rust version
Let's remove the script and rely only on the rust version from now on.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:07:49 +01:00
Fabiano Fidêncio
d7aa793dde Revert "ci: Run a nightly job using the kata-deploy rust"
This reverts commit 6130d7330f, as we're
officially swithcing to the rust version of kata-deploy.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:07:49 +01:00
Fabiano Fidêncio
17472f3f10 release: scripts: Accept KATA_TOOLS_STATIC_TARBALL env var
a2534e7bc8 introduced the logic to also
release a kata-tools tarball, but it missed allowing
KATA_TOOLS_STATIC_TARBALL env var to be passed to the release script,
leading to the following error during the release process:
```
ERROR: Invalid environment variable "KATA_TOOLS_STATIC_TARBALL"
```

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 13:03:23 +01:00
Fabiano Fidêncio
882862d711 release: Bump version to 3.25.0
Bump VERSION and helm-charts versions.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 11:33:45 +01:00
XanderC
93beb58c5d runtime: fix network initialization for non-hotplug VMMs
In startVM(), for VMMs without hotplug support (e.g., Firecracker or
QEMU microvm), the runtime runs prestart hooks but misses rescanning
the network namespace. This causes VMs to boot with uninitialized
network configs, as updates from CNI plugins are not captured.

This patch adds a network rescan via AddEndpoints after prestart hooks
for the non-hotplug path, ensuring correct network info is passed to
the VMM configuration before the VM starts.

Fixes #11500

Signed-off-by: XanderC <xanderc@qq.com>
2026-01-17 23:56:59 +01:00
Zvonko Kaiser
428cc5d586 gpu: Chroot Cleanup
With the newest NVRC we do not need the supported GPUs
anymore.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-17 19:27:24 +01:00
Fabiano Fidêncio
1c154b4c15 kernel: Add DAX fix for arm64
The patch has been provided upstream by Seunguk Shin and is already
approved.

We'll drop it once it becomes available in the LTS tree.

Reference:
https://lore.kernel.org/all/18af3213-6c46-4611-ba75-da5be5a1c9b0@arm.coum

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-17 19:15:53 +01:00
Fabiano Fidêncio
33b1f0786e Revert "arm64: Do not use DAX with the rootfs image"
This reverts commit 2acb94ef2d, as we have
a kernel patch approved fixing the issue.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-17 19:15:53 +01:00
Alex Lyn
fe15f2fa47 runtime-rs: Remove deprecated virtio-9p
The virtio-9p is not supported for a long time, specially within
the runtime-rs, we have no such plan to support it. Removal of the
related items is reasonable.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
b7cfc6fd72 runtime-rs: Remove mem-agent section from TDX/SNP configurations
As Memory Agent feature is not used within CoCo(TDX/SNP) scenarios,
with this fact, it's better to just remove the related sections.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
634ec2b56d runtime-rs: Add configurable SNP items in Makefile when make build
It aims to introduce some related items within Makefile to enable
Intel SNP settings in configuration when do make build. And make it
possible to generate the rendered qemu-snp-runtime-rs configuration
based on the *.in template.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
0abdb8e016 runtime-rs: Introduce a qemu-runtime-rs/SEV-SNP dedicated configuration
To make it work well on the SEV-SNP platforms for qemu-runtime-rs with
coco, a dedicated SEV-SNP configuration should be introduced to help
prepare related CVM resources.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
b0a82f7bb8 runtime-rs: Enable measured rootfs within configuration when make build
Enable measured rootfs within configuration when make build. And add
some other important items to make the configuration work well.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
3799855040 runtime-rs: Add configurable TDX items in Makefile when make build
It aims to introduce some related items within Makefile to enable
Intel TDX settings in configuration when do make build. And make it
possible to generate the rendered qemu-tdx-runtime-rs configuration
based on the *.in template.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
4d55e2c8c8 runtime-rs: Introduce a dedicated configuration for qemu-runtime-rs/TDX
To make it work well on the TDX platforms for qemu-runtime-rs with
coco, a dedicated TDX configuration should be introduced to help
prepare related CVM resources.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Manuel Huber
956f43c6c6 runtime: skip MoveTo for systemd cgroups
Systemd-managed cgroups use the slice:prefix:name format, which is
not a filesystem path. Calling MoveTo() on such paths fails with
"invalid group path" and can abort cleanup before Delete() runs.
In some cases, this causes pod teardown delays.
Skip MoveTo for systemd-formatted sandbox/overhead cgroup paths when
sandbox_cgroup_only is true; systemd moves tasks on unit deletion.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-16 16:41:38 +01:00
Manuel Huber
6b70923e55 docs: Update NVIDIA GPU passthrough QEMU scenario
With cold-plug becoming by design the only supported mode with the
update of NVRC to v0.1.1, resolving references to hot-plug.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-16 13:50:10 +01:00
Steve Horsman
610a8bdfd5 Merge pull request #12346 from Amulyam24/ppc64le-payload
ci: move the job publish kata payload after push to an alternate runner for ppc64le
2026-01-16 11:41:53 +00:00
Fabiano Fidêncio
ea18f543b4 tests: kata-deploy: Enable verification during helm install
Enable post-install verification in kata-deploy CI tests. When
HELM_VERIFY_DEPLOYMENT is set, a simple verification pod is created
that runs with the Kata runtime to confirm deployment succeeded.

The verification pod prints kernel info and exits - success indicates
the Kata runtime is properly configured and functional.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-16 10:52:43 +01:00
Fabiano Fidêncio
a188f04d75 kata-deploy: helm: Add optional post-install verification
Add optional verification that runs after kata-deploy installation.
When a pod spec is provided via --set-file verification.pod=<file>,
a verification job runs after install/upgrade to validate deployment.

The user is fully responsible for the verification pod content:
- Pod name, runtimeClassName, annotations, and verification logic
- Pod must exit 0 on success, non-zero on failure

The verification job simply:
1. Waits for kata-deploy DaemonSet to be ready
2. Applies the user-provided pod spec
3. Waits for the pod to complete
4. Shows logs and cleans up

Usage:
  helm install kata-deploy ... \
    --set-file verification.pod=/path/to/your-pod.yaml

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-16 10:52:43 +01:00
Amulyam24
859313d904 ci: move the job payload after push to an alternate runner for ppc64le
To unlock the release, move the job to publish kata payload after push to an alternate runner(IBM owned) for ppc64le.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2026-01-16 11:14:42 +05:30
Alex Lyn
c0cca81993 runtime-rs: Set default_bridges with 0 for dragonball vmm
As Dragonball VMM does not support PCI hotplug options, it should
be set 0.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
1a76d44e16 kata-types: Chanage the default bridges with 1
It aims to align it with the Makefile and configuration's
setting.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
6375b3881d runtime-rs: Set the default bridges with default 1
As runtime-go use the default bridges with 1, it should be
kept as 1 to avoid alignment issues.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
8728b262fb Merge pull request #12338 from zvonkok/nvrc-update
gpu: Bump NVRC Version
2026-01-15 19:36:07 +08:00
Zvonko Kaiser
adce41c432 gpu: Bump NVRC Version
The new NVRC version works for CC and non-CC use cases,
no --feature confidential needed anymore.

Bump versions.yaml and adjust deployment instructions.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-15 01:51:10 +00:00
Manuel Huber
6753c3ac08 runtime: nvidia: Disable NVDIMM
Disable NVDIMM. When using GPU passthrough, using NVDIMM would create
a r/o file-backed memory region. When using a GPU, QEMU tries to DMA-
map guest memory for the device, resulting in a mapping error:
memory listener initialization failed: Region mem0:
vfio_container_dma_map ... -22 (Invalid argument).
For the CC configs, NVDIMM is disabled by default in qemu_amd64.go
with a warning, but we also explicitly disable the setting in the
shim configuration file.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-14 22:51:07 +01:00
Fabiano Fidêncio
a9dda0e52b versions: nvidia: Bump kernel to the latest LTS
As now that we have the decoupled rootfs / kernel, doing the bump
becomes trivial.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 20:45:54 +01:00
Fabiano Fidêncio
4e99860fd2 workflows: nvidia: Adjust to kernel / roots build decouple
We don't need to store the kernel headers anymore. We do need to store
the kernel modules, instead.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
02d2b6bdf2 kernel: bump kata_config_version
We have kernel build changes bump the config version

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
a075c3740a gpu: build_image.sh use versions.yaml
We've done some bad file based driver determination,
now with versions.yaml there is a single source of truth.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
ffc8725164 gpu: rootfs update decoupling
Remove all the driver build instructions,
sicne those are now done in the kernel target.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
cca973772d gpu: deploy modules for kernel build
We need to package the build modules for the rootfs
to be able to consume it. We package the whole
/lib/modules/$(uname -r)  directory strip=2.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
13ed3cdff9 gpu: Add NVIDA modules to build-kernel.sh
Checkout and build the kernel modules along
with the kernel to avoid the kernel rootfs dependency.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
2a11910acb gpu: Remove building of Headers
Since we build along the kernel we do not need to
carry over the headers to the rootfs build.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
b1870fef07 gpu: versions.yaml nvidia driver pinning
We want to have deterministic behaviour and only
one valid driver version acceptable via versions.yaml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
229481b348 kernel: bugfix install yq
We actually never installed yq to the kernel build,
there are  some path that use yq but were never hit,
for the GPU use-case we need to read values from versions.yaml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Steve Horsman
6db3a4cf8d Merge pull request #12333 from fitzthum/bump-v0180
Update Trustee and guest-components for upcoming releases
2026-01-14 19:44:55 +00:00
Tobin Feldman-Fitzthum
ca29e68acb agent-ctl: bump image-rs version
In preparation for coco v0.18.0, bump the version of image-rs we use in
agent-ctl to match what we have in versions.yaml.

Drop the snapshotter-overlayfs feature. This was dropped from image-rs
when we removed enclave-cc support.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-01-14 06:54:29 -08:00
Tobin Feldman-Fitzthum
25a08ef739 versions: bump Trustee and guest-components
Before cutting the Kata release that will be used with CoCo v0.18.0,
let's bump the versions of Trustee and guest-components to latest.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-01-14 06:43:30 -08:00
Steve Horsman
0f5f914a04 Merge pull request #12330 from LandonTClipp/docs_improvement
docs: Navigation improvements and bug fixes to Pages
2026-01-14 14:13:29 +00:00
stevenhorsman
70e3e2b0c9 genpolicy: Bump openssl-src
This is a vulnerability (CVE-2025-9230) in openssl, so move
to 3.5.4 which has a fix for this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-14 14:05:48 +01:00
stevenhorsman
aace7a7336 versions: Bump openssl-src
This is a vulnerability (CVE-2025-9230) in openssl, so move
to 3.5.4 which has a fix for this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-14 14:05:48 +01:00
Fabiano Fidêncio
2acb94ef2d arm64: Do not use DAX with the rootfs image
Kernel 6.18.x has an issue with DAX, which is not yet fixed upstream:
```
[    0.737679] EXT4-fs (pmem0p1): mounted filesystem 79676804-7c8b-491a-b2a6-9bae3c72af70 ro with ordered data mode. Quota mode: disabled.
[    0.737891] VFS: Mounted root (ext4 filesystem) readonly on device 259:1.
[    0.739119] devtmpfs: mounted
[    0.739476] Freeing unused kernel memory: 1920K
[    0.740156] Run /sbin/init as init process
[    0.740229]   with arguments:
[    0.740286]     /sbin/init
[    0.740321]   with environment:
[    0.740369]     HOME=/
[    0.740400]     TERM=linux
[    0.743162] Unable to handle kernel paging request at virtual address fffffdffbf000008
[    0.743285] Mem abort info:
[    0.743316]   ESR = 0x0000000096000006
[    0.743371]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.743444]   SET = 0, FnV = 0
[    0.743489]   EA = 0, S1PTW = 0
[    0.743545]   FSC = 0x06: level 2 translation fault
[    0.743610] Data abort info:
[    0.743656]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[    0.743720]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.743785]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.743848] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000b9d17000
[    0.743931] [fffffdffbf000008] pgd=10000000bfa3d403, p4d=10000000bfa3d403, pud=1000000040bfe403, pmd=0000000000000000
[    0.744070] Internal error: Oops: 0000000096000006 [#1]  SMP
[    0.748888] CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.18.4 #1 NONE
[    0.749421] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.749969] pc : dax_disassociate_entry.constprop.0+0x20/0x50
[    0.750444] lr : dax_insert_entry+0xcc/0x408
[    0.750802] sp : ffff80008000b9e0
[    0.751083] x29: ffff80008000b9e0 x28: 0000000000000000 x27: 0000000000000000
[    0.751682] x26: 0000000001963d01 x25: ffff0000004f7d90 x24: 0000000000000000
[    0.752264] x23: 0000000000000000 x22: ffff80008000bcc8 x21: 0000000000000011
[    0.752836] x20: ffff80008000ba90 x19: 0000000001963d01 x18: 0000000000000000
[    0.753407] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[    0.753970] x14: ffffbf3154b9ae70 x13: 0000000000000000 x12: ffffbf3154b9ae70
[    0.754548] x11: ffffffffffffffff x10: 0000000000000000 x9 : 0000000000000000
[    0.755122] x8 : 000000000000000d x7 : 000000000000001f x6 : 0000000000000000
[    0.755707] x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffdffc0000000
[    0.756287] x2 : 0000000000000008 x1 : 0000000040000000 x0 : fffffdffbf000000
[    0.756871] Call trace:
[    0.757107]  dax_disassociate_entry.constprop.0+0x20/0x50 (P)
[    0.757592]  dax_iomap_pte_fault+0x4fc/0x808
[    0.757951]  dax_iomap_fault+0x28/0x30
[    0.758258]  ext4_dax_huge_fault+0x80/0x2dc
[    0.758594]  ext4_dax_fault+0x10/0x3c
[    0.758892]  __do_fault+0x38/0x12c
[    0.759175]  __handle_mm_fault+0x530/0xcf0
[    0.759518]  handle_mm_fault+0xe4/0x230
[    0.759833]  do_page_fault+0x17c/0x4dc
[    0.760144]  do_translation_fault+0x30/0x38
[    0.760483]  do_mem_abort+0x40/0x8c
[    0.760771]  el0_ia+0x4c/0x170
[    0.761032]  el0t_64_sync_handler+0xd8/0xdc
[    0.761371]  el0t_64_sync+0x168/0x16c
[    0.761677] Code: f9453021 f2dfbfe3 cb813080 8b001860 (f9400401)
[    0.762168] ---[ end trace 0000000000000000 ]---
[    0.762550] note: init[1] exited with irqs disabled
[    0.762631] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
```

For now, we limit the rootfs that we ship to ARM64 to not use DAX, in
the future we'll re-enable it as soon as the patch lands on mainstream
kernel.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 11:46:40 +01:00
Fabiano Fidêncio
3ef99f4ee3 versions: Add specific nvidia kernel version
This is needed as the 580 driver doesn't build against 6.18.x, and the
590 driver is not yet fully working for our case, thus we stick to the
previous version that worked before.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 11:46:40 +01:00
Fabiano Fidêncio
cce5d4abf6 kernel: bump to v6.18.x (LTS)
Bump both the kernel and kernel-confidential versions from v6.12.x and
v6.16.x to v6.18.4, aligning with the new LTS release.

Kernel 6.18 introduced several configuration changes that required
updates to our kernel config fragments:

* CRYPTO_FIPS dependencies changed:
  - In 6.12: depended on !CRYPTO_MANAGER_DISABLE_TESTS
  - In 6.18: now depends on CRYPTO_SELFTESTS (which requires EXPERT)
  Added CONFIG_EXPERT=y and CONFIG_CRYPTO_SELFTESTS=y to crypto.conf
  to satisfy the new dependency chain.
  * CONFIG_EXPERT is a naughty one, as it disables / enables a bunch
    of things behind ones back, probably just to prove a point that
    it is for experts ;-) ... regardless, a reasonable amount of
    options had to be re-added in order to make sure anything ends
    up broken.

* Legacy iptables support:
  Kernel 6.18 requires explicit legacy xtables/iptables configs for
  IP_NF_* options. Added CONFIG_NETFILTER_XTABLES_LEGACY,
  CONFIG_IP_NF_IPTABLES_LEGACY, and CONFIG_IP6_NF_IPTABLES_LEGACY
  to netfilter.conf.

* Module signing dependencies:
  Added CONFIG_MODULES=y and other required dependencies to
  module_signing.conf to ensure MODULE_SIG can be properly enabled.

* Whitelist updates:
  - Added CONFIG_NF_CT_PROTO_DCCP (removed in 6.18+)
  - Added CONFIG_CRYPTO_SELFTESTS, CONFIG_NETFILTER_XTABLES_LEGACY,
    CONFIG_IP_NF_IPTABLES_LEGACY, CONFIG_IP6_NF_IPTABLES_LEGACY
    (added in 6.18+, not present in older kernels like 6.12)

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 11:46:40 +01:00
LandonTClipp
197231456f docs: Navigation improvements and bug fixes to Pages
A few minor changes to the Zensical config that makes navigation easier. Also
fixed a couple of bugs with local serving and added some quality of life
features to Zensical.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-01-13 11:17:58 -06:00
LandonTClipp
94fde1356c docs: Add Zensical Doc Site Generation
This commit adds a Github workflow for building a Github Pages site for the markdown
files in the docs/ directory. Zensical is a new markdown-based static site generation
framework built by the creators of Material for Mkdocs. https://zensical.org/

This commit does not clean the doc structure, so site navigation is initially going to
be messy.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-01-13 12:42:02 +01:00
dependabot[bot]
2edb161c53 build(deps): bump github.com/urfave/cli in /src/runtime
Bumps [github.com/urfave/cli](https://github.com/urfave/cli) from 1.22.14 to 1.22.17.
- [Release notes](https://github.com/urfave/cli/releases)
- [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md)
- [Commits](https://github.com/urfave/cli/compare/v1.22.14...v1.22.17)

---
updated-dependencies:
- dependency-name: github.com/urfave/cli
  dependency-version: 1.22.17
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-13 09:04:41 +00:00
dependabot[bot]
3377d729ea build(deps): bump rsa from 0.9.6 to 0.9.9 in /src/tools/agent-ctl
Bumps [rsa](https://github.com/RustCrypto/RSA) from 0.9.6 to 0.9.9.
- [Changelog](https://github.com/RustCrypto/RSA/blob/v0.9.9/CHANGELOG.md)
- [Commits](https://github.com/RustCrypto/RSA/compare/v0.9.6...v0.9.9)

---
updated-dependencies:
- dependency-name: rsa
  dependency-version: 0.9.9
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-13 04:08:40 +01:00
Fupan Li
1f1a000608 Merge pull request #12291 from Apokleos/bump-qapi
runtime-rs: Bump qapi-rs from 0.14 to 0.15
2026-01-13 10:39:41 +08:00
Manuel Huber
9e30283952 runtime: nvidia: change kernel parameters
Remove the agent hotplug timeout parameter from the kernel
command line. Having shifted to VFIO cold-plug, this parameter is
no longer needed.
Remove the no longer required parameter for TDX and thus align the
SNP and TDX configurations.
Add a parameter to avoid the kernel to mount the /dev tmpfs. NVRC
and later on kata-agent attempt this. While kata-agent does not
panic when mounting /dev fails, NVRC makes mounting /dev a hard
requirement.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-12 16:11:28 -08:00
dependabot[bot]
bcadb9b231 build(deps): bump sequoia-openpgp in /src/tools/agent-ctl
Bumps [sequoia-openpgp](https://gitlab.com/sequoia-pgp/sequoia) from 2.0.0 to 2.1.0.
- [Commits](https://gitlab.com/sequoia-pgp/sequoia/compare/openpgp/v2.0.0...openpgp/v2.1.0)

---
updated-dependencies:
- dependency-name: sequoia-openpgp
  dependency-version: 2.1.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-12 22:16:51 +01:00
Alex Lyn
fba92880c9 tests: make set_container_command idempotent and add debug output
set_container_command() previously appended command arguments
one-by-one with
'.command += [...]'. This makes the helper non-idempotent and can
lead to unexpected command arrays when invoked multiple times.

Update the helper to set the full command array in a single yq v4
expression and print the target YAML path plus the command being
applied to simplify debugging when tests fail.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-12 17:56:28 +01:00
Alex Lyn
38296a41b2 tests: Generate pod config with stable .yaml suffix
The pod config file created by new_pod_config() was generated via
mktemp using the template "pod-config.yaml.in.XXX", which produces
filenames that do not end with ".yaml" (e.g. pod-config.yaml.in.ABC).

If the random combination of special suffix with ".Csv" or ".Xml", etc.
the following operations with yq will fail.

Some helpers and tooling assume the config path ends with ".yaml".
Switch the mktemp template to place the random suffix before the
extension so the returned path always ends with ".yaml".

Fixes: #12268, #12319

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-12 17:56:28 +01:00
Fabiano Fidêncio
9fec31f400 tools: kubectl: Add kubectl version as a tag
This is a suggestion from Choi, so we can easily test with a specific
kubectl version and also easily understand which kubectl version is
being used in case of failure.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-12 15:48:44 +01:00
Fabiano Fidêncio
26dfcb627b tools: Build kubectl image
This image will be used by our helm charts to verify that a
kata-containers deployment is correct.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-12 15:48:44 +01:00
Alex Lyn
d03eccf567 runtime-rs: Improve wait_for_migration to avoid fixed sleep
Enhance the wait_for_migration implementation to reliably wait for
QEMU migration completion and avoid the previous `sleep(280ms)`
delay.
(1) Add an initial fast-path query to return immediately if
migration is already completed/failed/cancelled.
(2) Use a hard deadline to enforce timeouts deterministically.
(3) Implement adaptive polling with backoff and a maximum interval
to reduce QMP load while keeping responsiveness.
(4) Unify migration status handling and return clear errors on
failed/cancelled states.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-12 20:06:55 +08:00
Alex Lyn
5026b33455 runtime-rs: Introduce a method to detect current migrate info
Return information about current migration process. And the input
and output as below:
{ 'command': 'query-migrate', 'returns': 'MigrationInfo' }

But note that the Qemu API is valid within qapi-rs(v0.15+)

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-12 20:06:55 +08:00
Alex Lyn
c472b5db54 runtime-rs: Bump qapi-rs from 0.14 to 0.15
The detailed information about the updated versions as below:
```
qapi = { version = "0.15", features = ["qmp", "async-tokio-all"] }
qapi-spec = "0.3.2"
qapi-qmp = "0.15.0"
```
and it will correct some corresonding structures.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-12 20:06:55 +08:00
Manuel Huber
183507beeb agent: change secure_storage_integrity default
Change the secure_storage_integrity option's default value to true.
With this, integrity protection for encrypted block device contents
will be requested from the confidential data hub by default, see the
agent's cdh_handler_trusted_storage function in rpc.rs.
This behavior can be disabled by explicitly setting the
agent.secure_storage_integrity parameter to 0 or false via kernel
command line parameters.

This will affect the trusted storage implementation for the guest-pull
mechanism, and it will affect future implementations using this code
path, such as implementations for ephemeral secure storage.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-10 16:54:03 +01:00
stevenhorsman
a0d96256f5 packaging: Fix tools permissions issue
In some builds we are seeing:
```
error: could not create temp file /opt/rustup/tmp/r2xu46kwuyc7k2kr_file: Permission denied (os error 13)
```
in the agent-ctl build, so try and port a fix from #12313 to the tools build
to try and resolve this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-09 21:45:26 +01:00
Federico A. Corazza
787768fe9b kata-deploy: Fix extraction of the containerd major version
Fixes deploying kata-containers using k3s. The deploy script fails with /opt/kata-artifacts/scripts/kata-deploy.sh: line 397: [: too many arguments

Signed-off-by: Federico A. Corazza <git@facorazza.com>
2026-01-09 19:52:18 +01:00
stevenhorsman
5067ed7d9a versions.yaml: Fix formatting errors
yamllint complains that there is only one space before the comment,
so add a second to prevent this annoying message showing up.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-09 19:36:31 +01:00
stevenhorsman
a850f66fc4 versions: Bump rust to 1.89
Following the agreed toolchain policy - bump rust to the current (1.91)-2
releases.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-09 19:36:31 +01:00
Manuel Huber
df2896c298 docs: Create NVIDIA GPU passthrough QEMU scenario
Create a new page for a reference implementation for Kubernetes
using QEMU, the go shim and an NVIDIA rootfs. The new page
contains information on:
- components involved in the NVIDIA (TEE) GPU scenario
- orchestration flow for GPU passthrough scenarios
- deployment guidance

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-09 19:02:56 +01:00
Manuel Huber
43627805f4 docs: Improve structure and flow of NVIDIA guide
- Apply a few structural/grouping changes and improve flow
- Group build sections together
- Move usage examples to last section

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-09 19:02:56 +01:00
Steve Horsman
489deaad17 Merge pull request #12297 from manuelh-dev/mahuber/fix-doc
docs: Fix trusted-image-storage reference
2026-01-09 15:22:25 +00:00
Hyounggyu Choi
2962e14c10 virtiofsd: fix RUSTUP_HOME and CARGO_HOME permissions for non-root builds
The following error was observed during virtiofsd static build:

```
error: could not create temp file /opt/rustup/tmp/p44enysfaxwdbvw4_file:
Permission denied (os error 13)
```

This occurs because RUSTUP_HOME and CARGO_HOME were initialized by the
root user during `docker build`, but `cargo build` is executed as a
non-root user via 'docker run --user'.

Ensure these directories are writable by adjusting the permission after
the toolchain installation is complete.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-01-09 14:01:20 +01:00
Manuel Huber
65aa99f291 docs: Fix trusted-image-storage reference
The sample uses a volume device name which does not exist,
hence fix.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-09 11:41:18 +00:00
Saul Paredes
02979a13e3 Merge pull request #12208 from romoh/patch-1
ci: Update AKS setup post Pod Sandboxing GA
2026-01-08 11:02:05 -08:00
Fabiano Fidêncio
f8318c0542 kata-deploy: Remove unused dependency
We're depending solely on toml_edit, thus we can safely remove the toml
dependency.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-08 18:58:11 +01:00
Fupan Li
b3546f3a68 Merge pull request #12282 from kata-containers/set-required-ci
Set several tests as required ci
2026-01-08 20:34:39 +08:00
Mikko Ylinen
cc6277b735 Revert "tdx: Update GPU config for the latest TDX stack"
Prefer the "full feature TDVF" instead of the generic OVMF build. See
Option-B in
https://github.com/tianocore/edk2/tree/master/OvmfPkg/IntelTdx#configurations-and-features
for the extra hardening supported.

FIRMWAREPATH_NV also seems to be TDX specific unlike the Makefile
suggests. Therefore, it can be dropped completely.

This reverts commit 66ccc25724.
2026-01-08 10:21:47 +01:00
Mikko Ylinen
e02e226431 packaging: build OVMF for Intel TDX again
OVMF build for Intel TDX (aka "TDVF") was disabled in favor of Ubuntu/
CentOS pre-upstream releases of Intel TDX.

See 4292c4c3b1.

It's time to re-enable the build and move runtime configurations to
use it (the latter will be done in a later commit).

This is a partial revert of 4292c4c3b with the following changes:
- Stop calling OVMF for Intel TDX "TDVF" and follow the naming distros
use for TDX enabled build: OVMF.inteltdx.fd.
- Single binary OVMF.inteltdx.fd is supported using -bios QEMU param.
- Secure Boot infrastructure is disabled since Kata does not support it.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-01-08 10:21:47 +01:00
Alex Lyn
f3d92a8b4a dragonball: Fix UT failed in test_fs_manipulate_backend_fs
Improve the checking logic for source path existing.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 12:42:00 +08:00
Alex Lyn
7de968b416 dragonball: Fix warning of unused method
Actually this method is indeed called, just add attribute of
`#[allow(dead_code)]` to allow UT pass. And the warning looks like:
warning: method `send_message_with_payload` is never used
    |
224 | impl<R: Req> Endpoint<R> {
    | ------------------------ method in this implementation
...
522 |     pub fn send_message_with_payload<T: Sized, P: Sized>(
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 11:01:34 +08:00
Alex Lyn
36d3d7c3bf dragonball: Fix warnings of result to be handled
warning: unused `std::result::Result` that must be used
   -->
src/dragonball/dbs_virtio_devices/src/vhost/vhost_user/net.rs:679:9
    |
679 | /         VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync,
GuestRegionMmap>::write_config(
680 | |             &mut dev, 0, &config,
681 | |         );
    | |_________^
    |
    = note: this `Result` may be an `Err` variant, which should be
handled
    = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
    |
679 |         let _ = VirtioDevice::<Arc<GuestMemoryMmap<()>>,
QueueSync, GuestRegionMmap>::write_config(
    |         +++++++

warning: unused `std::result::Result` that must be used
   -->
src/dragonball/dbs_virtio_devices/src/vhost/vhost_user/net.rs:683:9
    |
683 | /         VirtioDevice::<Arc<GuestMemoryMmap<()>>, QueueSync,
GuestRegionMmap>::read_config(
684 | |             &mut dev, 0, &mut data,
685 | |         );
    | |_________^
    |
    = note: this `Result` may be an `Err` variant, which should be
handled
help: use `let _ = ...` to ignore the resulting value
    |
683 |         let _ = VirtioDevice::<Arc<GuestMemoryMmap<()>>,
QueueSync, GuestRegionMmap>::read_config(
    |         +++++++

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 10:52:19 +08:00
Alex Lyn
6a1b25a4b0 dragonball: Fix warning of variable does not need to be mutable
the WARNING looks like as:
...
warning: variable does not need to be mutable
   --> src/dragonball/dbs_virtio_devices/src/vsock/csm/txbuf.rs:217:13
    |
217 |         let mut tmp: Vec<u8> = vec![0; TxBuf::SIZE - 2];
    |             ----^^^
    |             |
    |             help: remove this `mut`
    |
    = note: `#[warn(unused_mut)]` on by default
...

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 10:44:25 +08:00
Alex Lyn
064271b9cb dragonball: Fix unexpected cfg condition of test-resources
Fix the warnings about unexpected cfg of test-resources, and the
detailed warning message looks like as below:

...
warning: unexpected `cfg` condition value: `test-resources`
   --> src/dragonball/dbs_virtio_devices/src/fs/device.rs:973:11
    |
973 |     #[cfg(feature = "test-resources")]
    |           ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: expected values for `feature` are: `fuse-backend-rs`,
`vhost`, `vhost-net`, `vhost-rs`, `vhost-user`, `vhost-user-blk`,
`vhost-user-fs`, `vhost-user-net`, `virtio-balloon`, `virtio-blk`,
`virtio-fs`, `virtio-fs-pro`, `virtio-mem`, `virtio-mmio`, `virtio-net`,
and `virtio-vsock`
    = help: consider adding `test-resources` as a feature in
`Cargo.toml`
...

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 10:39:33 +08:00
Alex Lyn
ef36c47ca4 runtime-rs: Fix deprecated method in UT
Remove into_path() and replace it with keep().

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 10:32:31 +08:00
Alex Lyn
e4451baa84 tests: Set run-nerdctl-tests with qemu-runtime-rs required
run-nerdctl-tests (qemu-runtime-rs)

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 09:56:50 +08:00
Alex Lyn
56a21c33a3 tests: Set stability tests with qemu-runtime-rs required
run-containerd-stability (active, qemu-runtime-rs)
run-containerd-stability (lts, qemu-runtime-rs)

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 09:56:50 +08:00
Alex Lyn
679e31d884 tests: Set run-nydus CIs as required
run-basic-amd64-tests / run-nydus

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-08 09:56:50 +08:00
Fabiano Fidêncio
6b3953dd51 tests: k8s: liveness-probes: Adjust events grep
Till k8s 1.34 we could grep by "Started containerd". From k8s 1.35
onwards the event message changed and we should, instead, grep by
"Container started".

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-07 23:01:59 +01:00
Fabiano Fidêncio
c4194538e2 versions: Bump QEMU to v10.2.0
QEMU v10.2.0 was released on December 24th, 2025.

The experimental GPU SNP / TDX are also pointing to v10.2.0 release with
their gpu-{snp,tdx}-20260107 branch.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-07 12:30:55 +01:00
Steve Horsman
93ad6fde75 Merge pull request #12294 from stevenhorsman/remediate-RUSTSEC-2021-0064
versions: Bump sha2 crate version
2026-01-07 09:53:26 +00:00
stevenhorsman
c456b84537 versions: Bump sha2 crate version
sha2 0.9.3 includes the use of cpuid-bool, which was renamed to cpufeatures
around 5 years ago. Try moving to a workspace dependency of sha2
and bumping to the latest version to remediate RUSTSEC-2021-0064

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-06 15:41:34 +00:00
Roaa Sakr
44c79cf14a ci: Update AKS setup post Pod Sandboxing GA
Update workload-runtime value to align with current AKS Pod Sandboxing documentation post GA.

Signed-off-by: Roaa Sakr <romoh@microsoft.com>
2026-01-05 13:47:33 -08:00
Steve Horsman
9463dd970e Merge pull request #12287 from mythi/drop-qat
use-cases: drop Intel QuickAssist instructions
2026-01-05 13:28:16 +00:00
Mikko Ylinen
99bc0f49cc use-cases: drop Intel QuickAssist instructions
While the use-case of Intel QuickAssist (QAT) accelerated crypto
and/or compression with k8s and Kata Containers is still valid,
the setup instructions are outdated:

Starting with Intel Xeon Gen4 (Sapphire Rapids), QAT driver
stack moved to in-tree drivers without a separete SR-IOV VF
driver.

Drop all the setup instructions but keep the use-cases doc
for reference. Users wanting to enable the use-case, should consult
with Intel QAT Device plugins or Intel QAT DRA driver authors.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-01-02 12:14:04 +02:00
Fupan Li
b27a80b800 Merge pull request #12156 from Apokleos/required-coco-dev-rs
tests: Make the tests coco-dev job with coco-dev-runtime-rs required
2025-12-25 17:30:40 +08:00
Steve Horsman
bdc5f7d4be Merge pull request #12271 from stevenhorsman/bump-rust-to-1.88
Bump rust to 1.88
2025-12-23 21:38:42 +00:00
Alex Lyn
0b1a5c6e93 tests: Make the tests coco-dev job with coco-dev-runtime-rs required
The nontee job (run-k8s-tests-coco-nontee) for qemu-coco-dev-runtime-rs
is running well and it's time to make it required when the CI runs.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-12-23 09:54:52 +08:00
stevenhorsman
b6108a7c4a dragonball: Fix manual implementation of .is_multiple_of
Use this new method to avoid the clippy warning and increase
readability

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
55be31ef0f runtime-rs: Fix manual implementation of .is_multiple_of
Use this new method to avoid the clippy warning and increase
readability

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
1d139a7c92 versions: Bump rust to 1.88
In prep for the bump to rust 1.90, try bumping
to 1.88 first to see if the CI is successful here

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
c6053e976f dragonball: Improve vector initialisation
Directly initialise  a zero-filled vector, rather than resizing later

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
18a51dad98 dragonball: Fix manual slice size calculation
Using the built in size_of_val is easier to read and less error-prone
than doing this calculation manually

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
188c9e6eb7 dragonball: Prefer from over into
From give Into for free, so prefer this method

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
c7daa12fe6 dragonball: Remove unnecessary cast
Don't cast usize to usize

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
6c19bd01c8 dragonball: Fix redundant pattern matching
Convert `matches!(desc, None)` to desc.is_none() which is simpler

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
15c6ef5988 dragonball: Fix deprecated cargo-clippy cfg
#[cfg(feature = "cargo-clippy")] has been deprecated for years,
so should be replaced with `#[cfg(clippy)]`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
e0d09dd787 dragonball: Fix useless use of vec!
`vec![...]` is the same as `[...]`, so remove it to clean up code

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
4fb90d61aa dragonball: Temporaily skip kvm bindgen tests
There are many, many null pointer dereferences in the bindgen code
when moving between rust 1.85.1 and 1.86 and no docs of the source
that it was generated from, so try and skip
these test from running until an SME can look at them @lifupan

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:19 +00:00
stevenhorsman
04306c162b genpolicy: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:11 +00:00
stevenhorsman
b9ce0bbdf8 trace-forwarder: Fix uninlined_format_args in examples
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:11 +00:00
stevenhorsman
c5f0acef23 kata-ctl: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:02 +00:00
stevenhorsman
aff3524420 kata-ctl: Refresh runtime-rs crates
runtime-rs crates are pulled into kata-ctl and some of these have
bumped recently, so update these in kata-ctl as well

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:50:01 +00:00
stevenhorsman
2caa62f753 agent-ctl: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:52 +00:00
stevenhorsman
6006b8350d libs: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:45 +00:00
stevenhorsman
2fde31547a runtime-rs: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:36 +00:00
stevenhorsman
a299338b6c dragonball: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:27 +00:00
stevenhorsman
e44c4d901f doc: Fix uninlined_format_args in examples
Clippy is recommending that format args are inlined for
better clarity, so ensure our docs include this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:27 +00:00
stevenhorsman
b07899f8dc agent: Fix uninlined_format_args
Clippy is recommending that format args are inlined for
better clarity, so update our code to remove these warnings

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-22 19:49:17 +00:00
stevenhorsman
2af88dbb48 agent: bump cdi-rs
In #12151 the version was bumped in cargo.toml, but the update not
done, so run `cargo update -p container-device-interface` to apply it

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-20 10:08:45 +00:00
Steve Horsman
97603608ac Merge pull request #12259 from RuoqingHe/filter-tests-requires-kvm
dragonball: Skip tests require kvm while kvm is absent
2025-12-19 16:05:33 +00:00
Steve Horsman
81d74346f3 Merge pull request #12255 from stevenhorsman/bump-to-rust-1.90-prep
Preparations for the rust 1.90 bump
2025-12-19 14:41:32 +00:00
Steve Horsman
b75cc16bad Merge pull request #12272 from shwetha-s-poojary/revert_cleanup
workflows: payload: do not remove AGENT_TOOLSDIRECTORY
2025-12-19 14:22:36 +00:00
shwetha-s-poojary
1929ca8879 workflows: payload: do not remove AGENT_TOOLSDIRECTORY
Remove line that deletes $AGENT_TOOLSDIRECTORY

Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>
2025-12-19 05:24:36 -08:00
Ruoqing He
5fa663b1e3 dragonball: Skip tests requires KVM when KVM is absent
KVM is not available in our ARM runners, let's skip those tests
accordingly, while making the rest test cases remain tested on machines
with KVM present and access to KVM device.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-12-18 14:17:46 +00:00
Ruoqing He
7cfb97d41b libs: Introduce skip_if_kvm_unaccessable macro
There are test cases require interaction with KVM device, introduce
skip_if_kvm_unaccessable macro to skip them.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-12-18 12:43:20 +00:00
stevenhorsman
e5568e65a1 lib: Fix missing copyright and license
Add the copyright date from when the file was first submitted to github

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
175c2c70b1 dragonball: Fix pointer equality check
Use `ptr::eq` to compare references by address rather than the
values that they point to

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
a221eaa81d dragonball: Fix length comparison to zero
Replace .len() == 0 with .is_empty() for more clarity

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
e73a7c3717 dragonball: Replace manual div_ceil
Use the more clear built-in method

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
048000654c runtime-rs: Prevent doc test issue
cargo test was trying to evaluate the documentation comment and failing,
so try and make the comment explicitly text to avoid this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
4384b6ad9f dragonball: Avoid manual implementation of ok
Refactor to use `.ok()` rather than implementing it ourselves

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
f4dd69a835 dragonball: Remove unnecessary unwrap
Given that we call `is_some` earlier, we don't then need to unwrap,
so refactor to avoid this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
20192f819f agent-ctl: Remove unnecessary unwrap
Given that we call `is_some` earlier, we don't then need to unwrap,
so refactor to avoid this

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
9bf5f113f9 genpolicy: Allow dead_code
A few structs in genpolicy are never constructed, so add
`#[allow(dead_code)]` to prevent this clipped warning

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
ca1c0c853f libs: Remove doc overindentation
The doc comment had one space to many in it's list, so the format was wrong

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
501b41cf8f dragonball: Remove doc overindentation
The doc comment had one space to many in it's list, so the format was wrong

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
6a45ee0874 runtime-rs: Improve map iteration
The key was never used, just the value, so just iterate over `.values()`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
2f49dffcd7 runtime-rs: Remove dead code
`VmmPingResponse` and `NetInterworkingModel` are
never constructed, so remove them

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
35557745b1 runtime-rs: Fix char_indices_as_byte_indices
In unicode you can have multi-byte characters, so it's better to
user char_indices than enumerate the bytes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
69ca6c0de0 runtime-rs: Fix manual_contains
Use contains to be more concise and efficient rather than manually
implementing this check

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
0027f6cae0 agent: Fix dead_code warning
VirtioBlkCcwDeviceHandler and VirtioBlkCcwHandler
are only constructed on s390x, so add #[cfg(target_arch = "s390x")]
to all the code

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:27 +00:00
stevenhorsman
3b2c83f9d2 trace-forwarder: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
b1cfa98524 runtime-rs: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
dc8f628dd1 libs: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other and drop our own macro that did this mapping

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
5f1d3481af dragonball: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
9ec7109712 agent: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
34d299ae44 vsock-exporter: Fix clippy::io_other_error issue
We can use the new Error::other options rather than
Error:new(Error:Kind:Other

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
b2f9f23504 dragonball: Fix mismatched_lifetime_syntaxes issue
Fix to`warning: hiding a lifetime that's elided elsewhere is confusing`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
stevenhorsman
8bbbc3a58b lib: Fix mismatched_lifetime_syntaxes issue
Fix the warning throw up:
```
warning: hiding a lifetime that's elided elsewhere is confusing
  --> /root/go/src/github.com/kata-containers/kata-containers/src/libs/kata-types/src/utils/u32_set.rs:50:17
   |
50 |     pub fn iter(&self) -> Iter<u32> {
   |                 ^^^^^     --------- the same lifetime is hidden here
   |                 |
   |                 the lifetime is elided here
   |
   = help: the same lifetime is referred to in inconsistent ways, making the signature confusing
   = note: `#[warn(mismatched_lifetime_syntaxes)]` on by default
help: use `'_` for type paths
   |
50 |     pub fn iter(&self) -> Iter<'_, u32> {
   |                                +++
   ```

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-12-18 07:45:26 +00:00
544 changed files with 10033 additions and 14836 deletions

View File

@@ -12,7 +12,6 @@ updates:
- "/src/tools/agent-ctl"
- "/src/tools/genpolicy"
- "/src/tools/kata-ctl"
- "/src/tools/runk"
- "/src/tools/trace-forwarder"
schedule:
interval: "daily"

View File

@@ -163,42 +163,6 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
name: run-runk
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
timeout-minutes: 10
run: bash tests/integration/runk/gha-run.sh run
run-tracing:
name: run-tracing
strategy:

View File

@@ -54,6 +54,7 @@ jobs:
- nydus
- ovmf
- ovmf-sev
- ovmf-tdx
- pause-image
- qemu
- qemu-snp-experimental
@@ -147,8 +148,8 @@ jobs:
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: kata-artifacts-amd64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.zst
name: kata-artifacts-amd64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst
retention-days: 15
if-no-files-found: error
@@ -236,8 +237,8 @@ jobs:
asset:
- busybox
- coco-guest-components
- kernel-nvidia-gpu-headers
- kernel-nvidia-gpu-confidential-headers
- kernel-nvidia-gpu-modules
- kernel-nvidia-gpu-confidential-modules
- pause-image
steps:
- uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

View File

@@ -134,8 +134,8 @@ jobs:
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: kata-artifacts-arm64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.zst
name: kata-artifacts-arm64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst
retention-days: 15
if-no-files-found: error
@@ -216,7 +216,7 @@ jobs:
matrix:
asset:
- busybox
- kernel-nvidia-gpu-headers
- kernel-nvidia-gpu-modules
steps:
- uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0
with:

View File

@@ -0,0 +1,75 @@
name: Build kubectl multi-arch image
on:
schedule:
# Run every Sunday at 00:00 UTC
- cron: '0 0 * * 0'
workflow_dispatch:
# Allow manual triggering
push:
branches:
- main
paths:
- 'tools/packaging/kubectl/Dockerfile'
- '.github/workflows/build-kubectl-image.yaml'
permissions: {}
env:
REGISTRY: quay.io
IMAGE_NAME: kata-containers/kubectl
jobs:
build-and-push:
name: Build and push multi-arch image
runs-on: ubuntu-24.04
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
persist-credentials: false
- name: Set up QEMU
uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0
- name: Login to Quay.io
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
with:
registry: ${{ env.REGISTRY }}
username: ${{ vars.QUAY_DEPLOYER_USERNAME }}
password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
- name: Get kubectl version
id: kubectl-version
run: |
KUBECTL_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)
echo "version=${KUBECTL_VERSION}" >> "$GITHUB_OUTPUT"
- name: Generate image metadata
id: meta
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=raw,value=latest
type=raw,value={{date 'YYYYMMDD'}}
type=raw,value=${{ steps.kubectl-version.outputs.version }}
type=sha,prefix=
- name: Build and push multi-arch image
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
with:
context: tools/packaging/kubectl/
file: tools/packaging/kubectl/Dockerfile
platforms: linux/amd64,linux/arm64,linux/s390x,linux/ppc64le
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max

View File

@@ -1,36 +0,0 @@
name: Kata Containers Nightly CI (Rust)
on:
schedule:
- cron: '0 1 * * *' # Run at 1 AM UTC (1 hour after script-based nightly)
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
permissions: {}
jobs:
kata-containers-ci-on-push-rust:
permissions:
contents: read
packages: write
id-token: write
attestations: write
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "nightly-rust"
tag: ${{ github.sha }}-nightly-rust
target-branch: ${{ github.ref_name }}
build-type: "rust" # Use Rust-based build
secrets:
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
ITA_KEY: ${{ secrets.ITA_KEY }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

View File

@@ -19,11 +19,6 @@ on:
required: false
type: string
default: no
build-type:
description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).
required: false
type: string
default: ""
secrets:
AUTHENTICATED_IMAGE_PASSWORD:
required: true
@@ -77,7 +72,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-22.04
arch: amd64
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -110,7 +104,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-arm
arch: arm64
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -156,7 +149,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-s390x
arch: s390x
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -175,7 +167,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-ppc64le
arch: ppc64le
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -297,7 +288,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -313,7 +304,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-arm64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-arm64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -326,7 +317,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -348,7 +339,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -366,7 +357,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-s390x${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-s390x
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -380,7 +371,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-ppc64le${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-ppc64le
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -392,7 +383,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}

32
.github/workflows/docs.yaml vendored Normal file
View File

@@ -0,0 +1,32 @@
name: Documentation
on:
push:
branches:
- main
permissions: {}
jobs:
deploy-docs:
name: deploy-docs
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- uses: actions/configure-pages@v5
- uses: actions/checkout@v5
with:
persist-credentials: false
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: pip install zensical
- run: zensical build --clean
- uses: actions/upload-pages-artifact@v4
with:
path: site
- uses: actions/deploy-pages@v4
id: deployment

View File

@@ -82,7 +82,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: ubuntu-22.04
arch: amd64
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -100,7 +99,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: ubuntu-24.04-arm
arch: arm64
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -118,7 +116,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: s390x
arch: s390x
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -134,9 +131,8 @@ jobs:
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-ppc64le
target-branch: ${{ github.ref_name }}
runner: ppc64le-small
runner: ubuntu-24.04-ppc64le
arch: ppc64le
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

View File

@@ -30,11 +30,6 @@ on:
description: The arch of the tarball.
required: true
type: string
build-type:
description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).
required: false
type: string
default: ""
secrets:
QUAY_DEPLOYER_PASSWORD:
required: true
@@ -63,7 +58,6 @@ jobs:
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf /usr/local/share/boost
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
sudo rm -rf /usr/lib/jvm
sudo rm -rf /usr/share/swift
sudo rm -rf /usr/local/share/powershell
@@ -107,10 +101,8 @@ jobs:
REGISTRY: ${{ inputs.registry }}
REPO: ${{ inputs.repo }}
TAG: ${{ inputs.tag }}
BUILD_TYPE: ${{ inputs.build-type }}
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)/kata-static.tar.zst" \
"${REGISTRY}/${REPO}" \
"${TAG}" \
"${BUILD_TYPE}"
"${TAG}"

View File

@@ -32,6 +32,7 @@ jobs:
matrix:
vmm:
- qemu
- qemu-runtime-rs
k8s:
- kubeadm
runs-on: arm64-k8s

View File

@@ -126,5 +126,6 @@ jobs:
- name: Delete CoCo KBS
if: always() && matrix.environment.name != 'nvidia-gpu'
timeout-minutes: 10
run: |
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

View File

@@ -137,10 +137,12 @@ jobs:
- name: Delete kata-deploy
if: always()
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi
- name: Delete CoCo KBS
if: always()
timeout-minutes: 10
run: |
if [ "${KBS}" == "true" ]; then
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

View File

@@ -120,10 +120,12 @@ jobs:
- name: Delete kata-deploy
if: always()
timeout-minutes: 15
run: bash tests/integration/kubernetes/gha-run.sh cleanup
- name: Delete CoCo KBS
if: always()
timeout-minutes: 10
run: |
[[ "${KATA_HYPERVISOR}" == "qemu-tdx" ]] && echo "ITA_KEY=${GH_ITA_KEY}" >> "${GITHUB_ENV}"
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs
@@ -327,7 +329,6 @@ jobs:
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf /usr/local/share/boost
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
sudo rm -rf /usr/lib/jvm
sudo rm -rf /usr/share/swift
sudo rm -rf /usr/local/share/powershell

View File

@@ -66,7 +66,6 @@ jobs:
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf /usr/local/share/boost
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
sudo rm -rf /usr/lib/jvm
sudo rm -rf /usr/share/swift
sudo rm -rf /usr/local/share/powershell
@@ -88,4 +87,4 @@ jobs:
- name: Report tests
if: always()
run: bash tests/integration/kubernetes/gha-run.sh report-tests
run: bash tests/functional/kata-deploy/gha-run.sh report-tests

View File

@@ -1,54 +0,0 @@
name: CI | Run runk tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
permissions: {}
jobs:
run-runk:
name: run-runk
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
run: bash tests/integration/runk/gha-run.sh run

View File

@@ -6,14 +6,21 @@ on:
permissions: {}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
stale:
name: stale
runs-on: ubuntu-22.04
permissions:
actions: write # Needed to manage caches for state persistence across runs
pull-requests: write # Needed to add/remove labels, post comments, or close PRs
steps:
- uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0
with:
stale-pr-message: 'This PR has been opened without with no activity for 180 days. Comment on the issue otherwise it will be closed in 7 days'
stale-pr-message: 'This PR has been opened without activity for 180 days. Please comment on the issue or it will be closed in 7 days.'
days-before-pr-stale: 180
days-before-pr-close: 7
days-before-issue-stale: -1

View File

@@ -21,7 +21,7 @@ jobs:
persist-credentials: false
- name: Run zizmor
uses: zizmorcore/zizmor-action@e673c3917a1aef3c65c972347ed84ccd013ecda4 # v0.2.0
uses: zizmorcore/zizmor-action@135698455da5c3b3e55f73f4419e481ab68cdd95 # v0.4.1
with:
advanced-security: false
annotations: true

1
.gitignore vendored
View File

@@ -19,3 +19,4 @@ tools/packaging/static-build/agent/install_libseccomp.sh
.envrc
.direnv
**/.DS_Store
site/

30
Cargo.lock generated
View File

@@ -770,12 +770,6 @@ dependencies = [
"libc",
]
[[package]]
name = "cpuid-bool"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8aebca1129a03dc6dc2b127edd729435bbc4a37e1d5f4d7513165089ceb02634"
[[package]]
name = "crc32fast"
version = "1.3.2"
@@ -2378,7 +2372,7 @@ dependencies = [
"nix 0.23.2",
"once_cell",
"serde",
"sha2 0.9.3",
"sha2 0.9.9",
"thiserror 1.0.48",
"uuid 0.8.2",
]
@@ -2999,9 +2993,9 @@ checksum = "ff011a302c396a5197692431fc1948019154afc178baf7d8e37367442a4601cf"
[[package]]
name = "openssl-src"
version = "300.5.0+3.5.0"
version = "300.5.4+3.5.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8ce546f549326b0e6052b649198487d91320875da901e7bd11a06d1ee3f9c2f"
checksum = "a507b3792995dae9b0df8a1c1e3771e8418b7c2d9f0baeba32e6fe8b06c7cb72"
dependencies = [
"cc",
]
@@ -3627,9 +3621,9 @@ dependencies = [
[[package]]
name = "qapi"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c6412bdd014ebee03ddbbe79ac03a0b622cce4d80ba45254f6357c847f06fa38"
checksum = "7b047adab56acc4948d4b9b58693c1f33fd13efef2d6bb5f0f66a47436ceada8"
dependencies = [
"bytes",
"futures 0.3.28",
@@ -3664,9 +3658,9 @@ dependencies = [
[[package]]
name = "qapi-qmp"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8b944db7e544d2fa97595e9a000a6ba5c62c426fa185e7e00aabe4b5640b538"
checksum = "45303cac879d89361cad0287ae15f9ae1e7799b904b474152414aeece39b9875"
dependencies = [
"qapi-codegen",
"qapi-spec",
@@ -4011,6 +4005,7 @@ version = "0.1.0"
dependencies = [
"anyhow",
"common",
"containerd-shim-protos",
"go-flag",
"logging",
"nix 0.26.4",
@@ -4430,13 +4425,13 @@ dependencies = [
[[package]]
name = "sha2"
version = "0.9.3"
version = "0.9.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fa827a14b29ab7f44778d14a88d3cb76e949c45083f7dbfa507d0cb699dc12de"
checksum = "4d58a1e1bf39749807d89cf2d98ac2dfa0ff1cb3faa38fbb64dd88ac8013d800"
dependencies = [
"block-buffer 0.9.0",
"cfg-if 1.0.0",
"cpuid-bool",
"cpufeatures",
"digest 0.9.0",
"opaque-debug",
]
@@ -4482,7 +4477,7 @@ dependencies = [
"runtimes",
"serial_test 0.10.0",
"service",
"sha2 0.9.3",
"sha2 0.10.9",
"slog",
"slog-async",
"slog-scope",
@@ -4866,6 +4861,7 @@ checksum = "8f50febec83f5ee1df3015341d8bd429f2d1cc62bcba7ea2076759d315084683"
name = "test-utils"
version = "0.1.0"
dependencies = [
"libc",
"nix 0.26.4",
]

View File

@@ -2,7 +2,7 @@
authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]
edition = "2018"
license = "Apache-2.0"
rust-version = "1.85.1"
rust-version = "1.88"
[workspace]
members = [
@@ -127,6 +127,7 @@ protobuf = "3.7.2"
rand = "0.8.4"
serde = { version = "1.0.145", features = ["derive"] }
serde_json = "1.0.91"
sha2 = "0.10.9"
slog = "2.5.2"
slog-scope = "4.4.0"
strum = { version = "0.24.0", features = ["derive"] }

View File

@@ -18,7 +18,6 @@ TOOLS =
TOOLS += agent-ctl
TOOLS += kata-ctl
TOOLS += log-parser
TOOLS += runk
TOOLS += trace-forwarder
STANDARD_TARGETS = build check clean install static-checks-build test vendor
@@ -50,10 +49,14 @@ docs-url-alive-check:
build-and-publish-kata-debug:
bash tools/packaging/kata-debug/kata-debug-build-and-upload-payload.sh ${KATA_DEBUG_REGISTRY} ${KATA_DEBUG_TAG}
docs-serve:
docker run --rm -p 8000:8000 -v ./docs:/docs:ro -v ${PWD}/zensical.toml:/zensical.toml:ro zensical/zensical serve --config-file /zensical.toml -a 0.0.0.0:8000
.PHONY: \
all \
kata-tarball \
install-tarball \
default \
static-checks \
docs-url-alive-check
docs-url-alive-check \
docs-serve

View File

@@ -139,7 +139,6 @@ The table below lists the remaining parts of the project:
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |
| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |

View File

@@ -1 +1 @@
3.24.0
3.26.0

View File

@@ -46,16 +46,12 @@ fi
[[ ${SELINUX_PERMISSIVE} == "yes" ]] && oc delete -f "${deployments_dir}/machineconfig_selinux.yaml.in"
# Delete kata-containers
pushd "${katacontainers_repo_dir}/tools/packaging/kata-deploy" || { echo "Failed to push to ${katacontainers_repo_dir}/tools/packaging/kata-deploy"; exit 125; }
oc delete -f kata-deploy/base/kata-deploy.yaml
helm uninstall kata-deploy --wait --namespace kube-system
oc -n kube-system wait --timeout=10m --for=delete -l name=kata-deploy pod
oc apply -f kata-cleanup/base/kata-cleanup.yaml
echo "Wait for all related pods to be gone"
( repeats=1; for _ in $(seq 1 600); do
oc get pods -l name="kubelet-kata-cleanup" --no-headers=true -n kube-system 2>&1 | grep "No resources found" -q && ((repeats++)) || repeats=1
[[ "${repeats}" -gt 5 ]] && echo kata-cleanup finished && break
sleep 1
done) || { echo "There are still some kata-cleanup related pods after 600 iterations"; oc get all -n kube-system; exit 1; }
oc delete -f kata-cleanup/base/kata-cleanup.yaml
oc delete -f kata-rbac/base/kata-rbac.yaml
oc delete -f runtimeclasses/kata-runtimeClasses.yaml

View File

@@ -51,13 +51,13 @@ apply_kata_deploy() {
oc label --overwrite ns kube-system pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline
local version chart
version=$(curl -sSL https://api.github.com/repos/kata-containers/kata-containers/releases/latest | jq .tag_name | tr -d '"')
version='0.0.0-dev'
chart="oci://ghcr.io/kata-containers/kata-deploy-charts/kata-deploy"
# Ensure any potential leftover is cleaned up ... and this secret usually is not in case of previous failures
oc delete secret sh.helm.release.v1.kata-deploy.v1 -n kube-system || true
echo "Installing kata using helm ${chart} ${version}"
echo "Installing kata using helm ${chart} ${version} (sha printed in helm output)"
helm install kata-deploy --wait --namespace kube-system --set "image.reference=${KATA_DEPLOY_IMAGE%%:*},image.tag=${KATA_DEPLOY_IMAGE##*:}" "${chart}" --version "${version}"
}

View File

@@ -157,6 +157,16 @@ if [[ -z "${CAA_IMAGE}" ]]; then
fi
# Get latest PP image
#
# You can list the CI images by:
# az sig image-version list-community --location "eastus" --public-gallery-name "cocopodvm-d0e4f35f-5530-4b9c-8596-112487cdea85" --gallery-image-definition "podvm_image0" --output table
# or the release images by:
# az sig image-version list-community --location "eastus" --public-gallery-name "cococommunity-42d8482d-92cd-415b-b332-7648bd978eff" --gallery-image-definition "peerpod-podvm-fedora" --output table
# or the release debug images by:
# az sig image-version list-community --location "eastus" --public-gallery-name "cococommunity-42d8482d-92cd-415b-b332-7648bd978eff" --gallery-image-definition "peerpod-podvm-fedora-debug" --output table
#
# Note there are other flavours of the released images, you can list them by:
# az sig image-definition list-community --location "eastus" --public-gallery-name "cococommunity-42d8482d-92cd-415b-b332-7648bd978eff" --output table
if [[ -z "${PP_IMAGE_ID}" ]]; then
SUCCESS_TIME=$(curl -s \
-H "Accept: application/vnd.github+json" \

View File

@@ -125,7 +125,7 @@ If you want to enable SELinux in Permissive mode, add `enforcing=0` to the kerne
Enable full debug as follows:
```bash
$ sudo sed -i -e 's/^# *\(enable_debug\).*=.*$/\1 = true/g' /etc/kata-containers/configuration.toml
$ sudo sed -i -E 's/^(\s*enable_debug\s*=\s*)false/\1true/' /etc/kata-containers/configuration.toml
$ sudo sed -i -e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 agent.log=debug initcall_debug"/g' /etc/kata-containers/configuration.toml
```

View File

@@ -198,7 +198,7 @@ fn join_params_with_dash(str: &str, num: i32) -> Result<String> {
return Err("number must be positive");
}
let result = format!("{}-{}", str, num);
let result = format!("{str}-{num}");
Ok(result)
}
@@ -253,13 +253,13 @@ mod tests {
// Run the tests
for (i, d) in tests.iter().enumerate() {
// Create a string containing details of the test
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
// Call the function under test
let result = join_params_with_dash(d.str, d.num);
// Update the test details string with the results of the call
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
// Perform the checks
if d.result.is_ok() {
@@ -267,8 +267,8 @@ mod tests {
continue;
}
let expected_error = format!("{}", d.result.as_ref().unwrap_err());
let actual_error = format!("{}", result.unwrap_err());
let expected_error = format!("{d.result.as_ref().unwrap_err()}");
let actual_error = format!("{result.unwrap_err()}");
assert!(actual_error == expected_error, msg);
}
}

9
docs/assets/favicon.svg Normal file
View File

@@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
<!-- Dark background matching the site -->
<rect width="32" height="32" rx="4" fill="#1a1a2e"/>
<!-- Kata logo scaled and centered -->
<g transform="translate(-27, -2) scale(0.75)">
<path d="M70.925 25.22L58.572 37.523 46.27 25.22l2.192-2.192 10.11 10.11 10.11-10.11zm-6.575-.2l-3.188-3.188 3.188-3.188 3.188 3.188zm-4.93-2.54l3.736 3.736-3.736 3.736zm-1.694 7.422l-8.07-8.07 8.07-8.07zm1.694-16.14l3.686 3.686-3.686 3.686zm-13.15 4.682L58.572 6.143l12.353 12.303-2.192 2.192-10.16-10.11-10.11 10.11zm26.997 0L58.572 3.752 43.878 18.446l3.387 3.387-3.387 3.387 14.694 14.694L73.266 25.22l-3.337-3.387z" fill="#f15b3e"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 710 B

View File

@@ -51,6 +51,7 @@ containers started after the VM has been launched.
Users can check to see if the container uses the `devicemapper` block
device as its rootfs by calling `mount(8)` within the container. If
the `devicemapper` block device is used, the root filesystem (`/`)
will be mounted from `/dev/vda`. Users can disable direct mounting of
the underlying block device through the runtime
[configuration](README.md#configuration).
will be mounted from `/dev/vda`. Users can enable direct mounting of
the underlying block device by setting the runtime
[configuration](README.md#configuration) flag `disable_block_device_use` to
`false`.

View File

@@ -256,7 +256,7 @@ spec:
values:
- NODE_NAME
volumes:
- name: trusted-storage
- name: trusted-image-storage
persistentVolumeClaim:
claimName: trusted-pvc
containers:

View File

@@ -50,7 +50,7 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.default_max_vcpus` | uint32| the maximum number of vCPUs allocated for the VM by the hypervisor |
| `io.katacontainers.config.hypervisor.default_memory` | uint32| the memory assigned for a VM by the hypervisor in `MiB` |
| `io.katacontainers.config.hypervisor.default_vcpus` | float32| the default vCPUs assigned for a VM by the hypervisor |
| `io.katacontainers.config.hypervisor.disable_block_device_use` | `boolean` | disallow a block device from being used |
| `io.katacontainers.config.hypervisor.disable_block_device_use` | `boolean` | disable hotplugging host block devices to guest VMs for container rootfs |
| `io.katacontainers.config.hypervisor.disable_image_nvdimm` | `boolean` | specify if a `nvdimm` device should be used as rootfs for the guest (QEMU) |
| `io.katacontainers.config.hypervisor.disable_vhost_net` | `boolean` | specify if `vhost-net` is not available on the host |
| `io.katacontainers.config.hypervisor.enable_hugepages` | `boolean` | if the memory should be `pre-allocated` from huge pages |

View File

@@ -103,48 +103,8 @@ $ minikube ssh "grep -c -E 'vmx|svm' /proc/cpuinfo"
## Installing Kata Containers
You can now install the Kata Containers runtime components. You will need a local copy of some Kata
Containers components to help with this, and then use `kubectl` on the host (that Minikube has already
configured for you) to deploy them:
```sh
$ git clone https://github.com/kata-containers/kata-containers.git
$ cd kata-containers/tools/packaging/kata-deploy
$ kubectl apply -f kata-rbac/base/kata-rbac.yaml
$ kubectl apply -f kata-deploy/base/kata-deploy.yaml
```
This installs the Kata Containers components into `/opt/kata` inside the Minikube node. It can take
a few minutes for the operation to complete. You can check the installation has worked by checking
the status of the `kata-deploy` pod, which will be executing
[this script](../../tools/packaging/kata-deploy/scripts/kata-deploy.sh),
and will be executing a `sleep infinity` once it has successfully completed its work.
You can accomplish this by running the following:
```sh
$ podname=$(kubectl -n kube-system get pods -o=name | grep -F kata-deploy | sed 's?pod/??')
$ kubectl -n kube-system exec ${podname} -- ps -ef | grep -F infinity
```
> *NOTE:* This check only works for single node clusters, which is the default for Minikube.
> For multi-node clusters, the check would need to be adapted to check `kata-deploy` had
> completed on all nodes.
## Enabling Kata Containers
Now you have installed the Kata Containers components in the Minikube node. Next, you need to configure
Kubernetes `RuntimeClass` to know when to use Kata Containers to run a pod.
### Register the runtime
Now register the `kata qemu` runtime with that class. This should result in no errors:
```sh
$ cd kata-containers/tools/packaging/kata-deploy/runtimeclasses
$ kubectl apply -f kata-runtimeClasses.yaml
```
The Kata Containers installation process should be complete and enabled in the Minikube cluster.
You can now install the Kata Containers runtime components
[following the official instructions](../../tools/packaging/kata-deploy/helm-chart).
## Testing Kata Containers

View File

@@ -48,7 +48,7 @@ $ make test
- Run a test in the current package in verbose mode:
```bash
# Example
# Example
$ test="config::tests::test_get_log_level"
$ cargo test "$test" -vv -- --exact --nocapture
@@ -223,7 +223,7 @@ What's wrong with this function?
```rust
fn foo(config: &Config, path_prefix: String, container_id: String, pid: String) -> Result<()> {
let mut full_path = format!("{}/{}", path_prefix, container_id);
let mut full_path = format!("{path_prefix}/{container_id}");
let _ = remove_recursively(&mut full_path);

View File

@@ -3,4 +3,4 @@
Kata Containers supports passing certain GPUs from the host into the container. Select the GPU vendor for detailed information:
- [Intel Discrete GPUs](Intel-Discrete-GPU-passthrough-and-Kata.md)/[Intel Integrated GPUs](Intel-GPU-passthrough-and-Kata.md)
- [NVIDIA](NVIDIA-GPU-passthrough-and-Kata.md)
- [NVIDIA GPUs](NVIDIA-GPU-passthrough-and-Kata.md) and [Enabling NVIDIA GPU workloads using GPU passthrough with Kata Containers](NVIDIA-GPU-passthrough-and-Kata-QEMU.md)

View File

@@ -0,0 +1,569 @@
# Enabling NVIDIA GPU workloads using GPU passthrough with Kata Containers
This page provides:
1. A description of the components involved when running GPU workloads with
Kata Containers using the NVIDIA TEE and non-TEE GPU runtime classes.
1. An explanation of the orchestration flow on a Kubernetes node for this
scenario.
1. A deployment guide enabling to utilize these runtime classes.
The goal is to educate readers familiar with Kubernetes and Kata Containers
on NVIDIA's reference implementation which is reflected in Kata CI's build
and test framework. With this, we aim to enable readers to leverage this
stack, or to use the principles behind this stack in order to run GPU
workloads on their variant of the Kata Containers stack.
We assume the reader is familiar with Kubernetes, Kata Containers, and
Confidential Containers.
> **Note:**
>
> The current supported mode for enabling GPU workloads in the TEE scenario
> is single GPU passthrough (one GPU per pod) on AMD64 platforms (AMD SEV-SNP
> being the only supported TEE scenario so far with support for Intel TDX being
> on the way).
## Component Overview
Before providing deployment guidance, we describe the components involved to
support running GPU workloads. We start from a top to bottom perspective
from the NVIDIA GPU operator via the Kata runtime to the components within
the NVIDIA GPU Utility Virtual Machine (UVM) root filesystem.
### NVIDIA GPU Operator
A central component is the
[NVIDIA GPU operator](https://github.com/NVIDIA/gpu-operator) which can be
deployed onto your cluster as a helm chart. Installing the GPU operator
delivers various operands on your nodes in the form of Kubernetes DaemonSets.
These operands are vital to support the flow of orchestrating pod manifests
using NVIDIA GPU runtime classes with GPU passthrough on your nodes. Without
getting into the details, the most important operands and their
responsibilities are:
- **nvidia-vfio-manager:** Binding discovered NVIDIA GPUs to the `vfio-pci`
driver for VFIO passthrough.
- **nvidia-cc-manager:** Transitioning GPUs into confidential computing (CC)
and non-CC mode (see the
[NVIDIA/k8s-cc-manager](https://github.com/NVIDIA/k8s-cc-manager)
repository).
- **nvidia-kata-manager:** Creating host-side CDI specifications for GPU
passthrough, resulting in the file `/var/run/cdi/nvidia.yaml`, containing
`kind: nvidia.com/pgpu` (see the
[NVIDIA/k8s-kata-manager](https://github.com/NVIDIA/k8s-kata-manager)
repository).
- **nvidia-sandbox-device-plugin** (see the
[NVIDIA/sandbox-device-plugin](https://github.com/NVIDIA/sandbox-device-plugin)
repository):
- Allocating GPUs during pod deployment.
- Discovering NVIDIA GPUs, their capabilities, and advertising these to
the Kubernetes control plane (allocatable resources as type
`nvidia.com/pgpu` resources will appear for the node and GPU Device IDs
will be registered with Kubelet). These GPUs can thus be allocated as
container resources in your pod manifests. See below GPU operator
deployment instructions for the use of the key `pgpu`, controlled via a
variable.
To summarize, the GPU operator manages the GPUs on each node, allowing for
simple orchestration of pod manifests using Kata Containers. Once the cluster
with GPU operator and Kata bits is up and running, the end user can schedule
Kata NVIDIA GPU workloads, using resource limits and the
`kata-qemu-nvidia-gpu` or `kata-qemu-nvidia-gpu-snp` runtime classes, for
example:
```yaml
apiVersion: v1
kind: Pod
...
spec:
...
runtimeClassName: kata-qemu-nvidia-gpu-snp
...
resources:
limits:
"nvidia.com/pgpu": 1
...
```
When this happens, the Kubelet calls into the sandbox device plugin to
allocate a GPU. The sandbox device plugin returns `DeviceSpec` entries to the
Kubelet for the allocated GPU. The Kubelet uses internal device IDs for
tracking of allocated GPUs and includes the device specifications in the CRI
request when scheduling the pod through containerd. Containerd processes the
device specifications and includes the device configuration in the OCI
runtime spec used to invoke the Kata runtime during the create container
request.
### Kata runtime
The Kata runtime for the NVIDIA GPU handlers is configured to cold-plug VFIO
devices (`cold_plug_vfio` is set to `root-port` while
`hot_plug_vfio` is set to `no-port`). Cold-plug is by design the only
supported mode for NVIDIA GPU passthrough of the NVIDIA reference stack.
With cold-plug, the Kata runtime attaches the GPU at VM launch time, when
creating the pod sandbox. This happens *before* the create container request,
i.e., before the Kata runtime receives the OCI spec including device
configurations from containerd. Thus, a mechanism to acquire the device
information is required. This is done by the runtime calling the
`coldPlugDevices()` function during sandbox creation. In this function,
the runtime queries Kubelet's Pod Resources API to discover allocated GPU
device IDs (e.g., `nvidia.com/pgpu = [vfio0]`). The runtime formats these as
CDI device identifiers and injects them into the OCI spec using
`config.InjectCDIDevices()`. The runtime then consults the host CDI
specifications and determines the device path the GPU is backed by
(e.g., `/dev/vfio/devices/vfio0`). Finally, the runtime resolves the device's
PCI BDF (e.g., `0000:21:00`) and cold-plugs the GPU by launching QEMU with
relevant parameters for device passthrough (e.g.,
`-device vfio-pci,host=0000:21:00.0,x-pci-vendor-id=0x10de,x-pci-device-id=0x2321,bus=rp0,iommufd=iommufdvfio-faf829f2ea7aec330`).
The runtime also creates *inner runtime* CDI annotations
which map host VFIO devices to guest GPU devices. These are annotations
intended for the kata-agent, here referred to as the inner runtime (inside the
UVM), to properly handle GPU passthrough into containers. These annotations
serve as metadata providing the kata-agent with the information needed to
attach the passthrough devices to the correct container.
The annotations are key-value pairs consisting of `cdi.k8s.io/vfio<num>` keys
(derived from the host VFIO device path, e.g., `/dev/vfio/devices/vfio1`) and
`nvidia.com/gpu=<index>` values (referencing the corresponding device in the
guest CDI spec). These annotations are injected by the runtime during container
creation via the `annotateContainerWithVFIOMetadata` function (see
`container.go`).
We continue describing the orchestration flow inside the UVM in the next
section.
### Kata NVIDIA GPU UVM
#### UVM composition
To better understand the orchestration flow inside the NVIDIA GPU UVM, we
first look at the components its root filesystem contains. Should you decide
to use your own root filesystem to enable NVIDIA GPU scenarios, this should
give you a good idea on what ingredients you need.
From a file system perspective, the UVM is composed of two files: a standard
Kata kernel image and the NVIDIA GPU rootfs in initrd or disk image format.
These two files are being utilized for the QEMU launch command when the UVM
is created.
The two most important pieces in Kata Container's build recipes for the
NVIDIA GPU root filesystem are the `nvidia_chroot.sh` and `nvidia_rootfs.sh`
files. The build follows a two-stage process. In the first stage, a
full-fledged Ubuntu-based root filesystem is composed within a chroot
environment. In this stage, NVIDIA kernel modules are built and signed
against the current Kata kernel and relevant NVIDIA packages are installed.
In the second stage, a chiseled build is performed: Only relevant contents
from the first stage are copied and compressed into a new distro-less root
filesystem folder. Kata's build infrastructure then turns this root
filesystem into the NVIDIA initrd and image files.
The resulting root filesystem contains the following software components:
- NVRC - the
[NVIDIA Runtime Container init system](https://github.com/NVIDIA/nvrc/tree/main)
- NVIDIA drivers (kernel modules)
- NVIDIA user space driver libraries
- NVIDIA user space tools
- kata-agent
- confidential computing guest components: the attestation agent,
confidential data hub and api-server-rest binaries
- CRI-O pause container (for the guest image-pull method)
- BusyBox utilities (provides a base set of libraries and binaries, and a
linker)
- some supporting files, such as file containing a list of supported GPU
device IDs which NVRC reads
#### UVM orchestration flow
When the Kata runtime asks QEMU to launch the VM, the UVM's Linux kernel
boots and mounts the root filesystem. After this, NVRC starts as the initial
process.
NVRC scans for NVIDIA GPUs on the PCI bus, loads the
NVIDIA kernel modules, waits for driver initialization, creates the device nodes,
and initializes the GPU hardware (using the `nvidia-smi` binary). NVRC also
creates the guest-side CDI specification file (using the
`nvidia-ctk cdi generate` command). This file specifies devices of
`kind: nvidia.com/gpu`, i.e., GPUs appearing to be physical GPUs on regular
bare metal systems. The guest CDI specification also contains `containerEdits`
for each device, specifying device nodes (e.g., `/dev/nvidia0`,
`/dev/nvidiactl`), library mounts, and environment variables to be mounted
into the container which receives the passthrough GPU.
Then, NVRC forks the Kata agent while continuing to run as the
init system. This allows NVRC to handle ongoing GPU management tasks
while kata-agent focuses on container lifecycle management. See the
[NVRC sources](https://github.com/NVIDIA/nvrc/blob/main/src/main.rs) for an
overview on the steps carried out by NVRC.
When the Kata runtime sends the create container request, the Kata agent
parses the inner runtime CDI annotation. For example, for the inner runtime
annotation `"cdi.k8s.io/vfio1": "nvidia.com/gpu=0"`, the agent looks up device
`0` in the guest CDI specification with `kind: nvidia.com/gpu`.
The Kata agent also reads the guest CDI specification's `containerEdits`
section and injects relevant contents into the OCI spec of the respective
container. The kata agent then creates and starts a `rustjail` container
based on the final OCI spec. The container now has relevant device nodes,
binaries and low-level libraries available, and can start a user application
linked against the CUDA runtime API (e.g., `libcudart.so` and other
libraries). When used, the CUDA runtime API in turn calls the CUDA driver
API and kernel drivers, interacting with the pass-through GPU device.
An additional step is exercised in our CI samples: when using images from an
authenticated registry, the guest-pull mechanism triggers attestation using
trustee's Key Broker Service (KBS) for secure release of the NGC API
authentication key used to access the NVCR container registry. As part of
this, the attestation agent exercises composite attestation and transitions
the GPU into `Ready` state (without this, the GPU has to explicitly be
transitioned into `Ready` state by passing the `nvrc.smi.srs=1` kernel
parameter via the shim config, causing NVRC to transition the GPU into the
`Ready` state).
## Deployment Guidance
This guidance assumes you use bare-metal machines with proper support for
Kata's non-TEE and TEE GPU workload deployment scenarios for your Kubernetes
nodes. We provide guidance based on the upstream Kata CI procedures for the
NVIDIA GPU CI validation jobs. Note that, this setup:
- uses the guest image pull method to pull container image layers
- uses the genpolicy tool to attach Kata agent security policies to the pod
manifest
- has dedicated (composite) attestation tests, a CUDA vectorAdd test, and a
NIM/RA test sample with secure API key release
A similar deployment guide and scenario description can be found in NVIDIA resources
under
[Early Access: NVIDIA GPU Operator with Confidential Containers based on Kata](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/confidential-containers.html).
### Requirements
The requirements for the TEE scenario are:
- Ubuntu 25.10 as host OS
- CPU with AMD SEV-SNP support with proper BIOS/UEFI version and settings
- CC-capable Hopper/Blackwell GPU with proper VBIOS version.
BIOS and VBIOS configuration is out of scope for this guide. Other resources,
such as the documentation found on the
[NVIDIA Trusted Computing Solutions](https://docs.nvidia.com/nvtrust/index.html)
page and the above linked NVIDIA documentation, provide guidance on
selecting proper hardware and on properly configuring its firmware and OS.
### Installation
#### Containerd and Kubernetes
First, set up your Kubernetes cluster. For instance, in Kata CI, our NVIDIA
jobs use a single-node vanilla Kubernetes cluster with a 2.x containerd
version and Kata's current supported Kubernetes version. We set this cluster
up using the `deploy_k8s` function from `tests/integration/kubernetes/gha-run.sh`
as follows:
```bash
$ export KUBERNETES="vanilla"
$ export CONTAINER_ENGINE="containerd"
$ export CONTAINER_ENGINE_VERSION="v2.1"
$ source tests/gha-run-k8s-common.sh
$ deploy_k8s
```
> **Note:**
>
> We recommend to configure your Kubelet with a higher
> `runtimeRequestTimeout` timeout value than the two minute default timeout.
> Using the guest-pull mechanism, pulling large images may take a significant
> amount of time and may delay container start, possibly leading your Kubelet
> to de-allocate your pod before it transitions from the *container created*
> to the *container running* state.
> **Note:**
>
> The NVIDIA GPU runtime classes use VFIO cold-plug which, as
> described above, requires the Kata runtime to query Kubelet's Pod Resources
> API to discover allocated GPU devices during sandbox creation. For
> Kubernetes versions **older than 1.34**, you must explicitly enable the
> `KubeletPodResourcesGet` feature gate in your Kubelet configuration. For
> Kubernetes 1.34 and later, this feature is enabled by default.
#### GPU Operator
Assuming you have the helm tools installed, deploy the latest version of the
GPU Operator as a helm chart (minimum version: `v25.10.0`):
```bash
$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && helm repo update
$ helm install --wait --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator \
--set sandboxWorkloads.enabled=true \
--set sandboxWorkloads.defaultWorkload=vm-passthrough \
--set kataManager.enabled=true \
--set kataManager.config.runtimeClasses=null \
--set kataManager.repository=nvcr.io/nvidia/cloud-native \
--set kataManager.image=k8s-kata-manager \
--set kataManager.version=v0.2.4 \
--set ccManager.enabled=true \
--set ccManager.defaultMode=on \
--set ccManager.repository=nvcr.io/nvidia/cloud-native \
--set ccManager.image=k8s-cc-manager \
--set ccManager.version=v0.2.0 \
--set sandboxDevicePlugin.repository=nvcr.io/nvidia/cloud-native \
--set sandboxDevicePlugin.image=nvidia-sandbox-device-plugin \
--set sandboxDevicePlugin.version=v0.0.1 \
--set 'sandboxDevicePlugin.env[0].name=P_GPU_ALIAS' \
--set 'sandboxDevicePlugin.env[0].value=pgpu' \
--set nfd.enabled=true \
--set nfd.nodefeaturerules=true
```
> **Note:**
>
> For heterogeneous clusters with different GPU types, you can omit
> the `P_GPU_ALIAS` environment variable lines. This will cause the sandbox
> device plugin to create GPU model-specific resource types (e.g.,
> `nvidia.com/GH100_H100L_94GB`) instead of the generic `nvidia.com/pgpu`,
> which in turn can be used by pods through respective resource limits.
> For simplicity, this guide uses the generic alias.
> **Note:**
>
> Using `--set sandboxWorkloads.defaultWorkload=vm-passthrough` causes all
> your nodes to be labeled for GPU VM passthrough. Remove this parameter if
> you intend to only use selected nodes for this scenario, and label these
> nodes by hand, using:
> `kubectl label node <node-name> nvidia.com/gpu.workload.config=vm-passthrough`.
#### Kata Containers
Install the latest Kata Containers helm chart, similar to
[existing documentation](https://github.com/kata-containers/kata-containers/blob/main/tools/packaging/kata-deploy/helm-chart/README.md)
(minimum version: `3.24.0`).
```bash
$ export VERSION=$(curl -sSL https://api.github.com/repos/kata-containers/kata-containers/releases/latest | jq .tag_name | tr -d '"')
$ export CHART="oci://ghcr.io/kata-containers/kata-deploy-charts/kata-deploy"
$ helm install kata-deploy \
--namespace kata-system \
--create-namespace \
-f "https://raw.githubusercontent.com/kata-containers/kata-containers/refs/tags/${VERSION}/tools/packaging/kata-deploy/helm-chart/kata-deploy/try-kata-nvidia-gpu.values.yaml" \
--set nfd.enabled=false \
--set shims.qemu-nvidia-gpu-tdx.enabled=false \
--wait --timeout 10m --atomic \
"${CHART}" --version "${VERSION}"
```
#### Trustee's KBS for remote attestation
For our Kata CI runners we use Trustee's KBS for composite attestation for
secure key release, for instance, for test scenarios which use authenticated
container images. In such scenarios, the credentials to access the
authenticated container registry are only released to the confidential guest
after successful attestation. Please see the section below for more
information about this.
```bash
$ export NVIDIA_VERIFIER_MODE="remote"
$ export KBS_INGRESS="nodeport"
$ bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs
$ bash tests/integration/kubernetes/gha-run.sh install-kbs-client
```
Please note, that Trustee can also be deployed via any other upstream
mechanism as documented by the
[confidential-containers repository](https://github.com/confidential-containers/trustee).
For our architecture it is important to set up KBS in the remote verifier
mode which requires entering a licensing agreement with NVIDIA, see the
[notes in confidential-containers repository](https://github.com/confidential-containers/trustee/blob/main/deps/verifier/src/nvidia/README.md).
### Cluster validation and preparation
If you did not use the `sandboxWorkloads.defaultWorkload=vm-passthrough`
parameter during GPU operator deployment, label your nodes for GPU VM
passthrough, for the example of using all nodes for GPU passthrough, run:
```bash
$ kubectl label nodes --all nvidia.com/gpu.workload.config=vm-passthrough --overwrite
```
Check if the `nvidia-cc-manager` pod is running if you intend to run GPU TEE
scenarios. If not, you need to manually label the node as CC capable. Current
GPU Operator node feature rules do not yet recognize all CC capable GPU PCI
IDs. Run the following command:
```bash
$ kubectl label nodes --all nvidia.com/cc.capable=true
```
After this, assure the `nvidia-cc-manager` pod is running. With the suggested
parameters for GPU Operator deployment, the `nvidia-cc-manager` will
automatically transition the GPU into CC mode.
After deployment, you can transition your node(s) to the desired CC state,
using either the `on` or `off` value, depending on your scenario. For the
non-CC scenario, transition to the `off` state via:
`kubectl label nodes --all nvidia.com/cc.mode=off` and wait until all pods
are back running. When an actual change is exercised, various GPU operator
operands will be restarted.
Ensure all pods are running:
```bash
$ kubectl get pods -A
```
On your node(s), ensure for correct driver binding. Your GPU device should be
bound to the VFIO driver, i.e., showing `Kernel driver in use: vfio-pci`
when running:
```bash
$ lspci -nnk -d 10de:
```
### Run the CUDA vectorAdd sample
Create the following file:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: cuda-vectoradd-kata
namespace: default
annotations:
io.katacontainers.config.hypervisor.kernel_params: "nvrc.smi.srs=1"
spec:
runtimeClassName: ${GPU_RUNTIME_CLASS_NAME}
restartPolicy: Never
containers:
- name: cuda-vectoradd
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
resources:
limits:
nvidia.com/pgpu: "1"
memory: 16Gi
```
Depending on your scenario and on the CC state, export your desired runtime
class name define the environment variable:
```bash
$ export GPU_RUNTIME_CLASS_NAME="kata-qemu-nvidia-gpu-snp"
```
Then, deploy the sample Kubernetes pod manifest and observe the pod logs:
```bash
$ envsubst < ./cuda-vectoradd-kata.yaml.in | kubectl apply -f -
$ kubectl wait --for=condition=Ready pod/cuda-vectoradd-kata --timeout=60s
$ kubectl logs -n default cuda-vectoradd-kata
```
Expect the following output:
```
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
```
To stop the pod, run: `kubectl delete pod cuda-vectoradd-kata`.
### Next steps
#### Transition between CC and non-CC mode
Use the previously described node labeling approach to transition between
the CC and non-CC mode. In case of the non-CC mode, you can use the
`kata-qemu-nvidia-gpu` value for the `GPU_RUNTIME_CLASS_NAME` runtime class
variable in the above CUDA vectorAdd sample. The `kata-qemu-nvidia-gpu-snp`
runtime class will **NOT** work in this mode - and vice versa.
#### Run Kata CI tests locally
Upstream Kata CI runs the CUDA vectorAdd test, a composite attestation test,
and a basic NIM/RAG deployment. Running CI tests for the TEE GPU scenario
requires KBS to be deployed (except for the CUDA vectorAdd test). The best
place to get started running these tests locally is to look into our
[NVIDIA CI workflow manifest](https://github.com/kata-containers/kata-containers/blob/main/.github/workflows/run-k8s-tests-on-nvidia-gpu.yaml)
and into the underling
[run_kubernetes_nv_tests.sh](https://github.com/kata-containers/kata-containers/blob/main/tests/integration/kubernetes/run_kubernetes_nv_tests.sh)
script. For example, to run the CUDA vectorAdd scenario against the TEE GPU
runtime class use the following commands:
```bash
# create the kata runtime class the test framework uses
$ export KATA_HYPERVISOR=qemu-nvidia-gpu-snp
$ kubectl delete runtimeclass kata --ignore-not-found
$ kubectl get runtimeclass "kata-${KATA_HYPERVISOR}" -o json | \
jq '.metadata.name = "kata" | del(.metadata.uid, .metadata.resourceVersion, .metadata.creationTimestamp)' | \
kubectl apply -f -
$ cd tests/integration/kubernetes
$ K8S_TEST_NV="k8s-nvidia-cuda.bats" ./gha-run.sh run-nv-tests
```
> **Note:**
>
> The other scenarios require an NGC API key to run, i.e., to export the
> `NGC_API_KEY` variable with a valid NGC API key.
#### Deploy pods using attestation
Attestation is a fundamental piece of the confidential containers solution.
In our upstream CI we use attestation at the example of leveraging the
authenticated container image pull mechanism where container images reside
in the authenticated NVCR registry (`k8s-nvidia-nim.bats`), and for
requesting secrets from KBS (`k8s-confidential-attestation.bats`). KBS will
release the image pull secret to a confidential guest. To get the
authentication credentials from inside the guest, KBS must already be
deployed and configured. In our CI samples, we configure KBS with the guest
image pull secret, a resource policy, and launch the pod with certain kernel
command line parameters:
`"agent.image_registry_auth=kbs:///default/credentials/nvcr agent.aa_kbc_params=cc_kbc::${CC_KBS_ADDR}"`.
The `agent.aa_kbc_params` option is a general configuration for attestation.
For your use case, you need to set the IP address and port under which KBS
is reachable through the `CC_KBS_ADDR` variable (see our CI sample). This
tells the guest how to reach KBS. Something like this must be set whenever
attestation is used, but on its own this parameter does not trigger
attestation. The `agent.image_registry_auth` option tells the guest to ask
for a resource from KBS and use it as the authentication configuration. When
this is set, the guest will request this resource at boot (and trigger
attestation) regardless of which image is being pulled.
To deploy your own pods using authenticated container images, or secure key
release for attestation, follow steps similar to our mentioned CI samples.
#### Deploy pods with Kata agent security policies
With GPU passthrough being supported by the
[genpolicy tool](https://github.com/kata-containers/kata-containers/tree/main/src/tools/genpolicy),
you can use the tool to create a Kata agent security policy. Our CI deploys
all sample pod manifests with a Kata agent security policy.
#### Deploy pods using your own containers and manifests
You can author pod manifests leveraging your own containers, for instance,
containers built using the CUDA container toolkit. We recommend to start
with a CUDA base container.
The GPU is transitioned into the `Ready` state via attestation, for instance,
when pulling authenticated images. If your deployment scenario does not use
attestation, please refer back to the CUDA vectorAdd pod manifest. In this
manifest, we ensure that NVRC sets the GPU to `Ready` state by adding the
following annotation in the manifest:
`io.katacontainers.config.hypervisor.kernel_params: "nvrc.smi.srs=1"`
> **Notes:**
>
> - musl-based container images (e.g., using Alpine), or distro-less
> containers are not supported.
> - for the TEE scenario, only single-GPU passthrough per pod is supported,
> so your pod resource limit must be: `nvidia.com/pgpu: "1"` (on a system
> with multiple GPUs, you can thus pass through one GPU per pod).

View File

@@ -1,10 +1,25 @@
# Using NVIDIA GPU device with Kata Containers
This page gives an overview on the different modes in which GPUs can be passed
to a Kata Containers container, provides host system requirements, explains how
Kata Containers guest components can be built to support the NVIDIA GPU
scenario, and gives practical usage examples using `ctr`.
Please see the guide
[Enabling NVIDIA GPU workloads using GPU passthrough with Kata Containers](NVIDIA-GPU-passthrough-and-Kata-QEMU.md)
for a documentation of an end-to-end reference implementation of a Kata
Containers stack for GPU passthrough using QEMU, the go-based Kata Runtime,
and an NVIDIA-specific root filesystem. This reference implementation is built
and validated in Kata's CI, and it can be used to test GPU workloads with Kata
components and Kubernetes out of the box.
## Comparison between Passthrough and vGPU Modes
An NVIDIA GPU device can be passed to a Kata Containers container using GPU
passthrough (NVIDIA GPU pass-through mode) as well as GPU mediated passthrough
passthrough (NVIDIA GPU passthrough mode) as well as GPU mediated passthrough
(NVIDIA `vGPU` mode).
NVIDIA GPU pass-through mode, an entire physical GPU is directly assigned to one
NVIDIA GPU passthrough mode, an entire physical GPU is directly assigned to one
VM, bypassing the NVIDIA Virtual GPU Manager. In this mode of operation, the GPU
is accessed exclusively by the NVIDIA driver running in the VM to which it is
assigned. The GPU is not shared among VMs.
@@ -20,18 +35,20 @@ with [MIG-slices](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/).
| Technology | Description | Behavior | Detail |
| --- | --- | --- | --- |
| NVIDIA GPU pass-through mode | GPU passthrough | Physical GPU assigned to a single VM | Direct GPU assignment to VM without limitation |
| NVIDIA GPU passthrough mode | GPU passthrough | Physical GPU assigned to a single VM | Direct GPU assignment to VM without limitation |
| NVIDIA vGPU time-sliced | GPU time-sliced | Physical GPU time-sliced for multiple VMs | Mediated passthrough |
| NVIDIA vGPU MIG-backed | GPU with MIG-slices | Physical GPU MIG-sliced for multiple VMs | Mediated passthrough |
## Hardware Requirements
## Host Requirements
NVIDIA GPUs Recommended for Virtualization:
### Hardware
NVIDIA GPUs recommended for virtualization:
- NVIDIA Tesla (T4, M10, P6, V100 or newer)
- NVIDIA Quadro RTX 6000/8000
## Host BIOS Requirements
### Firmware
Some hardware requires a larger PCI BARs window, for example, NVIDIA Tesla P100,
K40m
@@ -55,9 +72,7 @@ Some hardware vendors use a different name in BIOS, such as:
If one is using a GPU based on the Ampere architecture and later additionally
SR-IOV needs to be enabled for the `vGPU` use-case.
The following steps outline the workflow for using an NVIDIA GPU with Kata.
## Host Kernel Requirements
### Kernel
The following configurations need to be enabled on your host kernel:
@@ -70,7 +85,13 @@ The following configurations need to be enabled on your host kernel:
Your host kernel needs to be booted with `intel_iommu=on` on the kernel command
line.
## Install and configure Kata Containers
## Build the Kata Components
This section explains how to build an environment with Kata Containers bits
supporting the GPU scenario. We first deploy and configure the regular Kata
components, then describe how to build the guest kernel and root filesystem.
### Install and configure Kata Containers
To use non-large BARs devices (for example, NVIDIA Tesla T4), you need Kata
version 1.3.0 or above. Follow the [Kata Containers setup
@@ -101,7 +122,7 @@ hotplug_vfio_on_root_bus = true
pcie_root_port = 1
```
## Build Kata Containers kernel with GPU support
### Build guest kernel with GPU support
The default guest kernel installed with Kata Containers does not provide GPU
support. To use an NVIDIA GPU with Kata Containers, you need to build a kernel
@@ -160,11 +181,11 @@ code, using `Dragonball VMM` for NVIDIA GPU `hot-plug/hot-unplug` requires apply
addition to the above kernel configuration items. Follow these steps to build for NVIDIA GPU `hot-[un]plug`
for `Dragonball`:
```sh
# Prepare .config to support both upcall and nvidia gpu
```sh
# Prepare .config to support both upcall and nvidia gpu
$ ./build-kernel.sh -v 5.10.25 -e -t dragonball -g nvidia -f setup
# Build guest kernel to support both upcall and nvidia gpu
# Build guest kernel to support both upcall and nvidia gpu
$ ./build-kernel.sh -v 5.10.25 -e -t dragonball -g nvidia build
# Install guest kernel to support both upcall and nvidia gpu
@@ -196,303 +217,7 @@ Before using the new guest kernel, please update the `kernel` parameters in
kernel = "/usr/share/kata-containers/vmlinuz-nvidia-gpu.container"
```
## NVIDIA GPU pass-through mode with Kata Containers
Use the following steps to pass an NVIDIA GPU device in pass-through mode with Kata:
1. Find the Bus-Device-Function (BDF) for the GPU device on the host:
```sh
$ sudo lspci -nn -D | grep -i nvidia
0000:d0:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:20b9] (rev a1)
```
> PCI address `0000:d0:00.0` is assigned to the hardware GPU device.
> `10de:20b9` is the device ID of the hardware GPU device.
2. Find the IOMMU group for the GPU device:
```sh
$ BDF="0000:d0:00.0"
$ readlink -e /sys/bus/pci/devices/$BDF/iommu_group
```
The previous output shows that the GPU belongs to IOMMU group 192. The next
step is to bind the GPU to the VFIO-PCI driver.
```sh
$ BDF="0000:d0:00.0"
$ DEV="/sys/bus/pci/devices/$BDF"
$ echo "vfio-pci" > $DEV/driver_override
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
# To return the device to the standard driver, we simply clear the
# driver_override and reprobe the device, ex:
$ echo > $DEV/preferred_driver
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
```
3. Check the IOMMU group number under `/dev/vfio`:
```sh
$ ls -l /dev/vfio
total 0
crw------- 1 zvonkok zvonkok 243, 0 Mar 18 03:06 192
crw-rw-rw- 1 root root 10, 196 Mar 18 02:27 vfio
```
4. Start a Kata container with the GPU device:
```sh
# You may need to `modprobe vhost-vsock` if you get
# host system doesn't support vsock: stat /dev/vhost-vsock
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch uname -r
```
5. Run `lspci` within the container to verify the GPU device is seen in the list
of the PCI devices. Note the vendor-device id of the GPU (`10de:20b9`) in the `lspci` output.
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -nn | grep '10de:20b9'"
```
6. Additionally, you can check the PCI BARs space of the NVIDIA GPU device in the container:
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -s 02:00.0 -vv | grep Region"
```
> **Note**: If you see a message similar to the above, the BAR space of the NVIDIA
> GPU has been successfully allocated.
## NVIDIA vGPU mode with Kata Containers
NVIDIA vGPU is a licensed product on all supported GPU boards. A software license
is required to enable all vGPU features within the guest VM. NVIDIA vGPU manager
needs to be installed on the host to configure GPUs in vGPU mode. See [NVIDIA Virtual GPU Software Documentation v14.0 through 14.1](https://docs.nvidia.com/grid/14.0/) for more details.
### NVIDIA vGPU time-sliced
In the time-sliced mode, the GPU is not partitioned and the workload uses the
whole GPU and shares access to the GPU engines. Processes are scheduled in
series. The best effort scheduler is the default one and can be exchanged by
other scheduling policies see the documentation above how to do that.
Beware if you had `MIG` enabled before to disable `MIG` on the GPU if you want
to use `time-sliced` `vGPU`.
```sh
$ sudo nvidia-smi -mig 0
```
Enable the virtual functions for the physical GPU in the `sysfs` file system.
```sh
$ sudo /usr/lib/nvidia/sriov-manage -e 0000:41:00.0
```
Get the `BDF` of the available virtual function on the GPU, and choose one for the
following steps.
```sh
$ cd /sys/bus/pci/devices/0000:41:00.0/
$ ls -l | grep virtfn
```
#### List all available vGPU instances
The following shell snippet will walk the `sysfs` and only print instances
that are available, that can be created.
```sh
# The 00.0 is often the PF of the device the VFs will have the funciont in the
# BDF incremented by some values so e.g. the very first VF is 0000:41:00.4
cd /sys/bus/pci/devices/0000:41:00.0/
for vf in $(ls -d virtfn*)
do
BDF=$(basename $(readlink -f $vf))
for md in $(ls -d $vf/mdev_supported_types/*)
do
AVAIL=$(cat $md/available_instances)
NAME=$(cat $md/name)
DIR=$(basename $md)
if [ $AVAIL -gt 0 ]; then
echo "| BDF | INSTANCES | NAME | DIR |"
echo "+--------------+-----------+----------------+------------+"
printf "| %12s |%10d |%15s | %10s |\n\n" "$BDF" "$AVAIL" "$NAME" "$DIR"
fi
done
done
```
If there are available instances you get something like this (for the first VF),
beware that the output is highly dependent on the GPU you have, if there is no
output check again if `MIG` is really disabled.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-4C | nvidia-692 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-8C | nvidia-693 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-10C | nvidia-694 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-16C | nvidia-695 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-20C | nvidia-696 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-40C | nvidia-697 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-80C | nvidia-698 |
```
Change to the `mdev_supported_types` directory for the virtual function on which
you want to create the `vGPU`. Taking the first output as an example:
```sh
$ cd virtfn0/mdev_supported_types/nvidia-692
$ UUIDGEN=$(uuidgen)
$ sudo bash -c "echo $UUIDGEN > create"
```
Confirm that the `vGPU` was created. You should see the `UUID` pointing to a
subdirectory of the `sysfs` space.
```sh
$ ls -l /sys/bus/mdev/devices/
```
Get the `IOMMU` group number and verify there is a `VFIO` device created to use
with Kata.
```sh
$ ls -l /sys/bus/mdev/devices/*/
$ ls -l /dev/vfio
```
Use the `VFIO` device created in the same way as in the pass-through use-case.
Beware that the guest needs the NVIDIA guest drivers, so one would need to build
a new guest `OS` image.
### NVIDIA vGPU MIG-backed
We're not going into detail what `MIG` is but briefly it is a technology to
partition the hardware into independent instances with guaranteed quality of
service. For more details see [NVIDIA Multi-Instance GPU User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/).
First enable `MIG` mode for a GPU, depending on the platform you're running
a reboot would be necessary. Some platforms support GPU reset.
```sh
$ sudo nvidia-smi -mig 1
```
If the platform supports a GPU reset one can run, otherwise you will get a
warning to reboot the server.
```sh
$ sudo nvidia-smi --gpu-reset
```
The driver per default provides a number of profiles that users can opt-in when
configuring the MIG feature.
```sh
$ sudo nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
...
```
Create the GPU instances that correspond to the `vGPU` types of the `MIG-backed`
`vGPUs` that you will create [NVIDIA A100 PCIe 80GB Virtual GPU Types](https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html#vgpu-types-nvidia-a100-pcie-80gb).
```sh
# MIG 1g.10gb --> vGPU A100D-1-10C
$ sudo nvidia-smi mig -cgi 19
```
List the GPU instances and get the GPU instance id to create the compute
instance.
```sh
$ sudo nvidia-smi mig -lgi # list the created GPU instances
$ sudo nvidia-smi mig -cci -gi 9 # each GPU instance can have several compute
# instances. Instance -> Workload
```
Verify that the compute instances were created within the GPU instance
```sh
$ nvidia-smi
... snip ...
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 9 0 0 | 0MiB / 9728MiB | 14 0 | 1 0 0 0 0 |
| | 0MiB / 4095MiB | | |
+------------------+----------------------+-----------+-----------------------+
... snip ...
```
We can use the [snippet](#list-all-available-vgpu-instances) from before to list
the available `vGPU` instances, this time `MIG-backed`.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.5 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:01.6 | 1 |GRID A100D-1-10C | nvidia-699 |
... snip ...
```
Repeat the steps after the [snippet](#list-all-available-vgpu-instances) listing
to create the corresponding `mdev` device and use the guest `OS` created in the
previous section with `time-sliced` `vGPUs`.
## Install NVIDIA Driver + Toolkit in Kata Containers Guest OS
### Build Guest OS with NVIDIA Driver and Toolkit
Consult the [Developer-Guide](https://github.com/kata-containers/kata-containers/blob/main/docs/Developer-Guide.md#create-a-rootfs-image) on how to create a
rootfs base image for a distribution of your choice. This is going to be used as
@@ -583,9 +308,12 @@ Enable the `guest_hook_path` in Kata's `configuration.toml`
guest_hook_path = "/usr/share/oci/hooks"
```
As the last step one can remove the additional packages and files that were added
to the `$ROOTFS_DIR` to keep it as small as possible.
One has built a NVIDIA rootfs, kernel and now we can run any GPU container
without installing the drivers into the container. Check NVIDIA device status
with `nvidia-smi`
with `nvidia-smi`:
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/nvidia/cuda:11.6.0-base-ubuntu20.04" cuda nvidia-smi
@@ -611,8 +339,309 @@ Fri Mar 18 10:36:59 2022
+-----------------------------------------------------------------------------+
```
As the last step one can remove the additional packages and files that were added
to the `$ROOTFS_DIR` to keep it as small as possible.
## Usage Examples with Kata Containers
The following sections give usage examples for this based on the different modes.
### NVIDIA GPU passthrough mode
Use the following steps to pass an NVIDIA GPU device in passthrough mode with Kata:
1. Find the Bus-Device-Function (BDF) for the GPU device on the host:
```sh
$ sudo lspci -nn -D | grep -i nvidia
0000:d0:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:20b9] (rev a1)
```
> PCI address `0000:d0:00.0` is assigned to the hardware GPU device.
> `10de:20b9` is the device ID of the hardware GPU device.
2. Find the IOMMU group for the GPU device:
```sh
$ BDF="0000:d0:00.0"
$ readlink -e /sys/bus/pci/devices/$BDF/iommu_group
```
The previous output shows that the GPU belongs to IOMMU group 192. The next
step is to bind the GPU to the VFIO-PCI driver.
```sh
$ BDF="0000:d0:00.0"
$ DEV="/sys/bus/pci/devices/$BDF"
$ echo "vfio-pci" > $DEV/driver_override
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
# To return the device to the standard driver, we simply clear the
# driver_override and reprobe the device, ex:
$ echo > $DEV/preferred_driver
$ echo $BDF > $DEV/driver/unbind
$ echo $BDF > /sys/bus/pci/drivers_probe
```
3. Check the IOMMU group number under `/dev/vfio`:
```sh
$ ls -l /dev/vfio
total 0
crw------- 1 zvonkok zvonkok 243, 0 Mar 18 03:06 192
crw-rw-rw- 1 root root 10, 196 Mar 18 02:27 vfio
```
4. Start a Kata container with the GPU device:
```sh
# You may need to `modprobe vhost-vsock` if you get
# host system doesn't support vsock: stat /dev/vhost-vsock
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch uname -r
```
5. Run `lspci` within the container to verify the GPU device is seen in the list
of the PCI devices. Note the vendor-device id of the GPU (`10de:20b9`) in the `lspci` output.
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -nn | grep '10de:20b9'"
```
6. Additionally, you can check the PCI BARs space of the NVIDIA GPU device in the container:
```sh
$ sudo ctr --debug run --runtime "io.containerd.kata.v2" --device /dev/vfio/192 --rm -t "docker.io/library/archlinux:latest" arch sh -c "lspci -s 02:00.0 -vv | grep Region"
```
> **Note**: If you see a message similar to the above, the BAR space of the NVIDIA
> GPU has been successfully allocated.
### NVIDIA vGPU mode
NVIDIA vGPU is a licensed product on all supported GPU boards. A software license
is required to enable all vGPU features within the guest VM. NVIDIA vGPU manager
needs to be installed on the host to configure GPUs in vGPU mode. See
[NVIDIA Virtual GPU Software Documentation v14.0 through 14.1](https://docs.nvidia.com/grid/14.0/)
for more details.
#### NVIDIA vGPU time-sliced
In the time-sliced mode, the GPU is not partitioned and the workload uses the
whole GPU and shares access to the GPU engines. Processes are scheduled in
series. The best effort scheduler is the default one and can be exchanged by
other scheduling policies see the documentation above how to do that.
Beware if you had `MIG` enabled before to disable `MIG` on the GPU if you want
to use `time-sliced` `vGPU`.
```sh
$ sudo nvidia-smi -mig 0
```
Enable the virtual functions for the physical GPU in the `sysfs` file system.
```sh
$ sudo /usr/lib/nvidia/sriov-manage -e 0000:41:00.0
```
Get the `BDF` of the available virtual function on the GPU, and choose one for the
following steps.
```sh
$ cd /sys/bus/pci/devices/0000:41:00.0/
$ ls -l | grep virtfn
```
##### List all available vGPU instances
The following shell snippet will walk the `sysfs` and only print instances
that are available, that can be created.
```sh
# The 00.0 is often the PF of the device. The VFs will have the function in the
# BDF incremented by some values so e.g. the very first VF is 0000:41:00.4
cd /sys/bus/pci/devices/0000:41:00.0/
for vf in $(ls -d virtfn*)
do
BDF=$(basename $(readlink -f $vf))
for md in $(ls -d $vf/mdev_supported_types/*)
do
AVAIL=$(cat $md/available_instances)
NAME=$(cat $md/name)
DIR=$(basename $md)
if [ $AVAIL -gt 0 ]; then
echo "| BDF | INSTANCES | NAME | DIR |"
echo "+--------------+-----------+----------------+------------+"
printf "| %12s |%10d |%15s | %10s |\n\n" "$BDF" "$AVAIL" "$NAME" "$DIR"
fi
done
done
```
If there are available instances you get something like this (for the first VF),
beware that the output is highly dependent on the GPU you have, if there is no
output check again if `MIG` is really disabled.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-4C | nvidia-692 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-8C | nvidia-693 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-10C | nvidia-694 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-16C | nvidia-695 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-20C | nvidia-696 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-40C | nvidia-697 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 | GRID A100D-80C | nvidia-698 |
```
Change to the `mdev_supported_types` directory for the virtual function on which
you want to create the `vGPU`. Taking the first output as an example:
```sh
$ cd virtfn0/mdev_supported_types/nvidia-692
$ UUIDGEN=$(uuidgen)
$ sudo bash -c "echo $UUIDGEN > create"
```
Confirm that the `vGPU` was created. You should see the `UUID` pointing to a
subdirectory of the `sysfs` space.
```sh
$ ls -l /sys/bus/mdev/devices/
```
Get the `IOMMU` group number and verify there is a `VFIO` device created to use
with Kata.
```sh
$ ls -l /sys/bus/mdev/devices/*/
$ ls -l /dev/vfio
```
Use the `VFIO` device created in the same way as in the passthrough use-case.
Beware that the guest needs the NVIDIA guest drivers, so one would need to build
a new guest `OS` image.
#### NVIDIA vGPU MIG-backed
We're not going into detail what `MIG` is but briefly it is a technology to
partition the hardware into independent instances with guaranteed quality of
service. For more details see
[NVIDIA Multi-Instance GPU User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/).
First enable `MIG` mode for a GPU, depending on the platform you're running
a reboot would be necessary. Some platforms support GPU reset.
```sh
$ sudo nvidia-smi -mig 1
```
If the platform supports a GPU reset one can run, otherwise you will get a
warning to reboot the server.
```sh
$ sudo nvidia-smi --gpu-reset
```
The driver per default provides a number of profiles that users can opt-in when
configuring the MIG feature.
```sh
$ sudo nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
...
```
Create the GPU instances that correspond to the `vGPU` types of the `MIG-backed`
`vGPUs` that you will create
[NVIDIA A100 PCIe 80GB Virtual GPU Types](https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html#vgpu-types-nvidia-a100-pcie-80gb).
```sh
# MIG 1g.10gb --> vGPU A100D-1-10C
$ sudo nvidia-smi mig -cgi 19
```
List the GPU instances and get the GPU instance id to create the compute
instance.
```sh
$ sudo nvidia-smi mig -lgi # list the created GPU instances
$ sudo nvidia-smi mig -cci -gi 9 # each GPU instance can have several compute
# instances. Instance -> Workload
```
Verify that the compute instances were created within the GPU instance
```sh
$ nvidia-smi
... snip ...
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 9 0 0 | 0MiB / 9728MiB | 14 0 | 1 0 0 0 0 |
| | 0MiB / 4095MiB | | |
+------------------+----------------------+-----------+-----------------------+
... snip ...
```
We can use the [snippet](#list-all-available-vgpu-instances) from before to list
the available `vGPU` instances, this time `MIG-backed`.
```sh
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.4 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:00.5 | 1 |GRID A100D-1-10C | nvidia-699 |
| BDF | INSTANCES | NAME | DIR |
+--------------+-----------+----------------+------------+
| 0000:41:01.6 | 1 |GRID A100D-1-10C | nvidia-699 |
... snip ...
```
Repeat the steps after the [snippet](#list-all-available-vgpu-instances) listing
to create the corresponding `mdev` device and use the guest `OS` created in the
previous section with `time-sliced` `vGPUs`.
## References

View File

@@ -1,24 +1,20 @@
# Table of Contents
**Note:**: This guide used to contain an end-to-end flow to build a
custom Kata containers root filesystem with QAT out-of-tree SR-IOV virtual
function driver and run QAT enabled containers. The former is no longer necessary
so the instructions are dropped. If the use-case is still of interest, please file
an issue in either of the QAT Kubernetes specific repos linked below.
# Introduction
Intel® QuickAssist Technology (QAT) provides hardware acceleration
for security (cryptography) and compression. These instructions cover the
steps for the latest [Ubuntu LTS release](https://ubuntu.com/download/desktop)
which already include the QAT host driver. These instructions can be adapted to
any Linux distribution. These instructions guide the user on how to download
the kernel sources, compile kernel driver modules against those sources, and
load them onto the host as well as preparing a specially built Kata Containers
kernel and custom Kata Containers rootfs.
for security (cryptography) and compression. Kata Containers can enable
these acceleration functions for containers using QAT SR-IOV with the
support from [Intel QAT Device Plugin for Kubernetes](https://github.com/intel/intel-device-plugins-for-kubernetes)
or [Intel QAT DRA Resource Driver for Kubernetes](https://github.com/intel/intel-resource-drivers-for-kubernetes).
* Download kernel sources
* Compile Kata kernel
* Compile kernel driver modules against those sources
* Download rootfs
* Add driver modules to rootfs
* Build rootfs image
## Helpful Links before starting
## More Information
[Intel® QuickAssist Technology at `01.org`](https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html)
@@ -26,554 +22,6 @@ kernel and custom Kata Containers rootfs.
[Intel Device Plugin for Kubernetes](https://github.com/intel/intel-device-plugins-for-kubernetes)
[Intel DRA Resource Driver for Kubernetes](https://github.com/intel/intel-resource-drivers-for-kubernetes)
[Intel® QuickAssist Technology for Crypto Poll Mode Driver](https://dpdk-docs.readthedocs.io/en/latest/cryptodevs/qat.html)
## Steps to enable Intel® QAT in Kata Containers
There are some steps to complete only once, some steps to complete with every
reboot, and some steps to complete when the host kernel changes.
## Script variables
The following list of variables must be set before running through the
scripts. These variables refer to locations to store modules and configuration
files on the host and links to the drivers to use. Modify these as
needed to point to updated drivers or different install locations.
### Set environment variables (Every Reboot)
Make sure to check [`01.org`](https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html) for
the latest driver.
```bash
$ export QAT_DRIVER_VER=qat1.7.l.4.14.0-00031.tar.gz
$ export QAT_DRIVER_URL=https://downloadmirror.intel.com/30178/eng/${QAT_DRIVER_VER}
$ export QAT_CONF_LOCATION=~/QAT_conf
$ export QAT_DOCKERFILE=https://raw.githubusercontent.com/intel/intel-device-plugins-for-kubernetes/main/demo/openssl-qat-engine/Dockerfile
$ export QAT_SRC=~/src/QAT
$ export GOPATH=~/src/go
$ export KATA_KERNEL_LOCATION=~/kata
$ export KATA_ROOTFS_LOCATION=~/kata
```
## Prepare the Ubuntu Host
The host could be a bare metal instance or a virtual machine. If using a
virtual machine, make sure that KVM nesting is enabled. The following
instructions reference an Intel® C62X chipset. Some of the instructions must be
modified if using a different Intel® QAT device. The Intel® QAT chipset can be
identified by executing the following.
### Identify which PCI Bus the Intel® QAT card is on
```bash
$ for i in 0434 0435 37c8 1f18 1f19; do lspci -d 8086:$i; done
```
### Install necessary packages for Ubuntu
These packages are necessary to compile the Kata kernel, Intel® QAT driver, and to
prepare the rootfs for Kata. [Docker](https://docs.docker.com/engine/install/ubuntu/)
also needs to be installed to be able to build the rootfs. To test that
everything works a Kubernetes pod is started requesting Intel® QAT resources. For the
pass through of the virtual functions the kernel boot parameter needs to have
`INTEL_IOMMU=on`.
```bash
$ sudo apt update
$ sudo apt install -y golang-go build-essential python pkg-config zlib1g-dev libudev-dev bison libelf-dev flex libtool automake autotools-dev autoconf bc libpixman-1-dev coreutils libssl-dev
$ sudo sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT=""/GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"/' /etc/default/grub
$ sudo update-grub
$ sudo reboot
```
### Download Intel® QAT drivers
This will download the [Intel® QAT drivers](https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html).
Make sure to check the website for the latest version.
```bash
$ mkdir -p $QAT_SRC
$ cd $QAT_SRC
$ curl -L $QAT_DRIVER_URL | tar zx
```
### Copy Intel® QAT configuration files and enable virtual functions
Modify the instructions below as necessary if using a different Intel® QAT hardware
platform. You can learn more about customizing configuration files at the
[Intel® QAT Engine repository](https://github.com/intel/QAT_Engine/#copy-the-correct-intel-quickassist-technology-driver-config-files)
This section starts from a base config file and changes the `SSL` section to
`SHIM` to support the OpenSSL engine. There are more tweaks that you can make
depending on the use case and how many Intel® QAT engines should be run. You
can find more information about how to customize in the
[Intel® QuickAssist Technology Software for Linux* - Programmer's Guide.](https://www.intel.com/content/www/us/en/content-details/709196/intel-quickassist-technology-api-programmer-s-guide.html)
> **Note: This section assumes that a Intel® QAT `c6xx` platform is used.**
```bash
$ mkdir -p $QAT_CONF_LOCATION
$ cp $QAT_SRC/quickassist/utilities/adf_ctl/conf_files/c6xxvf_dev0.conf.vm $QAT_CONF_LOCATION/c6xxvf_dev0.conf
$ sed -i 's/\[SSL\]/\[SHIM\]/g' $QAT_CONF_LOCATION/c6xxvf_dev0.conf
```
### Expose and Bind Intel® QAT virtual functions to VFIO-PCI (Every reboot)
To enable virtual functions, the host OS should have IOMMU groups enabled. In
the UEFI Firmware Intel® Virtualization Technology for Directed I/O
(Intel® VT-d) must be enabled. Also, the kernel boot parameter should be
`intel_iommu=on` or `intel_iommu=ifgx_off`. This should have been set from
the instructions above. Check the output of `/proc/cmdline` to confirm. The
following commands assume you installed an Intel® QAT card, IOMMU is on, and
VT-d is enabled. The vendor and device ID add to the `VFIO-PCI` driver so that
each exposed virtual function can be bound to the `VFIO-PCI` driver. Once
complete, each virtual function passes into a Kata Containers container using
the PCIe device passthrough feature. For Kubernetes, the
[Intel device plugin](https://github.com/intel/intel-device-plugins-for-kubernetes)
for Kubernetes handles the binding of the driver, but the VFs still must be
enabled.
```bash
$ sudo modprobe vfio-pci
$ QAT_PCI_BUS_PF_NUMBERS=$((lspci -d :435 && lspci -d :37c8 && lspci -d :19e2 && lspci -d :6f54) | cut -d ' ' -f 1)
$ QAT_PCI_BUS_PF_1=$(echo $QAT_PCI_BUS_PF_NUMBERS | cut -d ' ' -f 1)
$ echo 16 | sudo tee /sys/bus/pci/devices/0000:$QAT_PCI_BUS_PF_1/sriov_numvfs
$ QAT_PCI_ID_VF=$(cat /sys/bus/pci/devices/0000:${QAT_PCI_BUS_PF_1}/virtfn0/uevent | grep PCI_ID)
$ QAT_VENDOR_AND_ID_VF=$(echo ${QAT_PCI_ID_VF/PCI_ID=} | sed 's/:/ /')
$ echo $QAT_VENDOR_AND_ID_VF | sudo tee --append /sys/bus/pci/drivers/vfio-pci/new_id
```
Loop through all the virtual functions and bind to the VFIO driver
```bash
$ for f in /sys/bus/pci/devices/0000:$QAT_PCI_BUS_PF_1/virtfn*
do QAT_PCI_BUS_VF=$(basename $(readlink $f))
echo $QAT_PCI_BUS_VF | sudo tee --append /sys/bus/pci/drivers/c6xxvf/unbind
echo $QAT_PCI_BUS_VF | sudo tee --append /sys/bus/pci/drivers/vfio-pci/bind
done
```
### Check Intel® QAT virtual functions are enabled
If the following command returns empty, then the virtual functions are not
properly enabled. This command checks the enumerated device IDs for just the
virtual functions. Using the Intel® QAT as an example, the physical device ID
is `37c8` and virtual function device ID is `37c9`. The following command checks
if VF's are enabled for any of the currently known Intel® QAT device ID's. The
following `ls` command should show the 16 VF's bound to `VFIO-PCI`.
```bash
$ for i in 0442 0443 37c9 19e3; do lspci -d 8086:$i; done
```
Another way to check is to see what PCI devices that `VFIO-PCI` is mapped to.
It should match the device ID's of the VF's.
```bash
$ ls -la /sys/bus/pci/drivers/vfio-pci
```
## Prepare Kata Containers
### Download Kata kernel Source
This example automatically uses the latest Kata kernel supported by Kata. It
follows the instructions from the
[packaging kernel repository](../../tools/packaging/kernel)
and uses the latest Kata kernel
[config](../../tools/packaging/kernel/configs).
There are some patches that must be installed as well, which the
`build-kernel.sh` script should automatically apply. If you are using a
different kernel version, then you might need to manually apply them. Since
the Kata Containers kernel has a minimal set of kernel flags set, you must
create a Intel® QAT kernel fragment with the necessary `CONFIG_CRYPTO_*` options set.
Update the config to set some of the `CRYPTO` flags to enabled. This might
change with different kernel versions. The following instructions were tested
with kernel `v5.4.0-64-generic`.
```bash
$ mkdir -p $GOPATH
$ cd $GOPATH
$ go get -v github.com/kata-containers/kata-containers
$ cat << EOF > $GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/configs/fragments/common/qat.conf
CONFIG_PCIEAER=y
CONFIG_UIO=y
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_QAT_C62XVF=m
CONFIG_CRYPTO_CBC=y
CONFIG_MODULES=y
CONFIG_MODULE_SIG=y
CONFIG_CRYPTO_AUTHENC=y
CONFIG_CRYPTO_DH=y
EOF
$ $GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/build-kernel.sh setup
```
### Build Kata kernel
```bash
$ cd $GOPATH
$ export LINUX_VER=$(ls -d kata-linux-*)
$ sed -i 's/EXTRAVERSION =/EXTRAVERSION = .qat.container/' $LINUX_VER/Makefile
$ $GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kernel/build-kernel.sh build
```
### Copy Kata kernel
```bash
$ export KATA_KERNEL_NAME=vmlinux-${LINUX_VER}_qat
$ mkdir -p $KATA_KERNEL_LOCATION
$ cp ${GOPATH}/${LINUX_VER}/vmlinux ${KATA_KERNEL_LOCATION}/${KATA_KERNEL_NAME}
```
### Prepare Kata root filesystem
These instructions build upon the OS builder instructions located in the
[Developer Guide](../Developer-Guide.md). At this point it is recommended that
[Docker](https://docs.docker.com/engine/install/ubuntu/) is installed first, and
then [Kata-deploy](../../tools/packaging/kata-deploy)
is use to install Kata. This will make sure that the correct `agent` version
is installed into the rootfs in the steps below.
The following instructions use Ubuntu as the root filesystem with systemd as
the init and will add in the `kmod` binary, which is not a standard binary in
a Kata rootfs image. The `kmod` binary is necessary to load the Intel® QAT
kernel modules when the virtual machine rootfs boots.
```bash
$ export OSBUILDER=$GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder
$ export ROOTFS_DIR=${OSBUILDER}/rootfs-builder/rootfs
$ export EXTRA_PKGS='kmod'
```
Make sure that the `kata-agent` version matches the installed `kata-runtime`
version. Also make sure the `kata-runtime` install location is in your `PATH`
variable. The following `AGENT_VERSION` can be set manually to match
the `kata-runtime` version if the following commands don't work.
```bash
$ export PATH=$PATH:/opt/kata/bin
$ cd $GOPATH
$ export AGENT_VERSION=$(kata-runtime version | head -n 1 | grep -o "[0-9.]\+")
$ cd ${OSBUILDER}/rootfs-builder
$ sudo rm -rf ${ROOTFS_DIR}
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true SECCOMP=no ./rootfs.sh ubuntu'
```
### Compile Intel® QAT drivers for Kata Containers kernel and add to Kata Containers rootfs
After the Kata Containers kernel builds with the proper configuration flags,
you must build the Intel® QAT drivers against that Kata Containers kernel
version in a similar way they were previously built for the host OS. You must
set the `KERNEL_SOURCE_ROOT` variable to the Kata Containers kernel source
directory and build the Intel® QAT drivers again. The `make` command will
install the Intel® QAT modules into the Kata rootfs.
```bash
$ cd $GOPATH
$ export LINUX_VER=$(ls -d kata*)
$ export KERNEL_MAJOR_VERSION=$(awk '/^VERSION =/{print $NF}' $GOPATH/$LINUX_VER/Makefile)
$ export KERNEL_PATHLEVEL=$(awk '/^PATCHLEVEL =/{print $NF}' $GOPATH/$LINUX_VER/Makefile)
$ export KERNEL_SUBLEVEL=$(awk '/^SUBLEVEL =/{print $NF}' $GOPATH/$LINUX_VER/Makefile)
$ export KERNEL_EXTRAVERSION=$(awk '/^EXTRAVERSION =/{print $NF}' $GOPATH/$LINUX_VER/Makefile)
$ export KERNEL_ROOTFS_DIR=${KERNEL_MAJOR_VERSION}.${KERNEL_PATHLEVEL}.${KERNEL_SUBLEVEL}${KERNEL_EXTRAVERSION}
$ cd $QAT_SRC
$ KERNEL_SOURCE_ROOT=$GOPATH/$LINUX_VER ./configure --enable-icp-sriov=guest
$ sudo -E make all -j $(nproc)
$ sudo -E make INSTALL_MOD_PATH=$ROOTFS_DIR qat-driver-install -j $(nproc)
```
The `usdm_drv` module also needs to be copied into the rootfs modules path and
`depmod` should be run.
```bash
$ sudo cp $QAT_SRC/build/usdm_drv.ko $ROOTFS_DIR/lib/modules/${KERNEL_ROOTFS_DIR}/updates/drivers
$ sudo depmod -a -b ${ROOTFS_DIR} ${KERNEL_ROOTFS_DIR}
$ cd ${OSBUILDER}/image-builder
$ script -fec 'sudo -E USE_DOCKER=true ./image_builder.sh ${ROOTFS_DIR}'
```
> **Note: Ignore any errors on modules.builtin and modules.order when running
> `depmod`.**
### Copy Kata rootfs
```bash
$ mkdir -p $KATA_ROOTFS_LOCATION
$ cp ${OSBUILDER}/image-builder/kata-containers.img $KATA_ROOTFS_LOCATION
```
## Verify Intel® QAT works in a container
The following instructions uses a OpenSSL Dockerfile that builds the
Intel® QAT engine to allow OpenSSL to offload crypto functions. It is a
convenient way to test that VFIO device passthrough for the Intel® QAT VFs are
working properly with the Kata Containers VM.
### Build OpenSSL Intel® QAT engine container
Use the OpenSSL Intel® QAT [Dockerfile](https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/demo/openssl-qat-engine)
to build a container image with an optimized OpenSSL engine for
Intel® QAT. Using `docker build` with the Kata Containers runtime can sometimes
have issues. Therefore, make sure that `runc` is the default Docker container
runtime.
```bash
$ cd $QAT_SRC
$ curl -O $QAT_DOCKERFILE
$ sudo docker build -t openssl-qat-engine .
```
> **Note: The Intel® QAT driver version in this container might not match the
> Intel® QAT driver compiled and loaded on the host when compiling.**
### Test Intel® QAT with the ctr tool
The `ctr` tool can be used to interact with the containerd daemon. It may be
more convenient to use this tool to verify the kernel and image instead of
setting up a Kubernetes cluster. The correct Kata runtimes need to be added
to the containerd `config.toml`. Below is a sample snippet that can be added
to allow QEMU and Cloud Hypervisor (CLH) to work with `ctr`.
```
[plugins.cri.containerd.runtimes.kata-qemu]
runtime_type = "io.containerd.kata-qemu.v2"
privileged_without_host_devices = true
pod_annotations = ["io.katacontainers.*"]
[plugins.cri.containerd.runtimes.kata-qemu.options]
ConfigPath = "/opt/kata/share/defaults/kata-containers/configuration-qemu.toml"
[plugins.cri.containerd.runtimes.kata-clh]
runtime_type = "io.containerd.kata-clh.v2"
privileged_without_host_devices = true
pod_annotations = ["io.katacontainers.*"]
[plugins.cri.containerd.runtimes.kata-clh.options]
ConfigPath = "/opt/kata/share/defaults/kata-containers/configuration-clh.toml"
```
In addition, containerd expects the binary to be in `/usr/local/bin` so add
this small script so that it redirects to be able to use either QEMU or
Cloud Hypervisor with Kata.
```bash
$ echo '#!/usr/bin/env bash' | sudo tee /usr/local/bin/containerd-shim-kata-qemu-v2
$ echo 'KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-qemu.toml /opt/kata/bin/containerd-shim-kata-v2 $@' | sudo tee -a /usr/local/bin/containerd-shim-kata-qemu-v2
$ sudo chmod +x /usr/local/bin/containerd-shim-kata-qemu-v2
$ echo '#!/usr/bin/env bash' | sudo tee /usr/local/bin/containerd-shim-kata-clh-v2
$ echo 'KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-clh.toml /opt/kata/bin/containerd-shim-kata-v2 $@' | sudo tee -a /usr/local/bin/containerd-shim-kata-clh-v2
$ sudo chmod +x /usr/local/bin/containerd-shim-kata-clh-v2
```
After the OpenSSL image is built and imported into containerd, a Intel® QAT
virtual function exposed in the step above can be added to the `ctr` command.
Make sure to change the `/dev/vfio` number to one that actually exists on the
host system. When using the `ctr` tool, the`configuration.toml` for Kata needs
to point to the custom Kata kernel and rootfs built above and the Intel® QAT
modules in the Kata rootfs need to load at boot. The following steps assume that
`kata-deploy` was used to install Kata and QEMU is being tested. If using a
different hypervisor, different install method for Kata, or a different
Intel® QAT chipset then the command will need to be modified.
> **Note: The following was tested with
[containerd v1.4.6](https://github.com/containerd/containerd/releases/tag/v1.4.6).**
```bash
$ config_file="/opt/kata/share/defaults/kata-containers/configuration-qemu.toml"
$ sudo sed -i "/kernel =/c kernel = "\"${KATA_ROOTFS_LOCATION}/${KATA_KERNEL_NAME}\""" $config_file
$ sudo sed -i "/image =/c image = "\"${KATA_KERNEL_LOCATION}/kata-containers.img\""" $config_file
$ sudo sed -i -e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 modules-load=usdm_drv,qat_c62xvf"/g' $config_file
$ sudo docker save -o openssl-qat-engine.tar openssl-qat-engine:latest
$ sudo ctr images import openssl-qat-engine.tar
$ sudo ctr run --runtime io.containerd.run.kata-qemu.v2 --privileged -t --rm --device=/dev/vfio/180 --mount type=bind,src=/dev,dst=/dev,options=rbind:rw --mount type=bind,src=${QAT_CONF_LOCATION}/c6xxvf_dev0.conf,dst=/etc/c6xxvf_dev0.conf,options=rbind:rw docker.io/library/openssl-qat-engine:latest bash
```
Below are some commands to run in the container image to verify Intel® QAT is
working
```sh
root@67561dc2757a/ # cat /proc/modules
qat_c62xvf 16384 - - Live 0xffffffffc00d9000 (OE)
usdm_drv 86016 - - Live 0xffffffffc00e8000 (OE)
intel_qat 249856 - - Live 0xffffffffc009b000 (OE)
root@67561dc2757a/ # adf_ctl restart
Restarting all devices.
Processing /etc/c6xxvf_dev0.conf
root@67561dc2757a/ # adf_ctl status
Checking status of all devices.
There is 1 QAT acceleration device(s) in the system:
qat_dev0 - type: c6xxvf, inst_id: 0, node_id: 0, bsf: 0000:01:01.0, #accel: 1 #engines: 1 state: up
root@67561dc2757a/ # openssl engine -c -t qat-hw
(qat-hw) Reference implementation of QAT crypto engine v0.6.1
[RSA, DSA, DH, AES-128-CBC-HMAC-SHA1, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA1, AES-256-CBC-HMAC-SHA256, TLS1-PRF, HKDF, X25519, X448]
[ available ]
```
### Test Intel® QAT in Kubernetes
Start a Kubernetes cluster with containerd as the CRI. The host should
already be setup with 16 virtual functions of the Intel® QAT card bound to
`VFIO-PCI`. Verify this by looking in `/dev/vfio` for a listing of devices.
You might need to disable Docker before initializing Kubernetes. Be aware
that the OpenSSL container image built above will need to be exported from
Docker and imported into containerd.
If Kata is installed through [`kata-deploy`](../../tools/packaging/kata-deploy/helm-chart/README.md)
there will be multiple `configuration.toml` files associated with different
hypervisors. Rather than add in the custom Kata kernel, Kata rootfs, and
kernel modules to each `configuration.toml` as the default, instead use
[annotations](../how-to/how-to-load-kernel-modules-with-kata.md)
in the Kubernetes YAML file to tell Kata which kernel and rootfs to use. The
easy way to do this is to use `kata-deploy` which will install the Kata binaries
to `/opt` and properly configure the `/etc/containerd/config.toml` with annotation
support. However, the `configuration.toml` needs to enable support for
annotations as well. The following configures both QEMU and Cloud Hypervisor
`configuration.toml` files that are currently available with Kata Container
versions 2.0 and higher.
```bash
$ sudo sed -i 's/enable_annotations\s=\s\[\]/enable_annotations = [".*"]/' /opt/kata/share/defaults/kata-containers/configuration-qemu.toml
$ sudo sed -i 's/enable_annotations\s=\s\[\]/enable_annotations = [".*"]/' /opt/kata/share/defaults/kata-containers/configuration-clh.toml
```
Export the OpenSSL image from Docker and import into containerd.
```bash
$ sudo docker save -o openssl-qat-engine.tar openssl-qat-engine:latest
$ sudo ctr -n=k8s.io images import openssl-qat-engine.tar
```
The [Intel® QAT Plugin](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/qat_plugin/README.md)
needs to be started so that the virtual functions can be discovered and
used by Kubernetes.
The following YAML file can be used to start a Kata container with Intel® QAT
support. If Kata is installed with `kata-deploy`, then the containerd
`configuration.toml` should have all of the Kata runtime classes already
populated and annotations supported. To use a Intel® QAT virtual function, the
Intel® QAT plugin needs to be started after the VF's are bound to `VFIO-PCI` as
described [above](#expose-and-bind-intel-qat-virtual-functions-to-vfio-pci-every-reboot).
Edit the following to point to the correct Kata kernel and rootfs location
built with Intel® QAT support.
```bash
$ cat << EOF > kata-openssl-qat.yaml
apiVersion: v1
kind: Pod
metadata:
name: kata-openssl-qat
labels:
app: kata-openssl-qat
annotations:
io.katacontainers.config.hypervisor.kernel: "$KATA_KERNEL_LOCATION/$KATA_KERNEL_NAME"
io.katacontainers.config.hypervisor.image: "$KATA_ROOTFS_LOCATION/kata-containers.img"
io.katacontainers.config.hypervisor.kernel_params: "modules-load=usdm_drv,qat_c62xvf"
spec:
runtimeClassName: kata-qemu
containers:
- name: kata-openssl-qat
image: docker.io/library/openssl-qat-engine:latest
imagePullPolicy: IfNotPresent
resources:
limits:
qat.intel.com/generic: 1
cpu: 1
securityContext:
capabilities:
add: ["IPC_LOCK", "SYS_ADMIN"]
volumeMounts:
- mountPath: /etc/c6xxvf_dev0.conf
name: etc-mount
- mountPath: /dev
name: dev-mount
volumes:
- name: dev-mount
hostPath:
path: /dev
- name: etc-mount
hostPath:
path: $QAT_CONF_LOCATION/c6xxvf_dev0.conf
EOF
```
Use `kubectl` to start the pod. Verify that Intel® QAT card acceleration is
working with the Intel® QAT engine.
```bash
$ kubectl apply -f kata-openssl-qat.yaml
```
```sh
$ kubectl exec -it kata-openssl-qat -- adf_ctl restart
Restarting all devices.
Processing /etc/c6xxvf_dev0.conf
$ kubectl exec -it kata-openssl-qat -- adf_ctl status
Checking status of all devices.
There is 1 QAT acceleration device(s) in the system:
qat_dev0 - type: c6xxvf, inst_id: 0, node_id: 0, bsf: 0000:01:01.0, #accel: 1 #engines: 1 state: up
$ kubectl exec -it kata-openssl-qat -- openssl engine -c -t qat-hw
(qat-hw) Reference implementation of QAT crypto engine v0.6.1
[RSA, DSA, DH, AES-128-CBC-HMAC-SHA1, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA1, AES-256-CBC-HMAC-SHA256, TLS1-PRF, HKDF, X25519, X448]
[ available ]
```
### Troubleshooting
* Check that `/dev/vfio` has VFs enabled.
```sh
$ ls /dev/vfio
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 vfio
```
* Check that the modules load when inside the Kata Container.
```sh
bash-5.0# grep -E "qat|usdm_drv" /proc/modules
qat_c62xvf 16384 - - Live 0x0000000000000000 (O)
usdm_drv 86016 - - Live 0x0000000000000000 (O)
intel_qat 184320 - - Live 0x0000000000000000 (O)
```
* Verify that at least the first `c6xxvf_dev0.conf` file mounts inside the
container image in `/etc`. You will need one configuration file for each VF
passed into the container.
```sh
bash-5.0# ls /etc
c6xxvf_dev0.conf c6xxvf_dev11.conf c6xxvf_dev14.conf c6xxvf_dev3.conf c6xxvf_dev6.conf c6xxvf_dev9.conf resolv.conf
c6xxvf_dev1.conf c6xxvf_dev12.conf c6xxvf_dev15.conf c6xxvf_dev4.conf c6xxvf_dev7.conf hostname
c6xxvf_dev10.conf c6xxvf_dev13.conf c6xxvf_dev2.conf c6xxvf_dev5.conf c6xxvf_dev8.conf hosts
```
* Check `dmesg` inside the container to see if there are any issues with the
Intel® QAT driver.
* If there are issues building the OpenSSL Intel® QAT container image, then
check to make sure that runc is the default runtime for building container.
```sh
$ cat /etc/systemd/system/docker.service.d/50-runtime.conf
[Service]
Environment="DOCKER_DEFAULT_RUNTIME=--default-runtime runc"
```
## Optional Scripts
### Verify Intel® QAT card counters are incremented
To check the built in firmware counters, the Intel® QAT driver has to be compiled
and installed to the host and can't rely on the built in host driver. The
counters will increase when the accelerator is actively being used. To verify
Intel® QAT is actively accelerating the containerized application, use the
following instructions to check if any of the counters increment. Make
sure to change the PCI Device ID to match whats in the system.
```bash
$ for i in 0434 0435 37c8 1f18 1f19; do lspci -d 8086:$i; done
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b1\:00.0/fw_counters
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b3\:00.0/fw_counters
$ sudo watch cat /sys/kernel/debug/qat_c6xx_0000\:b5\:00.0/fw_counters
```

View File

@@ -1,3 +1,3 @@
[toolchain]
# Keep in sync with versions.yaml
channel = "1.85.1"
channel = "1.89"

17
src/agent/Cargo.lock generated
View File

@@ -780,9 +780,9 @@ dependencies = [
[[package]]
name = "container-device-interface"
version = "0.1.0"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "653849f0c250f73d9afab4b2a9a6b07adaee1f34c44ffa6f2d2c3f9392002c1a"
checksum = "2605001b0e8214dae8af146a43ccaa965d960403e330f174c21327154530df8b"
dependencies = [
"anyhow",
"clap",
@@ -1207,9 +1207,9 @@ dependencies = [
[[package]]
name = "fancy-regex"
version = "0.14.0"
version = "0.16.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6e24cb5a94bcae1e5408b0effca5cd7172ea3c5755049c5f3af4cd283a165298"
checksum = "998b056554fbe42e03ae0e152895cd1a7e1002aec800fdc6635d20270260c46f"
dependencies = [
"bit-set",
"regex-automata 0.4.9",
@@ -2007,9 +2007,9 @@ dependencies = [
[[package]]
name = "jsonschema"
version = "0.30.0"
version = "0.33.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f1b46a0365a611fbf1d2143104dcf910aada96fafd295bab16c60b802bf6fa1d"
checksum = "d46662859bc5f60a145b75f4632fbadc84e829e45df6c5de74cfc8e05acb96b5"
dependencies = [
"ahash 0.8.12",
"base64 0.22.1",
@@ -3405,9 +3405,9 @@ dependencies = [
[[package]]
name = "referencing"
version = "0.30.0"
version = "0.33.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8eff4fa778b5c2a57e85c5f2fe3a709c52f0e60d23146e2151cbef5893f420e"
checksum = "9e9c261f7ce75418b3beadfb3f0eb1299fe8eb9640deba45ffa2cb783098697d"
dependencies = [
"ahash 0.8.12",
"fluent-uri 0.3.2",
@@ -4305,6 +4305,7 @@ checksum = "8f50febec83f5ee1df3015341d8bd429f2d1cc62bcba7ea2076759d315084683"
name = "test-utils"
version = "0.1.0"
dependencies = [
"libc",
"nix 0.26.4",
]

View File

@@ -21,7 +21,7 @@ fn to_capshashset(cfd_log: RawFd, capabilities: &Option<HashSet<LinuxCapability>
let binding: HashSet<LinuxCapability> = HashSet::new();
let caps = capabilities.as_ref().unwrap_or(&binding);
for cap in caps.iter() {
match Capability::from_str(&format!("CAP_{}", cap)) {
match Capability::from_str(&format!("CAP_{cap}")) {
Err(_) => {
log_child!(cfd_log, "{} is not a cap", &cap.to_string());
continue;

View File

@@ -1097,7 +1097,7 @@ impl Manager {
devices_group_info
);
Self::setup_allowed_all_mode(pod_cg).with_context(|| {
format!("Setup allowed all devices mode for {}", pod_cpath)
format!("Setup allowed all devices mode for {pod_cpath}")
})?;
devices_group_info.allowed_all = true;
}
@@ -1109,11 +1109,11 @@ impl Manager {
if !is_allowded_all {
Self::setup_devcg_whitelist(pod_cg).with_context(|| {
format!("Setup device cgroup whitelist for {}", pod_cpath)
format!("Setup device cgroup whitelist for {pod_cpath}")
})?;
} else {
Self::setup_allowed_all_mode(pod_cg)
.with_context(|| format!("Setup allowed all mode for {}", pod_cpath))?;
.with_context(|| format!("Setup allowed all mode for {pod_cpath}"))?;
devices_group_info.allowed_all = true;
}
@@ -1132,7 +1132,7 @@ impl Manager {
if let Some(devices_group_info) = devices_group_info.as_ref() {
if !devices_group_info.allowed_all {
Self::setup_devcg_whitelist(&cg)
.with_context(|| format!("Setup device cgroup whitelist for {}", cpath))?;
.with_context(|| format!("Setup device cgroup whitelist for {cpath}"))?;
}
}

View File

@@ -57,8 +57,8 @@ fn parse_parent(slice: String) -> Result<String> {
if subslice.is_empty() {
return Err(anyhow!("invalid slice name: {}", slice));
}
slice_path = format!("{}/{}{}{}", slice_path, prefix, subslice, SLICE_SUFFIX);
prefix = format!("{}{}-", prefix, subslice);
slice_path = format!("{slice_path}/{prefix}{subslice}{SLICE_SUFFIX}");
prefix = format!("{prefix}{subslice}-");
}
slice_path.remove(0);
Ok(slice_path)
@@ -68,9 +68,9 @@ fn get_unit_name(prefix: String, name: String) -> String {
if name.ends_with(SLICE_SUFFIX) {
name
} else if prefix.is_empty() {
format!("{}{}", name, SCOPE_SUFFIX)
format!("{name}{SCOPE_SUFFIX}")
} else {
format!("{}-{}{}", prefix, name, SCOPE_SUFFIX)
format!("{prefix}-{name}{SCOPE_SUFFIX}")
}
}

View File

@@ -346,7 +346,7 @@ pub fn init_child() {
Ok(_) => log_child!(cfd_log, "temporary parent process exit successfully"),
Err(e) => {
log_child!(cfd_log, "temporary parent process exit:child exit: {:?}", e);
let _ = write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
let _ = write_sync(cwfd, SYNC_FAILED, format!("{e:?}").as_str());
}
}
}
@@ -544,13 +544,13 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
sched::setns(fd, s).or_else(|e| {
if s == CloneFlags::CLONE_NEWUSER {
if e != Errno::EINVAL {
let _ = write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
let _ = write_sync(cwfd, SYNC_FAILED, format!("{e:?}").as_str());
return Err(e);
}
Ok(())
} else {
let _ = write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
let _ = write_sync(cwfd, SYNC_FAILED, format!("{e:?}").as_str());
Err(e)
}
})?;
@@ -685,7 +685,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let _ = write_sync(
cwfd,
SYNC_FAILED,
format!("setgroups failed: {:?}", e).as_str(),
format!("setgroups failed: {e:?}").as_str(),
);
})?;
}
@@ -808,7 +808,7 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
if init {
let fd = fcntl::open(
format!("/proc/self/fd/{}", fifofd).as_str(),
format!("/proc/self/fd/{fifofd}").as_str(),
OFlag::O_RDONLY | OFlag::O_CLOEXEC,
Mode::from_bits_truncate(0),
)?;
@@ -1171,14 +1171,14 @@ impl BaseContainer for LinuxContainer {
.stderr(child_stderr)
.env(INIT, format!("{}", p.init))
.env(NO_PIVOT, format!("{}", self.config.no_pivot_root))
.env(CRFD_FD, format!("{}", crfd))
.env(CWFD_FD, format!("{}", cwfd))
.env(CLOG_FD, format!("{}", cfd_log))
.env(CRFD_FD, format!("{crfd}"))
.env(CWFD_FD, format!("{cwfd}"))
.env(CLOG_FD, format!("{cfd_log}"))
.env(CONSOLE_SOCKET_FD, console_name)
.env(PIDNS_ENABLED, format!("{}", pidns.enabled));
if p.init {
child = child.env(FIFO_FD, format!("{}", fifofd));
child = child.env(FIFO_FD, format!("{fifofd}"));
}
if pidns.fd.is_some() {
@@ -1588,9 +1588,11 @@ async fn join_namespaces(
cm.apply(p.pid)?;
}
if p.init && res.is_some() {
info!(logger, "set properties to cgroups!");
cm.set(res.unwrap(), false)?;
if p.init {
if let Some(resource) = res {
info!(logger, "set properties to cgroups!");
cm.set(resource, false)?;
}
}
info!(logger, "notify child to continue");
@@ -1687,7 +1689,7 @@ impl LinuxContainer {
return anyhow!(e).context(format!("container {} already exists", id.as_str()));
}
anyhow!(e).context(format!("fail to create container directory {}", root))
anyhow!(e).context(format!("fail to create container directory {root}"))
})?;
unistd::chown(
@@ -1695,7 +1697,7 @@ impl LinuxContainer {
Some(unistd::getuid()),
Some(unistd::getgid()),
)
.context(format!("Cannot change owner of container {} root", id))?;
.context(format!("Cannot change owner of container {id} root"))?;
let spec = config.spec.as_ref().unwrap();
let linux_cgroups_path = spec

View File

@@ -528,7 +528,7 @@ pub fn pivot_rootfs<P: ?Sized + NixPath + std::fmt::Debug>(path: &P) -> Result<(
// Change to the new root so that the pivot_root actually acts on it.
unistd::fchdir(newroot)?;
pivot_root(".", ".").context(format!("failed to pivot_root on {:?}", path))?;
pivot_root(".", ".").context(format!("failed to pivot_root on {path:?}"))?;
// Currently our "." is oldroot (according to the current kernel code).
// However, purely for safety, we will fchdir(oldroot) since there isn't
@@ -752,15 +752,6 @@ fn parse_mount(m: &Mount) -> (MsFlags, MsFlags, String) {
(flags, pgflags, data.join(","))
}
// This function constructs a canonicalized path by combining the `rootfs` and `unsafe_path` elements.
// The resulting path is guaranteed to be ("below" / "in a directory under") the `rootfs` directory.
//
// Parameters:
//
// - `rootfs` is the absolute path to the root of the containers root filesystem directory.
// - `unsafe_path` is path inside a container. It is unsafe since it may try to "escape" from the containers
// rootfs by using one or more "../" path elements or is its a symlink to path.
fn mount_from(
cfd_log: RawFd,
m: &Mount,
@@ -929,7 +920,7 @@ fn create_devices(devices: &[LinuxDevice], bind: bool) -> Result<()> {
for dev in DEFAULT_DEVICES.iter() {
let dev_path = dev.path().display().to_string();
let path = Path::new(&dev_path[1..]);
op(dev, path).context(format!("Creating container device {:?}", dev))?;
op(dev, path).context(format!("Creating container device {dev:?}"))?;
}
for dev in devices {
let dev_path = &dev.path();
@@ -941,9 +932,9 @@ fn create_devices(devices: &[LinuxDevice], bind: bool) -> Result<()> {
anyhow!(msg)
})?;
if let Some(dir) = path.parent() {
fs::create_dir_all(dir).context(format!("Creating container device {:?}", dev))?;
fs::create_dir_all(dir).context(format!("Creating container device {dev:?}"))?;
}
op(dev, path).context(format!("Creating container device {:?}", dev))?;
op(dev, path).context(format!("Creating container device {dev:?}"))?;
}
stat::umask(old);
Ok(())

View File

@@ -18,10 +18,10 @@ pub fn is_enabled() -> Result<bool> {
pub fn add_mount_label(data: &mut String, label: &str) {
if data.is_empty() {
let context = format!("context=\"{}\"", label);
let context = format!("context=\"{label}\"");
data.push_str(&context);
} else {
let context = format!(",context=\"{}\"", label);
let context = format!(",context=\"{label}\"");
data.push_str(&context);
}
}

View File

@@ -68,7 +68,7 @@ mod tests {
#[test]
fn test_from_str() {
let device = Address::from_str("a.1").unwrap();
assert_eq!(format!("{}", device), "0a.0001");
assert_eq!(format!("{device}"), "0a.0001");
assert!(Address::from_str("").is_err());
assert!(Address::from_str(".").is_err());

View File

@@ -102,10 +102,10 @@ mod tests {
fn test_new_device() {
// Valid devices
let device = Device::new(0, 0).unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
assert_eq!(format!("{device}"), "0.0.0000");
let device = Device::new(3, 0xffff).unwrap();
assert_eq!(format!("{}", device), "0.3.ffff");
assert_eq!(format!("{device}"), "0.3.ffff");
// Invalid device
let device = Device::new(4, 0);
@@ -116,13 +116,13 @@ mod tests {
fn test_device_from_str() {
// Valid devices
let device = Device::from_str("0.0.0").unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
assert_eq!(format!("{device}"), "0.0.0000");
let device = Device::from_str("0.0.0000").unwrap();
assert_eq!(format!("{}", device), "0.0.0000");
assert_eq!(format!("{device}"), "0.0.0000");
let device = Device::from_str("0.3.ffff").unwrap();
assert_eq!(format!("{}", device), "0.3.ffff");
assert_eq!(format!("{device}"), "0.3.ffff");
// Invalid devices
let device = Device::from_str("0.0");

View File

@@ -110,7 +110,7 @@ impl CDHClient {
pub async fn get_resource(&self, resource_path: &str) -> Result<Vec<u8>> {
let req = GetResourceRequest {
ResourcePath: format!("kbs://{}", resource_path),
ResourcePath: format!("kbs://{resource_path}"),
..Default::default()
};
let res = self

View File

@@ -260,7 +260,7 @@ impl Default for AgentConfig {
debug_console_vport: 0,
log_vport: 0,
container_pipe_size: DEFAULT_CONTAINER_PIPE_SIZE,
server_addr: format!("{}:{}", VSOCK_ADDR, DEFAULT_AGENT_VSOCK_PORT),
server_addr: format!("{VSOCK_ADDR}:{DEFAULT_AGENT_VSOCK_PORT}"),
passfd_listener_port: 0,
cgroup_no_v1: String::from(""),
unified_cgroup_hierarchy: false,
@@ -269,7 +269,7 @@ impl Default for AgentConfig {
no_proxy: String::from(""),
guest_components_rest_api: GuestComponentsFeatures::default(),
guest_components_procs: GuestComponentsProcs::default(),
secure_storage_integrity: false,
secure_storage_integrity: true,
#[cfg(feature = "agent-policy")]
policy_file: String::from(""),
mem_agent: None,
@@ -417,7 +417,7 @@ impl AgentConfig {
// generate our config from it.
// The agent will fail to start if the configuration file is not present,
// or if it can't be parsed properly.
if param.starts_with(format!("{}=", CONFIG_FILE).as_str()) {
if param.starts_with(format!("{CONFIG_FILE}=").as_str()) {
let config_file = get_string_value(param)?;
return AgentConfig::from_config_file(&config_file)
.context("AgentConfig from kernel cmdline");
@@ -651,7 +651,7 @@ impl AgentConfig {
#[instrument]
pub fn from_config_file(file: &str) -> Result<AgentConfig> {
let config = fs::read_to_string(file)
.with_context(|| format!("Failed to read config file {}", file))?;
.with_context(|| format!("Failed to read config file {file}"))?;
AgentConfig::from_str(&config)
}
@@ -668,7 +668,7 @@ impl AgentConfig {
}
if let Ok(value) = env::var(TRACING_ENV_VAR) {
let name_value = format!("{}={}", TRACING_ENV_VAR, value);
let name_value = format!("{TRACING_ENV_VAR}={value}");
self.tracing = get_bool_value(&name_value).unwrap_or(false);
}
@@ -911,7 +911,7 @@ mod tests {
no_proxy: "",
guest_components_rest_api: GuestComponentsFeatures::default(),
guest_components_procs: GuestComponentsProcs::default(),
secure_storage_integrity: false,
secure_storage_integrity: true,
#[cfg(feature = "agent-policy")]
policy_file: "",
mem_agent: None,
@@ -1364,7 +1364,7 @@ mod tests {
},
TestData {
contents: "",
secure_storage_integrity: false,
secure_storage_integrity: true,
..Default::default()
},
TestData {
@@ -1442,7 +1442,7 @@ mod tests {
// Now, test various combinations of file contents and environment
// variables.
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let file_path = dir.path().join("cmdline");
@@ -1470,40 +1470,36 @@ mod tests {
let config =
AgentConfig::from_cmdline(filename, vec![]).expect("Failed to parse command line");
assert_eq!(d.debug_console, config.debug_console, "{}", msg);
assert_eq!(d.dev_mode, config.dev_mode, "{}", msg);
assert_eq!(d.cgroup_no_v1, config.cgroup_no_v1, "{}", msg);
assert_eq!(d.debug_console, config.debug_console, "{msg}");
assert_eq!(d.dev_mode, config.dev_mode, "{msg}");
assert_eq!(d.cgroup_no_v1, config.cgroup_no_v1, "{msg}");
assert_eq!(
d.unified_cgroup_hierarchy, config.unified_cgroup_hierarchy,
"{}",
msg
"{msg}"
);
assert_eq!(d.log_level, config.log_level, "{}", msg);
assert_eq!(d.hotplug_timeout, config.hotplug_timeout, "{}", msg);
assert_eq!(d.container_pipe_size, config.container_pipe_size, "{}", msg);
assert_eq!(d.server_addr, config.server_addr, "{}", msg);
assert_eq!(d.tracing, config.tracing, "{}", msg);
assert_eq!(d.https_proxy, config.https_proxy, "{}", msg);
assert_eq!(d.no_proxy, config.no_proxy, "{}", msg);
assert_eq!(d.log_level, config.log_level, "{msg}");
assert_eq!(d.hotplug_timeout, config.hotplug_timeout, "{msg}");
assert_eq!(d.container_pipe_size, config.container_pipe_size, "{msg}");
assert_eq!(d.server_addr, config.server_addr, "{msg}");
assert_eq!(d.tracing, config.tracing, "{msg}");
assert_eq!(d.https_proxy, config.https_proxy, "{msg}");
assert_eq!(d.no_proxy, config.no_proxy, "{msg}");
assert_eq!(
d.guest_components_rest_api, config.guest_components_rest_api,
"{}",
msg
"{msg}"
);
assert_eq!(
d.guest_components_procs, config.guest_components_procs,
"{}",
msg
"{msg}"
);
assert_eq!(
d.secure_storage_integrity, config.secure_storage_integrity,
"{}",
msg
"{msg}"
);
#[cfg(feature = "agent-policy")]
assert_eq!(d.policy_file, config.policy_file, "{}", msg);
assert_eq!(d.policy_file, config.policy_file, "{msg}");
assert_eq!(d.mem_agent, config.mem_agent, "{}", msg);
assert_eq!(d.mem_agent, config.mem_agent, "{msg}");
for v in vars_to_unset {
env::remove_var(v);
@@ -1568,7 +1564,7 @@ mod tests {
#[case("panic", Ok(slog::Level::Critical))]
fn test_logrus_to_slog_level(#[case] input: &str, #[case] expected: Result<slog::Level>) {
let result = logrus_to_slog_level(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1593,7 +1589,7 @@ mod tests {
#[case("agent.log=panic", Ok(slog::Level::Critical))]
fn test_get_log_level(#[case] input: &str, #[case] expected: Result<slog::Level>) {
let result = get_log_level(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1636,7 +1632,7 @@ Caused by:
#[case("agent.cdi_timeout=320", Ok(time::Duration::from_secs(320)))]
fn test_timeout(#[case] param: &str, #[case] expected: Result<time::Duration>) {
let result = get_timeout(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1676,7 +1672,7 @@ Caused by:
)))]
fn test_get_container_pipe_size(#[case] param: &str, #[case] expected: Result<i32>) {
let result = get_container_pipe_size(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1697,7 +1693,7 @@ Caused by:
#[case("x= = ", Ok(" = ".into()))]
fn test_get_string_value(#[case] param: &str, #[case] expected: Result<String>) {
let result = get_string_value(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1716,7 +1712,7 @@ Caused by:
#[case] expected: Result<GuestComponentsFeatures>,
) {
let result = get_guest_components_features_value(input);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}
@@ -1739,7 +1735,7 @@ Caused by:
#[case] expected: Result<GuestComponentsProcs>,
) {
let result = get_guest_components_procs_value(param);
let msg = format!("expected: {:?}, result: {:?}", expected, result);
let msg = format!("expected: {expected:?}, result: {result:?}");
assert_result!(expected, result, msg);
}

View File

@@ -16,7 +16,10 @@ use crate::pci;
use crate::sandbox::Sandbox;
use crate::uevent::{wait_for_uevent, Uevent, UeventMatcher};
use anyhow::{anyhow, Context, Result};
use kata_types::device::{DRIVER_BLK_CCW_TYPE, DRIVER_BLK_MMIO_TYPE, DRIVER_BLK_PCI_TYPE};
#[cfg(target_arch = "s390x")]
use kata_types::device::DRIVER_BLK_CCW_TYPE;
use kata_types::device::{DRIVER_BLK_MMIO_TYPE, DRIVER_BLK_PCI_TYPE};
use protocols::agent::Device;
use regex::Regex;
use std::path::Path;
@@ -28,6 +31,7 @@ use tracing::instrument;
#[derive(Debug)]
pub struct VirtioBlkPciDeviceHandler {}
#[cfg(target_arch = "s390x")]
#[derive(Debug)]
pub struct VirtioBlkCcwDeviceHandler {}
@@ -52,6 +56,7 @@ impl DeviceHandler for VirtioBlkPciDeviceHandler {
}
}
#[cfg(target_arch = "s390x")]
#[async_trait::async_trait]
impl DeviceHandler for VirtioBlkCcwDeviceHandler {
#[instrument]
@@ -164,7 +169,7 @@ pub struct VirtioBlkPciMatcher {
impl VirtioBlkPciMatcher {
pub fn new(relpath: &str) -> VirtioBlkPciMatcher {
let root_bus = create_pci_root_bus_path();
let re = format!(r"^{}{}/virtio[0-9]+/block/", root_bus, relpath);
let re = format!(r"^{root_bus}{relpath}/virtio[0-9]+/block/");
VirtioBlkPciMatcher {
rex: Regex::new(&re).expect("BUG: failed to compile VirtioBlkPciMatcher regex"),
@@ -186,7 +191,7 @@ pub struct VirtioBlkMmioMatcher {
impl VirtioBlkMmioMatcher {
pub fn new(devname: &str) -> VirtioBlkMmioMatcher {
VirtioBlkMmioMatcher {
suffix: format!(r"/block/{}", devname),
suffix: format!(r"/block/{devname}"),
}
}
}
@@ -206,10 +211,8 @@ pub struct VirtioBlkCCWMatcher {
#[cfg(target_arch = "s390x")]
impl VirtioBlkCCWMatcher {
pub fn new(root_bus_path: &str, device: &ccw::Device) -> Self {
let re = format!(
r"^{}/0\.[0-3]\.[0-9a-f]{{1,4}}/{}/virtio[0-9]+/block/",
root_bus_path, device
);
let re =
format!(r"^{root_bus_path}/0\.[0-3]\.[0-9a-f]{{1,4}}/{device}/virtio[0-9]+/block/");
VirtioBlkCCWMatcher {
rex: Regex::new(&re).expect("BUG: failed to compile VirtioBlkCCWMatcher regex"),
}
@@ -238,12 +241,12 @@ mod tests {
uev_a.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev_a.subsystem = BLOCK.to_string();
uev_a.devname = devname.to_string();
uev_a.devpath = format!("{}{}/virtio4/block/{}", root_bus, relpath_a, devname);
uev_a.devpath = format!("{root_bus}{relpath_a}/virtio4/block/{devname}");
let matcher_a = VirtioBlkPciMatcher::new(relpath_a);
let mut uev_b = uev_a.clone();
let relpath_b = "/0000:00:0a.0/0000:00:0b.0";
uev_b.devpath = format!("{}{}/virtio0/block/{}", root_bus, relpath_b, devname);
uev_b.devpath = format!("{root_bus}{relpath_b}/virtio0/block/{devname}");
let matcher_b = VirtioBlkPciMatcher::new(relpath_b);
assert!(matcher_a.is_match(&uev_a));
@@ -264,10 +267,7 @@ mod tests {
uev.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev.subsystem = subsystem.to_string();
uev.devname = devname.to_string();
uev.devpath = format!(
"{}/0.0.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
uev.devpath = format!("{root_bus}/0.0.0001/{relpath}/virtio1/{subsystem}/{devname}");
// Valid path
let device = ccw::Device::from_str(relpath).unwrap();
@@ -275,40 +275,25 @@ mod tests {
assert!(matcher.is_match(&uev));
// Invalid paths
uev.devpath = format!(
"{}/0.0.0001/0.0.0003/virtio1/{}/{}",
root_bus, subsystem, devname
);
uev.devpath = format!("{root_bus}/0.0.0001/0.0.0003/virtio1/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
uev.devpath = format!("0.0.0001/{}/virtio1/{}/{}", relpath, subsystem, devname);
uev.devpath = format!("0.0.0001/{relpath}/virtio1/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.0.0001/{}/virtio/{}/{}",
root_bus, relpath, subsystem, devname
);
uev.devpath = format!("{root_bus}/0.0.0001/{relpath}/virtio/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
uev.devpath = format!("{}/0.0.0001/{}/virtio1", root_bus, relpath);
uev.devpath = format!("{root_bus}/0.0.0001/{relpath}/virtio1");
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/1.0.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
uev.devpath = format!("{root_bus}/1.0.0001/{relpath}/virtio1/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.4.0001/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
uev.devpath = format!("{root_bus}/0.4.0001/{relpath}/virtio1/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
uev.devpath = format!(
"{}/0.0.10000/{}/virtio1/{}/{}",
root_bus, relpath, subsystem, devname
);
uev.devpath = format!("{root_bus}/0.0.10000/{relpath}/virtio1/{subsystem}/{devname}");
assert!(!matcher.is_match(&uev));
}
@@ -321,17 +306,13 @@ mod tests {
uev_a.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev_a.subsystem = BLOCK.to_string();
uev_a.devname = devname_a.to_string();
uev_a.devpath = format!(
"/sys/devices/virtio-mmio-cmdline/virtio-mmio.0/virtio0/block/{}",
devname_a
);
uev_a.devpath =
format!("/sys/devices/virtio-mmio-cmdline/virtio-mmio.0/virtio0/block/{devname_a}");
let matcher_a = VirtioBlkMmioMatcher::new(devname_a);
let mut uev_b = uev_a.clone();
uev_b.devpath = format!(
"/sys/devices/virtio-mmio-cmdline/virtio-mmio.4/virtio4/block/{}",
devname_b
);
uev_b.devpath =
format!("/sys/devices/virtio-mmio-cmdline/virtio-mmio.4/virtio4/block/{devname_b}");
let matcher_b = VirtioBlkMmioMatcher::new(devname_b);
assert!(matcher_a.is_match(&uev_a));

View File

@@ -425,7 +425,7 @@ pub fn update_env_pci(
let mut guest_addrs = Vec::<String>::new();
for host_addr_str in val.split(',') {
let host_addr = pci::Address::from_str(host_addr_str)
.with_context(|| format!("Can't parse {} environment variable", name))?;
.with_context(|| format!("Can't parse {name} environment variable"))?;
let host_guest = pcimap
.get(cid)
.ok_or_else(|| anyhow!("No PCI mapping found for container {}", cid))?;
@@ -433,11 +433,11 @@ pub fn update_env_pci(
.get(&host_addr)
.ok_or_else(|| anyhow!("Unable to translate host PCI address {}", host_addr))?;
guest_addrs.push(format!("{}", guest_addr));
addr_map.insert(host_addr_str.to_string(), format!("{}", guest_addr));
guest_addrs.push(format!("{guest_addr}"));
addr_map.insert(host_addr_str.to_string(), format!("{guest_addr}"));
}
pci_dev_map.insert(format!("{}_INFO", name), addr_map);
pci_dev_map.insert(format!("{name}_INFO"), addr_map);
envvar.replace_range(eqpos + 1.., guest_addrs.join(",").as_str());
}
@@ -526,7 +526,7 @@ fn update_spec_devices(
"Missing devices in OCI spec: {:?}",
updates
.keys()
.map(|d| format!("{:?}", d))
.map(|d| format!("{d:?}"))
.collect::<Vec<_>>()
.join(" ")
));
@@ -572,7 +572,7 @@ pub fn pcipath_to_sysfs(root_bus_sysfs: &str, pcipath: &pci::Path) -> Result<Str
for i in 0..pcipath.len() {
let bdf = format!("{}:{}", bus, pcipath[i]);
relpath = format!("{}/{}", relpath, bdf);
relpath = format!("{relpath}/{bdf}");
if i == pcipath.len() - 1 {
// Final device need not be a bridge
@@ -580,7 +580,7 @@ pub fn pcipath_to_sysfs(root_bus_sysfs: &str, pcipath: &pci::Path) -> Result<Str
}
// Find out the bus exposed by bridge
let bridgebuspath = format!("{}{}/pci_bus", root_bus_sysfs, relpath);
let bridgebuspath = format!("{root_bus_sysfs}{relpath}/pci_bus");
let mut files: Vec<_> = fs::read_dir(&bridgebuspath)?.collect();
match files.pop() {
@@ -1120,7 +1120,7 @@ mod tests {
// Create mock sysfs files to indicate that 0000:00:02.0 is a bridge to bus 01
let bridge2bus = "0000:01";
let bus2path = format!("{}/pci_bus/{}", bridge2path, bridge2bus);
let bus2path = format!("{bridge2path}/pci_bus/{bridge2bus}");
fs::create_dir_all(bus2path).unwrap();
@@ -1134,9 +1134,9 @@ mod tests {
assert!(relpath.is_err());
// Create mock sysfs files for a bridge at 0000:01:03.0 to bus 02
let bridge3path = format!("{}/0000:01:03.0", bridge2path);
let bridge3path = format!("{bridge2path}/0000:01:03.0");
let bridge3bus = "0000:02";
let bus3path = format!("{}/pci_bus/{}", bridge3path, bridge3bus);
let bus3path = format!("{bridge3path}/pci_bus/{bridge3bus}");
fs::create_dir_all(bus3path).unwrap();
@@ -1169,7 +1169,7 @@ mod tests {
let devname = "vda";
let root_bus = create_pci_root_bus_path();
let relpath = "/0000:00:0a.0/0000:03:0b.0";
let devpath = format!("{}{}/virtio4/block/{}", root_bus, relpath, devname);
let devpath = format!("{root_bus}{relpath}/virtio4/block/{devname}");
let mut uev = crate::uevent::Uevent::default();
uev.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
@@ -1272,7 +1272,7 @@ mod tests {
cdi_timeout,
)
.await;
println!("modfied spec {:?}", spec);
println!("modfied spec {spec:?}");
assert!(res.is_ok(), "{}", res.err().unwrap());
let linux = spec.linux().as_ref().unwrap();

View File

@@ -69,7 +69,7 @@ impl NetPciMatcher {
let root_bus = create_pci_root_bus_path();
NetPciMatcher {
devpath: format!("{}{}", root_bus, relpath),
devpath: format!("{root_bus}{relpath}"),
}
}
}
@@ -106,10 +106,7 @@ struct NetCcwMatcher {
#[cfg(target_arch = "s390x")]
impl NetCcwMatcher {
pub fn new(root_bus_path: &str, device: &ccw::Device) -> Self {
let re = format!(
r"{}/0\.[0-3]\.[0-9a-f]{{1,4}}/{}/virtio[0-9]+/net/",
root_bus_path, device
);
let re = format!(r"{root_bus_path}/0\.[0-3]\.[0-9a-f]{{1,4}}/{device}/virtio[0-9]+/net/");
NetCcwMatcher {
re: Regex::new(&re).expect("BUG: failed to compile NetCCWMatcher regex"),
}
@@ -139,7 +136,7 @@ mod tests {
let mut uev_a = crate::uevent::Uevent::default();
uev_a.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev_a.devpath = format!("{}{}", root_bus, relpath_a);
uev_a.devpath = format!("{root_bus}{relpath_a}");
uev_a.subsystem = String::from("net");
uev_a.interface = String::from("eth0");
let matcher_a = NetPciMatcher::new(relpath_a);
@@ -147,7 +144,7 @@ mod tests {
let relpath_b = "/0000:00:02.0/0000:01:02.0";
let mut uev_b = uev_a.clone();
uev_b.devpath = format!("{}{}", root_bus, relpath_b);
uev_b.devpath = format!("{root_bus}{relpath_b}");
let matcher_b = NetPciMatcher::new(relpath_b);
assert!(matcher_a.is_match(&uev_a));
@@ -158,7 +155,7 @@ mod tests {
let relpath_c = "/0000:00:02.0/0000:01:03.0";
let net_substr = "/net/eth0";
let mut uev_c = uev_a.clone();
uev_c.devpath = format!("{}{}{}", root_bus, relpath_c, net_substr);
uev_c.devpath = format!("{root_bus}{relpath_c}{net_substr}");
let matcher_c = NetPciMatcher::new(relpath_c);
assert!(matcher_c.is_match(&uev_c));

View File

@@ -67,7 +67,7 @@ pub struct PmemBlockMatcher {
impl PmemBlockMatcher {
pub fn new(devname: &str) -> PmemBlockMatcher {
let suffix = format!(r"/block/{}", devname);
let suffix = format!(r"/block/{devname}");
PmemBlockMatcher { suffix }
}

View File

@@ -58,7 +58,7 @@ pub struct ScsiBlockMatcher {
impl ScsiBlockMatcher {
pub fn new(scsi_addr: &str) -> ScsiBlockMatcher {
let search = format!(r"/0:0:{}/block/", scsi_addr);
let search = format!(r"/0:0:{scsi_addr}/block/");
ScsiBlockMatcher { search }
}
@@ -118,18 +118,14 @@ mod tests {
uev_a.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev_a.subsystem = BLOCK.to_string();
uev_a.devname = devname.to_string();
uev_a.devpath = format!(
"{}/0000:00:00.0/virtio0/host0/target0:0:0/0:0:{}/block/sda",
root_bus, addr_a
);
uev_a.devpath =
format!("{root_bus}/0000:00:00.0/virtio0/host0/target0:0:0/0:0:{addr_a}/block/sda");
let matcher_a = ScsiBlockMatcher::new(addr_a);
let mut uev_b = uev_a.clone();
let addr_b = "2:0";
uev_b.devpath = format!(
"{}/0000:00:00.0/virtio0/host0/target0:0:2/0:0:{}/block/sdb",
root_bus, addr_b
);
uev_b.devpath =
format!("{root_bus}/0000:00:00.0/virtio0/host0/target0:0:2/0:0:{addr_b}/block/sdb");
let matcher_b = ScsiBlockMatcher::new(addr_b);
assert!(matcher_a.is_match(&uev_a));

View File

@@ -170,7 +170,7 @@ pub struct VfioMatcher {
impl VfioMatcher {
pub fn new(grp: IommuGroup) -> VfioMatcher {
VfioMatcher {
syspath: format!("/devices/virtual/vfio/{}", grp),
syspath: format!("/devices/virtual/vfio/{grp}"),
}
}
}
@@ -215,7 +215,7 @@ impl PciMatcher {
pub fn new(relpath: &str) -> Result<PciMatcher> {
let root_bus = create_pci_root_bus_path();
Ok(PciMatcher {
devpath: format!("{}{}", root_bus, relpath),
devpath: format!("{root_bus}{relpath}"),
})
}
}
@@ -425,12 +425,12 @@ mod tests {
let mut uev_a = crate::uevent::Uevent::default();
uev_a.action = crate::linux_abi::U_EVENT_ACTION_ADD.to_string();
uev_a.devname = format!("vfio/{}", grpa);
uev_a.devpath = format!("/devices/virtual/vfio/{}", grpa);
uev_a.devname = format!("vfio/{grpa}");
uev_a.devpath = format!("/devices/virtual/vfio/{grpa}");
let matcher_a = VfioMatcher::new(grpa);
let mut uev_b = uev_a.clone();
uev_b.devpath = format!("/devices/virtual/vfio/{}", grpb);
uev_b.devpath = format!("/devices/virtual/vfio/{grpb}");
let matcher_b = VfioMatcher::new(grpb);
assert!(matcher_a.is_match(&uev_a));
@@ -531,12 +531,12 @@ mod tests {
async fn test_vfio_ap_matcher() {
let subsystem = "ap";
let card = "0a";
let relpath = format!("{}.0001", card);
let relpath = format!("{card}.0001");
let mut uev = Uevent::default();
uev.action = U_EVENT_ACTION_ADD.to_string();
uev.subsystem = subsystem.to_string();
uev.devpath = format!("{}/card{}/{}", AP_ROOT_BUS_PATH, card, relpath);
uev.devpath = format!("{AP_ROOT_BUS_PATH}/card{card}/{relpath}");
let ap_address = ap::Address::from_str(&relpath).unwrap();
let matcher = ApMatcher::new(ap_address);
@@ -548,7 +548,7 @@ mod tests {
assert!(!matcher.is_match(&uev_remove));
let mut uev_other_device = uev.clone();
uev_other_device.devpath = format!("{}/card{}/{}.0002", AP_ROOT_BUS_PATH, card, card);
uev_other_device.devpath = format!("{AP_ROOT_BUS_PATH}/card{card}/{card}.0002");
assert!(!matcher.is_match(&uev_other_device));
}
}

View File

@@ -301,12 +301,12 @@ async fn real_main(init_mode: bool) -> std::result::Result<(), Box<dyn std::erro
tracer::end_tracing();
}
eprintln!("{} shutdown complete", NAME);
eprintln!("{NAME} shutdown complete");
let mut wait_errors: Vec<tokio::task::JoinError> = vec![];
for result in results {
if let Err(e) = result {
eprintln!("wait task error: {:#?}", e);
eprintln!("wait task error: {e:#?}");
wait_errors.push(e);
}
}
@@ -746,7 +746,7 @@ mod tests {
skip_if_root!();
}
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let (rfd, wfd) = unistd::pipe2(OFlag::O_CLOEXEC).unwrap();
defer!({
// XXX: Never try to close rfd, because it will be closed by PipeStream in
@@ -759,7 +759,7 @@ mod tests {
shutdown_tx.send(true).unwrap();
let result = create_logger_task(rfd, d.vsock_port, shutdown_rx).await;
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
assert_result!(d.result, result, msg);
}
}

View File

@@ -239,7 +239,7 @@ fn update_guest_metrics() {
Ok(kernel_stats) => {
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, "total", &kernel_stats.total);
for (i, cpu_time) in kernel_stats.cpu_time.iter().enumerate() {
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, format!("{}", i).as_str(), cpu_time);
set_gauge_vec_cpu_time(&GUEST_CPU_TIME, format!("{i}").as_str(), cpu_time);
}
}
}

View File

@@ -177,7 +177,7 @@ pub fn get_mount_fs_type_from_file(mount_file: &str, mount_point: &str) -> Resul
let content = fs::read_to_string(mount_file)
.map_err(|e| anyhow!("read mount file {}: {}", mount_file, e))?;
let re = Regex::new(format!("device .+ mounted on {} with fstype (.+)", mount_point).as_str())?;
let re = Regex::new(format!("device .+ mounted on {mount_point} with fstype (.+)").as_str())?;
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line in content.lines() {
@@ -355,8 +355,7 @@ mod tests {
let flags = MsFlags::MS_RDONLY;
let options = "mode=755";
println!(
"testing if already mounted baremount({:?} {:?} {:?})",
source, destination, fs_type
"testing if already mounted baremount({source:?} {destination:?} {fs_type:?})"
);
assert!(baremount(source, destination, fs_type, flags, options, &logger).is_ok());
}
@@ -461,7 +460,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
skip_loop_by_user!(msg, d.test_user);
@@ -505,7 +504,7 @@ mod tests {
let result = baremount(src, dest, d.fs_type, d.flags, d.options, &logger);
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
@@ -517,7 +516,7 @@ mod tests {
}
let err = result.unwrap_err();
let error_msg = format!("{}", err);
let error_msg = format!("{err}");
assert!(
error_msg.contains(d.error_contains),
@@ -619,11 +618,11 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let result = remove_mounts(&d.mounts);
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
@@ -699,15 +698,15 @@ mod tests {
.iter()
.enumerate()
{
let msg = format!("missing mount file test[{}] with mountpoint: {}", i, mp);
let msg = format!("missing mount file test[{i}] with mountpoint: {mp}");
let result = get_mount_fs_type_from_file("", mp);
let err = result.unwrap_err();
let msg = format!("{}: error: {}", msg, err);
let msg = format!("{msg}: error: {err}");
assert!(
format!("{}", err).contains("No such file or directory"),
format!("{err}").contains("No such file or directory"),
"{}",
msg
);
@@ -715,7 +714,7 @@ mod tests {
// Now, test various combinations of file contents
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let file_path = dir.path().join("mount_stats");
@@ -732,7 +731,7 @@ mod tests {
let result = get_mount_fs_type_from_file(filename, d.mount_point);
// add more details if an assertion fails
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if d.error_contains.is_empty() {
let fs_type = result.unwrap();
@@ -875,7 +874,7 @@ mod tests {
);
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let file_path = dir.path().join("cgroups");
let filename = file_path
@@ -889,7 +888,7 @@ mod tests {
.unwrap_or_else(|_| panic!("{}: failed to write file contents", msg));
let result = get_cgroup_mounts(&logger, filename, false);
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if !d.error_contains.is_empty() {
assert!(result.is_err(), "{}", msg);
@@ -974,7 +973,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
skip_loop_by_user!(msg, d.test_user);
let drain = slog::Discard;
@@ -1011,7 +1010,7 @@ mod tests {
nix::mount::umount(&dest).unwrap();
}
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
} else {
@@ -1062,14 +1061,14 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let result = parse_mount_options(&d.options_vec).unwrap();
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
let expected_result = (d.result.0, d.result.1.to_owned());
assert_eq!(expected_result, result, "{}", msg);
assert_eq!(expected_result, result, "{msg}");
}
}
}

View File

@@ -41,9 +41,9 @@ pub enum LinkFilter<'a> {
impl fmt::Display for LinkFilter<'_> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
LinkFilter::Name(name) => write!(f, "Name: {}", name),
LinkFilter::Index(idx) => write!(f, "Index: {}", idx),
LinkFilter::Address(addr) => write!(f, "Address: {}", addr),
LinkFilter::Name(name) => write!(f, "Name: {name}"),
LinkFilter::Index(idx) => write!(f, "Index: {idx}"),
LinkFilter::Address(addr) => write!(f, "Address: {addr}"),
}
}
}
@@ -272,7 +272,7 @@ impl Handle {
use LinkAttribute as Nla;
let mac_addr = parse_mac_address(addr)
.with_context(|| format!("Failed to parse MAC address: {}", addr))?;
.with_context(|| format!("Failed to parse MAC address: {addr}"))?;
// Hardware filter might not be supported by netlink,
// we may have to dump link list and then find the target link.
@@ -924,7 +924,7 @@ mod tests {
/// Helper function to check if the result is a netlink EACCES error
fn is_netlink_permission_error<T>(result: &Result<T>) -> bool {
if let Err(e) = result {
let error_string = format!("{:?}", e);
let error_string = format!("{e:?}");
if error_string.contains("code: Some(-13)") {
println!("INFO: skipping test - netlink operations are restricted in this environment (EACCES)");
return true;
@@ -1000,7 +1000,7 @@ mod tests {
.unwrap()
.list_routes()
.await
.context(format!("available devices: {:?}", devices))
.context(format!("available devices: {devices:?}"))
.expect("Failed to list routes");
assert_ne!(all.len(), 0);
@@ -1186,7 +1186,7 @@ mod tests {
);
assert_eq!(
stdout.trim(),
format!("{} lladdr {} PERMANENT", TEST_ARP_IP, mac)
format!("{TEST_ARP_IP} lladdr {mac} PERMANENT")
);
clean_env_for_test_add_one_arp_neighbor(TEST_DUMMY_INTERFACE, TEST_ARP_IP);

View File

@@ -191,13 +191,13 @@ mod tests {
fn test_slotfn() {
// Valid slots
let sf = SlotFn::new(0x00, 0x0).unwrap();
assert_eq!(format!("{}", sf), "00.0");
assert_eq!(format!("{sf}"), "00.0");
let sf = SlotFn::from_str("00.0").unwrap();
assert_eq!(format!("{}", sf), "00.0");
assert_eq!(format!("{sf}"), "00.0");
let sf = SlotFn::from_str("00").unwrap();
assert_eq!(format!("{}", sf), "00.0");
assert_eq!(format!("{sf}"), "00.0");
let sf = SlotFn::new(31, 7).unwrap();
let sf2 = SlotFn::from_str("1f.7").unwrap();
@@ -256,12 +256,12 @@ mod tests {
let sf1f_7 = SlotFn::new(0x1f, 7).unwrap();
let addr = Address::new(0, 0, sf0_0);
assert_eq!(format!("{}", addr), "0000:00:00.0");
assert_eq!(format!("{addr}"), "0000:00:00.0");
let addr2 = Address::from_str("0000:00:00.0").unwrap();
assert_eq!(addr, addr2);
let addr = Address::new(0xffff, 0xff, sf1f_7);
assert_eq!(format!("{}", addr), "ffff:ff:1f.7");
assert_eq!(format!("{addr}"), "ffff:ff:1f.7");
let addr2 = Address::from_str("ffff:ff:1f.7").unwrap();
assert_eq!(addr, addr2);
@@ -299,7 +299,7 @@ mod tests {
// Valid paths
let pcipath = Path::new(vec![sf3_0]).unwrap();
assert_eq!(format!("{}", pcipath), "03.0");
assert_eq!(format!("{pcipath}"), "03.0");
let pcipath2 = Path::from_str("03.0").unwrap();
assert_eq!(pcipath, pcipath2);
let pcipath2 = Path::from_str("03").unwrap();
@@ -308,7 +308,7 @@ mod tests {
assert_eq!(pcipath[0], sf3_0);
let pcipath = Path::new(vec![sf3_0, sf4_0]).unwrap();
assert_eq!(format!("{}", pcipath), "03.0/04.0");
assert_eq!(format!("{pcipath}"), "03.0/04.0");
let pcipath2 = Path::from_str("03.0/04.0").unwrap();
assert_eq!(pcipath, pcipath2);
let pcipath2 = Path::from_str("03/04").unwrap();
@@ -318,7 +318,7 @@ mod tests {
assert_eq!(pcipath[1], sf4_0);
let pcipath = Path::new(vec![sf3_0, sf4_0, sf5_0]).unwrap();
assert_eq!(format!("{}", pcipath), "03.0/04.0/05.0");
assert_eq!(format!("{pcipath}"), "03.0/04.0/05.0");
let pcipath2 = Path::from_str("03.0/04.0/05.0").unwrap();
assert_eq!(pcipath, pcipath2);
let pcipath2 = Path::from_str("03/04/05").unwrap();
@@ -329,7 +329,7 @@ mod tests {
assert_eq!(pcipath[2], sf5_0);
let pcipath = Path::new(vec![sfa_5, sfb_6, sfc_7]).unwrap();
assert_eq!(format!("{}", pcipath), "0a.5/0b.6/0c.7");
assert_eq!(format!("{pcipath}"), "0a.5/0b.6/0c.7");
let pcipath2 = Path::from_str("0a.5/0b.6/0c.7").unwrap();
assert_eq!(pcipath, pcipath2);
assert_eq!(pcipath.len(), 3);

View File

@@ -138,7 +138,7 @@ fn sl() -> slog::Logger {
// Convenience function to wrap an error and response to ttrpc client
pub fn ttrpc_error(code: ttrpc::Code, err: impl Debug) -> ttrpc::Error {
get_rpc_status(code, format!("{:?}", err))
get_rpc_status(code, format!("{err:?}"))
}
/// Convert SandboxError to ttrpc error with appropriate code.
@@ -996,7 +996,7 @@ impl agent_ttrpc::AgentService for AgentService {
let err = unsafe { libc::ioctl(fd, TIOCSWINSZ, &win) };
Errno::result(err)
.map(drop)
.map_ttrpc_err(|e| format!("ioctl error: {:?}", e))?;
.map_ttrpc_err(|e| format!("ioctl error: {e:?}"))?;
Ok(Empty::new())
}
@@ -1020,20 +1020,20 @@ impl agent_ttrpc::AgentService for AgentService {
#[cfg(not(target_arch = "s390x"))]
{
let pcipath = pci::Path::from_str(&interface.devicePath).map_ttrpc_err(|e| {
format!("Unexpected pci-path for network interface: {:?}", e)
format!("Unexpected pci-path for network interface: {e:?}")
})?;
wait_for_pci_net_interface(&self.sandbox, &pcipath)
.await
.map_ttrpc_err(|e| format!("interface not available: {:?}", e))?;
.map_ttrpc_err(|e| format!("interface not available: {e:?}"))?;
}
#[cfg(target_arch = "s390x")]
{
let ccw_dev = ccw::Device::from_str(&interface.devicePath).map_ttrpc_err(|e| {
format!("Unexpected CCW path for network interface: {:?}", e)
format!("Unexpected CCW path for network interface: {e:?}")
})?;
wait_for_ccw_net_interface(&self.sandbox, &ccw_dev)
.await
.map_ttrpc_err(|e| format!("interface not available: {:?}", e))?;
.map_ttrpc_err(|e| format!("interface not available: {e:?}"))?;
}
}
@@ -1043,7 +1043,7 @@ impl agent_ttrpc::AgentService for AgentService {
.rtnl
.update_interface(&interface)
.await
.map_ttrpc_err(|e| format!("update interface: {:?}", e))?;
.map_ttrpc_err(|e| format!("update interface: {e:?}"))?;
Ok(interface)
}
@@ -1068,13 +1068,13 @@ impl agent_ttrpc::AgentService for AgentService {
.rtnl
.update_routes(new_routes)
.await
.map_ttrpc_err(|e| format!("Failed to update routes: {:?}", e))?;
.map_ttrpc_err(|e| format!("Failed to update routes: {e:?}"))?;
let list = sandbox
.rtnl
.list_routes()
.await
.map_ttrpc_err(|e| format!("Failed to list routes after update: {:?}", e))?;
.map_ttrpc_err(|e| format!("Failed to list routes after update: {e:?}"))?;
Ok(protocols::agent::Routes {
Routes: list,
@@ -1092,7 +1092,7 @@ impl agent_ttrpc::AgentService for AgentService {
update_ephemeral_mounts(sl(), &req.storages, &self.sandbox)
.await
.map_ttrpc_err(|e| format!("Failed to update mounts: {:?}", e))?;
.map_ttrpc_err(|e| format!("Failed to update mounts: {e:?}"))?;
Ok(Empty::new())
}
@@ -1243,7 +1243,7 @@ impl agent_ttrpc::AgentService for AgentService {
.rtnl
.list_interfaces()
.await
.map_ttrpc_err(|e| format!("Failed to list interfaces: {:?}", e))?;
.map_ttrpc_err(|e| format!("Failed to list interfaces: {e:?}"))?;
Ok(protocols::agent::Interfaces {
Interfaces: list,
@@ -1266,7 +1266,7 @@ impl agent_ttrpc::AgentService for AgentService {
.rtnl
.list_routes()
.await
.map_ttrpc_err(|e| format!("list routes: {:?}", e))?;
.map_ttrpc_err(|e| format!("list routes: {e:?}"))?;
Ok(protocols::agent::Routes {
Routes: list,
@@ -1383,7 +1383,7 @@ impl agent_ttrpc::AgentService for AgentService {
.rtnl
.add_arp_neighbors(neighs)
.await
.map_ttrpc_err(|e| format!("Failed to add ARP neighbours: {:?}", e))?;
.map_ttrpc_err(|e| format!("Failed to add ARP neighbours: {e:?}"))?;
Ok(Empty::new())
}
@@ -1603,7 +1603,7 @@ impl agent_ttrpc::AgentService for AgentService {
ma.memcg_set_config_async(mem_agent_memcgconfig_to_memcg_optionconfig(&config))
.await
.map_err(|e| {
let estr = format!("ma.memcg_set_config_async fail: {}", e);
let estr = format!("ma.memcg_set_config_async fail: {e}");
error!(sl(), "{}", estr);
ttrpc::Error::RpcStatus(ttrpc::get_status(ttrpc::Code::INTERNAL, estr))
})?;
@@ -1627,7 +1627,7 @@ impl agent_ttrpc::AgentService for AgentService {
ma.compact_set_config_async(mem_agent_compactconfig_to_compact_optionconfig(&config))
.await
.map_err(|e| {
let estr = format!("ma.compact_set_config_async fail: {}", e);
let estr = format!("ma.compact_set_config_async fail: {e}");
error!(sl(), "{}", estr);
ttrpc::Error::RpcStatus(ttrpc::get_status(ttrpc::Code::INTERNAL, estr))
})?;
@@ -2239,10 +2239,8 @@ fn load_kernel_module(module: &protocols::agent::KernelModule) -> Result<()> {
Some(code) => {
let std_out = String::from_utf8_lossy(&output.stdout);
let std_err = String::from_utf8_lossy(&output.stderr);
let msg = format!(
"load_kernel_module return code: {} stdout:{} stderr:{}",
code, std_out, std_err
);
let msg =
format!("load_kernel_module return code: {code} stdout:{std_out} stderr:{std_err}");
Err(anyhow!(msg))
}
None => Err(anyhow!("Process terminated by signal")),
@@ -2491,9 +2489,9 @@ mod tests {
// Skip test if loading kernel modules is not permitted
// or kernel module is not found
if let Err(e) = &result {
let error_string = format!("{:?}", e);
let error_string = format!("{e:?}");
// Let's print out the error message first
println!("DEBUG: error: {}", error_string);
println!("DEBUG: error: {error_string}");
if error_string.contains("Operation not permitted")
|| error_string.contains("EPERM")
|| error_string.contains("Permission denied")
@@ -2654,7 +2652,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let logger = slog::Logger::root(slog::Discard, o!());
let mut sandbox = Sandbox::new(&logger).unwrap();
@@ -2725,7 +2723,7 @@ mod tests {
// the fd will be closed on Process's dropping.
// unistd::close(wfd).unwrap();
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
assert_result!(d.result, result, msg);
}
}
@@ -2829,7 +2827,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let logger = slog::Logger::root(slog::Discard, o!());
let mut sandbox = Sandbox::new(&logger).unwrap();
@@ -2849,15 +2847,14 @@ mod tests {
let result = update_container_namespaces(&sandbox, &mut oci, d.use_sandbox_pidns);
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
assert_result!(d.result, result, msg);
if let Some(linux) = oci.linux() {
assert_eq!(
d.expected_namespaces,
linux.namespaces().clone().unwrap(),
"{}",
msg
"{msg}"
);
}
}
@@ -2950,7 +2947,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let dir = tempdir().expect("failed to make tempdir");
let block_size_path = dir.path().join("block_size_bytes");
@@ -2970,7 +2967,7 @@ mod tests {
hotplug_probe_path.to_str().unwrap(),
);
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
assert_result!(d.result, result, msg);
}
@@ -3080,7 +3077,7 @@ OtherField:other
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let dir = tempdir().expect("failed to make tempdir");
let proc_status_file_path = dir.path().join("status");
@@ -3091,9 +3088,9 @@ OtherField:other
let result = is_signal_handled(proc_status_file_path.to_str().unwrap(), d.signum);
let msg = format!("{}, result: {:?}", msg, result);
let msg = format!("{msg}, result: {result:?}");
assert_eq!(d.result, result, "{}", msg);
assert_eq!(d.result, result, "{msg}");
}
}
@@ -3392,9 +3389,9 @@ COMMIT
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let result = is_sealed_secret_path(d.source_path);
assert_eq!(d.result, result, "{}", msg);
assert_eq!(d.result, result, "{msg}");
}
}
}

View File

@@ -258,7 +258,7 @@ impl Sandbox {
}
let mut pid_ns = Namespace::new(&self.logger).get_pid();
pid_ns.path = format!("/proc/{}/ns/pid", init_pid);
pid_ns.path = format!("/proc/{init_pid}/ns/pid");
self.sandbox_pidns = Some(pid_ns);
}
@@ -726,8 +726,7 @@ mod tests {
let ref_count = new_storage.ref_count().await;
assert_eq!(
ref_count, 1,
"Invalid refcount, got {} expected 1.",
ref_count
"Invalid refcount, got {ref_count} expected 1."
);
// Use the existing sandbox storage
@@ -738,8 +737,7 @@ mod tests {
let ref_count = new_storage.ref_count().await;
assert_eq!(
ref_count, 2,
"Invalid refcount, got {} expected 2.",
ref_count
"Invalid refcount, got {ref_count} expected 2."
);
}
@@ -826,11 +824,7 @@ mod tests {
// Reference counter should decrement to 1.
let storage = &s.storages[storage_path];
let refcount = storage.ref_count().await;
assert_eq!(
refcount, 1,
"Invalid refcount, got {} expected 1.",
refcount
);
assert_eq!(refcount, 1, "Invalid refcount, got {refcount} expected 1.");
assert!(
s.remove_sandbox_storage(storage_path).await.unwrap(),
@@ -974,7 +968,7 @@ mod tests {
assert!(s.sandbox_pidns.is_some());
let ns_path = format!("/proc/{}/ns/pid", test_pid);
let ns_path = format!("/proc/{test_pid}/ns/pid");
assert_eq!(s.sandbox_pidns.unwrap().path, ns_path);
}
@@ -1286,7 +1280,7 @@ mod tests {
let tmpdir_path = tmpdir.path().to_str().unwrap();
for (i, d) in tests.iter().enumerate() {
let current_test_dir_path = format!("{}/test_{}", tmpdir_path, i);
let current_test_dir_path = format!("{tmpdir_path}/test_{i}");
fs::create_dir(&current_test_dir_path).unwrap();
// create numbered directories and fill using root name
@@ -1295,7 +1289,7 @@ mod tests {
"{}/{}{}",
current_test_dir_path, d.directory_autogen_name, j
);
let subfile_path = format!("{}/{}", subdir_path, SYSFS_ONLINE_FILE);
let subfile_path = format!("{subdir_path}/{SYSFS_ONLINE_FILE}");
fs::create_dir(&subdir_path).unwrap();
let mut subfile = File::create(subfile_path).unwrap();
subfile.write_all(b"0").unwrap();
@@ -1322,18 +1316,15 @@ mod tests {
result.is_ok()
);
assert_eq!(result.is_ok(), d.result.is_ok(), "{}", msg);
assert_eq!(result.is_ok(), d.result.is_ok(), "{msg}");
if d.result.is_ok() {
let test_result_val = *d.result.as_ref().ok().unwrap();
let result_val = result.ok().unwrap();
msg = format!(
"test[{}]: {:?}, expected {}, actual {}",
i, d, test_result_val, result_val
);
msg = format!("test[{i}]: {d:?}, expected {test_result_val}, actual {result_val}");
assert_eq!(test_result_val, result_val, "{}", msg);
assert_eq!(test_result_val, result_val, "{msg}");
}
}
}

View File

@@ -44,7 +44,7 @@ async fn handle_sigchild(logger: Logger, sandbox: Arc<Mutex<Sandbox>>) -> Result
if let Some(pid) = wait_status.pid() {
let raw_pid = pid.as_raw();
let child_pid = format!("{}", raw_pid);
let child_pid = format!("{raw_pid}");
let logger = logger.new(o!("child-pid" => child_pid));

View File

@@ -11,9 +11,10 @@ use std::str::FromStr;
use std::sync::Arc;
use anyhow::{anyhow, Context, Result};
#[cfg(target_arch = "s390x")]
use kata_types::device::DRIVER_BLK_CCW_TYPE;
use kata_types::device::{
DRIVER_BLK_CCW_TYPE, DRIVER_BLK_MMIO_TYPE, DRIVER_BLK_PCI_TYPE, DRIVER_NVDIMM_TYPE,
DRIVER_SCSI_TYPE,
DRIVER_BLK_MMIO_TYPE, DRIVER_BLK_PCI_TYPE, DRIVER_NVDIMM_TYPE, DRIVER_SCSI_TYPE,
};
use kata_types::mount::StorageDevice;
use protocols::agent::Storage;
@@ -93,9 +94,11 @@ impl StorageHandler for VirtioBlkPciHandler {
}
}
#[cfg(target_arch = "s390x")]
#[derive(Debug)]
pub struct VirtioBlkCcwHandler {}
#[cfg(target_arch = "s390x")]
#[async_trait::async_trait]
impl StorageHandler for VirtioBlkCcwHandler {
#[instrument]

View File

@@ -467,7 +467,7 @@ mod tests {
];
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
skip_loop_by_user!(msg, d.test_user);
@@ -515,7 +515,7 @@ mod tests {
nix::mount::umount(&mount_point).unwrap();
}
let msg = format!("{}: result: {:?}", msg, result);
let msg = format!("{msg}: result: {result:?}");
if d.error_contains.is_empty() {
assert!(result.is_ok(), "{}", msg);
} else {
@@ -576,7 +576,7 @@ mod tests {
let tempdir = tempdir().expect("failed to create tmpdir");
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let mount_dir = tempdir.path().join(d.mount_path);
fs::create_dir(&mount_dir)
@@ -663,7 +663,7 @@ mod tests {
let tempdir = tempdir().expect("failed to create tmpdir");
for (i, d) in tests.iter().enumerate() {
let msg = format!("test[{}]: {:?}", i, d);
let msg = format!("test[{i}]: {d:?}");
let mount_dir = tempdir.path().join(d.path);
fs::create_dir(&mount_dir)
@@ -674,12 +674,12 @@ mod tests {
// create testing directories and files
for n in 1..COUNT {
let nest_dir = mount_dir.join(format!("nested{}", n));
let nest_dir = mount_dir.join(format!("nested{n}"));
fs::create_dir(&nest_dir)
.unwrap_or_else(|_| panic!("{}: failed to create nest directory", msg));
for f in 1..COUNT {
let filename = nest_dir.join(format!("file{}", f));
let filename = nest_dir.join(format!("file{f}"));
File::create(&filename)
.unwrap_or_else(|_| panic!("{}: failed to create file", msg));
file_mode = filename.as_path().metadata().unwrap().permissions().mode();
@@ -707,9 +707,9 @@ mod tests {
);
for n in 1..COUNT {
let nest_dir = mount_dir.join(format!("nested{}", n));
let nest_dir = mount_dir.join(format!("nested{n}"));
for f in 1..COUNT {
let filename = nest_dir.join(format!("file{}", f));
let filename = nest_dir.join(format!("file{f}"));
let file = Path::new(&filename);
assert_eq!(file.metadata().unwrap().gid(), d.gid);

View File

@@ -147,7 +147,7 @@ mod tests {
let v = vec_locked
.as_deref_mut()
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e.to_string()))?;
.map_err(|e| std::io::Error::other(e.to_string()))?;
std::io::Write::flush(v)
}

View File

@@ -556,9 +556,9 @@ mod tests {
use test_utils::skip_if_not_root;
async fn create_test_storage(dir: &Path, id: &str) -> Result<(protos::Storage, PathBuf)> {
let src_path = dir.join(format!("src{}", id));
let src_path = dir.join(format!("src{id}"));
let src_filename = src_path.to_str().expect("failed to create src filename");
let dest_path = dir.join(format!("dest{}", id));
let dest_path = dir.join(format!("dest{id}"));
let dest_filename = dest_path.to_str().expect("failed to create dest filename");
std::fs::create_dir_all(src_filename).expect("failed to create path");
@@ -682,7 +682,7 @@ mod tests {
// setup storage0: too many files
for i in 1..21 {
fs::write(src0_path.join(format!("{}.txt", i)), "original").unwrap();
fs::write(src0_path.join(format!("{i}.txt")), "original").unwrap();
}
// setup storage1: two small files
@@ -700,7 +700,7 @@ mod tests {
// setup storage3: many files, but still watchable
for i in 1..MAX_ENTRIES_PER_STORAGE {
fs::write(src3_path.join(format!("{}.txt", i)), "original").unwrap();
fs::write(src3_path.join(format!("{i}.txt")), "original").unwrap();
}
let logger = slog::Logger::root(slog::Discard, o!());
@@ -919,7 +919,7 @@ mod tests {
// Up to 15 files should be okay (can watch 15 files + 1 directory)
for i in 1..MAX_ENTRIES_PER_STORAGE {
fs::write(source_dir.path().join(format!("{}.txt", i)), "original").unwrap();
fs::write(source_dir.path().join(format!("{i}.txt")), "original").unwrap();
}
assert_eq!(
@@ -1387,7 +1387,7 @@ mod tests {
assert!(!dest_dir.path().exists());
for i in 1..21 {
fs::write(source_dir.path().join(format!("{}.txt", i)), "fluff").unwrap();
fs::write(source_dir.path().join(format!("{i}.txt")), "fluff").unwrap();
}
// verify non-watched storage is cleaned up correctly

View File

@@ -70,7 +70,7 @@ impl ExportError for Error {
}
fn make_io_error(desc: String) -> std::io::Error {
std::io::Error::new(ErrorKind::Other, desc)
std::io::Error::other(desc)
}
// Send a trace span to the forwarder running on the host.

View File

@@ -44,7 +44,7 @@ test:
ifdef SUPPORT_VIRTUALIZATION
@set -e; \
for dir in $(PROJECT_DIRS); do \
bash -c "pushd $${dir} && RUST_BACKTRACE=1 cargo test --all-features --target $(TRIPLE) -- --nocapture --test-threads=1 && popd"; \
bash -c "pushd $${dir} && RUST_BACKTRACE=1 cargo test --all-features --target $(TRIPLE) --no-fail-fast -- --nocapture --test-threads=1 --skip bindgen && popd"; \
done
else
@echo "INFO: skip testing dragonball, it need virtualization support."

View File

@@ -207,7 +207,7 @@ impl<B: Bitmap> GuestMemoryRegion for GuestRegionHybrid<B> {
&self,
offset: MemoryRegionAddress,
count: usize,
) -> guest_memory::Result<VolatileSlice<BS<B>>> {
) -> guest_memory::Result<VolatileSlice<'_, BS<'_, B>>> {
match self {
GuestRegionHybrid::Mmap(region) => region.get_slice(offset, count),
GuestRegionHybrid::Raw(region) => region.get_slice(offset, count),
@@ -246,8 +246,8 @@ impl<B: Bitmap> GuestMemoryHybrid<B> {
/// # Arguments
///
/// * `regions` - The vector of regions.
/// The regions shouldn't overlap and they should be sorted
/// by the starting address.
/// The regions shouldn't overlap and they should be sorted
/// by the starting address.
pub fn from_regions(mut regions: Vec<GuestRegionHybrid<B>>) -> Result<Self, Error> {
Self::from_arc_regions(regions.drain(..).map(Arc::new).collect())
}
@@ -262,8 +262,8 @@ impl<B: Bitmap> GuestMemoryHybrid<B> {
/// # Arguments
///
/// * `regions` - The vector of `Arc` regions.
/// The regions shouldn't overlap and they should be sorted
/// by the starting address.
/// The regions shouldn't overlap and they should be sorted
/// by the starting address.
pub fn from_arc_regions(regions: Vec<Arc<GuestRegionHybrid<B>>>) -> Result<Self, Error> {
if regions.is_empty() {
return Err(Error::NoMemoryRegion);
@@ -359,7 +359,7 @@ impl<B: Bitmap + 'static> GuestMemory for GuestMemoryHybrid<B> {
index.map(|x| self.regions[x].as_ref())
}
fn iter(&self) -> Iter<B> {
fn iter(&self) -> Iter<'_, B> {
Iter(self.regions.iter())
}
}

View File

@@ -202,7 +202,7 @@ impl<B: Bitmap> GuestMemoryRegion for GuestRegionRaw<B> {
&self,
offset: MemoryRegionAddress,
count: usize,
) -> guest_memory::Result<VolatileSlice<BS<B>>> {
) -> guest_memory::Result<VolatileSlice<'_, BS<'_, B>>> {
let offset = offset.raw_value() as usize;
let end = compute_offset(offset, count)?;
if end > self.size {

View File

@@ -210,7 +210,7 @@ mod tests {
#[test]
fn test_create_gic() {
test_utils::skip_if_not_root!();
test_utils::skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
assert!(create_gic(&vm, 1).is_ok());

View File

@@ -150,7 +150,7 @@ mod tests {
#[test]
fn test_create_pmu() {
test_utils::skip_if_not_root!();
test_utils::skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();

View File

@@ -166,11 +166,11 @@ pub fn read_mpidr(vcpu: &VcpuFd) -> Result<u64> {
mod tests {
use super::*;
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
#[test]
fn test_setup_regs() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();
@@ -187,7 +187,7 @@ mod tests {
#[test]
fn test_read_mpidr() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();

View File

@@ -78,7 +78,7 @@ pub fn set_lint(vcpu: &VcpuFd) -> Result<()> {
mod tests {
use super::*;
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
const KVM_APIC_REG_SIZE: usize = 0x400;
@@ -101,7 +101,7 @@ mod tests {
#[test]
fn test_setlint() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
assert!(kvm.check_extension(kvm_ioctls::Cap::Irqchip));
let vm = kvm.create_vm().unwrap();
@@ -128,7 +128,7 @@ mod tests {
#[test]
fn test_setlint_fails() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();

View File

@@ -271,7 +271,7 @@ mod tests {
use super::*;
use crate::x86_64::gdt::gdt_entry;
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use vm_memory::{Bytes, GuestAddress, GuestMemoryMmap};
const BOOT_GDT_OFFSET: u64 = 0x500;
@@ -335,7 +335,7 @@ mod tests {
#[test]
fn test_setup_fpu() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();
@@ -358,7 +358,7 @@ mod tests {
#[test]
#[allow(clippy::cast_ptr_alignment)]
fn test_setup_msrs() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();
@@ -387,7 +387,7 @@ mod tests {
#[test]
fn test_setup_regs() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();

View File

@@ -399,7 +399,7 @@ mod tests {
use device_tree::DeviceTree;
use kvm_bindings::{kvm_vcpu_init, KVM_ARM_VCPU_PMU_V3, KVM_ARM_VCPU_PSCI_0_2};
use kvm_ioctls::{Kvm, VcpuFd, VmFd};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use vm_memory::GuestMemoryMmap;
use super::super::tests::MMIODeviceInfo;
@@ -461,7 +461,7 @@ mod tests {
#[test]
fn test_create_fdt_with_devices() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let regions = arch_memory_regions(FDT_MAX_SIZE + 0x1000);
let mem = GuestMemoryMmap::<()>::from_ranges(&regions).expect("Cannot initialize memory");
let dev_info: HashMap<(DeviceType, String), MMIODeviceInfo> = [
@@ -500,7 +500,7 @@ mod tests {
#[test]
fn test_create_fdt() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let regions = arch_memory_regions(FDT_MAX_SIZE + 0x1000);
let mem = GuestMemoryMmap::<()>::from_ranges(&regions).expect("Cannot initialize memory");
let kvm = Kvm::new().unwrap();
@@ -535,7 +535,7 @@ mod tests {
#[test]
fn test_create_fdt_with_initrd() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let regions = arch_memory_regions(FDT_MAX_SIZE + 0x1000);
let mem = GuestMemoryMmap::<()>::from_ranges(&regions).expect("Cannot initialize memory");
let kvm = Kvm::new().unwrap();
@@ -574,7 +574,7 @@ mod tests {
#[test]
fn test_create_fdt_with_pmu() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let regions = arch_memory_regions(FDT_MAX_SIZE + 0x1000);
let mem = GuestMemoryMmap::<()>::from_ranges(&regions).expect("Cannot initialize memory");
let kvm = Kvm::new().unwrap();

View File

@@ -304,7 +304,7 @@ mod tests {
#[test]
fn test_fdtutils_fdt_device_info() {
test_utils::skip_if_not_root!();
test_utils::skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let gic = create_gic(&vm, 0).unwrap();

View File

@@ -258,7 +258,7 @@ mod tests {
#[test]
fn test_setup_page_tables() {
test_utils::skip_if_not_root!();
test_utils::skip_if_kvm_unaccessable!();
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vcpu = vm.create_vcpu(0).unwrap();

View File

@@ -182,8 +182,8 @@ impl IoManager {
///
/// * `device`: device object to handle trapped IO access requests
/// * `resources`: resources representing trapped MMIO/PIO address ranges. Only MMIO/PIO address
/// ranges will be handled, and other types of resource will be ignored. So the caller does
/// not need to filter out non-MMIO/PIO resources.
/// ranges will be handled, and other types of resource will be ignored. So the caller does
/// not need to filter out non-MMIO/PIO resources.
pub fn register_device_io(
&mut self,
device: Arc<dyn DeviceIo>,
@@ -364,8 +364,8 @@ pub trait IoManagerContext {
/// * `ctx`: context object returned by begin_tx().
/// * `device`: device instance object to be registered
/// * `resources`: resources representing trapped MMIO/PIO address ranges. Only MMIO/PIO address
/// ranges will be handled, and other types of resource will be ignored. So the caller does
/// not need to filter out non-MMIO/PIO resources.
/// ranges will be handled, and other types of resource will be ignored. So the caller does
/// not need to filter out non-MMIO/PIO resources.
fn register_device_io(
&self,
ctx: &mut Self::Context,

View File

@@ -220,7 +220,7 @@ impl InterruptSourceGroup for LegacyIrq {
mod test {
use super::*;
use crate::manager::tests::create_vm_fd;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
const MASTER_PIC: usize = 7;
const SLAVE_PIC: usize = 8;
@@ -229,7 +229,7 @@ mod test {
#[test]
#[allow(unreachable_patterns)]
fn test_legacy_interrupt_group() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
let rounting = Arc::new(KvmIrqRouting::new(vmfd.clone()));
let base = 0;
@@ -265,7 +265,7 @@ mod test {
#[test]
fn test_irq_routing_initialize_legacy() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
let routing = KvmIrqRouting::new(vmfd.clone());
@@ -281,7 +281,7 @@ mod test {
#[test]
fn test_routing_opt() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
let routing = KvmIrqRouting::new(vmfd.clone());
@@ -313,7 +313,7 @@ mod test {
#[test]
fn test_routing_set_routing() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
let routing = KvmIrqRouting::new(vmfd.clone());

View File

@@ -271,7 +271,7 @@ pub fn from_sys_util_errno(e: vmm_sys_util::errno::Error) -> std::io::Error {
pub(crate) mod tests {
use super::*;
use crate::manager::tests::create_vm_fd;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
fn create_irq_group(
manager: Arc<KvmIrqManager>,
@@ -307,13 +307,13 @@ pub(crate) mod tests {
#[test]
fn test_create_kvm_irq_manager() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let _ = create_kvm_irq_manager();
}
#[test]
fn test_kvm_irq_manager_opt() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
vmfd.create_irq_chip().unwrap();
let manager = Arc::new(KvmIrqManager::new(vmfd.clone()));

View File

@@ -202,12 +202,12 @@ impl InterruptSourceGroup for MsiIrq {
mod test {
use super::*;
use crate::manager::tests::create_vm_fd;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
#[test]
#[allow(unreachable_patterns)]
fn test_msi_interrupt_group() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let vmfd = Arc::new(create_vm_fd());
vmfd.create_irq_chip().unwrap();

View File

@@ -451,7 +451,7 @@ pub(crate) mod tests {
use dbs_device::resources::{DeviceResources, MsiIrqType, Resource};
use kvm_ioctls::{Kvm, VmFd};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use super::*;
use crate::KvmIrqManager;
@@ -503,7 +503,7 @@ pub(crate) mod tests {
#[test]
fn test_create_device_interrupt_manager() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut mgr = create_interrupt_manager();
assert_eq!(mgr.mode, DeviceInterruptMode::Disabled);
@@ -539,7 +539,7 @@ pub(crate) mod tests {
#[test]
fn test_device_interrupt_manager_switch_mode() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut mgr = create_interrupt_manager();
// Can't switch working mode in enabled state.
@@ -624,7 +624,7 @@ pub(crate) mod tests {
#[test]
fn test_msi_config() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
assert!(interrupt_manager.set_msi_data(512, 0).is_err());
@@ -642,7 +642,7 @@ pub(crate) mod tests {
#[test]
fn test_set_working_mode_after_activated() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.activated = true;
assert!(interrupt_manager
@@ -664,7 +664,7 @@ pub(crate) mod tests {
#[test]
fn test_disable2legacy() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.activated = false;
interrupt_manager.mode = DeviceInterruptMode::Disabled;
@@ -675,7 +675,7 @@ pub(crate) mod tests {
#[test]
fn test_disable2nonlegacy() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.activated = false;
interrupt_manager.mode = DeviceInterruptMode::Disabled;
@@ -686,7 +686,7 @@ pub(crate) mod tests {
#[test]
fn test_legacy2nonlegacy() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.activated = false;
interrupt_manager.mode = DeviceInterruptMode::Disabled;
@@ -700,7 +700,7 @@ pub(crate) mod tests {
#[test]
fn test_nonlegacy2legacy() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.activated = false;
interrupt_manager.mode = DeviceInterruptMode::Disabled;
@@ -714,7 +714,7 @@ pub(crate) mod tests {
#[test]
fn test_update() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager
.set_working_mode(DeviceInterruptMode::GenericMsiIrq)
@@ -731,7 +731,7 @@ pub(crate) mod tests {
#[test]
fn test_get_configs() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
// legacy irq config
{
let interrupt_manager = create_interrupt_manager();
@@ -773,7 +773,7 @@ pub(crate) mod tests {
#[test]
fn test_reset_configs() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut interrupt_manager = create_interrupt_manager();
interrupt_manager.reset_configs(DeviceInterruptMode::LegacyIrq);

View File

@@ -235,7 +235,7 @@ mod tests {
use super::*;
use crate::{InterruptManager, InterruptSourceType};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
const VIRTIO_INTR_VRING: u32 = 0x01;
const VIRTIO_INTR_CONFIG: u32 = 0x02;
@@ -251,7 +251,7 @@ mod tests {
#[cfg(feature = "kvm-legacy-irq")]
#[test]
fn test_create_legacy_notifier() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let (_vmfd, irq_manager) = crate::kvm::tests::create_kvm_irq_manager();
let group = irq_manager
.create_group(InterruptSourceType::LegacyIrq, 0, 1)
@@ -282,7 +282,7 @@ mod tests {
#[cfg(feature = "kvm-msi-irq")]
#[test]
fn test_virtio_msi_notifier() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let (_vmfd, irq_manager) = crate::kvm::tests::create_kvm_irq_manager();
let group = irq_manager
.create_group(InterruptSourceType::MsiIrq, 0, 3)

View File

@@ -64,7 +64,7 @@ impl DeviceIoMut for I8042Wrapper<EventFdTrigger> {
}
if let Err(e) = self.device.write(offset.raw_value() as u8, data[0]) {
self.metrics.error_count.inc();
error!("Failed to trigger i8042 reset event: {:?}", e);
error!("Failed to trigger i8042 reset event: {e:?}");
} else {
self.metrics.write_count.inc();
}

View File

@@ -70,10 +70,7 @@ impl SerialEvents for SerialEventsWrapper {
.map_or(Ok(()), |buf_ready| buf_ready.write(1))
{
Ok(_) => (),
Err(err) => error!(
"Could not signal that serial device buffer is ready: {:?}",
err
),
Err(err) => error!("Could not signal that serial device buffer is ready: {err:?}"),
}
}
}
@@ -111,7 +108,7 @@ impl ConsoleHandler for SerialWrapper<EventFdTrigger, SerialEventsWrapper> {
fn raw_input(&mut self, data: &[u8]) -> std::io::Result<usize> {
self.serial
.enqueue_raw_bytes(data)
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, format!("{e:?}")))
.map_err(|e| std::io::Error::other(format!("{e:?}")))
}
fn set_output_stream(&mut self, out: Option<Box<dyn Write + Send>>) {
@@ -133,7 +130,7 @@ impl DeviceIoMut for SerialWrapper<EventFdTrigger, SerialEventsWrapper> {
return;
}
if let Err(e) = self.serial.write(offset.raw_value() as u8, data[0]) {
error!("Failed the pio write to serial: {:?}", e);
error!("Failed the pio write to serial: {e:?}");
self.serial.events().metrics.error_count.inc();
}
}
@@ -151,7 +148,7 @@ impl DeviceIoMut for SerialWrapper<EventFdTrigger, SerialEventsWrapper> {
return;
}
if let Err(e) = self.serial.write(offset.raw_value() as u8, data[0]) {
error!("Failed the write to serial: {:?}", e);
error!("Failed the write to serial: {e:?}");
self.serial.events().metrics.error_count.inc();
}
}

View File

@@ -72,7 +72,7 @@ impl PciBusContent {
.ok_or_else(|| Error::InvalidResource(res.clone()))?;
self.iomem_resources.insert(Range::new(*base, end), None);
}
_ => debug!("unknown resource assigned to PCI bus {}", id),
_ => debug!("unknown resource assigned to PCI bus {id}"),
}
}
@@ -158,7 +158,7 @@ impl PciBus {
let device_id = device.id();
let mut devices = self.devices.write().expect("poisoned lock for PCI bus");
debug!("add device id {} to bus", device_id);
debug!("add device id {device_id} to bus");
let old = devices.update(&Range::new(device_id, device_id), device.clone());
assert!(old.is_none());

View File

@@ -703,7 +703,7 @@ impl PciConfiguration {
pub fn read_u32(&self, offset: usize) -> (bool, u32) {
let reg_idx = offset >> 2;
if (offset & 0x3) != 0 || offset >= 256 {
warn!("configuration read_u32 offset invalid: 0x{:x}", offset);
warn!("configuration read_u32 offset invalid: 0x{offset:x}");
return (false, 0xffff_ffff);
}
@@ -731,7 +731,7 @@ impl PciConfiguration {
/// been handled by the framework.
pub fn read_u16(&self, offset: usize) -> (bool, u16) {
if (offset & 0x1) != 0 || offset >= 256 {
warn!("configuration read_u16 offset invalid: 0x{:x}", offset);
warn!("configuration read_u16 offset invalid: 0x{offset:x}");
return (false, 0xffff);
}
@@ -750,7 +750,7 @@ impl PciConfiguration {
/// been handled by the framework.
pub fn read_u8(&self, offset: usize) -> (bool, u8) {
if offset >= 256 {
warn!("configuration read_8 offset invalid: 0x{:x}", offset);
warn!("configuration read_8 offset invalid: 0x{offset:x}");
return (false, 0xff);
}
@@ -771,7 +771,7 @@ impl PciConfiguration {
let reg_idx = offset >> 2;
let mask = self.writable_bits[reg_idx];
if (offset & 0x3) != 0 || offset >= 256 {
warn!("configuration write_u32 offset invalid: 0x{:x}", offset);
warn!("configuration write_u32 offset invalid: 0x{offset:x}");
return false;
}
@@ -816,7 +816,7 @@ impl PciConfiguration {
/// been handled by the framework.
pub fn write_u16(&mut self, offset: usize, value: u16) -> bool {
if (offset & 0x1) != 0 || offset >= 256 {
warn!("configuration write_u16 offset invalid: 0x{:x}", offset);
warn!("configuration write_u16 offset invalid: 0x{offset:x}");
return false;
}
@@ -1281,7 +1281,7 @@ impl PciConfiguration {
};
let constraints = vec![constraint];
if let Err(e) = self.bus.upgrade().unwrap().allocate_resources(&constraints) {
debug!("failed to allocate resource for PCI BAR: {:?}", e);
debug!("failed to allocate resource for PCI BAR: {e:?}");
} else {
self.set_bar_allocated(param.bar_idx, true);
}

View File

@@ -406,7 +406,7 @@ impl PciCapability for MsiCap {
let mut buf = [0u8; 1];
if let Err(e) = self.read(offset as u64, &mut buf) {
error!("failed to read PCI MSI capability structure, {}", e);
error!("failed to read PCI MSI capability structure, {e}");
fill_config_data(&mut buf);
}
@@ -417,7 +417,7 @@ impl PciCapability for MsiCap {
let mut buf = [0u8; 2];
if let Err(e) = self.read(offset as u64, &mut buf) {
error!("failed to read PCI MSI capability structure, {}", e);
error!("failed to read PCI MSI capability structure, {e}");
fill_config_data(&mut buf);
}
@@ -428,7 +428,7 @@ impl PciCapability for MsiCap {
let mut buf = [0u8; 4];
if let Err(e) = self.read(offset as u64, &mut buf) {
error!("failed to read PCI MSI capability structure, {}", e);
error!("failed to read PCI MSI capability structure, {e}");
fill_config_data(&mut buf);
}
@@ -437,7 +437,7 @@ impl PciCapability for MsiCap {
fn write_u8(&mut self, offset: usize, value: u8) {
if let Err(e) = self.write(offset as u64, &[value]) {
error!("failed to write PCI MSI capability structure, {}", e);
error!("failed to write PCI MSI capability structure, {e}");
}
}
@@ -445,7 +445,7 @@ impl PciCapability for MsiCap {
let mut buf = [0u8; 2];
LittleEndian::write_u16(&mut buf, value);
if let Err(e) = self.write(offset as u64, &buf) {
error!("failed to write PCI MSI capability structure, {}", e);
error!("failed to write PCI MSI capability structure, {e}");
}
}
@@ -453,7 +453,7 @@ impl PciCapability for MsiCap {
let mut buf = [0u8; 4];
LittleEndian::write_u32(&mut buf, value);
if let Err(e) = self.write(offset as u64, &buf) {
error!("failed to write PCI MSI capability structure, {}", e);
error!("failed to write PCI MSI capability structure, {e}");
}
}
@@ -654,7 +654,7 @@ mod tests {
use dbs_device::resources::{DeviceResources, MsiIrqType, Resource};
use dbs_interrupt::KvmIrqManager;
use kvm_ioctls::{Kvm, VmFd};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use super::*;
@@ -736,7 +736,7 @@ mod tests {
#[test]
fn test_msi_state_struct() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let flags = MSI_CTL_ENABLE | MSI_CTL_64_BITS | MSI_CTL_PER_VECTOR | 0x6 | 0x20;
let mut cap = MsiCap::new(0xa5, flags);

View File

@@ -361,7 +361,7 @@ mod tests {
use dbs_device::resources::{DeviceResources, MsiIrqType, Resource};
use dbs_interrupt::KvmIrqManager;
use kvm_ioctls::{Kvm, VmFd};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use super::*;
@@ -423,7 +423,7 @@ mod tests {
#[test]
fn test_set_msg_ctl() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut config = MsixState::new(0x10);
let mut intr_mgr = create_interrupt_manager();
@@ -454,7 +454,7 @@ mod tests {
#[test]
fn test_read_write_table() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut intr_mgr = create_interrupt_manager();
let mut config = MsixState::new(0x10);

View File

@@ -146,28 +146,24 @@ impl VirtioPciCommonConfig {
}
fn read_common_config_byte(&self, offset: u64) -> u8 {
trace!("read_common_config_byte: offset 0x{:x}", offset);
trace!("read_common_config_byte: offset 0x{offset:x}");
// The driver is only allowed to do aligned, properly sized access.
match offset {
0x14 => self.driver_status,
0x15 => self.config_generation,
_ => {
warn!("invalid virtio config byte read: 0x{:x}", offset);
warn!("invalid virtio config byte read: 0x{offset:x}");
0
}
}
}
fn write_common_config_byte(&mut self, offset: u64, value: u8) {
trace!(
"write_common_config_byte: offset 0x{:x} value 0x{:x}",
offset,
value
);
trace!("write_common_config_byte: offset 0x{offset:x} value 0x{value:x}");
match offset {
0x14 => self.driver_status = value,
_ => {
warn!("invalid virtio config byte write: 0x{:x}", offset);
warn!("invalid virtio config byte write: 0x{offset:x}");
}
}
}
@@ -177,7 +173,7 @@ impl VirtioPciCommonConfig {
offset: u64,
queues: &[VirtioQueueConfig<Q>],
) -> u16 {
trace!("read_common_config_word: offset 0x{:x}", offset);
trace!("read_common_config_word: offset 0x{offset:x}");
match offset {
0x10 => self.msix_config.load(Ordering::Acquire),
0x12 => queues.len() as u16, // num_queues
@@ -187,7 +183,7 @@ impl VirtioPciCommonConfig {
0x1c => u16::from(self.with_queue(queues, |q| q.ready()).unwrap_or(false)),
0x1e => self.queue_select, // notify_off
_ => {
warn!("invalid virtio register word read: 0x{:x}", offset);
warn!("invalid virtio register word read: 0x{offset:x}");
0
}
}
@@ -199,11 +195,7 @@ impl VirtioPciCommonConfig {
value: u16,
queues: &mut [VirtioQueueConfig<Q>],
) {
trace!(
"write_common_config_word: offset 0x{:x} value 0x{:x}",
offset,
value
);
trace!("write_common_config_word: offset 0x{offset:x} value 0x{value:x}");
match offset {
0x10 => self.msix_config.store(value, Ordering::Release),
0x16 => self.queue_select = value,
@@ -214,7 +206,7 @@ impl VirtioPciCommonConfig {
q.set_ready(ready);
}),
_ => {
warn!("invalid virtio register word write: 0x{:x}", offset);
warn!("invalid virtio register word write: 0x{offset:x}");
}
}
}
@@ -228,7 +220,7 @@ impl VirtioPciCommonConfig {
offset: u64,
device: ArcMutexBoxDynVirtioDevice<AS, Q, R>,
) -> u32 {
trace!("read_common_config_dword: offset 0x{:x}", offset);
trace!("read_common_config_dword: offset 0x{offset:x}");
match offset {
0x00 => self.device_feature_select,
0x04 => {
@@ -243,7 +235,7 @@ impl VirtioPciCommonConfig {
}
0x08 => self.driver_feature_select,
_ => {
warn!("invalid virtio register dword read: 0x{:x}", offset);
warn!("invalid virtio register dword read: 0x{offset:x}");
0
}
}
@@ -260,11 +252,7 @@ impl VirtioPciCommonConfig {
queues: &mut [VirtioQueueConfig<Q>],
device: ArcMutexBoxDynVirtioDevice<AS, Q, R>,
) {
trace!(
"write_common_config_dword: offset 0x{:x} value 0x{:x}",
offset,
value
);
trace!("write_common_config_dword: offset 0x{offset:x} value 0x{value:x}");
match offset {
0x00 => self.device_feature_select = value,
@@ -287,13 +275,13 @@ impl VirtioPciCommonConfig {
0x30 => self.with_queue_mut(queues, |q| q.set_used_ring_address(Some(value), None)),
0x34 => self.with_queue_mut(queues, |q| q.set_used_ring_address(None, Some(value))),
_ => {
warn!("invalid virtio register dword write: 0x{:x}", offset);
warn!("invalid virtio register dword write: 0x{offset:x}");
}
}
}
fn read_common_config_qword(&self, _offset: u64) -> u64 {
trace!("read_common_config_qword: offset 0x{:x}", _offset);
trace!("read_common_config_qword: offset 0x{_offset:x}");
0 // Assume the guest has no reason to read write-only registers.
}
@@ -303,11 +291,7 @@ impl VirtioPciCommonConfig {
value: u64,
queues: &mut [VirtioQueueConfig<Q>],
) {
trace!(
"write_common_config_qword: offset 0x{:x}, value 0x{:x}",
offset,
value
);
trace!("write_common_config_qword: offset 0x{offset:x}, value 0x{value:x}");
let low = Some((value & 0xffff_ffff) as u32);
let high = Some((value >> 32) as u32);
@@ -317,7 +301,7 @@ impl VirtioPciCommonConfig {
0x28 => self.with_queue_mut(queues, |q| q.set_avail_ring_address(low, high)),
0x30 => self.with_queue_mut(queues, |q| q.set_used_ring_address(low, high)),
_ => {
warn!("invalid virtio register qword write: 0x{:x}", offset);
warn!("invalid virtio register qword write: 0x{offset:x}");
}
}
}

View File

@@ -242,7 +242,7 @@ impl DeviceIo for PciRootDevice {
} else {
data[0] = io_addr as u8;
}
debug!("=>read offset {}, and io_addr: 0x{:x}", offset, io_addr);
debug!("=>read offset {offset}, and io_addr: 0x{io_addr:x}");
return;
}
// Configuration data register

View File

@@ -268,7 +268,7 @@ impl Interrupt {
if let Some(msix) = &self.msix {
if let Some(offset) = msix.with_in_range(offset) {
if let Err(e) = self.update_msix_capability(offset, data) {
error!("Could not update MSI-X capability: {}", e);
error!("Could not update MSI-X capability: {e}");
}
return true;
}
@@ -277,7 +277,7 @@ impl Interrupt {
if let Some(msi) = self.msi.as_mut() {
if let Some(offset) = msi.with_in_range(offset) {
if let Err(e) = self.update_msi_capability(offset, data) {
error!("Could not update MSI capability: {}", e);
error!("Could not update MSI capability: {e}");
}
}
}
@@ -491,7 +491,7 @@ impl Interrupt {
if let Some(fd) = group.notifier(idx) {
irqfds.push(fd)
} else {
warn!("pci_vfio: failed to get irqfd 0x{:x} for vfio device", idx);
warn!("pci_vfio: failed to get irqfd 0x{idx:x} for vfio device");
return Err(VfioPciError::InternalError);
}
}
@@ -529,7 +529,7 @@ impl Interrupt {
let intr_mgr = self.irq_manager.as_mut().unwrap();
let offset = offset - u64::from(msix.cap.table_offset());
if let Err(e) = msix.state.write_table(offset, data, intr_mgr) {
debug!("failed to update PCI MSI-x table entry, {}", e);
debug!("failed to update PCI MSI-x table entry, {e}");
}
}
}
@@ -685,7 +685,7 @@ impl Region {
match self.set_user_memory_region(j, false, vm) {
Ok(_) => {}
Err(err) => {
error!("Could not delete kvm memory slot, error:{}", err);
error!("Could not delete kvm memory slot, error:{err}");
}
}
@@ -711,10 +711,7 @@ impl Region {
self.mmaps[i].mmap_size,
host_addr as u64,
) {
error!(
"vfio dma map failed, pci p2p dma may not work, due to {:?}",
e
);
error!("vfio dma map failed, pci p2p dma may not work, due to {e:?}");
}
}
@@ -749,10 +746,7 @@ impl Region {
self.start.raw_value() + self.mmaps[i].mmap_offset,
self.mmaps[i].mmap_size,
) {
error!(
"vfio dma unmap failed, pci p2p dma may not work, due to {:?}",
e
);
error!("vfio dma unmap failed, pci p2p dma may not work, due to {e:?}");
}
}
@@ -779,10 +773,7 @@ impl Region {
self.start.raw_value() + self.mmaps[i].mmap_offset,
self.mmaps[i].mmap_size,
) {
error!(
"vfio dma unmap failed, pci p2p dma may not work, due to {:?}",
e
);
error!("vfio dma unmap failed, pci p2p dma may not work, due to {e:?}");
}
self.start = GuestAddress(params.new_base);
self.set_user_memory_region(i, true, vm)?;
@@ -793,10 +784,7 @@ impl Region {
self.mmaps[i].mmap_size,
self.mmaps[i].mmap_host_addr,
) {
error!(
"vfio dma map failed, pci p2p dma may not work, due to {:?}",
e
);
error!("vfio dma map failed, pci p2p dma may not work, due to {e:?}");
}
}
}
@@ -1159,7 +1147,7 @@ impl<C: PciSystemContext> VfioPciDeviceState<C> {
mappable
);
log::info!("mmap_size {}, mmap_offset {} ", mmap_size, mmap_offset);
log::info!("mmap_size {mmap_size}, mmap_offset {mmap_offset} ");
self.regions.push(Region {
bar_index: bar_id,
@@ -1479,7 +1467,7 @@ impl<C: PciSystemContext> VfioPciDeviceState<C> {
let table_size: u64 = u64::from(msix_cap.table_size()) * (MSIX_TABLE_ENTRY_SIZE as u64);
let pba_bir: u32 = msix_cap.pba_bir();
let pba_offset: u64 = u64::from(msix_cap.pba_offset());
let pba_size: u64 = (u64::from(msix_cap.table_size()) + 7) / 8;
let pba_size: u64 = u64::from(msix_cap.table_size()).div_ceil(8);
self.interrupt.msix = Some(VfioMsix {
state: msix_config,
@@ -1595,7 +1583,7 @@ impl<C: PciSystemContext> VfioPciDevice<C> {
pub fn activate(&self, device: Weak<dyn DeviceIo>, resources: DeviceResources) -> Result<()> {
let mut state = self.state();
if resources.len() == 0 {
if resources.is_empty() {
return Err(VfioPciError::InvalidResources);
}
@@ -1640,7 +1628,7 @@ impl<C: PciSystemContext> VfioPciDevice<C> {
);
let _ = state.unregister_regions(&self.vm_fd).map_err(|e| {
// If unregistering regions goes wrong, the memory region in Dragonball will be in a mess,
// so we panic here to avoid more serious problem.
// so we panic here to avoid more serious problem.
panic!("failed to rollback changes of VfioPciDevice::register_regions() because error {:?}", e);
});
}
@@ -1656,7 +1644,7 @@ impl<C: PciSystemContext> VfioPciDevice<C> {
Ok(())
}
pub fn state(&self) -> MutexGuard<VfioPciDeviceState<C>> {
pub fn state(&self) -> MutexGuard<'_, VfioPciDeviceState<C>> {
// Don't expect poisoned lock
self.state
.lock()
@@ -1818,7 +1806,7 @@ impl<C: 'static + PciSystemContext> PciDevice for VfioPciDevice<C> {
state.configuration.write_config(offset as usize, data);
if let Some(params) = state.configuration.get_bar_programming_params() {
if let Err(e) = state.program_bar(reg_idx, params, &self.vm_fd) {
debug!("failed to program VFIO PCI BAR, {}", e);
debug!("failed to program VFIO PCI BAR, {e}");
}
}
// For device like nvidia vGPU the config space must also be updated.

View File

@@ -410,7 +410,7 @@ where
let msix_res = device_resource
.get_pci_msix_irqs()
.ok_or(VirtioPciDeviceError::InvalidMsixResource)?;
info!("{:?}: virtio pci device msix_res: {:?}", dev_id, msix_res);
info!("{dev_id:?}: virtio pci device msix_res: {msix_res:?}");
let msix_state = MsixState::new(msix_res.1 as u16);
@@ -553,7 +553,7 @@ where
let addr =
IoEventAddress::Mmio(notify_base + i as u64 * u64::from(NOTIFY_OFF_MULTIPLIER));
if let Err(e) = self.vm_fd.register_ioevent(&q.eventfd, &addr, NoDatamatch) {
error!("failed to register ioevent: {:?}", e);
error!("failed to register ioevent: {e:?}");
return Err(std::io::Error::from_raw_os_error(e.errno()));
}
}
@@ -669,7 +669,7 @@ where
Ok(())
}
pub fn device(&self) -> MutexGuard<Box<dyn VirtioDevice<AS, Q, R>>> {
pub fn device(&self) -> MutexGuard<'_, Box<dyn VirtioDevice<AS, Q, R>>> {
self.device.lock().expect("Poisoned lock of device")
}
@@ -677,21 +677,21 @@ where
self.device.clone()
}
pub fn common_config(&self) -> MutexGuard<VirtioPciCommonConfig> {
pub fn common_config(&self) -> MutexGuard<'_, VirtioPciCommonConfig> {
self.common_config
.lock()
.expect("Poisoned lock of common_config")
}
pub fn state(&self) -> MutexGuard<VirtioPciDeviceState<AS, Q>> {
pub fn state(&self) -> MutexGuard<'_, VirtioPciDeviceState<AS, Q>> {
self.state.lock().expect("Poisoned lock of state")
}
pub fn msix_state(&self) -> MutexGuard<MsixState> {
pub fn msix_state(&self) -> MutexGuard<'_, MsixState> {
self.msix_state.lock().expect("Poisoned lock of msix_state")
}
pub fn intr_mgr(&self) -> MutexGuard<DeviceInterruptManager<Arc<KvmIrqManager>>> {
pub fn intr_mgr(&self) -> MutexGuard<'_, DeviceInterruptManager<Arc<KvmIrqManager>>> {
// Safe to unwrap() because we don't expect poisoned lock here.
self.intr_mgr.lock().expect("Poisoned lock of intr_mgr")
}
@@ -806,7 +806,7 @@ impl<
.activate(device_config)
.map(|_| self.device_activated.store(true, Ordering::SeqCst))
.map_err(|e| {
error!("device activate error: {:?}", e);
error!("device activate error: {e:?}");
e
})?;
@@ -885,7 +885,7 @@ impl<
.device()
.set_resource(self.vm_fd.clone(), self.device_resource.clone())
.map_err(|e| {
error!("Failed to assign device resource to virtio device: {}", e);
error!("Failed to assign device resource to virtio device: {e}");
VirtioPciDeviceError::SetResource(e)
})?;
@@ -954,7 +954,7 @@ impl<
{
let mut device = self.device();
if let Err(e) = device.read_config(o - DEVICE_CONFIG_BAR_OFFSET, data) {
warn!("device read config err: {}", e);
warn!("device read config err: {e}");
}
}
o if (NOTIFICATION_BAR_OFFSET..NOTIFICATION_BAR_OFFSET + NOTIFICATION_SIZE)
@@ -1004,7 +1004,7 @@ impl<
{
let mut device = self.device();
if let Err(e) = device.write_config(o - DEVICE_CONFIG_BAR_OFFSET, data) {
warn!("pci device write config err: {}", e);
warn!("pci device write config err: {e}");
}
}
o if (NOTIFICATION_BAR_OFFSET..NOTIFICATION_BAR_OFFSET + NOTIFICATION_SIZE)
@@ -1051,7 +1051,7 @@ impl<
if self.device_activated.load(Ordering::SeqCst) && self.is_driver_init() {
let mut device = self.device();
if let Err(e) = device.reset() {
error!("Attempt to reset device when not implemented or reset error in underlying device, err: {:?}", e);
error!("Attempt to reset device when not implemented or reset error in underlying device, err: {e:?}");
let mut config = self.common_config();
config.driver_status = DEVICE_FAILED as u8;
} else {
@@ -1097,13 +1097,13 @@ impl<
if reg_in_offset == 2 && data.len() == 2 {
if let Err(e) = msix_state.set_msg_ctl(LittleEndian::read_u16(data), &mut intr_mgr)
{
error!("Failed to set MSI-X message control, err: {:?}", e);
error!("Failed to set MSI-X message control, err: {e:?}");
}
} else if reg_in_offset == 0 && data.len() == 4 {
if let Err(e) = msix_state
.set_msg_ctl((LittleEndian::read_u32(data) >> 16) as u16, &mut intr_mgr)
{
error!("Failed to set MSI-X message control, err: {:?}", e);
error!("Failed to set MSI-X message control, err: {e:?}");
}
}
}
@@ -1164,7 +1164,7 @@ pub(crate) mod tests {
use dbs_interrupt::kvm::KvmIrqManager;
use dbs_utils::epoll_manager::EpollManager;
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use virtio_queue::QueueSync;
use vm_memory::{GuestMemoryMmap, GuestRegionMmap, GuestUsize, MmapRegion};
@@ -1497,7 +1497,7 @@ pub(crate) mod tests {
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
#[test]
fn test_virtio_pci_device_activate() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut d: VirtioPciDevice<_, _, _> = get_pci_device();
assert_eq!(d.state().queues.len(), 2);
assert!(!d.state().check_queues_valid());
@@ -1556,7 +1556,7 @@ pub(crate) mod tests {
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
#[test]
fn test_bus_device_reset() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let mut d: VirtioPciDevice<_, _, _> = get_pci_device();
assert_eq!(d.state().queues.len(), 2);
@@ -1581,7 +1581,7 @@ pub(crate) mod tests {
#[test]
fn test_virtio_pci_device_resources() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let d: VirtioPciDevice<_, _, _> = get_pci_device();
let resources = d.get_assigned_resources();
@@ -1599,7 +1599,7 @@ pub(crate) mod tests {
#[test]
fn test_virtio_pci_register_ioevent() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let d: VirtioPciDevice<_, _, _> = get_pci_device();
d.register_ioevent().unwrap();
assert!(d.ioevent_registered.load(Ordering::SeqCst));
@@ -1621,7 +1621,7 @@ pub(crate) mod tests {
#[test]
fn test_read_bar() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let d: VirtioPciDevice<_, _, _> = get_pci_device();
let origin_data = vec![1u8];
// driver status

View File

@@ -321,20 +321,20 @@ impl<S: UpcallClientService + Send> UpcallEpollHandler<S> {
match info.state {
UpcallClientState::WaitingServer => {
if let Err(e) = info.server_connection_check() {
debug!("upcall connect server check failed, {}", e);
debug!("upcall connect server check failed, {e}");
info.set_state(UpcallClientState::WaitingServer);
if let Err(e) = self.set_reconnect() {
error!("set reconnect error: {}", e);
error!("set reconnect error: {e}");
info.set_state(UpcallClientState::ReconnectError);
}
} else {
info!("upcall connect server success");
// It's time to connect to service when server is connected.
if let Err(e) = info.service_connection_start() {
warn!("upcall connect service start failed {}", e);
warn!("upcall connect service start failed {e}");
info.set_state(UpcallClientState::WaitingServer);
if let Err(e) = self.set_reconnect() {
error!("set reconnect error: {}", e);
error!("set reconnect error: {e}");
info.set_state(UpcallClientState::ReconnectError);
}
} else {
@@ -345,10 +345,10 @@ impl<S: UpcallClientService + Send> UpcallEpollHandler<S> {
}
UpcallClientState::WaitingService => {
if let Err(e) = info.service_connection_check() {
warn!("upcall connect service check failed, {}", e);
warn!("upcall connect service check failed, {e}");
info.set_state(UpcallClientState::WaitingServer);
if let Err(e) = self.set_reconnect() {
error!("set reconnect error: {}", e);
error!("set reconnect error: {e}");
info.set_state(UpcallClientState::ReconnectError);
}
} else {
@@ -363,10 +363,10 @@ impl<S: UpcallClientService + Send> UpcallEpollHandler<S> {
info.consume_callback(response);
}
Err(e) => {
warn!("upcall response failed {}", e);
warn!("upcall response failed {e}");
info.set_state(UpcallClientState::WaitingServer);
if let Err(e) = self.set_reconnect() {
error!("set reconnect error: {}", e);
error!("set reconnect error: {e}");
info.set_state(UpcallClientState::ReconnectError);
}
}
@@ -399,9 +399,9 @@ impl<S: UpcallClientService + Send> UpcallEpollHandler<S> {
let mut info = info.lock().unwrap();
// reconnect to server
if let Err(e) = info.server_connection_start() {
warn!("upcall reconnect server /failed: {}", e);
warn!("upcall reconnect server /failed: {e}");
if let Err(e) = self.set_reconnect() {
error!("set reconnect error: {}", e);
error!("set reconnect error: {e}");
}
}
debug!("upcall reconnect server...");

View File

@@ -286,7 +286,7 @@ mod tests {
use std::net::Ipv4Addr;
use std::str;
use std::sync::atomic::{AtomicUsize, Ordering};
use test_utils::skip_if_not_root;
use test_utils::skip_if_kvm_unaccessable;
use super::*;
@@ -390,7 +390,7 @@ mod tests {
#[test]
fn test_tap_name() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
// Sanity check that the assumed max iface name length is correct.
assert_eq!(
IFACE_NAME_MAX_LEN,
@@ -417,13 +417,13 @@ mod tests {
#[test]
fn test_tap_partial_eq() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
assert_ne!(Tap::new().unwrap(), Tap::new().unwrap());
}
#[test]
fn test_tap_configure() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
// `fetch_add` adds to the current value, returning the previous value.
let next_ip = NEXT_IP.fetch_add(1, Ordering::SeqCst);
@@ -456,7 +456,7 @@ mod tests {
#[test]
fn test_tap_enable() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let tap = Tap::new().unwrap();
let ret = tap.enable();
assert!(ret.is_ok());
@@ -464,7 +464,7 @@ mod tests {
#[test]
fn test_tap_get_ifreq() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let tap = Tap::new().unwrap();
let ret = tap.get_ifreq();
assert_eq!(
@@ -475,7 +475,7 @@ mod tests {
#[test]
fn test_raw_fd() {
skip_if_not_root!();
skip_if_kvm_unaccessable!();
let tap = Tap::new().unwrap();
assert_eq!(tap.as_raw_fd(), tap.tap_file.as_raw_fd());
}

View File

@@ -53,6 +53,7 @@ vm-memory = { workspace = true, features = [
test-utils = { workspace = true }
[features]
test-resources = []
virtio-mmio = []
virtio-vsock = ["virtio-mmio"]
virtio-net = ["virtio-mmio"]

Some files were not shown because too many files have changed in this diff Show More