Compare commits

...

75 Commits

Author SHA1 Message Date
stevenhorsman
aa11441c1a workflows: Create workflow to stale issues based on date
The standard stale/action is intended to be run regularly with
a date offset, but we want to have one we can run against a specific
date in order to run the stale bot against issues created since a particular
release milestone, so calculate the offset in one step and use it in the next.

At the moment we want to run this to stale issues before 9th October 2022 when Kata 3.0 was release, so default to this.

Note the stale action only processes a few issues at a time to avoid rate limiting, so why we want a cron job to it can get through
the backlog, but also to stale/unstale issues that are commented on.
2026-01-22 11:32:01 +00:00
Steve Horsman
2cd76796bd Merge pull request #12305 from stevenhorsman/fix-stalebot-permissions
ci: Fix stalebot permissions
2026-01-22 10:02:43 +00:00
Hyounggyu Choi
bc131a84b9 GHA: Set timeout for kata-deploy and kbs cleanup
It was observed that some kata-deploy cleanup steps could hang,
causing the workflow to never finish properly. In these cases,
a QEMU process was not cleaned up and kept printing debug logs
to the journal. Over time, this maxed out the runner’s disk
usage and caused the runner service to stop.

Set timeouts for the relevant cleanup steps to avoid this.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-01-22 10:32:24 +01:00
Fabiano Fidêncio
dacb14619d kata-deploy: Make verification ConfigMap a regular resource
The verification job mounts a ConfigMap containing the pod spec for
the Kata runtime test. Previously, both the ConfigMap and the Job were
Helm hooks with different weights (-5 and 0 respectively).

On k3s, a race condition was observed where the Job pod would be
scheduled before the kubelet's informer cache had registered the
ConfigMap, causing a FailedMount error:

  MountVolume.SetUp failed for volume "pod-spec": object
  "kube-system"/"kata-deploy-verification-spec" not registered

This happened because k3s's lightweight architecture schedules pods
very quickly, and the hook weight difference only controls Helm's
ordering, not actual timing between resource creation and cache sync.

By making the ConfigMap a regular chart resource (removing hook
annotations), it is created during the main chart installation phase,
well before any post-install hooks run. This guarantees the ConfigMap
is fully propagated to all kubelets before the verification Job starts.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
89e287c3b2 kata-deploy: Add more permissions to verification job's RBAC
The verification job needs to list nodes to check for the
katacontainers.io/kata-runtime label and list events to detect
FailedCreatePodSandBox errors during pod creation.

This was discovered when testing with k0s, where the service account
lacked the required cluster-scope permissions to list nodes.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
869dd5ac65 kata-deploy: Enable dynamic drop-in support for k0s
Remove k0s-worker and k0s-controller from
RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT and always return true for
k0s in is_containerd_capable_of_using_drop_in_files since k0s auto-loads
from containerd.d/ directory regardless of containerd version.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
d4ea02e339 kata-deploy: Add microk8s support with dynamic version detection
Add microk8s case to get_containerd_paths() method and remove microk8s
from RUNTIMES_WITHOUT_CONTAINERD_DROP_IN_SUPPORT to enable dynamic
containerd version checking.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
69dd9679c2 kata-deploy: Centralize containerd path management
Introduce ContainerdPaths struct and get_containerd_paths() method to
centralize the complex logic for determining containerd configuration
file paths across different Kubernetes distributions.

The new ContainerdPaths struct includes:
- config_file: File to read containerd version from and write to
- backup_file: Backup file path before modification
- imports_file: File to add/remove drop-in imports from (Option<String>)
- drop_in_file: Path to the drop-in configuration file
- use_drop_in: Whether drop-in files can be used

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
606c12df6d kata-deploy: fix JSONPath parsing for labels with dots
The JSONPath parser was incorrectly splitting on escaped dots (\.)
causing microk8s detection to fail. Labels like "microk8s.io/cluster"
were being split into ["microk8s\", "io/cluster"] instead of being
treated as a single key.

This adds a split_jsonpath() helper that properly handles escaped dots,
allowing the automatic microk8s detection via the node label to work
correctly.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
ec18dd79ba tests: Simplify kata-deploy test to use helm directly
The kata-deploy test was using helm_helper which made it hard to debug
failures (die() calls would cause "Executed 0 tests" errors) and added
unnecessary complexity.

The test now calls helm directly like a user would, making it simpler
and more representative of real-world usage. The verification job status
is explicitly checked with proper failure detection instead of relying
on helm --wait.

Timeouts are configurable via environment variables to account for
different network speeds and image sizes:
- KATA_DEPLOY_TIMEOUT (default: 600s)
- KATA_DEPLOY_DAEMONSET_TIMEOUT (default: 300s)
- KATA_DEPLOY_VERIFICATION_TIMEOUT (default: 120s)

Documentation has been added to explain what each timeout controls and
how to customize them.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
86e0b08b13 kata-deploy: Improve verification job timing and failure detection
The verification job now supports configurable timeouts to accommodate
different environments and network conditions. The daemonset timeout
defaults to 1200 seconds (20 minutes) to allow for large image downloads,
while the verification pod timeout defaults to 180 seconds.

The job now waits for the DaemonSet to exist, pods to be scheduled,
rollout to complete, and nodes to be labeled before creating the
verification pod. A 15-second delay is added after node labeling to
allow kubelet time to refresh runtime information.

Retry logic with 3 attempts and a 10-second delay handles transient
FailedCreatePodSandBox errors that can occur during runtime
initialization. The job only fails on pod errors after a 30-second
grace period to avoid false positives from timing issues.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
Fabiano Fidêncio
2369cf585d tests: Fix retry loop bugs in helm_helper
The retry loop in helm_helper had two bugs:
1. Counter initialized to 10 instead of 0, causing immediate failure
2. Exit condition used -eq instead of -ge, incorrect for loop logic

These bugs would cause helm_helper to fail immediately on the first
retry attempt instead of properly retrying up to max_tries times.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-21 20:14:33 +01:00
stevenhorsman
19efeae12e workflow: Fix stalebot permissions
When looking into stale bot more for issues, I realised that our existing
stale job would need permissions to work. Unfortunately the behaviour
of the actions without these permissions is to log, but still finish as successful.
This means it was hard to spot we had an issue.

Add the required permissions to get this working again and improve the message
Also add concurrency rule to make zizmor happy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 17:28:59 +00:00
Steve Horsman
70f6543333 Merge pull request #12371 from stevenhorsman/cargo-check
build: Add cargo check
2026-01-21 14:50:07 +00:00
Steve Horsman
4eb50d7b59 Merge pull request #12334 from stevenhorsman/rust-linting-improvements
Rust linting improvements
2026-01-21 14:01:37 +00:00
Steve Horsman
ba47bb6583 Merge pull request #11421 from kata-containers/dependabot/go_modules/src/runtime/github.com/urfave/cli-1.22.17
build(deps): bump github.com/urfave/cli from 1.22.14 to 1.22.17 in /src/runtime
2026-01-21 11:46:02 +00:00
stevenhorsman
62847e1efb kata-ctl: Remove unnecessary unwrap
Switch `is_err()` and then `unwrap_err()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:53:40 +00:00
stevenhorsman
78824e0181 agent: Remove unnecessary unwrap
Switch `is_some()` and then `unwrap()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:53:40 +00:00
stevenhorsman
d135a186e1 libs: Remove unnecessary unwrap
Switch `is_err()` and then `unwrap_err()` for `if let` which is
"more idiomatic"

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
949e0c2ca0 libs: Remove unused imports
Tidy up the imports

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
83b0c44986 dragonball: Remove unused imports
Clean up the imports

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
7a02c54b6c kata-ctl: Allow unused assigned in clap parsing
command isn't ever read, but leave it in for now, so we don't disrupt
the parsing option

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:48 +00:00
stevenhorsman
bf1539b802 libs: Replace manual default
HugePageType has a manual default that can be derived
more concisely

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 08:52:47 +00:00
stevenhorsman
0fd9eebf0f kata-ctl: Update Cargo.lock
The cargo check identified that the lock file is out of date,
so bump this to fix the issue

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-20 16:07:34 +00:00
stevenhorsman
3f1533ae8a build: Add cargo check
We've had a couple of occasions that Cargo.lock has been out of sync
with Cargo.toml, so try and extend our rust check to pick this up in the CI.

There is probably a more elegant way than doing `cargo check` and
checking for changes, but I'll start with this approach

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-20 16:07:34 +00:00
Greg Kurz
cf3441bd2c agent: Refresh Cargo.lock
Downstream builders at Red Hat complain that `Cargo.lock` doesn't match
`Cargo.toml`.

Run `cargo check` to refresh `Cargo.lock`.

`git bisect` shows that 7cfb97d41b is the first commit where
`cargo check` has an effect in `src/agent`.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-01-20 14:44:47 +01:00
Fabiano Fidêncio
e0158869b1 tests: Add common bats test runner function
Add run_bats_tests() function to common.bash that provides consistent
test execution and reporting across all test suites (k8s, nvidia,
kata-deploy).

This removes duplicated test runner code from run_kubernetes_tests.sh,
run_kubernetes_nv_tests.sh, and run-kata-deploy-tests.sh.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-20 12:31:55 +01:00
Fabiano Fidêncio
5aff81198f helm-chart: Fix warnings on README
nydus -> `nydus`
erofs -> `erofs`

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 22:41:50 +01:00
Fabiano Fidêncio
b5a986eacf kata-deploy: Add runtime-rs TDX / SNP runtimeclasses
https://github.com/kata-containers/kata-containers/pull/11534 has been
merged and it added all the needed bits to deploy the QEMU SNP / TDX
runtime-rs variants, apart from the kata-deploy additions, which is done
by this PR.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 22:41:50 +01:00
Fabiano Fidêncio
c7570427d2 tests: Add report generation to NVIDIA tests
The NVIDIA GPU test runner script was not generating test reports,
causing the report_tests() function in gha-run.sh to have nothing
to display. This aligns the script with run_kubernetes_tests.sh by:

- Adding set -o pipefail for proper pipeline error handling
- Creating a reports directory with timestamped subdirectory
- Capturing test output to files with ok-/not_ok- prefixes
- Adding --timing flag to bats for timing information

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 18:21:43 +01:00
Fabiano Fidêncio
c1216598e8 static-checks: Fix kata-deploy reference
Let's just point to the official documentation rather than explaining
exactly how to deploy (and the current text was very outdated).

Removing fluentd / minikube examples is out of context of this commit.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 15:09:20 +01:00
Fabiano Fidêncio
96e1fb4ca6 tools: Remove runk
The runk tool hasn't been supported for a few years, with no maintainers
since ManaSugi stopped being involved in the project and the CI was
disabled in 2024.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:43:53 +01:00
Fabiano Fidêncio
f68c25de6a kata-deploy: Switch to the rust version
Let's remove the script and rely only on the rust version from now on.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:07:49 +01:00
Fabiano Fidêncio
d7aa793dde Revert "ci: Run a nightly job using the kata-deploy rust"
This reverts commit 6130d7330f, as we're
officially swithcing to the rust version of kata-deploy.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:07:49 +01:00
Fabiano Fidêncio
17472f3f10 release: scripts: Accept KATA_TOOLS_STATIC_TARBALL env var
a2534e7bc8 introduced the logic to also
release a kata-tools tarball, but it missed allowing
KATA_TOOLS_STATIC_TARBALL env var to be passed to the release script,
leading to the following error during the release process:
```
ERROR: Invalid environment variable "KATA_TOOLS_STATIC_TARBALL"
```

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 13:03:23 +01:00
Fabiano Fidêncio
882862d711 release: Bump version to 3.25.0
Bump VERSION and helm-charts versions.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 11:33:45 +01:00
XanderC
93beb58c5d runtime: fix network initialization for non-hotplug VMMs
In startVM(), for VMMs without hotplug support (e.g., Firecracker or
QEMU microvm), the runtime runs prestart hooks but misses rescanning
the network namespace. This causes VMs to boot with uninitialized
network configs, as updates from CNI plugins are not captured.

This patch adds a network rescan via AddEndpoints after prestart hooks
for the non-hotplug path, ensuring correct network info is passed to
the VMM configuration before the VM starts.

Fixes #11500

Signed-off-by: XanderC <xanderc@qq.com>
2026-01-17 23:56:59 +01:00
Zvonko Kaiser
428cc5d586 gpu: Chroot Cleanup
With the newest NVRC we do not need the supported GPUs
anymore.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-17 19:27:24 +01:00
Fabiano Fidêncio
1c154b4c15 kernel: Add DAX fix for arm64
The patch has been provided upstream by Seunguk Shin and is already
approved.

We'll drop it once it becomes available in the LTS tree.

Reference:
https://lore.kernel.org/all/18af3213-6c46-4611-ba75-da5be5a1c9b0@arm.coum

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-17 19:15:53 +01:00
Fabiano Fidêncio
33b1f0786e Revert "arm64: Do not use DAX with the rootfs image"
This reverts commit 2acb94ef2d, as we have
a kernel patch approved fixing the issue.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-17 19:15:53 +01:00
Alex Lyn
fe15f2fa47 runtime-rs: Remove deprecated virtio-9p
The virtio-9p is not supported for a long time, specially within
the runtime-rs, we have no such plan to support it. Removal of the
related items is reasonable.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
b7cfc6fd72 runtime-rs: Remove mem-agent section from TDX/SNP configurations
As Memory Agent feature is not used within CoCo(TDX/SNP) scenarios,
with this fact, it's better to just remove the related sections.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
634ec2b56d runtime-rs: Add configurable SNP items in Makefile when make build
It aims to introduce some related items within Makefile to enable
Intel SNP settings in configuration when do make build. And make it
possible to generate the rendered qemu-snp-runtime-rs configuration
based on the *.in template.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
0abdb8e016 runtime-rs: Introduce a qemu-runtime-rs/SEV-SNP dedicated configuration
To make it work well on the SEV-SNP platforms for qemu-runtime-rs with
coco, a dedicated SEV-SNP configuration should be introduced to help
prepare related CVM resources.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
b0a82f7bb8 runtime-rs: Enable measured rootfs within configuration when make build
Enable measured rootfs within configuration when make build. And add
some other important items to make the configuration work well.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
3799855040 runtime-rs: Add configurable TDX items in Makefile when make build
It aims to introduce some related items within Makefile to enable
Intel TDX settings in configuration when do make build. And make it
possible to generate the rendered qemu-tdx-runtime-rs configuration
based on the *.in template.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Alex Lyn
4d55e2c8c8 runtime-rs: Introduce a dedicated configuration for qemu-runtime-rs/TDX
To make it work well on the TDX platforms for qemu-runtime-rs with
coco, a dedicated TDX configuration should be introduced to help
prepare related CVM resources.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-17 18:52:57 +01:00
Manuel Huber
956f43c6c6 runtime: skip MoveTo for systemd cgroups
Systemd-managed cgroups use the slice:prefix:name format, which is
not a filesystem path. Calling MoveTo() on such paths fails with
"invalid group path" and can abort cleanup before Delete() runs.
In some cases, this causes pod teardown delays.
Skip MoveTo for systemd-formatted sandbox/overhead cgroup paths when
sandbox_cgroup_only is true; systemd moves tasks on unit deletion.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-16 16:41:38 +01:00
Manuel Huber
6b70923e55 docs: Update NVIDIA GPU passthrough QEMU scenario
With cold-plug becoming by design the only supported mode with the
update of NVRC to v0.1.1, resolving references to hot-plug.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-16 13:50:10 +01:00
Steve Horsman
610a8bdfd5 Merge pull request #12346 from Amulyam24/ppc64le-payload
ci: move the job publish kata payload after push to an alternate runner for ppc64le
2026-01-16 11:41:53 +00:00
Fabiano Fidêncio
ea18f543b4 tests: kata-deploy: Enable verification during helm install
Enable post-install verification in kata-deploy CI tests. When
HELM_VERIFY_DEPLOYMENT is set, a simple verification pod is created
that runs with the Kata runtime to confirm deployment succeeded.

The verification pod prints kernel info and exits - success indicates
the Kata runtime is properly configured and functional.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-16 10:52:43 +01:00
Fabiano Fidêncio
a188f04d75 kata-deploy: helm: Add optional post-install verification
Add optional verification that runs after kata-deploy installation.
When a pod spec is provided via --set-file verification.pod=<file>,
a verification job runs after install/upgrade to validate deployment.

The user is fully responsible for the verification pod content:
- Pod name, runtimeClassName, annotations, and verification logic
- Pod must exit 0 on success, non-zero on failure

The verification job simply:
1. Waits for kata-deploy DaemonSet to be ready
2. Applies the user-provided pod spec
3. Waits for the pod to complete
4. Shows logs and cleans up

Usage:
  helm install kata-deploy ... \
    --set-file verification.pod=/path/to/your-pod.yaml

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-16 10:52:43 +01:00
Amulyam24
859313d904 ci: move the job payload after push to an alternate runner for ppc64le
To unlock the release, move the job to publish kata payload after push to an alternate runner(IBM owned) for ppc64le.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2026-01-16 11:14:42 +05:30
Alex Lyn
c0cca81993 runtime-rs: Set default_bridges with 0 for dragonball vmm
As Dragonball VMM does not support PCI hotplug options, it should
be set 0.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
1a76d44e16 kata-types: Chanage the default bridges with 1
It aims to align it with the Makefile and configuration's
setting.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
6375b3881d runtime-rs: Set the default bridges with default 1
As runtime-go use the default bridges with 1, it should be
kept as 1 to avoid alignment issues.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-01-15 20:32:15 +01:00
Alex Lyn
8728b262fb Merge pull request #12338 from zvonkok/nvrc-update
gpu: Bump NVRC Version
2026-01-15 19:36:07 +08:00
Zvonko Kaiser
adce41c432 gpu: Bump NVRC Version
The new NVRC version works for CC and non-CC use cases,
no --feature confidential needed anymore.

Bump versions.yaml and adjust deployment instructions.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-15 01:51:10 +00:00
Manuel Huber
6753c3ac08 runtime: nvidia: Disable NVDIMM
Disable NVDIMM. When using GPU passthrough, using NVDIMM would create
a r/o file-backed memory region. When using a GPU, QEMU tries to DMA-
map guest memory for the device, resulting in a mapping error:
memory listener initialization failed: Region mem0:
vfio_container_dma_map ... -22 (Invalid argument).
For the CC configs, NVDIMM is disabled by default in qemu_amd64.go
with a warning, but we also explicitly disable the setting in the
shim configuration file.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-01-14 22:51:07 +01:00
Fabiano Fidêncio
a9dda0e52b versions: nvidia: Bump kernel to the latest LTS
As now that we have the decoupled rootfs / kernel, doing the bump
becomes trivial.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 20:45:54 +01:00
Fabiano Fidêncio
4e99860fd2 workflows: nvidia: Adjust to kernel / roots build decouple
We don't need to store the kernel headers anymore. We do need to store
the kernel modules, instead.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
02d2b6bdf2 kernel: bump kata_config_version
We have kernel build changes bump the config version

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
a075c3740a gpu: build_image.sh use versions.yaml
We've done some bad file based driver determination,
now with versions.yaml there is a single source of truth.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
ffc8725164 gpu: rootfs update decoupling
Remove all the driver build instructions,
sicne those are now done in the kernel target.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
cca973772d gpu: deploy modules for kernel build
We need to package the build modules for the rootfs
to be able to consume it. We package the whole
/lib/modules/$(uname -r)  directory strip=2.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
13ed3cdff9 gpu: Add NVIDA modules to build-kernel.sh
Checkout and build the kernel modules along
with the kernel to avoid the kernel rootfs dependency.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
2a11910acb gpu: Remove building of Headers
Since we build along the kernel we do not need to
carry over the headers to the rootfs build.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
b1870fef07 gpu: versions.yaml nvidia driver pinning
We want to have deterministic behaviour and only
one valid driver version acceptable via versions.yaml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Zvonko Kaiser
229481b348 kernel: bugfix install yq
We actually never installed yq to the kernel build,
there are  some path that use yq but were never hit,
for the GPU use-case we need to read values from versions.yaml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-01-14 20:45:54 +01:00
Steve Horsman
6db3a4cf8d Merge pull request #12333 from fitzthum/bump-v0180
Update Trustee and guest-components for upcoming releases
2026-01-14 19:44:55 +00:00
Tobin Feldman-Fitzthum
ca29e68acb agent-ctl: bump image-rs version
In preparation for coco v0.18.0, bump the version of image-rs we use in
agent-ctl to match what we have in versions.yaml.

Drop the snapshotter-overlayfs feature. This was dropped from image-rs
when we removed enclave-cc support.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-01-14 06:54:29 -08:00
Tobin Feldman-Fitzthum
25a08ef739 versions: bump Trustee and guest-components
Before cutting the Kata release that will be used with CoCo v0.18.0,
let's bump the versions of Trustee and guest-components to latest.

Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
2026-01-14 06:43:30 -08:00
Steve Horsman
0f5f914a04 Merge pull request #12330 from LandonTClipp/docs_improvement
docs: Navigation improvements and bug fixes to Pages
2026-01-14 14:13:29 +00:00
LandonTClipp
197231456f docs: Navigation improvements and bug fixes to Pages
A few minor changes to the Zensical config that makes navigation easier. Also
fixed a couple of bugs with local serving and added some quality of life
features to Zensical.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-01-13 11:17:58 -06:00
dependabot[bot]
2edb161c53 build(deps): bump github.com/urfave/cli in /src/runtime
Bumps [github.com/urfave/cli](https://github.com/urfave/cli) from 1.22.14 to 1.22.17.
- [Release notes](https://github.com/urfave/cli/releases)
- [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md)
- [Commits](https://github.com/urfave/cli/compare/v1.22.14...v1.22.17)

---
updated-dependencies:
- dependency-name: github.com/urfave/cli
  dependency-version: 1.22.17
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-13 09:04:41 +00:00
127 changed files with 3628 additions and 10289 deletions

View File

@@ -12,7 +12,6 @@ updates:
- "/src/tools/agent-ctl"
- "/src/tools/genpolicy"
- "/src/tools/kata-ctl"
- "/src/tools/runk"
- "/src/tools/trace-forwarder"
schedule:
interval: "daily"

View File

@@ -163,42 +163,6 @@ jobs:
timeout-minutes: 10
run: bash tests/integration/nydus/gha-run.sh run
run-runk:
name: run-runk
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
timeout-minutes: 10
run: bash tests/integration/runk/gha-run.sh run
run-tracing:
name: run-tracing
strategy:

View File

@@ -148,8 +148,8 @@ jobs:
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: kata-artifacts-amd64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.zst
name: kata-artifacts-amd64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst
retention-days: 15
if-no-files-found: error
@@ -237,8 +237,8 @@ jobs:
asset:
- busybox
- coco-guest-components
- kernel-nvidia-gpu-headers
- kernel-nvidia-gpu-confidential-headers
- kernel-nvidia-gpu-modules
- kernel-nvidia-gpu-confidential-modules
- pause-image
steps:
- uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

View File

@@ -134,8 +134,8 @@ jobs:
if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: kata-artifacts-arm64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.zst
name: kata-artifacts-arm64-${{ matrix.asset }}-modules${{ inputs.tarball-suffix }}
path: kata-build/kata-static-${{ matrix.asset }}-modules.tar.zst
retention-days: 15
if-no-files-found: error
@@ -216,7 +216,7 @@ jobs:
matrix:
asset:
- busybox
- kernel-nvidia-gpu-headers
- kernel-nvidia-gpu-modules
steps:
- uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0
with:

View File

@@ -1,36 +0,0 @@
name: Kata Containers Nightly CI (Rust)
on:
schedule:
- cron: '0 1 * * *' # Run at 1 AM UTC (1 hour after script-based nightly)
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
permissions: {}
jobs:
kata-containers-ci-on-push-rust:
permissions:
contents: read
packages: write
id-token: write
attestations: write
uses: ./.github/workflows/ci.yaml
with:
commit-hash: ${{ github.sha }}
pr-number: "nightly-rust"
tag: ${{ github.sha }}-nightly-rust
target-branch: ${{ github.ref_name }}
build-type: "rust" # Use Rust-based build
secrets:
AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}
AZ_APPID: ${{ secrets.AZ_APPID }}
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}
ITA_KEY: ${{ secrets.ITA_KEY }}
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
NGC_API_KEY: ${{ secrets.NGC_API_KEY }}
KBUILD_SIGN_PIN: ${{ secrets.KBUILD_SIGN_PIN }}

View File

@@ -19,11 +19,6 @@ on:
required: false
type: string
default: no
build-type:
description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).
required: false
type: string
default: ""
secrets:
AUTHENTICATED_IMAGE_PASSWORD:
required: true
@@ -77,7 +72,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-22.04
arch: amd64
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -110,7 +104,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-arm
arch: arm64
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -156,7 +149,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-s390x
arch: s390x
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -175,7 +167,6 @@ jobs:
target-branch: ${{ inputs.target-branch }}
runner: ubuntu-24.04-ppc64le
arch: ppc64le
build-type: ${{ inputs.build-type }}
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -297,7 +288,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -313,7 +304,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-arm64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-arm64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -326,7 +317,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -348,7 +339,7 @@ jobs:
tarball-suffix: -${{ inputs.tag }}
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -366,7 +357,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-s390x${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-s390x
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -380,7 +371,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-ppc64le${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-ppc64le
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}
@@ -392,7 +383,7 @@ jobs:
with:
registry: ghcr.io
repo: ${{ github.repository_owner }}/kata-deploy-ci
tag: ${{ inputs.tag }}-amd64${{ inputs.build-type == 'rust' && '-rust' || '' }}
tag: ${{ inputs.tag }}-amd64
commit-hash: ${{ inputs.commit-hash }}
pr-number: ${{ inputs.pr-number }}
target-branch: ${{ inputs.target-branch }}

View File

@@ -82,7 +82,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: ubuntu-22.04
arch: amd64
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -100,7 +99,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: ubuntu-24.04-arm
arch: arm64
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -118,7 +116,6 @@ jobs:
target-branch: ${{ github.ref_name }}
runner: s390x
arch: s390x
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}
@@ -134,9 +131,8 @@ jobs:
repo: kata-containers/kata-deploy-ci
tag: kata-containers-latest-ppc64le
target-branch: ${{ github.ref_name }}
runner: ppc64le-small
runner: ubuntu-24.04-ppc64le
arch: ppc64le
build-type: "" # Use script-based build (default)
secrets:
QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

View File

@@ -30,11 +30,6 @@ on:
description: The arch of the tarball.
required: true
type: string
build-type:
description: The build type for kata-deploy. Use 'rust' for Rust-based build, empty or omit for script-based (default).
required: false
type: string
default: ""
secrets:
QUAY_DEPLOYER_PASSWORD:
required: true
@@ -106,10 +101,8 @@ jobs:
REGISTRY: ${{ inputs.registry }}
REPO: ${{ inputs.repo }}
TAG: ${{ inputs.tag }}
BUILD_TYPE: ${{ inputs.build-type }}
run: |
./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \
"$(pwd)/kata-static.tar.zst" \
"${REGISTRY}/${REPO}" \
"${TAG}" \
"${BUILD_TYPE}"
"${TAG}"

View File

@@ -126,5 +126,6 @@ jobs:
- name: Delete CoCo KBS
if: always() && matrix.environment.name != 'nvidia-gpu'
timeout-minutes: 10
run: |
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

View File

@@ -137,10 +137,12 @@ jobs:
- name: Delete kata-deploy
if: always()
timeout-minutes: 10
run: bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi
- name: Delete CoCo KBS
if: always()
timeout-minutes: 10
run: |
if [ "${KBS}" == "true" ]; then
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

View File

@@ -120,10 +120,12 @@ jobs:
- name: Delete kata-deploy
if: always()
timeout-minutes: 15
run: bash tests/integration/kubernetes/gha-run.sh cleanup
- name: Delete CoCo KBS
if: always()
timeout-minutes: 10
run: |
[[ "${KATA_HYPERVISOR}" == "qemu-tdx" ]] && echo "ITA_KEY=${GH_ITA_KEY}" >> "${GITHUB_ENV}"
bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

View File

@@ -87,4 +87,4 @@ jobs:
- name: Report tests
if: always()
run: bash tests/integration/kubernetes/gha-run.sh report-tests
run: bash tests/functional/kata-deploy/gha-run.sh report-tests

View File

@@ -1,54 +0,0 @@
name: CI | Run runk tests
on:
workflow_call:
inputs:
tarball-suffix:
required: false
type: string
commit-hash:
required: false
type: string
target-branch:
required: false
type: string
default: ""
permissions: {}
jobs:
run-runk:
name: run-runk
# Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether
if: false
runs-on: ubuntu-22.04
env:
CONTAINERD_VERSION: lts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.commit-hash }}
fetch-depth: 0
persist-credentials: false
- name: Rebase atop of the latest target branch
run: |
./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"
env:
TARGET_BRANCH: ${{ inputs.target-branch }}
- name: Install dependencies
run: bash tests/integration/runk/gha-run.sh install-dependencies
env:
GH_TOKEN: ${{ github.token }}
- name: get-kata-tarball
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
with:
name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}
path: kata-artifacts
- name: Install kata
run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts
- name: Run runk tests
run: bash tests/integration/runk/gha-run.sh run

View File

@@ -6,14 +6,21 @@ on:
permissions: {}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
stale:
name: stale
runs-on: ubuntu-22.04
permissions:
actions: write # Needed to manage caches for state persistence across runs
pull-requests: write # Needed to add/remove labels, post comments, or close PRs
steps:
- uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0
with:
stale-pr-message: 'This PR has been opened without with no activity for 180 days. Comment on the issue otherwise it will be closed in 7 days'
stale-pr-message: 'This PR has been opened without activity for 180 days. Please comment on the issue or it will be closed in 7 days.'
days-before-pr-stale: 180
days-before-pr-close: 7
days-before-issue-stale: -1

41
.github/workflows/stale_issues.yaml vendored Normal file
View File

@@ -0,0 +1,41 @@
name: 'Stale issues with activity before a fixed date'
on:
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
inputs:
date:
description: "Date of stale cut-off. All issues not updated since this date will be marked as stale. Format: YYYY-MM-DD e.g. 2022-10-09"
default: "2022-10-09"
required: false
type: string
permissions: {}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
stale:
name: stale
runs-on: ubuntu-24.04
permissions:
actions: write # Needed to manage caches for state persistence across runs
issues: write # Needed to add/remove labels, post comments, or close issues
steps:
- name: Calculate the age to stale
run: |
echo AGE=$(( ( $(date +%s) - $(date -d "${DATE:-2022-10-09}" +%s) ) / 86400 )) >> "$GITHUB_ENV"
env:
DATE: ${{ inputs.date }}
- name: Run the stale action
uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0
with:
stale-pr-message: 'This issue has had no activity for at least ${AGE} days. Please comment on the issue, or it will be closed in 30 days'
days-before-pr-stale: -1
days-before-pr-close: -1
days-before-issue-stale: ${AGE}
days-before-issue-close: 30

View File

@@ -18,7 +18,6 @@ TOOLS =
TOOLS += agent-ctl
TOOLS += kata-ctl
TOOLS += log-parser
TOOLS += runk
TOOLS += trace-forwarder
STANDARD_TARGETS = build check clean install static-checks-build test vendor
@@ -51,7 +50,7 @@ build-and-publish-kata-debug:
bash tools/packaging/kata-debug/kata-debug-build-and-upload-payload.sh ${KATA_DEBUG_REGISTRY} ${KATA_DEBUG_TAG}
docs-serve:
docker run --rm -p 8000:8000 -v ./docs:/docs/docs -v ${PWD}/zensical.toml:/zensical.toml:ro zensical/zensical serve --config-file /zensical.toml -a 0.0.0.0:8000
docker run --rm -p 8000:8000 -v ./docs:/docs:ro -v ${PWD}/zensical.toml:/zensical.toml:ro zensical/zensical serve --config-file /zensical.toml -a 0.0.0.0:8000
.PHONY: \
all \

View File

@@ -139,7 +139,6 @@ The table below lists the remaining parts of the project:
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |
| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |

View File

@@ -1 +1 @@
3.24.0
3.25.0

9
docs/assets/favicon.svg Normal file
View File

@@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
<!-- Dark background matching the site -->
<rect width="32" height="32" rx="4" fill="#1a1a2e"/>
<!-- Kata logo scaled and centered -->
<g transform="translate(-27, -2) scale(0.75)">
<path d="M70.925 25.22L58.572 37.523 46.27 25.22l2.192-2.192 10.11 10.11 10.11-10.11zm-6.575-.2l-3.188-3.188 3.188-3.188 3.188 3.188zm-4.93-2.54l3.736 3.736-3.736 3.736zm-1.694 7.422l-8.07-8.07 8.07-8.07zm1.694-16.14l3.686 3.686-3.686 3.686zm-13.15 4.682L58.572 6.143l12.353 12.303-2.192 2.192-10.16-10.11-10.11 10.11zm26.997 0L58.572 3.752 43.878 18.446l3.387 3.387-3.387 3.387 14.694 14.694L73.266 25.22l-3.337-3.387z" fill="#f15b3e"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 710 B

View File

@@ -103,48 +103,8 @@ $ minikube ssh "grep -c -E 'vmx|svm' /proc/cpuinfo"
## Installing Kata Containers
You can now install the Kata Containers runtime components. You will need a local copy of some Kata
Containers components to help with this, and then use `kubectl` on the host (that Minikube has already
configured for you) to deploy them:
```sh
$ git clone https://github.com/kata-containers/kata-containers.git
$ cd kata-containers/tools/packaging/kata-deploy
$ kubectl apply -f kata-rbac/base/kata-rbac.yaml
$ kubectl apply -f kata-deploy/base/kata-deploy.yaml
```
This installs the Kata Containers components into `/opt/kata` inside the Minikube node. It can take
a few minutes for the operation to complete. You can check the installation has worked by checking
the status of the `kata-deploy` pod, which will be executing
[this script](../../tools/packaging/kata-deploy/scripts/kata-deploy.sh),
and will be executing a `sleep infinity` once it has successfully completed its work.
You can accomplish this by running the following:
```sh
$ podname=$(kubectl -n kube-system get pods -o=name | grep -F kata-deploy | sed 's?pod/??')
$ kubectl -n kube-system exec ${podname} -- ps -ef | grep -F infinity
```
> *NOTE:* This check only works for single node clusters, which is the default for Minikube.
> For multi-node clusters, the check would need to be adapted to check `kata-deploy` had
> completed on all nodes.
## Enabling Kata Containers
Now you have installed the Kata Containers components in the Minikube node. Next, you need to configure
Kubernetes `RuntimeClass` to know when to use Kata Containers to run a pod.
### Register the runtime
Now register the `kata qemu` runtime with that class. This should result in no errors:
```sh
$ cd kata-containers/tools/packaging/kata-deploy/runtimeclasses
$ kubectl apply -f kata-runtimeClasses.yaml
```
The Kata Containers installation process should be complete and enabled in the Minikube cluster.
You can now install the Kata Containers runtime components
[following the official instructions](../../tools/packaging/kata-deploy/helm-chart).
## Testing Kata Containers

View File

@@ -96,18 +96,16 @@ request.
### Kata runtime
Depending on the Kata runtime's configuration, the orchestration flow then
differs between VFIO cold-plug and hot-plug. This behavior can be
controlled via the `hot_plug_vfio` and `cold_plug_vfio` configuration
settings:
The Kata runtime for the NVIDIA GPU handlers is configured to cold-plug VFIO
devices (`cold_plug_vfio` is set to `root-port` while
`hot_plug_vfio` is set to `no-port`). Cold-plug is by design the only
supported mode for NVIDIA GPU passthrough of the NVIDIA reference stack.
- **Cold-plug scenario:**
In this scenario, the Kata runtime attaches the GPU at VM launch time, when
With cold-plug, the Kata runtime attaches the GPU at VM launch time, when
creating the pod sandbox. This happens *before* the create container request,
i.e., before the Kata runtime receives the OCI spec including device
configurations from containerd. Thus, a mechanism to acquire the device
information is required:
When the `cold_plug_vfio` configuration is enabled, the runtime calls the
information is required. This is done by the runtime calling the
`coldPlugDevices()` function during sandbox creation. In this function,
the runtime queries Kubelet's Pod Resources API to discover allocated GPU
device IDs (e.g., `nvidia.com/pgpu = [vfio0]`). The runtime formats these as
@@ -118,23 +116,8 @@ specifications and determines the device path the GPU is backed by
PCI BDF (e.g., `0000:21:00`) and cold-plugs the GPU by launching QEMU with
relevant parameters for device passthrough (e.g.,
`-device vfio-pci,host=0000:21:00.0,x-pci-vendor-id=0x10de,x-pci-device-id=0x2321,bus=rp0,iommufd=iommufdvfio-faf829f2ea7aec330`).
Cold-plug is the default setting used in the NVIDIA GPU TEE and non-TEE
shim configuration, with `cold_plug_vfio` set to `root-port` and
`hot_plug_vfio` set to `no-port`.
- **Hot-plug scenario:**
In this scenario, the Kata runtime skips the `coldPlugDevices` function, and
thus, querying the Kubelet's Pod Resources API, during sandbox creation.
Instead, when the runtime receives a create container request with the device
information contained in the OCI spec, the runtime attaches the GPU to the
running pod VM using QEMU's QMP `device_add` command. Since the Kubelet has
passed the device information via the OCI spec as part of the create container
request, querying the Pod Resources API is not necessary. The runtime then
provides the kata-agent with relevant device information - most importantly,
the device PCI BDF - indicating which devices it will need to expected to be
hot-plugged.
In both scenarios, the runtime also creates *inner runtime* CDI annotations
The runtime also creates *inner runtime* CDI annotations
which map host VFIO devices to guest GPU devices. These are annotations
intended for the kata-agent, here referred to as the inner runtime (inside the
UVM), to properly handle GPU passthrough into containers. These annotations
@@ -144,8 +127,8 @@ The annotations are key-value pairs consisting of `cdi.k8s.io/vfio<num>` keys
(derived from the host VFIO device path, e.g., `/dev/vfio/devices/vfio1`) and
`nvidia.com/gpu=<index>` values (referencing the corresponding device in the
guest CDI spec). These annotations are injected by the runtime during container
creation for both cold-plug and hot-plug scenarios via the
`annotateContainerWithVFIOMetadata` function (see `container.go`).
creation via the `annotateContainerWithVFIOMetadata` function (see
`container.go`).
We continue describing the orchestration flow inside the UVM in the next
section.
@@ -196,9 +179,8 @@ The resulting root filesystem contains the following software components:
When the Kata runtime asks QEMU to launch the VM, the UVM's Linux kernel
boots and mounts the root filesystem. After this, NVRC starts as the initial
process.
The behavior then differs between cold-plug and hot-plug scenarios:
- **Cold-plug scenario:** NVRC scans for NVIDIA GPUs on the PCI bus, loads the
NVRC scans for NVIDIA GPUs on the PCI bus, loads the
NVIDIA kernel modules, waits for driver initialization, creates the device nodes,
and initializes the GPU hardware (using the `nvidia-smi` binary). NVRC also
creates the guest-side CDI specification file (using the
@@ -209,19 +191,9 @@ for each device, specifying device nodes (e.g., `/dev/nvidia0`,
`/dev/nvidiactl`), library mounts, and environment variables to be mounted
into the container which receives the passthrough GPU.
- **Hot-plug scenario:** NVRC performs initial system setup but no GPUs are
present at VM boot time. Instead, both NVRC and kata-agent monitor for PCI
uevents to detect GPUs that are hot-plugged later during container creation.
When a GPU hot-plug event occurs, NVRC detects the uevent, identifies the GPU,
loads the appropriate drivers, and generates the CDI specifications for the
newly added GPU. Meanwhile, kata-agent uses a `PciMatcher` to wait for the
device to appear under `/sys/devices/`, ensuring the GPU is ready for container
integration.
In both scenarios, NVRC forks the Kata agent while continuing to run as the
Then, NVRC forks the Kata agent while continuing to run as the
init system. This allows NVRC to handle ongoing GPU management tasks
(including hot-plug scenarios) while kata-agent focuses on container lifecycle
management. See the
while kata-agent focuses on container lifecycle management. See the
[NVRC sources](https://github.com/NVIDIA/nvrc/blob/main/src/main.rs) for an
overview on the steps carried out by NVRC.
@@ -309,7 +281,7 @@ $ deploy_k8s
> **Note:**
>
> The NVIDIA GPU runtime classes use VFIO cold-plug by default which, as
> The NVIDIA GPU runtime classes use VFIO cold-plug which, as
> described above, requires the Kata runtime to query Kubelet's Pod Resources
> API to discover allocated GPU devices during sandbox creation. For
> Kubernetes versions **older than 1.34**, you must explicitly enable the

1
src/agent/Cargo.lock generated
View File

@@ -4305,6 +4305,7 @@ checksum = "8f50febec83f5ee1df3015341d8bd429f2d1cc62bcba7ea2076759d315084683"
name = "test-utils"
version = "0.1.0"
dependencies = [
"libc",
"nix 0.26.4",
]

View File

@@ -1588,9 +1588,11 @@ async fn join_namespaces(
cm.apply(p.pid)?;
}
if p.init && res.is_some() {
info!(logger, "set properties to cgroups!");
cm.set(res.unwrap(), false)?;
if p.init {
if let Some(resource) = res {
info!(logger, "set properties to cgroups!");
cm.set(resource, false)?;
}
}
info!(logger, "notify child to continue");

View File

@@ -10,7 +10,7 @@ use std::fs::File;
use std::sync::{Arc, Mutex};
use crossbeam_channel::{Receiver, Sender, TryRecvError};
use log::{debug, error, info, warn};
use log::{debug, info, warn};
use std::sync::mpsc;
use tracing::instrument;

View File

@@ -24,7 +24,6 @@ use dbs_legacy_devices::ConsoleHandler;
use dbs_pci::CAPABILITY_BAR_SIZE;
use dbs_utils::epoll_manager::EpollManager;
use kvm_ioctls::VmFd;
use log::error;
use virtio_queue::QueueSync;
#[cfg(feature = "dbs-virtio-devices")]

View File

@@ -75,7 +75,7 @@ pub const DEFAULT_QEMU_GUEST_KERNEL_PARAMS: &str = "";
pub const DEFAULT_QEMU_FIRMWARE_PATH: &str = "";
pub const DEFAULT_QEMU_MEMORY_SIZE_MB: u32 = 128;
pub const DEFAULT_QEMU_MEMORY_SLOTS: u32 = 128;
pub const DEFAULT_QEMU_PCI_BRIDGES: u32 = 2;
pub const DEFAULT_QEMU_PCI_BRIDGES: u32 = 1;
pub const MAX_QEMU_PCI_BRIDGES: u32 = 5;
pub const MAX_QEMU_VCPUS: u32 = 256;
pub const MIN_QEMU_MEMORY_SIZE_MB: u32 = 64;

View File

@@ -770,10 +770,11 @@ impl MachineInfo {
}
/// Huge page type for VM RAM backend
#[derive(Clone, Debug, Deserialize_enum_str, Serialize_enum_str, PartialEq, Eq)]
#[derive(Clone, Debug, Deserialize_enum_str, Serialize_enum_str, PartialEq, Eq, Default)]
pub enum HugePageType {
/// Memory allocated using hugetlbfs backend
#[serde(rename = "hugetlbfs")]
#[default]
Hugetlbfs,
/// Memory allocated using transparent huge pages
@@ -781,12 +782,6 @@ pub enum HugePageType {
THP,
}
impl Default for HugePageType {
fn default() -> Self {
Self::Hugetlbfs
}
}
/// Virtual machine memory configuration information.
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
pub struct MemoryInfo {

View File

@@ -366,8 +366,8 @@ key = "value"
let result = add_hypervisor_initdata_overrides(&encoded);
// This might fail depending on whether algorithm is required
if result.is_err() {
assert!(result.unwrap_err().to_string().contains("parse initdata"));
if let Err(error) = result {
assert!(error.to_string().contains("parse initdata"));
}
}
@@ -386,8 +386,8 @@ key = "value"
let result = add_hypervisor_initdata_overrides(&encoded);
// This might fail depending on whether version is required
if result.is_err() {
assert!(result.unwrap_err().to_string().contains("parse initdata"));
if let Err(error) = result {
assert!(error.to_string().contains("parse initdata"));
}
}
@@ -488,7 +488,7 @@ key = "value"
let valid_toml = r#"
version = "0.1.0"
algorithm = "sha384"
[data]
valid_key = "valid_value"
"#;
@@ -497,7 +497,7 @@ key = "value"
// Invalid TOML (missing version)
let invalid_toml = r#"
algorithm = "sha256"
[data]
key = "value"
"#;

View File

@@ -136,8 +136,6 @@ macro_rules! skip_loop_by_user {
#[cfg(test)]
mod tests {
use super::{skip_if_kvm_unaccessable, skip_if_not_root, skip_if_root};
#[test]
fn test_skip_if_not_root() {
skip_if_not_root!();

View File

@@ -133,6 +133,17 @@ PKGLIBEXECDIR := $(LIBEXECDIR)/$(PROJECT_DIR)
FIRMWAREPATH :=
FIRMWAREVOLUMEPATH :=
ROOTMEASURECONFIG ?= ""
KERNELTDXPARAMS += $(ROOTMEASURECONFIG)
# TDX
DEFSHAREDFS_QEMU_TDX_VIRTIOFS := none
FIRMWARETDXPATH := $(PREFIXDEPS)/share/ovmf/OVMF.inteltdx.fd
# SEV-SNP
FIRMWARE_SNP_PATH := $(PREFIXDEPS)/share/ovmf/AMDSEV.fd
FIRMWARE_VOLUME_SNP_PATH :=
##VAR DEFVCPUS=<number> Default number of vCPUs
DEFVCPUS := 1
##VAR DEFMAXVCPUS=<number> Default maximum number of vCPUs
@@ -149,7 +160,7 @@ DEFMEMSLOTS := 10
# Default maximum memory in MiB
DEFMAXMEMSZ := 0
##VAR DEFBRIDGES=<number> Default number of bridges
DEFBRIDGES := 0
DEFBRIDGES := 1
DEFENABLEANNOTATIONS := [\"enable_iommu\", \"virtio_fs_extra_args\", \"kernel_params\", \"default_vcpus\", \"default_memory\"]
DEFENABLEANNOTATIONS_COCO := [\"enable_iommu\", \"virtio_fs_extra_args\", \"kernel_params\", \"default_vcpus\", \"default_memory\", \"cc_init_data\"]
DEFDISABLEGUESTSECCOMP := true
@@ -176,6 +187,7 @@ DEFVIRTIOFSQUEUESIZE ?= 1024
# Make sure you quote args.
DEFVIRTIOFSEXTRAARGS ?= [\"--thread-pool-size=1\", \"-o\", \"announce_submounts\"]
DEFENABLEIOTHREADS := false
DEFINDEPIOTHREADS := 0
DEFENABLEVHOSTUSERSTORE := false
DEFVHOSTUSERSTOREPATH := $(PKGRUNDIR)/vhost-user
DEFVALIDVHOSTUSERSTOREPATHS := [\"$(DEFVHOSTUSERSTOREPATH)\"]
@@ -192,6 +204,8 @@ QEMUTDXQUOTEGENERATIONSERVICESOCKETPORT := 4050
DEFCREATECONTAINERTIMEOUT ?= 30
DEFCREATECONTAINERTIMEOUT_COCO ?= 60
DEFSTATICRESOURCEMGMT_COCO = true
DEFDISABLEIMAGENVDIMM ?= false
DEFPODRESOURCEAPISOCK := ""
SED = sed
CLI_DIR = cmd
@@ -244,6 +258,7 @@ ifneq (,$(DBCMD))
RUNTIMENAME := virt_container
PIPESIZE := 1
DBSHAREDFS := inline-virtio-fs
DEF_DGB_BRIDGES := 0
endif
ifneq (,$(CLHCMD))
@@ -291,6 +306,30 @@ ifneq (,$(QEMUCMD))
CONFIGS += $(CONFIG_QEMU)
CONFIG_FILE_QEMU_TDX = configuration-qemu-tdx-runtime-rs.toml
CONFIG_QEMU_TDX = config/$(CONFIG_FILE_QEMU_TDX)
CONFIG_QEMU_TDX_IN = $(CONFIG_QEMU_TDX).in
CONFIG_PATH_QEMU_TDX = $(abspath $(CONFDIR)/$(CONFIG_FILE_QEMU_TDX))
CONFIG_PATHS += $(CONFIG_PATH_QEMU_TDX)
SYSCONFIG_QEMU_TDX = $(abspath $(SYSCONFDIR)/$(CONFIG_FILE_QEMU_TDX))
SYSCONFIG_PATHS += $(SYSCONFIG_QEMU_TDX)
CONFIGS += $(CONFIG_QEMU_TDX)
CONFIG_FILE_QEMU_SNP = configuration-qemu-snp-runtime-rs.toml
CONFIG_QEMU_SNP = config/$(CONFIG_FILE_QEMU_SNP)
CONFIG_QEMU_SNP_IN = $(CONFIG_QEMU_SNP).in
CONFIG_PATH_QEMU_SNP = $(abspath $(CONFDIR)/$(CONFIG_FILE_QEMU_SNP))
CONFIG_PATHS += $(CONFIG_PATH_QEMU_SNP)
SYSCONFIG_QEMU_SNP = $(abspath $(SYSCONFDIR)/$(CONFIG_FILE_QEMU_SNP))
SYSCONFIG_PATHS += $(SYSCONFIG_QEMU_SNP)
CONFIGS += $(CONFIG_QEMU_SNP)
CONFIG_FILE_QEMU_SE = configuration-qemu-se-runtime-rs.toml
CONFIG_QEMU_SE = config/$(CONFIG_FILE_QEMU_SE)
CONFIG_QEMU_SE_IN = $(CONFIG_QEMU_SE).in
@@ -521,6 +560,7 @@ USER_VARS += DEFVIRTIOFSEXTRAARGS
USER_VARS += DEFENABLEANNOTATIONS
USER_VARS += DEFENABLEANNOTATIONS_COCO
USER_VARS += DEFENABLEIOTHREADS
USER_VARS += DEFINDEPIOTHREADS
USER_VARS += DEFSECCOMPSANDBOXPARAM
USER_VARS += DEFGUESTSELINUXLABEL
USER_VARS += DEFENABLEVHOSTUSERSTORE
@@ -541,6 +581,7 @@ USER_VARS += DEFSTATICRESOURCEMGMT_FC
USER_VARS += DEFSTATICRESOURCEMGMT_CLH
USER_VARS += DEFSTATICRESOURCEMGMT_QEMU
USER_VARS += DEFSTATICRESOURCEMGMT_COCO
USER_VARS += DEFDISABLEIMAGENVDIMM
USER_VARS += DEFBINDMOUNTS
USER_VARS += DEFVFIOMODE
USER_VARS += DEFVFIOMODE_SE
@@ -552,6 +593,7 @@ USER_VARS += HYPERVISOR_QEMU
USER_VARS += HYPERVISOR_FC
USER_VARS += PIPESIZE
USER_VARS += DBSHAREDFS
USER_VARS += DEF_DGB_BRIDGES
USER_VARS += KATA_INSTALL_GROUP
USER_VARS += KATA_INSTALL_OWNER
USER_VARS += KATA_INSTALL_CFG_PERMS
@@ -560,6 +602,13 @@ USER_VARS += DEFFORCEGUESTPULL
USER_VARS += QEMUTDXQUOTEGENERATIONSERVICESOCKETPORT
USER_VARS += DEFCREATECONTAINERTIMEOUT
USER_VARS += DEFCREATECONTAINERTIMEOUT_COCO
USER_VARS += QEMUTDXEXPERIMENTALCMD
USER_VARS += FIRMWARE_SNP_PATH
USER_VARS += FIRMWARE_VOLUME_SNP_PATH
USER_VARS += KERNELTDXPARAMS
USER_VARS += DEFSHAREDFS_QEMU_TDX_VIRTIOFS
USER_VARS += FIRMWARETDXPATH
USER_VARS += DEFPODRESOURCEAPISOCK
SOURCES := \
$(shell find . 2>&1 | grep -E '.*\.rs$$') \
@@ -597,6 +646,8 @@ GENERATED_VARS = \
VERSION \
CONFIG_DB_IN \
CONFIG_FC_IN \
CONFIG_QEMU_TDX_IN \
CONFIG_QEMU_SNP_IN \
$(USER_VARS)

View File

@@ -92,10 +92,11 @@ default_maxvcpus = @DEFMAXVCPUS_DB@
# * Until 5 PCI bridges can be cold plugged per VM.
# This limitation could be a bug in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0 --> will be set to @DEFBRIDGES@
# unspecified or 0 --> will be set to @DEF_DGB_BRIDGES@
# > 1 <= 5 --> will be set to the specified number
# > 5 --> will be set to 5
default_bridges = @DEFBRIDGES@
# As Dragonball VMM does not support PCI hotplug options, it should be set 0.
default_bridges = @DEF_DGB_BRIDGES@
# Reclaim guest freed memory.
# Enabling this will result in the VM balloon device having f_reporting=on set.

View File

@@ -0,0 +1,770 @@
# Copyright (c) 2017-2019 Intel Corporation
# Copyright (c) 2021 Adobe Inc.
# Copyright (c) 2024 IBM Corp.
# Copyright (c) 2025-2026 Ant Group
#
# SPDX-License-Identifier: Apache-2.0
#
# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "@CONFIG_QEMU_IN@"
# XXX: Project:
# XXX: Name: @PROJECT_NAME@
# XXX: Type: @PROJECT_TYPE@
[hypervisor.qemu]
path = "@QEMUPATH@"
kernel = "@KERNELPATH_COCO@"
initrd = "@INITRDCONFIDENTIALPATH@"
# image = "@IMAGECONFIDENTIALPATH@"
machine_type = "@MACHINETYPE@"
# Enable confidential guest support.
# Toggling that setting may trigger different hardware features, ranging
# from memory encryption to both memory and CPU-state encryption and integrity.
# The Kata Containers runtime dynamically detects the available feature set and
# aims at enabling the largest possible one, returning an error if none is
# available, or none is supported by the hypervisor.
#
# Known limitations:
# * Does not work by design:
# - CPU Hotplug
# - Memory Hotplug
# - NVDIMM devices
#
# Default false
confidential_guest = true
# Enable AMD SEV-SNP confidential guests
# In case of using confidential guests on AMD hardware that supports SEV-SNP,
# the following enables SEV-SNP guests. Default true
sev_snp_guest = true
# SNP 'ID Block' and 'ID Authentication Information Structure'.
# If one of snp_id_block or snp_id_auth is specified, the other must be specified, too.
# Notice that the default SNP policy of QEMU (0x30000) is used by Kata, if not explicitly
# set via 'snp_guest_policy' option. The IDBlock contains the guest policy as field, and
# it must match the value from 'snp_guest_policy' or, if unset, the QEMU default policy.
#
# 96-byte, base64-encoded blob to provide the ID Block structure for the
# SNP_LAUNCH_FINISH command defined in the SEV-SNP firmware ABI (QEMU default: all-zero)
snp_id_block = ""
# 4096-byte, base64-encoded blob to provide the ID Authentication Information Structure
# for the SNP_LAUNCH_FINISH command defined in the SEV-SNP firmware ABI (QEMU default: all-zero)
snp_id_auth = ""
# SNP Guest Policy, the POLICY parameter to the SNP_LAUNCH_START command.
# If unset, the QEMU default policy (0x30000) will be used.
# Notice that the guest policy is enforced at VM launch, and your pod VMs
# won't start at all if the policy denys it. This will be indicated by a
# 'SNP_LAUNCH_START' error.
snp_guest_policy = 196608
# rootfs filesystem type:
# - ext4 (default)
# - xfs
# - erofs
rootfs_type = @DEFROOTFSTYPE@
# Block storage driver to be used for the VM rootfs is backed
# by a block device. This is virtio-blk-pci, virtio-blk-mmio or nvdimm
vm_rootfs_driver = "virtio-blk-pci"
# Enable running QEMU VMM as a non-root user.
# By default QEMU VMM run as root. When this is set to true, QEMU VMM process runs as
# a non-root random user. See documentation for the limitations of this mode.
rootless = false
# List of valid annotation names for the hypervisor
# Each member of the list is a regular expression, which is the base name
# of the annotation, e.g. "path" for io.katacontainers.config.hypervisor.path"
enable_annotations = @DEFENABLEANNOTATIONS_COCO@
# List of valid annotations values for the hypervisor
# Each member of the list is a path pattern as described by glob(3).
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @QEMUVALIDHYPERVISORPATHS@
valid_hypervisor_paths = @QEMUVALIDHYPERVISORPATHS@
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = "@KERNELPARAMS@"
# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = "@FIRMWARE_SNP_PATH@"
# Path to the firmware volume.
# firmware TDVF or OVMF can be split into FIRMWARE_VARS.fd (UEFI variables
# as configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI variables
# can be customized per each user while UEFI code is kept same.
firmware_volume = "@FIRMWARE_VOLUME_SNP_PATH@"
# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators = "@MACHINEACCELERATORS@"
# Qemu seccomp sandbox feature
# comma-separated list of seccomp sandbox features to control the syscall access.
# For example, `seccompsandbox= "on,obsolete=deny,spawn=deny,resourcecontrol=deny"`
# Note: "elevateprivileges=deny" doesn't work with daemonize option, so it's removed from the seccomp sandbox
# Another note: enabling this feature may reduce performance, you may enable
# /proc/sys/net/core/bpf_jit_enable to reduce the impact. see https://man7.org/linux/man-pages/man8/bpfc.8.html
# Recommended value when enabling: "on,obsolete=deny,spawn=deny,resourcecontrol=deny"
seccompsandbox = "@DEFSECCOMPSANDBOXPARAM@"
# CPU features
# comma-separated list of cpu features to pass to the cpu
# For example, `cpu_features = "pmu=off,vmx=off"
cpu_features = "@CPUFEATURES@"
# Default number of vCPUs per SB/VM:
# unspecified or 0 --> will be set to @DEFVCPUS@
# < 0 --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores
default_vcpus = @DEFVCPUS_QEMU@
# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0 --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
# NOTICE: on arm platform with gicv2 interrupt controller, set it to 8.
default_maxvcpus = @DEFMAXVCPUS_QEMU@
# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
# This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0 --> will be set to @DEFBRIDGES@
# > 1 <= 5 --> will be set to the specified number
# > 5 --> will be set to 5
default_bridges = @DEFBRIDGES@
# Default memory size in MiB for SB/VM.
# If unspecified then it will be set @DEFMEMSZ@ MiB.
default_memory = @DEFMEMSZ@
#
# Default memory slots per SB/VM.
# If unspecified then it will be set @DEFMEMSLOTS@.
# This is will determine the times that memory will be hotadded to sandbox/VM.
memory_slots = @DEFMEMSLOTS@
# Default maximum memory in MiB per SB / VM
# unspecified or == 0 --> will be set to the actual amount of physical RAM
# > 0 <= amount of physical RAM --> will be set to the specified number
# > amount of physical RAM --> will be set to the actual amount of physical RAM
default_maxmemory = @DEFMAXMEMSZ@
# The size in MiB will be plused to max memory of hypervisor.
# It is the memory address space for the NVDIMM device.
# If set block storage driver (block_device_driver) to "nvdimm",
# should set memory_offset to the size of block device.
# Default 0
memory_offset = 0
# Specifies virtio-mem will be enabled or not.
# Please note that this option should be used with the command
# "echo 1 > /proc/sys/vm/overcommit_memory".
# Default false
enable_virtio_mem = false
# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons.
# This flag prevents the block device from being passed to the hypervisor,
# virtio-fs is used instead to pass the rootfs.
disable_block_device_use = @DEFDISABLEBLOCK@
# Shared file system type:
# - virtio-fs (default)
# - virtio-fs-nydus
# - none
shared_fs = "none"
# Path to vhost-user-fs daemon.
virtio_fs_daemon = "@DEFVIRTIOFSDAEMON@"
# List of valid annotations values for the virtiofs daemon
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDVIRTIOFSDAEMONPATHS@
valid_virtio_fs_daemon_paths = @DEFVALIDVIRTIOFSDAEMONPATHS@
# Default size of DAX cache in MiB
virtio_fs_cache_size = @DEFVIRTIOFSCACHESIZE@
# Default size of virtqueues
virtio_fs_queue_size = @DEFVIRTIOFSQUEUESIZE@
# Extra args for virtiofsd daemon
#
# Format example:
# ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
# Examples:
# Set virtiofsd log level to debug : ["-o", "log_level=debug"] or ["-d"]
#
# see `virtiofsd -h` for possible options.
virtio_fs_extra_args = @DEFVIRTIOFSEXTRAARGS@
# Cache mode:
#
# - never
# Metadata, data, and pathname lookup are not cached in guest. They are
# always fetched from host and any changes are immediately pushed to host.
#
# - metadata
# Metadata and pathname lookup are cached in guest and never expire.
# Data is never cached in guest.
#
# - auto
# Metadata and pathname lookup cache expires after a configured amount of
# time (default is 1 second). Data is cached while the file is open (close
# to open consistency).
#
# - always
# Metadata, data, and pathname lookup are cached in guest and never expire.
virtio_fs_cache = "@DEFVIRTIOFSCACHE@"
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is virtio-scsi, virtio-blk
# or nvdimm.
block_device_driver = "@DEFBLOCKSTORAGEDRIVER_QEMU@"
# aio is the I/O mechanism used by qemu
# Options:
#
# - threads
# Pthread based disk I/O.
#
# - native
# Native Linux I/O.
#
# - io_uring
# Linux io_uring API. This provides the fastest I/O operations on Linux, requires kernel>5.1 and
# qemu >=5.0.
block_device_aio = "@DEFBLOCKDEVICEAIO_QEMU@"
# Specifies cache-related options will be set to block devices or not.
# Default false
block_device_cache_set = false
# Specifies cache-related options for block devices.
# Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
# Default false
block_device_cache_direct = false
# Specifies cache-related options for block devices.
# Denotes whether flush requests for the device are ignored.
# Default false
block_device_cache_noflush = false
# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = @DEFENABLEIOTHREADS@
# Independent IOThreads enables IO to be processed in a separate thread, it is
# for QEMU hotplug device attach to iothread, like virtio-blk.
indep_iothreads = @DEFINDEPIOTHREADS@
# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
enable_mem_prealloc = false
# Reclaim guest freed memory.
# Enabling this will result in the VM balloon device having f_reporting=on set.
# Then the hypervisor will use it to reclaim guest freed memory.
# This is useful for reducing the amount of memory used by a VM.
# Enabling this feature may sometimes reduce the speed of memory access in
# the VM.
#
# Default false
reclaim_guest_freed_memory = false
# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
enable_hugepages = false
# Enable vhost-user storage device, default false
# Enabling this will result in some Linux reserved block type
# major range 240-254 being chosen to represent vhost-user devices.
enable_vhost_user_store = @DEFENABLEVHOSTUSERSTORE@
# The base directory specifically used for vhost-user devices.
# Its sub-path "block" is used for block devices; "block/sockets" is
# where we expect vhost-user sockets to live; "block/devices" is where
# simulated block device nodes for vhost-user devices to live.
vhost_user_store_path = "@DEFVHOSTUSERSTOREPATH@"
# Enable vIOMMU, default false
# Enabling this will result in the VM having a vIOMMU device
# This will also add the following options to the kernel's
# command line: intel_iommu=on,iommu=pt
enable_iommu = false
# Enable IOMMU_PLATFORM, default false
# Enabling this will result in the VM device having iommu_platform=on set
enable_iommu_platform = false
# List of valid annotations values for the vhost user store path
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDVHOSTUSERSTOREPATHS@
valid_vhost_user_store_paths = @DEFVALIDVHOSTUSERSTOREPATHS@
# The timeout for reconnecting on non-server spdk sockets when the remote end goes away.
# qemu will delay this many seconds and then attempt to reconnect.
# Zero disables reconnecting, and the default is zero.
vhost_user_reconnect_timeout_sec = 0
# Enable file based guest memory support. The default is an empty string which
# will disable this feature. In the case of virtio-fs, this is enabled
# automatically and '/dev/shm' is used as the backing folder.
# This option will be ignored if VM templating is enabled.
file_mem_backend = ""
# List of valid annotations values for the file_mem_backend annotation
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDFILEMEMBACKENDS@
valid_file_mem_backends = @DEFVALIDFILEMEMBACKENDS@
# -pflash can add image file to VM. The arguments of it should be in format
# of ["/path/to/flash0.img", "/path/to/flash1.img"]
pflashes = []
# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. And Debug also enable the hmp socket.
#
# Default false
enable_debug = false
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
disable_nesting_checks = true
# If false and nvdimm is supported, use nvdimm device to plug guest image.
# Otherwise virtio-block device is used.
#
# nvdimm is not supported when `confidential_guest = true`.
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM@
# Before hot plugging a PCIe device, you need to add a pcie_root_port device.
# Use this parameter when using some large PCI bar devices, such as Nvidia GPU
# The value means the number of pcie_root_port
# Default 0
pcie_root_port = 0
# If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
# security (vhost-net runs ring0) for network I/O performance.
disable_vhost_net = false
# This option allows to add an extra HMP or QMP socket when `enable_debug = true`
#
# WARNING: Anyone with access to the extra socket can take full control of
# Qemu. This is for debugging purpose only and must *NEVER* be used in
# production.
#
# Valid values are :
# - "hmp"
# - "qmp"
# - "qmp-pretty" (same as "qmp" with pretty json formatting)
#
# If set to the empty string "", no extra monitor socket is added. This is
# the default.
#extra_monitor_socket = "hmp"
#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy. If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
entropy_source = "@DEFENTROPYSOURCE@"
# List of valid annotations values for entropy_source
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDENTROPYSOURCES@
valid_entropy_sources = @DEFVALIDENTROPYSOURCES@
# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/kata-containers/tree/main/tools/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,poststart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered while scanning for hooks,
# but it will not abort container execution.
# Recommended value when enabling: "/usr/share/oci/hooks"
guest_hook_path = ""
#
# Use rx Rate Limiter to control network I/O inbound bandwidth(size in bits/sec for SB/VM).
# In Qemu, we use classful qdiscs HTB(Hierarchy Token Bucket) to discipline traffic.
# Default 0-sized value means unlimited rate.
rx_rate_limiter_max_rate = 0
# Use tx Rate Limiter to control network I/O outbound bandwidth(size in bits/sec for SB/VM).
# In Qemu, we use classful qdiscs HTB(Hierarchy Token Bucket) and ifb(Intermediate Functional Block)
# to discipline traffic.
# Default 0-sized value means unlimited rate.
tx_rate_limiter_max_rate = 0
# Set where to save the guest memory dump file.
# If set, when GUEST_PANICKED event occurred,
# guest memeory will be dumped to host filesystem under guest_memory_dump_path,
# This directory will be created automatically if it does not exist.
#
# The dumped file(also called vmcore) can be processed with crash or gdb.
#
# WARNING:
# Dump guest's memory can take very long depending on the amount of guest memory
# and use much disk space.
# Recommended value when enabling: "/var/crash/kata"
guest_memory_dump_path = ""
# If enable paging.
# Basically, if you want to use "gdb" rather than "crash",
# or need the guest-virtual addresses in the ELF vmcore,
# then you should enable paging.
#
# See: https://www.qemu.org/docs/master/qemu-qmp-ref.html#Dump-guest-memory for details
guest_memory_dump_paging = false
# Enable swap in the guest. Default false.
# When enable_guest_swap is enabled, insert a raw file to the guest as the swap device
# if the swappiness of a container (set by annotation "io.katacontainers.container.resource.swappiness")
# is bigger than 0.
# The size of the swap device should be
# swap_in_bytes (set by annotation "io.katacontainers.container.resource.swap_in_bytes") - memory_limit_in_bytes.
# If swap_in_bytes is not set, the size should be memory_limit_in_bytes.
# If swap_in_bytes and memory_limit_in_bytes is not set, the size should
# be default_memory.
enable_guest_swap = false
# use legacy serial for guest console if available and implemented for architecture. Default false
use_legacy_serial = false
# disable applying SELinux on the VMM process (default false)
disable_selinux = @DEFDISABLESELINUX@
# disable applying SELinux on the container process
# If set to false, the type `container_t` is applied to the container process by default.
# Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
# with `SELINUX=yes`.
# (default: true)
disable_guest_selinux = @DEFDISABLEGUESTSELINUX@
[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Note: Requires "initrd=" to be set ("image=" is not supported).
#
# Default false
enable_template = false
# Specifies the path of template.
#
# Default "/run/vc/vm/template"
template_path = "/run/vc/vm/template"
# The number of caches of VMCache:
# unspecified or == 0 --> VMCache is disabled
# > 0 --> will be set to the specified number
#
# VMCache is a function that creates VMs as caches before using it.
# It helps speed up new container creation.
# The function consists of a server and some clients communicating
# through Unix socket. The protocol is gRPC in protocols/cache/cache.proto.
# The VMCache server will create some VMs and cache them by factory cache.
# It will convert the VM to gRPC format and transport it when gets
# requestion from clients.
# Factory grpccache is the VMCache client. It will request gRPC format
# VM and convert it back to a VM. If VMCache function is enabled,
# kata-runtime will request VM from factory grpccache when it creates
# a new sandbox.
#
# Default 0
vm_cache_number = 0
# Specify the address of the Unix socket that is used by VMCache.
#
# Default /var/run/kata-containers/cache.sock
vm_cache_endpoint = "/var/run/kata-containers/cache.sock"
[agent.@PROJECT_TYPE@]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
enable_debug = false
# Enable agent tracing.
#
# If enabled, the agent will generate OpenTelemetry trace spans.
#
# Notes:
#
# - If the runtime also has tracing enabled, the agent spans will be
# associated with the appropriate runtime parent span.
# - If enabled, the runtime will wait for the container to shutdown,
# increasing the container shutdown time slightly.
#
# (default: disabled)
enable_tracing = false
# Comma separated list of kernel modules and their parameters.
# These modules will be loaded in the guest kernel using modprobe(8).
# The following example can be used to load two kernel modules with parameters
# - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
# The first word is considered as the module name and the rest as its parameters.
# Container will not be started when:
# * A kernel module is specified and the modprobe command is not installed in the guest
# or it fails loading the module.
# * The module is not available in the guest or it doesn't met the guest kernel
# requirements, like architecture and version.
#
kernel_modules = []
# Enable debug console.
# If enabled, user can connect guest OS running inside hypervisor
# through "kata-runtime exec <sandbox-id>" command
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 10
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 3000
# Create Container Request Timeout
# This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest.
# It's also used to ensure that workloads, especially those involving large image pulls within the guest,
# have sufficient time to complete.
#
# Effective Timeout Determination:
# The effective timeout for a CreateContainerRequest is determined by taking the minimum of the following two values:
# - create_container_timeout: The timeout value configured for creating containers (default: 30,000 milliseconds).
# - runtime-request-timeout: The timeout value specified in the Kubelet configuration described as the link below:
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout)
# Defaults to @DEFCREATECONTAINERTIMEOUT_COCO@ second(s)
create_container_timeout = @DEFCREATECONTAINERTIMEOUT_COCO@
[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
enable_debug = false
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
# - macvtap
# Used when the Container network interface can be bridged using
# macvtap.
#
# - none
# Used when customize network. Only creates a tap device. No veth pair.
#
# - tcfilter
# Uses tc filter rules to redirect traffic from the network interface
# provided by plugin to a tap interface connected to the VM.
#
internetworking_model="@DEFNETWORKMODEL_QEMU@"
name="@RUNTIMENAME@"
hypervisor_name="@HYPERVISOR_QEMU@"
agent_name="@PROJECT_TYPE@"
# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp = @DEFDISABLEGUESTSECCOMP@
# vCPUs pinning settings
# if enabled, each vCPU thread will be scheduled to a fixed CPU
# qualified condition: num(vCPU threads) == num(CPUs in sandbox's CPUSet)
enable_vcpus_pinning = false
# Apply a custom SELinux security policy to the container process inside the VM.
# This is used when you want to apply a type other than the default `container_t`,
# so general users should not uncomment and apply it.
# (format: "user:role:type")
# Note: You cannot specify MCS policy with the label because the sensitivity levels and
# categories are determined automatically by high-level container runtimes such as containerd.
guest_selinux_label = "@DEFGUESTSELINUXLABEL@"
# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
enable_tracing = false
# Set the full url to the Jaeger HTTP Thrift collector.
# The default if not set will be "http://localhost:14268/api/traces"
jaeger_endpoint = ""
# Sets the username to be used if basic auth is required for Jaeger.
jaeger_user = ""
# Sets the password to be used if basic auth is required for Jaeger.
jaeger_password = ""
# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# (default: false)
disable_new_netns = false
# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# The sandbox cgroup is constrained if there is no container type annotation.
# See: https://pkg.go.dev/github.com/kata-containers/kata-containers/src/runtime/virtcontainers#ContainerType
sandbox_cgroup_only = @DEFSANDBOXCGROUPONLY_QEMU@
# If enabled, the runtime will attempt to determine appropriate sandbox size (memory, CPU) before booting the virtual machine. In
# this case, the runtime will not dynamically update the amount of memory and CPU in the virtual machine. This is generally helpful
# when a hardware architecture or hypervisor solutions is utilized which does not support CPU and/or memory hotplug.
# Compatibility for determining appropriate sandbox (VM) size:
# - When running with pods, sandbox sizing information will only be available if using Kubernetes >= 1.23 and containerd >= 1.6. CRI-O
# does not yet support sandbox sizing annotations.
# - When running single containers using a tool like ctr, container sizing information will be available.
static_sandbox_resource_mgmt = @DEFSTATICRESOURCEMGMT_COCO@
# If specified, sandbox_bind_mounts identifieds host paths to be mounted (ro) into the sandboxes shared path.
# This is only valid if filesystem sharing is utilized. The provided path(s) will be bindmounted into the shared fs directory.
# If defaults are utilized, these mounts should be available in the guest at `/run/kata-containers/shared/containers/sandbox-mounts`
# These will not be exposed to the container workloads, and are only provided for potential guest services.
sandbox_bind_mounts = @DEFBINDMOUNTS@
# VFIO Mode
# Determines how VFIO devices should be be presented to the container.
# Options:
#
# - vfio
# Matches behaviour of OCI runtimes (e.g. runc) as much as
# possible. VFIO devices will appear in the container as VFIO
# character devices under /dev/vfio. The exact names may differ
# from the host (they need to match the VM's IOMMU group numbers
# rather than the host's)
#
# - guest-kernel
# This is a Kata-specific behaviour that's useful in certain cases.
# The VFIO device is managed by whatever driver in the VM kernel
# claims it. This means it will appear as one or more device nodes
# or network interfaces depending on the nature of the device.
# Using this mode requires specially built workloads that know how
# to locate the relevant device interfaces within the VM.
#
vfio_mode = "@DEFVFIOMODE@"
# If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem. Instead, emptyDir mounts will
# be created on the host and shared via virtio-fs. This is potentially slower, but allows sharing of files from host to guest.
disable_guest_empty_dir = @DEFDISABLEGUESTEMPTYDIR@
# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# they may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# for example:
# experimental=["force_guest_pull"]
# which is for enable force_guest_pull mode in CoCo scenarios.
# (default: [])
experimental = @DEFAULTEXPFEATURES@
# If enabled, user can run pprof tools with shim v2 process through kata-monitor.
# (default: false)
enable_pprof = false
# Base directory of directly attachable network config.
# Network devices for VM-based containers are allowed to be placed in the
# host netns to eliminate as many hops as possible, which is what we
# called a "Directly Attachable Network". The config, set by special CNI
# plugins, is used to tell the Kata containers what devices are attached
# to the hypervisor.
# (default: /run/kata-containers/dans)
dan_conf = "@DEFDANCONF@"
# pod_resource_api_sock specifies the unix socket for the Kubelet's
# PodResource API endpoint. If empty, kubernetes based cold plug
# will not be attempted. In order for this feature to work, the
# KubeletPodResourcesGet featureGate must be enabled in Kubelet,
# if using Kubelet older than 1.34.
#
# The pod resource API's socket is relative to the Kubelet's root-dir,
# which is defined by the cluster admin, and its location is:
# ${KubeletRootDir}/pod-resources/kubelet.sock
#
# cold_plug_vfio(see hypervisor config) acts as a feature gate:
# cold_plug_vfio = no_port (default) => no cold plug
# cold_plug_vfio != no_port AND pod_resource_api_sock = "" => need
# explicit CDI annotation for cold plug (applies mainly
# to non-k8s cases)
# cold_plug_vfio != no_port AND pod_resource_api_sock != "" => kubelet
# based cold plug.
pod_resource_api_sock = "@DEFPODRESOURCEAPISOCK@"

View File

@@ -0,0 +1,746 @@
# Copyright (c) 2017-2019 Intel Corporation
# Copyright (c) 2021 Adobe Inc.
# Copyright (c) 2025-2026 Ant Group
#
# SPDX-License-Identifier: Apache-2.0
#
# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "@CONFIG_QEMU_IN@"
# XXX: Project:
# XXX: Name: @PROJECT_NAME@
# XXX: Type: @PROJECT_TYPE@
[hypervisor.qemu]
path = "@QEMUPATH@"
kernel = "@KERNELPATH_COCO@"
image = "@IMAGECONFIDENTIALPATH@"
# initrd = "@INITRDPATH@"
machine_type = "@MACHINETYPE@"
tdx_quote_generation_service_socket_port = @QEMUTDXQUOTEGENERATIONSERVICESOCKETPORT@
# rootfs filesystem type:
# - ext4 (default)
# - xfs
# - erofs
rootfs_type = @DEFROOTFSTYPE@
# Block storage driver to be used for the VM rootfs is backed
# by a block device. This is virtio-blk-pci, virtio-blk-mmio or nvdimm
vm_rootfs_driver = "virtio-blk-pci"
# Enable confidential guest support.
# Toggling that setting may trigger different hardware features, ranging
# from memory encryption to both memory and CPU-state encryption and integrity.
# The Kata Containers runtime dynamically detects the available feature set and
# aims at enabling the largest possible one, returning an error if none is
# available, or none is supported by the hypervisor.
#
# Known limitations:
# * Does not work by design:
# - CPU Hotplug
# - Memory Hotplug
# - NVDIMM devices
#
# Default false
confidential_guest = true
# Enable running QEMU VMM as a non-root user.
# By default QEMU VMM run as root. When this is set to true, QEMU VMM process runs as
# a non-root random user. See documentation for the limitations of this mode.
rootless = false
# List of valid annotation names for the hypervisor
# Each member of the list is a regular expression, which is the base name
# of the annotation, e.g. "path" for io.katacontainers.config.hypervisor.path"
enable_annotations = @DEFENABLEANNOTATIONS_COCO@
# List of valid annotations values for the hypervisor
# Each member of the list is a path pattern as described by glob(3).
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @QEMUVALIDHYPERVISORPATHS@
valid_hypervisor_paths = @QEMUVALIDHYPERVISORPATHS@
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = "@KERNELTDXPARAMS@"
# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = "@FIRMWARETDXPATH@"
# Path to the firmware volume.
# firmware TDVF or OVMF can be split into FIRMWARE_VARS.fd (UEFI variables
# as configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI variables
# can be customized per each user while UEFI code is kept same.
firmware_volume = "@FIRMWAREVOLUMEPATH@"
# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators = "@MACHINEACCELERATORS@"
# Qemu seccomp sandbox feature
# comma-separated list of seccomp sandbox features to control the syscall access.
# For example, `seccompsandbox= "on,obsolete=deny,spawn=deny,resourcecontrol=deny"`
# Note: "elevateprivileges=deny" doesn't work with daemonize option, so it's removed from the seccomp sandbox
# Another note: enabling this feature may reduce performance, you may enable
# /proc/sys/net/core/bpf_jit_enable to reduce the impact. see https://man7.org/linux/man-pages/man8/bpfc.8.html
# Recommended value when enabling: "on,obsolete=deny,spawn=deny,resourcecontrol=deny"
seccompsandbox = "@DEFSECCOMPSANDBOXPARAM@"
# CPU features
# comma-separated list of cpu features to pass to the cpu
# For example, `cpu_features = "pmu=off,vmx=off"
cpu_features = "@CPUFEATURES@"
# Default number of vCPUs per SB/VM:
# unspecified or 0 --> will be set to @DEFVCPUS@
# < 0 --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores
default_vcpus = 1
# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0 --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores --> will be set to the actual number of physical cores or to the maximum number
# of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
# NOTICE: on arm platform with gicv2 interrupt controller, set it to 8.
default_maxvcpus = @DEFMAXVCPUS@
# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
# This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0 --> will be set to @DEFBRIDGES@
# > 1 <= 5 --> will be set to the specified number
# > 5 --> will be set to 5
default_bridges = @DEFBRIDGES@
# Default memory size in MiB for SB/VM.
# If unspecified then it will be set @DEFMEMSZ@ MiB.
default_memory = @DEFMEMSZ@
#
# Default memory slots per SB/VM.
# If unspecified then it will be set @DEFMEMSLOTS@.
# This is will determine the times that memory will be hotadded to sandbox/VM.
memory_slots = @DEFMEMSLOTS@
# Default maximum memory in MiB per SB / VM
# unspecified or == 0 --> will be set to the actual amount of physical RAM
# > 0 <= amount of physical RAM --> will be set to the specified number
# > amount of physical RAM --> will be set to the actual amount of physical RAM
default_maxmemory = @DEFMAXMEMSZ@
# The size in MiB will be plused to max memory of hypervisor.
# It is the memory address space for the NVDIMM device.
# If set block storage driver (block_device_driver) to "nvdimm",
# should set memory_offset to the size of block device.
# Default 0
memory_offset = 0
# Specifies virtio-mem will be enabled or not.
# Please note that this option should be used with the command
# "echo 1 > /proc/sys/vm/overcommit_memory".
# Default false
enable_virtio_mem = false
# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons.
# This flag prevents the block device from being passed to the hypervisor,
# virtio-fs is used instead to pass the rootfs.
disable_block_device_use = @DEFDISABLEBLOCK@
# Shared file system type:
# - virtio-fs (default)
# - virtio-fs-nydus
# - none
shared_fs = "@DEFSHAREDFS_QEMU_TDX_VIRTIOFS@"
# Path to vhost-user-fs daemon.
virtio_fs_daemon = "@DEFVIRTIOFSDAEMON@"
# List of valid annotations values for the virtiofs daemon
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDVIRTIOFSDAEMONPATHS@
valid_virtio_fs_daemon_paths = @DEFVALIDVIRTIOFSDAEMONPATHS@
# Default size of DAX cache in MiB
virtio_fs_cache_size = @DEFVIRTIOFSCACHESIZE@
# Default size of virtqueues
virtio_fs_queue_size = @DEFVIRTIOFSQUEUESIZE@
# Extra args for virtiofsd daemon
#
# Format example:
# ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
# Examples:
# Set virtiofsd log level to debug : ["-o", "log_level=debug"] or ["-d"]
#
# see `virtiofsd -h` for possible options.
virtio_fs_extra_args = @DEFVIRTIOFSEXTRAARGS@
# Cache mode:
#
# - never
# Metadata, data, and pathname lookup are not cached in guest. They are
# always fetched from host and any changes are immediately pushed to host.
#
# - metadata
# Metadata and pathname lookup are cached in guest and never expire.
# Data is never cached in guest.
#
# - auto
# Metadata and pathname lookup cache expires after a configured amount of
# time (default is 1 second). Data is cached while the file is open (close
# to open consistency).
#
# - always
# Metadata, data, and pathname lookup are cached in guest and never expire.
virtio_fs_cache = "@DEFVIRTIOFSCACHE@"
# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is virtio-scsi, virtio-blk
# or nvdimm.
block_device_driver = "@DEFBLOCKSTORAGEDRIVER_QEMU@"
# aio is the I/O mechanism used by qemu
# Options:
#
# - threads
# Pthread based disk I/O.
#
# - native
# Native Linux I/O.
#
# - io_uring
# Linux io_uring API. This provides the fastest I/O operations on Linux, requires kernel>5.1 and
# qemu >=5.0.
block_device_aio = "@DEFBLOCKDEVICEAIO_QEMU@"
# Specifies cache-related options will be set to block devices or not.
# Default false
block_device_cache_set = false
# Specifies cache-related options for block devices.
# Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
# Default false
block_device_cache_direct = false
# Specifies cache-related options for block devices.
# Denotes whether flush requests for the device are ignored.
# Default false
block_device_cache_noflush = false
# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently implemented
# for virtio-scsi and virtio-blk.
#
enable_iothreads = @DEFENABLEIOTHREADS@
# Independent IOThreads enables IO to be processed in a separate thread, it is
# for QEMU hotplug device attach to iothread, like virtio-blk.
indep_iothreads = @DEFINDEPIOTHREADS@
# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
enable_mem_prealloc = false
# Reclaim guest freed memory.
# Enabling this will result in the VM balloon device having f_reporting=on set.
# Then the hypervisor will use it to reclaim guest freed memory.
# This is useful for reducing the amount of memory used by a VM.
# Enabling this feature may sometimes reduce the speed of memory access in
# the VM.
#
# Default false
reclaim_guest_freed_memory = false
# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
enable_hugepages = false
# Enable vhost-user storage device, default false
# Enabling this will result in some Linux reserved block type
# major range 240-254 being chosen to represent vhost-user devices.
enable_vhost_user_store = @DEFENABLEVHOSTUSERSTORE@
# The base directory specifically used for vhost-user devices.
# Its sub-path "block" is used for block devices; "block/sockets" is
# where we expect vhost-user sockets to live; "block/devices" is where
# simulated block device nodes for vhost-user devices to live.
vhost_user_store_path = "@DEFVHOSTUSERSTOREPATH@"
# Enable vIOMMU, default false
# Enabling this will result in the VM having a vIOMMU device
# This will also add the following options to the kernel's
# command line: intel_iommu=on,iommu=pt
enable_iommu = false
# Enable IOMMU_PLATFORM, default false
# Enabling this will result in the VM device having iommu_platform=on set
enable_iommu_platform = false
# List of valid annotations values for the vhost user store path
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDVHOSTUSERSTOREPATHS@
valid_vhost_user_store_paths = @DEFVALIDVHOSTUSERSTOREPATHS@
# The timeout for reconnecting on non-server spdk sockets when the remote end goes away.
# qemu will delay this many seconds and then attempt to reconnect.
# Zero disables reconnecting, and the default is zero.
vhost_user_reconnect_timeout_sec = 0
# Enable file based guest memory support. The default is an empty string which
# will disable this feature. In the case of virtio-fs, this is enabled
# automatically and '/dev/shm' is used as the backing folder.
# This option will be ignored if VM templating is enabled.
file_mem_backend = "@DEFFILEMEMBACKEND@"
# List of valid annotations values for the file_mem_backend annotation
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDFILEMEMBACKENDS@
valid_file_mem_backends = @DEFVALIDFILEMEMBACKENDS@
# -pflash can add image file to VM. The arguments of it should be in format
# of ["/path/to/flash0.img", "/path/to/flash1.img"]
pflashes = []
# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. And Debug also enable the hmp socket.
#
# Default false
enable_debug = false
# This option allows to add an extra HMP or QMP socket when `enable_debug = true`
#
# WARNING: Anyone with access to the extra socket can take full control of
# Qemu. This is for debugging purpose only and must *NEVER* be used in
# production.
#
# Valid values are :
# - "hmp"
# - "qmp"
# - "qmp-pretty" (same as "qmp" with pretty json formatting)
#
# If set to the empty string "", no extra monitor socket is added. This is
# the default.
extra_monitor_socket = ""
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
disable_nesting_checks = false
# If false and nvdimm is supported, use nvdimm device to plug guest image.
# Otherwise virtio-block device is used.
#
# nvdimm is not supported when `confidential_guest = true`.
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM@
# Before hot plugging a PCIe device, you need to add a pcie_root_port device.
# Use this parameter when using some large PCI bar devices, such as Nvidia GPU
# The value means the number of pcie_root_port
# Default 0
pcie_root_port = 0
# If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
# security (vhost-net runs ring0) for network I/O performance.
disable_vhost_net = false
#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy. If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
entropy_source = "@DEFENTROPYSOURCE@"
# List of valid annotations values for entropy_source
# The default if not set is empty (all annotations rejected.)
# Your distribution recommends: @DEFVALIDENTROPYSOURCES@
valid_entropy_sources = @DEFVALIDENTROPYSOURCES@
# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/kata-containers/tree/main/tools/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,poststart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered while scanning for hooks,
# but it will not abort container execution.
# Recommended value when enabling: "/usr/share/oci/hooks"
guest_hook_path = ""
#
# Use rx Rate Limiter to control network I/O inbound bandwidth(size in bits/sec for SB/VM).
# In Qemu, we use classful qdiscs HTB(Hierarchy Token Bucket) to discipline traffic.
# Default 0-sized value means unlimited rate.
rx_rate_limiter_max_rate = 0
# Use tx Rate Limiter to control network I/O outbound bandwidth(size in bits/sec for SB/VM).
# In Qemu, we use classful qdiscs HTB(Hierarchy Token Bucket) and ifb(Intermediate Functional Block)
# to discipline traffic.
# Default 0-sized value means unlimited rate.
tx_rate_limiter_max_rate = 0
# Set where to save the guest memory dump file.
# If set, when GUEST_PANICKED event occurred,
# guest memeory will be dumped to host filesystem under guest_memory_dump_path,
# This directory will be created automatically if it does not exist.
#
# The dumped file(also called vmcore) can be processed with crash or gdb.
#
# WARNING:
# Dump guest's memory can take very long depending on the amount of guest memory
# and use much disk space.
# Recommended value when enabling: "/var/crash/kata"
guest_memory_dump_path = ""
# If enable paging.
# Basically, if you want to use "gdb" rather than "crash",
# or need the guest-virtual addresses in the ELF vmcore,
# then you should enable paging.
#
# See: https://www.qemu.org/docs/master/qemu-qmp-ref.html#Dump-guest-memory for details
guest_memory_dump_paging = false
# Enable swap in the guest. Default false.
# When enable_guest_swap is enabled, insert a raw file to the guest as the swap device
# if the swappiness of a container (set by annotation "io.katacontainers.container.resource.swappiness")
# is bigger than 0.
# The size of the swap device should be
# swap_in_bytes (set by annotation "io.katacontainers.container.resource.swap_in_bytes") - memory_limit_in_bytes.
# If swap_in_bytes is not set, the size should be memory_limit_in_bytes.
# If swap_in_bytes and memory_limit_in_bytes is not set, the size should
# be default_memory.
enable_guest_swap = false
# use legacy serial for guest console if available and implemented for architecture. Default false
use_legacy_serial = false
# disable applying SELinux on the VMM process (default false)
disable_selinux = @DEFDISABLESELINUX@
# disable applying SELinux on the container process
# If set to false, the type `container_t` is applied to the container process by default.
# Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
# with `SELINUX=yes`.
# (default: true)
disable_guest_selinux = @DEFDISABLEGUESTSELINUX@
[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Note: Requires "initrd=" to be set ("image=" is not supported).
#
# Default false
enable_template = false
# Specifies the path of template.
#
# Default "/run/vc/vm/template"
template_path = "/run/vc/vm/template"
# The number of caches of VMCache:
# unspecified or == 0 --> VMCache is disabled
# > 0 --> will be set to the specified number
#
# VMCache is a function that creates VMs as caches before using it.
# It helps speed up new container creation.
# The function consists of a server and some clients communicating
# through Unix socket. The protocol is gRPC in protocols/cache/cache.proto.
# The VMCache server will create some VMs and cache them by factory cache.
# It will convert the VM to gRPC format and transport it when gets
# requestion from clients.
# Factory grpccache is the VMCache client. It will request gRPC format
# VM and convert it back to a VM. If VMCache function is enabled,
# kata-runtime will request VM from factory grpccache when it creates
# a new sandbox.
#
# Default 0
vm_cache_number = 0
# Specify the address of the Unix socket that is used by VMCache.
#
# Default /var/run/kata-containers/cache.sock
vm_cache_endpoint = "/var/run/kata-containers/cache.sock"
[agent.@PROJECT_TYPE@]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
enable_debug = false
# Enable agent tracing.
#
# If enabled, the agent will generate OpenTelemetry trace spans.
#
# Notes:
#
# - If the runtime also has tracing enabled, the agent spans will be
# associated with the appropriate runtime parent span.
# - If enabled, the runtime will wait for the container to shutdown,
# increasing the container shutdown time slightly.
#
# (default: disabled)
enable_tracing = false
# Comma separated list of kernel modules and their parameters.
# These modules will be loaded in the guest kernel using modprobe(8).
# The following example can be used to load two kernel modules with parameters
# - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
# The first word is considered as the module name and the rest as its parameters.
# Container will not be started when:
# * A kernel module is specified and the modprobe command is not installed in the guest
# or it fails loading the module.
# * The module is not available in the guest or it doesn't met the guest kernel
# requirements, like architecture and version.
#
kernel_modules = []
# Enable debug console.
# If enabled, user can connect guest OS running inside hypervisor
# through "kata-runtime exec <sandbox-id>" command
debug_console_enabled = false
# Agent dial timeout in millisecond.
# (default: 10)
dial_timeout_ms = 10
# Agent reconnect timeout in millisecond.
# Retry times = reconnect_timeout_ms / dial_timeout_ms (default: 300)
# If you find pod cannot connect to the agent when starting, please
# consider increasing this value to increase the retry times.
# You'd better not change the value of dial_timeout_ms, unless you have an
# idea of what you are doing.
# (default: 3000)
reconnect_timeout_ms = 3000
# Create Container Request Timeout
# This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest.
# It's also used to ensure that workloads, especially those involving large image pulls within the guest,
# have sufficient time to complete.
#
# Effective Timeout Determination:
# The effective timeout for a CreateContainerRequest is determined by taking the minimum of the following two values:
# - create_container_timeout: The timeout value configured for creating containers (default: 30,000 milliseconds).
# - runtime-request-timeout: The timeout value specified in the Kubelet configuration described as the link below:
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout)
# Defaults to @DEFCREATECONTAINERTIMEOUT_COCO@ second(s)
create_container_timeout = @DEFCREATECONTAINERTIMEOUT_COCO@
[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
enable_debug = false
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
# - macvtap
# Used when the Container network interface can be bridged using
# macvtap.
#
# - none
# Used when customize network. Only creates a tap device. No veth pair.
#
# - tcfilter
# Uses tc filter rules to redirect traffic from the network interface
# provided by plugin to a tap interface connected to the VM.
#
internetworking_model = "@DEFNETWORKMODEL_QEMU@"
name="@RUNTIMENAME@"
hypervisor_name="@HYPERVISOR_QEMU@"
agent_name="@PROJECT_TYPE@"
# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp = @DEFDISABLEGUESTSECCOMP@
# vCPUs pinning settings
# if enabled, each vCPU thread will be scheduled to a fixed CPU
# qualified condition: num(vCPU threads) == num(CPUs in sandbox's CPUSet)
enable_vcpus_pinning = false
# Apply a custom SELinux security policy to the container process inside the VM.
# This is used when you want to apply a type other than the default `container_t`,
# so general users should not uncomment and apply it.
# (format: "user:role:type")
# Note: You cannot specify MCS policy with the label because the sensitivity levels and
# categories are determined automatically by high-level container runtimes such as containerd.
# Example value when enabling: "system_u:system_r:container_t"
guest_selinux_label = "@DEFGUESTSELINUXLABEL@"
# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
enable_tracing = false
# Set the full url to the Jaeger HTTP Thrift collector.
# The default if not set will be "http://localhost:14268/api/traces"
jaeger_endpoint = ""
# Sets the username to be used if basic auth is required for Jaeger.
jaeger_user = ""
# Sets the password to be used if basic auth is required for Jaeger.
jaeger_password = ""
# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# (default: false)
disable_new_netns = false
# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# The sandbox cgroup is constrained if there is no container type annotation.
# See: https://pkg.go.dev/github.com/kata-containers/kata-containers/src/runtime/virtcontainers#ContainerType
sandbox_cgroup_only = @DEFSANDBOXCGROUPONLY_QEMU@
# If enabled, the runtime will attempt to determine appropriate sandbox size (memory, CPU) before booting the virtual machine. In
# this case, the runtime will not dynamically update the amount of memory and CPU in the virtual machine. This is generally helpful
# when a hardware architecture or hypervisor solutions is utilized which does not support CPU and/or memory hotplug.
# Compatibility for determining appropriate sandbox (VM) size:
# - When running with pods, sandbox sizing information will only be available if using Kubernetes >= 1.23 and containerd >= 1.6. CRI-O
# does not yet support sandbox sizing annotations.
# - When running single containers using a tool like ctr, container sizing information will be available.
static_sandbox_resource_mgmt = @DEFSTATICRESOURCEMGMT_COCO@
# If specified, sandbox_bind_mounts identifieds host paths to be mounted (ro) into the sandboxes shared path.
# This is only valid if filesystem sharing is utilized. The provided path(s) will be bindmounted into the shared fs directory.
# If defaults are utilized, these mounts should be available in the guest at `/run/kata-containers/shared/containers/sandbox-mounts`
# These will not be exposed to the container workloads, and are only provided for potential guest services.
sandbox_bind_mounts = @DEFBINDMOUNTS@
# VFIO Mode
# Determines how VFIO devices should be be presented to the container.
# Options:
#
# - vfio
# Matches behaviour of OCI runtimes (e.g. runc) as much as
# possible. VFIO devices will appear in the container as VFIO
# character devices under /dev/vfio. The exact names may differ
# from the host (they need to match the VM's IOMMU group numbers
# rather than the host's)
#
# - guest-kernel
# This is a Kata-specific behaviour that's useful in certain cases.
# The VFIO device is managed by whatever driver in the VM kernel
# claims it. This means it will appear as one or more device nodes
# or network interfaces depending on the nature of the device.
# Using this mode requires specially built workloads that know how
# to locate the relevant device interfaces within the VM.
#
vfio_mode = "@DEFVFIOMODE@"
# If enabled, the runtime will not create Kubernetes emptyDir mounts on the guest filesystem. Instead, emptyDir mounts will
# be created on the host and shared via virtio-fs. This is potentially slower, but allows sharing of files from host to guest.
disable_guest_empty_dir = @DEFDISABLEGUESTEMPTYDIR@
# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# they may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# for example:
# experimental=["force_guest_pull"]
# which is for enable force_guest_pull mode in CoCo scenarios.
# (default: [])
experimental = @DEFAULTEXPFEATURES@
# If enabled, user can run pprof tools with shim v2 process through kata-monitor.
# (default: false)
enable_pprof = false
# Base directory of directly attachable network config.
# Network devices for VM-based containers are allowed to be placed in the
# host netns to eliminate as many hops as possible, which is what we
# called a "Directly Attachable Network". The config, set by special CNI
# plugins, is used to tell the Kata containers what devices are attached
# to the hypervisor.
# (default: /run/kata-containers/dans)
dan_conf = "@DEFDANCONF@"
# pod_resource_api_sock specifies the unix socket for the Kubelet's
# PodResource API endpoint. If empty, kubernetes based cold plug
# will not be attempted. In order for this feature to work, the
# KubeletPodResourcesGet featureGate must be enabled in Kubelet,
# if using Kubelet older than 1.34.
#
# The pod resource API's socket is relative to the Kubelet's root-dir,
# which is defined by the cluster admin, and its location is:
# ${KubeletRootDir}/pod-resources/kubelet.sock
#
# cold_plug_vfio(see hypervisor config) acts as a feature gate:
# cold_plug_vfio = no_port (default) => no cold plug
# cold_plug_vfio != no_port AND pod_resource_api_sock = "" => need
# explicit CDI annotation for cold plug (applies mainly
# to non-k8s cases)
# cold_plug_vfio != no_port AND pod_resource_api_sock != "" => kubelet
# based cold plug.
pod_resource_api_sock = "@DEFPODRESOURCEAPISOCK@"

View File

@@ -74,43 +74,21 @@ impl KernelParams {
pub(crate) fn new_rootfs_kernel_params(rootfs_driver: &str, rootfs_type: &str) -> Result<Self> {
let mut params = vec![];
// DAX is disabled on aarch64 due to kernel panic in dax_disassociate_entry
// with virtio-pmem on kernel 6.18.x
#[cfg(target_arch = "aarch64")]
let use_dax = false;
#[cfg(not(target_arch = "aarch64"))]
let use_dax = true;
match rootfs_driver {
VM_ROOTFS_DRIVER_PMEM => {
params.push(Param::new("root", VM_ROOTFS_ROOT_PMEM));
match rootfs_type {
VM_ROOTFS_FILESYSTEM_EXT4 => {
if use_dax {
params.push(Param::new(
"rootflags",
"dax,data=ordered,errors=remount-ro ro",
));
} else {
params.push(Param::new(
"rootflags",
"data=ordered,errors=remount-ro ro",
));
}
params.push(Param::new(
"rootflags",
"dax,data=ordered,errors=remount-ro ro",
));
}
VM_ROOTFS_FILESYSTEM_XFS => {
if use_dax {
params.push(Param::new("rootflags", "dax ro"));
} else {
params.push(Param::new("rootflags", "ro"));
}
params.push(Param::new("rootflags", "dax ro"));
}
VM_ROOTFS_FILESYSTEM_EROFS => {
if use_dax {
params.push(Param::new("rootflags", "dax ro"));
} else {
params.push(Param::new("rootflags", "ro"));
}
params.push(Param::new("rootflags", "dax ro"));
}
_ => {
return Err(anyhow!("Unsupported rootfs type {}", rootfs_type));
@@ -255,22 +233,6 @@ mod tests {
#[test]
fn test_rootfs_kernel_params() {
// DAX is disabled on aarch64
#[cfg(target_arch = "aarch64")]
let ext4_pmem_rootflags = "data=ordered,errors=remount-ro ro";
#[cfg(not(target_arch = "aarch64"))]
let ext4_pmem_rootflags = "dax,data=ordered,errors=remount-ro ro";
#[cfg(target_arch = "aarch64")]
let xfs_pmem_rootflags = "ro";
#[cfg(not(target_arch = "aarch64"))]
let xfs_pmem_rootflags = "dax ro";
#[cfg(target_arch = "aarch64")]
let erofs_pmem_rootflags = "ro";
#[cfg(not(target_arch = "aarch64"))]
let erofs_pmem_rootflags = "dax ro";
let tests = &[
// EXT4
TestData {
@@ -279,7 +241,7 @@ mod tests {
expect_params: KernelParams {
params: [
Param::new("root", VM_ROOTFS_ROOT_PMEM),
Param::new("rootflags", ext4_pmem_rootflags),
Param::new("rootflags", "dax,data=ordered,errors=remount-ro ro"),
Param::new("rootfstype", VM_ROOTFS_FILESYSTEM_EXT4),
]
.to_vec(),
@@ -306,7 +268,7 @@ mod tests {
expect_params: KernelParams {
params: [
Param::new("root", VM_ROOTFS_ROOT_PMEM),
Param::new("rootflags", xfs_pmem_rootflags),
Param::new("rootflags", "dax ro"),
Param::new("rootfstype", VM_ROOTFS_FILESYSTEM_XFS),
]
.to_vec(),
@@ -333,7 +295,7 @@ mod tests {
expect_params: KernelParams {
params: [
Param::new("root", VM_ROOTFS_ROOT_PMEM),
Param::new("rootflags", erofs_pmem_rootflags),
Param::new("rootflags", "dax ro"),
Param::new("rootfstype", VM_ROOTFS_FILESYSTEM_EROFS),
]
.to_vec(),

View File

@@ -234,7 +234,7 @@ DEFDISABLESELINUX := false
DEFDISABLEGUESTSELINUX := true
# Default is empty string "" to match the default golang (when commented out in config).
# Most users will want to set this to "system_u:system_r:container_t" for SELinux support.
DEFGUESTSELINUXLABEL :=
DEFGUESTSELINUXLABEL :=
#Default SeccomSandbox param
#The same default policy is used by libvirt
@@ -291,6 +291,7 @@ DEFSTATICRESOURCEMGMT_TEE = true
DEFSTATICRESOURCEMGMT_NV = true
DEFDISABLEIMAGENVDIMM ?= false
DEFDISABLEIMAGENVDIMM_NV = true
DEFBINDMOUNTS := []
@@ -784,6 +785,7 @@ USER_VARS += DEFVFIOMODE
USER_VARS += DEFVFIOMODE_SE
USER_VARS += BUILDFLAGS
USER_VARS += DEFDISABLEIMAGENVDIMM
USER_VARS += DEFDISABLEIMAGENVDIMM_NV
USER_VARS += DEFCCAMEASUREMENTALGO
USER_VARS += DEFSHAREDFS_QEMU_CCA_VIRTIOFS
USER_VARS += DEFPODRESOURCEAPISOCK

View File

@@ -379,7 +379,7 @@ msize_9p = @DEFMSIZE9P@
# Otherwise virtio-block device is used.
#
# nvdimm is not supported when `confidential_guest = true`.
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM@
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM_NV@
# Before hot plugging a PCIe device, you need to add a pcie_root_port device.
# Use this parameter when using some large PCI bar devices, such as Nvidia GPU

View File

@@ -356,7 +356,7 @@ msize_9p = @DEFMSIZE9P@
# Otherwise virtio-block device is used.
#
# nvdimm is not supported when `confidential_guest = true`.
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM@
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM_NV@
# Before hot plugging a PCIe device, you need to add a pcie_root_port device.
# Use this parameter when using some large PCI bar devices, such as Nvidia GPU

View File

@@ -353,7 +353,7 @@ msize_9p = @DEFMSIZE9P@
# Otherwise virtio-block device is used.
#
# nvdimm is not supported when `confidential_guest = true`.
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM@
disable_image_nvdimm = @DEFDISABLEIMAGENVDIMM_NV@
# Enable hot-plugging of VFIO devices to a bridge-port,
# root-port or switch-port.

View File

@@ -49,7 +49,7 @@ require (
github.com/safchain/ethtool v0.6.2
github.com/sirupsen/logrus v1.9.3
github.com/stretchr/testify v1.11.1
github.com/urfave/cli v1.22.15
github.com/urfave/cli v1.22.17
github.com/vishvananda/netlink v1.3.1
github.com/vishvananda/netns v0.0.5
gitlab.com/nvidia/cloud-native/go-nvlib v0.0.0-20220601114329-47893b162965
@@ -85,7 +85,7 @@ require (
github.com/containerd/log v0.1.0 // indirect
github.com/containerd/platforms v0.2.1 // indirect
github.com/containernetworking/cni v1.3.0 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.6 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.7 // indirect
github.com/cyphar/filepath-securejoin v0.6.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/distribution/reference v0.6.0 // indirect

View File

@@ -8,7 +8,6 @@ github.com/AdaLogics/go-fuzz-headers v0.0.0-20230811130428-ced1acdcaa24/go.mod h
github.com/AdamKorcz/go-118-fuzz-build v0.0.0-20230306123547-8075edf89bb0 h1:59MxjQVfjXsBpLy+dbd2/ELV5ofnUkUZBvWSC85sheA=
github.com/AdamKorcz/go-118-fuzz-build v0.0.0-20230306123547-8075edf89bb0/go.mod h1:OahwfttHWG6eJ0clwcfBAHoDI6X/LV/15hx/wlMZSrU=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/BurntSushi/toml v1.3.2/go.mod h1:CxXYINrC8qIiEnFrOxCa7Jy5BFHlXnUU2pbicEuybxQ=
github.com/BurntSushi/toml v1.5.0 h1:W5quZX/G/csjUnuI8SUYlsHs9M38FC7znL0lIO+DvMg=
github.com/BurntSushi/toml v1.5.0/go.mod h1:ukJfTF/6rtPPRCnwkur4qwRxa8vTRFBF0uk2lLoLwho=
github.com/Masterminds/semver/v3 v3.4.0 h1:Zog+i5UMtVoCU8oKka5P7i9q9HgrJeGzI9SA1Xbatp0=
@@ -70,9 +69,8 @@ github.com/containernetworking/plugins v1.9.0 h1:Mg3SXBdRGkdXyFC4lcwr6u2ZB2SDeL6
github.com/containernetworking/plugins v1.9.0/go.mod h1:JG3BxoJifxxHBhG3hFyxyhid7JgRVBu/wtooGEvWf1c=
github.com/coreos/go-systemd/v22 v22.6.0 h1:aGVa/v8B7hpb0TKl0MWoAavPDmHvobFe5R5zn0bCJWo=
github.com/coreos/go-systemd/v22 v22.6.0/go.mod h1:iG+pp635Fo7ZmV/j14KUcmEyWF+0X7Lua8rrTWzYgWU=
github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/cpuguy83/go-md2man/v2 v2.0.6 h1:XJtiaUW6dEEqVuZiMTn1ldk455QWwEIsMIJlo5vtkx0=
github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo=
github.com/cpuguy83/go-md2man/v2 v2.0.7/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/cri-o/cri-o v1.34.0 h1:ux2URwAyENy5e5hD9Z95tshdfy98eqatZk0fxx3rhuk=
github.com/cri-o/cri-o v1.34.0/go.mod h1:kP40HG+1EW5CDNHjqQBFhb6dehT5dCBKcmtO5RZAm6k=
github.com/cyphar/filepath-securejoin v0.6.0 h1:BtGB77njd6SVO6VztOHfPxKitJvd/VPT+OFBFMOi1Is=
@@ -289,13 +287,13 @@ github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635 h1:kdXcSzyDtseVEc4yCz2qF8ZrQvIDBJLl4S1c3GCXmoI=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635/go.mod h1:hkRG7XYTFWNJGYcbNJQlaLq0fg1yr4J4t/NcTQtrfww=
github.com/urfave/cli v1.22.15 h1:nuqt+pdC/KqswQKhETJjo7pvn/k4xMUxgW6liI7XpnM=
github.com/urfave/cli v1.22.15/go.mod h1:wSan1hmo5zeyLGBjRJbzRTNk8gwoYa2B9n4q9dmRIc0=
github.com/urfave/cli v1.22.17 h1:SYzXoiPfQjHBbkYxbew5prZHS1TOLT3ierW8SYLqtVQ=
github.com/urfave/cli v1.22.17/go.mod h1:b0ht0aqgH/6pBYzzxURyrM4xXNgsoT/n2ZzwQiEhNVo=
github.com/vishvananda/netlink v1.3.1 h1:3AEMt62VKqz90r0tmNhog0r/PpWKmrEShJU0wJW6bV0=
github.com/vishvananda/netlink v1.3.1/go.mod h1:ARtKouGSTGchR8aMwmkzC0qiNPrrWO5JS/XMVl45+b4=
github.com/vishvananda/netns v0.0.5 h1:DfiHV+j8bA32MFM7bfEunvT8IAqQ/NzSJHtcmW5zdEY=

View File

@@ -1,3 +1,4 @@
// Package md2man aims in converting markdown into roff (man pages).
package md2man
import (

View File

@@ -47,13 +47,13 @@ const (
tableStart = "\n.TS\nallbox;\n"
tableEnd = ".TE\n"
tableCellStart = "T{\n"
tableCellEnd = "\nT}\n"
tableCellEnd = "\nT}"
tablePreprocessor = `'\" t`
)
// NewRoffRenderer creates a new blackfriday Renderer for generating roff documents
// from markdown
func NewRoffRenderer() *roffRenderer { // nolint: golint
func NewRoffRenderer() *roffRenderer {
return &roffRenderer{}
}
@@ -316,9 +316,8 @@ func (r *roffRenderer) handleTableCell(w io.Writer, node *blackfriday.Node, ente
} else if nodeLiteralSize(node) > 30 {
end = tableCellEnd
}
if node.Next == nil && end != tableCellEnd {
// Last cell: need to carriage return if we are at the end of the
// header row and content isn't wrapped in a "tablecell"
if node.Next == nil {
// Last cell: need to carriage return if we are at the end of the header row.
end += crTag
}
out(w, end)
@@ -356,7 +355,7 @@ func countColumns(node *blackfriday.Node) int {
}
func out(w io.Writer, output string) {
io.WriteString(w, output) // nolint: errcheck
io.WriteString(w, output) //nolint:errcheck
}
func escapeSpecialChars(w io.Writer, text []byte) {
@@ -395,7 +394,7 @@ func escapeSpecialCharsLine(w io.Writer, text []byte) {
i++
}
if i > org {
w.Write(text[org:i]) // nolint: errcheck
w.Write(text[org:i]) //nolint:errcheck
}
// escape a character
@@ -403,7 +402,7 @@ func escapeSpecialCharsLine(w io.Writer, text []byte) {
break
}
w.Write([]byte{'\\', text[i]}) // nolint: errcheck
w.Write([]byte{'\\', text[i]}) //nolint:errcheck
}
}

View File

@@ -257,7 +257,7 @@ github.com/containernetworking/plugins/pkg/testutils
# github.com/coreos/go-systemd/v22 v22.6.0
## explicit; go 1.23
github.com/coreos/go-systemd/v22/dbus
# github.com/cpuguy83/go-md2man/v2 v2.0.6
# github.com/cpuguy83/go-md2man/v2 v2.0.7
## explicit; go 1.12
github.com/cpuguy83/go-md2man/v2/md2man
# github.com/cri-o/cri-o v1.34.0
@@ -526,7 +526,7 @@ github.com/stretchr/testify/assert/yaml
# github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635
## explicit
github.com/syndtr/gocapability/capability
# github.com/urfave/cli v1.22.15
# github.com/urfave/cli v1.22.17
## explicit; go 1.11
github.com/urfave/cli
# github.com/vishvananda/netlink v1.3.1

View File

@@ -585,9 +585,7 @@ func (clh *cloudHypervisor) CreateVM(ctx context.Context, id string, network Net
clh.vmconfig.Cpus = chclient.NewCpusConfig(int32(clh.config.NumVCPUs()), int32(clh.config.DefaultMaxVCPUs))
disableNvdimm := (clh.config.DisableImageNvdimm || clh.config.ConfidentialGuest)
// DAX is disabled on aarch64 due to kernel panic in dax_disassociate_entry
// with virtio-pmem on kernel 6.18.x
enableDax := !disableNvdimm && runtime.GOARCH != "arm64"
enableDax := !disableNvdimm
params, err := getNonUserDefinedKernelParams(hypervisorConfig.RootfsType, disableNvdimm, enableDax, clh.config.Debug, clh.config.ConfidentialGuest, clh.config.IOMMU)
if err != nil {

View File

@@ -69,11 +69,9 @@ func newQemuArch(config HypervisorConfig) (qemuArch, error) {
kernelParamsDebug: kernelParamsDebug,
kernelParams: kernelParams,
disableNvdimm: config.DisableImageNvdimm,
// DAX is disabled on aarch64 due to kernel panic in dax_disassociate_entry
// with virtio-pmem on kernel 6.18.x
dax: false,
protection: noneProtection,
legacySerial: config.LegacySerial,
dax: true,
protection: noneProtection,
legacySerial: config.LegacySerial,
},
measurementAlgo: config.MeasurementAlgo,
}

View File

@@ -1415,6 +1415,13 @@ func (s *Sandbox) startVM(ctx context.Context, prestartHookFunc func(context.Con
if err != nil {
return err
}
// If we want the network, scan the netns again to update the network
// configuration after the prestart hooks have run.
if !s.config.NetworkConfig.DisableNewNetwork {
if _, err := s.network.AddEndpoints(ctx, s, nil, false); err != nil {
return err
}
}
}
if err := s.network.Run(ctx, func() error {
@@ -2545,9 +2552,18 @@ func (s *Sandbox) resourceControllerDelete() error {
return err
}
resCtrlParent := sandboxController.Parent()
if err := sandboxController.MoveTo(resCtrlParent); err != nil {
return err
// When sandbox_cgroup_only is enabled, all Kata threads live in the
// sandbox controller and systemd can move tasks as part of unit deletion.
// In that mode, a systemd-formatted cgroup path is not a filesystem path,
// so MoveTo would fail with "invalid group path".
// Keep MoveTo for the case of using cgroupfs paths and for the
// non-sandbox_cgroup_only mode. In that mode, Kata may use an overhead
// cgroup in which case an explicit MoveTo is used to drain tasks.
if !(resCtrl.IsSystemdCgroup(s.state.SandboxCgroupPath) && s.config.SandboxCgroupOnly) {
resCtrlParent := sandboxController.Parent()
if err := sandboxController.MoveTo(resCtrlParent); err != nil {
return err
}
}
if err := sandboxController.Delete(); err != nil {
@@ -2560,9 +2576,12 @@ func (s *Sandbox) resourceControllerDelete() error {
return err
}
resCtrlParent := overheadController.Parent()
if err := s.overheadController.MoveTo(resCtrlParent); err != nil {
return err
// See comment at above MoveTo: Avoid this action as systemd moves tasks on unit deletion.
if !(resCtrl.IsSystemdCgroup(s.state.OverheadCgroupPath) && s.config.SandboxCgroupOnly) {
resCtrlParent := overheadController.Parent()
if err := s.overheadController.MoveTo(resCtrlParent); err != nil {
return err
}
}
if err := overheadController.Delete(); err != nil {

View File

@@ -54,7 +54,7 @@ version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b169f7a6d4742236a0a00c541b845991d0ac43e546831af1249753ab4c3aa3a0"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cipher",
"cpufeatures",
"zeroize",
@@ -246,13 +246,12 @@ dependencies = [
[[package]]
name = "async-compression"
version = "0.4.33"
version = "0.4.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "93c1f86859c1af3d514fa19e8323147ff10ea98684e6c7b307912509f50e67b2"
checksum = "d10e4f991a553474232bc0a31799f6d24b034a84c0971d80d2e2f78b2e576e40"
dependencies = [
"compression-codecs",
"compression-core",
"futures-core",
"futures-io",
"pin-project-lite",
"tokio",
@@ -292,7 +291,7 @@ checksum = "0fc5b45d93ef0529756f812ca52e44c221b35341892d3dcc34132ac02f3dd2af"
dependencies = [
"async-lock 2.8.0",
"autocfg",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"concurrent-queue",
"futures-lite 1.13.0",
"log",
@@ -311,7 +310,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1237c0ae75a0f3765f58910ff9cdd0a12eeb39ab2f4c7de23262f337f0aacbb3"
dependencies = [
"async-lock 3.4.0",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"concurrent-queue",
"futures-io",
"futures-lite 2.0.0",
@@ -353,7 +352,7 @@ dependencies = [
"async-lock 2.8.0",
"async-signal",
"blocking",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"event-listener 3.1.0",
"futures-lite 1.13.0",
"rustix 0.38.34",
@@ -380,7 +379,7 @@ dependencies = [
"async-io 2.4.1",
"async-lock 3.4.0",
"atomic-waker",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"futures-core",
"futures-io",
"rustix 1.0.7",
@@ -389,28 +388,6 @@ dependencies = [
"windows-sys 0.59.0",
]
[[package]]
name = "async-stream"
version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b5a71a6f37880a80d1d7f19efd781e4b5de42c88f0722cc13bcb6cc2cfe8476"
dependencies = [
"async-stream-impl",
"futures-core",
"pin-project-lite",
]
[[package]]
name = "async-stream-impl"
version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c7c24de15d275a1ecfd47a380fb4d5ec9bfe0933f309ed5e705b775596a3574d"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.87",
]
[[package]]
name = "async-task"
version = "4.7.1"
@@ -419,9 +396,9 @@ checksum = "8b75356056920673b02621b35afd0f7dda9306d03c79a30f5c56c44cf256e3de"
[[package]]
name = "async-trait"
version = "0.1.88"
version = "0.1.89"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e539d3fca749fcee5236ab05e93a52867dd549cc157c8cb7f99595f3cedffdb5"
checksum = "9035ad2d096bed7955a320ee7e2230574d28fd3c3a0f186cbea1ff3c7eed5dbb"
dependencies = [
"proc-macro2",
"quote",
@@ -487,11 +464,10 @@ dependencies = [
[[package]]
name = "axum"
version = "0.7.9"
version = "0.8.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "edca88bc138befd0323b20752846e6587272d3b03b0343c8ea28a6f819e6e71f"
checksum = "8b52af3cb4058c895d37317bb27508dccc8e5f2d39454016b297bf4a400597b8"
dependencies = [
"async-trait",
"axum-core",
"bytes 1.7.2",
"futures-util",
@@ -504,29 +480,26 @@ dependencies = [
"mime",
"percent-encoding",
"pin-project-lite",
"rustversion",
"serde",
"serde_core",
"sync_wrapper",
"tower 0.5.2",
"tower",
"tower-layer",
"tower-service",
]
[[package]]
name = "axum-core"
version = "0.4.5"
version = "0.5.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09f2bd6146b97ae3359fa0cc6d6b376d9539582c7b4220f041a33ec24c226199"
checksum = "08c78f31d7b1291f7ee735c1c6780ccde7785daae9a9206026862dab7d8792d1"
dependencies = [
"async-trait",
"bytes 1.7.2",
"futures-util",
"futures-core",
"http 1.1.0",
"http-body 1.0.1",
"http-body-util",
"mime",
"pin-project-lite",
"rustversion",
"sync_wrapper",
"tower-layer",
"tower-service",
@@ -539,7 +512,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bb531853791a215d7c62a30daf0dde835f381ab5de4589cfe7c649d2cbe92bd6"
dependencies = [
"addr2line",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"miniz_oxide",
"object",
@@ -606,7 +579,7 @@ dependencies = [
"bitflags 2.6.0",
"cexpr",
"clang-sys",
"itertools 0.10.5",
"itertools 0.11.0",
"lazy_static",
"lazycell",
"log",
@@ -865,7 +838,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fbdc32a78afc325d71a48d13084f1c3ddf67cc5dc06c6e5439a8630b14612cad"
dependencies = [
"bitflags 1.3.2",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
]
@@ -934,9 +907,9 @@ checksum = "4785bdd1c96b2a846b2bd7cc02e86b6b3dbf14e7e53446c4f54c92a361040822"
[[package]]
name = "cfg-if"
version = "1.0.1"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9555578bc9e57714c812a1f84e4fc5b4d21fcb063490c624de019f7464c91268"
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
[[package]]
name = "cfg_aliases"
@@ -1082,9 +1055,9 @@ checksum = "2382f75942f4b3be3690fe4f86365e9c853c1587d6ee58212cebf6e2a9ccd101"
[[package]]
name = "compression-codecs"
version = "0.4.32"
version = "0.4.36"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "680dc087785c5230f8e8843e2e57ac7c1c90488b6a91b88caa265410568f441b"
checksum = "00828ba6fd27b45a448e57dbfe84f1029d4c9f26b368157e9a448a5f49a2ec2a"
dependencies = [
"compression-core",
"flate2",
@@ -1095,9 +1068,9 @@ dependencies = [
[[package]]
name = "compression-core"
version = "0.4.30"
version = "0.4.31"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3a9b614a5787ef0c8802a55766480563cb3a93b435898c422ed2a359cf811582"
checksum = "75984efb6ed102a0d42db99afb6c1948f0380d1d91808d5529916e6c08b49d8d"
[[package]]
name = "concurrent-queue"
@@ -1165,7 +1138,7 @@ version = "1.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a97769d94ddab943e4510d138150169a2758b5ef3eb191a9ee688de3e23ef7b3"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
]
[[package]]
@@ -1174,7 +1147,7 @@ version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6eb9105919ca8e40d437fc9cbb8f1975d916f1bd28afe795a48aae32a2cc8920"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"crossbeam-channel",
"crossbeam-deque",
"crossbeam-epoch",
@@ -1197,7 +1170,7 @@ version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fca89a0e215bab21874660c67903c5f143333cab1da83d041c7ded6053774751"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"crossbeam-epoch",
"crossbeam-utils",
]
@@ -1209,7 +1182,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e3681d554572a651dda4186cd47240627c3d0114d45a95f6ad27f2f22e7548d"
dependencies = [
"autocfg",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"crossbeam-utils",
]
@@ -1219,7 +1192,7 @@ version = "0.3.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "adc6598521bb5a83d491e8c1fe51db7296019d2ca3cb93cc6c2a20369a4d78a2"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"crossbeam-utils",
]
@@ -1229,7 +1202,7 @@ version = "0.8.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3a430a770ebd84726f584a90ee7f020d28db52c6d02138900f22341f866d39c"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
]
[[package]]
@@ -1301,7 +1274,7 @@ version = "4.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97fb8b7c4503de7d6ae7b42ab72a5a59857b4c937ec27a3d4539dba95b5ab2be"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cpufeatures",
"curve25519-dalek-derive",
"digest 0.10.7",
@@ -1537,7 +1510,7 @@ version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b98cf8ebf19c3d1b223e151f99a4f9f0690dca41414773390fc824184ac833e1"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"dirs-sys-next",
]
@@ -1745,7 +1718,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "778e2ac28f6c47af28e4907f13ffd1e1ddbd400980a9abd7c8df189bf578a5ad"
dependencies = [
"libc",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -1854,7 +1827,7 @@ version = "0.2.25"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "35c0522e981e68cbfa8c3f978441a5f34b30b96e146b33cd3359176b50fe8586"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"libredox",
"windows-sys 0.59.0",
@@ -1880,9 +1853,9 @@ checksum = "b3ea1ec5f8307826a5b71094dd91fc04d4ae75d5709b20ad351c7fb4815c86ec"
[[package]]
name = "flate2"
version = "1.1.2"
version = "1.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4a3d7db9596fecd151c5f638c0ee5d5bd487b6e0ea232e5dc96d5250f6f94b1d"
checksum = "b375d6465b98090a5f25b1c7703f3859783755aa9a80433b36e0379a3ec2f369"
dependencies = [
"crc32fast",
"miniz_oxide",
@@ -2057,7 +2030,7 @@ version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4567c8db10ae91089c99af84c68c38da3ec2f087c3f82960bcdbf3656b6f4d7"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"js-sys",
"libc",
"wasi",
@@ -2130,7 +2103,7 @@ dependencies = [
"futures-core",
"futures-sink",
"http 1.1.0",
"indexmap 2.6.0",
"indexmap 2.13.0",
"slab",
"tokio",
"tokio-util",
@@ -2148,9 +2121,9 @@ dependencies = [
[[package]]
name = "hashbrown"
version = "0.15.2"
version = "0.16.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf151400ff0baff5465007dd2f3e717f3fe502074ca563069ce3a6629d07b289"
checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100"
[[package]]
name = "heck"
@@ -2618,14 +2591,14 @@ dependencies = [
[[package]]
name = "image-rs"
version = "0.1.0"
source = "git+https://github.com/confidential-containers/guest-components?rev=048ddaec4ecd6ee45c845d69bc39416908764560#048ddaec4ecd6ee45c845d69bc39416908764560"
source = "git+https://github.com/confidential-containers/guest-components?rev=026694d44d4ec483465d2fa5f80a0376166b174d#026694d44d4ec483465d2fa5f80a0376166b174d"
dependencies = [
"anyhow",
"astral-tokio-tar",
"async-compression",
"async-trait",
"base64 0.22.1",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"filetime",
"flate2",
"futures",
@@ -2650,7 +2623,7 @@ dependencies = [
"thiserror 2.0.12",
"tokio",
"tokio-util",
"toml 0.8.23",
"toml 0.9.11+spec-1.1.0",
"tonic",
"url",
"walkdir",
@@ -2671,13 +2644,14 @@ dependencies = [
[[package]]
name = "indexmap"
version = "2.6.0"
version = "2.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "707907fe3c25f5424cce2cb7e1cbcafee6bdbe735ca90ef77c29e84591e5b9da"
checksum = "7714e70437a7dc3ac8eb7e6f8df75fd8eb422675fc7678aff7364301092b1017"
dependencies = [
"equivalent",
"hashbrown 0.15.2",
"hashbrown 0.16.1",
"serde",
"serde_core",
]
[[package]]
@@ -2718,7 +2692,7 @@ version = "0.1.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a5bbe824c507c5da5956355e86a746d82e0e1464f65d862cc5e71da70e94b2c"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
]
[[package]]
@@ -2739,7 +2713,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b86e202f00093dcba4275d4636b93ef9dd75d025ae560d2521b45ea28ab49013"
dependencies = [
"bitflags 2.6.0",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
]
@@ -2766,7 +2740,7 @@ checksum = "e04d7f318608d35d4b61ddd75cbdaee86b023ebe2bd5a66ee0915f0bf93095a9"
dependencies = [
"hermit-abi 0.5.2",
"libc",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -2986,7 +2960,7 @@ version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4979f22fdb869068da03c9f7528f8297c6fd2606bc3a4affe42e6a823fdb8da4"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"windows-targets 0.52.6",
]
@@ -3060,9 +3034,9 @@ dependencies = [
[[package]]
name = "log"
version = "0.4.28"
version = "0.4.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "34080505efa8e45a4b816c349525ebe327ceaa8559756f0356cba97ef3bf7432"
checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897"
[[package]]
name = "logging"
@@ -3091,9 +3065,9 @@ dependencies = [
[[package]]
name = "matchit"
version = "0.7.3"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e7465ac9959cc2b1404e8e2367b43684a6d13790fe23056cc8c6c5a6b7bcb94"
checksum = "47e1ffaa40ddd1f3ed91f717a33c8c0ee23fff369e3aa8772b9605cc1d22f4c3"
[[package]]
name = "md-5"
@@ -3101,7 +3075,7 @@ version = "0.10.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d89e7ee0cfbedfc4da3340218492196241d89eefb6dab27de5df917a6d2e78cf"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"digest 0.10.7",
]
@@ -3173,6 +3147,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fa76a2c86f704bdb222d66965fb3d63269ce38518b83cb0575fca855ebb6316"
dependencies = [
"adler2",
"simd-adler32",
]
[[package]]
@@ -3193,7 +3168,7 @@ version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "39a6bfcc6c8c7eed5ee98b9c3e33adc726054389233e201c95dab2d41a3839d2"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"downcast",
"fragile",
"mockall_derive",
@@ -3207,7 +3182,7 @@ version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "25ca3004c2efe9011bd4e461bd8256445052b9615405b4f7ea43fc8ca5c20898"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"proc-macro2",
"quote",
"syn 2.0.87",
@@ -3233,7 +3208,7 @@ checksum = "8f3790c00a0150112de0f4cd161e3d7fc4b2d8a5542ffc35f099a2562aecb35c"
dependencies = [
"bitflags 1.3.2",
"cc",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"memoffset 0.6.5",
]
@@ -3245,7 +3220,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fa52e972a9a719cecb6864fb88568781eb706bac2cd1d4f04a648542dbf78069"
dependencies = [
"bitflags 1.3.2",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"memoffset 0.6.5",
]
@@ -3258,7 +3233,7 @@ checksum = "f346ff70e7dbfd675fe90590b92d59ef2de15a8779ae305ebcbfd3f0caf59be4"
dependencies = [
"autocfg",
"bitflags 1.3.2",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
]
@@ -3269,7 +3244,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "598beaf3cc6fdd9a5dfb1630c2800c7acd31df7aaf0f565796fba2b53ca1af1b"
dependencies = [
"bitflags 1.3.2",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"memoffset 0.7.1",
"pin-utils",
@@ -3282,7 +3257,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "74523f3a35e05aba87a1d978330aef40f67b0304ac79c1c00b294c9830543db6"
dependencies = [
"bitflags 2.6.0",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cfg_aliases",
"libc",
]
@@ -3375,7 +3350,7 @@ version = "5.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "51e219e79014df21a225b1860a479e2dcd7cbd9130f4defd4bd0e191ea31d67d"
dependencies = [
"base64 0.21.7",
"base64 0.22.1",
"chrono",
"getrandom",
"http 1.1.0",
@@ -3447,9 +3422,9 @@ dependencies = [
[[package]]
name = "oci-spec"
version = "0.8.3"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2eb4684653aeaba48dea019caa17b2773e1212e281d50b6fa759f36fe032239d"
checksum = "fc3da52b83ce3258fbf29f66ac784b279453c2ac3c22c5805371b921ede0d308"
dependencies = [
"const_format",
"derive_builder",
@@ -3465,11 +3440,11 @@ dependencies = [
[[package]]
name = "ocicrypt-rs"
version = "0.1.0"
source = "git+https://github.com/confidential-containers/guest-components?rev=048ddaec4ecd6ee45c845d69bc39416908764560#048ddaec4ecd6ee45c845d69bc39416908764560"
source = "git+https://github.com/confidential-containers/guest-components?rev=026694d44d4ec483465d2fa5f80a0376166b174d#026694d44d4ec483465d2fa5f80a0376166b174d"
dependencies = [
"anyhow",
"base64 0.22.1",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"protos",
"serde",
"serde_json",
@@ -3632,7 +3607,7 @@ version = "0.9.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1e401f977ab385c9e4e3ab30627d6f26d00e2c73eef317493c4ec6d468726cf8"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"libc",
"redox_syscall 0.5.7",
"smallvec",
@@ -3761,7 +3736,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4c5cc86750666a3ed20bdaf5ca2a0344f9c67674cae0515bec2da16fbaa47db"
dependencies = [
"fixedbitset 0.4.2",
"indexmap 2.6.0",
"indexmap 2.13.0",
]
[[package]]
@@ -3897,7 +3872,7 @@ checksum = "4b2d323e8ca7996b3e23126511a523f7e62924d93ecd5ae73b333815b0eb3dce"
dependencies = [
"autocfg",
"bitflags 1.3.2",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"concurrent-queue",
"libc",
"log",
@@ -3911,7 +3886,7 @@ version = "3.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b53a684391ad002dd6a596ceb6c74fd004fdce75f4be2e3f615068abbea5fd50"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"concurrent-queue",
"hermit-abi 0.5.2",
"pin-project-lite",
@@ -3937,7 +3912,7 @@ version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d1fe60d06143b2430aa532c94cfe9e29783047f06c0d7fd359a9a51b729fa25"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cpufeatures",
"opaque-debug",
"universal-hash",
@@ -4083,12 +4058,12 @@ dependencies = [
[[package]]
name = "prost"
version = "0.13.5"
version = "0.14.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2796faa41db3ec313a31f7624d9286acf277b52de526150b7e69f3debf891ee5"
checksum = "d2ea70524a2f82d518bce41317d0fae74151505651af45faf1ffbd6fd33f0568"
dependencies = [
"bytes 1.7.2",
"prost-derive 0.13.5",
"prost-derive 0.14.3",
]
[[package]]
@@ -4124,12 +4099,12 @@ dependencies = [
[[package]]
name = "prost-derive"
version = "0.13.5"
version = "0.14.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a56d757972c98b346a9b766e3f02746cde6dd1cd1d1d563472929fdd74bec4d"
checksum = "27c6023962132f4b30eb4c172c91ce92d933da334c59c23cddee82358ddafb0b"
dependencies = [
"anyhow",
"itertools 0.10.5",
"itertools 0.11.0",
"proc-macro2",
"quote",
"syn 2.0.87",
@@ -4178,7 +4153,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4aeaa1f2460f1d348eeaeed86aea999ce98c1bded6f089ff8514c9d9dbdc973"
dependencies = [
"anyhow",
"indexmap 2.6.0",
"indexmap 2.13.0",
"log",
"protobuf",
"protobuf-support",
@@ -4212,10 +4187,11 @@ dependencies = [
[[package]]
name = "protos"
version = "0.1.0"
source = "git+https://github.com/confidential-containers/guest-components?rev=048ddaec4ecd6ee45c845d69bc39416908764560#048ddaec4ecd6ee45c845d69bc39416908764560"
source = "git+https://github.com/confidential-containers/guest-components?rev=026694d44d4ec483465d2fa5f80a0376166b174d#026694d44d4ec483465d2fa5f80a0376166b174d"
dependencies = [
"prost 0.13.5",
"prost 0.14.3",
"tonic",
"tonic-prost",
]
[[package]]
@@ -4240,9 +4216,9 @@ dependencies = [
[[package]]
name = "qapi"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c6412bdd014ebee03ddbbe79ac03a0b622cce4d80ba45254f6357c847f06fa38"
checksum = "7b047adab56acc4948d4b9b58693c1f33fd13efef2d6bb5f0f66a47436ceada8"
dependencies = [
"bytes 1.7.2",
"futures",
@@ -4277,9 +4253,9 @@ dependencies = [
[[package]]
name = "qapi-qmp"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8b944db7e544d2fa97595e9a000a6ba5c62c426fa185e7e00aabe4b5640b538"
checksum = "45303cac879d89361cad0287ae15f9ae1e7799b904b474152414aeece39b9875"
dependencies = [
"qapi-codegen",
"qapi-spec",
@@ -4550,7 +4526,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a4689e6c2294d81e88dc6261c768b63bc4fcdb852be6d1352498b114f61383b7"
dependencies = [
"cc",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"getrandom",
"libc",
"untrusted 0.9.0",
@@ -4640,7 +4616,7 @@ version = "0.18.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f6d5f2436026b4f6e79dc829837d467cc7e9a55ee40e750d716713540715a2df"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"ordered-multimap",
]
@@ -4724,7 +4700,7 @@ dependencies = [
"errno 0.3.13",
"libc",
"linux-raw-sys 0.9.4",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -4737,7 +4713,7 @@ dependencies = [
"bit-vec 0.8.0",
"capctl",
"caps",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cgroups-rs",
"futures",
"inotify",
@@ -4983,7 +4959,7 @@ dependencies = [
"aes-gcm",
"anyhow",
"argon2",
"base64 0.21.7",
"base64 0.22.1",
"block-padding",
"blowfish",
"buffered-reader",
@@ -5036,10 +5012,11 @@ dependencies = [
[[package]]
name = "serde"
version = "1.0.217"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "02fc4265df13d6fa1d00ecff087228cc0a2b5f3c0e87e258d8b94a156e984c70"
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
dependencies = [
"serde_core",
"serde_derive",
]
@@ -5084,10 +5061,19 @@ dependencies = [
]
[[package]]
name = "serde_derive"
version = "1.0.217"
name = "serde_core"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a9bf7cf98d04a2b28aead066b7496853d4779c9cc183c440dbac457641e19a0"
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
dependencies = [
"proc-macro2",
"quote",
@@ -5138,11 +5124,11 @@ dependencies = [
[[package]]
name = "serde_spanned"
version = "0.6.9"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf41e0cfaf7226dca15e8197172c295a782857fcb97fad1808a166870dee75a3"
checksum = "f8bbf91e5a4d6315eee45e704372590b30e260ee83af6639d64557f51b067776"
dependencies = [
"serde",
"serde_core",
]
[[package]]
@@ -5167,7 +5153,7 @@ dependencies = [
"chrono",
"hex",
"indexmap 1.9.3",
"indexmap 2.6.0",
"indexmap 2.13.0",
"schemars",
"serde",
"serde_derive",
@@ -5194,7 +5180,7 @@ version = "0.9.34+deprecated"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47"
dependencies = [
"indexmap 2.6.0",
"indexmap 2.13.0",
"itoa",
"ryu",
"serde",
@@ -5207,7 +5193,7 @@ version = "0.10.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3bf829a2d51ab4a5ddf1352d8470c140cadc8301b2ae1789db023f01cedd6ba"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cpufeatures",
"digest 0.10.7",
]
@@ -5230,7 +5216,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4d58a1e1bf39749807d89cf2d98ac2dfa0ff1cb3faa38fbb64dd88ac8013d800"
dependencies = [
"block-buffer 0.9.0",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cpufeatures",
"digest 0.9.0",
"opaque-debug",
@@ -5242,7 +5228,7 @@ version = "0.10.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"cpufeatures",
"digest 0.10.7",
]
@@ -5297,14 +5283,14 @@ dependencies = [
[[package]]
name = "sigstore"
version = "0.12.1"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "43427f0d642cfed11bd596608148ee4476dd75f938888aa13a9c4e176fe14225"
checksum = "52bba786054331bdc89e90f74373b68a6c3b63c9754cf20e3a4a629d0165fe38"
dependencies = [
"async-trait",
"aws-lc-rs",
"base64 0.22.1",
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"chrono",
"const-oid",
"crypto_secretbox",
@@ -5342,6 +5328,12 @@ dependencies = [
"zeroize",
]
[[package]]
name = "simd-adler32"
version = "0.3.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e320a6c5ad31d271ad523dcf3ad13e2767ad8b1cb8f047f75a8aeaf8da139da2"
[[package]]
name = "simdutf8"
version = "0.1.4"
@@ -5574,9 +5566,9 @@ dependencies = [
[[package]]
name = "sync_wrapper"
version = "1.0.1"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7065abeca94b6a8a577f9bd45aa0867a2238b74e8eb67cf10d492bc39351394"
checksum = "0bf256ce5efdfa370213c1dabab5935a12e49f2c58d15e9eac2870d3b4f27263"
dependencies = [
"futures-core",
]
@@ -5623,7 +5615,7 @@ version = "3.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b9fbec84f381d5795b08656e4912bec604d162bff9291d6189a78f4c8ab87998"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"fastrand 1.9.0",
"redox_syscall 0.3.5",
"rustix 0.37.28",
@@ -5702,7 +5694,7 @@ version = "1.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3fdd6f064ccff2d6567adcb3873ca630700f00b5ad3f060c25b5dcfd9a4ce152"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"once_cell",
]
@@ -5904,14 +5896,17 @@ dependencies = [
[[package]]
name = "toml"
version = "0.8.23"
version = "0.9.11+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc1beb996b9d83529a9e75c17a1686767d148d70663143c7854d8b4a09ced362"
checksum = "f3afc9a848309fe1aaffaed6e1546a7a14de1f935dc9d89d32afd9a44bab7c46"
dependencies = [
"serde",
"indexmap 2.13.0",
"serde_core",
"serde_spanned",
"toml_datetime",
"toml_edit 0.22.27",
"toml_datetime 0.7.5+spec-1.1.0",
"toml_parser",
"toml_writer",
"winnow 0.7.14",
]
[[package]]
@@ -5919,8 +5914,14 @@ name = "toml_datetime"
version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22cddaf88f4fbc13c51aebbf5f8eceb5c7c5a9da2ac40a13519eb5b0a0e8f11c"
[[package]]
name = "toml_datetime"
version = "0.7.5+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "92e1cfed4a3038bc5a127e35a2d360f145e1f4b971b551a2ba5fd7aedf7e1347"
dependencies = [
"serde",
"serde_core",
]
[[package]]
@@ -5929,8 +5930,8 @@ version = "0.19.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b5bb770da30e5cbfde35a2d7b9b8a2c4b8ef89548a7a6aeab5c9a576e3e7421"
dependencies = [
"indexmap 2.6.0",
"toml_datetime",
"indexmap 2.13.0",
"toml_datetime 0.6.11",
"winnow 0.5.40",
]
@@ -5940,27 +5941,32 @@ version = "0.22.27"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41fe8c660ae4257887cf66394862d21dbca4a6ddd26f04a3560410406a2f819a"
dependencies = [
"indexmap 2.6.0",
"serde",
"serde_spanned",
"toml_datetime",
"toml_write",
"winnow 0.7.11",
"indexmap 2.13.0",
"toml_datetime 0.6.11",
"winnow 0.7.14",
]
[[package]]
name = "toml_write"
version = "0.1.2"
name = "toml_parser"
version = "1.0.6+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d99f8c9a7727884afe522e9bd5edbfc91a3312b36a77b5fb8926e4c31a41801"
checksum = "a3198b4b0a8e11f09dd03e133c0280504d0801269e9afa46362ffde1cbeebf44"
dependencies = [
"winnow 0.7.14",
]
[[package]]
name = "toml_writer"
version = "1.0.6+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ab16f14aed21ee8bfd8ec22513f7287cd4a91aa92e44edfe2c17ddd004e92607"
[[package]]
name = "tonic"
version = "0.12.3"
version = "0.14.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "877c5b330756d856ffcc4553ab34a5684481ade925ecc54bcd1bf02b1d0d4d52"
checksum = "eb7613188ce9f7df5bfe185db26c5814347d110db17920415cf2fbcad85e7203"
dependencies = [
"async-stream",
"async-trait",
"axum",
"base64 0.22.1",
@@ -5974,34 +5980,25 @@ dependencies = [
"hyper-util",
"percent-encoding",
"pin-project",
"prost 0.13.5",
"socket2 0.5.10",
"socket2 0.6.0",
"sync_wrapper",
"tokio",
"tokio-stream",
"tower 0.4.13",
"tower",
"tower-layer",
"tower-service",
"tracing",
]
[[package]]
name = "tower"
version = "0.4.13"
name = "tonic-prost"
version = "0.14.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8fa9be0de6cf49e536ce1851f987bd21a43b771b09473c3549a6c853db37c1c"
checksum = "66bd50ad6ce1252d87ef024b3d64fe4c3cf54a86fb9ef4c631fdd0ded7aeaa67"
dependencies = [
"futures-core",
"futures-util",
"indexmap 1.9.3",
"pin-project",
"pin-project-lite",
"rand",
"slab",
"tokio",
"tokio-util",
"tower-layer",
"tower-service",
"tracing",
"bytes 1.7.2",
"prost 0.14.3",
"tonic",
]
[[package]]
@@ -6012,10 +6009,15 @@ checksum = "d039ad9159c98b70ecfd540b2573b97f7f52c3e8d9f8ad57a24b916a536975f9"
dependencies = [
"futures-core",
"futures-util",
"indexmap 2.13.0",
"pin-project-lite",
"slab",
"sync_wrapper",
"tokio",
"tokio-util",
"tower-layer",
"tower-service",
"tracing",
]
[[package]]
@@ -6345,7 +6347,7 @@ version = "0.2.93"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a82edfc16a6c469f5f44dc7b571814045d60404b55a0ee849f9bcfa2e63dd9b5"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"once_cell",
"wasm-bindgen-macro",
]
@@ -6371,7 +6373,7 @@ version = "0.4.43"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "61e9300f63a621e96ed275155c108eb6f843b6a26d053f122ab69724559dc8ed"
dependencies = [
"cfg-if 1.0.1",
"cfg-if 1.0.4",
"js-sys",
"wasm-bindgen",
"web-sys",
@@ -6471,7 +6473,7 @@ version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cf221c93e13a30d793f7645a0e7762c55d169dbb0a49671918a2319d289b10bb"
dependencies = [
"windows-sys 0.48.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -6803,9 +6805,9 @@ dependencies = [
[[package]]
name = "winnow"
version = "0.7.11"
version = "0.7.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "74c7b26e3480b707944fc872477815d29a8e429d2f93a1ce000f5fa84a15cbcd"
checksum = "5a5364e9d77fcdeeaa6062ced926ee3381faa2ee02d3eb83a5c27a8825540829"
dependencies = [
"memchr",
]

View File

@@ -43,8 +43,7 @@ serde = { version = "1.0.131", features = ["derive"] }
serde_json = "1.0.73"
# Image pull/unpack
image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "048ddaec4ecd6ee45c845d69bc39416908764560", features = [
"snapshot-overlayfs",
image-rs = { git = "https://github.com/confidential-containers/guest-components", rev = "026694d44d4ec483465d2fa5f80a0376166b174d", features = [
"oci-client-rustls",
"signature-cosign-rustls",
] }

View File

@@ -3024,9 +3024,9 @@ dependencies = [
[[package]]
name = "qapi"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c6412bdd014ebee03ddbbe79ac03a0b622cce4d80ba45254f6357c847f06fa38"
checksum = "7b047adab56acc4948d4b9b58693c1f33fd13efef2d6bb5f0f66a47436ceada8"
dependencies = [
"bytes",
"futures",
@@ -3061,9 +3061,9 @@ dependencies = [
[[package]]
name = "qapi-qmp"
version = "0.14.0"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8b944db7e544d2fa97595e9a000a6ba5c62c426fa185e7e00aabe4b5640b538"
checksum = "45303cac879d89361cad0287ae15f9ae1e7799b904b474152414aeece39b9875"
dependencies = [
"qapi-codegen",
"qapi-spec",

View File

@@ -81,6 +81,7 @@ pub enum Commands {
#[error("Argument is not valid")]
pub struct CheckArgument {
#[clap(subcommand)]
#[allow(unused_assignments)]
pub command: CheckSubCommand,
}

View File

@@ -486,11 +486,11 @@ mod tests {
let releases = get_kata_all_releases_by_url(KATA_GITHUB_RELEASE_URL);
// sometime in GitHub action accessing to github.com API may fail
// we can skip this test to prevent the whole test fail.
if releases.is_err() {
if let Err(error) = releases {
warn!(
sl!(),
"get kata version failed({:?}), this maybe a temporary error, just skip the test.",
releases.unwrap_err()
error
);
return;
}

View File

@@ -1 +0,0 @@
/vendor/

3943
src/tools/runk/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,38 +0,0 @@
[package]
name = "runk"
version = "0.0.1"
authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]
description = "runk: Kata OCI container runtime based on Kata agent"
license = "Apache-2.0"
edition = "2018"
[dependencies]
libcontainer = { path = "./libcontainer" }
rustjail = { path = "../../agent/rustjail", features = [
"standard-oci-runtime",
] }
runtime-spec = { path = "../../libs/runtime-spec" }
oci-spec = { version = "0.8.1", features = ["runtime"] }
logging = { path = "../../libs/logging" }
liboci-cli = "0.5.3"
clap = { version = "4.5.40", features = ["derive", "cargo"] }
libc = "0.2.108"
nix = "0.23.0"
anyhow = "1.0.52"
slog = "2.7.0"
chrono = { version = "0.4.19", features = ["serde"] }
slog-async = "2.7.0"
tokio = { version = "1.44.2", features = ["full"] }
serde = { version = "1.0.133", features = ["derive"] }
serde_json = "1.0.74"
uzers = "0.12.1"
tabwriter = "1.2.1"
[features]
seccomp = ["rustjail/seccomp"]
[dev-dependencies]
tempfile = "3.19.1"
[workspace]
members = ["libcontainer"]

View File

@@ -1,67 +0,0 @@
# Copyright 2021-2022 Sony Group Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# LIBC=musl|gnu (default: gnu)
LIBC ?= gnu
include ../../../utils.mk
TARGET = runk
TARGET_PATH = target/$(TRIPLE)/$(BUILD_TYPE)/$(TARGET)
AGENT_SOURCE_PATH = ../../agent
EXTRA_RUSTFEATURES :=
# Define if runk enables seccomp support (default: yes)
SECCOMP := yes
# BINDIR is a directory for installing executable programs
BINDIR := /usr/local/bin
ifeq ($(SECCOMP),yes)
override EXTRA_RUSTFEATURES += seccomp
endif
ifneq ($(EXTRA_RUSTFEATURES),)
override EXTRA_RUSTFEATURES := --features "$(EXTRA_RUSTFEATURES)"
endif
.DEFAULT_GOAL := default
default: build
build:
@RUSTFLAGS="$(EXTRA_RUSTFLAGS) --deny warnings" cargo build --target $(TRIPLE) --$(BUILD_TYPE) $(EXTRA_RUSTFEATURES)
static-checks-build:
@echo "INFO: static-checks-build do nothing.."
install:
install -D $(TARGET_PATH) $(BINDIR)/$(TARGET)
clean:
cargo clean
vendor:
cargo vendor
test: test-runk test-agent
test-runk:
cargo test --all --target $(TRIPLE) $(EXTRA_RUSTFEATURES) -- --nocapture
test-agent:
make test -C $(AGENT_SOURCE_PATH) STANDARD_OCI_RUNTIME=yes
check: standard_rust_check
.PHONY: \
build \
install \
clean \
clippy \
format \
vendor \
test \
check \

View File

@@ -1,352 +0,0 @@
# runk
## Overview
> **Warnings:**
> `runk` is currently an experimental tool.
> Only continue if you are using a non-critical system.
`runk` is a standard OCI container runtime written in Rust based on a modified version of
the [Kata Container agent](https://github.com/kata-containers/kata-containers/tree/main/src/agent), `kata-agent`.
`runk` conforms to the [OCI Container Runtime specifications](https://github.com/opencontainers/runtime-spec).
Unlike the [Kata Container runtime](https://github.com/kata-containers/kata-containers/tree/main/src/agent#features),
`kata-runtime`, `runk` spawns and runs containers on the host machine directly.
The user can run `runk` in the same way as the existing container runtimes such as `runc`,
the most used implementation of the OCI runtime specs.
## Why does `runk` exist?
The `kata-agent` is a process running inside a virtual machine (VM) as a supervisor for managing containers
and processes running within those containers.
In other words, the `kata-agent` is a kind of "low-level" container runtime inside VM because the agent
spawns and runs containers according to the OCI runtime specs.
However, the `kata-agent` does not have the OCI Command-Line Interface (CLI) that is defined in the
[runtime spec](https://github.com/opencontainers/runtime-spec/blob/main/runtime.md).
The `kata-runtime` provides the CLI part of the Kata Containers runtime component,
but the `kata-runtime` is a container runtime for creating hardware-virtualized containers running on the host.
`runk` is a Rust-based standard OCI container runtime that manages normal containers,
not hardware-virtualized containers.
`runk` aims to become one of the alternatives to existing OCI compliant container runtimes.
The `kata-agent` has most of the [features](https://github.com/kata-containers/kata-containers/tree/main/src/agent#features)
needed for the container runtime and delivers high performance with a low memory footprint owing to the
implementation by Rust language.
Therefore, `runk` leverages the mechanism of the `kata-agent` to avoid reinventing the wheel.
## Performance
`runk` is faster than `runc` and has a lower memory footprint.
This table shows the average of the elapsed time and the memory footprint (maximum resident set size)
for running sequentially 100 containers, the containers run `/bin/true` using `run` command with
[detached mode](https://github.com/opencontainers/runc/blob/main/docs/terminals.md#detached)
on 12 CPU cores (`3.8 GHz AMD Ryzen 9 3900X`) and 32 GiB of RAM.
`runk` always runs containers with detached mode currently.
Evaluation Results:
| | `runk` (v0.0.1) | `runc` (v1.0.3) | `crun` (v1.4.2) |
|-----------------------|---------------|---------------|---------------|
| time [ms] | 39.83 | 50.39 | 38.41 |
| memory footprint [MB] | 4.013 | 10.78 | 1.738 |
## Status of `runk`
We drafted the initial code here, and any contributions to `runk` and [`kata-agent`](https://github.com/kata-containers/kata-containers/tree/main/src/agent)
are welcome.
Regarding features compared to `runc`, see the `Status of runk` section in the [issue](https://github.com/kata-containers/kata-containers/issues/2784).
## Building
In order to enable seccomp support, you need to install the `libseccomp` library on
your platform.
> e.g. `libseccomp-dev` for Ubuntu, or `libseccomp-devel` for CentOS
You can build `runk`:
```bash
$ cd runk
$ make
```
If you want to build a statically linked binary of `runk`, set the environment
variables for the [`libseccomp` crate](https://github.com/libseccomp-rs/libseccomp-rs) and
set the `LIBC` to `musl`:
```bash
$ export LIBSECCOMP_LINK_TYPE=static
$ export LIBSECCOMP_LIB_PATH="the path of the directory containing libseccomp.a"
$ export LIBC=musl
$ make
```
> **Note**:
>
> - If the compilation fails when `runk` tries to link the `libseccomp` library statically
> against `musl`, you will need to build the `libseccomp` manually with `-U_FORTIFY_SOURCE`.
> For the details, see [our script](https://github.com/kata-containers/kata-containers/blob/main/ci/install_libseccomp.sh)
> to install the `libseccomp` for the agent.
> - On `ppc64le` and `s390x`, `glibc` should be used even if `LIBC=musl` is specified.
> - If you do not want to enable seccomp support, run `make SECCOMP=no`.
To install `runk` into default directory for executable program (`/usr/local/bin`):
```bash
$ sudo -E make install
```
## Using `runk` directly
Please note that `runk` is a low level tool not developed with an end user in mind.
It is mostly employed by other higher-level container software like `containerd`.
If you still want to use `runk` directly, here's how.
### Prerequisites
It is necessary to create an OCI bundle to use the tool. The simplest method is:
``` bash
$ bundle_dir="bundle"
$ rootfs_dir="$bundle_dir/rootfs"
$ image="busybox"
$ mkdir -p "$rootfs_dir" && (cd "$bundle_dir" && runk spec)
$ sudo docker export $(sudo docker create "$image") | tar -C "$rootfs_dir" -xf -
```
> **Note:**
> If you use the unmodified `runk spec` template, this should give a `sh` session inside the container.
> However, if you use `runk` directly and run a container with the unmodified template,
> `runk` cannot launch the `sh` session because `runk` does not support terminal handling yet.
> You need to edit the process field in the `config.json` should look like this below
> with `"terminal": false` and `"args": ["sleep", "10"]`.
```json
"process": {
"terminal": false,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sleep",
"10"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
[...]
}
```
If you want to launch the `sh` session inside the container, you need to run `runk` from `containerd`.
Please refer to the [Using `runk` from containerd](#using-runk-from-containerd) section
### Running a container
Now you can go through the [lifecycle operations](https://github.com/opencontainers/runtime-spec/blob/main/runtime.md)
in your shell.
You need to run `runk` as `root` because `runk` does not have the rootless feature which is the ability
to run containers without root privileges.
```bash
$ cd $bundle_dir
# Create a container
$ sudo runk create test
# View the container is created and in the "created" state
$ sudo runk state test
# Start the process inside the container
$ sudo runk start test
# After 10 seconds view that the container has exited and is now in the "stopped" state
$ sudo runk state test
# Now delete the container
$ sudo runk delete test
```
## Using `runk` from `Docker`
`runk` can run containers using [`Docker`](https://github.com/docker).
First, install `Docker` from package by following the
[`Docker` installation instructions](https://docs.docker.com/engine/install/).
### Running a container with `Docker` command line
Start the docker daemon:
```bash
$ sudo dockerd --experimental --add-runtime="runk=/usr/local/bin/runk"
```
> **Note:**
> Before starting the `dockerd`, you need to stop the normal docker daemon
> running on your environment (i.e., `systemctl stop docker`).
Launch a container in a different terminal:
```bash
$ sudo docker run -it --rm --runtime runk busybox sh
/ #
```
## Using `runk` from `Podman`
`runk` can run containers using [`Podman`](https://github.com/containers/podman).
First, install `Podman` from source code or package by following the
[`Podman` installation instructions](https://podman.io/getting-started/installation).
### Running a container with `Podman` command line
```bash
$ sudo podman --runtime /usr/local/bin/runk run -it --rm busybox sh
/ #
```
> **Note:**
> `runk` does not support some commands except
> [OCI standard operations](https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#operations)
> yet, so those commands do not work in `Docker/Podman`. Regarding commands currently
> implemented in `runk`, see the [Status of `runk`](#status-of-runk) section.
## Using `runk` from `containerd`
`runk` can run containers with the containerd runtime handler support on `containerd`.
### Prerequisites for `runk` with containerd
* `containerd` v1.2.4 or above
* `cri-tools`
> **Note:**
> [`cri-tools`](https://github.com/kubernetes-sigs/cri-tools) is a set of tools for CRI
> used for development and testing.
Install `cri-tools` from source code:
```bash
$ go get github.com/kubernetes-sigs/cri-tools
$ pushd $GOPATH/src/github.com/kubernetes-sigs/cri-tools
$ make
$ sudo -E make install
$ popd
```
Write the `crictl` configuration file:
``` bash
$ cat <<EOF | sudo tee /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
EOF
```
### Configure `containerd` to use `runk`
Update `/etc/containerd/config.toml`:
```bash
$ cat <<EOF | sudo tee /etc/containerd/config.toml
version = 2
[plugins."io.containerd.runtime.v1.linux"]
shim_debug = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runk]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runk.options]
BinaryName = "/usr/local/bin/runk"
EOF
```
Restart `containerd`:
```bash
$ sudo systemctl restart containerd
```
### Running a container with `crictl` command line
You can run containers in `runk` via containerd's CRI.
Pull the `busybox` image:
``` bash
$ sudo crictl pull busybox
```
Create the sandbox configuration:
``` bash
$ cat <<EOF | tee sandbox.json
{
"metadata": {
"name": "busybox-sandbox",
"namespace": "default",
"attempt": 1,
"uid": "hdishd83djaidwnduwk28bcsb"
},
"log_directory": "/tmp",
"linux": {
}
}
EOF
```
Create the container configuration:
``` bash
$ cat <<EOF | tee container.json
{
"metadata": {
"name": "busybox"
},
"image": {
"image": "docker.io/busybox"
},
"command": [
"sh"
],
"envs": [
{
"key": "PATH",
"value": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
},
{
"key": "TERM",
"value": "xterm"
}
],
"log_path": "busybox.0.log",
"stdin": true,
"stdin_once": true,
"tty": true
}
EOF
```
With the `crictl` command line of `cri-tools`, you can specify runtime class with `-r` or `--runtime` flag.
Launch a sandbox and container using the `crictl`:
```bash
# Run a container inside a sandbox
$ sudo crictl run -r runk container.json sandbox.json
f492eee753887ba3dfbba9022028975380739aba1269df431d097b73b23c3871
# Attach to the running container
$ sudo crictl attach --stdin --tty f492eee753887ba3dfbba9022028975380739aba1269df431d097b73b23c3871
/ #
```

View File

@@ -1,32 +0,0 @@
[package]
name = "libcontainer"
version = "0.0.1"
authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]
description = "Library for runk container"
license = "Apache-2.0"
edition = "2018"
[dependencies]
rustjail = { path = "../../../agent/rustjail", features = [
"standard-oci-runtime",
] }
runtime-spec = { path = "../../../libs/runtime-spec" }
oci-spec = { version = "0.8.1", features = ["runtime"] }
kata-sys-util = { path = "../../../libs/kata-sys-util" }
logging = { path = "../../../libs/logging" }
derive_builder = "0.10.2"
libc = "0.2.108"
nix = "0.23.0"
anyhow = "1.0.52"
slog = "2.7.0"
chrono = { version = "0.4.19", features = ["serde"] }
serde = { version = "1.0.133", features = ["derive"] }
serde_json = "1.0.74"
scopeguard = "1.1.0"
cgroups = { package = "cgroups-rs", git = "https://github.com/kata-containers/cgroups-rs", rev = "v0.3.5" }
procfs = "0.14.0"
[dev-dependencies]
tempfile = "3.19.1"
test-utils = { path = "../../../libs/test-utils" }
protocols = { path = "../../../libs/protocols" }

View File

@@ -1,336 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::container::{load_linux_container, Container, ContainerLauncher};
use crate::status::Status;
use crate::utils::validate_spec;
use anyhow::{anyhow, Result};
use derive_builder::Builder;
use oci::{Process as OCIProcess, Spec};
use oci_spec::runtime as oci;
use runtime_spec::ContainerState;
use rustjail::container::update_namespaces;
use slog::{debug, Logger};
use std::fs::File;
use std::path::{Path, PathBuf};
/// Used for exec command. It will prepare the options for joining an existing container.
#[derive(Default, Builder, Debug, Clone)]
#[builder(build_fn(validate = "Self::validate"))]
pub struct ActivatedContainer {
pub id: String,
pub root: PathBuf,
pub console_socket: Option<PathBuf>,
pub pid_file: Option<PathBuf>,
pub tty: bool,
pub cwd: Option<PathBuf>,
pub env: Vec<(String, String)>,
pub no_new_privs: bool,
pub args: Vec<String>,
pub process: Option<PathBuf>,
}
impl ActivatedContainerBuilder {
/// pre-validate before building ActivatedContainer
fn validate(&self) -> Result<(), String> {
// ensure container exists
let id = self.id.as_ref().unwrap();
let root = self.root.as_ref().unwrap();
let status_path = Status::get_dir_path(root, id);
if !status_path.exists() {
return Err(format!(
"container {} does not exist at path {:?}",
id, root
));
}
// ensure argv will not be empty in process exec phase later
let process = self.process.as_ref().unwrap();
let args = self.args.as_ref().unwrap();
if process.is_none() && args.is_empty() {
return Err("process and args cannot be all empty".to_string());
}
Ok(())
}
}
impl ActivatedContainer {
/// Create ContainerLauncher that can be used to spawn a process in an existing container.
/// This reads the spec from status file of an existing container and adapts it with given process file
/// or other options like args, env, etc. It also changes the namespace in spec to join the container.
pub fn create_launcher(self, logger: &Logger) -> Result<ContainerLauncher> {
debug!(
logger,
"enter ActivatedContainer::create_launcher {:?}", self
);
let mut container = Container::load(&self.root, &self.id)?;
// If state is Created or Running, we can execute the process.
if container.state != ContainerState::Created && container.state != ContainerState::Running
{
return Err(anyhow!(
"cannot exec in a stopped or paused container, state: {:?}",
container.state
));
}
let spec = container
.status
.config
.spec
.as_mut()
.ok_or_else(|| anyhow!("spec config was not present"))?;
self.adapt_exec_spec(spec, container.status.pid, logger)?;
debug!(logger, "adapted spec: {:?}", spec);
validate_spec(spec, &self.console_socket)?;
debug!(
logger,
"load LinuxContainer with config: {:?}", &container.status.config
);
let runner = load_linux_container(&container.status, self.console_socket, logger)?;
Ok(ContainerLauncher::new(
&self.id,
&container.status.bundle,
&self.root,
false,
runner,
self.pid_file,
))
}
/// Adapt spec to execute a new process which will join the container.
fn adapt_exec_spec(&self, spec: &mut Spec, pid: i32, logger: &Logger) -> Result<()> {
// If with --process, load process from file.
// Otherwise, update process with args and other options.
if let Some(process_path) = self.process.as_ref() {
spec.set_process(Some(Self::get_process(process_path)?));
} else if let Some(process) = spec.process_mut().as_mut() {
self.update_process(process)?;
} else {
return Err(anyhow!("process is empty in spec"));
};
// Exec process will join the container's namespaces
update_namespaces(logger, spec, pid)?;
Ok(())
}
/// Update process with args and other options.
fn update_process(&self, process: &mut OCIProcess) -> Result<()> {
process.set_args(Some(self.args.clone()));
process.set_no_new_privileges(Some(self.no_new_privs));
process.set_terminal(Some(self.tty));
if let Some(cwd) = self.cwd.as_ref() {
process.set_cwd(cwd.as_path().to_path_buf());
}
if let Some(process_env) = process.env_mut() {
process_env.extend(self.env.iter().map(|kv| format!("{}={}", kv.0, kv.1)));
}
Ok(())
}
/// Read and parse OCI Process from path
fn get_process(process_path: &Path) -> Result<OCIProcess> {
let f = File::open(process_path)?;
Ok(serde_json::from_reader(f)?)
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::status::Status;
use crate::utils::test_utils::*;
use nix::unistd::getpid;
use oci_spec::runtime::{LinuxBuilder, LinuxNamespaceBuilder, ProcessBuilder, User};
use rustjail::container::TYPETONAME;
use scopeguard::defer;
use slog::o;
use std::{
fs::{create_dir_all, File},
path::PathBuf,
};
use tempfile::tempdir;
use test_utils::skip_if_not_root;
fn create_activated_dirs(root: &Path, id: &str, bundle: &Path) {
Status::create_dir(root, id).unwrap();
create_dir_all(bundle.join(TEST_ROOTFS_PATH)).unwrap();
}
#[test]
fn test_activated_container_validate() {
let root = tempdir().unwrap();
let id = TEST_CONTAINER_ID.to_string();
Status::create_dir(root.path(), &id).unwrap();
let result = ActivatedContainerBuilder::default()
.id(id)
.root(root.into_path())
.console_socket(None)
.pid_file(None)
.tty(false)
.cwd(None)
.env(Vec::new())
.no_new_privs(false)
.process(None)
.args(vec!["sleep".to_string(), "10".to_string()])
.build();
assert!(result.is_ok());
}
#[test]
fn test_activated_container_create() {
// create cgroup directory needs root permission
skip_if_not_root!();
let logger = slog::Logger::root(slog::Discard, o!());
let bundle_dir = tempdir().unwrap();
let root = tempdir().unwrap();
// Since tests are executed concurrently, container_id must be unique in tests with cgroup.
// Or the cgroup directory may be removed by other tests in advance.
let id = "test_activated_container_create".to_string();
create_activated_dirs(root.path(), &id, bundle_dir.path());
let pid = getpid().as_raw();
let mut spec = create_dummy_spec();
spec.root_mut()
.as_mut()
.unwrap()
.set_path(bundle_dir.path().join(TEST_ROOTFS_PATH));
let status = create_custom_dummy_status(&id, pid, root.path(), &spec);
status.save().unwrap();
// create empty cgroup directory to avoid is_pause failing
let cgroup = create_dummy_cgroup(Path::new(id.as_str()));
defer!(cgroup.delete().unwrap());
let result = ActivatedContainerBuilder::default()
.id(id)
.root(root.into_path())
.console_socket(Some(PathBuf::from(TEST_CONSOLE_SOCKET_PATH)))
.pid_file(Some(PathBuf::from(TEST_PID_FILE_PATH)))
.tty(true)
.cwd(Some(PathBuf::from(TEST_BUNDLE_PATH)))
.env(vec![
("K1".to_string(), "V1".to_string()),
("K2".to_string(), "V2".to_string()),
])
.no_new_privs(true)
.process(None)
.args(vec!["sleep".to_string(), "10".to_string()])
.build()
.unwrap();
let linux = LinuxBuilder::default()
.namespaces(
TYPETONAME
.iter()
.filter(|&(_, &name)| name != "user")
.map(|ns| {
LinuxNamespaceBuilder::default()
.typ(ns.0.clone())
.path(PathBuf::from(&format!("/proc/{}/ns/{}", pid, ns.1)))
.build()
.unwrap()
})
.collect::<Vec<_>>(),
)
.build()
.unwrap();
spec.set_linux(Some(linux));
let process = ProcessBuilder::default()
.terminal(result.tty)
.user(User::default())
.args(result.args.clone())
.cwd(result.cwd.clone().unwrap().to_string_lossy().to_string())
.env(vec![
"PATH=/bin:/usr/bin".to_string(),
"K1=V1".to_string(),
"K2=V2".to_string(),
])
.no_new_privileges(result.no_new_privs)
.build()
.unwrap();
spec.set_process(Some(process));
let launcher = result.clone().create_launcher(&logger).unwrap();
assert!(!launcher.init);
assert_eq!(launcher.runner.config.spec.unwrap(), spec);
assert_eq!(
launcher.runner.console_socket,
result.console_socket.unwrap()
);
assert_eq!(launcher.pid_file, result.pid_file);
}
#[test]
fn test_activated_container_create_with_process() {
// create cgroup directory needs root permission
skip_if_not_root!();
let bundle_dir = tempdir().unwrap();
let process_file = bundle_dir.path().join(TEST_PROCESS_FILE_NAME);
let mut process_template = OCIProcess::default();
process_template.set_args(Some(vec!["sleep".to_string(), "10".to_string()]));
process_template.set_cwd(PathBuf::from("/"));
let file = File::create(process_file.clone()).unwrap();
serde_json::to_writer(&file, &process_template).unwrap();
let logger = slog::Logger::root(slog::Discard, o!());
let root = tempdir().unwrap();
// Since tests are executed concurrently, container_id must be unique in tests with cgroup.
// Or the cgroup directory may be removed by other tests in advance.
let id = "test_activated_container_create_with_process".to_string();
let pid = getpid().as_raw();
let mut spec = create_dummy_spec();
spec.root_mut()
.as_mut()
.unwrap()
.set_path(bundle_dir.path().join(TEST_ROOTFS_PATH));
create_activated_dirs(root.path(), &id, bundle_dir.path());
let status = create_custom_dummy_status(&id, pid, root.path(), &spec);
status.save().unwrap();
// create empty cgroup directory to avoid is_pause failing
let cgroup = create_dummy_cgroup(Path::new(id.as_str()));
defer!(cgroup.delete().unwrap());
let launcher = ActivatedContainerBuilder::default()
.id(id)
.root(root.into_path())
.console_socket(Some(PathBuf::from(TEST_CONSOLE_SOCKET_PATH)))
.pid_file(None)
.tty(true)
.cwd(Some(PathBuf::from(TEST_BUNDLE_PATH)))
.env(vec![
("K1".to_string(), "V1".to_string()),
("K2".to_string(), "V2".to_string()),
])
.no_new_privs(true)
.process(Some(process_file))
.args(vec!["sleep".to_string(), "10".to_string()])
.build()
.unwrap()
.create_launcher(&logger)
.unwrap();
assert!(!launcher.init);
assert_eq!(
launcher
.runner
.config
.spec
.unwrap()
.process()
.clone()
.unwrap(),
process_template
);
}
}

View File

@@ -1,77 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::anyhow;
use anyhow::Result;
use cgroups;
use cgroups::freezer::{FreezerController, FreezerState};
use std::{thread, time};
// Try to remove the provided cgroups path five times with increasing delay between tries.
// If after all there are not removed cgroups, an appropriate error will be returned.
pub fn remove_cgroup_dir(cgroup: &cgroups::Cgroup) -> Result<()> {
let mut retries = 5;
let mut delay = time::Duration::from_millis(10);
while retries != 0 {
if retries != 5 {
delay *= 2;
thread::sleep(delay);
}
if cgroup.delete().is_ok() {
return Ok(());
}
retries -= 1;
}
Err(anyhow!("failed to remove cgroups paths"))
}
// Make sure we get a stable freezer state, so retry if the cgroup is still undergoing freezing.
pub fn get_freezer_state(freezer: &FreezerController) -> Result<FreezerState> {
let mut retries = 10;
while retries != 0 {
let state = freezer.state()?;
match state {
FreezerState::Thawed => return Ok(FreezerState::Thawed),
FreezerState::Frozen => return Ok(FreezerState::Frozen),
FreezerState::Freezing => {
// sleep for 10 ms, wait for the cgroup to finish freezing
thread::sleep(time::Duration::from_millis(10));
retries -= 1;
}
}
}
Ok(FreezerState::Freezing)
}
// check whether freezer state is frozen
pub fn is_paused(cgroup: &cgroups::Cgroup) -> Result<bool> {
let freezer_controller: &FreezerController = cgroup
.controller_of()
.ok_or_else(|| anyhow!("failed to get freezer controller"))?;
let freezer_state = get_freezer_state(freezer_controller)?;
match freezer_state {
FreezerState::Frozen => Ok(true),
_ => Ok(false),
}
}
pub fn freeze(cgroup: &cgroups::Cgroup, state: FreezerState) -> Result<()> {
let freezer_controller: &FreezerController = cgroup
.controller_of()
.ok_or_else(|| anyhow!("failed to get freezer controller"))?;
match state {
FreezerState::Frozen => {
freezer_controller.freeze()?;
}
FreezerState::Thawed => {
freezer_controller.thaw()?;
}
_ => return Err(anyhow!("invalid freezer state")),
}
Ok(())
}

View File

@@ -1,437 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::cgroup::{freeze, remove_cgroup_dir};
use crate::status::{self, get_current_container_state, Status};
use anyhow::{anyhow, Result};
use cgroups;
use cgroups::freezer::FreezerState;
use cgroups::hierarchies::is_cgroup2_unified_mode;
use nix::sys::signal::kill;
use nix::{
sys::signal::Signal,
sys::signal::SIGKILL,
unistd::{chdir, unlink, Pid},
};
use procfs;
use runtime_spec::{ContainerState, State as OCIState};
use rustjail::cgroups::fs::Manager as CgroupManager;
use rustjail::{
container::{BaseContainer, LinuxContainer, EXEC_FIFO_FILENAME},
process::{Process, ProcessOperations},
specconv::CreateOpts,
};
use scopeguard::defer;
use slog::{debug, info, Logger};
use std::{
env::current_dir,
fs,
path::{Path, PathBuf},
};
use kata_sys_util::hooks::HookStates;
pub const CONFIG_FILE_NAME: &str = "config.json";
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum ContainerAction {
Create,
Start,
Run,
}
#[derive(Debug)]
pub struct Container {
pub status: Status,
pub state: ContainerState,
pub cgroup: cgroups::Cgroup,
}
// Container represents a container that is created by the container runtime.
impl Container {
pub fn load(state_root: &Path, id: &str) -> Result<Self> {
let status = Status::load(state_root, id)?;
let spec = status
.config
.spec
.as_ref()
.ok_or_else(|| anyhow!("spec config was not present"))?;
let linux = spec
.linux()
.as_ref()
.ok_or_else(|| anyhow!("linux config was not present"))?;
let cpath = if linux.cgroups_path().is_none() {
id.to_string()
} else {
linux
.cgroups_path()
.clone()
.unwrap_or_default()
.display()
.to_string()
.trim_start_matches('/')
.to_string()
};
let cgroup = cgroups::Cgroup::load(cgroups::hierarchies::auto(), cpath);
let state = get_current_container_state(&status, &cgroup)?;
Ok(Self {
status,
state,
cgroup,
})
}
pub fn processes(&self) -> Result<Vec<Pid>> {
let pids = self.cgroup.tasks();
let result = pids.iter().map(|x| Pid::from_raw(x.pid as i32)).collect();
Ok(result)
}
pub fn kill(&self, signal: Signal, all: bool) -> Result<()> {
if all {
let pids = self.processes()?;
for pid in pids {
if !status::is_process_running(pid)? {
continue;
}
kill(pid, signal)?;
}
} else {
// If --all option is not specified and the container is stopped,
// kill operation generates an error in accordance with the OCI runtime spec.
if self.state == ContainerState::Stopped {
return Err(anyhow!(
"container {} can't be killed because it is {:?}",
self.status.id,
self.state
)
// This error message mustn't be chagned because the containerd integration tests
// expect that OCI container runtimes return the message.
// Ref. https://github.com/containerd/containerd/blob/release/1.7/pkg/process/utils.go#L135
.context("container not running"));
}
let pid = Pid::from_raw(self.status.pid);
if status::is_process_running(pid)? {
kill(pid, signal)?;
}
}
// For cgroup v1, killing a process in a frozen cgroup does nothing until it's thawed.
// Only thaw the cgroup for SIGKILL.
// Ref: https://github.com/opencontainers/runc/pull/3217
if !is_cgroup2_unified_mode() && self.state == ContainerState::Paused && signal == SIGKILL {
freeze(&self.cgroup, FreezerState::Thawed)?;
}
Ok(())
}
pub async fn delete(&self, force: bool, logger: &Logger) -> Result<()> {
let status = &self.status;
let spec = status
.config
.spec
.as_ref()
.ok_or_else(|| anyhow!("spec config was not present in the status"))?;
let oci_state = OCIState {
version: status.oci_version.clone(),
id: status.id.clone(),
status: self.state,
pid: status.pid,
bundle: status
.bundle
.to_str()
.ok_or_else(|| anyhow!("invalid bundle path"))?
.to_string(),
annotations: spec.annotations().clone().unwrap_or_default(),
};
if let Some(hooks) = spec.hooks().as_ref() {
info!(&logger, "Poststop Hooks");
let mut poststop_hookstates = HookStates::new();
poststop_hookstates.execute_hooks(
&hooks.poststop().clone().unwrap_or_default(),
Some(oci_state.clone()),
)?;
}
match oci_state.status {
ContainerState::Stopped => {
self.destroy()?;
}
ContainerState::Created => {
// Kill an init process
self.kill(SIGKILL, false)?;
self.destroy()?;
}
_ => {
if force {
self.kill(SIGKILL, true)?;
self.destroy()?;
} else {
return Err(anyhow!(
"cannot delete container {} that is not stopped",
&status.id
));
}
}
}
Ok(())
}
pub fn pause(&self) -> Result<()> {
if self.state != ContainerState::Running && self.state != ContainerState::Created {
return Err(anyhow!(
"failed to pause container: current status is: {:?}",
self.state
));
}
freeze(&self.cgroup, FreezerState::Frozen)?;
Ok(())
}
pub fn resume(&self) -> Result<()> {
if self.state != ContainerState::Paused {
return Err(anyhow!(
"failed to resume container: current status is: {:?}",
self.state
));
}
freeze(&self.cgroup, FreezerState::Thawed)?;
Ok(())
}
pub fn destroy(&self) -> Result<()> {
remove_cgroup_dir(&self.cgroup)?;
self.status.remove_dir()
}
}
/// Used to run a process. If init is set, it will create a container and run the process in it.
/// If init is not set, it will run the process in an existing container.
#[derive(Debug)]
pub struct ContainerLauncher {
pub id: String,
pub bundle: PathBuf,
pub state_root: PathBuf,
pub init: bool,
pub runner: LinuxContainer,
pub pid_file: Option<PathBuf>,
}
impl ContainerLauncher {
pub fn new(
id: &str,
bundle: &Path,
state_root: &Path,
init: bool,
runner: LinuxContainer,
pid_file: Option<PathBuf>,
) -> Self {
ContainerLauncher {
id: id.to_string(),
bundle: bundle.to_path_buf(),
state_root: state_root.to_path_buf(),
init,
runner,
pid_file,
}
}
/// Launch a process. For init containers, we will create a container. For non-init, it will join an existing container.
pub async fn launch(&mut self, action: ContainerAction, logger: &Logger) -> Result<()> {
if self.init {
self.spawn_container(action, logger).await?;
} else {
if action == ContainerAction::Create {
return Err(anyhow!(
"ContainerAction::Create is used for init-container only"
));
}
self.spawn_process(action, logger).await?;
}
if let Some(pid_file) = self.pid_file.as_ref() {
fs::write(
pid_file,
format!("{}", self.runner.get_process(self.id.as_str())?.pid()),
)?;
}
Ok(())
}
/// Create the container by invoking runner to spawn the first process and save status.
async fn spawn_container(&mut self, action: ContainerAction, logger: &Logger) -> Result<()> {
// State root path root/id has been created in LinuxContainer::new(),
// so we don't have to create it again.
// Spawn a new process in the container by using the agent's codes.
self.spawn_process(action, logger).await?;
let status = self.get_status()?;
status.save()?;
debug!(logger, "saved status is {:?}", status);
// Clean up the fifo file created by LinuxContainer, which is used for block the created process.
if action == ContainerAction::Run || action == ContainerAction::Start {
let fifo_path = get_fifo_path(&status);
if fifo_path.exists() {
unlink(&fifo_path)?;
}
}
Ok(())
}
/// Generate rustjail::Process from OCI::Process
fn get_process(&self, logger: &Logger) -> Result<Process> {
let spec = self.runner.config.spec.as_ref().unwrap();
if spec.process().is_some() {
Ok(Process::new(
logger,
spec.process().as_ref().unwrap(),
// rustjail::LinuxContainer use the exec_id to identify processes in a container,
// so we can get the spawned process by ctr.get_process(exec_id) later.
// Since LinuxContainer is temporarily created to spawn one process in each runk invocation,
// we can use arbitrary string as the exec_id. Here we choose the container id.
&self.id,
self.init,
0,
None,
)?)
} else {
Err(anyhow!("no process configuration"))
}
}
/// Spawn a new process in the container by invoking runner.
async fn spawn_process(&mut self, action: ContainerAction, logger: &Logger) -> Result<()> {
// Agent will chdir to bundle_path before creating LinuxContainer. Just do the same as agent.
let current_dir = current_dir()?;
chdir(&self.bundle)?;
defer! {
chdir(&current_dir).unwrap();
}
let process = self.get_process(logger)?;
match action {
ContainerAction::Create => {
self.runner.start(process).await?;
}
ContainerAction::Start => {
self.runner.exec().await?;
}
ContainerAction::Run => {
self.runner.run(process).await?;
}
}
Ok(())
}
/// Generate runk specified Status
fn get_status(&self) -> Result<Status> {
let oci_state = self.runner.oci_state()?;
// read start time from /proc/<pid>/stat
let proc = procfs::process::Process::new(self.runner.init_process_pid)?;
let process_start_time = proc.stat()?.starttime;
Status::new(
&self.state_root,
&self.bundle,
oci_state,
process_start_time,
self.runner.created,
self.runner
.cgroup_manager
.as_ref()
.as_any()?
.downcast_ref::<CgroupManager>()
.unwrap()
.clone(),
self.runner.config.clone(),
)
}
}
pub fn create_linux_container(
id: &str,
root: &Path,
config: CreateOpts,
console_socket: Option<PathBuf>,
logger: &Logger,
) -> Result<LinuxContainer> {
let mut container = LinuxContainer::new(
id,
root.to_str()
.map(|s| s.to_string())
.ok_or_else(|| anyhow!("failed to convert bundle path"))?
.as_str(),
None,
config,
logger,
)?;
if let Some(socket_path) = console_socket.as_ref() {
container.set_console_socket(socket_path)?;
}
Ok(container)
}
// Load rustjail's Linux container.
// "uid_map_path" and "gid_map_path" are always empty, so they are not set.
pub fn load_linux_container(
status: &Status,
console_socket: Option<PathBuf>,
logger: &Logger,
) -> Result<LinuxContainer> {
let mut container = LinuxContainer::new(
&status.id,
&status
.root
.to_str()
.map(|s| s.to_string())
.ok_or_else(|| anyhow!("failed to convert a root path"))?,
None,
status.config.clone(),
logger,
)?;
if let Some(socket_path) = console_socket.as_ref() {
container.set_console_socket(socket_path)?;
}
container.init_process_pid = status.pid;
container.init_process_start_time = status.process_start_time;
container.created = status.created.into();
Ok(container)
}
pub fn get_config_path<P: AsRef<Path>>(bundle: P) -> PathBuf {
bundle.as_ref().join(CONFIG_FILE_NAME)
}
pub fn get_fifo_path(status: &Status) -> PathBuf {
status.root.join(&status.id).join(EXEC_FIFO_FILENAME)
}
#[cfg(test)]
mod tests {
use super::*;
use crate::utils::test_utils::*;
use rustjail::container::EXEC_FIFO_FILENAME;
use std::path::PathBuf;
#[test]
fn test_get_config_path() {
let test_data = PathBuf::from(TEST_BUNDLE_PATH).join(CONFIG_FILE_NAME);
assert_eq!(get_config_path(TEST_BUNDLE_PATH), test_data);
}
#[test]
fn test_get_fifo_path() {
let test_data = PathBuf::from(TEST_STATE_ROOT_PATH)
.join(TEST_CONTAINER_ID)
.join(EXEC_FIFO_FILENAME);
let status = create_dummy_status();
assert_eq!(get_fifo_path(&status), test_data);
}
}

View File

@@ -1,140 +0,0 @@
// Copyright 2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::container::{load_linux_container, Container, ContainerLauncher};
use anyhow::{anyhow, Result};
use derive_builder::Builder;
use runtime_spec::ContainerState;
use slog::{debug, Logger};
use std::path::PathBuf;
/// Used for start command. It will prepare the options used for starting a new container.
#[derive(Default, Builder, Debug, Clone)]
#[builder(build_fn(validate = "Self::validate"))]
pub struct CreatedContainer {
id: String,
root: PathBuf,
}
impl CreatedContainerBuilder {
/// pre-validate before building CreatedContainer
fn validate(&self) -> Result<(), String> {
// ensure container exists
let id = self.id.as_ref().unwrap();
let root = self.root.as_ref().unwrap();
let path = root.join(id);
if !path.as_path().exists() {
return Err(format!("container {} does not exist", id));
}
Ok(())
}
}
impl CreatedContainer {
/// Create ContainerLauncher that can be used to start a process from an existing init container.
/// It reads the spec from status file of the init container.
pub fn create_launcher(self, logger: &Logger) -> Result<ContainerLauncher> {
debug!(logger, "enter CreatedContainer::create_launcher {:?}", self);
let container = Container::load(&self.root, &self.id)?;
if container.state != ContainerState::Created {
return Err(anyhow!(
"cannot start a container in the {:?} state",
container.state
));
}
let config = container.status.config.clone();
debug!(
logger,
"Prepare LinuxContainer for starting with config: {:?}", config
);
let runner = load_linux_container(&container.status, None, logger)?;
Ok(ContainerLauncher::new(
&self.id,
&container.status.bundle,
&self.root,
true,
runner,
None,
))
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::status::Status;
use crate::utils::test_utils::*;
use nix::sys::stat::Mode;
use nix::unistd::{self, getpid};
use rustjail::container::EXEC_FIFO_FILENAME;
use scopeguard::defer;
use slog::o;
use std::fs::create_dir_all;
use std::path::Path;
use tempfile::tempdir;
use test_utils::skip_if_not_root;
fn create_created_container_dirs(root: &Path, id: &str, bundle: &Path) {
Status::create_dir(root, id).unwrap();
let fifo = root.join(id).join(EXEC_FIFO_FILENAME);
unistd::mkfifo(&fifo, Mode::from_bits(0o644).unwrap()).unwrap();
create_dir_all(bundle.join(TEST_ROOTFS_PATH)).unwrap();
}
#[test]
fn test_created_container_validate() {
let root = tempdir().unwrap();
let id = TEST_CONTAINER_ID.to_string();
let result = CreatedContainerBuilder::default()
.id(id)
.root(root.path().to_path_buf())
.build();
assert!(result.is_err());
}
#[test]
fn test_created_container_create_launcher() {
// create cgroup directory needs root permission
skip_if_not_root!();
let logger = slog::Logger::root(slog::Discard, o!());
let bundle_dir = tempdir().unwrap();
let root = tempdir().unwrap();
// Since tests are executed concurrently, container_id must be unique in tests with cgroup.
// Or the cgroup directory may be removed by other tests in advance.
let id = "test_created_container_create".to_string();
create_created_container_dirs(root.path(), &id, bundle_dir.path());
let pid = getpid().as_raw();
let mut spec = create_dummy_spec();
spec.root_mut()
.as_mut()
.unwrap()
.set_path(bundle_dir.path().join(TEST_ROOTFS_PATH));
let status = create_custom_dummy_status(&id, pid, root.path(), &spec);
status.save().unwrap();
// create empty cgroup directory to avoid is_pause failing
let cgroup = create_dummy_cgroup(Path::new(id.as_str()));
defer!(cgroup.delete().unwrap());
let launcher = CreatedContainerBuilder::default()
.id(id.clone())
.root(root.into_path())
.build()
.unwrap()
.create_launcher(&logger)
.unwrap();
assert!(launcher.init);
assert_eq!(launcher.runner.config.spec.unwrap(), spec);
assert_eq!(launcher.runner.id, id);
}
}

View File

@@ -1,215 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::container::{create_linux_container, get_config_path, ContainerLauncher};
use crate::status::Status;
use crate::utils::{canonicalize_spec_root, validate_spec};
use anyhow::{anyhow, Result};
use derive_builder::Builder;
use oci_spec::runtime::Spec;
use rustjail::specconv::CreateOpts;
use slog::{debug, Logger};
use std::path::PathBuf;
/// Used for create and run commands. It will prepare the options used for creating a new container.
#[derive(Default, Builder, Debug, Clone)]
#[builder(build_fn(validate = "Self::validate"))]
pub struct InitContainer {
id: String,
bundle: PathBuf,
root: PathBuf,
console_socket: Option<PathBuf>,
pid_file: Option<PathBuf>,
}
impl InitContainerBuilder {
/// pre-validate before building InitContainer
fn validate(&self) -> Result<(), String> {
// ensure container hasn't already been created
let id = self.id.as_ref().unwrap();
let root = self.root.as_ref().unwrap();
let status_path = Status::get_dir_path(root, id);
if status_path.exists() {
return Err(format!(
"container {} already exists at path {:?}",
id, root
));
}
Ok(())
}
}
impl InitContainer {
/// Create ContainerLauncher that can be used to launch a new container.
/// It will read the spec under bundle path.
pub fn create_launcher(self, logger: &Logger) -> Result<ContainerLauncher> {
debug!(logger, "enter InitContainer::create_launcher {:?}", self);
let bundle_canon = self.bundle.canonicalize()?;
let config_path = get_config_path(&bundle_canon);
let mut spec = Spec::load(
config_path
.to_str()
.ok_or_else(|| anyhow!("invalid config path"))?,
)?;
// Only absolute rootfs path is valid when creating LinuxContainer later.
canonicalize_spec_root(&mut spec, &bundle_canon)?;
debug!(logger, "load spec from config file: {:?}", spec);
validate_spec(&spec, &self.console_socket)?;
let config = CreateOpts {
cgroup_name: "".to_string(),
use_systemd_cgroup: false,
// TODO: liboci-cli does not support --no-pivot option for create and run command.
// After liboci-cli supports the option, we will change the following code.
// no_pivot_root: self.no_pivot,
no_pivot_root: false,
no_new_keyring: false,
spec: Some(spec),
rootless_euid: false,
rootless_cgroup: false,
container_name: "".to_string(),
};
debug!(logger, "create LinuxContainer with config: {:?}", config);
let container =
create_linux_container(&self.id, &self.root, config, self.console_socket, logger)?;
Ok(ContainerLauncher::new(
&self.id,
&bundle_canon,
&self.root,
true,
container,
self.pid_file,
))
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::container::CONFIG_FILE_NAME;
use crate::utils::test_utils::*;
use oci_spec::runtime::Process;
use slog::o;
use std::fs::{create_dir, File};
use std::path::Path;
use tempfile::tempdir;
#[test]
fn test_init_container_validate() {
let root = tempdir().unwrap();
let id = TEST_CONTAINER_ID.to_string();
Status::create_dir(root.path(), id.as_str()).unwrap();
let result = InitContainerBuilder::default()
.id(id)
.root(root.path().to_path_buf())
.bundle(PathBuf::from(TEST_BUNDLE_PATH))
.pid_file(None)
.console_socket(None)
.build();
assert!(result.is_err());
}
#[test]
fn test_init_container_create_launcher() {
#[cfg(all(target_arch = "powerpc64", target_endian = "little"))]
skip_if_not_root!();
let logger = slog::Logger::root(slog::Discard, o!());
let root_dir = tempdir().unwrap();
let bundle_dir = tempdir().unwrap();
// create dummy rootfs
create_dir(bundle_dir.path().join(TEST_ROOTFS_PATH)).unwrap();
let config_file = bundle_dir.path().join(CONFIG_FILE_NAME);
let mut spec = create_dummy_spec();
let file = File::create(config_file).unwrap();
serde_json::to_writer(&file, &spec).unwrap();
spec.root_mut()
.as_mut()
.unwrap()
.set_path(bundle_dir.path().join(TEST_ROOTFS_PATH));
let test_data = TestContainerData {
// Since tests are executed concurrently, container_id must be unique in tests with cgroup.
// Or the cgroup directory may be removed by other tests in advance.
id: String::from("test_init_container_create_launcher"),
bundle: bundle_dir.path().to_path_buf(),
root: root_dir.into_path(),
console_socket: Some(PathBuf::from(TEST_CONSOLE_SOCKET_PATH)),
config: CreateOpts {
spec: Some(spec),
..Default::default()
},
pid_file: Some(PathBuf::from(TEST_PID_FILE_PATH)),
};
let launcher = InitContainerBuilder::default()
.id(test_data.id.clone())
.bundle(test_data.bundle.clone())
.root(test_data.root.clone())
.console_socket(test_data.console_socket.clone())
.pid_file(test_data.pid_file.clone())
.build()
.unwrap()
.create_launcher(&logger)
.unwrap();
// LinuxContainer doesn't impl PartialEq, so we need to compare the fields manually.
assert!(launcher.init);
assert_eq!(launcher.bundle, test_data.bundle);
assert_eq!(launcher.state_root, test_data.root);
assert_eq!(launcher.pid_file, test_data.pid_file);
assert_eq!(launcher.runner.id, test_data.id);
assert_eq!(launcher.runner.config.spec, test_data.config.spec);
assert_eq!(
Some(launcher.runner.console_socket),
test_data.console_socket
);
// If it is run by root, create_launcher will create cgroup dirs successfully. So we need to do some cleanup stuff.
if nix::unistd::Uid::effective().is_root() {
clean_up_cgroup(Path::new(&test_data.id));
}
}
#[test]
fn test_init_container_tty_err() {
let logger = slog::Logger::root(slog::Discard, o!());
let bundle_dir = tempdir().unwrap();
let config_file = bundle_dir.path().join(CONFIG_FILE_NAME);
let mut spec = Spec::default();
spec.set_process(Some(Process::default()));
spec.process_mut()
.as_mut()
.unwrap()
.set_terminal(Some(true));
let file = File::create(config_file).unwrap();
serde_json::to_writer(&file, &spec).unwrap();
let test_data = TestContainerData {
id: String::from(TEST_CONTAINER_ID),
bundle: bundle_dir.into_path(),
root: tempdir().unwrap().into_path(),
console_socket: None,
config: CreateOpts {
spec: Some(spec),
..Default::default()
},
pid_file: None,
};
let result = InitContainerBuilder::default()
.id(test_data.id.clone())
.bundle(test_data.bundle.clone())
.root(test_data.root.clone())
.console_socket(test_data.console_socket.clone())
.pid_file(test_data.pid_file)
.build()
.unwrap()
.create_launcher(&logger);
assert!(result.is_err());
}
}

View File

@@ -1,12 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
pub mod activated_builder;
pub mod cgroup;
pub mod container;
pub mod created_builder;
pub mod init_builder;
pub mod status;
pub mod utils;

View File

@@ -1,236 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::cgroup::is_paused;
use crate::container::get_fifo_path;
use crate::utils::*;
use anyhow::{anyhow, Result};
use chrono::{DateTime, Utc};
use libc::pid_t;
use nix::{
errno::Errno,
sys::{signal::kill, stat::Mode},
unistd::Pid,
};
use procfs::process::ProcState;
use runtime_spec::{ContainerState, State as OCIState};
use rustjail::{cgroups::fs::Manager as CgroupManager, specconv::CreateOpts};
use serde::{Deserialize, Serialize};
use std::{
fs::{self, File, OpenOptions},
path::{Path, PathBuf},
time::SystemTime,
};
const STATUS_FILE: &str = "status.json";
#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Status {
pub oci_version: String,
pub id: String,
pub pid: pid_t,
pub root: PathBuf,
pub bundle: PathBuf,
pub rootfs: String,
pub process_start_time: u64,
pub created: DateTime<Utc>,
// Methods of Manager traits in rustjail are invisible, and CgroupManager.cgroup can't be serialized.
// So it is cumbersome to manage cgroups by this field. Instead, we use cgroups-rs::cgroup directly in Container to manager cgroups.
// Another solution is making some methods public outside rustjail and adding getter/setter for CgroupManager.cgroup.
// Temporarily keep this field for compatibility.
pub cgroup_manager: CgroupManager,
pub config: CreateOpts,
}
impl Status {
pub fn new(
root: &Path,
bundle: &Path,
oci_state: OCIState,
process_start_time: u64,
created_time: SystemTime,
cgroup_mg: CgroupManager,
config: CreateOpts,
) -> Result<Self> {
let created = DateTime::from(created_time);
let rootfs = config
.clone()
.spec
.ok_or_else(|| anyhow!("spec config was not present"))?
.root()
.as_ref()
.ok_or_else(|| anyhow!("root config was not present in the spec"))?
.path()
.clone();
Ok(Self {
oci_version: oci_state.version,
id: oci_state.id,
pid: oci_state.pid,
root: root.to_path_buf(),
bundle: bundle.to_path_buf(),
rootfs: rootfs.display().to_string(),
process_start_time,
created,
cgroup_manager: cgroup_mg,
config,
})
}
pub fn save(&self) -> Result<()> {
let state_file_path = Self::get_file_path(&self.root, &self.id);
if !&self.root.exists() {
create_dir_with_mode(&self.root, Mode::S_IRWXU, true)?;
}
let file = OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.open(state_file_path)?;
serde_json::to_writer(&file, self)?;
Ok(())
}
pub fn load(state_root: &Path, id: &str) -> Result<Self> {
let state_file_path = Self::get_file_path(state_root, id);
if !state_file_path.exists() {
return Err(anyhow!("container \"{}\" does not exist", id));
}
let file = File::open(&state_file_path)?;
let state: Self = serde_json::from_reader(&file)?;
Ok(state)
}
pub fn create_dir(state_root: &Path, id: &str) -> Result<()> {
let state_dir_path = Self::get_dir_path(state_root, id);
if !state_dir_path.exists() {
create_dir_with_mode(state_dir_path, Mode::S_IRWXU, true)?;
} else {
return Err(anyhow!("container with id exists: \"{}\"", id));
}
Ok(())
}
pub fn remove_dir(&self) -> Result<()> {
let state_dir_path = Self::get_dir_path(&self.root, &self.id);
fs::remove_dir_all(state_dir_path)?;
Ok(())
}
pub fn get_dir_path(state_root: &Path, id: &str) -> PathBuf {
state_root.join(id)
}
pub fn get_file_path(state_root: &Path, id: &str) -> PathBuf {
state_root.join(id).join(STATUS_FILE)
}
}
pub fn is_process_running(pid: Pid) -> Result<bool> {
match kill(pid, None) {
Err(errno) => {
if errno != Errno::ESRCH {
return Err(anyhow!("failed to kill process {}: {:?}", pid, errno));
}
Ok(false)
}
Ok(()) => Ok(true),
}
}
// Returns the current state of a container. It will read cgroupfs and procfs to determine the state.
// https://github.com/opencontainers/runc/blob/86d6898f3052acba1ebcf83aa2eae3f6cc5fb471/libcontainer/container_linux.go#L1953
pub fn get_current_container_state(
status: &Status,
cgroup: &cgroups::Cgroup,
) -> Result<ContainerState> {
if is_paused(cgroup)? {
return Ok(ContainerState::Paused);
}
let proc = procfs::process::Process::new(status.pid);
// if reading /proc/<pid> occurs error, then the process is not running
if proc.is_err() {
return Ok(ContainerState::Stopped);
}
let proc_stat = proc.unwrap().stat()?;
// if start time is not equal, then the pid is reused, and the process is not running
if proc_stat.starttime != status.process_start_time {
return Ok(ContainerState::Stopped);
}
match proc_stat.state()? {
ProcState::Zombie | ProcState::Dead => Ok(ContainerState::Stopped),
_ => {
let fifo = get_fifo_path(status);
if fifo.exists() {
return Ok(ContainerState::Created);
}
Ok(ContainerState::Running)
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::utils::test_utils::*;
use ::test_utils::skip_if_not_root;
use chrono::{DateTime, Utc};
use nix::unistd::getpid;
use runtime_spec::ContainerState;
use rustjail::cgroups::fs::Manager as CgroupManager;
use scopeguard::defer;
use std::path::Path;
use std::time::SystemTime;
#[test]
fn test_status() {
let cgm: CgroupManager = serde_json::from_str(TEST_CGM_DATA).unwrap();
let oci_state = create_dummy_oci_state();
let created = SystemTime::now();
let status = Status::new(
Path::new(TEST_STATE_ROOT_PATH),
Path::new(TEST_BUNDLE_PATH),
oci_state.clone(),
1,
created,
cgm,
create_dummy_opts(),
)
.unwrap();
assert_eq!(status.id, oci_state.id);
assert_eq!(status.pid, oci_state.pid);
assert_eq!(status.process_start_time, 1);
assert_eq!(status.created, DateTime::<Utc>::from(created));
}
#[test]
fn test_is_process_running() {
let pid = getpid();
let ret = is_process_running(pid).unwrap();
assert!(ret);
}
#[test]
fn test_get_current_container_state() {
skip_if_not_root!();
let mut status = create_dummy_status();
status.id = "test_get_current_container_state".to_string();
// crete a dummy cgroup to make sure is_pause doesn't return error
let cgroup = create_dummy_cgroup(Path::new(&status.id));
defer!(cgroup.delete().unwrap());
let state = get_current_container_state(&status, &cgroup).unwrap();
assert_eq!(state, ContainerState::Running);
}
}

View File

@@ -1,294 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Result};
use nix::sys::stat::Mode;
use oci_spec::runtime::{Process, Spec};
use std::{
fs::{DirBuilder, File},
io::{prelude::*, BufReader},
os::unix::fs::DirBuilderExt,
path::{Path, PathBuf},
};
pub fn lines_from_file<P: AsRef<Path>>(path: P) -> Result<Vec<String>> {
let file = File::open(&path)?;
let buf = BufReader::new(file);
Ok(buf
.lines()
.map(|v| v.expect("could not parse line"))
.collect())
}
pub fn create_dir_with_mode<P: AsRef<Path>>(path: P, mode: Mode, recursive: bool) -> Result<()> {
let path = path.as_ref();
if path.exists() {
return Err(anyhow!("{} already exists", path.display()));
}
Ok(DirBuilder::new()
.recursive(recursive)
.mode(mode.bits())
.create(path)?)
}
/// If root in spec is a relative path, make it absolute.
pub fn canonicalize_spec_root(spec: &mut Spec, bundle_canon: &Path) -> Result<()> {
let spec_root = spec
.root_mut()
.as_mut()
.ok_or_else(|| anyhow!("root config was not present in the spec file"))?;
let rootfs_path = &spec_root.path();
if !rootfs_path.is_absolute() {
let bundle_canon_path = bundle_canon.join(rootfs_path).canonicalize()?;
spec_root.set_path(bundle_canon_path);
}
Ok(())
}
/// Check whether spec is valid. Now runk only support detach mode.
pub fn validate_spec(spec: &Spec, console_socket: &Option<PathBuf>) -> Result<()> {
validate_process_spec(spec.process())?;
if let Some(process) = spec.process().as_ref() {
// runk always launches containers with detached mode, so users have to
// use a console socket with run or create operation when a terminal is used.
if process.terminal().is_some() && console_socket.is_none() {
return Err(anyhow!(
"cannot allocate a pseudo-TTY without setting a console socket"
));
}
}
Ok(())
}
// Validate process just like runc, https://github.com/opencontainers/runc/pull/623
pub fn validate_process_spec(process: &Option<Process>) -> Result<()> {
let process = process
.as_ref()
.ok_or_else(|| anyhow!("process property must not be empty"))?;
if process.cwd().as_os_str().is_empty() {
return Err(anyhow!("cwd property must not be empty"));
}
let cwd = process.cwd();
if !cwd.is_absolute() {
return Err(anyhow!("cwd must be an absolute path"));
}
if process.args().is_none() {
return Err(anyhow!("args must not be empty"));
}
Ok(())
}
#[cfg(test)]
pub(crate) mod test_utils {
use super::*;
use crate::status::Status;
use chrono::DateTime;
use nix::unistd::getpid;
use oci::{LinuxBuilder, LinuxNamespaceBuilder, Process, Root, Spec};
use oci_spec::runtime as oci;
use runtime_spec::{ContainerState, State as OCIState};
use rustjail::{
cgroups::fs::Manager as CgroupManager, container::TYPETONAME, specconv::CreateOpts,
};
use std::{fs::create_dir_all, path::Path, time::SystemTime};
use tempfile::tempdir;
pub const TEST_CONTAINER_ID: &str = "test";
pub const TEST_STATE_ROOT_PATH: &str = "/state";
pub const TEST_BUNDLE_PATH: &str = "/bundle";
pub const TEST_ROOTFS_PATH: &str = "rootfs";
pub const TEST_ANNOTATION: &str = "test-annotation";
pub const TEST_CONSOLE_SOCKET_PATH: &str = "/test-console-sock";
pub const TEST_PROCESS_FILE_NAME: &str = "process.json";
pub const TEST_PID_FILE_PATH: &str = "/test-pid";
pub const TEST_HOST_NAME: &str = "test-host";
pub const TEST_OCI_SPEC_VERSION: &str = "1.0.2";
pub const TEST_CGM_DATA: &str = r#"{
"paths": {
"devices": "/sys/fs/cgroup/devices"
},
"mounts": {
"devices": "/sys/fs/cgroup/devices"
},
"cpath": "test"
}"#;
#[derive(Debug)]
pub struct TestContainerData {
pub id: String,
pub bundle: PathBuf,
pub root: PathBuf,
pub console_socket: Option<PathBuf>,
pub pid_file: Option<PathBuf>,
pub config: CreateOpts,
}
pub fn create_dummy_spec() -> Spec {
let linux = LinuxBuilder::default()
.namespaces(
TYPETONAME
.iter()
.filter(|&(_, &name)| name != "user")
.map(|ns| {
LinuxNamespaceBuilder::default()
.typ(ns.0.clone())
.path(PathBuf::from(""))
.build()
.unwrap()
})
.collect::<Vec<_>>(),
)
.build()
.unwrap();
let mut process = Process::default();
process.set_args(Some(vec!["sleep".to_string(), "10".to_string()]));
process.set_env(Some(vec!["PATH=/bin:/usr/bin".to_string()]));
process.set_cwd(PathBuf::from("/"));
let mut root = Root::default();
root.set_path(PathBuf::from(TEST_ROOTFS_PATH));
root.set_readonly(Some(false));
let mut spec = Spec::default();
spec.set_version(TEST_OCI_SPEC_VERSION.to_string());
spec.set_process(Some(process));
spec.set_hostname(Some(TEST_HOST_NAME.to_string()));
spec.set_root(Some(root));
spec.set_linux(Some(linux));
spec
}
pub fn create_dummy_opts() -> CreateOpts {
let mut spec = Spec::default();
spec.set_root(Some(Root::default()));
CreateOpts {
cgroup_name: "".to_string(),
use_systemd_cgroup: false,
no_pivot_root: false,
no_new_keyring: false,
spec: Some(spec),
rootless_euid: false,
rootless_cgroup: false,
container_name: "".to_string(),
}
}
pub fn create_dummy_oci_state() -> OCIState {
OCIState {
version: TEST_OCI_SPEC_VERSION.to_string(),
id: TEST_CONTAINER_ID.to_string(),
status: ContainerState::Running,
pid: getpid().as_raw(),
bundle: TEST_BUNDLE_PATH.to_string(),
annotations: [(TEST_ANNOTATION.to_string(), TEST_ANNOTATION.to_string())]
.iter()
.cloned()
.collect(),
}
}
pub fn create_dummy_status() -> Status {
let cgm: CgroupManager = serde_json::from_str(TEST_CGM_DATA).unwrap();
let oci_state = create_dummy_oci_state();
let created = SystemTime::now();
let start_time = procfs::process::Process::new(oci_state.pid)
.unwrap()
.stat()
.unwrap()
.starttime;
let status = Status::new(
Path::new(TEST_STATE_ROOT_PATH),
Path::new(TEST_BUNDLE_PATH),
oci_state,
start_time,
created,
cgm,
create_dummy_opts(),
)
.unwrap();
status
}
pub fn create_custom_dummy_status(id: &str, pid: i32, root: &Path, spec: &Spec) -> Status {
let start_time = procfs::process::Process::new(pid)
.unwrap()
.stat()
.unwrap()
.starttime;
Status {
oci_version: spec.version().clone(),
id: id.to_string(),
pid,
root: root.to_path_buf(),
bundle: PathBuf::from(TEST_BUNDLE_PATH),
rootfs: TEST_ROOTFS_PATH.to_string(),
process_start_time: start_time,
created: DateTime::from(SystemTime::now()),
cgroup_manager: serde_json::from_str(TEST_CGM_DATA).unwrap(),
config: CreateOpts {
spec: Some(spec.clone()),
..Default::default()
},
}
}
pub fn create_dummy_cgroup(cpath: &Path) -> cgroups::Cgroup {
cgroups::Cgroup::new(cgroups::hierarchies::auto(), cpath).unwrap()
}
pub fn clean_up_cgroup(cpath: &Path) {
let cgroup = cgroups::Cgroup::load(cgroups::hierarchies::auto(), cpath);
cgroup.delete().unwrap();
}
#[test]
fn test_canonicalize_spec_root() {
let gen_spec = |p: &str| -> Spec {
let mut root = Root::default();
root.set_path(PathBuf::from(p));
root.set_readonly(Some(false));
let mut spec = Spec::default();
spec.set_root(Some(root));
spec
};
let rootfs_name = TEST_ROOTFS_PATH;
let temp_dir = tempdir().unwrap();
let bundle_dir = temp_dir.path();
let abs_root = bundle_dir.join(rootfs_name);
create_dir_all(abs_root.clone()).unwrap();
let mut spec = gen_spec(abs_root.to_str().unwrap());
assert!(canonicalize_spec_root(&mut spec, bundle_dir).is_ok());
assert_eq!(spec.root_mut().clone().unwrap().path(), &abs_root);
let mut spec = gen_spec(rootfs_name);
assert!(canonicalize_spec_root(&mut spec, bundle_dir).is_ok());
assert_eq!(spec.root().clone().unwrap().path(), &abs_root);
}
#[test]
pub fn test_validate_process_spec() {
let mut valid_process = Process::default();
valid_process.set_args(Some(vec!["test".to_string()]));
valid_process.set_cwd(PathBuf::from("/"));
assert!(validate_process_spec(&None).is_err());
assert!(validate_process_spec(&Some(valid_process.clone())).is_ok());
let mut invalid_process = valid_process.clone();
invalid_process.set_args(None);
assert!(validate_process_spec(&Some(invalid_process)).is_err());
let mut invalid_process = valid_process.clone();
invalid_process.set_cwd(PathBuf::from(""));
assert!(validate_process_spec(&Some(invalid_process)).is_err());
let mut invalid_process = valid_process;
invalid_process.set_cwd(PathBuf::from("test/"));
assert!(validate_process_spec(&Some(invalid_process)).is_err());
}
}

View File

@@ -1,28 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::{container::ContainerAction, init_builder::InitContainerBuilder};
use liboci_cli::Create;
use slog::{info, Logger};
use std::path::Path;
pub async fn run(opts: Create, root: &Path, logger: &Logger) -> Result<()> {
let mut launcher = InitContainerBuilder::default()
.id(opts.container_id)
.bundle(opts.bundle)
.root(root.to_path_buf())
.console_socket(opts.console_socket)
.pid_file(opts.pid_file)
.build()?
.create_launcher(logger)?;
launcher.launch(ContainerAction::Create, logger).await?;
info!(&logger, "create command finished successfully");
Ok(())
}

View File

@@ -1,30 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Result};
use libcontainer::{container::Container, status::Status};
use liboci_cli::Delete;
use slog::{info, Logger};
use std::{fs, path::Path};
pub async fn run(opts: Delete, root: &Path, logger: &Logger) -> Result<()> {
let container_id = &opts.container_id;
let status_dir = Status::get_dir_path(root, container_id);
if !status_dir.exists() {
return Err(anyhow!("container {} does not exist", container_id));
}
let container = if let Ok(value) = Container::load(root, container_id) {
value
} else {
fs::remove_dir_all(status_dir)?;
return Ok(());
};
container.delete(opts.force, logger).await?;
info!(&logger, "delete command finished successfully");
Ok(())
}

View File

@@ -1,32 +0,0 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::activated_builder::ActivatedContainerBuilder;
use libcontainer::container::ContainerAction;
use liboci_cli::Exec;
use slog::{info, Logger};
use std::path::Path;
pub async fn run(opts: Exec, root: &Path, logger: &Logger) -> Result<()> {
let mut launcher = ActivatedContainerBuilder::default()
.id(opts.container_id)
.root(root.to_path_buf())
.console_socket(opts.console_socket)
.pid_file(opts.pid_file)
.tty(opts.tty)
.cwd(opts.cwd)
.env(opts.env)
.no_new_privs(opts.no_new_privs)
.process(opts.process)
.args(opts.command)
.build()?
.create_launcher(logger)?;
launcher.launch(ContainerAction::Run, logger).await?;
info!(&logger, "exec command finished successfully");
Ok(())
}

View File

@@ -1,58 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::container::Container;
use liboci_cli::Kill;
use nix::sys::signal::Signal;
use slog::{info, Logger};
use std::{convert::TryFrom, path::Path, str::FromStr};
pub fn run(opts: Kill, state_root: &Path, logger: &Logger) -> Result<()> {
let container_id = &opts.container_id;
let container = Container::load(state_root, container_id)?;
let sig = parse_signal(&opts.signal)?;
let all = opts.all;
container.kill(sig, all)?;
info!(&logger, "kill command finished successfully");
Ok(())
}
fn parse_signal(signal: &str) -> Result<Signal> {
if let Ok(num) = signal.parse::<i32>() {
return Ok(Signal::try_from(num)?);
}
let mut signal_upper = signal.to_uppercase();
if !signal_upper.starts_with("SIG") {
signal_upper = "SIG".to_string() + &signal_upper;
}
Ok(Signal::from_str(&signal_upper)?)
}
#[cfg(test)]
mod tests {
use super::*;
use nix::sys::signal::Signal;
#[test]
fn test_parse_signal() {
assert_eq!(Signal::SIGHUP, parse_signal("1").unwrap());
assert_eq!(Signal::SIGHUP, parse_signal("sighup").unwrap());
assert_eq!(Signal::SIGHUP, parse_signal("hup").unwrap());
assert_eq!(Signal::SIGHUP, parse_signal("SIGHUP").unwrap());
assert_eq!(Signal::SIGHUP, parse_signal("HUP").unwrap());
assert_eq!(Signal::SIGKILL, parse_signal("9").unwrap());
assert_eq!(Signal::SIGKILL, parse_signal("sigkill").unwrap());
assert_eq!(Signal::SIGKILL, parse_signal("kill").unwrap());
assert_eq!(Signal::SIGKILL, parse_signal("SIGKILL").unwrap());
assert_eq!(Signal::SIGKILL, parse_signal("KILL").unwrap());
}
}

View File

@@ -1,68 +0,0 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::state::get_container_state_name;
use anyhow::Result;
use libcontainer::container::Container;
use liboci_cli::List;
use runtime_spec::ContainerState;
use slog::{info, Logger};
use std::fmt::Write as _;
use std::{fs, os::unix::prelude::MetadataExt, path::Path};
use std::{io, io::Write};
use tabwriter::TabWriter;
use uzers::get_user_by_uid;
pub fn run(_: List, root: &Path, logger: &Logger) -> Result<()> {
let mut content = String::new();
for entry in fs::read_dir(root)? {
let entry = entry?;
// Possibly race with other command of runk, so continue loop when any error occurs below
let metadata = match entry.metadata() {
Ok(metadata) => metadata,
Err(_) => continue,
};
if !metadata.is_dir() {
continue;
}
let container_id = match entry.file_name().into_string() {
Ok(id) => id,
Err(_) => continue,
};
let container = match Container::load(root, &container_id) {
Ok(container) => container,
Err(_) => continue,
};
let state = container.state;
// Just like runc, pid of stopped container is 0
let pid = match state {
ContainerState::Stopped => 0,
_ => container.status.pid,
};
// May replace get_user_by_uid with getpwuid(3)
let owner = match get_user_by_uid(metadata.uid()) {
Some(user) => String::from(user.name().to_string_lossy()),
None => format!("#{}", metadata.uid()),
};
let _ = writeln!(
content,
"{}\t{}\t{}\t{}\t{}\t{}",
container_id,
pid,
get_container_state_name(state),
container.status.bundle.display(),
container.status.created,
owner
);
}
let mut tab_writer = TabWriter::new(io::stdout());
writeln!(&mut tab_writer, "ID\tPID\tSTATUS\tBUNDLE\tCREATED\tOWNER")?;
write!(&mut tab_writer, "{}", content)?;
tab_writer.flush()?;
info!(&logger, "list command finished successfully");
Ok(())
}

View File

@@ -1,17 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
pub mod create;
pub mod delete;
pub mod exec;
pub mod kill;
pub mod list;
pub mod pause;
pub mod ps;
pub mod resume;
pub mod run;
pub mod spec;
pub mod start;
pub mod state;

View File

@@ -1,18 +0,0 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::container::Container;
use liboci_cli::Pause;
use slog::{info, Logger};
use std::path::Path;
pub fn run(opts: Pause, root: &Path, logger: &Logger) -> Result<()> {
let container = Container::load(root, &opts.container_id)?;
container.pause()?;
info!(&logger, "pause command finished successfully");
Ok(())
}

View File

@@ -1,63 +0,0 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::anyhow;
use anyhow::Result;
use libcontainer::container::Container;
use liboci_cli::Ps;
use slog::{info, Logger};
use std::path::Path;
use std::process::Command;
use std::str;
pub fn run(opts: Ps, root: &Path, logger: &Logger) -> Result<()> {
let container = Container::load(root, opts.container_id.as_str())?;
let pids = container
.processes()?
.iter()
.map(|pid| pid.as_raw())
.collect::<Vec<_>>();
match opts.format.as_str() {
"json" => println!("{}", serde_json::to_string(&pids)?),
"table" => {
let ps_options = if opts.ps_options.is_empty() {
vec!["-ef".to_string()]
} else {
opts.ps_options
};
let output = Command::new("ps").args(ps_options).output()?;
if !output.status.success() {
return Err(anyhow!("{}", std::str::from_utf8(&output.stderr)?));
}
let lines = str::from_utf8(&output.stdout)?.lines().collect::<Vec<_>>();
if lines.is_empty() {
return Err(anyhow!("no processes found"));
}
let pid_index = lines[0]
.split_whitespace()
.position(|field| field == "PID")
.ok_or_else(|| anyhow!("could't find PID field in ps output"))?;
println!("{}", lines[0]);
for &line in &lines[1..] {
if line.is_empty() {
continue;
}
let fields = line.split_whitespace().collect::<Vec<_>>();
if pid_index >= fields.len() {
continue;
}
let pid: i32 = fields[pid_index].parse()?;
if pids.contains(&pid) {
println!("{}", line);
}
}
}
_ => return Err(anyhow!("unknown format: {}", opts.format)),
}
info!(&logger, "ps command finished successfully");
Ok(())
}

View File

@@ -1,18 +0,0 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::container::Container;
use liboci_cli::Resume;
use slog::{info, Logger};
use std::path::Path;
pub fn run(opts: Resume, root: &Path, logger: &Logger) -> Result<()> {
let container = Container::load(root, &opts.container_id)?;
container.resume()?;
info!(&logger, "pause command finished successfully");
Ok(())
}

View File

@@ -1,27 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::{container::ContainerAction, init_builder::InitContainerBuilder};
use liboci_cli::Run;
use slog::{info, Logger};
use std::path::Path;
pub async fn run(opts: Run, root: &Path, logger: &Logger) -> Result<()> {
let mut launcher = InitContainerBuilder::default()
.id(opts.container_id)
.bundle(opts.bundle)
.root(root.to_path_buf())
.console_socket(opts.console_socket)
.pid_file(opts.pid_file)
.build()?
.create_launcher(logger)?;
launcher.launch(ContainerAction::Run, logger).await?;
info!(&logger, "run command finished successfully");
Ok(())
}

View File

@@ -1,207 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
//use crate::container::get_config_path;
use anyhow::Result;
use libcontainer::container::CONFIG_FILE_NAME;
use liboci_cli::Spec;
use slog::{info, Logger};
use std::{fs::File, io::Write, path::Path};
pub const DEFAULT_SPEC: &str = r#"{
"ociVersion": "1.0.2-dev",
"process": {
"terminal": true,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"inheritable": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"ambient": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
]
},
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "runk",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
}
],
"maskedPaths": [
"/proc/acpi",
"/proc/asound",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware",
"/proc/scsi"
],
"readonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}"#;
pub fn run(_opts: Spec, logger: &Logger) -> Result<()> {
// TODO: liboci-cli does not support --bundle option for spec command.
// After liboci-cli supports the option, we will change the following code.
// let config_path = get_config_path(&opts.bundle);
let config_path = Path::new(".").join(CONFIG_FILE_NAME);
let config_data = DEFAULT_SPEC;
let mut file = File::create(config_path)?;
file.write_all(config_data.as_bytes())?;
info!(&logger, "spec command finished successfully");
Ok(())
}

View File

@@ -1,24 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use libcontainer::{container::ContainerAction, created_builder::CreatedContainerBuilder};
use liboci_cli::Start;
use slog::{info, Logger};
use std::path::Path;
pub async fn run(opts: Start, root: &Path, logger: &Logger) -> Result<()> {
let mut launcher = CreatedContainerBuilder::default()
.id(opts.container_id)
.root(root.to_path_buf())
.build()?
.create_launcher(logger)?;
launcher.launch(ContainerAction::Start, logger).await?;
info!(&logger, "start command finished successfully");
Ok(())
}

View File

@@ -1,78 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::Result;
use chrono::{DateTime, Utc};
use libcontainer::{container::Container, status::Status};
use liboci_cli::State;
use runtime_spec::ContainerState;
use serde::{Deserialize, Serialize};
use slog::{info, Logger};
use std::path::{Path, PathBuf};
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct RuntimeState {
pub oci_version: String,
pub id: String,
pub pid: i32,
pub status: String,
pub bundle: PathBuf,
pub created: DateTime<Utc>,
}
impl RuntimeState {
pub fn new(status: Status, state: ContainerState) -> Self {
Self {
oci_version: status.oci_version,
id: status.id,
pid: status.pid,
status: get_container_state_name(state),
bundle: status.bundle,
created: status.created,
}
}
}
pub fn run(opts: State, state_root: &Path, logger: &Logger) -> Result<()> {
let container = Container::load(state_root, &opts.container_id)?;
let oci_state = RuntimeState::new(container.status, container.state);
let json_state = &serde_json::to_string_pretty(&oci_state)?;
println!("{}", json_state);
info!(&logger, "state command finished successfully");
Ok(())
}
pub fn get_container_state_name(state: ContainerState) -> String {
match state {
ContainerState::Creating => "creating",
ContainerState::Created => "created",
ContainerState::Running => "running",
ContainerState::Stopped => "stopped",
ContainerState::Paused => "paused",
}
.into()
}
#[cfg(test)]
mod tests {
use super::*;
use runtime_spec::ContainerState;
#[test]
fn test_get_container_state_name() {
assert_eq!(
"creating",
get_container_state_name(ContainerState::Creating)
);
assert_eq!("created", get_container_state_name(ContainerState::Created));
assert_eq!("running", get_container_state_name(ContainerState::Running));
assert_eq!("stopped", get_container_state_name(ContainerState::Stopped));
assert_eq!("paused", get_container_state_name(ContainerState::Paused));
}
}

View File

@@ -1,133 +0,0 @@
// Copyright 2021-2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Result};
use clap::{crate_description, crate_name, Parser};
use liboci_cli::{CommonCmd, GlobalOpts};
use liboci_cli::{Create, Delete, Kill, Start, State};
use slog::{o, Logger};
use slog_async::AsyncGuard;
use std::{
fs::OpenOptions,
path::{Path, PathBuf},
process::exit,
};
const DEFAULT_ROOT_DIR: &str = "/run/runk";
const DEFAULT_LOG_LEVEL: slog::Level = slog::Level::Info;
mod commands;
#[derive(Parser, Debug)]
enum SubCommand {
#[clap(flatten)]
Standard(StandardCmd),
#[clap(flatten)]
Common(CommonCmd),
/// Launch an init process (do not call it outside of runk)
Init {},
}
// Copy from https://github.com/containers/youki/blob/v0.0.3/crates/liboci-cli/src/lib.rs#L38-L44
#[derive(Parser, Debug)]
pub enum StandardCmd {
Create(Create),
Start(Start),
State(State),
Delete(Delete),
Kill(Kill),
}
#[derive(Parser, Debug)]
#[clap(version, author, about = crate_description!())]
struct Cli {
#[clap(flatten)]
global: GlobalOpts,
#[clap(subcommand)]
subcmd: SubCommand,
}
async fn cmd_run(subcmd: SubCommand, root_path: &Path, logger: &Logger) -> Result<()> {
match subcmd {
SubCommand::Standard(cmd) => match cmd {
StandardCmd::Create(create) => commands::create::run(create, root_path, logger).await,
StandardCmd::Start(start) => commands::start::run(start, root_path, logger).await,
StandardCmd::Delete(delete) => commands::delete::run(delete, root_path, logger).await,
StandardCmd::State(state) => commands::state::run(state, root_path, logger),
StandardCmd::Kill(kill) => commands::kill::run(kill, root_path, logger),
},
SubCommand::Common(cmd) => match cmd {
CommonCmd::Run(run) => commands::run::run(run, root_path, logger).await,
CommonCmd::Spec(spec) => commands::spec::run(spec, logger),
CommonCmd::List(list) => commands::list::run(list, root_path, logger),
CommonCmd::Exec(exec) => commands::exec::run(exec, root_path, logger).await,
CommonCmd::Ps(ps) => commands::ps::run(ps, root_path, logger),
CommonCmd::Pause(pause) => commands::pause::run(pause, root_path, logger),
CommonCmd::Resume(resume) => commands::resume::run(resume, root_path, logger),
_ => Err(anyhow!("command is not implemented yet")),
},
_ => unreachable!(),
}
}
fn setup_logger(
log_file: Option<PathBuf>,
log_level: slog::Level,
) -> Result<(Logger, Option<AsyncGuard>)> {
if let Some(ref file) = log_file {
let log_writer = OpenOptions::new()
.write(true)
.read(true)
.create(true)
.truncate(true)
.open(file)?;
// TODO: Support 'text' log format.
let (logger_local, logger_async_guard_local) =
logging::create_logger(crate_name!(), crate_name!(), log_level, log_writer);
Ok((logger_local, Some(logger_async_guard_local)))
} else {
let logger = slog::Logger::root(slog::Discard, o!());
Ok((logger, None))
}
}
async fn real_main() -> Result<()> {
let cli = Cli::parse();
if let SubCommand::Init {} = cli.subcmd {
rustjail::container::init_child();
exit(0);
}
let root_path = if let Some(path) = cli.global.root {
path
} else {
PathBuf::from(DEFAULT_ROOT_DIR)
};
let log_level = if cli.global.debug {
slog::Level::Debug
} else {
DEFAULT_LOG_LEVEL
};
let (logger, _async_guard) = setup_logger(cli.global.log, log_level)?;
cmd_run(cli.subcmd, &root_path, &logger).await?;
Ok(())
}
#[tokio::main]
async fn main() {
if let Err(e) = real_main().await {
eprintln!("ERROR: {}", e);
exit(1);
}
exit(0);
}

View File

@@ -14,7 +14,6 @@ and with different container managers.
- [Docker](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/docker)
- [`Nerdctl`](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/nerdctl)
- [`Nydus`](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/nydus)
- [`Runk`](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/runk)
2. [Stability tests](https://github.com/kata-containers/kata-containers/tree/main/tests/stability)
3. [Metrics](https://github.com/kata-containers/kata-containers/tree/main/tests/metrics)
4. [Functional](https://github.com/kata-containers/kata-containers/tree/main/tests/functional)

View File

@@ -1022,3 +1022,113 @@ function version_greater_than_equal() {
return 1
fi
}
# Run bats tests with proper reporting
#
# This function provides consistent test execution and reporting across
# all test suites (k8s, nvidia, kata-deploy, etc.)
#
# Parameters:
# $1 - Test directory (where tests are located and reports will be saved)
# $2 - Array name containing test files (passed by reference)
#
# Environment variables:
# BATS_TEST_FAIL_FAST - Set to "yes" to stop at first failure (default: "no")
#
# Example usage:
# tests=("test1.bats" "test2.bats")
# run_bats_tests "/path/to/tests" tests
#
function run_bats_tests() {
local test_dir="$1"
local -n test_array=$2
local fail_fast="${BATS_TEST_FAIL_FAST:-no}"
local report_dir="${test_dir}/reports/$(date +'%F-%T')"
mkdir -p "${report_dir}"
info "Running tests with bats version: $(bats --version). Save outputs to ${report_dir}"
local tests_fail=()
for test_entry in "${test_array[@]}"; do
test_entry=$(echo "${test_entry}" | tr -d '[:space:][:cntrl:]')
[ -z "${test_entry}" ] && continue
info "Executing ${test_entry}"
# Output file will be prefixed with "ok" or "not_ok" based on the result
local out_file="${report_dir}/${test_entry}.out"
pushd "${test_dir}" > /dev/null
if ! bats --timing --show-output-of-passing-tests "${test_entry}" | tee "${out_file}"; then
tests_fail+=("${test_entry}")
mv "${out_file}" "$(dirname "${out_file}")/not_ok-$(basename "${out_file}")"
[[ "${fail_fast}" == "yes" ]] && break
else
mv "${out_file}" "$(dirname "${out_file}")/ok-$(basename "${out_file}")"
fi
popd > /dev/null
done
if [[ ${#tests_fail[@]} -ne 0 ]]; then
die "Tests FAILED from suites: ${tests_fail[*]}"
fi
info "All tests SUCCEEDED"
}
# Report bats test results from the reports directory
#
# This function displays a summary of test results and outputs from
# the reports directory created by run_bats_tests().
#
# Parameters:
# $1 - Test directory (where reports subdirectory is located)
#
# Example usage:
# report_bats_tests "/path/to/tests"
#
function report_bats_tests() {
local test_dir="$1"
local reports_dir="${test_dir}/reports"
if [[ ! -d "${reports_dir}" ]]; then
warn "No reports directory found: ${reports_dir}"
return 1
fi
for report_dir in "${reports_dir}"/*; do
[[ ! -d "${report_dir}" ]] && continue
local ok=()
local not_ok=()
mapfile -t ok < <(find "${report_dir}" -name "ok-*.out" 2>/dev/null)
mapfile -t not_ok < <(find "${report_dir}" -name "not_ok-*.out" 2>/dev/null)
cat <<-EOF
SUMMARY ($(basename "${report_dir}")):
Pass: ${#ok[*]}
Fail: ${#not_ok[*]}
EOF
echo -e "\nSTATUSES:"
for out in "${not_ok[@]}" "${ok[@]}"; do
[[ -z "${out}" ]] && continue
local status
local bats
status=$(basename "${out}" | cut -d '-' -f1)
bats=$(basename "${out}" | cut -d '-' -f2- | sed 's/.out$//')
echo " ${status} ${bats}"
done
echo -e "\nOUTPUTS:"
for out in "${not_ok[@]}" "${ok[@]}"; do
[[ -z "${out}" ]] && continue
local bats
bats=$(basename "${out}" | cut -d '-' -f2- | sed 's/.out$//')
echo "::group::${bats}"
cat "${out}"
echo "::endgroup::"
done
done
}

View File

@@ -20,6 +20,10 @@ function run_tests() {
popd
}
function report_tests() {
report_bats_tests "${kata_deploy_dir}"
}
function cleanup_runtimeclasses() {
# Cleanup any runtime class that was left behind in the cluster, in
# case of a test failure, apart from the default one that comes from
@@ -59,6 +63,7 @@ function main() {
install-kubectl) install_kubectl ;;
get-cluster-credentials) get_cluster_credentials "kata-deploy" ;;
run-tests) run_tests ;;
report-tests) report_tests ;;
delete-cluster) cleanup "aks" "kata-deploy" ;;
*) >&2 echo "Invalid argument"; exit 2 ;;
esac

View File

@@ -4,15 +4,35 @@
#
# SPDX-License-Identifier: Apache-2.0
#
# Kata Deploy Functional Tests
#
# This test validates that kata-deploy successfully installs and configures
# Kata Containers on a Kubernetes cluster using Helm.
#
# Required environment variables:
# DOCKER_REGISTRY - Container registry for kata-deploy image
# DOCKER_REPO - Repository name for kata-deploy image
# DOCKER_TAG - Image tag to test
# KATA_HYPERVISOR - Hypervisor to test (qemu, clh, etc.)
# KUBERNETES - K8s distribution (microk8s, k3s, rke2, etc.)
#
# Optional timeout configuration (increase for slow networks or large images):
# KATA_DEPLOY_TIMEOUT - Overall helm timeout (default: 30m)
# KATA_DEPLOY_DAEMONSET_TIMEOUT - DaemonSet rollout timeout in seconds (default: 1200 = 20m)
# Includes time to pull kata-deploy image
# KATA_DEPLOY_VERIFICATION_TIMEOUT - Verification pod timeout in seconds (default: 180 = 3m)
# Time for verification pod to run
#
# Example with custom timeouts for slow network:
# KATA_DEPLOY_DAEMONSET_TIMEOUT=3600 bats kata-deploy.bats
#
load "${BATS_TEST_DIRNAME}/../../common.bash"
repo_root_dir="${BATS_TEST_DIRNAME}/../../../"
load "${repo_root_dir}/tests/gha-run-k8s-common.sh"
setup() {
ensure_yq
pushd "${repo_root_dir}"
ensure_helm
# We expect 2 runtime classes because:
# * `kata` is the default runtimeclass created by Helm, basically an alias for `kata-${KATA_HYPERVISOR}`.
@@ -26,46 +46,113 @@ setup() {
"kata\s+kata-${KATA_HYPERVISOR}" \
"kata-${KATA_HYPERVISOR}\s+kata-${KATA_HYPERVISOR}" \
)
# Set the latest image, the one generated as part of the PR, to be used as part of the tests
export HELM_IMAGE_REFERENCE="${DOCKER_REGISTRY}/${DOCKER_REPO}"
export HELM_IMAGE_TAG="${DOCKER_TAG}"
# Enable debug for Kata Containers
export HELM_DEBUG="true"
# Create the runtime class only for the shim that's being tested
export HELM_SHIMS="${KATA_HYPERVISOR}"
# Set the tested hypervisor as the default `kata` shim
export HELM_DEFAULT_SHIM="${KATA_HYPERVISOR}"
# Let the Helm chart create the default `kata` runtime class
export HELM_CREATE_DEFAULT_RUNTIME_CLASS="true"
HOST_OS=""
if [[ "${KATA_HOST_OS}" = "cbl-mariner" ]]; then
HOST_OS="${KATA_HOST_OS}"
fi
export HELM_HOST_OS="${HOST_OS}"
export HELM_K8S_DISTRIBUTION="${KUBERNETES}"
helm_helper
echo "::group::kata-deploy logs"
kubectl -n kube-system logs --tail=100 -l name=kata-deploy
echo "::endgroup::"
echo "::group::Runtime classes"
kubectl get runtimeclass
echo "::endgroup::"
popd
}
@test "Test runtimeclasses are being properly created and container runtime is not broken" {
pushd "${repo_root_dir}"
# Create verification pod spec
local verification_yaml
verification_yaml=$(mktemp)
cat > "${verification_yaml}" << EOF
apiVersion: v1
kind: Pod
metadata:
name: kata-deploy-verify
spec:
runtimeClassName: kata-${KATA_HYPERVISOR}
restartPolicy: Never
nodeSelector:
katacontainers.io/kata-runtime: "true"
containers:
- name: verify
image: quay.io/kata-containers/alpine-bash-curl:latest
imagePullPolicy: Always
command:
- sh
- -c
- |
echo "=== Kata Verification ==="
echo "Kernel: \$(uname -r)"
echo "SUCCESS: Pod running with Kata runtime"
EOF
# Install kata-deploy via Helm
echo "Installing kata-deploy with Helm..."
local helm_chart_dir="tools/packaging/kata-deploy/helm-chart/kata-deploy"
# Timeouts can be customized via environment variables:
# - KATA_DEPLOY_TIMEOUT: Overall helm timeout (includes all hooks)
# Default: 600s (10 minutes)
# - KATA_DEPLOY_DAEMONSET_TIMEOUT: Time to wait for kata-deploy DaemonSet rollout (image pull + pod start)
# Default: 300s (5 minutes) - accounts for large image downloads
# - KATA_DEPLOY_VERIFICATION_TIMEOUT: Time to wait for verification pod to complete
# Default: 120s (2 minutes) - verification pod execution time
local helm_timeout="${KATA_DEPLOY_TIMEOUT:-600s}"
local daemonset_timeout="${KATA_DEPLOY_DAEMONSET_TIMEOUT:-300}"
local verification_timeout="${KATA_DEPLOY_VERIFICATION_TIMEOUT:-120}"
echo "Timeout configuration:"
echo " Helm overall: ${helm_timeout}"
echo " DaemonSet rollout: ${daemonset_timeout}s (includes image pull)"
echo " Verification pod: ${verification_timeout}s (pod execution)"
helm dependency build "${helm_chart_dir}"
# Disable all shims except the one being tested
helm upgrade --install kata-deploy "${helm_chart_dir}" \
--set image.reference="${DOCKER_REGISTRY}/${DOCKER_REPO}" \
--set image.tag="${DOCKER_TAG}" \
--set debug=true \
--set k8sDistribution="${KUBERNETES}" \
--set shims.clh.enabled=false \
--set shims.cloud-hypervisor.enabled=false \
--set shims.dragonball.enabled=false \
--set shims.fc.enabled=false \
--set shims.qemu.enabled=false \
--set shims.qemu-runtime-rs.enabled=false \
--set shims.qemu-cca.enabled=false \
--set shims.qemu-se.enabled=false \
--set shims.qemu-se-runtime-rs.enabled=false \
--set shims.qemu-nvidia-gpu.enabled=false \
--set shims.qemu-nvidia-gpu-snp.enabled=false \
--set shims.qemu-nvidia-gpu-tdx.enabled=false \
--set shims.qemu-sev.enabled=false \
--set shims.qemu-snp.enabled=false \
--set shims.qemu-snp-runtime-rs.enabled=false \
--set shims.qemu-tdx.enabled=false \
--set shims.qemu-tdx-runtime-rs.enabled=false \
--set shims.qemu-coco-dev.enabled=false \
--set shims.qemu-coco-dev-runtime-rs.enabled=false \
--set "shims.${KATA_HYPERVISOR}.enabled=true" \
--set "defaultShim.amd64=${KATA_HYPERVISOR}" \
--set "defaultShim.arm64=${KATA_HYPERVISOR}" \
--set runtimeClasses.enabled=true \
--set runtimeClasses.createDefault=true \
--set-file verification.pod="${verification_yaml}" \
--set verification.timeout="${verification_timeout}" \
--set verification.daemonsetTimeout="${daemonset_timeout}" \
--namespace kube-system \
--wait --timeout "${helm_timeout}"
rm -f "${verification_yaml}"
echo ""
echo "::group::kata-deploy logs"
kubectl -n kube-system logs --tail=200 -l name=kata-deploy
echo "::endgroup::"
echo ""
echo "::group::Runtime classes"
kubectl get runtimeclass
echo "::endgroup::"
# helm --wait already waits for post-install hooks to complete
# If helm returns successfully, the verification job passed
# The job is deleted after success (hook-delete-policy: hook-succeeded)
echo ""
echo "Helm install completed successfully - verification passed"
# We filter `kata-mshv-vm-isolation` out as that's present on AKS clusters, but that's not coming from kata-deploy
current_runtime_classes=$(kubectl get runtimeclasses | grep -v "kata-mshv-vm-isolation" | grep "kata" | wc -l)
[[ ${current_runtime_classes} -eq ${expected_runtime_classes} ]]
@@ -87,6 +174,8 @@ setup() {
# Check that the container runtime verison doesn't have unknown, which happens when containerd can't start properly
container_runtime_version=$(kubectl get nodes --no-headers -o custom-columns=CONTAINER_RUNTIME:.status.nodeInfo.containerRuntimeVersion)
[[ ${container_runtime_version} != *"containerd://Unknown"* ]]
popd
}
teardown() {

View File

@@ -6,10 +6,14 @@
#
set -e
set -o pipefail
kata_deploy_dir=$(dirname "$(readlink -f "$0")")
source "${kata_deploy_dir}/../../common.bash"
# Setting to "yes" enables fail fast, stopping execution at the first failed test.
export BATS_TEST_FAIL_FAST="${BATS_TEST_FAIL_FAST:-no}"
if [[ -n "${KATA_DEPLOY_TEST_UNION:-}" ]]; then
KATA_DEPLOY_TEST_UNION=("${KATA_DEPLOY_TEST_UNION}")
else
@@ -18,8 +22,4 @@ else
)
fi
info "Run tests"
for KATA_DEPLOY_TEST_ENTRY in "${KATA_DEPLOY_TEST_UNION[@]}"
do
bats --show-output-of-passing-tests "${KATA_DEPLOY_TEST_ENTRY}"
done
run_bats_tests "${kata_deploy_dir}" KATA_DEPLOY_TEST_UNION

View File

@@ -28,6 +28,7 @@ HELM_SHIMS="${HELM_SHIMS:-}"
HELM_SNAPSHOTTER_HANDLER_MAPPING="${HELM_SNAPSHOTTER_HANDLER_MAPPING:-}"
HELM_EXPERIMENTAL_SETUP_SNAPSHOTTER="${HELM_EXPERIMENTAL_SETUP_SNAPSHOTTER:-}"
HELM_EXPERIMENTAL_FORCE_GUEST_PULL="${HELM_EXPERIMENTAL_FORCE_GUEST_PULL:-}"
HELM_VERIFY_DEPLOYMENT="${HELM_VERIFY_DEPLOYMENT:-false}"
KATA_DEPLOY_WAIT_TIMEOUT="${KATA_DEPLOY_WAIT_TIMEOUT:-600}"
KATA_HOST_OS="${KATA_HOST_OS:-}"
KUBERNETES="${KUBERNETES:-}"
@@ -599,7 +600,7 @@ function helm_helper() {
yq -i ".shims.${shim}.enabled = true" "${values_yaml}"
yq -i ".shims.${shim}.supportedArches = [\"arm64\"]" "${values_yaml}"
;;
qemu-snp|qemu-tdx|qemu-nvidia-gpu-snp|qemu-nvidia-gpu-tdx)
qemu-snp|qemu-snp-runtime-rs|qemu-tdx|qemu-tdx-runtime-rs|qemu-nvidia-gpu-snp|qemu-nvidia-gpu-tdx)
yq -i ".shims.${shim}.enabled = true" "${values_yaml}"
yq -i ".shims.${shim}.supportedArches = [\"amd64\"]" "${values_yaml}"
;;
@@ -819,10 +820,53 @@ function helm_helper() {
[[ -n "${HELM_HOST_OS}" ]] && yq -i ".env.hostOS=\"${HELM_HOST_OS}\"" "${values_yaml}"
fi
# Enable verification during deployment if HELM_VERIFY_DEPLOYMENT is set
# Creates a simple verification pod that runs with the Kata runtime
local helm_set_file_args=""
if [[ "${HELM_VERIFY_DEPLOYMENT}" == "true" ]]; then
# Determine runtime class from HELM_DEFAULT_SHIM or default to kata-qemu
local runtime_class="kata-qemu"
if [[ -n "${HELM_DEFAULT_SHIM}" ]]; then
runtime_class="kata-${HELM_DEFAULT_SHIM}"
fi
local verification_yaml
verification_yaml=$(mktemp)
cat > "${verification_yaml}" << 'VERIFICATION_POD_EOF'
apiVersion: v1
kind: Pod
metadata:
name: kata-deploy-verify
spec:
runtimeClassName: RUNTIME_CLASS_PLACEHOLDER
restartPolicy: Never
nodeSelector:
katacontainers.io/kata-runtime: "true"
containers:
- name: verify
image: quay.io/kata-containers/alpine-bash-curl:latest
imagePullPolicy: Always
command:
- sh
- -c
- |
echo "=== Kata Verification ==="
echo "Kernel: $(uname -r)"
echo "SUCCESS: Pod running with Kata runtime"
VERIFICATION_POD_EOF
# Replace runtime class placeholder
sed -i "s|RUNTIME_CLASS_PLACEHOLDER|${runtime_class}|g" "${verification_yaml}"
echo "Enabling deployment verification with runtimeClass: ${runtime_class}"
helm_set_file_args="--set-file verification.pod=${verification_yaml}"
# Clean up temp file on exit
trap "rm -f ${verification_yaml}" EXIT
fi
echo "::group::Final kata-deploy manifests used in the test"
cat "${values_yaml}"
echo ""
helm template "${helm_chart_dir}" --values "${values_yaml}" --namespace kube-system
# ${helm_set_file_args} is intentionally left unquoted
helm template "${helm_chart_dir}" --values "${values_yaml}" ${helm_set_file_args} --namespace kube-system
[[ "$(yq .image.reference "${values_yaml}")" = "${HELM_IMAGE_REFERENCE}" ]] || die "Failed to set image reference"
[[ "$(yq .image.tag "${values_yaml}")" = "${HELM_IMAGE_TAG}" ]] || die "Failed to set image tag"
echo "::endgroup::"
@@ -832,12 +876,13 @@ function helm_helper() {
max_tries=3
interval=10
i=10
i=0
# Retry loop for helm install to prevent transient failures due to instantly unreachable cluster
set +e # Disable immediate exit on failure
while true; do
helm upgrade --install kata-deploy "${helm_chart_dir}" --values "${values_yaml}" --namespace kube-system --debug
# ${helm_set_file_args} is intentionally left unquoted
helm upgrade --install kata-deploy "${helm_chart_dir}" --values "${values_yaml}" ${helm_set_file_args} --namespace kube-system --debug
ret=${?}
if [[ ${ret} -eq 0 ]]; then
echo "Helm install succeeded!"
@@ -845,15 +890,16 @@ function helm_helper() {
fi
i=$((i+1))
if [[ ${i} -lt ${max_tries} ]]; then
echo "Retrying after ${interval} seconds (Attempt ${i} of $((max_tries - 1)))"
echo "Retrying after ${interval} seconds (Attempt ${i} of ${max_tries})"
else
break
fi
sleep "${interval}"
done
set -e # Re-enable immediate exit on failure
if [[ ${i} -eq ${max_tries} ]]; then
die "Failed to deploy kata-deploy after ${max_tries} tries"
if [[ ${i} -ge ${max_tries} ]]; then
echo "ERROR: Failed to deploy kata-deploy after ${max_tries} tries"
return 1
fi
# `helm install --wait` does not take effect on single replicas and maxUnavailable=1 DaemonSets

View File

@@ -326,41 +326,7 @@ function run_tests() {
# directory.
#
function report_tests() {
local reports_dir="${kubernetes_dir}/reports"
local ok
local not_ok
local status
if [[ ! -d "${reports_dir}" ]]; then
info "no reports directory found: ${reports_dir}"
return
fi
for report_dir in "${reports_dir}"/*; do
mapfile -t ok < <(find "${report_dir}" -name "ok-*.out")
mapfile -t not_ok < <(find "${report_dir}" -name "not_ok-*.out")
cat <<-EOF
SUMMARY ($(basename "${report_dir}")):
Pass: ${#ok[*]}
Fail: ${#not_ok[*]}
EOF
echo -e "\nSTATUSES:"
for out in "${not_ok[@]}" "${ok[@]}"; do
status=$(basename "${out}" | cut -d '-' -f1)
bats=$(basename "${out}" | cut -d '-' -f2- | sed 's/.out$//')
echo " ${status} ${bats}"
done
echo -e "\nOUTPUTS:"
for out in "${not_ok[@]}" "${ok[@]}"; do
bats=$(basename "${out}" | cut -d '-' -f2- | sed 's/.out$//')
echo "::group::${bats}"
cat "${out}"
echo "::endgroup::"
done
done
report_bats_tests "${kubernetes_dir}"
}
function collect_artifacts() {

View File

@@ -6,6 +6,7 @@
#
set -e
set -o pipefail
kubernetes_dir=$(dirname "$(readlink -f "$0")")
# shellcheck disable=SC1091 # import based on variable
@@ -53,20 +54,6 @@ if [[ "${ENABLE_NVRC_TRACE:-true}" == "true" ]]; then
enable_nvrc_trace
fi
info "Running tests with bats version: $(bats --version)"
tests_fail=()
for K8S_TEST_ENTRY in "${K8S_TEST_NV[@]}"
do
K8S_TEST_ENTRY=$(echo "${K8S_TEST_ENTRY}" | tr -d '[:space:][:cntrl:]')
info "$(kubectl get pods --all-namespaces 2>&1)"
info "Executing ${K8S_TEST_ENTRY}"
if ! bats --show-output-of-passing-tests "${K8S_TEST_ENTRY}"; then
tests_fail+=("${K8S_TEST_ENTRY}")
[[ "${K8S_TEST_FAIL_FAST}" = "yes" ]] && break
fi
done
[[ ${#tests_fail[@]} -ne 0 ]] && die "Tests FAILED from suites: ${tests_fail[*]}"
info "All tests SUCCEEDED"
# Use common bats test runner with proper reporting
export BATS_TEST_FAIL_FAST="${K8S_TEST_FAIL_FAST}"
run_bats_tests "${kubernetes_dir}" K8S_TEST_NV

View File

@@ -135,28 +135,6 @@ fi
ensure_yq
report_dir="${kubernetes_dir}/reports/$(date +'%F-%T')"
mkdir -p "${report_dir}"
info "Running tests with bats version: $(bats --version). Save outputs to ${report_dir}"
tests_fail=()
for K8S_TEST_ENTRY in "${K8S_TEST_UNION[@]}"
do
K8S_TEST_ENTRY=$(echo "$K8S_TEST_ENTRY" | tr -d '[:space:][:cntrl:]')
time info "$(kubectl get pods --all-namespaces 2>&1)"
info "Executing ${K8S_TEST_ENTRY}"
# Output file will be prefixed with "ok" or "not_ok" based on the result
out_file="${report_dir}/${K8S_TEST_ENTRY}.out"
if ! bats --timing --show-output-of-passing-tests "${K8S_TEST_ENTRY}" | tee "${out_file}"; then
tests_fail+=("${K8S_TEST_ENTRY}")
mv "${out_file}" "$(dirname "${out_file}")/not_ok-$(basename "${out_file}")"
[ "${K8S_TEST_FAIL_FAST}" = "yes" ] && break
else
mv "${out_file}" "$(dirname "${out_file}")/ok-$(basename "${out_file}")"
fi
done
[ ${#tests_fail[@]} -ne 0 ] && die "Tests FAILED from suites: ${tests_fail[*]}"
info "All tests SUCCEEDED"
# Use common bats test runner with proper reporting
export BATS_TEST_FAIL_FAST="${K8S_TEST_FAIL_FAST}"
run_bats_tests "${kubernetes_dir}" K8S_TEST_UNION

View File

@@ -1,65 +0,0 @@
#!/bin/bash
#
# Copyright (c) 2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
set -o errexit
set -o nounset
set -o pipefail
kata_tarball_dir="${2:-kata-artifacts}"
runk_dir="$(dirname "$(readlink -f "$0")")"
source "${runk_dir}/../../common.bash"
source "${runk_dir}/../../gha-run-k8s-common.sh"
function install_dependencies() {
info "Installing the dependencies needed for running the runk tests"
# Dependency list of projects that we can rely on the system packages
# - jq
declare -a system_deps=(
jq
)
sudo apt-get update
sudo apt-get -y install "${system_deps[@]}"
ensure_yq
# Dependency list of projects that we can install them
# directly from their releases on GitHub:
# - containerd
# - cri-container-cni release tarball already includes CNI plugins
declare -a github_deps
github_deps[0]="cri_containerd:$(get_from_kata_deps ".externals.containerd.${CONTAINERD_VERSION}")"
github_deps[1]="runc:$(get_from_kata_deps ".externals.runc.latest")"
github_deps[2]="cni_plugins:$(get_from_kata_deps ".externals.cni-plugins.version")"
for github_dep in "${github_deps[@]}"; do
IFS=":" read -r -a dep <<< "${github_dep}"
install_${dep[0]} "${dep[1]}"
done
# Requires bats to run the tests
install_bats
}
function run() {
info "Running runk tests using"
bats "${runk_dir}/runk-tests.bats"
}
function main() {
action="${1:-}"
case "${action}" in
install-dependencies) install_dependencies ;;
install-kata) install_kata ;;
run) run ;;
*) >&2 die "Invalid argument" ;;
esac
}
main "$@"

View File

@@ -1,127 +0,0 @@
#!/usr/bin/env bats
#
# Copyright (c) 2023,2024 Kata Contributors
#
# SPDX-License-Identifier: Apache-2.0
#
# This test will validate runk with containerd
load "${BATS_TEST_DIRNAME}/../../common.bash"
load "${BATS_TEST_DIRNAME}/../../metrics/lib/common.bash"
setup_file() {
export RUNK_BIN_PATH="/usr/local/bin/runk"
export TEST_IMAGE="quay.io/prometheus/busybox:latest"
export CONTAINER_ID="id1"
export PID_FILE="${CONTAINER_ID}.pid"
export WORK_DIR="${BATS_FILE_TMPDIR}"
echo "pull container image"
check_images ${TEST_IMAGE}
}
setup() {
# Bind mount ${WORK_DIR}:/tmp. Tests below will store files in this dir and check them when container is frozon.
sudo ctr run --pid-file ${PID_FILE} -d \
--mount type=bind,src=${WORK_DIR},dst=/tmp,options=rbind:rw \
--runc-binary ${RUNK_BIN_PATH} \
${TEST_IMAGE} \
${CONTAINER_ID}
read CID PID STATUS <<< $(sudo ctr t ls | grep ${CONTAINER_ID})
# Check the pid is consistent
[ "${PID}" == "$(cat "${PID_FILE}")" ]
# Check the container status is RUNNING
[ "${STATUS}" == "RUNNING" ]
}
teardown() {
echo "delete the container"
if sudo ctr t list -q | grep -q "${CONTAINER_ID}"; then
stop_container
fi
sudo ctr c rm "${CONTAINER_ID}"
sudo rm -f "${PID_FILE}"
}
stop_container() {
local cmd
sudo ctr t kill --signal SIGKILL --all "${CONTAINER_ID}"
# poll for a while until the task receives signal and exit
cmd='[ "STOPPED" == "$(sudo ctr t ls | grep ${CONTAINER_ID} | awk "{print \$3}")" ]'
waitForProcess 10 1 "${cmd}"
echo "check the container is stopped"
# there is only title line of ps command
[ "1" == "$(sudo ctr t ps ${CONTAINER_ID} | wc -l)" ]
}
@test "start container with runk" {
}
@test "exec process in a container" {
sudo ctr t exec --exec-id id1 "${CONTAINER_ID}" sh -c "echo hello > /tmp/foo"
# Check exec succeeded
[ "hello" == "$(sudo ctr t exec --exec-id id1 "${CONTAINER_ID}" cat /tmp/foo)" ]
}
@test "run ps command" {
sudo ctr t exec --detach --exec-id id1 "${CONTAINER_ID}" sh
return_code=$?
echo "ctr t exec sh return: ${return_code}"
# Give some time for the sh process to start within the container.
sleep 5
ps_out="$(sudo ctr t ps ${CONTAINER_ID})" || die "ps command failed"
printf "ps output:\n%s\n" "${ps_out}"
lines_no="$(printf "%s\n" "${ps_out}" | wc -l)"
echo "ps output lines: ${lines_no}"
# one line is the titles, and the other 2 lines are process info
[ "3" == "${lines_no}" ]
}
@test "pause and resume the container" {
# The process outputs lines into /tmp/{CONTAINER_ID}, which can be read in host when it's frozon.
sudo ctr t exec --detach --exec-id id2 ${CONTAINER_ID} \
sh -c "while true; do echo hello >> /tmp/${CONTAINER_ID}; sleep 0.1; done"
# sleep for 1s to make sure the process outputs some lines
sleep 1
sudo ctr t pause "${CONTAINER_ID}"
# Check the status is PAUSED
[ "PAUSED" == "$(sudo ctr t ls | grep ${CONTAINER_ID} | grep -o PAUSED)" ]
echo "container is paused"
local TMP_FILE="${WORK_DIR}/${CONTAINER_ID}"
local lines1=$(cat ${TMP_FILE} | wc -l)
# sleep for a while and check the lines are not changed.
sleep 1
local lines2=$(cat ${TMP_FILE} | wc -l)
# Check the paused container is not running the process (paused indeed)
[ ${lines1} == ${lines2} ]
sudo ctr t resume ${CONTAINER_ID}
# Check the resumed container has status of RUNNING
[ "RUNNING" == "$(sudo ctr t ls | grep ${CONTAINER_ID} | grep -o RUNNING)" ]
echo "container is resumed"
# sleep for a while and check the lines are changed.
sleep 1
local lines3=$(cat ${TMP_FILE} | wc -l)
# Check the process is running again
[ ${lines2} -lt ${lines3} ]
}
@test "kill the container and poll until it is stopped" {
stop_container
}
@test "kill --all is allowed regardless of the container state" {
# High-level container runtimes such as containerd call the kill command with
# --all option in order to terminate all processes inside the container
# even if the container already is stopped. Hence, a low-level runtime
# should allow kill --all regardless of the container state like runc.
echo "test kill --all is allowed regardless of the container state"
# Check kill should fail because the container is paused
stop_container
run sudo ctr t kill --signal SIGKILL ${CONTAINER_ID}
[ $status -eq 1 ]
# Check kill --all should not fail
sudo ctr t kill --signal SIGKILL --all "${CONTAINER_ID}"
}

View File

@@ -74,8 +74,6 @@ Extra environment variables:
AGENT_BIN: Use it to change the expected agent binary name
AGENT_INIT: Use kata agent as init process
BLOCK_SIZE: Use to specify the size of blocks in bytes. DEFAULT: 4096
DAX_DISABLE: If set to "yes", skip DAX metadata header (for kernels without FS_DAX support).
DEFAULT: not set
IMAGE_REGISTRY: Hostname for the image registry used to pull down the rootfs build image.
NSDAX_BIN: Use to specify path to pre-compiled 'nsdax' tool.
USE_DOCKER: If set will build image in a Docker Container (requries docker)
@@ -171,7 +169,6 @@ build_with_container() {
--env BLOCK_SIZE="${block_size}" \
--env ROOT_FREE_SPACE="${root_free_space}" \
--env NSDAX_BIN="${nsdax_bin}" \
--env DAX_DISABLE="${DAX_DISABLE:-no}" \
--env MEASURED_ROOTFS="${MEASURED_ROOTFS}" \
--env SELINUX="${SELINUX}" \
--env DEBUG="${DEBUG}" \
@@ -307,12 +304,8 @@ calculate_img_size() {
local fs_type="$3"
local block_size="$4"
# rootfs start + DAX header size (if enabled) + rootfs end
local dax_sz=0
if [ "${DAX_DISABLE:-no}" != "yes" ]; then
dax_sz="${dax_header_sz}"
fi
local reserved_size_mb=$((rootfs_start + dax_sz + rootfs_end))
# rootfs start + DAX header size + rootfs end
local reserved_size_mb=$((rootfs_start + dax_header_sz + rootfs_end))
disk_size="$(calculate_required_disk_size "${rootfs}" "${fs_type}" "${block_size}")"
@@ -631,35 +624,25 @@ main() {
die "Invalid rootfs"
fi
# Determine DAX header size based on DAX_DISABLE setting
local dax_sz=0
if [ "${DAX_DISABLE:-no}" != "yes" ]; then
dax_sz="${dax_header_sz}"
fi
if [ "${fs_type}" == 'erofs' ]; then
# mkfs.erofs accepts an src root dir directory as an input
# rather than some device, so no need to guess the device dest size first.
create_erofs_rootfs_image "${rootfs}" "${image}" \
"${block_size}" "${agent_bin}"
rootfs_img_size=$?
img_size=$((rootfs_img_size + dax_sz))
img_size=$((rootfs_img_size + dax_header_sz))
else
img_size=$(calculate_img_size "${rootfs}" "${root_free_space}" \
"${fs_type}" "${block_size}")
# the first 2M are for the first MBR + NVDIMM metadata and were already
# consider in calculate_img_size (if DAX is enabled)
rootfs_img_size=$((img_size - dax_sz))
# consider in calculate_img_size
rootfs_img_size=$((img_size - dax_header_sz))
create_rootfs_image "${rootfs}" "${image}" "${rootfs_img_size}" \
"${fs_type}" "${block_size}" "${agent_bin}"
fi
# insert at the beginning of the image the MBR + DAX header
if [ "${DAX_DISABLE:-no}" != "yes" ]; then
set_dax_header "${image}" "${img_size}" "${fs_type}" "${nsdax_bin}"
else
info "Skipping DAX header (DAX_DISABLE=yes)"
fi
set_dax_header "${image}" "${img_size}" "${fs_type}" "${nsdax_bin}"
chown "${USER}:${GROUP}" "${image}"
}

View File

@@ -18,62 +18,21 @@ die() {
exit 1
}
run_file_name=$2
run_fm_file_name=$3
arch_target=$4
nvidia_gpu_stack="$5"
driver_version=""
driver_type="-open"
supported_gpu_devids="/supported-gpu.devids"
base_os="noble"
arch_target=$1
nvidia_gpu_stack="$2"
base_os="$3"
APT_INSTALL="apt -o Dpkg::Options::='--force-confdef' -o Dpkg::Options::='--force-confold' -yqq --no-install-recommends install"
export KBUILD_SIGN_PIN="${6:-}"
export DEBIAN_FRONTEND=noninteractive
is_feature_enabled() {
local feature="$1"
# Check if feature is in the comma-separated list
if [[ ",${nvidia_gpu_stack}," == *",${feature},"* ]]; then
return 0
else
return 1
fi
}
set_driver_version_type() {
echo "chroot: Setting the correct driver version"
if [[ ",${nvidia_gpu_stack}," == *",latest,"* ]]; then
driver_version="latest"
elif [[ ",${nvidia_gpu_stack}," == *",lts,"* ]]; then
driver_version="lts"
elif [[ "${nvidia_gpu_stack}" =~ version=([^,]+) ]]; then
driver_version="${BASH_REMATCH[1]}"
else
echo "No known driver spec found. Please specify \"latest\", \"lts\", or \"version=<VERSION>\"."
exit 1
fi
echo "chroot: driver_version: ${driver_version}"
echo "chroot: Setting the correct driver type"
# driver -> enable open or closed drivers
if [[ "${nvidia_gpu_stack}" =~ (^|,)driver=open($|,) ]]; then
driver_type="-open"
elif [[ "${nvidia_gpu_stack}" =~ (^|,)driver=closed($|,) ]]; then
driver_type=""
fi
echo "chroot: driver_type: ${driver_type}"
[[ ",${nvidia_gpu_stack}," == *",${feature},"* ]]
}
install_nvidia_ctk() {
echo "chroot: Installing NVIDIA GPU container runtime"
apt list nvidia-container-toolkit-base -a
# Base gives a nvidia-ctk and the nvidia-container-runtime
eval "${APT_INSTALL}" nvidia-container-toolkit-base=1.17.6-1
}
@@ -83,222 +42,61 @@ install_nvidia_fabricmanager() {
echo "chroot: Skipping NVIDIA fabricmanager installation"
return
}
# if run_fm_file_name exists run it
if [[ -f /"${run_fm_file_name}" ]]; then
install_nvidia_fabricmanager_from_run_file
else
install_nvidia_fabricmanager_from_distribution
fi
}
install_nvidia_fabricmanager_from_run_file() {
echo "chroot: Install NVIDIA fabricmanager from run file"
pushd / >> /dev/null
chmod +x "${run_fm_file_name}"
./"${run_fm_file_name}" --nox11
popd >> /dev/null
}
install_nvidia_fabricmanager_from_distribution() {
echo "chroot: Install NVIDIA fabricmanager from distribution"
eval "${APT_INSTALL}" nvidia-fabricmanager-"${driver_version}" libnvidia-nscq-"${driver_version}"
apt-mark hold nvidia-fabricmanager-"${driver_version}" libnvidia-nscq-"${driver_version}"
}
check_kernel_sig_config() {
[[ -n ${kernel_version} ]] || die "kernel_version is not set"
[[ -e /lib/modules/"${kernel_version}"/build/scripts/config ]] || die "Cannot find /lib/modules/${kernel_version}/build/scripts/config"
# make sure the used kernel has the proper CONFIG(s) set
readonly scripts_config=/lib/modules/"${kernel_version}"/build/scripts/config
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_MODULE_SIG)" == "y" ]] || die "Kernel config CONFIG_MODULE_SIG must be =Y"
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_MODULE_SIG_FORCE)" == "y" ]] || die "Kernel config CONFIG_MODULE_SIG_FORCE must be =Y"
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_MODULE_SIG_ALL)" == "y" ]] || die "Kernel config CONFIG_MODULE_SIG_ALL must be =Y"
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_MODULE_SIG_SHA512)" == "y" ]] || die "Kernel config CONFIG_MODULE_SIG_SHA512 must be =Y"
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_SYSTEM_TRUSTED_KEYS)" == "" ]] || die "Kernel config CONFIG_SYSTEM_TRUSTED_KEYS must be =\"\""
[[ "$("${scripts_config}" --file "/boot/config-${kernel_version}" --state CONFIG_SYSTEM_TRUSTED_KEYRING)" == "y" ]] || die "Kernel config CONFIG_SYSTEM_TRUSTED_KEYRING must be =Y"
}
build_nvidia_drivers() {
is_feature_enabled "compute" || {
echo "chroot: Skipping NVIDIA drivers build"
return
}
echo "chroot: Build NVIDIA drivers"
pushd "${driver_source_files}" >> /dev/null
local certs_dir
local kernel_version
local ARCH
for version in /lib/modules/*; do
kernel_version=$(basename "${version}")
certs_dir=/lib/modules/"${kernel_version}"/build/certs
signing_key=${certs_dir}/signing_key.pem
echo "chroot: Building GPU modules for: ${kernel_version}"
cp /boot/System.map-"${kernel_version}" /lib/modules/"${kernel_version}"/build/System.map
if [[ "${arch_target}" == "aarch64" ]]; then
ln -sf /lib/modules/"${kernel_version}"/build/arch/arm64 /lib/modules/"${kernel_version}"/build/arch/aarch64
ARCH=arm64
fi
if [[ "${arch_target}" == "x86_64" ]]; then
ln -sf /lib/modules/"${kernel_version}"/build/arch/x86 /lib/modules/"${kernel_version}"/build/arch/amd64
ARCH=x86_64
fi
echo "chroot: Building GPU modules for: ${kernel_version} ${ARCH}"
make -j "$(nproc)" CC=gcc SYSSRC=/lib/modules/"${kernel_version}"/build > /dev/null
if [[ -n "${KBUILD_SIGN_PIN}" ]]; then
mkdir -p "${certs_dir}" && mv /signing_key.* "${certs_dir}"/.
check_kernel_sig_config
fi
make INSTALL_MOD_STRIP=1 -j "$(nproc)" CC=gcc SYSSRC=/lib/modules/"${kernel_version}"/build modules_install
make -j "$(nproc)" CC=gcc SYSSRC=/lib/modules/"${kernel_version}"/build clean > /dev/null
# The make clean above should clear also the certs directory but just in case something
# went wroing make sure the signing_key.pem is removed
[[ -e "${signing_key}" ]] && rm -f "${signing_key}"
done
# Save the modules for later so that a linux-image purge does not remove them
tar cvfa /lib/modules.save_from_purge.tar.zst /lib/modules
popd >> /dev/null
echo "chroot: Install NVIDIA fabricmanager"
eval "${APT_INSTALL}" nvidia-fabricmanager libnvidia-nscq
apt-mark hold nvidia-fabricmanager libnvidia-nscq
}
install_userspace_components() {
if [[ ! -f /"${run_file_name}" ]]; then
echo "chroot: Skipping NVIDIA userspace runfile components installation"
return
# Extract the driver=XXX part first, then get the value
if [[ "${nvidia_gpu_stack}" =~ driver=([^,]+) ]]; then
driver_version="${BASH_REMATCH[1]}"
fi
echo "chroot: driver_version: ${driver_version}"
pushd /NVIDIA-* >> /dev/null
# if aarch64 we need to remove --no-install-compat32-libs
if [[ "${arch_target}" == "aarch64" ]]; then
./nvidia-installer --no-kernel-modules --no-systemd --no-nvidia-modprobe -s --x-prefix=/root
else
./nvidia-installer --no-kernel-modules --no-systemd --no-nvidia-modprobe -s --x-prefix=/root --no-install-compat32-libs
fi
popd >> /dev/null
eval "${APT_INSTALL}" nvidia-driver-pinning-"${driver_version}"
eval "${APT_INSTALL}" nvidia-imex nvidia-firmware \
libnvidia-cfg1 libnvidia-gl libnvidia-extra \
libnvidia-decode libnvidia-fbc1 libnvidia-encode \
libnvidia-nscq
}
prepare_run_file_drivers() {
if [[ "${driver_version}" == "latest" ]]; then
driver_version=""
echo "chroot: Resetting driver version not supported with run-file"
elif [[ "${driver_version}" == "lts" ]]; then
driver_version=""
echo "chroot: Resetting driver version not supported with run-file"
fi
echo "chroot: Prepare NVIDIA run file drivers"
pushd / >> /dev/null
chmod +x "${run_file_name}"
./"${run_file_name}" -x
mkdir -p /usr/share/nvidia/rim/
# Sooner or later RIM files will be only available remotely
RIMFILE=$(ls NVIDIA-*/RIM_GH100PROD.swidtag)
if [[ -e "${RIMFILE}" ]]; then
cp NVIDIA-*/RIM_GH100PROD.swidtag /usr/share/nvidia/rim/.
fi
popd >> /dev/null
}
prepare_distribution_drivers() {
if [[ "${driver_version}" == "latest" ]]; then
driver_version=$(apt-cache search --names-only 'nvidia-headless-no-dkms-.?.?.?-server-open' | sort | awk '{ print $1 }' | tail -n 1 | cut -d'-' -f5)
elif [[ "${driver_version}" == "lts" ]]; then
driver_version="580"
fi
echo "chroot: Prepare NVIDIA distribution drivers"
eval "${APT_INSTALL}" nvidia-headless-no-dkms-"${driver_version}-server${driver_type}" \
nvidia-kernel-common-"${driver_version}"-server \
nvidia-imex-"${driver_version}" \
nvidia-utils-"${driver_version}"-server \
libnvidia-cfg1-"${driver_version}"-server \
libnvidia-gl-"${driver_version}"-server \
libnvidia-extra-"${driver_version}"-server \
libnvidia-decode-"${driver_version}"-server \
libnvidia-fbc1-"${driver_version}"-server \
libnvidia-encode-"${driver_version}"-server \
libnvidia-nscq-"${driver_version}"
}
prepare_nvidia_drivers() {
local driver_source_dir=""
if [[ -f /"${run_file_name}" ]]; then
prepare_run_file_drivers
for source_dir in /NVIDIA-*; do
if [[ -d "${source_dir}" ]]; then
driver_source_files="${source_dir}"/kernel${driver_type}
driver_source_dir="${source_dir}"
break
fi
done
get_supported_gpus_from_run_file "${driver_source_dir}"
else
prepare_distribution_drivers
for source_dir in /usr/src/nvidia*; do
if [[ -d "${source_dir}" ]]; then
driver_source_files="${source_dir}"
driver_source_dir="${source_dir}"
break
fi
done
get_supported_gpus_from_distro_drivers "${driver_source_dir}"
fi
}
install_build_dependencies() {
echo "chroot: Install NVIDIA drivers build dependencies"
eval "${APT_INSTALL}" make gcc gawk kmod libvulkan1 pciutils jq zstd linuxptp xz-utils
apt-mark hold nvidia-imex nvidia-firmware \
libnvidia-cfg1 libnvidia-gl libnvidia-extra \
libnvidia-decode libnvidia-fbc1 libnvidia-encode \
libnvidia-nscq
}
setup_apt_repositories() {
echo "chroot: Setup APT repositories"
mkdir -p /var/cache/apt/archives/partial
mkdir -p /var/log/apt
mkdir -p /var/lib/dpkg/info
mkdir -p /var/lib/dpkg/updates
mkdir -p /var/lib/dpkg/alternatives
mkdir -p /var/lib/dpkg/triggers
mkdir -p /var/lib/dpkg/parts
# Architecture to mirror mapping
declare -A arch_to_mirror=(
["x86_64"]="us.archive.ubuntu.com/ubuntu"
["aarch64"]="ports.ubuntu.com/ubuntu-ports"
)
local mirror="${arch_to_mirror[${arch_target}]}"
[[ -z "${mirror}" ]] && die "Unknown arch_target: ${arch_target}"
local deb_arch="amd64"
[[ "${arch_target}" == "aarch64" ]] && deb_arch="arm64"
mkdir -p /var/cache/apt/archives/partial /var/log/apt \
/var/lib/dpkg/{info,updates,alternatives,triggers,parts}
touch /var/lib/dpkg/status
rm -f /etc/apt/sources.list.d/*
if [[ "${arch_target}" == "x86_64" ]]; then
cat <<-CHROOT_EOF > /etc/apt/sources.list.d/"${base_os}".list
deb [arch=amd64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://us.archive.ubuntu.com/ubuntu ${base_os} main restricted universe multiverse
deb [arch=amd64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://us.archive.ubuntu.com/ubuntu ${base_os}-updates main restricted universe multiverse
deb [arch=amd64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://us.archive.ubuntu.com/ubuntu ${base_os}-security main restricted universe multiverse
deb [arch=amd64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://us.archive.ubuntu.com/ubuntu ${base_os}-backports main restricted universe multiverse
CHROOT_EOF
fi
key="/usr/share/keyrings/ubuntu-archive-keyring.gpg"
comp="main restricted universe multiverse"
if [[ "${arch_target}" == "aarch64" ]]; then
cat <<-CHROOT_EOF > /etc/apt/sources.list.d/"${base_os}".list
deb [arch=arm64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://ports.ubuntu.com/ubuntu-ports ${base_os} main restricted universe multiverse
deb [arch=arm64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://ports.ubuntu.com/ubuntu-ports ${base_os}-updates main restricted universe multiverse
deb [arch=arm64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://ports.ubuntu.com/ubuntu-ports ${base_os}-security main restricted universe multiverse
deb [arch=arm64 signed-by=/usr/share/keyrings/ubuntu-archive-keyring.gpg] http://ports.ubuntu.com/ubuntu-ports ${base_os}-backports main restricted universe multiverse
CHROOT_EOF
fi
cat <<-CHROOT_EOF > /etc/apt/sources.list.d/"${base_os}".list
deb [arch=${deb_arch} signed-by=${key}] http://${mirror} ${base_os} ${comp}
deb [arch=${deb_arch} signed-by=${key}] http://${mirror} ${base_os}-updates ${comp}
deb [arch=${deb_arch} signed-by=${key}] http://${mirror} ${base_os}-security ${comp}
deb [arch=${deb_arch} signed-by=${key}] http://${mirror} ${base_os}-backports ${comp}
CHROOT_EOF
local arch="${arch_target}"
[[ ${arch_target} == "aarch64" ]] && arch="sbsa"
@@ -310,60 +108,24 @@ setup_apt_repositories() {
curl -fsSL -O "https://developer.download.nvidia.com/compute/cuda/repos/${osver}/${arch}/${keyring}"
dpkg -i "${keyring}" && rm -f "${keyring}"
# Set priorities: Ubuntu repos highest, NVIDIA Container Toolkit next, CUDA repo blocked for driver packages
# Set priorities: CUDA repos highest, Ubuntu non-driver next, Ubuntu blocked for driver packages
cat <<-CHROOT_EOF > /etc/apt/preferences.d/nvidia-priority
# Prioritize Ubuntu repositories (highest priority)
Package: *
Pin: origin us.archive.ubuntu.com
Pin-Priority: 1000
Pin: $(dirname "${mirror}")
Pin-Priority: 400
Package: *
Pin: origin ports.ubuntu.com
Pin-Priority: 1000
# NVIDIA Container Toolkit (medium priority for toolkit only)
Package: nvidia-container-toolkit* libnvidia-container*
Pin: origin nvidia.github.io
Pin-Priority: 500
# Block all nvidia and libnvidia packages from CUDA repository
Package: nvidia-* libnvidia-*
Pin: origin developer.download.nvidia.com
Pin: $(dirname "${mirror}")
Pin-Priority: -1
# Allow non-driver CUDA packages from CUDA repository (low priority)
Package: *
Pin: origin developer.download.nvidia.com
Pin-Priority: 100
Pin-Priority: 800
CHROOT_EOF
apt update
}
install_kernel_dependencies() {
dpkg -i /linux-*deb
}
get_supported_gpus_from_run_file() {
local source_dir="$1"
local supported_gpus_json="${source_dir}"/supported-gpus/supported-gpus.json
jq . < "${supported_gpus_json}" | grep '"devid"' | awk '{ print $2 }' | tr -d ',"' > "${supported_gpu_devids}"
}
get_supported_gpus_from_distro_drivers() {
local supported_gpus_json="./usr/share/doc/nvidia-kernel-common-${driver_version}-server/supported-gpus.json"
jq . < "${supported_gpus_json}" | grep '"devid"' | awk '{ print $2 }' | tr -d ',"' > "${supported_gpu_devids}"
}
export_driver_version() {
for modules_version in /lib/modules/*; do
modinfo "${modules_version}"/kernel/drivers/video/nvidia.ko | grep ^version | awk '{ print $2 }' > /nvidia_driver_version
break
done
}
install_nvidia_dcgm() {
is_feature_enabled "dcgm" || {
echo "chroot: Skipping NVIDIA DCGM installation"
@@ -379,49 +141,12 @@ install_nvidia_dcgm() {
cleanup_rootfs() {
echo "chroot: Cleanup NVIDIA GPU rootfs"
apt-mark hold libstdc++6 libzstd1 libgnutls30t64 pciutils
if [[ -n "${driver_version}" ]]; then
apt-mark hold libnvidia-cfg1-"${driver_version}"-server \
nvidia-utils-"${driver_version}"-server \
nvidia-kernel-common-"${driver_version}"-server \
nvidia-imex-"${driver_version}" \
nvidia-compute-utils-"${driver_version}"-server \
libnvidia-compute-"${driver_version}"-server \
libnvidia-gl-"${driver_version}"-server \
libnvidia-extra-"${driver_version}"-server \
libnvidia-decode-"${driver_version}"-server \
libnvidia-fbc1-"${driver_version}"-server \
libnvidia-encode-"${driver_version}"-server \
libnvidia-nscq-"${driver_version}" \
linuxptp libnftnl11
fi
kernel_headers=$(dpkg --get-selections | cut -f1 | grep linux-headers)
linux_images=$(dpkg --get-selections | cut -f1 | grep linux-image)
for i in ${kernel_headers} ${linux_images}; do
apt purge -yqq "${i}"
done
apt purge -yqq jq make gcc xz-utils linux-libc-dev
if [[ -n "${driver_version}" ]]; then
apt purge -yqq nvidia-headless-no-dkms-"${driver_version}"-server"${driver_type}" \
nvidia-kernel-source-"${driver_version}"-server"${driver_type}"
fi
apt-mark hold libstdc++6 libzstd1 libgnutls30t64 pciutils linuxptp libnftnl11
apt autoremove -yqq
apt clean
apt autoclean
for modules_version in /lib/modules/*; do
ln -sf "${modules_version}" /lib/modules/"$(uname -r)"
touch "${modules_version}"/modules.order
touch "${modules_version}"/modules.builtin
depmod -a
done
rm -rf /var/lib/apt/lists/* /var/cache/apt/* /var/log/apt /var/cache/debconf
rm -f /etc/apt/sources.list
rm -f /usr/bin/nvidia-ngx-updater /usr/bin/nvidia-container-runtime
@@ -430,23 +155,14 @@ cleanup_rootfs() {
# Clear and regenerate the ld cache
rm -f /etc/ld.so.cache
ldconfig
tar xvf /lib/modules.save_from_purge.tar.zst -C /
rm -f /lib/modules.save_from_purge.tar.zst
}
# Start of script
echo "chroot: Setup NVIDIA GPU rootfs stage one"
set_driver_version_type
setup_apt_repositories
install_kernel_dependencies
install_build_dependencies
prepare_nvidia_drivers
build_nvidia_drivers
install_userspace_components
install_nvidia_fabricmanager
install_nvidia_ctk
export_driver_version
install_nvidia_dcgm
cleanup_rootfs

View File

@@ -41,29 +41,27 @@ fi
readonly stage_one="${BUILD_DIR:?}/rootfs-${VARIANT:?}-stage-one"
setup_nvidia-nvrc() {
local rootfs_type=${1:-""}
local url ver
local nvrc=NVRC-${machine_arch}-unknown-linux-musl
url=$(get_package_version_from_kata_yaml "externals.nvrc.url")
ver=$(get_package_version_from_kata_yaml "externals.nvrc.version")
BIN="NVRC${rootfs_type:+"-${rootfs_type}"}"
TARGET=${machine_arch}-unknown-linux-musl
URL=$(get_package_version_from_kata_yaml "externals.nvrc.url")
VER=$(get_package_version_from_kata_yaml "externals.nvrc.version")
local dl="${url}/${ver}"
curl -fsSL -o "${BUILD_DIR}/${nvrc}.tar.xz" "${dl}/${nvrc}.tar.xz"
curl -fsSL -o "${BUILD_DIR}/${nvrc}.tar.xz.sig" "${dl}/${nvrc}.tar.xz.sig"
curl -fsSL -o "${BUILD_DIR}/${nvrc}.tar.xz.cert" "${dl}/${nvrc}.tar.xz.cert"
local DL="${URL}/${VER}"
curl -fsSL -o "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz" "${DL}/${BIN}-${TARGET}.tar.xz"
curl -fsSL -o "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz.sig" "${DL}/${BIN}-${TARGET}.tar.xz.sig"
curl -fsSL -o "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz.cert" "${DL}/${BIN}-${TARGET}.tar.xz.cert"
ID="^https://github.com/NVIDIA/nvrc/.github/workflows/.+@refs/heads/main$"
OIDC="https://token.actions.githubusercontent.com"
local id="^https://github.com/NVIDIA/nvrc/.github/workflows/.+@refs/heads/main$"
local oidc="https://token.actions.githubusercontent.com"
# Only allow releases from the NVIDIA/nvrc main branch and build by github actions
cosign verify-blob \
--rekor-url https://rekor.sigstore.dev \
--certificate "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz.cert" \
--signature "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz.sig" \
--certificate-identity-regexp "${ID}" \
--certificate-oidc-issuer "${OIDC}" \
"${BUILD_DIR}/${BIN}-${TARGET}.tar.xz"
cosign verify-blob \
--rekor-url https://rekor.sigstore.dev \
--certificate "${BUILD_DIR}/${nvrc}.tar.xz.cert" \
--signature "${BUILD_DIR}/${nvrc}.tar.xz.sig" \
--certificate-identity-regexp "${id}" \
--certificate-oidc-issuer "${oidc}" \
"${BUILD_DIR}/${nvrc}.tar.xz"
}
setup_nvidia_gpu_rootfs_stage_one() {
@@ -81,47 +79,31 @@ setup_nvidia_gpu_rootfs_stage_one() {
chmod +x ./nvidia_chroot.sh
local BIN="NVRC${rootfs_type:+"-${rootfs_type}"}"
local TARGET=${machine_arch}-unknown-linux-musl
if [[ ! -e "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz" ]]; then
setup_nvidia-nvrc "${rootfs_type}"
local nvrc=NVRC-${machine_arch}-unknown-linux-musl
if [[ ! -e "${BUILD_DIR}/${nvrc}.tar.xz" ]]; then
setup_nvidia-nvrc
fi
tar -xvf "${BUILD_DIR}/${BIN}-${TARGET}.tar.xz" -C ./bin/
tar -xvf "${BUILD_DIR}/${nvrc}.tar.xz" -C ./bin/
local appendix="${rootfs_type:+"-${rootfs_type}"}"
if echo "${NVIDIA_GPU_STACK}" | grep -q '\<dragonball\>'; then
appendix="-dragonball-experimental"
fi
# We need the kernel packages for building the drivers cleanly will be
# deinstalled and removed from the roofs once the build finishes.
tar --zstd -xvf "${BUILD_DIR}"/kata-static-kernel-nvidia-gpu"${appendix}"-headers.tar.zst -C .
# If we find a local downloaded run file build the kernel modules
# with it, otherwise use the distribution packages. Run files may have
# more recent drivers available then the distribution packages.
local run_file_name="nvidia-driver.run"
if [[ -f ${BUILD_DIR}/${run_file_name} ]]; then
cp -L "${BUILD_DIR}"/"${run_file_name}" ./"${run_file_name}"
fi
local run_fm_file_name="nvidia-fabricmanager.run"
if [[ -f ${BUILD_DIR}/${run_fm_file_name} ]]; then
cp -L "${BUILD_DIR}"/"${run_fm_file_name}" ./"${run_fm_file_name}"
fi
# Install the precompiled kernel modules shipped with the kernel
mkdir -p ./lib/modules/
tar --zstd -xvf "${BUILD_DIR}"/kata-static-kernel-nvidia-gpu"${appendix}"-modules.tar.zst -C ./lib/modules/
mount --rbind /dev ./dev
mount --make-rslave ./dev
mount -t proc /proc ./proc
chroot . /bin/bash -c "/nvidia_chroot.sh $(uname -r) ${run_file_name} \
${run_fm_file_name} ${machine_arch} ${NVIDIA_GPU_STACK} ${KBUILD_SIGN_PIN}"
chroot . /bin/bash -c "/nvidia_chroot.sh ${machine_arch} ${NVIDIA_GPU_STACK} noble"
umount -R ./dev
umount ./proc
rm ./nvidia_chroot.sh
rm ./*.deb
tar cfa "${stage_one}.tar.zst" --remove-files -- *
@@ -183,7 +165,6 @@ chisseled_dcgm() {
chisseled_compute() {
echo "nvidia: chisseling GPU"
cp -a "${stage_one}"/nvidia_driver_version .
cp -a "${stage_one}"/lib/modules/* lib/modules/.
libdir="lib/${machine_arch}-linux-gnu"
@@ -194,6 +175,15 @@ chisseled_compute() {
cp -a "${stage_one}/${libdir}"/libc.so.6* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libm.so.6* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/librt.so.1* "${libdir}"/.
# nvidia-persistenced dependencies for CUDA repo and >= 590
cp -a "${stage_one}/${libdir}"/libtirpc.so.3* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libgssapi_krb5.so.2* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libkrb5.so.3* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libkrb5support.so.0* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libk5crypto.so.3* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libcom_err.so.2* "${libdir}"/.
cp -a "${stage_one}/${libdir}"/libkeyutils.so.1* "${libdir}"/.
cp -a "${stage_one}/etc/netconfig" etc/.
[[ "${type}" == "confidential" ]] && cp -a "${stage_one}/${libdir}"/libnvidia-pkcs11* "${libdir}"/.
@@ -222,19 +212,13 @@ chisseled_gpudirect() {
}
setup_nvrc_init_symlinks() {
local rootfs_type=${1:-""}
local bin="NVRC${rootfs_type:+"-${rootfs_type}"}"
local target=${machine_arch}-unknown-linux-musl
local nvrc="NVRC-${machine_arch}-unknown-linux-musl"
# make sure NVRC is the init process for the initrd and image case
ln -sf /bin/"${bin}-${target}" init
ln -sf /bin/"${bin}-${target}" sbin/init
ln -sf /bin/"${nvrc}" init
ln -sf /bin/"${nvrc}" sbin/init
}
chisseled_init() {
local rootfs_type=${1:-""}
echo "nvidia: chisseling init"
tar --zstd -xvf "${BUILD_DIR}"/kata-static-busybox.tar.zst -C .
@@ -248,21 +232,19 @@ chisseled_init() {
libdir=lib/"${machine_arch}"-linux-gnu
cp -a "${stage_one}"/"${libdir}"/libgcc_s.so.1* "${libdir}"/.
bin="NVRC${rootfs_type:+"-${rootfs_type}"}"
target=${machine_arch}-unknown-linux-musl
local nvrc="NVRC-${machine_arch}-unknown-linux-musl"
cp -a "${stage_one}/bin/${bin}-${target}" bin/.
cp -a "${stage_one}/bin/${bin}-${target}".cert bin/.
cp -a "${stage_one}/bin/${bin}-${target}".sig bin/.
cp -a "${stage_one}/bin/${nvrc}" bin/.
cp -a "${stage_one}/bin/${nvrc}".cert bin/.
cp -a "${stage_one}/bin/${nvrc}".sig bin/.
setup_nvrc_init_symlinks "${rootfs_type}"
setup_nvrc_init_symlinks
cp -a "${stage_one}"/usr/bin/kata-agent usr/bin/.
if [[ "${AGENT_POLICY}" == "yes" ]]; then
cp -a "${stage_one}"/etc/kata-opa etc/.
fi
cp -a "${stage_one}"/etc/resolv.conf etc/.
cp -a "${stage_one}"/supported-gpu.devids .
cp -a "${stage_one}"/lib/firmware/nvidia lib/firmware/.
cp -a "${stage_one}"/sbin/ldconfig.real sbin/ldconfig
@@ -350,7 +332,7 @@ setup_nvidia_gpu_rootfs_stage_two() {
pushd "${stage_two}" >> /dev/null
# Only step needed from stage_two (see chisseled_init)
setup_nvrc_init_symlinks "${type}"
setup_nvrc_init_symlinks
else
echo "nvidia: chisseling the following stack components: ${stack}"
@@ -361,7 +343,7 @@ setup_nvidia_gpu_rootfs_stage_two() {
pushd "${stage_two}" >> /dev/null
chisseled_init "${type}"
chisseled_init
chisseled_iptables
IFS=',' read -r -a stack_components <<< "${NVIDIA_GPU_STACK}"

View File

@@ -52,7 +52,7 @@ build_initrd() {
GUEST_HOOKS_TARBALL="${GUEST_HOOKS_TARBALL}"
if [[ "${image_initrd_suffix}" == "nvidia-gpu"* ]]; then
nvidia_driver_version=$(cat "${builddir}"/initrd-image/*/nvidia_driver_version)
nvidia_driver_version=$(get_from_kata_deps .externals.nvidia.driver.version)
artifact_name=${artifact_name/.initrd/"-${nvidia_driver_version}".initrd}
fi
@@ -81,7 +81,7 @@ build_image() {
GUEST_HOOKS_TARBALL="${GUEST_HOOKS_TARBALL}"
if [[ "${image_initrd_suffix}" == "nvidia-gpu"* ]]; then
nvidia_driver_version=$(cat "${builddir}"/rootfs-image/*/nvidia_driver_version)
nvidia_driver_version=$(get_from_kata_deps .externals.nvidia.driver.version)
artifact_name=${artifact_name/.image/"-${nvidia_driver_version}".image}
fi

View File

@@ -1,11 +1,8 @@
# Copyright Intel Corporation, 2022 IBM Corp.
# Copyright (c) 2025 NVIDIA Corporation
#
# SPDX-License-Identifier: Apache-2.0
ARG BASE_IMAGE_NAME=alpine
ARG BASE_IMAGE_TAG=3.22
FROM ${BASE_IMAGE_NAME}:${BASE_IMAGE_TAG} AS base
#### Nydus snapshotter & nydus image
FROM golang:1.24-alpine AS nydus-binary-downloader
@@ -17,53 +14,219 @@ ARG NYDUS_SNAPSHOTTER_REPO=https://github.com/containerd/nydus-snapshotter
RUN \
mkdir -p /opt/nydus-snapshotter && \
ARCH=$(uname -m) && \
if [[ "${ARCH}" == "x86_64" ]]; then ARCH=amd64 ; fi && \
if [[ "${ARCH}" == "aarch64" ]]; then ARCH=arm64; fi && \
ARCH="$(uname -m)" && \
if [ "${ARCH}" = "x86_64" ]; then ARCH=amd64 ; fi && \
if [ "${ARCH}" = "aarch64" ]; then ARCH=arm64; fi && \
apk add --no-cache curl && \
curl -fOL --progress-bar ${NYDUS_SNAPSHOTTER_REPO}/releases/download/${NYDUS_SNAPSHOTTER_VERSION}/nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz && \
tar xvzpf nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz -C /opt/nydus-snapshotter && \
rm nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz
curl -fOL --progress-bar "${NYDUS_SNAPSHOTTER_REPO}/releases/download/${NYDUS_SNAPSHOTTER_VERSION}/nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz" && \
tar xvzpf "nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz" -C /opt/nydus-snapshotter && \
rm "nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz"
#### Build binary package
FROM ubuntu:22.04 AS rust-builder
#### kata-deploy main image
# Default to Rust 1.90.0
ARG RUST_TOOLCHAIN=1.90.0
ENV DEBIAN_FRONTEND=noninteractive
ENV RUSTUP_HOME="/opt/rustup"
ENV CARGO_HOME="/opt/cargo"
ENV PATH="/opt/cargo/bin/:${PATH}"
# kata-deploy args
FROM base
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
ARG KATA_ARTIFACTS=./kata-static.tar.zst
RUN \
mkdir ${RUSTUP_HOME} ${CARGO_HOME} && \
chmod -R a+rwX ${RUSTUP_HOME} ${CARGO_HOME}
RUN \
apt-get update && \
apt-get --no-install-recommends -y install \
ca-certificates \
curl \
gcc \
libc6-dev \
musl-tools && \
apt-get clean && rm -rf /var/lib/apt/lists/ && \
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain ${RUST_TOOLCHAIN}
WORKDIR /kata-deploy
# Copy standalone binary project
COPY binary /kata-deploy
# Install target and run tests based on architecture
# - AMD64/arm64: use musl for fully static binaries
# - PPC64le/s390x: use glibc (musl has issues on these platforms)
RUN \
HOST_ARCH="$(uname -m)"; \
rust_arch=""; \
rust_target=""; \
case "${HOST_ARCH}" in \
"x86_64") \
rust_arch="x86_64"; \
rust_target="${rust_arch}-unknown-linux-musl"; \
echo "Installing musl target for ${rust_target}"; \
rustup target add "${rust_target}"; \
;; \
"aarch64") \
rust_arch="aarch64"; \
rust_target="${rust_arch}-unknown-linux-musl"; \
echo "Installing musl target for ${rust_target}"; \
rustup target add "${rust_target}"; \
;; \
"ppc64le") \
rust_arch="powerpc64le"; \
rust_target="${rust_arch}-unknown-linux-gnu"; \
echo "Using glibc target for ${rust_target} (musl is not well supported on ppc64le)"; \
;; \
"s390x") \
rust_arch="s390x"; \
rust_target="${rust_arch}-unknown-linux-gnu"; \
echo "Using glibc target for ${rust_target} (musl is not well supported on s390x)"; \
;; \
*) echo "Unsupported architecture: ${HOST_ARCH}" && exit 1 ;; \
esac; \
echo "${rust_target}" > /tmp/rust_target
# Run tests using --test-threads=1 to prevent environment variable pollution between tests,
# and this is fine as we'll never ever have multiple binaries running at the same time.
RUN \
rust_target="$(cat /tmp/rust_target)"; \
echo "Running binary tests with target ${rust_target}..." && \
RUSTFLAGS="-D warnings" cargo test --target "${rust_target}" -- --test-threads=1 && \
echo "All tests passed!"
RUN \
rust_target="$(cat /tmp/rust_target)"; \
echo "Building kata-deploy binary for ${rust_target}..." && \
RUSTFLAGS="-D warnings" cargo build --release --target "${rust_target}" && \
mkdir -p /kata-deploy/bin && \
cp "/kata-deploy/target/${rust_target}/release/kata-deploy" /kata-deploy/bin/kata-deploy && \
echo "Cleaning up build artifacts to save disk space..." && \
rm -rf /kata-deploy/target && \
cargo clean
#### Extract kata artifacts
FROM alpine:3.22 AS artifact-extractor
ARG KATA_ARTIFACTS=kata-static.tar.zst
ARG DESTINATION=/opt/kata-artifacts
COPY ${KATA_ARTIFACTS} /
# I understand that in order to be on the safer side, it'd
# be good to have the alpine packages pointing to a very
# specific version, but this may break anyone else trying
# to use a different version of alpine for one reason or
# another. With this in mind, let's ignore DL3018.
# SC2086 is about using double quotes to prevent globbing and
# word splitting, which can also be ignored for now.
# hadolint ignore=DL3018,SC2086
COPY ${KATA_ARTIFACTS} /tmp/
RUN \
apk --no-cache add bash curl tar zstd && \
ARCH=$(uname -m) && \
if [ "${ARCH}" = "x86_64" ]; then ARCH=amd64; fi && \
if [ "${ARCH}" = "aarch64" ]; then ARCH=arm64; fi && \
DEBIAN_ARCH=${ARCH} && \
if [ "${DEBIAN_ARCH}" = "ppc64le" ]; then DEBIAN_ARCH=ppc64el; fi && \
curl -fL --progress-bar -o /usr/bin/kubectl https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/${ARCH}/kubectl && \
chmod +x /usr/bin/kubectl && \
curl -fL --progress-bar -o /usr/bin/jq https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-${DEBIAN_ARCH} && \
chmod +x /usr/bin/jq && \
mkdir -p ${DESTINATION} && \
tar --zstd -xvf ${WORKDIR}/${KATA_ARTIFACTS} -C ${DESTINATION} && \
rm -f ${WORKDIR}/${KATA_ARTIFACTS} && \
apk del curl tar zstd && \
apk --no-cache add py3-pip && \
pip install --no-cache-dir yq==3.2.3 --break-system-packages
apk add --no-cache tar zstd util-linux-misc && \
mkdir -p "${DESTINATION}" && \
tar --zstd -xf "/tmp/$(basename "${KATA_ARTIFACTS}")" -C "${DESTINATION}" && \
rm -f "/tmp/$(basename "${KATA_ARTIFACTS}")"
COPY scripts ${DESTINATION}/scripts
#### Prepare runtime dependencies (nsenter and required libraries)
# This stage assembles all runtime dependencies based on architecture
# using ldd to find exact library dependencies
FROM debian:bookworm-slim AS runtime-assembler
ARG DESTINATION=/opt/kata-artifacts
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN \
apt-get update && \
apt-get --no-install-recommends -y install \
util-linux && \
apt-get clean && rm -rf /var/lib/apt/lists/
# Copy the built binary to analyze its dependencies
COPY --from=rust-builder /kata-deploy/bin/kata-deploy /tmp/kata-deploy
# Create output directories
RUN mkdir -p /output/lib /output/lib64 /output/usr/bin
# Use ldd to find and copy all required libraries for the kata-deploy binary and nsenter
RUN \
HOST_ARCH="$(uname -m)"; \
echo "Preparing runtime dependencies for ${HOST_ARCH}"; \
case "${HOST_ARCH}" in \
"ppc64le"|"s390x") \
echo "Using glibc - copying libraries based on ldd output"; \
\
# Copy nsenter \
cp /usr/bin/nsenter /output/usr/bin/nsenter; \
\
# Show what the binaries need \
echo "Libraries needed by kata-deploy:"; \
ldd /tmp/kata-deploy || echo "ldd failed"; \
echo "Libraries needed by nsenter:"; \
ldd /usr/bin/nsenter || echo "ldd failed"; \
\
# Extract and copy all library paths from both binaries \
for binary in /tmp/kata-deploy /usr/bin/nsenter; do \
echo "Processing ${binary}..."; \
# Get libraries with "=>" (shared libs) \
ldd "${binary}" 2>/dev/null | grep "=>" | awk '{print $3}' | sort -u | while read -r lib; do \
if [ -n "${lib}" ] && [ -f "${lib}" ]; then \
dest_dir="/output$(dirname "${lib}")"; \
mkdir -p "${dest_dir}"; \
cp -Ln "${lib}" "${dest_dir}/" 2>/dev/null || true; \
echo " Copied lib: ${lib}"; \
fi; \
done; \
done; \
\
# Copy the dynamic linker - it's at /lib/ld64.so.1 (not /lib64/) \
echo "Copying dynamic linker:"; \
mkdir -p /output/lib; \
cp -Ln /lib/ld64.so* /output/lib/ 2>/dev/null || true; \
cp -Ln /lib64/ld64.so* /output/lib64/ 2>/dev/null || true; \
\
echo "glibc" > /output/.libc-type; \
;; \
*) \
echo "amd64/arm64: will use musl-based static binaries"; \
echo "musl" > /output/.libc-type; \
# Create placeholder so COPY doesn't fail \
touch /output/lib/.placeholder; \
touch /output/lib64/.placeholder; \
touch /output/usr/bin/.placeholder; \
;; \
esac
# Copy musl nsenter from alpine for amd64/arm64
COPY --from=artifact-extractor /usr/bin/nsenter /output/usr/bin/nsenter-musl
COPY --from=artifact-extractor /lib/ld-musl-*.so.1 /output/lib/
# For amd64/arm64, use the musl nsenter; for ppc64le/s390x, keep the glibc one
RUN \
HOST_ARCH="$(uname -m)"; \
case "${HOST_ARCH}" in \
"x86_64"|"aarch64") \
mv /output/usr/bin/nsenter-musl /output/usr/bin/nsenter; \
;; \
*) \
rm -f /output/usr/bin/nsenter-musl; \
;; \
esac
#### kata-deploy main image
FROM gcr.io/distroless/static-debian12@sha256:87bce11be0af225e4ca761c40babb06d6d559f5767fbf7dc3c47f0f1a466b92c
ARG DESTINATION=/opt/kata-artifacts
# Copy extracted kata artifacts
COPY --from=artifact-extractor ${DESTINATION} ${DESTINATION}
# Copy Rust binary
COPY --from=rust-builder /kata-deploy/bin/kata-deploy /usr/bin/kata-deploy
# Copy nsenter and required libraries (assembled based on architecture)
COPY --from=runtime-assembler /output/usr/bin/nsenter /usr/bin/nsenter
COPY --from=runtime-assembler /output/lib/ /lib/
COPY --from=runtime-assembler /output/lib64/ /lib64/
# Copy nydus snapshotter
COPY nydus-snapshotter ${DESTINATION}/nydus-snapshotter
COPY --from=nydus-binary-downloader /opt/nydus-snapshotter/bin/containerd-nydus-grpc ${DESTINATION}/nydus-snapshotter/
COPY --from=nydus-binary-downloader /opt/nydus-snapshotter/bin/nydus-overlayfs ${DESTINATION}/nydus-snapshotter/
# Copy runtimeclasses and node-feature-rules
COPY node-feature-rules ${DESTINATION}/node-feature-rules
ENTRYPOINT ["/usr/bin/kata-deploy"]

View File

@@ -1,232 +0,0 @@
# Copyright Intel Corporation, 2022 IBM Corp.
# Copyright (c) 2025 NVIDIA Corporation
#
# SPDX-License-Identifier: Apache-2.0
#### Nydus snapshotter & nydus image
FROM golang:1.24-alpine AS nydus-binary-downloader
# Keep the version here aligned with "ndyus-snapshotter.version"
# in versions.yaml
ARG NYDUS_SNAPSHOTTER_VERSION=v0.15.10
ARG NYDUS_SNAPSHOTTER_REPO=https://github.com/containerd/nydus-snapshotter
RUN \
mkdir -p /opt/nydus-snapshotter && \
ARCH="$(uname -m)" && \
if [ "${ARCH}" = "x86_64" ]; then ARCH=amd64 ; fi && \
if [ "${ARCH}" = "aarch64" ]; then ARCH=arm64; fi && \
apk add --no-cache curl && \
curl -fOL --progress-bar "${NYDUS_SNAPSHOTTER_REPO}/releases/download/${NYDUS_SNAPSHOTTER_VERSION}/nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz" && \
tar xvzpf "nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz" -C /opt/nydus-snapshotter && \
rm "nydus-snapshotter-${NYDUS_SNAPSHOTTER_VERSION}-linux-${ARCH}.tar.gz"
#### Build binary package
FROM ubuntu:22.04 AS rust-builder
# Default to Rust 1.90.0
ARG RUST_TOOLCHAIN=1.90.0
ENV DEBIAN_FRONTEND=noninteractive
ENV RUSTUP_HOME="/opt/rustup"
ENV CARGO_HOME="/opt/cargo"
ENV PATH="/opt/cargo/bin/:${PATH}"
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN \
mkdir ${RUSTUP_HOME} ${CARGO_HOME} && \
chmod -R a+rwX ${RUSTUP_HOME} ${CARGO_HOME}
RUN \
apt-get update && \
apt-get --no-install-recommends -y install \
ca-certificates \
curl \
gcc \
libc6-dev \
musl-tools && \
apt-get clean && rm -rf /var/lib/apt/lists/ && \
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain ${RUST_TOOLCHAIN}
WORKDIR /kata-deploy
# Copy standalone binary project
COPY binary /kata-deploy
# Install target and run tests based on architecture
# - AMD64/arm64: use musl for fully static binaries
# - PPC64le/s390x: use glibc (musl has issues on these platforms)
RUN \
HOST_ARCH="$(uname -m)"; \
rust_arch=""; \
rust_target=""; \
case "${HOST_ARCH}" in \
"x86_64") \
rust_arch="x86_64"; \
rust_target="${rust_arch}-unknown-linux-musl"; \
echo "Installing musl target for ${rust_target}"; \
rustup target add "${rust_target}"; \
;; \
"aarch64") \
rust_arch="aarch64"; \
rust_target="${rust_arch}-unknown-linux-musl"; \
echo "Installing musl target for ${rust_target}"; \
rustup target add "${rust_target}"; \
;; \
"ppc64le") \
rust_arch="powerpc64le"; \
rust_target="${rust_arch}-unknown-linux-gnu"; \
echo "Using glibc target for ${rust_target} (musl is not well supported on ppc64le)"; \
;; \
"s390x") \
rust_arch="s390x"; \
rust_target="${rust_arch}-unknown-linux-gnu"; \
echo "Using glibc target for ${rust_target} (musl is not well supported on s390x)"; \
;; \
*) echo "Unsupported architecture: ${HOST_ARCH}" && exit 1 ;; \
esac; \
echo "${rust_target}" > /tmp/rust_target
# Run tests using --test-threads=1 to prevent environment variable pollution between tests,
# and this is fine as we'll never ever have multiple binaries running at the same time.
RUN \
rust_target="$(cat /tmp/rust_target)"; \
echo "Running binary tests with target ${rust_target}..." && \
RUSTFLAGS="-D warnings" cargo test --target "${rust_target}" -- --test-threads=1 && \
echo "All tests passed!"
RUN \
rust_target="$(cat /tmp/rust_target)"; \
echo "Building kata-deploy binary for ${rust_target}..." && \
RUSTFLAGS="-D warnings" cargo build --release --target "${rust_target}" && \
mkdir -p /kata-deploy/bin && \
cp "/kata-deploy/target/${rust_target}/release/kata-deploy" /kata-deploy/bin/kata-deploy && \
echo "Cleaning up build artifacts to save disk space..." && \
rm -rf /kata-deploy/target && \
cargo clean
#### Extract kata artifacts
FROM alpine:3.22 AS artifact-extractor
ARG KATA_ARTIFACTS=kata-static.tar.zst
ARG DESTINATION=/opt/kata-artifacts
COPY ${KATA_ARTIFACTS} /tmp/
RUN \
apk add --no-cache tar zstd util-linux-misc && \
mkdir -p "${DESTINATION}" && \
tar --zstd -xf "/tmp/$(basename "${KATA_ARTIFACTS}")" -C "${DESTINATION}" && \
rm -f "/tmp/$(basename "${KATA_ARTIFACTS}")"
#### Prepare runtime dependencies (nsenter and required libraries)
# This stage assembles all runtime dependencies based on architecture
# using ldd to find exact library dependencies
FROM debian:bookworm-slim AS runtime-assembler
ARG DESTINATION=/opt/kata-artifacts
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN \
apt-get update && \
apt-get --no-install-recommends -y install \
util-linux && \
apt-get clean && rm -rf /var/lib/apt/lists/
# Copy the built binary to analyze its dependencies
COPY --from=rust-builder /kata-deploy/bin/kata-deploy /tmp/kata-deploy
# Create output directories
RUN mkdir -p /output/lib /output/lib64 /output/usr/bin
# Use ldd to find and copy all required libraries for the kata-deploy binary and nsenter
RUN \
HOST_ARCH="$(uname -m)"; \
echo "Preparing runtime dependencies for ${HOST_ARCH}"; \
case "${HOST_ARCH}" in \
"ppc64le"|"s390x") \
echo "Using glibc - copying libraries based on ldd output"; \
\
# Copy nsenter \
cp /usr/bin/nsenter /output/usr/bin/nsenter; \
\
# Show what the binaries need \
echo "Libraries needed by kata-deploy:"; \
ldd /tmp/kata-deploy || echo "ldd failed"; \
echo "Libraries needed by nsenter:"; \
ldd /usr/bin/nsenter || echo "ldd failed"; \
\
# Extract and copy all library paths from both binaries \
for binary in /tmp/kata-deploy /usr/bin/nsenter; do \
echo "Processing ${binary}..."; \
# Get libraries with "=>" (shared libs) \
ldd "${binary}" 2>/dev/null | grep "=>" | awk '{print $3}' | sort -u | while read -r lib; do \
if [ -n "${lib}" ] && [ -f "${lib}" ]; then \
dest_dir="/output$(dirname "${lib}")"; \
mkdir -p "${dest_dir}"; \
cp -Ln "${lib}" "${dest_dir}/" 2>/dev/null || true; \
echo " Copied lib: ${lib}"; \
fi; \
done; \
done; \
\
# Copy the dynamic linker - it's at /lib/ld64.so.1 (not /lib64/) \
echo "Copying dynamic linker:"; \
mkdir -p /output/lib; \
cp -Ln /lib/ld64.so* /output/lib/ 2>/dev/null || true; \
cp -Ln /lib64/ld64.so* /output/lib64/ 2>/dev/null || true; \
\
echo "glibc" > /output/.libc-type; \
;; \
*) \
echo "amd64/arm64: will use musl-based static binaries"; \
echo "musl" > /output/.libc-type; \
# Create placeholder so COPY doesn't fail \
touch /output/lib/.placeholder; \
touch /output/lib64/.placeholder; \
touch /output/usr/bin/.placeholder; \
;; \
esac
# Copy musl nsenter from alpine for amd64/arm64
COPY --from=artifact-extractor /usr/bin/nsenter /output/usr/bin/nsenter-musl
COPY --from=artifact-extractor /lib/ld-musl-*.so.1 /output/lib/
# For amd64/arm64, use the musl nsenter; for ppc64le/s390x, keep the glibc one
RUN \
HOST_ARCH="$(uname -m)"; \
case "${HOST_ARCH}" in \
"x86_64"|"aarch64") \
mv /output/usr/bin/nsenter-musl /output/usr/bin/nsenter; \
;; \
*) \
rm -f /output/usr/bin/nsenter-musl; \
;; \
esac
#### kata-deploy main image
FROM gcr.io/distroless/static-debian12@sha256:87bce11be0af225e4ca761c40babb06d6d559f5767fbf7dc3c47f0f1a466b92c
ARG DESTINATION=/opt/kata-artifacts
# Copy extracted kata artifacts
COPY --from=artifact-extractor ${DESTINATION} ${DESTINATION}
# Copy Rust binary
COPY --from=rust-builder /kata-deploy/bin/kata-deploy /usr/bin/kata-deploy
# Copy nsenter and required libraries (assembled based on architecture)
COPY --from=runtime-assembler /output/usr/bin/nsenter /usr/bin/nsenter
COPY --from=runtime-assembler /output/lib/ /lib/
COPY --from=runtime-assembler /output/lib64/ /lib64/
# Copy nydus snapshotter
COPY nydus-snapshotter ${DESTINATION}/nydus-snapshotter
COPY --from=nydus-binary-downloader /opt/nydus-snapshotter/bin/containerd-nydus-grpc ${DESTINATION}/nydus-snapshotter/
COPY --from=nydus-binary-downloader /opt/nydus-snapshotter/bin/nydus-overlayfs ${DESTINATION}/nydus-snapshotter/
# Copy runtimeclasses and node-feature-rules
COPY node-feature-rules ${DESTINATION}/node-feature-rules
ENTRYPOINT ["/usr/bin/kata-deploy"]

View File

@@ -36,7 +36,9 @@ const ALL_SHIMS: &[&str] = &[
"qemu-se",
"qemu-se-runtime-rs",
"qemu-snp",
"qemu-snp-runtime-rs",
"qemu-tdx",
"qemu-tdx-runtime-rs",
];
/// Check if a shim is a QEMU-based shim (all QEMU shims start with "qemu")
@@ -824,6 +826,8 @@ VERSION_ID="24.04"
"qemu"
);
assert_eq!(get_hypervisor_name("qemu-se-runtime-rs").unwrap(), "qemu");
assert_eq!(get_hypervisor_name("qemu-snp-runtime-rs").unwrap(), "qemu");
assert_eq!(get_hypervisor_name("qemu-tdx-runtime-rs").unwrap(), "qemu");
}
#[test]

View File

@@ -85,7 +85,11 @@ pub async fn configure_snapshotter(
runtime: &str,
config: &Config,
) -> Result<()> {
let pluginid = if fs::read_to_string(&config.containerd_conf_file)
// Get all paths and drop-in capability in one call
let paths = config.get_containerd_paths(runtime).await?;
// Read containerd version from config_file to determine pluginid
let pluginid = if fs::read_to_string(&paths.config_file)
.unwrap_or_default()
.contains("version = 3")
{
@@ -94,28 +98,21 @@ pub async fn configure_snapshotter(
"\"io.containerd.grpc.v1.cri\".containerd"
};
let use_drop_in =
crate::runtime::is_containerd_capable_of_using_drop_in_files(config, runtime).await?;
let configuration_file: std::path::PathBuf = if use_drop_in {
// Ensure we have the absolute path with /host prefix
let base_path = if config.containerd_drop_in_conf_file.starts_with("/host") {
// Already has /host prefix
Path::new(&config.containerd_drop_in_conf_file).to_path_buf()
let configuration_file: std::path::PathBuf = if paths.use_drop_in {
// Only add /host prefix if path is not in /etc/containerd (which is mounted from host)
let base_path = if paths.drop_in_file.starts_with("/etc/containerd/") {
Path::new(&paths.drop_in_file).to_path_buf()
} else {
// Need to add /host prefix
let drop_in_path = config.containerd_drop_in_conf_file.trim_start_matches('/');
// Need to add /host prefix for paths outside /etc/containerd
let drop_in_path = paths.drop_in_file.trim_start_matches('/');
Path::new("/host").join(drop_in_path)
};
log::debug!("Snapshotter using drop-in config file: {:?}", base_path);
base_path
} else {
log::debug!(
"Snapshotter using main config file: {}",
config.containerd_conf_file
);
Path::new(&config.containerd_conf_file).to_path_buf()
log::debug!("Snapshotter using main config file: {}", paths.config_file);
Path::new(&paths.config_file).to_path_buf()
};
match snapshotter {

View File

@@ -7,6 +7,22 @@ use anyhow::{Context, Result};
use log::info;
use std::env;
/// Containerd configuration paths and capabilities for a specific runtime
#[derive(Debug, Clone)]
pub struct ContainerdPaths {
/// File to read containerd version from and write to (non-drop-in mode)
pub config_file: String,
/// Backup file path before modification
pub backup_file: String,
/// File to add/remove drop-in imports from (drop-in mode)
/// None if imports are not needed (e.g., k0s auto-loads from containerd.d/)
pub imports_file: Option<String>,
/// Path to the drop-in configuration file
pub drop_in_file: String,
/// Whether drop-in files can be used (based on containerd version)
pub use_drop_in: bool,
}
#[derive(Debug, Clone)]
pub struct Config {
pub node_name: String,
@@ -359,6 +375,50 @@ impl Config {
self.experimental_force_guest_pull_for_arch.join(",")
);
}
/// Get containerd configuration file paths based on runtime type and containerd version
pub async fn get_containerd_paths(&self, runtime: &str) -> Result<ContainerdPaths> {
use crate::runtime::manager;
// Check if drop-in files can be used based on containerd version
let use_drop_in = manager::is_containerd_capable_of_using_drop_in_files(self, runtime).await?;
let paths = match runtime {
"k0s-worker" | "k0s-controller" => ContainerdPaths {
config_file: "/etc/containerd/containerd.toml".to_string(),
backup_file: "/etc/containerd/containerd.toml.bak".to_string(), // Never used, but needed for consistency
imports_file: None, // k0s auto-loads from containerd.d/, imports not needed
drop_in_file: "/etc/containerd/containerd.d/kata-deploy.toml".to_string(),
use_drop_in,
},
"microk8s" => ContainerdPaths {
// microk8s uses containerd-template.toml instead of config.toml
config_file: "/etc/containerd/containerd-template.toml".to_string(),
backup_file: "/etc/containerd/containerd-template.toml.bak".to_string(),
imports_file: Some("/etc/containerd/containerd-template.toml".to_string()),
drop_in_file: self.containerd_drop_in_conf_file.clone(),
use_drop_in,
},
"k3s" | "k3s-agent" | "rke2-agent" | "rke2-server" => ContainerdPaths {
// k3s/rke2 generates config.toml from config.toml.tmpl on each restart
// We must modify the template file so our changes persist
config_file: "/etc/containerd/config.toml.tmpl".to_string(),
backup_file: "/etc/containerd/config.toml.tmpl.bak".to_string(),
imports_file: Some("/etc/containerd/config.toml.tmpl".to_string()),
drop_in_file: self.containerd_drop_in_conf_file.clone(),
use_drop_in,
},
_ => ContainerdPaths {
config_file: self.containerd_conf_file.clone(),
backup_file: self.containerd_conf_file_backup.clone(),
imports_file: Some(self.containerd_conf_file.clone()),
drop_in_file: self.containerd_drop_in_conf_file.clone(),
use_drop_in,
},
};
Ok(paths)
}
}
fn get_arch() -> Result<String> {
@@ -379,7 +439,7 @@ fn get_arch() -> Result<String> {
/// Returns only shims that are supported for that architecture
fn get_default_shims_for_arch(arch: &str) -> &'static str {
match arch {
"x86_64" => "clh cloud-hypervisor dragonball fc qemu qemu-coco-dev qemu-coco-dev-runtime-rs qemu-runtime-rs qemu-nvidia-gpu qemu-nvidia-gpu-snp qemu-nvidia-gpu-tdx qemu-snp qemu-tdx",
"x86_64" => "clh cloud-hypervisor dragonball fc qemu qemu-coco-dev qemu-coco-dev-runtime-rs qemu-runtime-rs qemu-nvidia-gpu qemu-nvidia-gpu-snp qemu-nvidia-gpu-tdx qemu-snp qemu-snp-runtime-rs qemu-tdx qemu-tdx-runtime-rs",
"aarch64" => "clh cloud-hypervisor dragonball fc qemu qemu-nvidia-gpu qemu-cca",
"s390x" => "qemu qemu-runtime-rs qemu-se qemu-se-runtime-rs qemu-coco-dev qemu-coco-dev-runtime-rs",
"ppc64le" => "qemu",

View File

@@ -459,13 +459,48 @@ impl K8sClient {
}
}
/// Split a JSONPath string by dots, but respect escaped dots (\.)
/// Example: "metadata.labels.microk8s\.io/cluster" -> ["metadata", "labels", "microk8s.io/cluster"]
fn split_jsonpath(path: &str) -> Vec<String> {
let mut parts = Vec::new();
let mut current = String::new();
let mut chars = path.chars().peekable();
while let Some(c) = chars.next() {
if c == '\\' {
// Check if next char is a dot (escaped dot)
if chars.peek() == Some(&'.') {
current.push(chars.next().unwrap()); // Add the dot literally
} else {
current.push(c); // Keep the backslash
}
} else if c == '.' {
if !current.is_empty() {
parts.push(current);
current = String::new();
}
} else {
current.push(c);
}
}
if !current.is_empty() {
parts.push(current);
}
parts
}
/// Get value from JSON using JSONPath-like syntax (simplified)
fn get_jsonpath_value(obj: &serde_json::Value, jsonpath: &str) -> Result<String> {
// Simple JSONPath implementation for common cases
// Supports: .field, .field.subfield, [index]
// Supports: .field, .field.subfield, [index], escaped dots (\.)
let mut current = serde_json::to_value(obj)?;
for part in jsonpath.trim_start_matches('.').split('.') {
// Split by unescaped dots only
let parts = split_jsonpath(jsonpath.trim_start_matches('.'));
for part in parts {
if part.is_empty() {
continue;
}
@@ -489,7 +524,7 @@ fn get_jsonpath_value(obj: &serde_json::Value, jsonpath: &str) -> Result<String>
.clone();
} else {
current = current
.get(part)
.get(&part)
.ok_or_else(|| anyhow::anyhow!("Field '{part}' not found"))?
.clone();
}
@@ -580,6 +615,25 @@ pub async fn update_runtimeclass(
mod tests {
use super::*;
#[test]
fn test_split_jsonpath_simple() {
let parts = split_jsonpath("metadata.labels.foo");
assert_eq!(parts, vec!["metadata", "labels", "foo"]);
}
#[test]
fn test_split_jsonpath_escaped_dot() {
// microk8s\.io/cluster should become a single key: microk8s.io/cluster
let parts = split_jsonpath(r"metadata.labels.microk8s\.io/cluster");
assert_eq!(parts, vec!["metadata", "labels", "microk8s.io/cluster"]);
}
#[test]
fn test_split_jsonpath_multiple_escaped_dots() {
let parts = split_jsonpath(r"a\.b\.c.d");
assert_eq!(parts, vec!["a.b.c", "d"]);
}
#[test]
fn test_get_jsonpath_value() {
let json = json!({
@@ -593,4 +647,18 @@ mod tests {
let result = get_jsonpath_value(&json, ".status.nodeInfo.containerRuntimeVersion").unwrap();
assert_eq!(result, "containerd://1.7.0");
}
#[test]
fn test_get_jsonpath_value_with_escaped_dot() {
let json = json!({
"metadata": {
"labels": {
"microk8s.io/cluster": "true"
}
}
});
let result = get_jsonpath_value(&json, r".metadata.labels.microk8s\.io/cluster").unwrap();
assert_eq!(result, "true");
}
}

Some files were not shown because too many files have changed in this diff Show More