Joji's added the labels for the default values.yaml, but we missed
adding those to the nvidia specific values.yaml file.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When do cargo fmt --all, some files changes as unformatted with
`cargo fmt`. This commit is just to address it.
Just use this as an example:
```
// Generate the common drop-in files (shared with standard
// runtimes)
- write_common_drop_ins(config, &runtime.base_config,
&config_d_dir, container_runtime)?;
+ write_common_drop_ins(
+ config,
+ &runtime.base_config,
+ &config_d_dir,
+ container_runtime,
+ )?;
```
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Without those, we'd end up pulling the same / old rootfs that's cached
without re-building it in case of a bump in any of those components.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
- Updated image-rs from rev 026694d4 to tag v0.18.0
- This update brings rsa 0.9.10 which fixes CVE-2026-21895
- Resolves vulnerability in indirect dependencies
Signed-off-by: pavithiran34 <pavithiran.p@ibm.com>
The attestation-agent no longer sets nvidia devices to ready
automatically. Instead, we should use nvrc for this. Since this is
required for all nvidia workloads, add it to the default nv kernel
params.
With bounce buffers, the timing of attesting a device versus setting it
to ready is not so important.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
The create_container_timeout key was placed after the
[agent.@PROJECT_TYPE@.mem_agent] TOML section header, which meant
TOML parsed it as a field of mem_agent rather than of the parent
agent table. This was silently ignored before, but now that
MemAgent has #[serde(deny_unknown_fields)] it causes a parse error.
Move the key above the [mem_agent] section so it belongs to the
correct [agent.@PROJECT_TYPE@] table.
Also fix configuration-qemu-coco-dev which had a duplicate entry:
keep only the correctly placed one with the COCO timeout value.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
..where possible. Failing on unknown fields makes migration easier,
as we do not silently ignore configuration options that previously
worked in runtime-go. However, serde can't deny unknown fields
where flatten is used, so this can't be used everywhere sadly.
There were also errors in test fixtures that were unnoticed.
These are fixed here, too.
Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We recently moved the default policy in the Trustee repo. Now it's in
the same place as all the other policies. Update the test code to match.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
Pick up the new version of guest-components which uses NVAT bindings
instead of NVML bindings. This will allow us to attests guests with
nvswitches.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
Resolve externals.nydus-snapshotter version and url in the Docker image build
with yq from the repo-root versions.yaml instead of Dockerfile ARG defaults.
Drop the redundant workflow that only enforced parity between those two sources.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add tools/packaging/kata-deploy/binary as a workspace member, inherit shared
dependency versions from the root manifest, and refresh Cargo.lock.
Build the kata-deploy image from the repository root: copy the workspace
layout into the rust-builder stage, run cargo test/build with -p kata-deploy,
and adjust artifact and static asset COPY paths. Update the payload build
script to invoke docker buildx with -f .../Dockerfile from the repo root.
Add a repo-root .dockerignore to keep the Docker build context smaller.
Document running unit tests with cargo test -p kata-deploy from the root.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The busybox-pod.yaml test fixture sets tty: true on the second
container. When a container has a TTY, kubectl exec may return
\r\n line endings. The invisible \r causes string comparisons
to fail:
container_name=$(kubectl exec ... -- env | grep CONTAINER_NAME)
[ "$container_name" == "CONTAINER_NAME=second-test-container" ]
This comparison fails because $container_name contains a trailing
\r character.
Fix by piping through tr -d '\r' after grep. This is harmless
when \r is absent and fixes the mismatch when present.
Fixes: #9136
Signed-off-by: Rophy Tsai <rophy@users.noreply.github.com>
Trustee is compatible with old guest components (using NVML bindings) or
new guest components (using NVAT). If we have the new version of gc, we
can attest PPCIE guests, which we need the new version of Trustee to
verify.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
Update NVIDIA rootfs builder to include runtime dependencies for NVAT
Rust bindings.
The nvattest package does not include the .so file, so we need to build
from source.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
The attestation agent will soon rely on the NVAT rust bindings, which
have some built-time dependencies.
There is currently no nvattest-dev package, so we need to build from
source to get the headers and .so file.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
k3s and rke2 ship containerd 2.2.2, which requires the OCI 1.3.0
drop-in overlay. Move them from the separate OCI 1.2.1 branch into
the OCI 1.3.0 condition alongside nvidia-gpu, qemu-snp, qemu-tdx,
and custom container engine versions.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
--all option would trigger building and testing for everything within
our root workspace, which is not desired here. Let's specify the crates
of libs explicitly in our Makefile.
Signed-off-by: Ruoqing He <ruoqing.he@lingcage.com>
Remove libs from exclude list, and move them explicitly into root
workspace to make sure our core components are in a consistent state.
This is a follow up of #12413.
Signed-off-by: Ruoqing He <ruoqing.he@lingcage.com>
2ba0cb0d4a7 did the ground work for using OVMF even for the
qemu-nvidia-gpu, but missed actually setting the OVMF path to be used,
which we'e fixing now.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When TDX confidential guest support is enabled, set `kernel_irqchip=split`
for TDX CVM:
...
-machine \
q35,accel=kvm,kernel_irqchip=split,confidential-guest-support=tdx \
...
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
There's a typo in the error message which gets prompted when an
unsupported share_fs was configured. Fixed shred -> shared.
Signed-off-by: Yuting Nie <yuting.nie@spacemit.com>
Docker 26+ configures container networking (veth pair, IP addresses,
routes) after task creation rather than before. Kata's endpoint scan
runs during CreateSandbox, before the interfaces exist, resulting in
VMs starting without network connectivity (no -netdev passed to QEMU).
Add RescanNetwork() which runs asynchronously after the Start RPC.
It polls the network namespace until Docker's interfaces appear, then
hotplugs them to QEMU and informs the guest agent to configure them
inside the VM.
Additional fixes:
- mountinfo parser: find fs type dynamically instead of hardcoded
field index, fixing parsing with optional mount tags (shared:,
master:)
- IsDockerContainer: check CreateRuntime hooks for Docker 26+
- DockerNetnsPath: extract netns path from libnetwork-setkey hook
args with path traversal protection
- detectHypervisorNetns: verify PID ownership via /proc/pid/cmdline
to guard against PID recycling
- startVM guard: rescan when len(endpoints)==0 after VM start
Fixes: #9340
Signed-off-by: llink5 <llink5@users.noreply.github.com>
Onboard a test case for deploying a NIM service using the NIM
operator. We install the operator helm chart on the fly as this is
a fast operation, spinning up a single operand. Once a NIM service
is scheduled, the operator creates a deployment with a single pod.
For now, the TEE-based flow uses an allow-all policy. In future
work, we strive to support generating pod security policies for the
scenario where NIM services are deployed and the pod manifest is
being generated on the fly.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Do not run the NIM containers with elevated privileges. Note that,
using hostPath requires proper host folder permissions, and that
using emptyDir requires a proper fsGroup ID.
Once issue 11162 is resolved, we can further refine the securityContext
fields for the TEE manifests.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>