Downstream builders at Red Hat complain that `Cargo.lock` doesn't match
`Cargo.toml`.
Run `cargo check` to refresh `Cargo.lock`.
`git bisect` shows that 7cfb97d41b is the first commit where
`cargo check` has an effect in `src/agent`.
Signed-off-by: Greg Kurz <groug@kaod.org>
Add run_bats_tests() function to common.bash that provides consistent
test execution and reporting across all test suites (k8s, nvidia,
kata-deploy).
This removes duplicated test runner code from run_kubernetes_tests.sh,
run_kubernetes_nv_tests.sh, and run-kata-deploy-tests.sh.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The NVIDIA GPU test runner script was not generating test reports,
causing the report_tests() function in gha-run.sh to have nothing
to display. This aligns the script with run_kubernetes_tests.sh by:
- Adding set -o pipefail for proper pipeline error handling
- Creating a reports directory with timestamped subdirectory
- Capturing test output to files with ok-/not_ok- prefixes
- Adding --timing flag to bats for timing information
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's just point to the official documentation rather than explaining
exactly how to deploy (and the current text was very outdated).
Removing fluentd / minikube examples is out of context of this commit.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The runk tool hasn't been supported for a few years, with no maintainers
since ManaSugi stopped being involved in the project and the CI was
disabled in 2024.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This reverts commit 6130d7330f, as we're
officially swithcing to the rust version of kata-deploy.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
a2534e7bc8 introduced the logic to also
release a kata-tools tarball, but it missed allowing
KATA_TOOLS_STATIC_TARBALL env var to be passed to the release script,
leading to the following error during the release process:
```
ERROR: Invalid environment variable "KATA_TOOLS_STATIC_TARBALL"
```
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
In startVM(), for VMMs without hotplug support (e.g., Firecracker or
QEMU microvm), the runtime runs prestart hooks but misses rescanning
the network namespace. This causes VMs to boot with uninitialized
network configs, as updates from CNI plugins are not captured.
This patch adds a network rescan via AddEndpoints after prestart hooks
for the non-hotplug path, ensuring correct network info is passed to
the VMM configuration before the VM starts.
Fixes#11500
Signed-off-by: XanderC <xanderc@qq.com>
The virtio-9p is not supported for a long time, specially within
the runtime-rs, we have no such plan to support it. Removal of the
related items is reasonable.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
As Memory Agent feature is not used within CoCo(TDX/SNP) scenarios,
with this fact, it's better to just remove the related sections.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
It aims to introduce some related items within Makefile to enable
Intel SNP settings in configuration when do make build. And make it
possible to generate the rendered qemu-snp-runtime-rs configuration
based on the *.in template.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
To make it work well on the SEV-SNP platforms for qemu-runtime-rs with
coco, a dedicated SEV-SNP configuration should be introduced to help
prepare related CVM resources.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Enable measured rootfs within configuration when make build. And add
some other important items to make the configuration work well.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
It aims to introduce some related items within Makefile to enable
Intel TDX settings in configuration when do make build. And make it
possible to generate the rendered qemu-tdx-runtime-rs configuration
based on the *.in template.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
To make it work well on the TDX platforms for qemu-runtime-rs with
coco, a dedicated TDX configuration should be introduced to help
prepare related CVM resources.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Systemd-managed cgroups use the slice:prefix:name format, which is
not a filesystem path. Calling MoveTo() on such paths fails with
"invalid group path" and can abort cleanup before Delete() runs.
In some cases, this causes pod teardown delays.
Skip MoveTo for systemd-formatted sandbox/overhead cgroup paths when
sandbox_cgroup_only is true; systemd moves tasks on unit deletion.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
With cold-plug becoming by design the only supported mode with the
update of NVRC to v0.1.1, resolving references to hot-plug.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Enable post-install verification in kata-deploy CI tests. When
HELM_VERIFY_DEPLOYMENT is set, a simple verification pod is created
that runs with the Kata runtime to confirm deployment succeeded.
The verification pod prints kernel info and exits - success indicates
the Kata runtime is properly configured and functional.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add optional verification that runs after kata-deploy installation.
When a pod spec is provided via --set-file verification.pod=<file>,
a verification job runs after install/upgrade to validate deployment.
The user is fully responsible for the verification pod content:
- Pod name, runtimeClassName, annotations, and verification logic
- Pod must exit 0 on success, non-zero on failure
The verification job simply:
1. Waits for kata-deploy DaemonSet to be ready
2. Applies the user-provided pod spec
3. Waits for the pod to complete
4. Shows logs and cleans up
Usage:
helm install kata-deploy ... \
--set-file verification.pod=/path/to/your-pod.yaml
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
To unlock the release, move the job to publish kata payload after push to an alternate runner(IBM owned) for ppc64le.
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
The new NVRC version works for CC and non-CC use cases,
no --feature confidential needed anymore.
Bump versions.yaml and adjust deployment instructions.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Disable NVDIMM. When using GPU passthrough, using NVDIMM would create
a r/o file-backed memory region. When using a GPU, QEMU tries to DMA-
map guest memory for the device, resulting in a mapping error:
memory listener initialization failed: Region mem0:
vfio_container_dma_map ... -22 (Invalid argument).
For the CC configs, NVDIMM is disabled by default in qemu_amd64.go
with a warning, but we also explicitly disable the setting in the
shim configuration file.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
We don't need to store the kernel headers anymore. We do need to store
the kernel modules, instead.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We've done some bad file based driver determination,
now with versions.yaml there is a single source of truth.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We need to package the build modules for the rootfs
to be able to consume it. We package the whole
/lib/modules/$(uname -r) directory strip=2.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We want to have deterministic behaviour and only
one valid driver version acceptable via versions.yaml
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We actually never installed yq to the kernel build,
there are some path that use yq but were never hit,
for the GPU use-case we need to read values from versions.yaml
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In preparation for coco v0.18.0, bump the version of image-rs we use in
agent-ctl to match what we have in versions.yaml.
Drop the snapshotter-overlayfs feature. This was dropped from image-rs
when we removed enclave-cc support.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>
Before cutting the Kata release that will be used with CoCo v0.18.0,
let's bump the versions of Trustee and guest-components to latest.
Signed-off-by: Tobin Feldman-Fitzthum <tfeldmanfitz@nvidia.com>