Adapt create_containerd_config to work with containerd 2.x while
keeping compatibility with v1.x for completeness:
- Drop the direct config.toml patching in favour of conf.d fragments:
use containerd_render_config_default_with_imports to generate the
base config, then write separate drop-ins for API socket overrides,
debug settings, and the Kata runtime.
- Use CONTAINERD_SYSTEM_FRAGMENT_PREFIX directly (no PREFIX= indirection).
- Detect cfg_schema via _containerd_blob_schema_version to select the
right plugin table:
schema >= 3 -> io.containerd.cri.v1.runtime
schema 2 -> io.containerd.grpc.v1.cri
and to emit the sandboxer field only on schema >= 3.
- Pass GOTOOLCHAIN via "sudo -E make clean" so the environment variable
set by export_go_toolchain_for_containerd_source_builds is preserved
during the containerd source build.
The require_containerd_binary_default_schema_v3_plus call is kept: the
test explicitly clones and builds containerd 2.x from source, so a
schema v2 binary should never appear here.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <noreply@cursor.com>
Configure containerd for nydus differently depending on the active
config schema, because conf.d drop-in fragments are only honoured the
same way by containerd 2.x.
config_containerd now delegates to _containerd_resolved_schema_version
(from common.bash) to detect the active schema and passes it to
config_containerd_core, which emits schema-appropriate config:
schema >= 3 (containerd v2.x):
Keep the base config and add a conf.d drop-in fragment using the
io.containerd.cri.v1.runtime plugin (sandboxer = 'podsandbox') and
io.containerd.cri.v1.images to select nydus as the snapshotter.
schema 2 (containerd v1.x):
conf.d is not honoured the same way, so replace config.toml
wholesale with a complete, self-contained file using the
io.containerd.grpc.v1.cri plugin with nydus as the snapshotter and
no sandboxer field.
The [proxy_plugins] block is written in both cases as it is
schema-version agnostic.
Teardown restores the whole config.toml (schema v2 path) or removes the
drop-in fragment (schema v3+ path) as appropriate.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <noreply@cursor.com>
Enable the hard-coded init-data policy test gate for qemu-tdx-runtime-rs
so runtime-rs and Go TDX variants exercise the same Kubernetes policy
coverage.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The coco initdata tests signature verification and authenticated registry
never worked on qemu-tdx and so they have been disabled since.
Add them back now that all necessary fixes are in place.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
initdata tests set kernel arguments to "" which resets the
kernel arguments configured by Helm install. However, TDX
runner depends on agent.https_proxy= kernel arguments to pull
images.
In order for initdata tests to work on TDX, the same needs to
be added to CDH configuration via image.image_pull_proxy.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
No need to patch yamls locally. Also, set RUST_LOG=debug
and enable https_proxy for all TDX targets when the runner
has HTTPS_PROXY is set.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
k8s-sandbox-vcpus-allocation.bats was disabled for qemu-tdx due to
errors when moving to use "upstream" TDX KVM code. The failing test
is vcpus-less-than-one-with-no-limits pod which ends up getting
x86 default MaxCPU = 240 and erroring:
Number of hotpluggable cpus requested (240) exceeds the maximum cpus supported by KVM (224)
TDX max vcpus is capped to host's logical CPUs so 240 is too much.
With the maxcpus logic fixed (=maxcpus not set at all) for configurations
where confidential guest is enabled, qemu-tdx can be enabled for
k8s-sandox-vcpus-allocation.bats again.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Switch AKS Mariner matrix entries to clh-azure handlers and remove the
temporary host-OS based helm value overrides.
Update integration test wiring and required test labels so CI tracks the
new runtime names.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The empty-image test expects pod creation to fail. With an EROFS
snapshot that has a disk-backed rwlayer, runtime-rs can still reject
that pod with the existing unsupported mount-count error.
With default_size=0, there is no rwlayer mount. The same negative test
can instead reach the bind rootfs shape produced for the empty active
snapshot, which runtime-rs rejects as an unsupported rootfs mount.
Accept both messages so the test covers the expected failure for both
EROFS rwlayer modes.
Assisted-by: OpenAI Codex <codex@openai.com>
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
The current if condition causes agent security policies to be
attached to the non-TEE NVIDIA runtime-rs runtime class. While
this is good to see that it works, this is not intended. Thus,
replacting the condition with is_confidential_gpu_hypervisor.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Add k8s-nvidia-numa.bats with five tests that validate NUMA behaviour
on hosts where NUMA is configured by default (qemu-nvidia-gpu,
qemu-nvidia-gpu-snp, qemu-nvidia-gpu-tdx):
1. Multi-node sandbox (large workload spanning all host NUMA nodes):
- Guest NUMA node count matches host
- Guest vCPU distribution is balanced across nodes (max-min <= 1)
- Guest memory is distributed across NUMA nodes
- Host-side vCPU pinning is balanced across NUMA nodes
2. Right-sized single-node sandbox (small workload fitting one node):
- Guest collapses to a single NUMA node
- All host vCPU threads pinned to that one NUMA node
3. GPU passthrough with VFIO, multi-node:
- Guest NUMA topology is balanced (same as test 1)
- Guest GPU's NUMA node matches the host GPU's NUMA node
(resolved via the vfio-pci,host=<BDF> from the QEMU command
line and /sys/bus/pci/devices/<BDF>/numa_node)
- QEMU command line contains pxb-pcie and policy=bind
- Host vCPU pinning is balanced
4. GPU passthrough with VFIO, right-sized single-node: small workload
plus GPU that fits in a single host NUMA node:
- Guest collapses to a single NUMA node
- The chosen node is the GPU's host NUMA node, not just any node
that fits — verified by matching host-nodes= in the memory
backend and pxb-pcie numa_node= against the GPU's host node
- Guest GPU reports the same NUMA node as the host GPU
5. Explicit numa_mapping in the runtime TOML (QEMU-only):
- Drops a config.d/ fragment that sets numa_mapping = ["1"], so the
auto-derive + right-sizing path is bypassed entirely
- Guest sees exactly 1 NUMA node
- QEMU memory backend is bound to host node 1 (host-nodes=1,
policy=bind), not host node 0
- Host-side vCPU threads land on host node 1
- Drop-in is removed on teardown so subsequent tests are unaffected
Guest-side checks use a dedicated container image
(quay.io/kata-containers/numa) that reads sysfs and prints results to
stdout — no kubectl exec or CoCo policy overrides needed.
Host-side checks (crictl, pgrep, taskset) run directly on the host
via sudo; a standalone numa-pinning-check.sh script handles the vCPU
thread affinity inspection. The config.d/ helpers used by test 5 are
runtime-agnostic (probe Go vs runtime-rs layout on disk) but the test
is gated to qemu-* shims since runtime-rs does not yet implement
NUMA.
Skips cleanly on single-NUMA hosts, unsupported hypervisors, or when
no nvidia.com/pgpu resources are available (GPU tests only).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
Temporarily skip the `TestContainerMemoryUpdate` test case
for sandbox api.
This test case is currently skipped in other VMMs (e.g.,
QEMU, Cloud-Hypervisor) due to known issues and environmental
stability concerns.
To maintain consistency across the project, we are skipping it
for sandbox as well.
A follow-up PR will be dedicated to addressing these issues and
properly enabling/refining this test case for all VMMs.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
containerd 2.3 requires Go 1.26.3, but Kata still pins Go 1.25.10.
Use Go 1.26.3 for the sandbox-api job so that make cri-integration
can build containerd from source.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Creating a new container in the same sandbox VM after the previous
container has exited and been removed has never been supported by
kata-containers (neither with the go-based nor the rust-based runtime).
When the last container is removed the kata VM shuts down, so any
attempt to start a new container in the same sandbox fails.
This test exercises a use-case kata does not currently support, and it
has never been part of the passing list for good reason. Mark it
explicitly excluded with a comment so it is clear this is a deliberate
omission rather than an oversight.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
The check_daemon_setup function verifies that containerd + runc are
functional before the real kata tests run. Using the shim sandboxer
for this runc check hits a known containerd bug where the OCI spec
is not populated before NewBundle is called, so config.json is never
written and containerd-shim-runc-v2 fails at startup.
See containerd/containerd#11640
The sandboxer choice is irrelevant for this sanity check, so use
podsandbox which works correctly with runc.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Since gc and trustee were bumped (#13046), the test
"Cannot get CDH resource when affirming policy is set without reference values"
has started failing for IBM SEL.
The attestation policy for IBM SEL returns an "affirming"
result whenever the claim can be parsed successfully,
meaning the evidence verification succeeds. As a result,
the negative test above always produces a positive result.
Skip this negative test for IBM SEL environments
(e.g. qemu-se*).
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Exit early with an error message instead of starting kata-deploy if
the value of KATA_HYPERVISOR is not expected during CI.
For example: "cloud-hypervisor" was renamed recently to
"clh-runtime-rs" and user scripts depending on the old name were
getting tangled in kata-deploy instead of just rejecting the old
value quickly.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Now that runtime-rs supports block-encrypted emptyDir volumes, remove
the no-trusted-storage workaround templates and the is_runtime_rs
branching in the NIM test. Runtime-rs now uses the same TEE templates
as the Go runtime with emptyDir + PVC at 48Gi memory, instead of the
128Gi workaround that compensated for lacking trusted storage.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
Remove the runtime-rs skip from the trusted ephemeral data storage
test now that runtime-rs implements block-encrypted emptyDir volumes.
Also remove the genpolicy drop-in that disabled encrypted_emptydir
for runtime-rs and the corresponding copy logic in tests_common.sh.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
Update CDH to a newer version and:
- adjust the NVIDIA root filesystem build to reflect the change from
using libcryptsetup to using the cryptsetup binary.
- adjust image-pull test cases to conduct parallel write operations
on the /dev/trusted_store backed guest image pull location since
issue #12721 has been solved on CDH side.
Fixes#12721
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
k8s-measured-rootfs only runs on confidential runtime,
so we should move it into the subset on tests that run on TEEs
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Replace guest-pull image allow-all placeholders with explicit
auto-generated policies for each generated pod manifest.
Generate policy after the final YAML edits so initdata and image
pull secrets are represented in the policy inputs.
Assisted-by: OpenAI Codex <codex@openai.com>
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Teach auto_generate_policy to reuse a cc_init_data annotation by
decoding it into the temporary default-initdata.toml file.
This lets tests preserve CDH initdata while genpolicy appends the
generated agent security policy for the workload.
Assisted-by: OpenAI Codex <codex@openai.com>
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Move the Docker auth setup into common.bash so tests beyond the
NVIDIA runner can provide credentials for genpolicy image pulls.
Make the registry, username, password and output directory explicit
while preserving the nvcr.io setup used by the NIM tests.
Assisted-by: OpenAI Codex <codex@openai.com>
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
BATS_TEST_COMPLETED is per-test and remains empty in teardown_file.
Track file-level state so successful NIM runs skip the journal dump
while setup or test failures still include node diagnostics.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Place the NIM service into our test namespace. We are still observing
various situations where for some reasons, the NIM service appears in
the default namespace in our CI.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Wait for the NIM operator pod to run before deploying NIM services.
Add a temporary debug function to print resource placement into the
different namespaces. Remove this function again when the NIM tests
are stabilized.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Add basic genpolicy support for container environment variables sourced
from metadata.labels.
In this implementation, the relevant labels must be available as input
to the policy tool. This is slightly different from the way variables
sourced from metadata.annotations are treated by the tool: when the
relevant annotation is not available as input, the generated Policy
allows any value. Depending on metadata.labels use cases that we might
encounter maybe the labels will be handled the same way as the
annotations in the future.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
In cleanup_kata_deploy, bail out early when no kata-deploy Helm release
exists so baremetal-* pre-deploy cleanup on fresh clusters does not
block on helm uninstall --wait (up to 10m).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
Add qemu-nvidia-gpu-runtime-rs and qemu-nvidia-gpu-snp-runtime-rs to
the NVIDIA GPU test matrix so CI covers the new runtime-rs shims.
Introduce a `coco` boolean field in each matrix entry and use it for
all CoCo-related conditionals (KBS, snapshotter, KBS deploy/cleanup
steps). This replaces fragile name-string comparisons that were already
broken for the runtime-rs variants: `nvidia-gpu (runtime-rs)` was
incorrectly getting KBS steps, and `nvidia-gpu-snp (runtime-rs)` was
not getting the right env vars.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
1. Ignore PodAffinity's preferredDuringSchedulingIgnoredDuringExecution.
2. Ignore additional PodAffinityTerm fields.
3. Add basic tests for the new fields.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
The cron-job test workload was missing `runtimeClassName: kata`, which
meant the cron job was not actually being executed under the Kata
runtime, defeating the purpose of the test.
Set it explicitly, consistent with the sibling `job.yaml` workload.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The ITA_KEY secret was conditionally passed to TDX jobs for Intel
Trust Authority attestation, but it is no longer needed. Remove it
from all workflow files and the test helper export.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>