During the ${wait_time} for an expected condition, if
CreateContainerRequest was NOT expected to fail: detect possible
CreateContainerRequest failures early and abort the wait.
For example, before this change:
not ok 1 Successful job with auto-generated policy in 107111ms
ok 2 Policy failure: unexpected environment variable in 7920ms
ok 3 Policy failure: unexpected command line argument in 7874ms
ok 4 Policy failure: unexpected emptyDir volume in 7823ms
ok 5 Policy failure: unexpected projected volume in 7812ms
ok 6 Policy failure: unexpected readOnlyRootFilesystem in 7903ms
ok 7 Policy failure: unexpected UID = 222 in 7720ms
After this change:
not ok 1 Successful job with auto-generated policy in 10271ms
ok 2 Policy failure: unexpected environment variable in 8018ms
ok 3 Policy failure: unexpected command line argument in 7886ms
ok 4 Policy failure: unexpected emptyDir volume in 7621ms
ok 5 Policy failure: unexpected projected volume in 7843ms
ok 6 Policy failure: unexpected readOnlyRootFilesystem in 7632ms
ok 7 Policy failure: unexpected UID = 222 in 7619ms
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
During the ${wait_time} for an expected condition, if
CreateContainerRequest was NOT expected to fail: detect possible
CreateContainerRequest failures early and abort the wait.
For example, before this change:
ok 1 Successful sc deployment with auto-generated policy and container image volumes in 14769ms
ok 2 Successful sc with fsGroup/supplementalGroup deployment with auto-generated policy and container image volumes in 8384ms
not ok 3 Successful sc deployment with security context choosing another valid user in 136149ms
ok 4 Successful layered sc deployment with auto-generated policy and container image volumes in 8862ms
ok 5 Policy failure: unexpected GID = 0 for layered securityContext deployment in 7941ms
ok 6 Policy failure: malicious root group added via supplementalGroups deployment in 11612ms
After:
ok 1 Successful sc deployment with auto-generated policy and container image volumes in 15230ms
ok 2 Successful sc with fsGroup/supplementalGroup deployment with auto-generated policy and container image volumes in 9364ms
not ok 3 Successful sc deployment with security context choosing another valid user in 11060ms
ok 4 Successful layered sc deployment with auto-generated policy and container image volumes in 9124ms
ok 5 Policy failure: unexpected GID = 0 for layered securityContext deployment in 7919ms
ok 6 Policy failure: malicious root group added via supplementalGroups deployment in 11666ms
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
During the ${wait_time} for an expected condition, if
CreateContainerRequest was NOT expected to fail: detect possible
CreateContainerRequest failures early and abort the wait.
For example, before this change:
not ok 1 Successful pod with auto-generated policy in 110801ms
not ok 2 Able to read env variables sourced from configmap using envFrom in 94104ms
not ok 3 Successful pod with auto-generated policy and runtimeClassName filter in 95838ms
not ok 4 Successful pod with auto-generated policy and custom layers cache path in 110712ms
ok 5 Policy failure: unexpected container image in 8113ms
ok 6 Policy failure: unexpected privileged security context in 7943ms
ok 7 Policy failure: unexpected terminationMessagePath in 11530ms
ok 8 Policy failure: unexpected hostPath volume mount in 7970ms
ok 9 Policy failure: unexpected config map in 7933ms
not ok 10 Policy failure: unexpected lifecycle.postStart.exec.command in 112677ms
ok 11 RuntimeClassName filter: no policy in 2302ms
not ok 12 ExecProcessRequest tests in 93946ms
not ok 13 Successful pod: runAsUser having the same value as the UID from the container image in 94003ms
ok 14 Policy failure: unexpected UID = 0 in 8016ms
ok 15 Policy failure: unexpected UID = 1234 in 7850ms
After:
not ok 1 Successful pod with auto-generated policy in 12182ms
not ok 2 Able to read env variables sourced from configmap using envFrom in 10121ms
not ok 3 Successful pod with auto-generated policy and runtimeClassName filter in 11738ms
not ok 4 Successful pod with auto-generated policy and custom layers cache path in 26592ms
ok 5 Policy failure: unexpected container image in 7742ms
ok 6 Policy failure: unexpected privileged security context in 7949ms
ok 7 Policy failure: unexpected terminationMessagePath in 7789ms
ok 8 Policy failure: unexpected hostPath volume mount in 7887ms
ok 9 Policy failure: unexpected config map in 7818ms
not ok 10 Policy failure: unexpected lifecycle.postStart.exec.command in 9120ms
ok 11 RuntimeClassName filter: no policy in 2081ms
not ok 12 ExecProcessRequest tests in 9883ms
not ok 13 Successful pod: runAsUser having the same value as the UID from the container image in 9870ms
ok 14 Policy failure: unexpected UID = 0 in 11161ms
ok 15 Policy failure: unexpected UID = 1234 in 7814ms
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
We've seen a few cases where we fail the test due to timeout and when we
print the pods we just see that they've been created.
With that in mind, let's just increase the timeout a little bit.
Example:
```
not ok 1 Parallel jobs in 6250ms
(in test file k8s-parallel.bats, line 41)
`kubectl wait --for=condition=Ready --timeout=$timeout pod -l jobgroup=${job_name}' failed
No resources found in kata-containers-k8s-tests namespace.
[bats-exec-test:71] INFO: k8s configured to use runtimeclass
job.batch/process-item-test1 created
job.batch/process-item-test2 created
job.batch/process-item-test3 created
NAME STATUS COMPLETIONS DURATION AGE
process-item-test1 Running 0/1 0s
process-item-test2 Running 0/1 0s
process-item-test3 Running 0/1 0s
error: no matching resources found
No resources found in kata-containers-k8s-tests namespace.
No resources found in kata-containers-k8s-tests namespace.
DEBUG: system logs of node 'aks-nodepool1-25989463-vmss000000' since test start time (2025-11-01 16:39:03)
-- No entries --
job.batch "process-item-test1" deleted
job.batch "process-item-test2" deleted
job.batch "process-item-test3" deleted
```
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Otherwise we'll face issues like:
```
Error: found in Chart.yaml, but missing in charts/ directory: node-feature-discovery
```
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we're failing on the uninstall, which seems related to a bug on NFD
itself, but I don't have access to a s390x machine to debug, let's skip
the enablement for now and enable it back once we've experimented it
better on s390x.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we're failing to install NFD on CBL Mariner, let's skip the
enablement there, and enable it once we've experimented it better there.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As we have the ability to deploy NFD as a sub-chart of our chart, let's
make sure we test it during our CI.
We had to increase the timeout values, where we had timeouts set, to
deploy / undeploy kata, as now NFD is also deployed / undeployed.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's ensure that we add NFD as a weak dependency of the kata-deploy
helm chart.
What we're doing for now is leaving it up to the user / admin to enable
it, and if enabled then we do a explicit check for virtualization
support (x86_64 only for now).
In case NFD is already deployed, we fail the installation (in case it's
enabled on the kata-deploy helm chart) with a clear error message to the
user.
While I know that kata-remote **DOES NOT** require virtualization, I've
left this out (with a comment for when we add a peer-pods dependency on
kata-deploy) in order to simplify things for now, as kata-remote is not
a deployed shim by default.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
As Kata Containers can be consumed by other helm-charts, hard coding the
default runtime class name to `kata` is not optimal.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
All the options that take a specific shim as an argument MUST have
specific per arch settings, as not all the shims are available for all
the arches, leading to issues when setting up multi-arch deployments.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's ensure that we consume NVRC releases straight from GitHub instead
of building the binaries ourselves.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We have here either /dev/vfio/<num> or /dev/vfio/devices/vfio<num>,
for IOMMUFD format /dev/vfio/devices/vfio<num>, strip "vfio" prefix
/dev/vfio/123 - basename "123" - vfioNum = "123" - cdi.k8s.io/vfio123
/dev/vfio/devices/vfio123 - basename "vfio123" - strip - vfioNum = "123" - cdi.k8s.io/vfio123
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We have 2 tests running on GitHub provided runners:
* devmapper
* CRI-O
- devmapper situation
For devmapper, we're currently testing devmapper with s390x as part of
one of its jobs.
More than that, this test has been failing here due to a lack of space
in the machine for quite some time, and no-action was taken to bring it
back either via GARM or some other way.
With that said, let's rely on the s390x CI to test devmapper and avoid
one extra failure on our CI by removing this one.
- cri-o situation
CRI-O is being tested with a fixed version of kubernetes that's already
reached its EOL, and a CRI-O version that matches that k8s version.
There has been attempts to raise issues, and also to provide a PR that
does at least part of the work ... leaving the debugging part for the
maintainers of the CI. However, there was no action on those from the
maintainers.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
- Explained the concept and benefits of VM templating
- Provided step-by-step instructions for enabling VM templating
- Detailed the setup for using snapshotter in place of VirtioFS for template-based VM creation
- Added performance test results comparing template-based and direct VM creation
Signed-off-by: ssc <741026400@qq.com>
- init: initialize the VM template factory
- status: check the current factory status
- destroy: clean up and remove factory resources
These commands provide basic lifecycle management for VM templates.
Signed-off-by: ssc <741026400@qq.com>
Use `ioctl_with_mut_ref` instead of `ioctl_with_ref` in the
`create_device` method as it needs to write to the `kvm_create_device`
struct passed to it, which was released in v0.12.1.
Signed-off-by: Siyu Tao <taosiyu2024@163.com>
Fix the cargo fmt issues and then we can make the libs tests required
again to avoid this regression happening again.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
kata-deploy helm chart is *THE* way to deploy kata-containers on
kubernetes environments, and kubernetes environments is basically the
only reliably tested deployment we have.
For now, let's just drop documentation that is outdated / incorrect, and
in the future let's ensure we update the linked docs, as we work on
update / upgrade for the helm chart.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The libs in question were added when moving to developer.nvidia.com
but switching back to ubuntu only based builds they are not needed.
Remove them to keep the rootfs as minimal as possible.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In the case of CC we need additional libraries in the rootfs.
Add them conditionally if type == confidential.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Add build_vm_from_template() that flips boot_from_template flag,
wires factory.template_path/{memory,state} into the hypervisor config,
and returns ready-to-use hypervisor & agent instances.
When factory.template is enabled, VirtContainer bypasses normal creation
and directly boots the VM by restoring the template through incoming migration,
completing the "create → save → clone" loop.
Fixes: #11413
Signed-off-by: ssc <741026400@qq.com>
Introduced factory::FactoryConfig with init/destroy/status commands to manage template pools.
Added template::Template to fetch, create and persist base VMs.
Introduced vm::{VM, VMConfig} exposing create, pause, save, resume, stop,
disconnect and migration helpers for sandbox integration.
Extended QemuInner to executes QMP incoming migration, pause/resume and status tracking.
Fixes: #11413
Signed-off-by: ssc <741026400@qq.com>
Added new fields in Hypervisor struct to support VM template creation,
template boot, memory and device state paths, shared path, and store
paths. Introduced a Factory struct in config to manage template path,
cache endpoint, cache number, and template enable flag. Integrated
Factory into TomlConfig for runtime configuration parsing.
Fixes: #11413
Signed-off-by: ssc <741026400@qq.com>
This change enables to run the Cloud Hypervisor VMM using a non-root user
when rootless flag is set true in configuration.
Fixes: #11414
Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>
Pass the file descriptors of the tuntap device to the Cloud Hypervisor VMM process
so that the process could open the device without cap_net_admin
Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>
There's no reason to keep the env var / input as it's never been used
and now kata-deploy detects automatically whether NFD is deployed or
not.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When the NodeFeatureRule CRD is detected kata-deploy will:
* Create the specific NodeFeatureRules for the x86_64 TEEs
* Adapt the TEEs runtime classes to take into account the amount of keys
available in the system when spawning the podsandbox.
Note, we still do not have NFD as sub-dependency of the helm chart, and
I'm not even sure if we will have. However, it's important to integrate
better with the scenarios where the NFD is already present.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Change NIM bats file logic to allow skipping test cases which
require multiple GPUs. This can be helpful for test clusters where
there is only one node with a single GPU, or for local test
environments with a single-node cluster with a single GPU.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Temporarily disables the new runners for building artifacts jobs. Will be re-enabled once they are stable.
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
This partially reverts 8dcd91c for the s390x because the
CI jobs are currently blocking the release. The new runners
will be re-introduced once they are stable and no longer
impact critical paths.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This allows us to do a full multi-arch deployment, as the user can
easily select which shim can be deployed per arch, as some of the VMMs
are not supported on all architectures, which would lead to a broken
installation.
Now, passing shims per arch we can easily have an heterogenous
deployment where, for instance, we can set qemu-se-runtime-rs for s390x,
qemu-cca for aarch64, and qemu-snp / qemu-tdx for x86_64 and call all of
those a default kata-confidential ... and have everything working with
the same deployment.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The previous procedure failed to reliably ensure that all unused Device
Cgroups were completely removed, a failure consistently verified by CI
tests.
This change introduces a more robust and thorough cleanup mechanism. The
goal is to prevent previous issues—likely stemming from improper use of
Rust mutable references—that caused the modifications to be ineffective
or incomplete.
This ensures a clean environment and reliable CI test execution.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Build only from Ubuntu repositories do not mix with developer.nvidia.com
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Update tools/osbuilder/rootfs-builder/nvidia/nvidia_chroot.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>