Commit Graph

16958 Commits

Author SHA1 Message Date
Fabiano Fidêncio
06a3bbdd44 ci: k8s: coco: Add "Report tests" step
For some reason we didn't have the "Report tests" step as part of the
TEE jobs. This step immensely helps to check which tests are failing and
why, so let's add it while touching the workflow.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-10 09:51:59 +02:00
Fabiano Fidêncio
a1f90fe350 tests: k8s: Unify k8s TEE tests
There's no reason to have the code duplication between the SNP / TDX
tests for CoCo, as those are basically using the same configuration
nowadays.

Note that for the TEEs case, as the nydus-snapshotter is deployed by the
admin, once, instead of deploying it on every run ... I'm actually
removing the nydus-snapshotter steps so we make it clear that those
steps are not performed by the CI.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-10 09:51:59 +02:00
Zvonko Kaiser
afbec780a9 Merge pull request #11903 from zvonkok/ppcie
gpu: PPCIE support DGX like systems
2025-10-09 21:06:41 -04:00
Aurélien Bombo
a3a45429f6 Merge pull request #11865 from microsoft/danmihai1/nested-configmap-secret
tests: k8s-nested-configmap-secret policy
2025-10-09 11:33:50 -05:00
Alex Lyn
b42ef09ffb Merge pull request #11888 from spuzirev/main
runtime: fix "num-queues expects uint64" error with virtio-blk
2025-10-09 20:21:32 +08:00
Xuewei Niu
2a43bf37ed Merge pull request #11894 from M-Phansa/main
runtime: fix device typo
2025-10-09 16:53:40 +08:00
Xuewei Niu
5208ee4ec0 Merge pull request #11674 from was-saw/dragonball_seccomp
runtime-rs: add seccomp support for dragonball
2025-10-09 15:01:15 +08:00
wangxinge
8e1b33cc14 docs: add document for seccomp
This commit adds a document to use
seccomp in runtime-rs

Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>
2025-10-09 13:25:17 +08:00
wangxinge
2abf6965ff dragonball: add seccomp support for dragonball
This commit modifies seccomp framework to
support different restrictions for different threads.

Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>
2025-10-09 13:25:17 +08:00
wangxinge
bb6fb8ff39 runtime-rs: add seccomp support for dragonball
The implementation of the seccomp feature in Dragonball currently has a basic framework.
But the actual restriction rules are empty.

This pull request includes the following changes:
- Modifiy configuration files to relevant configuration files.
- Modifiy seccomp framework to support different restrictions for different threads.
- Add new seccomp rules for the modified framework.

This commit primarily implements the changes 1 and 3 for runtime-rs.

Fixes: #11673

Signed-off-by: wangxinge <wangxinge@bupt.edu.cn>
2025-10-09 13:25:17 +08:00
Zvonko Kaiser
91739d4425 gpu: PPCIE support DGX like systems
For DGX like systems we need additional binaries and libraries,
enable the Kata AND CoCo use-case.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

Update tools/osbuilder/rootfs-builder/nvidia/nvidia_rootfs.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-09 00:00:12 +00:00
Dan Mihai
364d3cded0 tests: k8s-nested-configmap-secret policy
Add auto-generated agent policy in k8s-nested-configmap-secret.bats.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-10-08 23:37:54 +00:00
Sergei Puzyrev
62b12953c7 runtime: fix "num-queues expects uint64" error with virtio-blk
Unneeded type-conversion was removed.

Fixes #11887

Signed-off-by: Sergei Puzyrev <spuzirev@gmail.com>
2025-10-08 17:09:22 -05:00
Adeet Phanse
4e4f9c44ae runtime: fix device typo
Fix device typo in dragonball / runtime-rs / runtime.

Signed-off-by: Adeet Phanse <adeet.phanse@mongodb.com>
2025-10-08 17:08:27 -05:00
Aurélien Bombo
d954932876 Merge pull request #11883 from kata-containers/sprt/zizmor-fixes3
ci: zizmor: Address all issues
2025-10-08 17:01:48 -05:00
Aurélien Bombo
07645cf58b ci: actionlint: Address issues and set as required
Address issues just introduced and set actionlint as a required by removing
the path filter.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 16:55:27 -05:00
Aurélien Bombo
b3a551d438 ci: zizmor: Reestablish as required test
We can re-require this now that we've addressed all the issues.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 16:55:27 -05:00
Aurélien Bombo
5a4ddb8c71 ci: zizmor: Fix all template-injection alerts
Fix all instances of template injection by using environment variables as
recommended by Zizmor, instead of directly injecting values into the
commands.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 16:55:26 -05:00
Aurélien Bombo
7b203d1b43 ci: zizmor: Ignore dangerous-triggers audit for known safe usage
The two ignored cases are strictly necessary for the CI to work today, and we
have various security mitigations in place.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 16:55:08 -05:00
Aurélien Bombo
7afdfc7388 ci: zizmor: Disable undocumented-permissions audit
There are 62 such warnings and addressing them would take quite a bit of
time so just disable them for now.

help[undocumented-permissions]: permissions without explanatory comments
  --> ./.github/workflows/release.yaml:71:7
   |
71 |       packages: write
   |       ^^^^^^^^^^^^^^^ needs an explanatory comment
72 |       id-token: write
   |       ^^^^^^^^^^^^^^^ needs an explanatory comment
73 |       attestations: write
   |       ^^^^^^^^^^^^^^^^^^^ needs an explanatory comment
   |
   = note: audit confidence → High

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 16:55:08 -05:00
Aurélien Bombo
889ba0d5db Merge pull request #11901 from kata-containers/sprt/remove-docs-url-check
gha: Fix `docs-url-alive-check` workflow
2025-10-08 14:42:58 -05:00
Aurélien Bombo
ec81ea95df gha: Add workflow_dispatch trigger to docs-url-alive-check
We can't test this PR because the workflow needs this trigger, so adding
this will allow testing future PRs.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 14:39:34 -05:00
Aurélien Bombo
4d760e64ae gha: Fix docs-url-alive-check workflow
The Go installation step was broken because the checkout action was
checking out the code in a subdirectory:

https://github.com/kata-containers/kata-containers/actions/runs/18265538456/job/51999316919

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-08 14:39:34 -05:00
Aurélien Bombo
476c827fca Merge pull request #11878 from kata-containers/sprt/privileged-docs
docs: Document `privileged_without_host_devices=false` as unsupported
2025-10-08 11:12:45 -05:00
Fabiano Fidêncio
dbb1eb959c kata-deploy: Allow users to set experimental_force_guest_pull
For those who are not willing to use the nydus-snapshotter for pulling
the image inside the guest, let's allow them setting the
experimetal_force_guest_pull, introduced by Edgeless, as part of our
helm-chart.

This option can be set as:
_experimentalForceGuestPull: "qemu-tdx,qemu-coco-dev"

Which would them ensure that the configuration for `qemu-tdx` and
`qemu-coco-dev` would have the option enabled.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 17:43:09 +02:00
Fabiano Fidêncio
8c4bad68a8 kata-deploy: Remove kustomize yamls, rely on helm-chart only
As the kata-deploy helm chart has been the only way we've been testing
kata-containers deployment as part of our CI, it's time to finally get
rid of the kustomize yamls and avoid us having to maintain two different
methods (with one of those not being tested).

Here I removed:
* kata-deploy yamls and kustomize yamls
* kata-cleanup yamls and kustomize yamls
* kata-rbac yals and kustomize yamls
* README.md for the kustomize yamls was removed

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 16:54:19 +02:00
Fabiano Fidêncio
3418cedacc ci: Add tests for erofs-snapshotter (for coco-qemu-dev)
erofs-snapshotter can be used to leverage sharing the image from the
host to the guest without the need of a shared filesystem (such as
virtio-fs or virtio-9p).

This case is ideal for Confidential Computing enabled on Kata
Containers, and we can immensely benefit from this snapshotter, thus
let's test it as soon as possible so we can find issues, report bugs,
and ask for enhancement requests.

There are at least a few things that we know for sure to be problematic
now:
* Policy has to be adjusted to the erofs-snapshotter
* There is no support for signed nor encrypted images
* Tests that use the KBS are disabled for now

Even with the limitations, I do believe we should be testing the
snapshoitter, so we can team up and get those limitations addressed.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 10:34:09 +02:00
Fabiano Fidêncio
544f688104 tests: Add ability to deploy vanilla k8s with erofs
As done in the previous commit, let's expand the vanilla k8s deployment
to also allow the erofs host side configuration.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 10:34:09 +02:00
Fabiano Fidêncio
3ac6579ca6 tests: Add support for deploying vanilla k8s
We already have support for deploying a few flavours of k8s that are
required for different tests we perform.

Let's also add the ability to deploy vanilla k8s, as that will be very
useful in the next commits in this series.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 10:34:09 +02:00
Fabiano Fidêncio
aa9e3fc3d5 versions: Update containerd active / latest versions
The active version is 2.1.x, and the latest is 2.2.0-beta.0.

The latest is what we'll be using to test if the "to be released"
version of containerd works well for our use-cases.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 10:34:09 +02:00
Fabiano Fidêncio
287db1865f tests: Relax regex used to install containerd
Let's make sure that we can get non-official releases as well, otherwise
we won't be able to test a coming release of containerd, to know whether
it solves issues that we face or not, before it's actually released.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-08 10:34:09 +02:00
Zvonko Kaiser
59b4e3d3f8 gpu: Add CONFIG_FW_LOADER to the kernel
We need it for the newer CC kernel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-10-08 10:01:27 +02:00
Zvonko Kaiser
7061f64db5 gpu: Fix confidential build
NVRC introduced the confidential feature flag and we
haven't updated the rootfs build to accomodate.
If rootfs_type==confidential user --feature=confidential

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-10-08 10:01:27 +02:00
Zvonko Kaiser
2260f66339 gpu: Some fixes regarding the rootfs v580
With the 580 driver version we need new dependencies
in the rootfs.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-10-08 10:01:27 +02:00
Dan Mihai
08272ab673 Merge pull request #11884 from kata-containers/sprt/priv-test
tests/k8s: Add test for privileged containers
2025-10-07 19:18:06 -07:00
Szymon Klimek
8dc6b24e7d kata-deploy: accept 25.10 as supported distro for TDX
Canonical TDX release is not needed for vanilla Ubuntu 25.10 but
GRUB_CMDLINE_LINUX_DEFAULT needs to contain `nohibernate` and
`kvm_intel.tdx=1`

Signed-off-by: Szymon Klimek <szymon.klimek@intel.com>
2025-10-07 23:41:52 +02:00
Dan Mihai
650863039b tests: k8s-volume: auto-generate policy
Auto-generate the agent policy, instead of using the insecure
"allow all" policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-10-07 23:35:06 +02:00
Dan Mihai
5ed76b3c91 tests: k8s-volume: retry failed exec
Use grep_pod_exec_output to retry possible failing "kubectl exec"
commands. Other tests have been hitting such errors during CI in
the past.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-10-07 23:35:06 +02:00
Dan Mihai
6ab59453ff genpolicy: better parsing of mount path
Mount paths ending in '/' were not parsed correctly.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-10-07 23:35:06 +02:00
Dan Mihai
ba792945ef genpolicy: additional mount_source_allows logging
Make debugging policy errors related to storage mount sources easier to
debug.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-10-07 23:35:06 +02:00
Aurélien Bombo
6e451e3da0 tests/k8s: Add test for privileged containers
This adds an integration test to verify that privileged containers work
properly when deploying Kata with kata-deploy.

This is a follow-up to #11878.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-07 09:59:05 -05:00
Fabiano Fidêncio
f994bacf6c tests: coco: Use the new way to set up nydus snapshotter
Let's rely on kata-deploy setting up the nydus snapshotter for us,
instead of doing this with external code.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
6f17125ea4 tests: Allow using the new way to deploy nydus-snapshotter
This allows us to stop setting up the snapshotter ourselves, and just
rely con kata-deploy to do so.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
000c9cce23 kata-deploy: chart: Add _experimentalSetupSnapshotter
Let's expose the EXPERIMENTAL_SETUP_SNAPSHOTTER script environment
variable to our chart, allowing then users of our helm chart to take
advantage of this experimental feature.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
d6a1881b8b kata-deploy: scripts: Allow setting up multiple snapshotters
We may deploy in scenarios where we want to have both snapshotters set
up, sometimes even for simple test on which one behaves better.

With this in mind, let's allow EXTERNAL_SETUP_SNAPSHOTTER to receive a
comma separated list of snapshotters, such as:
```
EXPERIMENTAL_SETUP_SNAPSHOTTER="erofs,nydus"
```

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
445af6c09b kata-deploy: scripts: Allow deploying erofs-snapshotters
Similarly to what's been done for the nydus-snapshotter, let's allow
users to have erofs-snapshotter set up by simply passing:
```
EXPERIMENTAL_SETUP_SNAPSHOTTER="erofs".
```

Mind that erofs, although a built-in containerd snapshotter, has system
depdencies that we will *NOT* install and it's up to the admin to do so.
These dependencies are:
* erofs-utils
* fsverity
* erofs module loaded

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
4359c7b15d tests: Ensure the nydus-snapshotter versions are aligned
In the previous commit we added the assumption that the
nydus-snapshotter version should be the same in two different places.

Now, with this test, we ensure those will always be in sync.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
2e0ce2f39f kata-deploy: scripts: Allow deploying nydus-snapshotter
Let's introduce a new EXPERIMENTAL_SETUP_SNAPSHOTTER environemnt
variable that, when set, allows kata-deploy to put the nydus snapshotter
in the correct place, and configure containerd accordingly.

Mind, this is a stop gap till the nydus-snapshotter helm chart is ready
to be used and behaving well enough to become a weak dependency of our
helm chart.  When that happens this code can be deleted entirely.

Users can have nydus-snapshotter deployed and configured for the
guest-pull use case by simply passing:
```
EXPERIMENTAL_SETUP_SNAPSHOTTER="nydus"
```

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
1e2c86c068 kata-deploy: scripts: Only add conf file to the imports once
Otherwise we'd end up adding a the file several times, which could lead
to problems when removing the entry, leading to containerd not being
able to start due to an import file not being present.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00
Fabiano Fidêncio
e1269afe8a tests: Only use Authorization when GH_TOKEN is available
The code, how it was, would lead to the following broke command:
`--header "Authorization: Bearer: "`

Let's only expand that part of the command if ${GH_TOKEN} is passed,
otherwise we don't even bother adding it.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-10-07 10:32:46 +02:00