Commit Graph

1013 Commits

Author SHA1 Message Date
Fabiano Fidêncio
1eabd6c729 tests: k8s: coco: Use a var for AUTHENTICATED_IMAGE_PASSWORD
It's a bot password that only has read permissions for that image, no
need to use a secret here.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-15 23:12:02 +01:00
Fabiano Fidêncio
b1ec7d0c02 build: ci: remove KBUILD_SIGN_PIN entirely
Drop kernel build signing (KBUILD_SIGN_PIN) from CI and from all
scripts that referenced it.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-15 18:52:18 +01:00
Fabiano Fidêncio
83dce477d0 build: ci: remove CI_HKD_PATH and s390x boot-image-se build
Drop the CI_HKD_PATH secret and the build-asset-boot-image-se job from
the s390x tarball workflow; the artefact that depended on the host key
was never ever released anyways.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-15 16:24:30 +01:00
Fabiano Fidêncio
f4dcb66a3c ci: add workflow to push ORAS tarball cache
Add push-oras-tarball-cache workflow that runs on push to main when
versions.yaml changes (and on workflow_dispatch). It populates the
ghcr.io ORAS cache with gperf and busybox tarballs from versions.yaml.

Remove the push_to_cache call from download-with-oras-cache.sh since
it was never triggered in CI. Cache population is now done solely by the
new workflow and by populate-oras-tarball-cache.sh when run manually.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-13 12:57:48 +01:00
Fabiano Fidêncio
5c0269881e tests: Make editorconfig-checker happy
- Trim trailing whitespace and ensure final newline in non-vendor files
- Add .editorconfig-checker.json excluding vendor dirs, *.patch, *.img,
  *.dtb, *.drawio, *.svg, and pkg/cloud-hypervisor/client so CI only
  checks project code
- Leave generated and binary assets unchanged (excluded from checker)

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 21:58:28 +01:00
Manuel Huber
a6ca5c6628 ci: add editorconfig checker
This adds a basic configuration for editorconfig checker. The
supplied configuration checks against trailing whitespaces and
issues with newlines.
Example:
| tools/packaging/kernel/configs/fragments/x86_64/numa.conf:
|       Wrong line endings or no final newline
| tools/packaging/release/generate_vendor.sh:
|       44: Trailing whitespace

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-09 15:03:26 -08:00
Fabiano Fidêncio
ab515712d4 kernel: Unify kernel and kernel-confidential
Build a single kernel for both kernel and kernel-confidential on x86_64
and s390x. The kernel is built with TEE support (-x) on those arches only.

This helps to simplilfy and to maintain the code, and having a single
kernel was the original plan since forever.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-09 18:28:23 +01:00
Fabiano Fidêncio
c5b5433866 kernel: Unify nvidia-gpu and nvidia-gpu-confidential
Build a single kernel for both nvidia-gpu and nvidia-gpu-confidential,
simplifying and reducing code maintenance.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-09 18:28:23 +01:00
Manuel Huber
a786582d0b rootfs: deprecate initramfs dm-verity mode
Remove the initramfs folder, its build steps, and use the kernel
based dm-verity enforcement for the handlers which used the
initramfs mode. Also, remove the initramfs verity mode
capability from the shims and their configs.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
a3c4e0b64f rootfs: Introduce kernelinit dm-verity mode
This change introduces the kernelinit dm-verity mode, allowing
initramfs-less dm-verity enforcement against the rootfs image.
For this, the change introduces a new variable with dm-verity
information. This variable will be picked up by shim
configurations in subsequent commits.
This will allow the shims to build the kernel command line
with dm-verity information based on the existing
kernel_parameters configuration knob and a new
kernel_verity_params configuration knob. The latter
specifically provides the relevant dm-verity information.
This new configuration knob avoids merging the verity
parameters into the kernel_params field. Avoiding this, no
cumbersome escape logic is required as we do not need to pass the
dm-mod.create="..." parameter directly in the kernel_parameters,
but only relevant dm-verity parameters in semi-structured manner
(see above). The only place where the final command line is
assembled is in the shims. Further, this is a line easy to comment
out for developers to disable dm-verity enforcement (or for CI
tasks).

This change produces the new kernelinit dm-verity parameters for
the NVIDIA runtime handlers, and modifies the format of how
these parameters are prepared for all handlers. With this, the
parameters are currently no longer provided to the
kernel_params configuration knob for any runtime handler.
This change alone should thus not be used as dm-verity
information will no longer be picked up by the shims.

systemd-analyze on the coco-dev handler shows that using the
kernelinit mode on a local machine, less time is spent in the
kernel phase, slightly speeding up pod start-up. On that machine,
the average of 172.5ms was reduced to 141ms (4 measurements, each
with a basic pod manifest), i.e., the kernel phase duration is
improved by about 18 percent.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
dependabot[bot]
dc8d9e056d build(deps): bump zizmorcore/zizmor-action from 0.2.0 to 0.4.1
Bumps [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) from 0.2.0 to 0.4.1.
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](e673c3917a...135698455d)

---
updated-dependencies:
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.4.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-01 15:08:10 +00:00
Fabiano Fidêncio
5b82b160e2 runtime-rs: Add arm64 QEMU support
Add the necessary configuration and code changes to support QEMU
on arm64 architecture in runtime-rs.

Changes:
- Set MACHINETYPE to "virt" for arm64
- Add machine accelerators "usb=off,gic-version=host" required for
  proper arm64 virtualization
- Add arm64-specific kernel parameter "iommu.passthrough=0"
- Guard vIOMMU (Intel IOMMU) to skip on arm64 since it's not supported

These changes align runtime-rs with the Go runtime's arm64 QEMU support.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
2026-01-23 19:48:31 +01:00
Steve Horsman
2cd76796bd Merge pull request #12305 from stevenhorsman/fix-stalebot-permissions
ci: Fix stalebot permissions
2026-01-22 10:02:43 +00:00
Hyounggyu Choi
bc131a84b9 GHA: Set timeout for kata-deploy and kbs cleanup
It was observed that some kata-deploy cleanup steps could hang,
causing the workflow to never finish properly. In these cases,
a QEMU process was not cleaned up and kept printing debug logs
to the journal. Over time, this maxed out the runner’s disk
usage and caused the runner service to stop.

Set timeouts for the relevant cleanup steps to avoid this.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2026-01-22 10:32:24 +01:00
stevenhorsman
19efeae12e workflow: Fix stalebot permissions
When looking into stale bot more for issues, I realised that our existing
stale job would need permissions to work. Unfortunately the behaviour
of the actions without these permissions is to log, but still finish as successful.
This means it was hard to spot we had an issue.

Add the required permissions to get this working again and improve the message
Also add concurrency rule to make zizmor happy

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-01-21 17:28:59 +00:00
Fabiano Fidêncio
e0158869b1 tests: Add common bats test runner function
Add run_bats_tests() function to common.bash that provides consistent
test execution and reporting across all test suites (k8s, nvidia,
kata-deploy).

This removes duplicated test runner code from run_kubernetes_tests.sh,
run_kubernetes_nv_tests.sh, and run-kata-deploy-tests.sh.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-20 12:31:55 +01:00
Fabiano Fidêncio
96e1fb4ca6 tools: Remove runk
The runk tool hasn't been supported for a few years, with no maintainers
since ManaSugi stopped being involved in the project and the CI was
disabled in 2024.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:43:53 +01:00
Fabiano Fidêncio
d7aa793dde Revert "ci: Run a nightly job using the kata-deploy rust"
This reverts commit 6130d7330f, as we're
officially swithcing to the rust version of kata-deploy.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-19 14:07:49 +01:00
Amulyam24
859313d904 ci: move the job payload after push to an alternate runner for ppc64le
To unlock the release, move the job to publish kata payload after push to an alternate runner(IBM owned) for ppc64le.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2026-01-16 11:14:42 +05:30
Fabiano Fidêncio
4e99860fd2 workflows: nvidia: Adjust to kernel / roots build decouple
We don't need to store the kernel headers anymore. We do need to store
the kernel modules, instead.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 20:45:54 +01:00
LandonTClipp
94fde1356c docs: Add Zensical Doc Site Generation
This commit adds a Github workflow for building a Github Pages site for the markdown
files in the docs/ directory. Zensical is a new markdown-based static site generation
framework built by the creators of Material for Mkdocs. https://zensical.org/

This commit does not clean the doc structure, so site navigation is initially going to
be messy.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-01-13 12:42:02 +01:00
Fabiano Fidêncio
9fec31f400 tools: kubectl: Add kubectl version as a tag
This is a suggestion from Choi, so we can easily test with a specific
kubectl version and also easily understand which kubectl version is
being used in case of failure.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-12 15:48:44 +01:00
Fabiano Fidêncio
26dfcb627b tools: Build kubectl image
This image will be used by our helm charts to verify that a
kata-containers deployment is correct.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-12 15:48:44 +01:00
Mikko Ylinen
e02e226431 packaging: build OVMF for Intel TDX again
OVMF build for Intel TDX (aka "TDVF") was disabled in favor of Ubuntu/
CentOS pre-upstream releases of Intel TDX.

See 4292c4c3b1.

It's time to re-enable the build and move runtime configurations to
use it (the latter will be done in a later commit).

This is a partial revert of 4292c4c3b with the following changes:
- Stop calling OVMF for Intel TDX "TDVF" and follow the naming distros
use for TDX enabled build: OVMF.inteltdx.fd.
- Single binary OVMF.inteltdx.fd is supported using -bios QEMU param.
- Secure Boot infrastructure is disabled since Kata does not support it.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-01-08 10:21:47 +01:00
shwetha-s-poojary
1929ca8879 workflows: payload: do not remove AGENT_TOOLSDIRECTORY
Remove line that deletes $AGENT_TOOLSDIRECTORY

Signed-off-by: shwetha-s-poojary <shwetha.s-poojary@ibm.com>
2025-12-19 05:24:36 -08:00
Fabiano Fidêncio
5415cf4e0f workflows: payload: Remove unneeded stuff from the runner
Otherwise we may hit a `no space left on device` when building the rust
kata-deploy binary.

This happens mostly because of the muli-staging build used to generate a
distroless final container.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-17 09:57:02 +01:00
Fabiano Fidêncio
6130d7330f ci: Run a nightly job using the kata-deploy rust
Let's shamelessly duplicate the nightly job to have at least nightly
runs using the rust implementation of kata-deploy.

The reason for doing that is to be pragmatic, as pragmatic as possible,
and avoid switching away of the scripts before 3.24.0 release, while
still testing both ways till the switch happens.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-17 09:57:02 +01:00
Fabiano Fidêncio
830d15d4c8 tests: Adapt to using kata-tools
Instead of relying and the fully bloated kata tarball.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-16 12:55:07 +01:00
Fabiano Fidêncio
a2534e7bc8 kata-tools: Release as its own tarball
We're only releasing those for amd64 as that's the only architecture
we've been building the packages for.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-16 12:55:07 +01:00
Fabiano Fidêncio
6d2f393be4 build: Split tools build from the other artefacts build
Let's ensure we can create a specific "tools" tarball, which will help
those who only need to pull those either for testing or production
usage.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-16 12:55:07 +01:00
Fabiano Fidêncio
ded6d1636f kata-deploy: Remove deprecated features from 3.23.0
Let's remove the deprecated features that were marked for removal
after Kata Containers 3.23.0:

kata-deploy.sh:
- Remove non-arch-specific variable fallbacks (SHIMS, DEFAULT_SHIM,
  SNAPSHOTTER_HANDLER_MAPPING, ALLOWED_HYPERVISOR_ANNOTATIONS,
  PULL_TYPE_MAPPING, EXPERIMENTAL_FORCE_GUEST_PULL). Each arch now
  has its own default value.
- Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS
  variables and associated functions (create_runtimeclasses,
  delete_runtimeclasses, adjust_shim_for_nfd). RuntimeClasses are
  now managed by Helm chart, not the daemonset script.
- Unsupported architectures now fail with an error instead of
  falling back to non-arch-specific defaults.

Helm chart:
- Remove all deprecated env values (createRuntimeClasses,
  createDefaultRuntimeClass, debug, shims, shims_*, defaultShim,
  defaultShim_*, allowedHypervisorAnnotations, snapshotterHandlerMapping,
  snapshotterHandlerMapping_*, agentHttpsProxy, agentNoProxy,
  pullTypeMapping, pullTypeMapping_*, _experimentalSetupSnapshotter,
  _experimentalForceGuestPull, _experimentalForceGuestPull_*).
- Remove backward compatibility code from _helpers.tpl that checked
  for legacy env values.
- Remove legacy env.shims check from runtimeclasses.yaml.
- Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS env
  vars from kata-deploy.yaml and post-delete-job.yaml.
- Update RBAC to only include runtimeclasses get/patch permissions
  (needed for NFD patching), removing create/delete/list/update/watch.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-13 16:32:00 +01:00
Fabiano Fidêncio
46c7d6c9f8 ci: arm64-non-k8s: temporarily skip the tests
The runner is down for a few weeks. I may end up bringing in my personal
runner, but I'm not confident I can easily do this before the holidays,
thus I'm skipping the tests for now.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-11 12:14:32 +01:00
Fabiano Fidêncio
3db7b88eff tests: remove containerd guest pull stability tests
Remove the existing containerd guest pull stability tests workflow
as we're going to rebuild all the VMs used for testing and introduce
new, more focused stability tests for nydus-snapshotter.

The new tests will be added soon, as part of another PR.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-08 16:29:11 +01:00
Manuel Huber
34efa83afc tests: nvidia: cc: Add attestation test
Add the attestation bats test case to the NVIDIA CI and provide a
second pod manifest for the attestation test with a GPU. This will
enable composite attestation in a subsequent step.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2025-12-05 11:48:55 +01:00
Manuel Huber
3427b5c00e ci: nvidia: Install kata-artifacts
In preparation for Kata agent security policy testing, installing
Kata tools to provide genpolicy.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2025-12-01 17:59:19 +00:00
Steve Horsman
8534afb9e8 Merge pull request #12150 from stevenhorsman/add-gatekeeper-triggers
ci: Add two extra gatekeeper triggers
2025-11-28 09:34:41 +00:00
Fabiano Fidêncio
776e08dbba build: Add nvidia image rootfs builds
So far we've only been building the initrd for the nvidia rootfs.
However, we're also interested on having the image beind used for a few
use-cases.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-11-27 22:46:07 +01:00
stevenhorsman
531311090c ci: Add two extra gatekeeper triggers
We hit a case that gatekeeper was failing due to thinking the WIP check
had failed, but since it ran the PR had been edited to remove that from
the title. We should listen to edits and unlabels of the PR to ensure that
gatekeeper doesn't get outdated in situations like this.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-11-27 16:45:04 +00:00
Steve Horsman
c5ae8c4ba0 Merge pull request #12144 from BbolroC/use-runs-on-to-choose-runners
GHA: Use `runs-on` only for choosing proper runners
2025-11-27 09:54:39 +00:00
Alex Lyn
df8315c865 Merge pull request #12130 from Apokleos/stability-rs
tests: Enable stability tests for runtime-rs
2025-11-27 14:27:58 +08:00
Steve Horsman
aae483bf1d Merge pull request #12096 from Amulyam24/enable-ibm-runners
ci: re-enable IBM runners for ppc64le and s390x
2025-11-26 13:51:21 +00:00
Amulyam24
43a004444a ci: re-enable IBM runners for ppc64le and s390x
This PR re-enables the IBM runners for ppc64le/s390x build jobs and s390x static checks.

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2025-11-26 16:20:01 +05:30
Hyounggyu Choi
6f761149a7 GHA: Use runs-on only for choosing proper runners
Fixes: #12123

`include` in #12069, introduced to choose a different runner
based on component, leads to another set of redundant jobs
where `matrix.command` is empty.
This commit gets back to the `runs-on` solution, but makes
the condition human-readable.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-11-26 11:35:30 +01:00
stevenhorsman
4c59cf1a5d workflows: Add Report tests to all workflows
In the CoCo tests jobs @wainersm create a report tests step
that summarises the jobs, so they are easier to understand and
get results for. This is very useful, so let's roll it out to all the bats
tests.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-11-26 09:28:36 +00:00
Alex Lyn
7f4d856e38 tests: Enable nydus tests for qemu-runtime-rs
We need enable nydus tests for qemu-runtime-rs, and this commit
aims to do it.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-11-25 17:45:57 +08:00
Alex Lyn
23393d47f6 tests: Enable stability tests for qemu-runtime-rs on nontee
Enable the stability tests for qemu-runtime-rs CoCo on non-TEE
environments

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-11-25 16:18:37 +08:00
Alex Lyn
f1d971040d tests: Enable run-nerdctl-tests for qemu-runtime-rs
Enable nerdctl tests for qemu-runtime-rs

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-11-25 16:14:50 +08:00
Alex Lyn
c7842aed16 tests: Enable stability tests for runtime-rs
As previous set without qemu-runtime-rs, we enable it in this commit.

Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2025-11-25 16:12:12 +08:00
Manuel Huber
6c6fc50aa5 tests: nvidia: cc: allow-all policy and init-data
Add an allow-all policy for the CC GPU tests and ensure the init-data
device is being created (hypervisor annotations).

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2025-11-21 09:24:15 +01:00
Steve Horsman
87b180383e Merge pull request #11802 from kata-containers/dependabot/github_actions/oras-project/setup-oras-1.2.4
build(deps): bump oras-project/setup-oras from 1.2.2 to 1.2.4
2025-11-19 09:58:37 +00:00