142 Commits

Author SHA1 Message Date
stevenhorsman
36b674ed51 workflow: wire agent policy coverage check into static-checks
Add a check-agent-policy-coverage job to static-checks.yaml
that runs ci/check_agent_policy_coverage.sh on every pull request.

Generated-By: IBM Bob
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-06-29 10:49:26 +01:00
Alex Lyn
e77795f573 ci: Update libs required-test names for libdevmapper dependency
Update the two affected entries in required-tests.yaml accordingly
so the gatekeeper keeps matching them instead of blocking subsequent
PRs after this one merges.

Co-authored-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-06-26 09:52:47 +08:00
LandonTClipp
85e828cc9b docs: Add AI agent skill for doc contributions
This skill will inform AI agents how to properly write and format
docs in the new docs system. There is nothing too fancy, just reminding
agents to use mkdocs-materialx features instead of treating the
markdown like the legacy Github-based format.

Signed-off-by: LandonTClipp <lclipp@coreweave.com>
2026-06-23 08:57:37 +01:00
stevenhorsman
1d854ad7af ci: Update required tests
publish-kata-deploy-payload got renamed in #13107, which broke the CI.

Now, instead of tracking all those intermediate steps, let's make sure
we only track the tests themselves.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-06-11 19:02:23 +02:00
Fabiano Fidêncio
fc08218f55 gatekeeper: rename required tests to minimum/latest
The containerd_version matrix values were renamed from lts/active to
minimum/latest, which changes the generated CI job names.  Update the
required-tests list so the gatekeeper waits on the checks that are
actually produced.

The amd64 run-containerd-stability, run-nydus, run-cri-containerd and
free-runner run-k8s-tests jobs map lts -> minimum and active -> latest.
The s390x cri-containerd job maps active -> latest, matching its
updated matrix.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <noreply@cursor.com>
2026-06-08 19:20:14 +02:00
Fabiano Fidêncio
743b0a4839 Merge pull request #13165 from stevenhorsman/bump-go-to-1.25.11
versions: bump golang to 1.25.11
2026-06-04 20:24:57 +02:00
stevenhorsman
81c7dde0ae ci: Remove kata-monitor test from required
The kata-monitor test is currently failing and is running a very EoL
version of cri-o. This area is being actively reworked in #13107,
so remove this and then once kata-monitor tests are stable we
can re-add the new versions

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-06-04 14:40:17 +01:00
stevenhorsman
879912be25 versions: bump golang to 1.25.11
Bump the go version to resolve CVEs:
- GO-2026-5037
- GO-2026-5038
- GO-2026-5039

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Generated-By: IBM Bob
2026-06-04 08:49:17 +01:00
stevenhorsman
51eee428f4 testing/webhook: bump golang.org/x dependencies
Bump golang.org/x/net from v0.53.0 to v0.55.0 and golang.org/x/sys
from v0.43.0 to v0.44.0 to resolve CVEs:
- GO-2026-5024
- GO-2026-5025
- GO-2026-5026
 - GO-2026-5027
- GO-2026-5028
- GO-2026-5029
- GO-2026-5030

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-06-03 09:56:54 +01:00
Fabiano Fidêncio
81ce51a9aa ci: target Azure CLH runtimes directly in AKS tests
Switch AKS Mariner matrix entries to clh-azure handlers and remove the
temporary host-OS based helm value overrides.

Update integration test wiring and required test labels so CI tracks the
new runtime names.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-28 23:32:37 +02:00
Manuel Huber
275a63b266 Revert "gatekeeper: Unrequire NVIDIA GPU test"
This reverts commit edfb6f5716.

The NVIDIA non-TEE CI job has passed again over the last 5 nightly
runs after merging PRs #13007 and #13020.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-15 15:20:12 -07:00
dependabot[bot]
18a13773da build(deps): bump github.com/sirupsen/logrus
Bumps [github.com/sirupsen/logrus](https://github.com/sirupsen/logrus) from 1.9.3 to 1.9.4.
- [Release notes](https://github.com/sirupsen/logrus/releases)
- [Changelog](https://github.com/sirupsen/logrus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/sirupsen/logrus/compare/v1.9.3...v1.9.4)

---
updated-dependencies:
- dependency-name: github.com/sirupsen/logrus
  dependency-version: 1.9.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-13 06:11:16 +00:00
stevenhorsman
7cc72b933d versions: bump golang.org/x/net to v0.53.0
Bump golang.org/x/net to resolve CVE:
- GO-2026-4918

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:26 +01:00
stevenhorsman
4a65aca9cf versions: bump golang to 1.25.10
Bump the go version to resolve CVEs:
- GO-2026-4918
- GO-2026-4971
- GO-2026-4976
- GO-2026-4977
- GO-2026-4980
- GO-2026-4981
- GO-2026-4982
- GO-2026-4986

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Assisted-by: IBM Bob
2026-05-12 11:56:13 +01:00
Manuel Huber
edfb6f5716 gatekeeper: Unrequire NVIDIA GPU test (temporary)
Temporarily unrequire the NVIDIA GPU test. We are experiencing
situations in which two NIM service instances get deployed almost
at the same time into the kata-containers-k8s-tests namespace
(expected current context) and into the default namespace. This
causes the NIM operator to create two deployments in the two
namespaces and to then schedule two pods at the same time. This
usually causes the NIM pod in the default namespace to fail and to
linger.
We can't explain yet why this does not happen in the TEE CI path
and why this is happening at all.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-07 14:39:24 +02:00
Greg Kurz
c18932b5ab build-checks: Remove make vendor
The `generate_vendor.sh` script already knows how to create a tarball
with all the rust and go vendored code within the repo. It is used by
the release workflow to provide vendored code to downstream consummers
that might need it.

There isn't any vendored code in the repo anymore.

It thus doesn't seem quite useful to run `make vendor` in CI.

Stop doing it.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-05-06 09:49:50 +02:00
Greg Kurz
6de1c00b77 webhook: Fix go.sum file
Run `go mod tidy`.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-05-06 09:31:55 +02:00
Markus Rudy
22598a34b2 gatekeeper: require codegen
The codegen check ensures that generated files are up-to-date and
correspond to the tool versions used in CI. Requiring this check
prevents us from accidentally merging, e.g., proto changes without the
corresponding Rust/Go updates.

Signed-off-by: Markus Rudy <mr@edgeless.systems>
2026-05-02 12:28:58 +02:00
Fabiano Fidêncio
1a22c3adec Merge pull request #12942 from stevenhorsman/fix-cri-containerd-test-names
ci: Fix cri-containerd-test names
2026-04-29 09:56:43 +02:00
Steve Horsman
2435970fe8 Merge pull request #12933 from fidencio/topic/runtime-rs-decouple-dragonball-from-non-x86-checks
runtime-rs: drop misleading unsupported arches gating
2026-04-28 18:36:16 +01:00
stevenhorsman
4d4dee3af2 ci: Fix cri-containerd-test names
During the zizmor refactoring I changed the name of two jobs
to make all the architectures match. I forgot to update required_tests
and as a workflow only change the PR didn't check this, so update
them now.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-28 18:30:53 +01:00
Aurélien Bombo
e4fbddb91a ci: rename cloud-hypervisor to clh-runtime-rs
This aligns on qemu-runtime-rs and makes more sense.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Fabiano Fidêncio
8ab97a60f3 ci: install protobuf-compiler for runtime-rs build-checks
The `runtime-rs` component of `build-checks.yaml` declared `rust`
as its only dependency, but the runtime-rs build pulls in
`prost-build v0.8.0` (via `ttrpc-codegen` -> `containerd-shim-protos`,
and via the in-tree `hypervisor` crate), and `prost-build`'s build
script needs a `protoc` binary at compile time.

This worked on x86_64 and aarch64 only because `prost-build v0.8.0`
ships bundled `protoc` binaries for those targets. On s390x (and
ppc64le, when the matrix gets there) there is no bundled binary,
so the build fails with:

  Failed to find the protoc binary. The PROTOC environment variable
  is not set, there is no bundled protoc for this platform, and
  protoc is not in the PATH

The reason this didn't show up in CI before is that `make test`
and `make check` for runtime-rs were wrapped in arch-specific
`ifeq` blocks in `src/runtime-rs/Makefile` that turned them into
no-ops on s390x/ppc64le/riscv64gc. The previous commit dropped
those gates so `make {test,check}` now actually run on every arch,
which exposes this latent CI gap.

Match what `agent`, `libs`, `agent-ctl`, `kata-ctl` and `genpolicy`
already declare and add `protobuf-compiler` to runtime-rs's needs.
The existing `Install protobuf-compiler` step in this workflow
already runs `sudo apt-get -y install protobuf-compiler`, which
the s390x/ppc64le runners support (those other components have
been using it on s390x for some time).

Made-with: Cursor
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-28 16:25:31 +02:00
Fabiano Fidêncio
877f6b2129 tools: Fix shellcheck issues in common.bash
Address shellcheck warnings including proper variable quoting,
use of [[ ]] over [ ], declaring and assigning variables separately,
and adding appropriate shellcheck disable directives where needed.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
fedc1003b0 tools: Fix shellcheck issues in webhook-check.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
ca180a0e58 tools: Fix shellcheck issues in create-certs.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
0959f02b76 gatekeeper: Make arm64 CI unrequired
We have only one machine up and running the CIs, thus no capacity to
keep it as required for now.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-23 16:12:06 +02:00
Saul Paredes
baf0f16804 ci: k8s-tests: test mariner and runtime-rs
Disable policy tests when using mariner and runtime-rs. These are not supported yet.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2026-04-21 14:08:21 -07:00
LandonTClipp
fd896e4e76 ci: Add kata-dictionary.txt to required_tests.yaml
This makes it so that changes to the kata-dictionary.txt file only trigger the
static checks to run.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-04-15 14:48:01 +01:00
stevenhorsman
8be3a24112 ci: Update cargo-deny in gatekeeper
Update the name and move it to the static checks as we don't
need to ensure it's running for none code changes.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-11 08:46:32 +01:00
stevenhorsman
31f9a5461b versions: bump golang to 1.25.9
Bump the go version to resolve CVEs:
- GO-2026-4947
- GO-2026-4946
- GO-2026-4870
- GO-2026-4869
- GO-2026-4865
- GO-2026-4864

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-04-09 08:59:40 +01:00
Harshitha Gowda
bb1165b23f tests: Set sev-snp, qemu-snp CIs as required
run-k8s-tests-on-tee (sev-snp, qemu-snp)

Signed-off-by: Harshitha Gowda <hgowda@amd.com>
2026-04-08 22:36:58 +02:00
Fabiano Fidêncio
3a1683ccdc gatekeeper: unrequire kata-deploy k3s tests
Those are breaking, and I need time to investigate why.
For now, unrequire those tests.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-31 18:32:17 +02:00
Fabiano Fidêncio
bcfb2354e0 gatekeeper: Unrequire NVIDIA GPU SNP tests till auth is fixed
SSIA, the NIM tests are breaking due to authentication issues, and those
issues are blocking other PRs.

Let's unrequire the test for now, and mark it as required again once we
fixed the auth issues.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-03-25 10:23:53 +01:00
Aurélien Bombo
ec9c57c595 Merge pull request #12467 from ldoktor/gk-output
tools.gatekeeper: Improve output
2026-03-24 17:03:55 -05:00
stevenhorsman
44ec815f77 gatekeeper: Add check-spelling to required
Add the new job to the list of static checks that runs on doc updates.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-19 10:22:54 +00:00
Aurélien Bombo
352b4cdad2 Merge pull request #12660 from LandonTClipp/ci_docs
ci: Don't run CI builds on doc PRs
2026-03-17 12:19:11 -05:00
LandonTClipp
d5d741f4e3 ci: Don't run CI builds on doc PRs
We disable the Kata artifact builds and testing if the PR is only
related to documentation. Regular static checks will remain.

Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
2026-03-12 15:48:05 -05:00
Aurélien Bombo
32444737b5 gatekeeper: Remove csi-kata-directvolume build from required tests
Since we don't build that anymore.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-11 12:55:23 -05:00
stevenhorsman
8ae0e36737 versions: bump golang to 1.25.8
Bump the builder image and versions to resolve CVEs:
- GO-2026-4601
- GO-2026-4602
- GO-2026-4603

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-09 09:10:01 +00:00
Lukáš Doktor
ce65d17276 tools.gatekeeper: Add support for GITHUB_STEP_SUMMARY
this should produce a table of failed/running jobs as a table along with
links to them. On pass it should only produce simple line with how many
jobs passed.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-03-06 12:19:26 -03:00
Lukáš Doktor
27bebfb438 tools.gatekeeper: Print link to the results in status output
to simplify analyzing failures let's print the link to the job result
next to the status.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2026-03-06 12:19:26 -03:00
Steve Horsman
94f850979f Merge pull request #12613 from stevenhorsman/tooling-bump-x/net-to-v0.51.0
Tooling bump x/net to v0.51.0
2026-03-04 13:44:22 +00:00
stevenhorsman
8640f27516 ci: Remove SNP tests from required
The SNP tests have been unstable on nightlies, but even when these
it seems to be manually cleaned up or something as PR tests are consistently
failing, so we should skip this from the required list until it is reliable.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-04 14:41:09 +01:00
Aurélien Bombo
911742e26e gatekeeper: Add EditorConfig checker to required tests
Now that it's stable and fully configured.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-03-03 11:34:06 -06:00
stevenhorsman
24fe232e56 ci/openshift-ci: Bump x/net to v0.51
Remediate CVE GO-2026-4559

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-02 16:40:03 +00:00
stevenhorsman
993a4846c8 versions: Bump go to 1.25.7
Now that go 1.26 is out, 1.24 is not supported, so bump to
1.25 as per our policy.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-03-02 16:33:47 +01:00
Fabiano Fidêncio
9634dfa859 gatekeeper: Update tests name
We need to do so after moving some of the tests to the free runners.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-02-21 08:44:47 +01:00
Harshitha Gowda
728d8656ee tests: Set sev-snp, qemu-snp CIs as required
run-k8s-tests-on-tee (sev-snp, qemu-snp)

Signed-off-by: Harshitha Gowda <hgowda@amd.com>
2026-02-19 16:41:29 +01:00
stevenhorsman
74d4469dab ci/openshift-ci: Bump x/net to v0.50
Remediates CVEs:
- GO-2026-4440
- GO-2026-4441

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2026-02-13 17:55:23 +01:00