Commit Graph

2085 Commits

Author SHA1 Message Date
Manuel Huber
1c081ff434 tests: nvidia: place NIM service into namespace
Place the NIM service into our test namespace. We are still observing
various situations where for some reasons, the NIM service appears in
the default namespace in our CI.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-10 07:36:23 +00:00
Fabiano Fidêncio
f7be57efe2 Merge pull request #13007 from manuelh-dev/mahuber/dbg-nim-svc
tests: nvidia: Wait for NIM operator pod and print
2026-05-08 20:58:51 +02:00
Manuel Huber
714adec3f8 tests: nvidia: Wait for NIM operator pod and print
Wait for the NIM operator pod to run before deploying NIM services.
Add a temporary debug function to print resource placement into the
different namespaces. Remove this function again when the NIM tests
are stabilized.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-05-08 06:27:48 +00:00
Fabiano Fidêncio
8dde5f39b7 tests: dump kata-deploy pod describe+logs on install timeout
When kubectl wait times out the pod never reached Ready, so the
existing log collection (which runs after wait succeeds) produces
"-- No entries --" with zero useful information.

Capture kubectl describe and kubectl logs (including previous
container) immediately on timeout so the next CI run shows exactly
why the pod is stuck (ImagePullBackOff, OOMKilled, probe failures,
containerd restart hang, etc.).

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-07 13:40:55 +02:00
Fabiano Fidêncio
0f3160276b ci: k8s: skip no-op Helm uninstall on free runners
In cleanup_kata_deploy, bail out early when no kata-deploy Helm release
exists so baremetal-* pre-deploy cleanup on fresh clusters does not
block on helm uninstall --wait (up to 10m).

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-07 13:40:55 +02:00
Fabiano Fidêncio
19c194aa94 ci: Add runtime-rs GPU shims to NVIDIA GPU CI workflow
Add qemu-nvidia-gpu-runtime-rs and qemu-nvidia-gpu-snp-runtime-rs to
the NVIDIA GPU test matrix so CI covers the new runtime-rs shims.

Introduce a `coco` boolean field in each matrix entry and use it for
all CoCo-related conditionals (KBS, snapshotter, KBS deploy/cleanup
steps). This replaces fragile name-string comparisons that were already
broken for the runtime-rs variants: `nvidia-gpu (runtime-rs)` was
incorrectly getting KBS steps, and `nvidia-gpu-snp (runtime-rs)` was
not getting the right env vars.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
2026-05-07 10:33:26 +02:00
Fabiano Fidêncio
acfb9f9762 Merge pull request #12954 from zvonkok/modular-makefile
build: remove gha-adjust-to-use-prebuilt-components.sh
2026-05-07 10:32:32 +02:00
manuelh-dev
8473144ee5 Merge pull request #12989 from microsoft/danmihai1/ignore-unnecessary-fields
genpolicy: ignore additional irrelevant fields
2026-05-06 23:54:39 -07:00
Greg Kurz
16bc6db59e static-checks: Drop vendor checks
The repo doesn't track vendor code anymore. Also, I could not find any
evidence that this code is actually called. The reference to URL

```
https://github.com/kata-containers/community/blob/main/VENDORING.md
```

that was recently removed by

https://github.com/kata-containers/community/pull/442

is another indication that this flow is outdated.

Drop it.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-05-06 09:49:53 +02:00
Greg Kurz
af54cd8a27 tests: Remove vendor directory
Now shipped in the vendored code tarball.

Signed-off-by: Greg Kurz <groug@kaod.org>
2026-05-06 09:32:05 +02:00
Dan Mihai
fcee4864e7 genpolicy: ignore additional PodAffinity fields
1. Ignore PodAffinity's preferredDuringSchedulingIgnoredDuringExecution.
2. Ignore additional PodAffinityTerm fields.
3. Add basic tests for the new fields.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-05-06 01:38:02 +00:00
Dan Mihai
b6349f50ab genpolicy: ignore preemptionPolicy
Ignore the pod preemptionPolicy field from input YAML - irrelevant
for building the Policy.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-05-06 00:35:27 +00:00
Dan Mihai
9f4a7a9d55 Merge pull request #12978 from microsoft/danmihai1/empty-env-var
genpolicy: support empty environment variables
2026-05-05 14:10:35 -07:00
Dan Mihai
99dd897814 genpolicy: support empty environment variables
K8s supports them, so genpolicy should support them too.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-05-05 18:53:25 +00:00
Fabiano Fidêncio
29e63c21a1 tests: k8s-cron-job: set runtimeClassName to kata
The cron-job test workload was missing `runtimeClassName: kata`, which
meant the cron job was not actually being executed under the Kata
runtime, defeating the purpose of the test.

Set it explicitly, consistent with the sibling `job.yaml` workload.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-05 11:21:05 +02:00
Fabiano Fidêncio
27c3dfbb8c Merge pull request #12943 from fidencio/topic/kata-deploy-add-http-health-probes
kata-deploy: add HTTP health probes (healthz/readyz)
2026-05-05 09:30:17 +02:00
Dan Mihai
0a6dc2fae0 ci: mariner: use OCI version 1.2.1
Mariner moved from version 1.2.0 to version 1.2.1.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2026-05-05 02:23:30 +00:00
Fabiano Fidêncio
9e3bd6b576 tests: fix kata-deploy lifecycle test reliability
Fix two issues in kata-deploy-lifecycle.bats that caused failures on
k3s, k0s and rke2:

  run_on_host():
  - `kubectl run --rm -i` causes k3s/rke2 to inject session-recording
    banners into stdout, polluting command output and breaking string
    assertions. Replace with a create/wait/logs/delete sequence so only
    the container's actual stdout is captured.

  "Artifacts are fully cleaned up after uninstall":
  - After a CRI restart the kubelet may briefly report "Unknown" for the
    container runtime version. Retry for up to 60s before asserting.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-03 22:09:08 +02:00
Fabiano Fidêncio
ed4f6ebc9e tests: use readiness probes to wait for kata-deploy install
Now that kata-deploy has a proper readiness probe (/readyz returns 200
only after install completes), replace the ad-hoc wait strategies with
kubectl wait --for=condition=Ready on the kata-deploy pods.

Note: helm --wait is ineffective for single-node clusters with
maxUnavailable=1 (the DaemonSet is considered ready with 0 ready pods),
so the CI uses kubectl wait on the pod readiness condition directly.

  gha-run-k8s-common.sh:
  - Drop the waitForProcess polling loop for Running pods
  - Drop the `sleep 60s` with its FIXME comment
  - Add kubectl wait --for=condition=Ready instead

  helm-deploy.bash:
  - Drop the extra `kubectl rollout status` after helm
  - Drop the `sleep 60`
  - The existing --wait on the helm command now suffices

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-03 22:09:08 +02:00
Fabiano Fidêncio
8c3c7aa871 ci: Drop ITA_KEY usage from CI workflows
The ITA_KEY secret was conditionally passed to TDX jobs for Intel
Trust Authority attestation, but it is no longer needed. Remove it
from all workflow files and the test helper export.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-03 18:05:51 +02:00
Zvonko Kaiser
a4129e41f3 build: remove gha-adjust-to-use-prebuilt-components.sh
No longer used; its two responsibilities are now expressed directly
in the workflows and the Makefile.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2026-04-30 00:49:13 +00:00
Aurélien Bombo
f3dc71a770 Revert "tests: k8s: policy: improve settings selection for runtime-rs hypervisors"
This reverts commit cafdd278ba.
2026-04-28 10:58:01 -05:00
Aurélien Bombo
cf6a91a104 runtime-rs/config: rename cloud-hypervisor to clh
This aligns on the previous commit and runtime-go.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Aurélien Bombo
e4fbddb91a ci: rename cloud-hypervisor to clh-runtime-rs
This aligns on qemu-runtime-rs and makes more sense.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2026-04-28 10:58:01 -05:00
Saul Paredes
7c8df3b9e6 Revert "test: temp skip failing tests on AKS"
This reverts commit 90e94ab305.
2026-04-27 09:36:51 -07:00
Saul Paredes
3273c4e1cc Revert "ci: Skip tests not working with k8s 1.36.0"
This reverts commit df68536cd6.
2026-04-27 08:08:27 -07:00
Saul Paredes
51f234cb56 tests: describe pods deployment when testing deployment output
For k8s 1.36.0, the events of a pod are no longer included in the "kubectl describe pod"
output when describing a deployment. Describe using the "app" label instead.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2026-04-27 08:07:58 -07:00
Mikko Ylinen
9cccfb5cb5 tests: align qemu-tdx kbs tests to use Trustee AS
No need to deviate from how other CoCo targets use Trustee and
enables us to add more tests (e.g., RVPS) that ITA Trustee implemention
does not support.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2026-04-25 22:53:15 +02:00
Fabiano Fidêncio
df68536cd6 ci: Skip tests not working with k8s 1.36.0
At first we thought this only happened with AKS, but it seems this is a
change in k8s 1.36.0 as the tests now started failing outside of AKS as
well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2026-04-25 08:56:42 +02:00
Fabiano Fidêncio
e6c6aad7af ci: k8s: temporarily remove smb tests
All the CIs are failing on the tests and in order to avoid blocking
upstream while allowing enough time for the developers to properly fix
it, let's just not execute the test.

This commit should be reverted once a fix is proposed.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 21:13:23 +02:00
Aurélien Bombo
15296fc9fe Merge pull request #12374 from microsoft/cameronbaird/add-cifs
kernel: add required configs for CIFS support
2026-04-24 10:42:09 -05:00
Fabiano Fidêncio
b7eb3ae402 tests: Fix shellcheck issues in helm-deploy.bash
Address shellcheck warnings including proper variable quoting,
use of [[ ]] over [ ], declaring and assigning variables separately,
and adding appropriate shellcheck disable directives where needed.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
9730aca676 tests: Fix shellcheck issues in common.bash
Address shellcheck warnings including proper variable quoting,
use of [[ ]] over [ ], declaring and assigning variables separately,
and adding appropriate shellcheck disable directives where needed.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Made-with: Cursor
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
5b737c01c3 tests: Fix shellcheck issues in stressng.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
2831481b31 tests: Fix shellcheck issues in cassandra_stress.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
544d11dba2 tests: Fix shellcheck issues in gha-adjust-to-use-prebuilt-components.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
f36c38803f tests: Fix shellcheck issues in run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
595296636f tests: Fix shellcheck issues in kata-monitor-tests.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
ce9c2671d4 tests: Fix shellcheck issues in gha-run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
143f9a7882 tests: Fix shellcheck issues in run-kata-deploy-tests.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
140e08044f tests: Fix shellcheck issues in gha-run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
dba477eeb0 tests: Fix shellcheck issues in setup_common.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
9ee4ef9ac0 tests: Fix shellcheck issues in run-agent-api-tests.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
63ebcc327a tests: Fix shellcheck issues in gha-run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
47f66299c3 tests: Fix shellcheck issues in soak_parallel_rm.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
98ab2228f8 tests: Fix shellcheck issues in scability_test.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
be0c348b80 tests: Fix shellcheck issues in gha-run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
7618715c24 tests: Fix shellcheck issues in agent_stability_test.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
011e0178e1 tests: Fix shellcheck issues in nydus_tests.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Fabiano Fidêncio
1161249197 tests: Fix shellcheck issues in gha-run.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00