Commit Graph

900 Commits

Author SHA1 Message Date
Ruoqing He
186c88b1d5 ci: Move musl-tools installation into Setup rust
`musl-tools` is only needed when a component needs `rust`, and the
`instance` running is of `x86_64` or `aarch64`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-03-05 09:43:19 +08:00
Zvonko Kaiser
4bb0eb4590
Merge pull request #10954 from kata-containers/topic/metrics-kata-deploy
Rework and fix metrics issues
2025-03-04 20:22:53 -05:00
stevenhorsman
fb1d4b571f workflows: Add required shellcheck workflow
Start with a required smaller set of shellchecks
to try and prevent regressions whilst we fix
the current problems

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-04 09:35:46 +00:00
stevenhorsman
b3972df3ca workflows: Shellcheck - ignore vendor
Ignore the vendor directories in our shellcheck
workflow as we can't fix them. If there is a way to
set this in shellcheckrc that would be better, but
it doesn't seem to be implemented yet.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-04 09:35:46 +00:00
stevenhorsman
3fab7944a3 workflows: Improve metrics jobs
- As the metrics tests are largely independent
then allow subsequent tests to run even if previous
ones failed. The results might not be perfect if
clean-up is required, but we can work on that later.
- Move the test results check out of the latency
test that seems arbitrary and into it's own job step
- Add timeouts to steps that might fail/hang if there
are containerd/K8s issues

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-01 17:50:05 +00:00
stevenhorsman
6f918d71f5 workflows: Update metrics jobs
Currently the run-metrics job runs a manual install
and does this in a separate job before the metrics
tests run. This doesn't make sense as if we have multiple
CI runs in parallel (like we often do), there is a high chance
that the setup for another PR runs between the metrics
setup and the runs, meaning it's not testing the correct
version of code. We want to remove this from happening,
so install (and delete to cleanup) kata as part of the metrics
test jobs.

Also switch to kata-deploy rather than manual install for
simplicity and in order to test what we recommend to users.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-03-01 17:50:05 +00:00
Stephane Talbot
f80e7370d5 test: Verify deployement of kata-deploy on microk8s
Enable fonctional test to verify deployment of kata-deploy on a Microk8s cluster

Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>
2025-02-28 10:10:29 +01:00
Ruoqing He
09030ee96e ci: Refactor build-checks workflow
Refator matrix setup and according dependencies installation logic in
`build-checks.yaml` and `build-checks-preview-riscv64.yaml` to provide
better readability and maintainability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-28 09:47:25 +08:00
Ruoqing He
eb94700590 ci: Drop install-libseccomp matrix variant
`install-libseccomp` is applied only for `agent` component, and we are
already combining matrix with `if`s in steps, drop `install-libseccomp`
in matrix to reduce complexity.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-28 09:44:53 +08:00
Fabiano Fidêncio
96ed706d20
Merge pull request #10950 from fidencio/topic/skip-arm-check-tests-that-depend-on-virt
ci: arm64: Skip tests that depend on virt on non-virt capable runners
2025-02-27 18:26:32 +01:00
Fabiano Fidêncio
e18e1ec3a8 ci: arm64: Skip tests that depend on virt on non-virt capable runners
The GitHub hosted runners for ARM64 do not provide virtualisation
support, thus we're just skipping the tests as those would check whether
or not the system is "VMContainerCapable".

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-27 14:43:21 +01:00
Steve Horsman
f3c22411fc
Merge pull request #10930 from stevenhorsman/codeql-config
workflows: Add codeql config
2025-02-27 12:43:41 +00:00
Ruoqing He
ec020399b9 ci: Enable partial components build-check on riscv
Since we have RISC-V builders available now, let's start with
`agent-ctl`, `trace-forwarder` and `genpolicy` components to run
build-checks on these `riscv-builder`s, and gradually add the rest
components when they are ready, to catch up with other architectures
eventually.

This workflow could be mannually triggered, `riscv-builder` will be the
default instance when that is the case.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-26 15:38:39 +08:00
stevenhorsman
c97e9e1592 workflows: Add codeql config
I noticed that CodeQl using the default config hasn't
scanned since May 2024, so figured it would be worth
trying an explicit configuration to see if that gets better results.
It's mostly the template, but updated to be more relevant:
- Only scan PRs and pushes to the `main` branch
- Set a pinned runner version rather than latest (with mac support)
- Edit the list of languages to be scanned to be more relevant
for kata-containers

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-25 15:05:43 +00:00
stevenhorsman
5000fca664 workflows: Add build-checks to manual CI
Currently the ci-on-push workflow that runs on PRs runs
two jobs: gatekeeper-skipper.yaml and ci.yaml. In order
to test things like for the error
```
too many workflows are referenced, total: 21, limit: 20
```
on topic branches, we need ci-devel.yaml to have an
extra workflow to match ci-on-push, so add the build-checks
as this is helpful to run on topic branches anyway.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-25 11:38:49 +00:00
stevenhorsman
23434791f2 workflows: Refactor publish workflows
Replace the four different publish workflows with
a single one that take input parameters of the arch
and runner, so reduce the amount of duplicated code
and try and avoid the
```
too many workflows are referenced, total: 21, limit: 20
```
error
2025-02-25 10:49:09 +00:00
Fabiano Fidêncio
7bd444fa52 ci: Run k8s tests on arm64
Let's take advantege of the current arm64 runners, and make sure we have
those tests running there as well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
2025-02-24 18:43:20 +01:00
Aurélien Bombo
adca339c3c ci: Fix GH throttling in run-nerdctl-tests
Specify a GH API token to avoid the below throttling error:

  https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585810679?pr=10911#step:4:96

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-02-21 17:52:17 -06:00
Hyounggyu Choi
d973d41efb GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x
This is the first attempt to remove the following code:

```
if [ "${ARCH}" == "s390x" ]; then
    export MEASURED_ROOTFS=no
fi
```

from install_shimv2() in kata-deploy-binaries.sh.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2025-02-19 18:19:19 +01:00
Zvonko Kaiser
ca4d227562 gpu: Add qemu-tdx-experimental build
We need to introduce again the qemu-tdx build for the GPU

Depends-on: #10867

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-19 14:48:56 +00:00
Zvonko Kaiser
1d9915147d release: Remove artifacts for release
We need to make sure the release does not have any residual binaries
left for the release payload

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-17 20:16:48 +00:00
Adithya Krishnan Kannan
6cc5b79507 CI: Deprecate SEV
Phase 1 of Issue #10840
AMD has deprecated SEV support on
Kata Containers, and going forward,
SNP will be the only AMD feature
supported. As a first step in this
deprecation process, we are removing
the SEV CI workflow from the test suite
to unblock the CI.

Will be adding future commits to
remove redundant SEV code paths.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2025-02-13 12:20:21 -06:00
Zvonko Kaiser
fbc8454d3d
Merge pull request #10866 from zvonkok/enable-cc-gpu-build
gpu: enable confidential initrd build
2025-02-12 09:26:08 -05:00
Zvonko Kaiser
5431841a80
Merge pull request #10814 from kata-containers/shellcheck-gha
gha: Add shellcheck
2025-02-11 18:30:41 -05:00
Zvonko Kaiser
b231a795d7 gha: Add shellcheck
We need to start to fix our scripts. Lets run shellcheck
and see what needs to be reworked.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-11 16:00:34 +00:00
Zvonko Kaiser
befb2a7c33 gpu: Confidential Initrd
Start building the confidential initrd

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-11 15:41:36 +00:00
Fupan Li
a3fd3d90bc ci: Add the sandbox api testcases
A test case is added based on the intergrated cri-containerd case.
The difference between cri containerd integrated testcase and sandbox
api testcase is the "sandboxer" setting in the sandbox runtime handler.

If the "sandboxer" is set to "" or "podsandbox", then containerd will
use the legacy shimv2 api, and if the "sandboxer" is set to "shim", then
it will use the sandbox api to launch the pod.

In addition, add a containerd v2.0.0 version. Because containerd officially
supports the sandbox api from version 2.0.0.

Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
2025-02-11 15:21:53 +01:00
Fabiano Fidêncio
c9f5966f56
Merge pull request #10860 from kata-containers/topic/debug-ci
workflows: build: Do not store unnecessary content on the tarball
2025-02-10 20:01:37 +01:00
Fabiano Fidêncio
ec290853e9 workflows: build: Do not store unnecessary content on the tarball
Otherwise we may end up simply unpacking kata-containers specific
binaries into the same location that system ones are needed, leading to
a broken system (most likely what happened with the metrics CI, and also
what's happening with the GHA runners).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-10 18:57:29 +01:00
Fabiano Fidêncio
23cb5bb6c2 ci: Only use the Ubuntu TDX machine in the CI
We've been hitting issues with the CentOS 9 Stream machine, which Intel
doesn't have cycles to debug.

After raising this up in the Confidential Containers community meeting
we got the green light from Red Hat (Ariel Adam) to just disable the CI
based on CentOS 9 Stream for now.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-02-10 12:50:16 +01:00
Zvonko Kaiser
45bd451fa0 ci: add arm64 attestation
Do the very same thing that we do on amd64 and add attestation

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
9a7dff9c40 gpu: Add arm64 targets
We want to make sure we deliver arm64 GPU targets as well

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
968318180d ci: Add extratarballs steps
We introduced extratarballs with a make target. The CI
currently only uploads tarballs that are listed in the matrix.
The NV kernel builds a headers package which needs to be uploaded
as well.

The get-artifacts has a glob to download all artifacts hence we
should be good.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Zvonko Kaiser
b04bdf54a5 gpu: Add rootfs target amd64/arm64
Adding the initrd build first to get the rootfs on amd64.
With that we can start to add tests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2025-02-05 16:30:20 +00:00
Steve Horsman
9060904c4f
Merge pull request #10826 from kata-containers/topic/crio-test-timeouts
workflows: Add delete kata-deploy timeouts for crio tests
2025-02-04 13:09:49 +00:00
stevenhorsman
d9eb1b0e06 versions: Bump golang version
Bump golang versions so we are more up-to-date and
have the extra security fixes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-03 15:28:53 +00:00
stevenhorsman
5203158195 workflows: Add delete kata-deploy timeouts for crio tests
I've also seen cases (the qemu, crio, k0s tests) where Delete kata-deploy is still
running for this test after 2 hours, and had to be manually
cancelled, so let's try adding a 5m timeout to the kata-deploy delete to stop CI jobs hanging.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-03 11:45:43 +00:00
stevenhorsman
d625f20d18 workflows: Move arm static checks runner
Now we have the build-assets running on the gh-hosted
runners, try the same approach for the static-checks

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-23 14:23:09 +00:00
stevenhorsman
ab27e11d31 workflows: Switch to github-hosted arm runner
Now that gituhb have hosted arm runners
https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/
we should try and switch our arm64 builder jobs to
run on these.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-22 16:27:17 +00:00
Ruoqing He
373a388844 ci: Retry on failure of Create AKS cluster
The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely
to fail fail since we are trying to issue `PUT` to `aks` in a relatively
high frequency, while the `aks` end has it's limit on `bucket-size` and
`refill-rate`, documented here [1].

Use `nick-fields/retry@v3` to retry in 10 seconds after request fail,
based on observations that AKS were request 7, or 8 second delays
before retry as part of their 429 response

[1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis

Fixes: #10772

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-01-22 13:24:51 +00:00
Aurélien Bombo
0d70dc31c1 ci: Unify on $GH_PR_NUMBER environment variable
While working on #10559, I realized that some parts of the codebase use
$GH_PR_NUMBER, while other parts use $PR_NUMBER.

Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests
without realizing that TEE tests use $PR_NUMBER, the tests on that PR
fail on TEEs:

https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45

  ...
  44      error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context
  ...
  135               image: ghcr.io/kata-containers/csi-kata-directvolume:
  ...

So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the
future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER.

Note that since some test scripts also refer to that variable, the CI
for this PR will fail (would have also happened with the converse
substitution), hence I'm not adding the ok-to-test label and we should
force-merge this after review.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-01-17 10:53:08 -06:00
stevenhorsman
9b6fce9e96 workflows: Add more ppc64le timeouts
Unsurprisingly now we've got passed the containerd test
hangs on the ppc64le, we are hitting others  in the "Prepare the
self-hosted runner" stage, so add timeouts to all of them
to avoid CI blockages.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 17:31:24 +00:00
stevenhorsman
d9d8d53bea workflows: Add timeout to some ppc64le steps
In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588
and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842
we've seen the Prepare the self-hosted runner
and Install dependencies steps get stuck for 5hours+.
If they are working then it should take a few minutes,
so let's add timeouts and not hold up whole the CI if they are stuck

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-20 16:37:36 +00:00
stevenhorsman
cf8b82794a workflows: Only remove artifacts in release builds
Due to the agent-api tests requiring the agent to be deployed in the
CI by the tarball, so in the short-term lets only do this on the release
stage, so that both kata-manager works with the release and the
agent-api tests work with the other CI builds.

In the longer term we need to re-evaluate what is in our tarballs
(issue #10619), but want to unblock the tests in the short-term.

Fixes: #10630
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 17:38:27 +00:00
stevenhorsman
e1f6aca9de workflows: Remove potential timing issues with artifacts
With the code I originally did I think there is potentially
a case where we can get a failure due to timing of steps.
Before this change the `build-asset-shim-v2`
job could start the `get-artifacts` step and concurrently
`remove-rootfs-binary-artifacts` could run and delete the artifact
during the download and result in the error. In this commit, I
try to resolve this by making sure that the shim build waits
for the artifact deletes to complete before starting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-12 16:52:54 +00:00
stevenhorsman
b4b3471bcb workflows: linting: Fix shellcheck SC1001
> This \/ will be a regular '/' in this context

Remove ignored escape

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
491210ed22 workflows: linting: Fix shellcheck SC2006
> Use $(...) notation instead of legacy backticks `...`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
5d7c5bdfa4 workflows: linting: Fix shellcheck SC2015
> A && B || C is not if-then-else. C may run when A is true

Refactor the echo so that we can't get into a situation where
the retry of workspace delete happens if the original one was
successful

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
c2ba15c111 workflows: linting: Fix shellcheck SC2206
>  Quote to prevent word splitting/globbing

Double quote variables expanded in an array

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
007514154c workflows: linting: Fix shellcheck SC2068
> Double quote array expansions to avoid re-splitting elements

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
4ef05c6176 workflows: linting: Fix shellcheck SC2116
> Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo'

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
f02d540799 workflows: Bump outdated action versions
Bump some actions that are significantly out-of-date
and out of sync with the versions used in other workflows

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
935327b5aa workflows: linting: Fix shellcheck SC2046
> Quote this to prevent word splitting.

Quote around subshell

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
d4bd314d52 workflows: linting: Fix incorrect properties
These properties are currently invalid, so either
fix, or remove them

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
9113606d45 workflows: linting: Fix shellcheck SC2086
> Double quote to prevent globbing and word splitting.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 13:50:12 +00:00
stevenhorsman
42cd2ce6e4 workflows: Add actionlint workflows
On PRs that update anything in the workflows directory,
add an actionlint run to validate our workflow files for errors
and hopefully catch issues earlier.

Fixes: #9646

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-06 11:36:08 +00:00
Fabiano Fidêncio
300a827d03 release: helm: Add the chart as part of the release
So users can simply download the chart and use it accordingly without
the need to download the full repo.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-12-06 11:19:34 +01:00
stevenhorsman
14a3adf4d6 workflows: Fix remove artifact name filter
- Fix copy-paste errors in artifact filters for arm64 and ppc64le
- Remove the trailing wildcard filter that falsely ends up removing agent-ctl
and replace with the tarball-suffix, which should exactly match the artifacts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-12-04 13:34:42 +00:00
Aurélien Bombo
4aa7d4e358 ci: Require CSI driver for CoCo tests
With the building/publishing step for the CSI driver validated, we can
set that as a requirement for the CoCo tests.

Depends on: #10561

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 14:43:36 -06:00
Aurélien Bombo
a23ceac913 ci: Fix Docker publishing for CSI driver, 2nd try
Follow-up to #10609 as it seems GHA doesn't allow hard links:

https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8

Note that I also updated the `needs` directive as we don't need the Kata
payload container, just the tarball artifact.

Part of: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-03 13:04:46 -06:00
Aurélien Bombo
85d3bcd713 ci: Fix Docker publishing for CSI driver
The compilation succeeds, however Docker can't find the binary because
we specify an absolute path. In Docker world, an absolute path is
absolute to the Docker build context (here:
src/tools/csi-kata-directvolume).

To fix this, we link the binary into the build context, where the
Dockerfile expects it.

Failure mode:
https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-12-02 15:50:01 -06:00
Fabiano Fidêncio
92b8091f62
Revert "ci: unbreak: Reallow no-op builds"
This reverts commit 559018554b.

As we've noticed that this is causing issues with initrd builds in the
CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-28 12:02:40 +01:00
Aurélien Bombo
559018554b ci: unbreak: Reallow no-op builds
#9838 previously modified the static build so as not to repeatedly
copy the same assets on each matrix iteration:

https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202

However, that implementation breaks specifiying no-op/WIP build targets
such as done in e43c59a. Such no-op builds have been a historical of the
project requirement because of a GHA limitation. The breakage is due to
no-op builds not generating a tar file corresponding to the asset:

https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592

To address this breakage, we revert to the `cp -r` implementation and
add the `--no-clobber` flag to still preserve the current behavior. Note
that `-r` will also create the destination directory if it doesn't
exist.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 18:40:29 -06:00
Aurélien Bombo
7f659f3d63 gha: Unbreak CI and work around workflow limit
#10561 inadvertently broke the CI by going over the limit of
20 reusable workflows:

https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow

This commit fixes that by inlining the job.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-27 12:23:15 -06:00
Aurélien Bombo
16a91fccbe
Merge pull request #10561 from sprt/csi-driver-ci
coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]
2024-11-27 10:26:45 -06:00
stevenhorsman
3e5d360185 workflows: Remove rootfs binary artifacts
We need the publish certain artefacts for the rootfs,
like the agent, guest-components, pause bundle etc
as they are consumed in the `build-asset-rootfs` step.
However after this point they aren't needed and probably
shouldn't be included in the overall kata tarball, so delete
them once they aren't needed any more to avoid them
being included.

Fixes: #10575
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-26 15:24:20 +00:00
Aurélien Bombo
5e4990bcf5 coco: ci: Add no-op steps to deploy CSI driver
This adds no-op steps that'll be used to deploy and clean up the CSI driver
used for testing.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:08:06 -06:00
Aurélien Bombo
893f6a4ca0 ci: Introduce job to publish CSI driver image
This adds a new job to build and publish the CSI driver Docker image.

Of course this job will fail after we merge this PR because the CSI driver
compilation job hasn't been implemented yet. However that will be implemented
directly after in #10561.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:07:59 -06:00
Aurélien Bombo
e43c59a2c6 ci: Add no-op step to compile CSI driver
This adds a no-op build step to compile the CSI driver. The actual compilation
will be implemented in an ulterior PR, so as to ensure we don't break the CI.

Addresses: #10560

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-21 16:06:55 -06:00
Steve Horsman
358ebf5134
Merge pull request #10558 from AdithyaKrishnan/main
ci: Re-enable SNP CI
2024-11-20 10:27:41 +00:00
stevenhorsman
da5f6b77c7 workflows: Remove skipping of artifact uploads
Now we are downloading artifacts to create the rootfs
we need to ensure they are uploaded always,
even on releases

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-19 13:28:02 +00:00
Adithya Krishnan Kannan
ef367d81f2 ci: Re-enable SNP CI
We've debugged the SNP Node and we
wish to test the fixes on GHA.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-18 11:11:27 -06:00
Fabiano Fidêncio
6a9266124b
Merge pull request #10501 from kata-containers/topic/ci-split-tests
ci: tdx: Split jobs to run in 2 different machines
2024-11-14 17:24:50 +01:00
Fabiano Fidêncio
9b3fe0c747 ci: tdx: Adjust workflows to use different machines
This will be helpful in order to increase the OS coverage (we'll be
using both Ubuntu 24.04 and CentOS 9 Stream), while also reducing the
amount spent on the tests (as one machine will only run attestation
related tests, and the other the tests that do *not* require
attestation).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-14 15:52:00 +01:00
Adithya Krishnan Kannan
439a1336b5 ci: Temporarily skip SNP CI
As discussed in the CI working group,
we are temporarily skipping the SNP CI
to unblock the remaining workflow.
Will revert after fixing the SNP runner.

Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>
2024-11-13 11:44:16 -06:00
GabyCT
06fe459e52
Merge pull request #10508 from GabyCT/topic/installartsta
gha: Get artifacts when installing kata tools in stability workflow
2024-11-11 15:59:06 -06:00
Aurélien Bombo
19e972151f gha: Hardcode ubuntu-22.04 instead of latest
GHA is migrating ubuntu-latest to Ubuntu 24 so
let's hardcode the current 22.04 LTS.

https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-11-08 11:00:15 -06:00
Steve Horsman
6112bf85c3
Merge pull request #10506 from stevenhorsman/skip-runk-ci
workflow: Remove/skip runk CI
2024-11-08 09:54:06 +00:00
Gabriela Cervantes
4274198664 gha: Get artifacts when installing kata tools in stability workflow
This PR adds the get artifacts which are needed when installing kata
tools in stability workflow to avoid failures saying that artifacts
are missing.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-07 16:20:41 +00:00
stevenhorsman
a5f1a5a0ee workflow: Remove/skip runk CI
As discussed in the AC meeting, we don't have a maintainer,
(or users?) of runk, and the CI is unstable, so giving we can't
support it, we shouldn't waste CI cycles on it.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:16:30 +00:00
stevenhorsman
0efe9f4e76 metrics: Skip metrics on stratovirt
As discussed on the AC call, we are lacking maintainers for the
metrics tests. As a starting point for potentially phasing them
out, we discussed starting with removing the test for stratovirt
as a non-core hypervisor and a job that is problematic in leaving
behind resources that need cleaning up.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-07 14:06:57 +00:00
GabyCT
47cea6f3c6
Merge pull request #10493 from GabyCT/topic/katatoolsta
gha: Add install kata tools as part of the stability workflow
2024-11-06 14:16:48 -06:00
Gabriela Cervantes
13e27331ef gha: Add install kata tools as part of the stability workflow
This PR adds the install kata tools step as part of the k8s stability workflow.
To avoid the failures saying that certain kata components are not installed it.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-11-06 20:07:06 +00:00
Fabiano Fidêncio
71c4c2a514
Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev
workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
2024-11-06 21:04:45 +01:00
Julien Ropé
6d0cb1e9a8 ci: export CONTAINER_RUNTIME to the test scripts
This variable will allow tests to adapt their behaviour to the runtime (containerd/crio).

Signed-off-by: Julien Ropé <jrope@redhat.com>
2024-11-06 11:29:11 +01:00
Fabiano Fidêncio
72979d7f30 workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev
By the moment we're testing it also with qemu-coco-dev, it becomes
easier for a developer without access to TEE to also test it locally.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-11-06 10:47:08 +01:00
Hyounggyu Choi
aeef28eec2 gha: Switch to kubeadm for run-k8s-tests-on-zvsi
Last November, SUSE discontinued support for s390x, leaving k3s
on this platform stuck at k8s version 1.28, while upstream k8s
has since reached 1.31. Fortunately, kubeadm allows us to create
a 1.30 Kubernetes cluster on s390x.
This commit switches the KUBERNETES option from k3s to kubeadm
for s390x and removes a dedicated cluster creation step.
Now, cluster setup and teardown occur in ACTIONS_RUNNER_HOOK_JOB_{STARTED,COMPLETED}.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-10-29 14:27:32 +01:00
Fabiano Fidêncio
d537932e66
build: shim-v2: Ensure MEASURED_ROOTFS is exported
The approach taken for now is to export MEASURED_ROOTFS=yes on the
workflow files for the architectures using confidential stuff, and leave
the "normal" build without having it set (to avoid any change of
expectation on the current bevahiour).

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
a65946bcb0
workflows: build: Ensure rootfs is present for shim-v2 build
Let's ensure that we get the already built rootfs tarball from previous
steps of the action at the time we're building the shim-v2.

The reason we do that is because the rootfs binary tarballs has a
root_hash.txt file that contains the information needed the shim-v2
build scripts to add the measured rootfs arguments to the shim-v2
configuration files.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
6ea0369878
workflows: build: Ensure rootfs is built before shim-v2
As the rootfs will have what we need to add as part of the shim-v2
configuration files for measured rootfs, we **must** ensure this is
built **before** shim-v2.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
13ea082531
workflows: Build rootfs after its deps are built
By doing this we can just re-use the dependencies already built, saving
us a reasonable amount of time.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:53 +01:00
Fabiano Fidêncio
c2b18f9660
workflows: Store rootfs dependencies
So far we haven't been storing the rootfs dependencies as part of our
workflows, but we better do it to re-use them as part of the rootfs
build.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 12:43:52 +01:00
Fabiano Fidêncio
a8fad6893a workflows: Possibly fix the release workflow
The only reason we had this one passing for amd64 is because the check
was done using the wrong variable (`matrix.stage`, while in the other
workflows the variable used is `inputs.stage`).

The commit that broke the release process is 67a8665f51, which
blindly copy & pasted the logic from the matrix assets.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-28 11:15:53 +01:00
Fabiano Fidêncio
475ad3e06b
workflows: devel: Allow running more than one at once
More than one developer can and should be able to run this workflow at
the same time, without cancelling the job started by another developer.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:35 +02:00
Fabiano Fidêncio
8f634ceb6b
workflows: devel: Adjust the pr-number
Let's use "dev" instead of "manually-triggered" as it avoids the name
being too long, which results in failures to create AKS clusters.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-24 15:38:31 +02:00
Fabiano Fidêncio
829415dfda workflows: Remove the possibility to manually trigger the nightly CI
As a new workflow was added for the cases where developers want to test
their changes in the workflow itself, let's make sure we stop allowing
manual triggers on this workflow, which can lead to a polluted /
misleading weather of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:19:45 +02:00
Fabiano Fidêncio
cc093cdfdb workflows: Add a manually trigger "devel" workflow for the CI
This workflow is intended to replace the `workflow_dispatch` trigger
currently present as part of the `ci-nightly.yaml`.

The reasoning behind having this done in this way is because of our good
and old GHA behaviour for `pull_request_target`, which requires a PR to
be merged in order to check the changes in the workflow itself, which
leads to:
* when a change in a workflow is done, developers (should) do:
  * push their branch to the kata-containers repo
  * manually trigger the "nightly" CI in order to ensure the changes
    don't break anything
    * this can result in the "nightly" CI weather being polluted
      * we don't have the guarantee / assurance about the last n nightly
	runs anymore

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-23 13:14:50 +02:00
Fabiano Fidêncio
67a8665f51 workflows: Ensure shim-v2 is built as the last asset
By doing this we can ensure that whenever the rootfs changes, we'll be
able to get the new root_hash.txt and use it.

This is the very first step to bring the measured rootfs tests back.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-22 14:56:37 +02:00
Magnus Kulke
e27d70d47e ci: don't parse oci image for cached artifacts
Moved the parsing of the oci image marker into its own step, since we
only need to perform that for attestation purposes and some cached
images might not have that file in the tarball.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-21 14:50:00 +02:00
Magnus Kulke
9a33a3413b
Merge pull request #10433 from mkulke/mkulke/add-provenance-attestation-for-agent-builds
ci: add provenance attestation for agent artifact
2024-10-18 15:00:18 +02:00
Magnus Kulke
b93f5390ce ci: add provenance attestation for agent artifact
This adds provenance attestation logic for agent binaries that are
published to an oci registry via ORAS.

As a downstream consumer of the kata-agent binary the Peerpod project
needs to verify that the artifact has been built on kata's CI.

To create an attestation we need to know the exact digest of the oci
artifact, at the point when the artifact was pushed.

Therefore we record the full oci image as returned by oras push.

The pushing and tagging logic has been slightly reworked to make this
task less repetetive.

The oras cli accepts multiple tags separated by comma on pushes, so a
push can be performed atomically instead of iterating through tags and
pushing each individually. This removes the risk of partially successful
push operations (think: rate limits on the oci registry).

So far the provenance creation has been only enabled for agent builds on
amd64 and xs390x.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
2024-10-18 10:24:00 +02:00
Sumedh Alok Sharma
bc195d758a ci: Install build dependencies for building agent-ctl with image pull.
Adds dependencies of 'clang' & 'protobuf' to be installed in runners
when building agent-ctl sources having image pull support.

Fixes #10400

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-10-14 10:36:04 +05:30
Fabiano Fidêncio
01a957f7e1
ci: mariner: Stop building mariner initrd
As the mariner image is already in place, and the tests were modified to
use them (as part of this series), let's just stop building it as part
of the CI.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-10-09 18:23:35 +02:00
Dan Mihai
63c1f81c23 local-build: add rootfs-image-mariner
Kata CI will start testing the new rootfs-image-mariner instead of the
older rootfs-initrd-mariner image.

The "official" AKS images are moving from a rootfs-initrd-mariner
format to the rootfs-image-mariner format. Making the same change in
Kata CI is useful to keep this testing in sync with the AKS settings.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-08 17:15:56 +00:00
Dan Mihai
5aaef8e6eb
Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci
gha: enable AUTO_GENERATE_POLICY where needed
2024-10-04 10:52:31 -07:00
Lukáš Doktor
2440a39c50
ci: Check required lables before checking tests in gatekeeper
some tests require certain labels before they are executed. When our PR
is not labeled appropriately the gatekeeper detects skipped required
tests and reports a failure. With this change we add "required-labeles"
to the tests mapping and check the expected labels first informing the
user about the missing labeles before even checking the test statuses.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
fdcfac0641
workflows/gatekeeper: export COMMIT_HASH variable
The Github SHA of triggering PR should be exported in the environment
so that gatekeeper can fetch the right workflows/jobs.

Note: by default github will export GITHUB_SHA in the job's environment
but that value cannot be used if the gatekeeper was triggered from a
pull_request_target event, because the SHA correspond to the push
branch.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta
4abfc11b4f
workflows/gatekeeper: configure concurrency properly
This will allow to cancel-in-progress the gatekeeper jobs.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:35 +02:00
Lukáš Doktor
5c1cea1601
ci: Select jobs by touched code
to allow selective testing as well as selective list of required tests
let's add a mapping of required jobs/tests in "skips.py" and a
"gatekeaper" workflow that will ensure the expected required jobs were
successful. Then we can only mark the "gatekeaper" as the required job
and modify the logic to suit our needs.

Fixes: #9237

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-10-03 09:08:33 +02:00
Dan Mihai
1a4928e710 gha: enable AUTO_GENERATE_POLICY where needed
The behavior of Kata CI doesn't change.

For local testing using kubernetes/gha-run.sh:

1. Before these changes:
- AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP,
  TDX, or KATA_HOST_OS=cbl-mariner.

2. After these changes:
- Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify
  AUTO_GENERATE_POLICY=yes if they want to auto-generate policy.
- These users have the option to test just using hard-coded policies
  (e.g., using the default policy built into the Guest rootfs) by
  using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default
  value of this env variable.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-10-02 23:20:33 +00:00
Gabriela Cervantes
d91979d7fa
gha: Add ita_key as a github secret
This PR adds ita_key as a github secret at the kata coco tests yaml workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-27 17:15:22 +02:00
Sicheng Liu
e4733748aa ci: Enable basic docker tests for runtime-rs
This commit enables basic amd64 tests of docker for runtime-rs by adding
vmm types "dragonball" and "cloud-hypervisor".

Signed-off-by: Sicheng Liu <lsc2001@outlook.com>
2024-09-26 06:27:05 +00:00
Gabriela Cervantes
49b3a0faa3 gha: Increase timeout to run k8s tests on TDX
This PR increases the timeout to run k8s tests for Kata CoCo TDX
to avoid the random failures of timeout.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-19 17:15:47 +00:00
Fabiano Fidêncio
35c7f8d1ba ci: Bump ubuntu 20.04 runners to 22.04
Azure internal mirrors for Ubuntu 20.04 have gone awry, leading to a
situation where dependencies cannot be installed (such as
libdevmapper-dev), blocking then our CI.

Let's bump the runners to 22.04 regardless, even knowing it'll cause an
issue with the runk tests, as the agent check tests are considered more
crucial to the project at this point.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-19 12:29:20 +02:00
Hyounggyu Choi
72df3004e8 gha: Rebase build-secure-image-se atop of latest target branch
This commit adds a step called `Rebase atop of the latest target branch`
to the job named `build-asset-boot-image-se` which can test the PR properly.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-17 09:54:51 +02:00
Sumedh Alok Sharma
e1ac2f4416 ci: Enable kata agent api tests
This commit enables running tests for kata agent apis.
The 'api-tests' directory will contain bats test files for
individual APIs.

Fixes #10269

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-06 00:02:55 +05:30
Aurélien Bombo
cc9aeee81a
Merge pull request #10263 from Sumynwa/sumsharma/add_ci_workflow
ci: Add workflow to run kata-agent api tests using kata-agent-ctl
2024-09-05 09:32:34 -07:00
GabyCT
deb6d12ff6
Merge pull request #10237 from GabyCT/topic/k8soakcoco
tests: Enable k8s soak stability test for Kata CoCo CI
2024-09-05 09:56:48 -06:00
Sumedh Alok Sharma
ad66f4dfc9 ci: Add workflow to run kata-agent api tests using kata-agent-ctl
enable CI to add test cases for testing kata-agent APIs. This commit
introduces:
- a workflow to run tests
- setup scripts to prepare the test environment

Fixes #10262

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2024-09-05 14:38:29 +05:30
Hyounggyu Choi
1cefa48047 gha: Add necessary steps for KBS enablement
The following steps are required for enabling KBS:

- Set environment variables `KBS` and `KBS_INGRESS`
- Uninstall and install `kbs-client`
- Deploy KBS

This commit adds the above stpes to the existing workflow
for `qemu-coco-dev`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-09-03 16:26:12 +02:00
Fabiano Fidêncio
e8657c502d Revert "CI: Add tests for stdio"
This reverts commit 704da86e9b, as the
tests never became stable to run.

This was discussed and agreed with the maintainer.

 Conflicts:
	.github/workflows/basic-ci-amd64.yaml
	tests/integration/stdio/gha-run.sh

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-09-03 11:52:30 +02:00
Gabriela Cervantes
c2aa288498 gha: Increase time to run Kata CoCo stability tests
This PR increases the time to run the Kata CoCo stability tests as
this tests are design to run for more than 2 hours.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-09-02 16:40:47 +00:00
Fabiano Fidêncio
ef5a5ea26e
Merge pull request #10038 from sprt/move-free-runner-iii
ci: Transition GARM tests to free runners, pt. III
2024-08-31 01:29:08 +02:00
Aurélien Bombo
de98e467b4 ci: Use ubuntu-22.04 instead of ubuntu-latest
22.04 is the default today:
23da668261/README.md

Being more specific will avoid unexpected errors when Github updates the
default.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:44:39 +00:00
Aurélien Bombo
ceab66b1ce ci: Run build-checks-depending-on-kvm for free
Also keeps the Rust installation step even though it's preinstalled, so that we
use the version specified in versions.yaml.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:59 +00:00
Aurélien Bombo
b4ce84b9d2 ci: Move run-runk to free runner
No change other than switching the runner - no dependency issue
expected.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:33 +00:00
Aurélien Bombo
645aaa6f7f ci: Move run-monitor to free runner
No change other than switching the runner - no
dependency issue expected.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-27 16:43:33 +00:00
Gabriela Cervantes
3a14b04621 gha: Fix entry for ci coco stability yaml
This PR fixes the entry or use of the ci weekly GHA workflow
to run properly the weekly k8s tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-26 17:14:35 +00:00
Gabriela Cervantes
95f6246858 gha: Add GHA workflow to run Kata CoCo stability tests
This PR adds a GHA workflow to run Kata CoCo weekly stablity tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-26 17:05:21 +00:00
Fabiano Fidêncio
3fd021a9b3 ci: commit-message-check: Take re-revert into consideration
`Reapply "` should be taken into sonsideration as well.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2024-08-21 14:19:16 +02:00
GabyCT
e0bff7ed14
Merge pull request #10177 from GabyCT/topic/cocoghas
gha: Add k8s stability Kata CoCo GHA workflow
2024-08-20 15:12:29 -06:00
Gabriela Cervantes
ca3d778479 gha: Add Kata CoCo Stability workflow
This PR adds the Kata CoCo Stability workflow that will setup the
environment to run the k8s tests on a non-tee environment.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-20 16:34:33 +00:00
Gabriela Cervantes
3ebaa5d215 gha: Add Kata CoCo stability weekly yaml
This PR adds the Kata CoCo stability weekly yaml that will trigger
weekly the k8s stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-20 16:32:03 +00:00
Fabiano Fidêncio
e75c149dec
ci: stdio: Properly start running the test
"gha-run.sh" requires a `run` argument in order to run the tests, which
seems to be forgotten when the test was added.

This PR needs to get merged before the test can successfully run.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-17 14:41:44 +02:00
Gabriela Cervantes
6ea34f13e1 gha: Add k8s stability Kata CoCo GHA workflow
This PR adds the k8s stability Kata CoCo GHA workflow to run weekly
the k8s stability tests.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-08-16 16:14:15 +00:00
Steve Horsman
91084058ae
Merge pull request #10007 from wainersm/run_k8s_on_free_runners
ci: Transition GARM tests to free runners, pt. II
2024-08-12 18:12:18 +01:00
Wainer dos Santos Moschetta
374405aed1 workflows/run-k8s-tests-on-amd64: remove 'instance' from matrix
The jobs are all executed on ubuntu-22.04 so it's invariant and
can be removed from the matrix (this will shrink the jobs names).

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 16:00:39 -03:00
Wainer dos Santos Moschetta
d11ce129ac workflows: merge run-k8s-tests-on-garm and run-k8s-tests-with-crio-on-garm
Created the run-k8s-tests-on-amd64.yaml which is a merge of
run-k8s-tests-on-garm.yaml and run-k8s-tests-with-crio-on-garm.yaml

ps: renamed the job from 'run-k8s-tests' to 'run-k8s-tests-on-amd64' to
it is easier to find on Github UI and be distinguished from s390x,
ppc64le, etc...

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:50:43 -03:00
Wainer dos Santos Moschetta
ed0732c75d workflows: migrate run-k8s-tests-with-crio-on-garm to free runners
Switch to Github managed runners just like the run-k8s-tests-on-garm
workflow.

See: #9940
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta
3d053a70ab workflows: migrate run-k8s-tests-on-garm to free runners
Switched to Github managed runners. The instance_type parameter was
removed and K8S_TEST_HOST_TYPE is set to "all" which combine the
tests of "small" and "normal". This way it will reduze to half of
the jobs.

See: #9940
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-08-07 15:20:42 -03:00
Archana Shinde
ba884aac13 ci: Enable nerdctl tests for clh
A recent fix should resolve some the issues seen earlier with clh
with the go runtime. Enabling this test to check if the issue is still
seen.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-08-06 10:41:42 -07:00
Fabiano Fidêncio
62a086937e
ci: Remove jobs that are not running
When re-enabling those we'll need a smart way to do so, as this limit of
20 workflows referenced is just ... weird.

However, for now, it's more important to add the jobs related to the new
platforms than keep the ones that are actively disabled.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-03 09:24:05 +02:00
Fabiano Fidêncio
ed57ef0297
ci; aarch64: Enable builders as part of the CI
As we have new runners added, let's enable the builders so we can
prevent build failures happening after something gets merged.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-02 14:13:53 +02:00
Fabiano Fidêncio
388b5b0e58
Revert "ci: Temporarily remove arm64 builds"
This reverts commit e9710332e7, as there
are now 2 arm64-builders (to be expanded to 4 really soon).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-02 13:53:50 +02:00
Fabiano Fidêncio
e9710332e7 ci: Temporarily remove arm64 builds
It's been a reasonable time that we're not able to even build arm64
artefacts.

For now I am removing the builds as it doesn't make sense to keep
running failing builds, and those can be re-enabled once we have arm64
machines plugged in that can be used for building the stuff, and
maintainers for those machines.

The `arm-jetson-xavier-nx-01` is also being removed from the runners.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-08-01 13:30:47 +02:00
Fupan Li
230aefc0da
Merge pull request #10070 from BbolroC/qemu-runtime-rs-k8s-s390x
GHA: Run k8s e2e tests for qemu-runtime-rs on s390x
2024-07-31 18:41:11 +08:00
Hyounggyu Choi
e135d536c5 gha: Restore cleanup-zvsi for s390x
In #10096, a cleanup step for kata-deploy is removed by mistake.
This leads to a cleanup error in the following `Complete job` step.

This commit restores the removed step to resolve the current CI failure on s390x.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-31 06:42:16 +02:00
Hyounggyu Choi
8d529b960a gha: Eradicate {pre,post}-action steps for s390x runners
As suggested in #9934, the following hooks have been introduced for s390x runners:

- ACTIONS_RUNNER_HOOK_JOB_STARTED
- ACTIONS_RUNNER_HOOK_JOB_COMPLETED

These hooks will perfectly replace the existing {pre,post}-action scripts.
This commit wipes out all GHA steps for s390x where the actions are triggered.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-30 17:10:19 +02:00
Hyounggyu Choi
d8cac9f60b GHA: Run k8s e2e tests for qemu-runtime-rs on s390x
This commit adds a new CI job for qemu-runtime-rs to the existing
zvsi Kubernetes test matrix.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-07-25 08:11:49 +02:00
ChengyuZhu6
2b44e9427c gha: Increase timeout to run CoCo tests
This PR increases the timeout for running the CoCo tests to avoid random failures.
These failures occur when the action `Run tests` times out after 30 minutes, causing the CI to fail.

Fixes: #10062

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-24 12:31:38 +08:00