Commit Graph

627 Commits

Author SHA1 Message Date
Steve Horsman
c9df743dab Merge pull request #9898 from sprt/gha-cleanup-job
ci: Add scheduled job to cleanup resources, pt. I
2024-06-25 19:11:30 +01:00
Aurélien Bombo
d60b548d61 ci: Add scheduled job to cleanup resources
This is the first part of adding a job to clean up potentially dangling
Azure resources. This will be based on Jeremi's tool from
https://github.com/jepio/kata-azure-automation.

At first, we'll only clean up AKS clusters, as this is what has been
causing us problems lately, but this could very well be extended to
cleaning up entire resource groups, which is why I left the different
names pretty generic (i.e. "resources" instead of "clusters").

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-06-25 16:33:03 +00:00
stevenhorsman
d8961cbd4a workflow: coco: Add auth registry secret
- Add the `AUTHENTICATED_IMAGE_USER` and
`AUTHENTICATED_IMAGE_PASSWORD` repository secrets as env vars
to the coco tests, so we can use them to pull an images from
and authenticated registry for testing

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-06-25 11:11:02 +01:00
Wainer Moschetta
bcdc4fde10 Merge pull request #9857 from wainersm/disable_failing_jobs-part2
CI: disable jobs that failed >= 50% on nightly CI recently - part 2
2024-06-24 10:11:05 -03:00
Fabiano Fidêncio
3adf9e250f Merge pull request #9875 from zvonkok/gha-no-sudo-arm64
ci: gha no sudo arm64
2024-06-21 15:28:54 +02:00
Fabiano Fidêncio
2d552800f2 Merge pull request #9876 from zvonkok/gha-no-sudo-s390x
ci: remove sudo from s390x build
2024-06-21 15:00:31 +02:00
GabyCT
9320c2e484 Merge pull request #9845 from GabyCT/topic/fixartifacts
gha: Do not fail when collecting artifacts
2024-06-20 10:15:53 -06:00
Fabiano Fidêncio
2bab0f31d7 ci: tdx: Re-enable TDX CI
Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it
becomes more stable than it used to be.

The cleanup-snapshotter is now taking ~4 minutes, and that matches with
the other platforms, mainly considering there's a sum of 210 seconds
sleep in the process.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 20:08:28 +02:00
Fabiano Fidêncio
7127178acc ci: tdx: Use vanilla k8s instead of k3s
We've noticed a bunch of issues related to deploying and deleting the
nydus-snapshotter.  As we don't see the same issues on other machines
using vanilla kubernetes, let's avoid using k3s for now follow the flow.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-19 16:56:15 +02:00
Zvonko Kaiser
d783ddaf03 ci: Remove not needed chown for ppc64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:56:45 +00:00
Zvonko Kaiser
5bc37e39d5 ci: remove sudo from ppc64 build
We can now do the same for ppc64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:55:45 +00:00
Zvonko Kaiser
c341234c0b ci: remove sudo from s390x build
We can now do the same for s390x that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:53:33 +00:00
Zvonko Kaiser
3beb460a97 ci: Remove not needed chown for arm64
Now that all artifacts are owned by $USER no extra step needed
to adjust ownership

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:48:00 +00:00
Zvonko Kaiser
445b389b16 ci: remove sudo from arm64 build
We can now do the same for arm64 that we did for amd64 and remove
the sudo cp.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-19 07:46:51 +00:00
Gabriela Cervantes
eeb467bdc2 gha: Do not fail when collecting artifacts
This PR will avoid the failures when collecting artifacts for the gha.
This will ensure that we collect and archive system's data for the
purpose of debugging.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-06-18 16:05:23 +00:00
Fabiano Fidêncio
587f4d45de ci: tdx: Disable TDX CI
TDX CI has been having some issues with the Nydus snapshotter cleanup,
which has been stuck for hours depending every now and then.

With this in mind, let's disable the TDX CI, so we avoid it blocking the
progress of Kata Containers project, and we re-enable it as soon as we
have it solved on Intel's side.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-06-18 10:30:40 +02:00
Wainer Moschetta
7df221a8f9 Merge pull request #9833 from wainersm/qemu-rs_tests
tests/k8s: run for qemu-runtime-rs on AKS
2024-06-17 16:59:46 -03:00
Wainer dos Santos Moschetta
08eaa60b59 CI: disable all run-kata-deploy-tests-on-garm jobs
The following jobs have failed more than 50% on nightly CI.

run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, k0s)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, rke2)
run-kata-deploy-tests-on-garm / run-kata-deploy-tests (qemu, k0s)

Instead of removing only those jobs, let's skip the kata-deploy-tests
on GARM completely so we can try to fix all the issues (or maybe
drop the jobs altogether).

Issue: #9854
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-17 14:39:38 -03:00
Zvonko Kaiser
5c2f3f34a8 CI: remove sudo from GHA
Now that all artifacts are owned by $USER we can start
to remove sudo from our GHA

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-17 11:06:56 +00:00
Wainer dos Santos Moschetta
d4f664b73b CI: disable run-kata-monitor-tests / run-monitor (containerd, lts) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9853
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:27:04 -03:00
Wainer dos Santos Moschetta
cbf0b7ca7b CI: disable run-basic-amd64-tests / run-nerdctl-tests (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: #9852
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:17:26 -03:00
Wainer dos Santos Moschetta
562820449e CI: disable run-basic-amd64-tests / run-vfio (qemu) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

The clh variation was disabled on commit 5f5274e699 so this change will
actually result on all the VFIO jobs disabled. Instead of delete the entire
entry from this workflow yaml (or comment the entry), I preferred to use
`if: false` which will make the jobs appear on the UI as skipped.

Issue: 9851
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-14 16:09:59 -03:00
Wainer dos Santos Moschetta
73ab5942fb tests/k8s: run for qemu-runtime-rs on AKS
The following tests are disabled because they fail (alike with dragonball):

- k8s-cpu-ns.bats
- k8s-number-cpus.bats
- k8s-sandbox-vcpus-allocation.bats

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-13 16:20:59 -03:00
Wainer dos Santos Moschetta
be9990144a workflow: run kata-deploy tests to qemu-runtime-rs on AKS
Start testing the ability of kata-deploy to install and configure
the qemu-runtime-rs runtimeClass.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-06-11 12:58:47 -03:00
Hyounggyu Choi
869f89c338 Merge pull request #9773 from BbolroC/use-qemu-coco-dev-s390x
GHA: Use qemu-coco-dev for k8s nydus test on s390x
2024-06-04 17:49:38 +02:00
Wainer Moschetta
2b8cdd9ff2 Merge pull request #9765 from wainersm/disable_failing_jobs
CI: disable jobs that failed > 50% on nightly CI recently - part 1
2024-06-04 12:05:36 -03:00
Hyounggyu Choi
246ee83768 GHA: Use qemu-coco-dev for k8s nydus test on s390x
In line with the changes for x86_64, the k8s nydus test for s390x should
also use `qemu-coco-dev` for `KATA_HYPERVISOR`.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-06-04 15:49:23 +02:00
Wainer dos Santos Moschetta
5f5274e699 CI: disable run-basic-amd64-tests / run-vfio (clh) job
The job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9764
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:34:45 -03:00
Wainer dos Santos Moschetta
9154ce9051 CI: disable run-basic-amd64-tests / run-tracing jobs
These jobs have failed more than 50% on nightly CI. Remove them from the list of
execution until we don't have a fix.

Issue: 9763
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:26:58 -03:00
Wainer dos Santos Moschetta
ac4d48ad17 CI: disable run-kata-monitor-tests / run-monitor (qemu, containerd) job
This job has failed more than 50% on nightly CI. Remove it from the list of
execution until we don't have a fix.

Issue: 9761
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-05-31 19:21:21 -03:00
Beraldo Leal
4f6732595d ci: skip go version check
golang.mk is not ready to deal with non GOPATH installs. This is
breaking test on s390x.

Since previous steps here are installing go and yq our way, we could
skip this aditional check. A full refactor to golang.mk would be needed
to work with different paths.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Fabiano Fidêncio
8879e3bc45 Merge pull request #9452 from GabyCT/topic/tdxcoco
gha: Add support to install KBS to k8s TDX GHA workflow
2024-05-20 23:28:52 +02:00
stevenhorsman
f271983aeb gha: release: Set inherit secrets on tarball builds
Now we have updated the release builds to push
artefacts to
our registry for the release, so we can cache the images, we need to
set `secrets: inherit` for all architecture's tarball builds
so that we can log into quay.io and ghcr in those steps

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-20 14:19:17 +01:00
stevenhorsman
f7fd2f9a5d workflow: Fix problems with build-asset workflows
- It appears like the `if` isn't required when setting env as a
conditional
- `inputs.stage` over input.stage
- Swap matrix.component to matrix.asset

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 11:51:46 +01:00
stevenhorsman
9999971656 release: Move component's don't ship logic
- We don't want to ship certain components (agent, coco-guest-components)
as part of the release, but for other consumers it's useful to be able to pull in the components
from oras, so rather than not building them, just don't upload it as part of the release.
- Also make the archs all consistent on not shipping the agent

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
040e6cdf12 gha: release: Set RELEASE env
- Set RELEASE env to 'yes', or 'no', based on if the stage
passed in was 'release', so we can use it in the build scripts

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
stevenhorsman
d93156d84d gha: release: Push artifacts to registry on release
For other projects (e.g. CoCo projects) being able to
access the released versions of components is helpful,
so push these during the release process

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-15 16:55:55 +01:00
Fabiano Fidêncio
12dc9f83df ci: Stop building TDX specific QEMU and OVMF
This is the first step of the work to start relying on the artefacts
coming from the distros (CentOS 9 Stream, and Ubuntu) themselves.

Let's have this first one merged, as this will not run the CI due to the
changes being on the yaml itself, and then follow-up with the changes
needed on other parts of the project (kata-deploy, runtime, etc).

Fixes: #9590 -- part I

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-08 11:39:32 +02:00
Gabriela Cervantes
b54dc26073 gha: Enable uninstall kbs client function for coco gha workflow
This PR enables the uninstall kbs client function for coco gha tdx
workflow.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:55:24 +00:00
Gabriela Cervantes
aaf9b54d97 gha: Add support to install KBS to k8s TDX GHA workflow
This PR adds support to install KBS to k8s TDX GHA workflow in
order to run confidential attestation tests.

Fixes #9451

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-06 15:42:17 +00:00
Fabiano Fidêncio
f04a7a55ed Merge pull request #9563 from fidencio/topic/agent-use-policy-by-default
build: Build the shipped agent with policy enabled
2024-05-01 12:22:05 +02:00
Wainer Moschetta
eae429a39b Merge pull request #9552 from wainersm/kata_cc_dev
runtime: new qemu-coco-dev configuration
2024-04-30 05:21:49 -03:00
stevenhorsman
0bec8721cc workflow: Skip commit checks for dependabout
Dependabot doesn't follow all our commit format guidelines,
so add a check and skip these if the author is `dependabot[bot]`

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-04-29 13:45:51 +01:00
Wainer dos Santos Moschetta
631f6f6ed6 gha: switch CoCo tests on non-TEE to use qemu-coco-dev
With the addition of the 'qemu-coco-dev' runtimeClass we no longer need
to run CoCo tests on non-TEE environments with 'qemu'. As a result the
tests also no longer need to set the "io.katacontainers.config.hypervisor.image"
annotation to pods.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-29 05:45:11 -03:00
Fabiano Fidêncio
d3b300ff95 build: tests: Remove agent-opa
Now that the `kata-agent` is being built with policy support, let's stop
building the `kata-opa-agent`, reducing the amount of things we need to
test and maintain.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-04-28 12:52:54 +02:00
Hyounggyu Choi
80cb4a6c18 build: Update golang version to 1.22.2
As we have an issue with a golang version for `run-cri-containerd`,
it is required to bump the language.

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-26 15:50:29 +02:00
Hyounggyu Choi
608df9b7df Merge pull request #9494 from BbolroC/guest-pull-gha-s390x
CC: Enable guest-pull tests on non-TEE for s390x
2024-04-23 21:22:37 +02:00
Hyounggyu Choi
f10744df99 CC: Enable guest-pull tests on non-TEE for s390x
This commit is to add a new CI job to run-k8s-tests-on-zvsi.yaml.
Why the job is not configured in run-kata-coco-tests.yaml by having it
integrated with `run-k8s-tests-coco-nontee` is:

- It uses k3s instead of AKS
- It runs on a self-hosted runner

These differences make the integrated job not easy to read and maintain
when it comes to incorporating other platforms in the near future.

Fixes: #9467

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-04-22 17:15:20 +02:00
Wainer dos Santos Moschetta
1e35291fd5 gha: move attestation tests to run-k8s-tests-coco-nontee
The new run-k8s-tests-coco-nontee job should be the home of attestation
tests.

Changed run-k8s-tests-coco-nontee to get KBS installed and by the time the
KBS variable is exported in the environment then the attestation tests
will kick in (likewise they will skip in run-k8s-tests-on-aks).

Fixes #9455
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-19 14:51:30 -03:00
Amulya Meka
12964256a4 Merge pull request #9521 from Amulyam24/gha
gha: tag k8s tests on ppc64le to ppc64le-runner-01
2024-04-19 15:08:08 +05:30