Commit Graph

54 Commits

Author SHA1 Message Date
Aurélien Bombo
1de466fe84 temp: ci: Fix AKS cluster creation
The AKS CLI recently introduced a regression that prevents using
aks-preview extensions (Azure/azure-cli#31345), and hence create
CI clusters.

To address this, we temporarily hardcode the last known good version of
aks-preview.

Note that I removed the comment about this being a Mariner requirement,
as aks-preview is also a requirement of AKS App Routing, which will
be introduced soon in #11164.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-04-24 15:06:14 -05:00
Fabiano Fidêncio
f7976a40e4 tests: Create a helm_helper() common function
Let's use what we have in the k8s functional tests to create a common
function to deploy kata containers using our helm charts.  This will
help us immensely in the kata-deploy testing side in the near future.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-26 13:30:11 +01:00
Dan Mihai
e8405590c1 ci: temporarily avoid using the Mariner Host image
Disable the Mariner host during CI, while investigating test failures
with new Cloud Hypervisor v43.0.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-03-10 20:15:09 +00:00
Fabiano Fidêncio
50f765b19c shellcheck: tests: Fix gha-run-k8s-common.sh warnings
Let's fix all the warnings caught in this file, as we're already
touching it.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-05 19:44:27 +01:00
Fabiano Fidêncio
219db60071 tests: kata-deploy: microk8s: Re-work installation
So we can ensure that the user has enough permissions to access
microk8s.

Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
2025-03-05 19:44:27 +01:00
Stephane Talbot
f80e7370d5 test: Verify deployement of kata-deploy on microk8s
Enable fonctional test to verify deployment of kata-deploy on a Microk8s cluster

Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>
2025-02-28 10:10:29 +01:00
stevenhorsman
d08787774f ci: k8s: Use pinned k0s version
Update the code to install the version of k0s
that we have in our versions.yaml, rather than
just installing the latest, to help our CI being
less stable and prone to breaking due to things
we don't control.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-27 11:33:23 +00:00
stevenhorsman
fd7bcd88d0 ci: k8s: Bump kcli image version
When trying to deploy nydus on kcli locally we get the
following failure:
```
root@sh-kata-ci1:~# kubectl get pods -n nydus-system
NAMESPACE                   NAME                                          READY   STATUS              RESTARTS      AGE
nydus-system                nydus-snapshotter-5kdqs                       0/1     CrashLoopBackOff    4 (84s ago)   7m29s
```
Digging into this I found that the nydus-snapshotter service
is failing with:
```
ubuntu@kata-k8s-worker-0:~$ journalctl -u nydus-snapshotter.service
-- Logs begin at Wed 2025-02-12 15:06:08 UTC, end at Wed 2025-02-12 15:20:27 UTC. --
Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: Started nydus snapshotter.
Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required b>
Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc:
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required b>
Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: nydus-snapshotter.service: Main process exited, code=exited, status=1/FAILURE
```
I think this is because 20.04 has version:
```
ubuntu@kata-k8s-worker-0:~$ ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.16) 2.31
```
so it's too old for the nydus snapshotter.
Also 20.04 is EoL soon, so bumping is better.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-02-12 15:38:18 +00:00
Wainer dos Santos Moschetta
badc208e9a tests/gha-run-k8s-common: shorten AKS cluster name
Because az client restricts the name to be less than 64 characters. In
some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name
will exceed the limit. This changed the function to shorten the name:

* SHA1 is computed from metadata then compound the cluster's name
* metadata as plain-text are passed as --tags

Fixes: #9850
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2025-01-08 16:39:07 -03:00
stevenhorsman
3f5bf9828b tests: k8s: Update bats
We've seen some issues with tests not being run in
some of the Coco CI jobs (Issue #10451) and in the
envrionments that are more stable we noticed that
they had a newer version of bats installed.

Try updating the version to 1.10+ and print out
the version for debug purposes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-11-04 09:59:33 +00:00
Gabriela Cervantes
f0e0c74fd4 gha: Use a arch_to_golang variable to have uniformity
This PR replaces the arch uname -m to use the arch_to_golang
variable in the script to have a better uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-15 20:03:09 +00:00
stevenhorsman
c0d35a66aa ci: kata-deploy: Update kubectil install URL
The `deploy_k0s` and `deploy_k3s` kubectl installs aren't failing
yet, but let get ahead of this and bump them as well

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-09-17 15:35:42 +01:00
Aurélien Bombo
a3dba3e82b ci: reinstate Mariner host
GH-9592 addressed a bug in a previous version of the AKS Mariner host
kernel that blocked the CH v39 upgrade. This bug has now been fixed so
we undo that PR.

Note we also specify a different OCI version for Mariner as it differs
from Ubuntu's.

Fixes: #9594

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-08-26 21:07:25 +00:00
ChengyuZhu6
489afffd8c tests:gha: delete namespace before resetting namespace
Delete the kata-containers-k8s-tests namespace before resetting the namespace
to ensure that no deployments or services are restarting and creating pods in the default namespace.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Wang, Arron <arron.wang@intel.com>
2024-07-10 12:08:28 +08:00
ChengyuZhu6
e874c8fa2e tests: Delete test scripts forcely
Delete test scripts forcely in `Delete kata-deploy` step before
deleting all kata pods.

Fixes: #9980

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2024-07-10 12:08:28 +08:00
Beraldo Leal
c99ba42d62 deps: bumping yq to v4.40.7
Since yq frequently updates, let's upgrade to a version from February to
bypass potential issues with versions 4.41-4.43 for now. We can always
upgrade to the newest version if necessary.

Fixes #9354
Depends-on:github.com/kata-containers/tests#5818

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2024-05-31 13:28:34 -04:00
Fabiano Fidêncio
25c9cf32ff
Revert "ci: azure: Workaround azure cli installation script"
This reverts commit 5ff53e4d1c, as the
script was fixed by MSFT, at least according to:
https://github.com/Azure/azure-cli/issues/28984

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-20 14:38:46 +02:00
Fabiano Fidêncio
5ff53e4d1c
ci: azure: Workaround azure cli installation script
This is done in order to work around
https://github.com/Azure/azure-cli/issues/28984, following a suggestion
on the very same issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 20:28:24 +02:00
Gabriela Cervantes
f20a44bba3 gha: Fix indentation in gha run k8s common
This PR fixes the indentation in gha run k8s common script
to have uniformity across the script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-13 20:07:47 +00:00
Fabiano Fidêncio
9713558477
k0s: Use a different port for kube-route's metrics
kube-router decided to use :8080 for its metrics, and this seems to be a
change that affected k0s 1.30.0+, leading to kube-router pod crashing
all the time and anything can actually be started after that.

Due to this issue, let's simply use a different port (:9999) and move on
with our tests.

Fixes: #9623

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-11 23:18:20 +02:00
Aurélien Bombo
0cc2b07a8c tests: adapt Mariner CI to unblock CH v39 upgrade
The CH v39 upgrade in #9575 is currently blocked because of a bug in the
Mariner host kernel. To address this, we temporarily tweak the Mariner
CI to use an Ubuntu host and the Kata guest kernel, while retaining the
Mariner initrd. This is tracked in #9594.

Importantly, this allows us to preserve CI for genpolicy. We had to
tweak the default rules.rego however, as the OCI version is now
different in the Ubuntu host. This is tracked in #9593.

This change has been tested together with CH v39 in #9588.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-05-03 16:29:12 +00:00
Wainer dos Santos Moschetta
4f74617897 tests: pass --overwrite-existing to aks get-credentials
By passing --overwrite-existing to `aks get-credentials` it will stop
asking if I want to overwrite the existing credentials. This is handy
for running the scripts locally.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-11 15:31:40 -03:00
Gabriela Cervantes
6d85025e59 test/k8s: Add basic attestation test
- Add basic test case to check that a ruuning
pod can use the api-server-rest (and attestation-agent
and confidential-data-hub indirectly) to get a resource
from a remote KBS

Fixes #9057

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Co-authored-by: Linda Yu <linda.yu@intel.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-04-08 11:38:53 +01:00
Saul Paredes
f20caac1c0 gha: ensure unique resource group name
There's an rg name duplication situation that got introduced by #9385
where 2 different test runs might have same rg name.

Add back uniqueness by including the first letter of GENPOLICY_PULL_METHOD to
cluster name.

Fixes: #9412

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2024-04-04 13:13:32 -07:00
Gabriela Cervantes
73f27e28d1 gha: Define GH_PR_NUMBER variable in gha run k8s common script
This PR defines the GH_PR_NUMBER variable in gha run k8s common
script to avoid failures like unbound variable when running
locally the scripts just like the GHA CI.

Fixes #9408

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-04-03 18:25:00 +00:00
Wainer dos Santos Moschetta
2c24977cb1 tests/k8s: allow to overwrite the cluster name
_print_cluster_name() create a string based information like the
pull request number and commit SHA. However, when you are developing the
scripts you might want to use an arbitrary name, so it was introduced
the $AKS_NAME variable that once exported it will overwrite the
generated name.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
5e4b7bbd04 tests/k8s: expose KBS service externally
Until this point the deployed KBS service is only reachable from within
the cluster. This introduces a generic mechanism to apply an Ingress
configuration to expose the service externally.

The first implemened ingress is for AKS. In case the HTTP application
routing isn't enabled in the cluster (this is required for ingress), an
add-on is applied.

It was added the get_cluster_specific_dns_zone() and
enable_cluster_http_application_routing() helper functions
to gha-run-k8s-common.sh.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta
6a28c94d99 tests/k8s: add a kustomize installer
Kustomize has been used on some of our internal components (e.g.
kata-deploy) to manage k8s deployments. On CI it has been used
the `sed` tool to edit kustomization.yaml files, but `kustomize` is
more suitable for that purpose. So in order to use that tool on CI
scripts in the future, this commit introduces the `install_kustomize()`
function that is going to download and install the binary in
/usr/local/bin in case it's found on $PATH.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-03-02 12:39:26 -03:00
Dan Mihai
1b4ef672ef tests: k8s: reduce namespace name duplication
1. Avoid repeating "kata-containers-k8s-tests".
2. Allow users to specify a different test namespace.
3. Introduce the TEST_CLUSTER_NAMESPACE variable, that will also be
   useful when auto-generating the Agent Policy for these tests.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2024-02-07 02:23:38 +00:00
Xuewei Niu
fa01a86334
Merge pull request #9007 from wainersm/aks_delete_rg
gha: delete azure RG only if it exists
2024-02-04 16:34:17 +08:00
Wainer dos Santos Moschetta
a04b215bcc gha: delete azure RG only if it exists
delete_cluster() has tried to delete the az resources group regardless
if it exists. In some cases the result of that operation is ignored,
i.e., fail to resource group not found, but the log messages get a
little dirty. Let's delete the RG only if it exists then.

Fixes #8989
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2024-02-02 16:57:20 -03:00
Aurélien Bombo
0ace31f041 ci: aks: switch from eastus2 to eastus region
This addresses an internal AKS issue that intermittently prevents
clusters from getting created. The fix has been rolled out to eastus but
not yet eastus2, so we unblock the CI by switching. No downsides in
general.

This supersedes #8990.

Fixes: #8989

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2024-02-01 19:22:42 +00:00
Fabiano Fidêncio
448c0aaecb
gha: azure: Set the correct subscription to the account
Due to the changes done in the CI, we need to set the correct
subscription to be used with the account from now on, otherwise we'd end
up using CoCo subscription.

Fixes: #8946

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-01-29 15:00:38 +01:00
Hyounggyu Choi
40f0c8fbb7 GHA: Use --client=true for k3s kubectl version
This is to fix a broken usage for `k3s kubectl version` by switching
an option `--short` to `--client=true`.

Fixes: #8621

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-12-11 08:26:39 +01:00
Fabiano Fidêncio
455b7bf776 gha: k3s: Avoid unnecessary escape
There's no reason to escape the first + on the +k3s[0-9]\+ regex, as
shown here:
```sh
ubuntu@k3s:~$ /usr/local/bin/k3s kubectl version --short 2>/dev/null | \
	grep "Client Version" | \
	sed \
		-e 's/Client Version: //' \
		-e 's/+k3s[0-9]\+//'
v1.27.7
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 08:42:25 +01:00
Fabiano Fidêncio
e7890ee8f6 gha: Fix regex used to get kubectl version from the k3s version
It seems that with the new k3s release, they've bumped their kubectl
version from x.y.z+k3s1 to x.y.z+k3s2.

Let's ensure our regexp is more generic and future proof for such
changes.

Fixes: #8410

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-11-09 07:08:02 +01:00
Wainer dos Santos Moschetta
89bef7d036 ci: k8s: create k8s clusters with kcli
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.

Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2023-10-09 11:05:40 -03:00
Fabiano Fidêncio
70e7ec3e23 gha: Fix k0s deployment
The tests are failing when setting up k0s, and that happens because we
download a kubectl binary matching the kubernetes version k0s is using,
and we do that by:
```
sudo k0s kubectl version --short 2>/dev/null | ...
```

With kubectl 1.28, which is now the default on k0s, `kubectl version
--short` has been removed, leading us to an empty stringm causing then
the error in the CI.

Fixes: #8105

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-10-03 17:21:40 +02:00
Fabiano Fidêncio
de1eeee334 ci: Create a generic install_crio function
This will serve us quite will in the upcoming tests addition, which will
also have to be executed using CRi-O.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-27 11:26:13 +02:00
Fabiano Fidêncio
74c12b2927 ci: crio: Enable default capabilities
We need the default capabilities to be enabled, especially `SYS_CHROOT`,
in order to have tests accessing the host to pass.

A huge thanks to Greg Kurz for spotting this and suggesting the fix.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
ebaa4fa4c1 ci: crio: Pass -y to apt
That was something overlooked during my tests. :-/

Fixes: #8005

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-25 14:56:15 +02:00
Fabiano Fidêncio
07a6e63a6b ci: k8s: rke2: Use sudo to call systemd
Otherwise we'll face the following error:
```
Failed to enable unit: Interactive authentication required.
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 08:48:29 +02:00
Fabiano Fidêncio
d7105cf7a4 ci: k8s: Add a method to install CRI-O
This is based on official CRI-O documentations[0] and right now we're
making this specific to Ubuntu as that's what we have as runners.

We may want to expand this in the future, but we're good for now.

[0]:
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
54c0a471b1 ci: k8s: k0s: Allow passing parameters to the k0s installer
We'll need this in order to setup k0s with a different container engine.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-20 00:59:09 +02:00
Fabiano Fidêncio
5560e72024
Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support
ci: kata-deploy: Enable all k8s flavours that we support
2023-09-19 17:44:34 +02:00
Fabiano Fidêncio
2c908b598c ci: kata-deploy: Add the ability to deploy rke2
This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
eaf6164916 ci: kata-deploy: Add the ability to deploy k0s
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 13:38:10 +02:00
Fabiano Fidêncio
09cc0ed438 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Aurélien Bombo
d9ef1352af ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name.  With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:58 +02:00
Aurélien Bombo
68267a3996 ci: Create clusters in individual resource groups
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.

This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.

The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.

Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.

Fixes: #7982

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2023-09-19 10:23:55 +02:00