Commit Graph

99 Commits

Author SHA1 Message Date
Greg Kurz
2cda69b284 release: Adapt kata-deploy for 3.2.0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-20 18:12:29 +02:00
Greg Kurz
93c7d165dc ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat
Fixes #8259

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 36109da93f)
Signed-off-by: Greg Kurz <groug@kaod.org>
2023-10-20 00:44:12 +02:00
Wainer dos Santos Moschetta
a195539307 ci: k8s: set KUBERNETES default value
The KUBERNETES variable is mostly used by kata-deploy whether to apply
k3s specific deployments or not. It is used to select the type of
kubernetes to be installed (k3s, k0s, rancher...etc) and it is always
set on CI. Running the script locally we want to set a value by default
to avoid `KUBERNETES: unbound variable` errors.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit e669282c25)
2023-10-11 10:36:03 +02:00
Wainer dos Santos Moschetta
c4456c21d9 tests: run k8s-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit c30c3ff185)
2023-10-11 10:35:58 +02:00
Wainer dos Santos Moschetta
58ad833300 tests: run k8s-file-volume on a given node
This test can give false-positive on a multi-node cluster. Changed it to
use the new get_one_kata_node() and the modified exec_host() to run the
setup commands on a given node (that has kata installed) and ensure the
test pod is scheduled at that same node.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 666993da8d)
2023-10-11 10:35:52 +02:00
Wainer dos Santos Moschetta
a54bdd00d5 tests: exec_host() now gets the node name
The exec_host() simply fails on cluster with multi-nodes because
`kubectl get node -o name" will return a list o names. Moreover, it will
return control nodes names which usually don't have kata installed.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 3a00fc9101)
2023-10-11 10:35:43 +02:00
Wainer dos Santos Moschetta
0eaf81c1a2 tests: add get_one_kata_node() to tests_common.sh
The introduced get_one_kata_node() returns the first node that
has the kata-runtime=true label, i.e., supposedly a node with
kata installed.

This is useful for tests that should run on a determined worker
node on a multi-nodes cluster.

Fixes #7619
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 61c9c17bff)
2023-10-11 10:35:34 +02:00
Wainer dos Santos Moschetta
5f2c7c78ff ci: k8s: set KATA_HYPERVISOR default value
Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable
is required to tweak some configurations of kata-deploy.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 68f083c4d0)
2023-10-11 10:35:28 +02:00
Wainer dos Santos Moschetta
7fceb21362 ci: k8s: configurable deploy kata timeout
The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata
deploy installation finish. This allow users of the script to overwrite
that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment
variable.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 6677a61fe4)
2023-10-11 10:35:20 +02:00
Wainer dos Santos Moschetta
c4b0f1f31b ci: k8s: shellcheck fixes to gha-run.sh
Fixed a couple of warns shellcheck emitted and disabled others:
 * SC2154 (var is referenced but not assigned)
 * SC2086 (Double quote to prevent globbing and word splitting)

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 200e542921)
2023-10-11 10:35:10 +02:00
Wainer dos Santos Moschetta
5cd2e947dc ci: k8s: run_tests() for kcli
The only difference to the other platforms is that it needs to
export KUBECONFIG.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit d54e6d9cda)
2023-10-11 10:34:53 +02:00
Wainer dos Santos Moschetta
56cebfb485 ci: k8s: add deploy-kata-kcli() to gh-run.sh
The cleanup-kcli() behaves like other deploy kata for
bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG
should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit c2ef1f0fb0)
2023-10-11 10:34:46 +02:00
Wainer dos Santos Moschetta
6b76d21568 ci: k8s: add cleanup-kcli() to gha-run.sh
The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev,
tdx...etc) except that KUBECONFIG should be exported.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit d2be8eef1a)
2023-10-11 10:34:40 +02:00
Wainer dos Santos Moschetta
308ce26438 ci: k8s: set default image for deploy_kata()
On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and
DOCKER_TAG are exported to match the built image. However, when running
the script outside of CI context, a developer might just use the latest
image which in this case will be
`quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit cbb9aa15b6)
2023-10-11 10:34:33 +02:00
Wainer dos Santos Moschetta
c3b91ed394 ci: k8s: create k8s clusters with kcli
Adapted the gha-run.sh script to create a Kubernetes cluster locally
using the kcli tool.

Use `./gha-run.sh create-cluster-kcli` to create it, and
`./gha-run.sh delete-cluster-kcli` to delete.

Fixes #7620
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
(cherry picked from commit 89bef7d036)
2023-10-11 10:34:27 +02:00
Fabiano Fidêncio
48a9b4ab13 ci: crio: Trail '\r' from exec_host() output
We've faced this as part of the CI, only happening with the CRI-O tests:
```
 not ok 1 Test readonly volume for pods
 # (from function `exec_host' in file tests_common.sh, line 51,
 #  in test file k8s-file-volume.bats, line 25)
 #   `exec_host "echo "$file_body" > $tmp_file"' failed with status 127
 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass
 # bash: line 1: $'\r': command not found
 #
 # Error from server (NotFound): pods "test-file-volume" not found
```

I must say I didn't dig into figuring out why this is happening, but we
may be safe enough to just trail the '\r', as long as all the tests keep
passing on containerd.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ef63d67c41)
2023-10-06 15:14:58 +02:00
Fabiano Fidêncio
b41fa6d946 ci: k8s: Add a method to install CRI-O
This is based on official CRI-O documentations[0] and right now we're
making this specific to Ubuntu as that's what we have as runners.

We may want to expand this in the future, but we're good for now.

[0]:
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d7105cf7a4)
2023-09-21 14:17:13 +02:00
Fabiano Fidêncio
abe9dc9904 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 09cc0ed438)
2023-09-21 14:16:12 +02:00
Fabiano Fidêncio
ea6489653e ci: Properly set K8S_TEST_UNION
Otherwise only the first test will be executed

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 486fe14c99)
2023-09-21 14:16:03 +02:00
Fabiano Fidêncio
7c4a0f7fac ci: Use variable size of VMs depending on the tests running
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.

Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)

With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.

Fixes: #7972

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c69a1e33bd)
2023-09-21 14:12:22 +02:00
Fabiano Fidêncio
2f280659b1 ci: k8s: Temporarily disable tests that require a bigger VM instance
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI

We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 094b6b2cf8)
2023-09-21 14:11:48 +02:00
Fabiano Fidêncio
fa9dd46041 ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 92fff129fd)
2023-09-21 14:10:57 +02:00
Fabiano Fidêncio
bb5dbfbbce k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9d74b7ccc9)

 Conflicts:
	tests/integration/kubernetes/k8s-pod-quota.bats
2023-09-21 13:43:16 +02:00
Fabiano Fidêncio
263ed4afd1 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f6cd3930c5)
2023-09-21 13:42:24 +02:00
Fabiano Fidêncio
7e135294a7 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3cc20b47a6)
2023-09-21 13:42:17 +02:00
Fabiano Fidêncio
8892d9a7b2 ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b5bad3cb0f)
2023-09-21 13:42:06 +02:00
Fabiano Fidêncio
aee6f36c86 ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 27fa7d828d)
2023-09-21 13:41:52 +02:00
Fabiano Fidêncio
5bb77b628d ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fa62a4c01b)
2023-09-21 13:41:45 +02:00
Fabiano Fidêncio
9fb291d88a ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3de23034f8)
2023-09-21 13:41:31 +02:00
Fabiano Fidêncio
89345b6731 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2df183fd99)
2023-09-21 13:41:01 +02:00
Fabiano Fidêncio
bb675f8101 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 369a8af8f7)
2023-09-21 13:40:56 +02:00
Fabiano Fidêncio
695c7162ef ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ada65b988a)
2023-09-21 13:40:51 +02:00
Fabiano Fidêncio
7f865be398 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ad45ab5d33)
2023-09-21 13:40:45 +02:00
Fabiano Fidêncio
7a96d0a589 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 028a97e0d5)
2023-09-21 13:40:38 +02:00
Fabiano Fidêncio
a41a56e326 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit b28b54df04)
2023-09-21 13:39:05 +02:00
Fabiano Fidêncio
315288a000 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 54f7117212)
2023-09-21 13:38:56 +02:00
Fabiano Fidêncio
2684b267f7 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e286e842c1)
2023-09-21 13:27:24 +02:00
Unmesh Deodhar
4976629aee tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
(cherry picked from commit e31f099be1)
2023-09-21 13:27:18 +02:00
Unmesh Deodhar
019849071e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c3b9d4945e)
2023-09-21 13:27:11 +02:00
Dan Mihai
17d22cae34 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 183f51d6f6)
2023-09-21 13:26:18 +02:00
Dan Mihai
e8c24fa0b9 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 6a974679f2)
2023-09-21 13:26:06 +02:00
Dan Mihai
508f1bba15 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 400eb88743)
2023-09-21 13:23:48 +02:00
Fabiano Fidêncio
b38624e2b3 tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 285e616b5e)
2023-09-21 13:22:40 +02:00
Fabiano Fidêncio
d7130f48b0 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit cfc29c11a3)
2023-09-21 13:22:21 +02:00
Aurélien Bombo
810507e8a3 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit f4dd152863)
2023-09-21 13:22:10 +02:00
Aurélien Bombo
915bace795 kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit 339569b69c)
2023-09-21 13:22:00 +02:00
Fabiano Fidêncio
c19cebfa80 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit af1b46bbf2)
2023-09-21 13:18:07 +02:00
Fabiano Fidêncio
dcc35781f7 ci: unencrypted-image: Don't fail to build on s390x
Let's make sure that we don't fail in case we're building non x86_64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit eb463b38ec)
2023-09-21 13:15:08 +02:00
Fabiano Fidêncio
218d83bd3f tests: k8s: Ensure the runtime classes are properly created
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.

Fixes: #7491

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 034d7aab87)
2023-09-21 13:13:34 +02:00
Fabiano Fidêncio
6ae591c618 ci: k8s: Add the image used for unencrypted confidential tests
Let's add here the image we'll be using for unencrypted confidential
tests.  Later on, we'll make sure to build and use this image as part of
our CI.

The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ab5f603ffa)
2023-09-21 13:13:16 +02:00