Commit Graph

150 Commits

Author SHA1 Message Date
Fabiano Fidêncio
5560e72024 Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support
ci: kata-deploy: Enable all k8s flavours that we support
2023-09-19 17:44:34 +02:00
Fabiano Fidêncio
09cc0ed438 ci: Move deploy_k8s() to gha-run-k8s-common.sh
This will allow us to re-use the function in the kata-deploy tests,
which will come soon.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 12:37:56 +02:00
Fabiano Fidêncio
486fe14c99 ci: Properly set K8S_TEST_UNION
Otherwise only the first test will be executed

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-19 10:23:58 +02:00
Fabiano Fidêncio
aba36ab188 nydus: Temporarily skip tests on dragonball
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.

The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
   /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
   error: Failed to Mount backend
...

Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b8a8dfcd15 nydus: Use kata-${KATA_HYPERVISOR} instead of kata
This will ensure we're testing with the correct runtime, instead of
using the `default` one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
ChengyuZhu6
2f9c9e2e63 tests: nydus: Update nydus tests
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b73bde320d gha: nydus: Populate run()
And with this we finally enable the nydus tests to run as part of our
GHA CI.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
b3904a1a30 gha: nydus: Populate install_dependencies()
Let's have all the dependencies needed for running the nydus tests
installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
d2b3b67f5d gha: nydus: Actually install kata when install-kata is called
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.

Now, let's install kata, as expected. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
0ec00ad42e gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.

Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
568439c77b tests: nydus: Add timeout to the crictl calls
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
5ac3b76eb1 tests: nydus: Add uid / namespace to the nydus container / sandbox
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"

getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
376574a16c tests: nydus: Decorate some calls with sudo
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
4290fd4b67 tests: nydus: Adapt "source ..." to GHA
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
a84efa3e87 tests: nydus: Adapt check to "clh" instead "cloud-hypervisor"
As that's what we've been using as part of the GHA.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-18 17:40:06 +02:00
Fabiano Fidêncio
c69a1e33bd ci: Use variable size of VMs depending on the tests running
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.

Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)

With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.

Fixes: #7972

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-16 09:13:54 +02:00
Fabiano Fidêncio
094b6b2cf8 ci: k8s: Temporarily disable tests that require a bigger VM instance
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI

We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-15 01:33:19 +02:00
Fabiano Fidêncio
92fff129fd ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-14 22:03:16 +02:00
Fabiano Fidêncio
a1e3fa7ac4 Merge pull request #7905 from microsoft/danmihai1/mariner-annotations
tests: fix kernel and initrd annotations
2023-09-14 10:37:42 +02:00
Fabiano Fidêncio
813bfdec01 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
This will ensure that we're calling the correct binary for the
hypervisor.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:10:14 +02:00
Fabiano Fidêncio
46bc0b1c01 ci: nerdctl: Create the containerd config
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.

This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
13968aa7f6 ci: nerdctl: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Fabiano Fidêncio
e0c811678b ci: docker: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-13 13:00:57 +02:00
Dan Mihai
c0ad914766 tests: fix kernel and initrd annotations
Fix kernel and initrd annotations in the k8s tests on Mariner. These
annotations must be applied to the spec.template for Deployment, Job
and ReplicationController resources.

Fixes: #7764

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-12 20:15:25 +00:00
Fabiano Fidêncio
f536ef5ce1 ci: docker: Also run the smoke test with runc
This will help us to make sure that the failure is actually related to
Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:54:02 +02:00
Fabiano Fidêncio
12d833d07d ci: Add a very basic nerdctl sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7911

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 16:52:55 +02:00
Fabiano Fidêncio
348b8644d6 ci: Add a very basic docker sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-12 15:15:26 +02:00
Fabiano Fidêncio
9d74b7ccc9 k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 15:51:46 +02:00
Fabiano Fidêncio
f6cd3930c5 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:29 +02:00
Fabiano Fidêncio
3cc20b47a6 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:25:24 +02:00
Fabiano Fidêncio
b5bad3cb0f ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 14:04:50 +02:00
Fabiano Fidêncio
27fa7d828d ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
fa62a4c01b ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-08 10:09:04 +02:00
Fabiano Fidêncio
3de23034f8 ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:46:58 +02:00
Fabiano Fidêncio
2df183fd99 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
369a8af8f7 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ada65b988a ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
ad45ab5d33 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
028a97e0d5 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-07 23:12:55 +02:00
Fabiano Fidêncio
b28b54df04 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-09-06 23:08:17 +02:00
Fabiano Fidêncio
54f7117212 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-06 23:07:41 +02:00
Dan Mihai
bf21411e90 tests: add policy to k8s tests
Use AGENT_POLICY=yes when building the Guest images, and add a
permissive test policy to the k8s tests for:
- CBL-Mariner
- SEV
- SNP
- TDX

Also, add an example of policy rejecting ExecProcessRequest.

Fixes: #7667

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
e286e842c1 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
e31f099be1 tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
c3b9d4945e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:34 +02:00
Fabiano Fidêncio
02a08c956b Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment
tests: delete k8s deployment at the test's end
2023-08-27 17:52:00 +02:00
Dan Mihai
183f51d6f6 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:41:06 +00:00
Dan Mihai
6a974679f2 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-25 03:34:37 +00:00
Dan Mihai
400eb88743 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2023-08-21 15:58:57 +00:00
Fabiano Fidêncio
285e616b5e tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-17 14:22:16 +02:00