Commit Graph

14008 Commits

Author SHA1 Message Date
Fabiano Fidêncio
a2c70222a8
tests: k8s: tdx: Skip initContainerd shared vol test
This is another one that is related to initContainers not being properly
handled with the guest image pulling.

Let's skip it for now as we have
https://github.com/kata-containers/kata-containers/issues/9668 to track
it down.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 20:58:45 +02:00
Fabiano Fidêncio
9d56145499
tests: k8s: tdx: Skip volume related tests
Similarly to firecracker, which doesn't have support for virtio-fs /
virtio-9p, TDX used with `shared_fs=none` will face the very same
limitations.

The tests affected are:
* k8s-credentials-secrets.bats
* k8s-file-volume.bats
* k8s-inotify.bats
* k8s-nested-configmap-secret.bats
* k8s-projected-volume.bats
* k8s-volume.bats

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:49 +02:00
Fabiano Fidêncio
606a62a0a7
tests: k8s: tdx: Skip "Setting sysctl" test
This test fails when using `shared_fs=none` with the nydus-snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9666

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 19:38:38 +02:00
Fabiano Fidêncio
937b2d5806
tests: k8s: tdx: Skip "Kill all processes in container" test
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
03ce41b743
tests: k8s: tdx: Skip "Check custom dns" test
The test has been failing on TDX for a while, and an issue has been
created to track it down, see:
https://github.com/kata-containers/kata-containers/issues/9663

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
1a8a4d046d
tests: k8s: setup: Improve / Fix logs
Let's make sure the logs will print the correct annotation and its
value, instead of always mentioning "kernel" and "initrd".

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:14 +02:00
Fabiano Fidêncio
3f38309c39
tests: k8s: tdx: Stop running k8s-guest-pull-image.bats
We're doing that as all tests are going to be running with
`shared_fs=none`, meaning that we don't need any specific test for this
case anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:51:00 +02:00
Fabiano Fidêncio
e84619d54b
tests: k8s: tdx: Add add_runtime_handler_annotations function
This function will set the needed annotation for enforcing that the
image pull will be handled by the snapshotter set for the runtime
handler, instead of using the default one.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:49:07 +02:00
Fabiano Fidêncio
f2de259387
runtime: tdx: Use shared_fs=none
We shouldn't be using 9p, at all, with TEEs, as off right now we have no
way to ensure the channels are encrypted.  The way to work this around
for now is using guest pull, either with containerd + nydus snapshotter
or with CRI-O; or even tardev snapshotter for pulling on the host (which
is the approach used by MSFT).

This is only done for TDX for now, leaving the generic, AMD, and IBM
related stuff for the folks working on those to switch and debug
possible issues on their environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-19 18:47:09 +02:00
Fabiano Fidêncio
5b257685d9
Merge pull request #9662 from dborquez/fix_launchtimes_timestamp_generation
Fix launch times timestamp generation.
2024-05-18 21:11:09 +02:00
Fabiano Fidêncio
94786dc939
Merge pull request #9659 from stevenhorsman/remove-non-printable-tag-characters
ci: cache: Filter out non-printable characters from tag
2024-05-18 14:47:07 +02:00
Fabiano Fidêncio
874cda0e51
Merge pull request #9655 from BbolroC/add-arch-to-initramfs
CI: Append arch type to initramfs-cryptsetup image
2024-05-18 14:31:57 +02:00
Malte Poll
babdab9078 genpolicy: detect empty string in ns as default
In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace:

- ` ` (namespace field missing)
- `namespace: ""` (namespace field is the empty string)
- `namespace: "default"`(namespace field has the explicit value `default`)

Genpolicy currently does not handle the empty string case correctly.

Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
2024-05-18 12:44:59 +02:00
Fabiano Fidêncio
cbfdc70a55
Merge pull request #9613 from fidencio/topic/skip-pull-image-tests-on-tees-part-II
tests: pull-image: Only skip tests for TEEs
2024-05-18 03:31:38 +02:00
Archana Shinde
0e28e904e0 kata-manager: Install cni for containerd
When just containerd is installed without installing nerdctl,
cni plugins are missing from the installation.
containerd tarball does not include cni plugin files.
Hence install cni plugins separately for containerd.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:19:57 +00:00
Archana Shinde
d23d58a484 kata-manager: Copy cni files under /opt/cni
nerdctl requires cni plugins to be installed in /opt/cni/bin
Without bridge plugin installed, it is not possible to run a
container with nerdctl.
The downloaded nerdctl tarball contains cni plugin files, but are
extracted under /usr/local/libexec.
Copy extracted tarball cni files under /usr/local/libexec
to /opt/cni/bin

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2024-05-18 00:16:48 +00:00
David Esparza
938d3dc430
metrics: fix timestamps generation from launch times test.
Use `eval` to process the `date` command along with its parameters,
thus avoiding misinterpreting the parameters as commands.

Fixes: #9661

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:44:41 -06:00
David Esparza
bae377b42a
metrics: determine the realpath of kata-shim component.
Determine the realpath of kata-shim avoiding the check fails
in case the kata-shim is not a symlink, as was happening prior
to this commit.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2024-05-17 14:40:02 -06:00
Fabiano Fidêncio
5ff53e4d1c
ci: azure: Workaround azure cli installation script
This is done in order to work around
https://github.com/Azure/azure-cli/issues/28984, following a suggestion
on the very same issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 20:28:24 +02:00
stevenhorsman
42fddb5530 ci: cache: Filter out non-printable characters from tag
- The tags have a trailing non-printable character, which results
in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_`
For ease of use of these cached components, we should strip off the trailing underscore.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 14:16:40 +01:00
Hyounggyu Choi
961735a181 CI: Migrate vfio-ap test files from tests repo
An e2e test for `vfio-ap` has been conducted internally in IBM
due to the lack of publicly available test machines equipped
with a required crypto device.
The test is performed by the `tests` repository:
(i.e. 772105b560/Makefile (L144))

The community is working to integrate all tests into the `kata-containers`
repository, so the `vfio-ap` test should be part of that effort.

This commit moves a test script and Dockerfile for a test image from
the `tests` repository. We do not rename the script to `gha-run.sh`
because it is not executed by Github Actions' workflow.

You can check the test results from the s390x nightly test with the migrated files here:
https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 14:59:16 +02:00
stevenhorsman
a92defdffe
tests: pull-image: Remove skips
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:57 +02:00
stevenhorsman
7ac302e2d8
tests: Slacken guest pull rootfs count assert
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
67ff58251d
tests: confidential_common: Remove unneeded ensure_yq call
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
cc874ad5e1
tests: confidential: Ensure those only run on TEEs
Running those with the non-TEE runtime classes will simply fail.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
2bc5b1bba2
tests: pull-image: Only skip tests for TEEs
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.

This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/

Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
d875f89fa2
tests: Add is_confidential_hardware()
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Fabiano Fidêncio
4a04a1f2ae
tests: Re-work confidential_setup()
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.

The new name provides a better description, leading to a better
understanding of what the function really does.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2024-05-17 12:39:56 +02:00
Pavel Mores
b9febc4458 runtime-rs: document architecture & implementation conventions in qemu-rs
Implementation of QemuCmdLine has a fairly uniform and repetitive structure
that's guided by a set of conventions.  These conventions have however been
mostly implicit so far, leading to a superfluous and annoying
request/force-push churn during qemu-rs PR reviews.

This commit aims to make things explicit so that contributors can take them
into account before an initial PR submission.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2024-05-17 12:21:44 +02:00
Hyounggyu Choi
3917930a76 CI: Append arch type to initramfs-cryptsetup image
This commit is to append an arch type to the initramfs-cryptsetup image
to prevent a wrong arch image from being pulled on a different arch host.

Fixes: #9654

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2024-05-17 11:42:49 +02:00
Steve Horsman
9a6d8d8330
Merge pull request #9650 from stevenhorsman/caching-tagging-update-partIII
Caching tagging update part iii
2024-05-17 09:09:15 +01:00
stevenhorsman
ce24e98358 ci: cache: Add tag character filtering
- Container image tags can only contain alphanumeric, period,
hyphen and underscore characters, so convert characters outside
of these to be underscores, to avoid having invalid tag failures

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
stevenhorsman
a98b1e3afb ci: cache: Integrate tagging updates with recent changes
Recently the extra gpu caching was added, unfortunately when I
rebased I ended up with both the new tagging logic and old logic.
Let's try and integrate them properly to avoid doing the push twice.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 21:38:07 +01:00
Lukáš Doktor
f994f79078
ci.ocp: Add steps to reproduce/bisect CI runs
in case the upstream CI fails it's useful to pin-point the PR that
caused the regression. Currently openshift-ci does not allow doing that
from their setup but we can mimic the setup on our infrastructure and
use the available kata-deploy-ci images to find the first failing one.
To help with that add a few helper scripts and a howto.

Fixes: #9228

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:20:05 +02:00
Lukáš Doktor
a556ad7e01
ci.ocp: Document how to run openshift-tests with kata
document the ocp pipeline.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:32 +02:00
Lukáš Doktor
ea081bd882
ci.ocp: Add webhook cleanup
cleanup the webhook resources as well.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
2024-05-16 20:15:31 +02:00
David Esparza
029a6de52b
Merge pull request #9615 from GabyCT/topic/fixlaunchtime
metrics: Update launch times script
2024-05-16 11:28:44 -06:00
Steve Horsman
33e6b241ba
Merge pull request #9647 from stevenhorsman/fix-artefact-tags-unbound-variable
ci: cache: Fix unbound variable
2024-05-16 16:22:47 +01:00
stevenhorsman
9d9487b17f ci: cache: Fix unbound variable
Now we have the workflow updated and can test the changes in caching
we've hit an error:
```
line 1180: artefact_tag: unbound variable
```
so we need to fix that up. Sorry for missing this before.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 14:30:32 +01:00
Steve Horsman
03c08583c3
Merge pull request #9644 from stevenhorsman/fix-broken-workflow
workflow: Remove if from env conditional
2024-05-16 14:13:25 +01:00
stevenhorsman
f7fd2f9a5d workflow: Fix problems with build-asset workflows
- It appears like the `if` isn't required when setting env as a
conditional
- `inputs.stage` over input.stage
- Swap matrix.component to matrix.asset

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2024-05-16 11:51:46 +01:00
Steve Horsman
d8468cb178
Merge pull request #9550 from stevenhorsman/tag-component-caches
Tag component caches
2024-05-16 11:05:18 +01:00
Steve Horsman
b31ff09b8d
Merge pull request #9617 from zvonkok/artefact-repository
deploy: Add artefact repository
2024-05-16 10:41:23 +01:00
Fabiano Fidêncio
4d073c837d
Merge pull request #9636 from ChengyuZhu6/snapshotter
version: Bump nydus snapshotter to v0.13.13
2024-05-16 02:54:53 +02:00
GabyCT
05cc8fae5e
Merge pull request #9610 from GabyCT/topic/fixrwfio
metrics: Fix random write value for FIO
2024-05-15 17:44:41 -06:00
Gabriela Cervantes
793a02600a metrics: Fix random write value for clh for FIO
This PR decreases the random write value for clh for FIO.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-05-15 22:13:10 +00:00
Chelsea Mafrica
5d2af555da runtime: Add missing check in ResizeMemory for CH
ResizeMemory for Cloud Hypervisor is missing a check for the new
requested memory being greater than the max hotplug size after
alignment. Add the check, and since an earlier check for this
setsrequested memory to the max hotplug size, do the same in the
post-alignment check.

Fixes #9640

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2024-05-15 11:29:18 -07:00
GabyCT
d752f0aa4f
Merge pull request #9627 from GabyCT/topic/ghacomk8s
gha: Fix indentation in gha run k8s common
2024-05-15 11:55:14 -06:00
Greg Kurz
bd6420e0cc runtime-rs: Drop some useless QEMU arguments
All these settings are hardcoded as `false` and result in
no extra options on the QEMU command line, like the go
runtime does. There actually not needed :
- we're never going to ask QEMU to survive a guest shutdown
- we're never going to run QEMU daemonized since it prevents
  log collection
- we're never going to ask QEMU to start with the guest stopped

No need to keep this code around then.

Signed-off-by: Greg Kurz <groug@kaod.org>
2024-05-15 18:33:43 +02:00
stevenhorsman
7f41329010 ci: cache: Optional tag components with tags
- CoCo wants to use the agent and coco-guest-components cached artifacts
so tag them with a helpful version, so make these easier to get

Signed-off-by: stevenhorsman <steven@uk.ibm.com>

 No commands remaining.
2024-05-15 16:56:40 +01:00