Now we have updated the release builds to push
artefacts to
our registry for the release, so we can cache the images, we need to
set `secrets: inherit` for all architecture's tarball builds
so that we can log into quay.io and ghcr in those steps
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
VERSION_ID is not guaranteed to be specified in os-release, this
makes kaka-deploy breaks in rolling distros like arch linux and void
linux.
Note that operating system vendors may choose not to provide
version information, for example to accommodate for rolling releases.
In this case, VERSION and VERSION_ID may be unset.
Applications should not rely on these fields to be set.
Signed-off-by: vac <dot.fun@protonmail.com>
This test doesn't fail with the guest image pulling, but it for sure
should. :-)
We can see in the bats logs, something like:
```
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31s default-scheduler Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com
Normal Pulled 23s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting)
Normal Started 21s kubelet Started container liveness
Warning Unhealthy 7s (x3 over 13s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 7s kubelet Container liveness failed liveness probe, will be restarted
Normal Pulled 7s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting)
Warning Failed 5s kubelet Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown
Normal Pulling 5s (x3 over 23s) kubelet Pulling image "quay.io/prometheus/busybox:latest"
Normal Pulled 4s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting)
Normal Created 4s (x3 over 23s) kubelet Created container liveness
Warning Failed 3s kubelet Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown
Warning BackOff 1s (x3 over 3s) kubelet Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff)
```
Let's skip it for now as we have an issue opened to track it down:
https://github.com/kata-containers/kata-containers/issues/9665
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is another one that is related to initContainers not being properly
handled with the guest image pulling.
Let's skip it for now as we have
https://github.com/kata-containers/kata-containers/issues/9668 to track
it down.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similarly to firecracker, which doesn't have support for virtio-fs /
virtio-9p, TDX used with `shared_fs=none` will face the very same
limitations.
The tests affected are:
* k8s-credentials-secrets.bats
* k8s-file-volume.bats
* k8s-inotify.bats
* k8s-nested-configmap-secret.bats
* k8s-projected-volume.bats
* k8s-volume.bats
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
https://github.com/kata-containers/kata-containers/issues/9664
For now, let's have it skipped.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The test has been failing on TDX for a while, and an issue has been
created to track it down, see:
https://github.com/kata-containers/kata-containers/issues/9663
For now, let's have it skipped.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure the logs will print the correct annotation and its
value, instead of always mentioning "kernel" and "initrd".
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're doing that as all tests are going to be running with
`shared_fs=none`, meaning that we don't need any specific test for this
case anymore.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will set the needed annotation for enforcing that the
image pull will be handled by the snapshotter set for the runtime
handler, instead of using the default one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We shouldn't be using 9p, at all, with TEEs, as off right now we have no
way to ensure the channels are encrypted. The way to work this around
for now is using guest pull, either with containerd + nydus snapshotter
or with CRI-O; or even tardev snapshotter for pulling on the host (which
is the approach used by MSFT).
This is only done for TDX for now, leaving the generic, AMD, and IBM
related stuff for the folks working on those to switch and debug
possible issues on their environment.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace:
- ` ` (namespace field missing)
- `namespace: ""` (namespace field is the empty string)
- `namespace: "default"`(namespace field has the explicit value `default`)
Genpolicy currently does not handle the empty string case correctly.
Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>
When just containerd is installed without installing nerdctl,
cni plugins are missing from the installation.
containerd tarball does not include cni plugin files.
Hence install cni plugins separately for containerd.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
nerdctl requires cni plugins to be installed in /opt/cni/bin
Without bridge plugin installed, it is not possible to run a
container with nerdctl.
The downloaded nerdctl tarball contains cni plugin files, but are
extracted under /usr/local/libexec.
Copy extracted tarball cni files under /usr/local/libexec
to /opt/cni/bin
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Use `eval` to process the `date` command along with its parameters,
thus avoiding misinterpreting the parameters as commands.
Fixes: #9661
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Determine the realpath of kata-shim avoiding the check fails
in case the kata-shim is not a symlink, as was happening prior
to this commit.
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This is done in order to work around
https://github.com/Azure/azure-cli/issues/28984, following a suggestion
on the very same issue.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- The tags have a trailing non-printable character, which results
in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_`
For ease of use of these cached components, we should strip off the trailing underscore.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
An e2e test for `vfio-ap` has been conducted internally in IBM
due to the lack of publicly available test machines equipped
with a required crypto device.
The test is performed by the `tests` repository:
(i.e. 772105b560/Makefile (L144))
The community is working to integrate all tests into the `kata-containers`
repository, so the `vfio-ap` test should be part of that effort.
This commit moves a test script and Dockerfile for a test image from
the `tests` repository. We do not rename the script to `gha-run.sh`
because it is not executed by Github Actions' workflow.
You can check the test results from the s390x nightly test with the migrated files here:
https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Given that we think the containerd -> snapshotter image cache
problems have been resolved by bumping to nydus-snapshotter v0.3.13
we can try removing the skips to test this out
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- We previously have an expectation for the pause rootfs
to be pull on the host when we did a guest pull. We weren't
really clear why, but it is plausible related to the issues we had
with containerd and nydus caching. Now that is fixed we can begin
to address this with setting shared_fs=none, but let's start with
updating the rootfs host check to be not higher than expected
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This test is called from `tests/integration/run_kuberentes_tests.sh`,
which already ensures that yq is installed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On 1423420, I've mistakenly disabled the tests entirely, for both
non-TEEs and TEEs.
This happened as I didn't realise that `confidential_setup` would take
non-TEEs into consideration. :-/
Now, let me follow-up on that and make sure that the tests will be
running on non-TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function is a helper to check whether the KATA_HYPERVISOR being
used is a confidential hardware (TEE) or not, and we can use it to
skip or only run tests on those platforms when needed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's rename it to `is_confidential_runtime_class`, and adapt all the
places where it's called.
The new name provides a better description, leading to a better
understanding of what the function really does.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Implementation of QemuCmdLine has a fairly uniform and repetitive structure
that's guided by a set of conventions. These conventions have however been
mostly implicit so far, leading to a superfluous and annoying
request/force-push churn during qemu-rs PR reviews.
This commit aims to make things explicit so that contributors can take them
into account before an initial PR submission.
Signed-off-by: Pavel Mores <pmores@redhat.com>
This commit is to append an arch type to the initramfs-cryptsetup image
to prevent a wrong arch image from being pulled on a different arch host.
Fixes: #9654
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
- Container image tags can only contain alphanumeric, period,
hyphen and underscore characters, so convert characters outside
of these to be underscores, to avoid having invalid tag failures
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Recently the extra gpu caching was added, unfortunately when I
rebased I ended up with both the new tagging logic and old logic.
Let's try and integrate them properly to avoid doing the push twice.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
in case the upstream CI fails it's useful to pin-point the PR that
caused the regression. Currently openshift-ci does not allow doing that
from their setup but we can mimic the setup on our infrastructure and
use the available kata-deploy-ci images to find the first failing one.
To help with that add a few helper scripts and a howto.
Fixes: #9228
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Now we have the workflow updated and can test the changes in caching
we've hit an error:
```
line 1180: artefact_tag: unbound variable
```
so we need to fix that up. Sorry for missing this before.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- It appears like the `if` isn't required when setting env as a
conditional
- `inputs.stage` over input.stage
- Swap matrix.component to matrix.asset
Signed-off-by: stevenhorsman <steven@uk.ibm.com>