This allows adding different runners in case the powerful one goes down
for one reason or another.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
To unlock the release, move the job to publish kata payload after push to an alternate runner(IBM owned) for ppc64le.
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
We don't need to store the kernel headers anymore. We do need to store
the kernel modules, instead.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This commit adds a Github workflow for building a Github Pages site for the markdown
files in the docs/ directory. Zensical is a new markdown-based static site generation
framework built by the creators of Material for Mkdocs. https://zensical.org/
This commit does not clean the doc structure, so site navigation is initially going to
be messy.
Signed-off-by: LandonTClipp <11232769+LandonTClipp@users.noreply.github.com>
This is a suggestion from Choi, so we can easily test with a specific
kubectl version and also easily understand which kubectl version is
being used in case of failure.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
This image will be used by our helm charts to verify that a
kata-containers deployment is correct.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
OVMF build for Intel TDX (aka "TDVF") was disabled in favor of Ubuntu/
CentOS pre-upstream releases of Intel TDX.
See 4292c4c3b1.
It's time to re-enable the build and move runtime configurations to
use it (the latter will be done in a later commit).
This is a partial revert of 4292c4c3b with the following changes:
- Stop calling OVMF for Intel TDX "TDVF" and follow the naming distros
use for TDX enabled build: OVMF.inteltdx.fd.
- Single binary OVMF.inteltdx.fd is supported using -bios QEMU param.
- Secure Boot infrastructure is disabled since Kata does not support it.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Otherwise we may hit a `no space left on device` when building the rust
kata-deploy binary.
This happens mostly because of the muli-staging build used to generate a
distroless final container.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's shamelessly duplicate the nightly job to have at least nightly
runs using the rust implementation of kata-deploy.
The reason for doing that is to be pragmatic, as pragmatic as possible,
and avoid switching away of the scripts before 3.24.0 release, while
still testing both ways till the switch happens.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We're only releasing those for amd64 as that's the only architecture
we've been building the packages for.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's ensure we can create a specific "tools" tarball, which will help
those who only need to pull those either for testing or production
usage.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's remove the deprecated features that were marked for removal
after Kata Containers 3.23.0:
kata-deploy.sh:
- Remove non-arch-specific variable fallbacks (SHIMS, DEFAULT_SHIM,
SNAPSHOTTER_HANDLER_MAPPING, ALLOWED_HYPERVISOR_ANNOTATIONS,
PULL_TYPE_MAPPING, EXPERIMENTAL_FORCE_GUEST_PULL). Each arch now
has its own default value.
- Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS
variables and associated functions (create_runtimeclasses,
delete_runtimeclasses, adjust_shim_for_nfd). RuntimeClasses are
now managed by Helm chart, not the daemonset script.
- Unsupported architectures now fail with an error instead of
falling back to non-arch-specific defaults.
Helm chart:
- Remove all deprecated env values (createRuntimeClasses,
createDefaultRuntimeClass, debug, shims, shims_*, defaultShim,
defaultShim_*, allowedHypervisorAnnotations, snapshotterHandlerMapping,
snapshotterHandlerMapping_*, agentHttpsProxy, agentNoProxy,
pullTypeMapping, pullTypeMapping_*, _experimentalSetupSnapshotter,
_experimentalForceGuestPull, _experimentalForceGuestPull_*).
- Remove backward compatibility code from _helpers.tpl that checked
for legacy env values.
- Remove legacy env.shims check from runtimeclasses.yaml.
- Remove CREATE_RUNTIMECLASSES and CREATE_DEFAULT_RUNTIMECLASS env
vars from kata-deploy.yaml and post-delete-job.yaml.
- Update RBAC to only include runtimeclasses get/patch permissions
(needed for NFD patching), removing create/delete/list/update/watch.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
The runner is down for a few weeks. I may end up bringing in my personal
runner, but I'm not confident I can easily do this before the holidays,
thus I'm skipping the tests for now.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Remove the existing containerd guest pull stability tests workflow
as we're going to rebuild all the VMs used for testing and introduce
new, more focused stability tests for nydus-snapshotter.
The new tests will be added soon, as part of another PR.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Add the attestation bats test case to the NVIDIA CI and provide a
second pod manifest for the attestation test with a GPU. This will
enable composite attestation in a subsequent step.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
So far we've only been building the initrd for the nvidia rootfs.
However, we're also interested on having the image beind used for a few
use-cases.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We hit a case that gatekeeper was failing due to thinking the WIP check
had failed, but since it ran the PR had been edited to remove that from
the title. We should listen to edits and unlabels of the PR to ensure that
gatekeeper doesn't get outdated in situations like this.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Fixes: #12123
`include` in #12069, introduced to choose a different runner
based on component, leads to another set of redundant jobs
where `matrix.command` is empty.
This commit gets back to the `runs-on` solution, but makes
the condition human-readable.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
In the CoCo tests jobs @wainersm create a report tests step
that summarises the jobs, so they are easier to understand and
get results for. This is very useful, so let's roll it out to all the bats
tests.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add an allow-all policy for the CC GPU tests and ensure the init-data
device is being created (hypervisor annotations).
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Right now we have only been passing the env var to the deployment
script, but we really need to pass it to the tests script as well.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When testing this branch, on several occasions the Delete
AKS cluster step has hung for multiple hours, so add a timeout
to prevent this.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The new environment of Power runners for agent checks is causing two test case failures
w.r.to selinux and inode which needs further understanding and is mostly an issue
due to environemnt change and not to do with the agent.
Fall back to running agent checks on original ppc64le self hosted runners.
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
As the arm 22.04 runner isn't working at the moment, let's test the
24.04 version to see if that is better.
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The fact that we were not explicitly setting the VMM was leading to us
testing with the default runtime class (qemu). :-/
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
By doing this, the ones interested on RISC-V support can still have a
ood visibility of its state, without the extra noise in our CI.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
We have had those tests broken for months. It's time to get rid of
those.
NOTE that we could easily revert this commit and re-add those tests as
soon as we find someone to maintain and be responsible for such
integration.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's ensure Trustee is deployed as some of the tests rely images that
live behind authentication. /o\
The approach taken here to deploy Trustee is exactly the same one taken
on the other CoCo tests, apart from an env var passed to ensure we're
using the NVIDIA remote verifier (which will be in handy very very
soon).
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Let's add a new NVIDIA machine, which later on will be used for CC
related tests.
For now the current tests are skipped in the CC capable machine.
Signed-off-by: Manuel Huber <manuelh@nvidia.com>
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
When added, I've mistakenly used the wrong test-type name, which is now
fixed and should be enough to trigger the tests correctly.
Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
On IBM actionspz P/Z runners, the following error was observed during
runtime tests:
```
host system doesn't support vsock: stat /dev/vhost-vsock: no such file or directory
```
Since loading the vsock module on the fly is not permitted, this commit
moves the runtime tests back to self-hosted runners for P/Z.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>