This will be very useful in the near future, when we start testing
kata-deploy with rke2 as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will be very useful in the near future, when we start testing
kata-deploy with k0s as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We'll be using exactly the same code used for the k8s tests, which are
already deploying k3s on GARM.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We just need to make sure the correct overlay is applied, following what
we already have been doing for k3s.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.
This was also done as part of fa62a4c01b,
for the k8s tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but
that exceeds the maximum amount of characteres allowed for the cluster
name. With this in mind, let's use the first letter of
K8S_TEST_HOST_TYPE instead.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This makes it so that each AKS cluster is created in its own individual
resource group, rather than using the "kataCI" resource group for all
test clusters.
This is to accommodate a tool that we recently introduced in our Azure
subscription which automatically deletes resource groups after a set
amount of time, in order to keep spending under control.
The tool will automatically delete any resource group, unless it has a
tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the
resource group will be retained until the specified date.
Note that I tagged all current resource groups in our subscription with
SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing
resources.
Fixes: #7982
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
We're hitting a specific issue after updating, which will require some
work on dragonball before it can be re-added here.
The issue:
```
...
3: failed to do rafs mount\\n
4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n
5: add share fs mount\\n
6: Mount rafs at
/rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower
error: Failed to Mount backend
...
Caused by:
vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a
backend filesystem failed:: missing field `version` at line 1 column
489\\\"))\"): unknown"
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will ensure we're testing with the correct runtime, instead of
using the `default` one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
To support the v0.12.0 nydus-snapshotter, we need to update the config
files and the commandline to start nydus-snapshotter.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
And with this we finally enable the nydus tests to run as part of our
GHA CI.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've been simply doing nothing whenever `install-kata` was called, and
that was the intent when we added the placeholder calls.
Now, let's install kata, as expected. :-)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we've added install_nydus() and install_nydus_snapshotter(), which do
conform with the pattern we're following on GHA, let's rely on them
rather than relying on the bits coming from nydus_test.sh.
Later on we'll have install_nydus() and install_nydus_snapshotter() as
part of the dependencies install in our `gha-run.sh`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similarly to what's been done for the cri-containerd tests, as part of
84dd02e0f9, we need to add the timeout
here for the crictl calls.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we may face errors like:
```
getting sandbox status of pod "d3af2db414ce8": metadata.Name,
metadata.Namespace or metadata.Uid is not in metadata
"&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}"
getting sandbox status of pod "-A": rpc error: code = NotFound desc = an
error occurred when try to find sandbox: not found
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we canoot properly start the nydus snapshotter, nor properly
kill it after it's been started.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The "source ..." we've been doing was not changed since those tests were
part of the Jenkins tests, and we need to adapt them, either setting the
correct path or entirely removing the ones that are not relevant to us
anymore.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will be used to download and install the
nydus-snapshotter, and it follows the same pattern we already have
introduced for downloading and installing another dependencies from
GitHub.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will be used to download and install nydus, and it follows
the same pattern we already have introduced for downloading and
installing another dependencies from GitHub.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
clippy is used as part our tests, so it's useful to have it installed
while we're already installing rust.
In case of developers, they also better be using it. :-)
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We'll use it as part of the refactoring we're doing in the static check
tests.
I can see a lot of other uses of this, but changing all of them to this
one is out of the scope for this PR.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We can use this a lot as part of our CI, but right now I'm just moving
those here with the intent to use later on in this series.
Fixes: #7974 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let me start with a fair warning that this commit is hard to split into
different parts that could be easily tested (or not tested, just
ignored) without breaking pieces.
Now, about the commit itself, as we're on the run to reduce costs
related to our sponsorship on Azure, we can split the k8s tests we run
in 2 simple groups:
* Tests that can be run in the smaller Azure instance (D2s_v5)
* Tests that required the normal Azure instance (D4s_v5)
With this in mind, we're now passing to the tests which type of host
we're using, which allows us to select to run either one of the two
types of tests, or even both in case of running the tests on a baremetal
system.
Fixes: #7972
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The list of tests which require a bigger VM instance is:
* k8s-number-cpus.bats -- failing on all CIs
* k8s-parallel.bats -- only failing on the cbl-mariner CI
* k8s-scale-nginx.bats -- only failing on the cbl-mariner CI
We'll keep those disabled while we re-work the logic to **only run
those** in a bigger (and more expensive) VM instance.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.
Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh
Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>