We don't need to run on a D4s_v5. as those tests are not CPU / memory
intense. With this is mind, let's use a smaller version of the
instance, the D2s_v5 one.
Fixes: #7958
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 32841827b8)
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 92fff129fd)
Instead of having some of them only being considered if explicitly
passed to the script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit adc18ecdb1)
As the environment variables are now being passed down from the GitHub
Actions, let's make sure they're exposed to the container used to build
the kata-deploy binaries, and during the build process we'll be able to
use those to log in and push the artefacts to the OCI registry, using
ORAS.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c7a851efd7)
We do the build of our artefacts inside a container image, and we need
to expose some env vars to the container so ORAS can be used there to
push the artefacts we want to cache to ghcr.io.
The env vars we're exposing are:
* ARTEFACT_REGISTRY: The registry where we're going to save the
artefacts.
* ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as
ORAS does not use the same json file used by docker.
* ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry,
as the ORAS does not use the same json file used by docker.
* TARGET_BRANCH: The target branch, which will be part of the tag of the
artefact, as we may end up caching the artefacts for both main and
stable branches.
Fixes: #7834 -- part 0
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6bd15a85d5)
This PR adds the iperf cpu utilization limit for qemu for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit cd4fd1292a)
This PR adds the iperf value for cpu utilization for kata metrics.
Fixes#7936
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit df5cd10ea0)
We need a very recent L2 guest kernel to fix all the bugs that occur in nested
virtualization.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 9d93036783)
cloud hypervisor does not emulate pcie switches or pci bridges, so we need to
accept a lonely device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit faee59b520)
It is fine to start a VM with the disk image without syncing it as we now run
the test in an ephemeral Azure instance.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit df3dc1105c)
Sometimes the test gets stuck running commands in the container - need to
investigate why later.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 7211c3dccc)
Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is
enabled. We need to add it to the whitelist because dragonball uses kernel
v5.10 which restricted VIRTIO_IOMMU to ARM64 only.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 1b02f89e4f)
Conflicts:
tools/packaging/kernel/kata_config_version
by enabling IOMMU on the default PCI segment. For hotplug to work we need a
virtualized iommu and clh exposes one if there is some device or PCI segment
that requests it. I would have preferred to add a separate PCI segment for
hotplugging vfio devices but unfortunately kata assumes there is only one
segment all over the place. See create_pci_root_bus_path(),
split_vfio_pci_option() and grep for '0000'.
Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on
every device that is attached to it, which is why it is sprinkled all over the
place.
CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for
that device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 3a1db7a86b)
This is a to catch the case of the guest getting stuck.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 9f1a42c6cc)
This shouldn't be hiding behind only a qemu check, we need this for clh as
well.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit b46b0ecf8b)
There is no way for this branch to be hit, as port is only set when it is
different than config.NoPort.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit bfc93927fb)
These test cases shows which options are valid for CLH/Qemu, and test that we
correctly catch unsupported combinations.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 7c4e73b609)
The only supported options are hot_plug_vfio=root-port or no-port.
cold_plug_vfio not supported yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit fc51e4b9eb)
hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to
CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have
not implemented support for cold_plug_vfio in CLH yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 509771e6f5)
tdp_mmu had some issues up until around Linux v6.3 that make it work
particularly bad when running nested on Hyper-V. Reload the module at the start
of the test and disable the tdp_mmu param.
Gather debug info at the end of the test to make it easier to figure out what
went wrong. This uses github actions group syntax so that each section can be
collapsed.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 5f6475a28a)
For debugging (though this doesn't get exposed yet).
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 8fffdc81c5)
- reduce memory and cpu usage to fit in a D4s_v5
- source correct lib
- mount workspace from 9p
- disable cpu mitigations for speed
- drop unused commands and variables
- install containerd
- install kata from built artifacts
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit df815087e7)
To match the flow of other github actions workflows.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit a92ddeea15)
This imports the vfio test scripts github.com/kata-containers/tests. The test
case doesn't work yet but doing the changes in a separate commit will make it
easier to track the changes. The only change in this commit is renaming
vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh
Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 5a551a85b1)
This PR increases the jitter value for qemu for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 49e2fa189c)
This PR increases the value limit for jitter in clh.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 49234433a7)
This will ensure that we're calling the correct binary for the
hypervisor.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 813bfdec01)
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.
This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 46bc0b1c01)
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.
This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity
For your own sanity, do not read the comments, after all this is
internet. :-)
Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping. With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 13968aa7f6)
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.
This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity
For your own sanity, do not read the comments, after all this is
internet. :-)
Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping. With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e0c811678b)
This PR adds the iperf bandwidth value for qemu for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 0aa073967d)
This PR adds the iperf bandwidth value for kata metrics.
Fixes#7924
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 615c1cbf19)
This PR ensures that docker is running as part of the init_env function
in kata metrics to avoid failures like docker is not running and making
the kata metrics CI to fail.
Fixes#7898
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit d53eb73eec)
This PR adds the Cassandra Metrics documentation for kata metrics.
Fixes#7922
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit ad08321b83)
FIO test is showing ongoing issues when running in k8s.
Working on running FIO on the ctr client which has been
shown to be stable.
Fixes: #7920
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
(cherry picked from commit a58ea66592)
This will help us to make sure that the failure is actually related to
Kata Containers.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f536ef5ce1)
There's no reason to wait till the payload is created to run the tests,
as we rely on the tarball, not on the kata-deploy payload.
That was a mistake on my side, and that's already fixed for the nerdctl
tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c83f167c59)
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.
This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.
In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.
Fixes: #7911
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 12d833d07d)
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.
This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.
For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912
In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.
Fixes: #7910
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 348b8644d6)
As, regardless of what's mentioned in the documentation, it seems that
$GITHUB_REF_NAME is passed down as a literal string.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f811b064ca)
The ones for the payload-after-push.yamland ci-nightly.yaml are not that
much important right now, but they're needed for when we start running
those on stable branches as well.
The other ones were missed during
bd24afcf73.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6d795c089e)
We missed those one as part of bd24afcf73.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8509c31870)
Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation.
Fixes#7894
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit 060499dcae)
There's no need to keep curl there after the kubectl binary has already
been downloaded.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 8b4a0b368f)
Similarly to what's been done for x86_64 -> amd64, we need to do a
aarch64 -> arm64 change in order to be able to download the kubectl
binary.
Fixes: #7861
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 139c7f03ab)
We're changing what's been done as part of ac939c458c, as we've
notcied issues using `github.event.pull_request.merge_commit_sha`.
Basically, whenever a force-push would happen, the reference of
merge_commit_sha wouldn't be updated, leading us to test PRs with the
old code. :-/
In order to get the rebase properly working, we need to ensure we pull
the hash of the commit as part of checkout action, and ensure
fetch-depth is set to 0.
Fixes: #7414
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit bd24afcf73)
This will make our image smaller, and still ensure it's multi-arch
support.
Fixes: #7861
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 670a8e9c73)
The test is failing, and an issue has been opened to track it.
For now, let's skip it.
Issue:
https://github.com/kata-containers/kata-containers/issues/7873
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9d74b7ccc9)
Conflicts:
tests/integration/kubernetes/k8s-pod-quota.bats