Last but not least add the continerd shim configuration
pointing to the correct configuration-<shim>.toml
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We need to set hotplug on pci root port and enable at least one
root port. Also set the guest-hooks-dir to the correct path
Fixes: #6675
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
On some systems a GPU is in a IOMMU group with a PCI Bridge and
PCI Host Bridge. Per default no PCI Bridge needs to be passed-through.
When scanning the IOMMU group, ignore devices with a 0x60 class ID prefix.
Fixes: #6663
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
With each release make sure we ship a GPU and TEE enabled kernel
This adds tdx-experimental kernel support
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The beauty of GHA not allowing us to easily test changes in the yaml
files as part of the PR has hit us again. :-/
The correct path for the k3s deployment is
tools/packaging/kata-deploy/kata-deploy/overlays/k3s instead of
tools/packaging/kata-deploy/kata-deploy/overlay/k3s.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDVF caching is not working as the tarball name is incorrect. The result
expected is kata-static-tdvf.tar.xz, but it's looking for
kata-static-tdx.tar.xz.
This happens as a logic to convert tdx -> tdvf has been added as part of
the building scripts, but I missed doing this as part of the caching
scripts.
Fixes: #6669
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDX QEMU caching is not working as expected, as we're checking for its
version looking at "assets.hypervisor.${QEMU_FLAVOUR}.version", which is
correct for standard QEMU. However, for TDX QEMU we should be checking
for "assets.hypervisor.${QEMU_FLAVOUR}.tag"
Fixes: #6668
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When testing on AKS, we've been hitting the dial_timeout every now and
then. Let's increase it to 45 seconds (instead of 30) for all the VMMs,
and to 60 seconfs in case of TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The agent now offloads cgroup configuration to systemd when
possible. This requires to enable D-Bus in order to communicate
with systemd.
Fixes#6657
Signed-off-by: Greg Kurz <groug@kaod.org>
Booting up TDX takes more time than booting up a normal VM. Those
values are being already used as part of the CCv0 branch, and we're just
bringing them to the `main` branch as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure the node is ready after the CRI Engine restart, otherwise
we may proceed and scripts may simply fail if they try to deploy a pod
while the CRI Engine is not yet restarted (and, consequently, the node
is not Ready).
Related: #6649
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
readinessProbe will help us to only have the kata-deploy pod marked as
Ready when it finishes all the needed configurations in the node.
Related: #6649
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As TEEs cannot hotplug memory / CPU, we *must* consider the default
values for those as part of the podOverhead.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the TDX machine is using k3s, let's make sure we're deploying
kat-deploy using the k3s overlay.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We must ensure that no kata-deploy is left behind after the tests
finish, otherwise it may interfere with the next run.
Fixes: #6647
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The socket file for shim management is created in /run/kata
and it isn't deleted after the container is stopped. After
running and stopping thousands of containers /run folder
will run out of space.
Fixes#6622
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Co-authored-by: Greg Kurz <groug@kaod.org>
If conf_guest is set we need to update the CONFIG_LOCALVERSION
to match the suffix created in install_kata
-nvidia-gpu-{snp|tdx}, the linux headers will be named the very
same if build with make deb-pkg for TDX or SNP.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
1. when we use nerdctl to setup network for kata, no netns is created by
nerdctl, kata need to create netns by its own
2. after start VM, nerdctl will call cni plugin via oci hook, we need to
rescan the netns after the interfaces have been created, and hotplug
the network device into the VM
Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Now that we've added a TDX capable external runner, let's make sure we
also run the basic tests using TDX.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we configure containerd for the kata-qemu-tdx handler
and ship the kata-qemu-tdx runtime class for kubernetes.
Fixes: #6537
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the QEMU configuration for TDX differs quite a lot from the normal
QEMU configuration, let's add a new configuration file for the QEMU TDX.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as
pflash device and it actually works as RAM. "-bios" option is chosen to
load TDVF.
OVMF is the opensource firmware that implements the TDVF support. Thus
the command line to specify and load TDVF is ``-bios OVMF.fd``
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>