Commit Graph

10557 Commits

Author SHA1 Message Date
Tim Zhang
0a582f7815 trace-forwarder: remove unused crate protobuf
Remove unused crate protobuf.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
73253850e6 kata-ctl: remove unused crate ttrpc
Remove unused crate ttrpc.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
76d2e30547 agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
eb3d20dccb protocols: Add ut for Serde
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
59568c79dd protocols: add support for Serde
rust-protobuf@3 does not support Serde natively anymore.
So we need to do it by ourselves.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:21 +08:00
Tim Zhang
a6b4d92c84 runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 19:49:20 +08:00
Zvonko Kaiser
ac7c63bc66 gpu: Add containerd shim for qemu-gpu
Last but not least add the continerd shim configuration
pointing to the correct configuration-<shim>.toml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:45:04 +00:00
Zvonko Kaiser
a0cc8a75f2 gpu: Add a kube runtime class
With the added configuration add the corresponding kube
runtime class.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:42:04 +00:00
Zvonko Kaiser
a81fff706f gpu: Adding a GPU enabled configuration
We need to set hotplug on pci root port and enable at least one
root port. Also set the guest-hooks-dir to the correct path

Fixes: #6675

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:40:09 +00:00
Tim Zhang
8af6fc77cd agent: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:41 +08:00
Tim Zhang
009b42dbff protocols: Fix unit test
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:41 +08:00
Tim Zhang
392732e213 protocols: Bump ttrpc from 0.6.0 to 0.7.1
Fixes: #6646

Signed-off-by: Tim Zhang <tim@hyper.sh>
2023-04-17 18:31:35 +08:00
Zvonko Kaiser
f4f958d53c gpu: Do not pass-through PCI (Host) Bridges
On some systems a GPU is in a IOMMU group with a PCI Bridge and
PCI Host Bridge. Per default no PCI Bridge needs to be passed-through.
When scanning the IOMMU group, ignore devices with a 0x60 class ID prefix.

Fixes: #6663

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 10:08:23 +00:00
Zvonko Kaiser
825e769483 gpu: Add GPU support to default kernel without any TEE
With each release make sure we ship a GPU enabled kernel

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 09:58:58 +00:00
Zvonko Kaiser
e4ee07f7d4 gpu: Add GPU TDX experimental kernel
With each release make sure we ship a GPU and TEE enabled kernel
This adds tdx-experimental kernel support

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-17 09:58:52 +00:00
Fabiano Fidêncio
243cb2e3af
Merge pull request #6670 from fidencio/topic/fix-caching-of-tdvf-and-tdx-qemu
cache-components: Fix caching of TDVF and QEMU for TDX
2023-04-16 09:04:04 +02:00
Fabiano Fidêncio
a1272bcf1d gha: tdx: Fix typo overlay -> overlays
The beauty of GHA not allowing us to easily test changes in the yaml
files as part of the PR has hit us again. :-/

The correct path for the k3s deployment is
tools/packaging/kata-deploy/kata-deploy/overlays/k3s instead of
tools/packaging/kata-deploy/kata-deploy/overlay/k3s.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 15:00:06 +02:00
Fabiano Fidêncio
3fa0890e5e cache-components: Fix TDVF caching
TDVF caching is not working as the tarball name is incorrect. The result
expected is kata-static-tdvf.tar.xz, but it's looking for
kata-static-tdx.tar.xz.

This happens as a logic to convert tdx -> tdvf has been added as part of
the building scripts, but I missed doing this as part of the caching
scripts.

Fixes: #6669

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 14:12:29 +02:00
Fabiano Fidêncio
80e3a2d408 cache-components: Fix TDX QEMU caching
TDX QEMU caching is not working as expected, as we're checking for its
version looking at "assets.hypervisor.${QEMU_FLAVOUR}.version", which is
correct for standard QEMU. However, for TDX QEMU we should be checking
for "assets.hypervisor.${QEMU_FLAVOUR}.tag"

Fixes: #6668

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-15 14:12:26 +02:00
Fabiano Fidêncio
fffe2c6082
Merge pull request #6648 from fidencio/topic/gha-tdx-improvements-and-fixes
gha: tdx: Ensure kata-deploy is removed after the tests run
2023-04-15 00:21:31 +02:00
Bo Chen
a819ce145f
Merge pull request #6633 from likebreath/0406/clh_v31.0
versions: Upgrade to Cloud Hypervisor v31.0
2023-04-14 13:52:19 -07:00
Zvonko Kaiser
87ea43cd4e gpu: Add configuration fragment
Adding configuration fragment for the kernel,
depending on the TEE kernel update the LOCALVERSION

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-14 07:52:51 +00:00
Zvonko Kaiser
aca6ff7289 gpu: Build and Ship an GPU enabled Kernel
With each release make sure we ship a GPU and TEE enabled kernel

Fixes: #6553

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-14 07:52:42 +00:00
Fabiano Fidêncio
dc662333df runtime: Increase the dial_timeout
When testing on AKS, we've been hitting the dial_timeout every now and
then.  Let's increase it to 45 seconds (instead of 30) for all the VMMs,
and to 60 seconfs in case of TEEs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 22:42:52 +02:00
Greg Kurz
897c0bc67e
Merge pull request #6658 from gkurz/osbuilder-dracut-dbus
osbuilder: Enable dbus in the dracut case
2023-04-13 19:03:15 +02:00
Greg Kurz
eb1762e813 osbuilder: Enable dbus in the dracut case
The agent now offloads cgroup configuration to systemd when
possible. This requires to enable D-Bus in order to communicate
with systemd.

Fixes #6657

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-04-13 14:16:50 +02:00
Greg Kurz
f9a94f8fc5
Merge pull request #6623 from UiPath/fix-no-space-device
runtime: Don't create socket file in /run/kata
2023-04-13 10:36:20 +02:00
Fabiano Fidêncio
f478b9115e clh: tdx: Update timeouts for confidential guest
Booting up TDX takes more time than booting up a normal VM.  Those
values are being already used as part of the CCv0 branch, and we're just
bringing them to the `main` branch as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
3b76abb366 kata-deploy: Ensure node is ready after CRI Engine restart
Let's ensure the node is ready after the CRI Engine restart, otherwise
we may proceed and scripts may simply fail if they try to deploy a pod
while the CRI Engine is not yet restarted (and, consequently, the node
is not Ready).

Related: #6649

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
5ec9ae0f04 kata-deploy: Use readinessProbe to ensure everything is ready
readinessProbe will help us to only have the kata-deploy pod marked as
Ready when it finishes all the needed configurations in the node.

Related: #6649

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
ea386700fe kata-deploy: Update podOverhead for TDX
As TEEs cannot hotplug memory / CPU, we *must* consider the default
values for those as part of the podOverhead.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
e31efc861c gha: tdx: Use the k3s overlay
As the TDX machine is using k3s, let's make sure we're deploying
kat-deploy using the k3s overlay.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
542bb0f3f3 gha: tdx: Set KUBECONFIG env at the job level
By doing this we avoid having to set it up on every step.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
d7fdf19e9b gha: tdx: Delete kata-deploy after the tests finish
We must ensure that no kata-deploy is left behind after the tests
finish, otherwise it may interfere with the next run.

Fixes: #6647

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
da35241a91 tests: k8s: Skip k8s-cpu-ns when testing TDX
TEEs do not support CPU / memory hotplug, thus this test must be
skipped.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Alexandru Matei
db2cac34d8 runtime: Don't create socket file in /run/kata
The socket file for shim management is created in /run/kata
and it isn't deleted after the container is stopped. After
running and stopping thousands of containers /run folder
will run out of space.

Fixes #6622
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Co-authored-by: Greg Kurz <groug@kaod.org>
2023-04-13 10:21:29 +03:00
Jianyong Wu
6d315719f0 snap: fix docker start fail issue
In Arm baseline CI, docker starts fail with error: "no sockets found via
socket activation: make sure the service was started by systemd". I find
a solusion in [1] to fix it.

[1] https://forums.docker.com/t/failed-to-load-listeners-no-sockets-found-via-socket-activation-make-sure-the-service-was-started-by-systemd/62505

Fixes: #6619
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-04-13 09:35:40 +08:00
Zhongtao Hu
328793bb27
Merge pull request #6585 from Apokleos/nydus_prefetch_files
nydus_rootfs/prefetch_files: add prefetch_files for RAFS
2023-04-12 19:58:36 +08:00
Zvonko Kaiser
e4b3b08871 gpu: Add proper CONFIG_LOCALVERSION depending on TEE
If conf_guest is set we need to update the CONFIG_LOCALVERSION
to match the suffix created in install_kata
-nvidia-gpu-{snp|tdx}, the linux headers will be named the very
same if build with make deb-pkg for TDX or SNP.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-12 11:30:59 +00:00
Zhongtao Hu
fef531f565
Merge pull request #6618 from Apokleos/virtiofs_extra_cache_mode
runtime-rs/virtio-fs: add support extra handler for cache mode.
2023-04-12 14:40:05 +08:00
Bin Liu
9327bb0912
Merge pull request #6639 from openanolis/nerdctl
runtime-rs: enable nerdctl to setup cni plugin
2023-04-12 12:04:37 +08:00
Zhongtao Hu
69ba2098f8 runtime-rs: remove network entities and netns
remove network entities and netns

Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-04-12 10:21:06 +08:00
Zhongtao Hu
b31f103d12 runtime-rs: enable nerdctl cni plugin
1. when we use nerdctl to setup network for kata, no netns is created by
nerdctl, kata need to create netns by its own

2. after start VM, nerdctl will call cni plugin via oci hook, we need to
rescan the netns after the interfaces have been created, and hotplug
the network device into the VM

Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-04-12 10:21:04 +08:00
Fabiano Fidêncio
3b3656d96d
Merge pull request #6522 from fidencio/topic/add-tdx-artefacts-from-2023ww01-to-main
tdx: Add artefacts from the latest TDX tools release into main
2023-04-11 20:43:02 +02:00
Fabiano Fidêncio
50ce33b02d
Merge pull request #6205 from fengwang666/non-root-clh
runtime: support non-root for clh
2023-04-11 19:34:00 +02:00
Fabiano Fidêncio
4751adbea1
Merge pull request #6610 from fidencio/topic/gha-run-dragonball-k8s-tests
gha: ci-on-push: Run k8s tests with dragonball
2023-04-11 18:16:14 +02:00
Fabiano Fidêncio
69d7a959c8 gha: ci-on-push: Run tests on TDX
Now that we've added a TDX capable external runner, let's make sure we
also run the basic tests using TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
5a0727ecb4 kata-deploy: Ship kata-qemu-tdx runtimeClass
Let's make sure we configure containerd for the kata-qemu-tdx handler
and ship the kata-qemu-tdx runtime class for kubernetes.

Fixes: #6537

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
98682805be config: Add configuration for QEMU TDX
As the QEMU configuration for TDX differs quite a lot from the normal
QEMU configuration, let's add a new configuration file for the QEMU TDX.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 16:10:35 +02:00
Fabiano Fidêncio
3e15800199 govmm: Directly pass the firmware using -bios with TDX
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as
pflash device and it actually works as RAM. "-bios" option is chosen to
load TDVF.

OVMF is the opensource firmware that implements the TDVF support. Thus
the command line to specify and load TDVF is ``-bios OVMF.fd``

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00