Commit Graph

32 Commits

Author SHA1 Message Date
Peng Tao
32fd013716 runtime: run prestart hooks before starting VM for FC
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.

Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2023-08-30 02:52:01 +00:00
Peng Tao
581be92b25
Merge pull request #4492 from zvonkok/pcie-topology
runtime: fix PCIe topology for GPUDirect use-case
2023-07-03 09:17:12 +08:00
Fabiano Fidêncio
6a21e20c63 runtime: Add "none" as a shared_fs option
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.

This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.

However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.

For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.

Fixes: #7207

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-30 20:45:00 +02:00
Zvonko Kaiser
de39fb7d38 runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology
Fixes: #4491

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Jakob Naucke
b546eca26f runtime: Generalize VFIO devices
Generalize VFIO devices to allow for adding AP in the next patch.
The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed.

The rationale for the revomal is:

- VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI
- Logic of checking a subtype of mediated device is included in GetVFIODeviceType()
- VFIOPciDeviceMediatedType() can simply fulfill the device addition based
on a type categorized by GetVFIODeviceType()

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 10:06:37 +09:00
Bin Liu
1dfd845f51 runtime: go fix code for 1.19
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.

Fixes: #5750

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-25 11:29:18 +08:00
Eric Ernst
f9e96c6506 runtime: device: move to top level package
Let's move device package to runtime/pkg instead of being buried under
virtcontainers.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-06-26 21:31:29 -07:00
Snir Sheriber
c67b9d2975 qemu: allow using legacy serial device for the console
This allows to get guest early boot logs which are usually
missed when virtconsole is used.
- It utilizes previous work on the govmm side:
https://github.com/kata-containers/govmm/pull/203
- unit test added

Fixes: #4237
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-17 12:06:11 +03:00
Snir Sheriber
44814dce19 qemu: treat console kernel params within appendConsole
as it is tightly coupled with the appended console device
additionally have it tested

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-05-17 12:05:31 +03:00
Jakob Naucke
eda8ea154a
runtime: Gofmt fixes
- Mostly blank lines after `+build` -- see
  https://pkg.go.dev/go/build@go1.14.15 -- this is, to date, enforced by
  `gofmt`.
- 1.17-style go:build directives are also added.
- Spaces in govmm/vmm_s390x.go

Fixes: #3769
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-02-28 17:24:47 +01:00
Samuel Ortiz
fa0e9dc6b1 virtcontainers: Make all Linux VMMs only build on Linux
Some of them (e.g. QEMU) can run on other OSes (e.g. Darwin) but the
current virtcontainers implementation is Linux specific.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-16 19:07:34 +01:00
Samuel Ortiz
b28d0274ff virtcontainers: Make max vCPU config less QEMU specific
Even though it's still actually defined as the QEMU upper bound,
it's now abstracted away through govmm.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-16 19:06:32 +01:00
Fabiano Fidêncio
ec6655af87 govmm: Use govmm from our own pkg
Let's stop using govmm from kata-containers/govmm and let's start using
it from our own repo.

Fixes: #3495

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-19 18:02:46 +01:00
bin
03546f75a6 runtime: change io/ioutil to io/os packages
Change io/ioutil to io/os packages because io/ioutil package
is deprecated from 1.16:

Discard => io.Discard
NopCloser => io.NopCloser
ReadAll => io.ReadAll
ReadDir => os.ReadDir
ReadFile => os.ReadFile
TempDir => os.MkdirTemp
TempFile => os.CreateTemp
WriteFile => os.WriteFile

Details: https://go.dev/doc/go1.16#ioutil

Fixes: #3265

Signed-off-by: bin <bin@hyper.sh>
2021-12-15 07:31:48 +08:00
Eric Ernst
1e7cb4bc3a macvlan: drop bridged part of name
The fact that we need to "bridge" the endpoint is a bit irrelevant. To
be consistent with the rest of the endpoints, let's just call this
"macvlan"

Fixes: #3050

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-16 16:44:29 -08:00
David Gibson
8bbcb06af5 qemu: Disable SHPC hotplug
Under certain circumstances[0] Kata will attempt to use SHPC hotplug
for PCI devices on the guest.  In fact we explicitly enable SHPC on
our PCI to PCI bridges, regardless of the qemu default.

SHPC was designed a long, long time ago for physical hotplugging and
works very poorly for a virtual environment. In particular it has a
mandatory 5s delay to allow a (real, human) operator to back out the
operation if they press a button by mistake. This alone makes it
unusable for a fast start up application like Kata.

Worse, the agent forces a PCI rescan during startup.  That will race
with the SHPC hotplug operation causing the device to go into a bad
state where config space can't be accessed from the guest at all.

The only reason we've sort of gotten away with this is that our
default guest kernel configuration triggers what's arguably a kernel
bug effectively disabling SHPC.  That makes the agent rescan the only
reason we see the new device.

Now that we require a qemu >=6.1, which includes ACPI PCI hotplug on
the q35 machine, we can explicitly disable SHPC in all cases.  It's
nothing but trouble.

fixes #2174

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-23 10:27:26 +10:00
Peng Tao
a9de761d71 runtime: drop qemu-lite support
As the project is not maintained and we have not been testing against it
for a long time.

Fixes: #2529
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-08-30 16:58:12 +08:00
Marcel Apfelbaum
789a59549e virtcontainers: Remove the pc machine
Keeping around two different x86 machines has no added value
and require more tests and maintenance. Prefer the q35 machine
since it has more features and drop the pc machine.

Fixes #1953
Depends-on: github.com/kata-containers/tests#3586
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2021-06-22 11:54:07 +03:00
Jakob Naucke
3ee61776d6
virtcontainers: Enable virtio-fs on s390x
Allow and configure vhost-user-fs devices (virtio-fs) on s390x. As a
consequence, appendVhostUserDevice now takes a context, which affects
its signature for other architectures.

Fixes: #1753

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-04-29 09:54:08 +02:00
Jakob Naucke
adba4532a4
virtcontainers: Revert "virtcontainers: Allow s390x appendVhostUserDevice"
This reverts commit 7f60911333.
Patch allowed other vhost user devices besides FS not supported on s390x
and failed to attach a CCW device number, which results in the
inavailability to use more devices after vhost-user-fs-ccw.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-04-29 09:43:33 +02:00
Jakob Naucke
7f60911333
virtcontainers: Allow s390x appendVhostUserDevice
Remove the prohibition of vhost-user devices on s390x, which are by now
supported (e.g. vhost-user-fs-ccw). As a consequence,
appendVhostUserDevice no longer needs an error in its signature.
This enables virtio-fs support on s390x.

Fixes: #1469

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-04-20 12:20:32 +02:00
Jakob Naucke
31ced01eba
virtcontainers: Fix missing contexts in s390x
#1389 has added a context for many signatures to improve trace spans.
Functions specific to s390x lack this. Add context where required. This
affects some common code signatures, since some functions that do not
require context on other architectures do require it on s390x.
Also remove an unnecessary import in test_qemu_s390x.go.

Fixes: #1562

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-03-29 17:49:27 +02:00
Chelsea Mafrica
4bf84b4b2f runtime: Add contexts to calls in unit tests
Modify calls in unit tests to use context since many functions were
updated to accept local context to fix trace span ordering.

Fixes #1355

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2021-03-16 17:39:28 -07:00
bin liu
5b065eb599 runtime: change govmm package
Change govmm package name from github.com/intel/govmm
to github.com/kata-containers/govmm

Fixes: #859

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-22 21:27:49 +08:00
Jia He
da79b4be67 virtcontainers: Append max_ports to virtio-serial device
Allow API consumers to change the maximum number of ports in the
virtio-serial devices, setting a lower number of ports can improve the
boot time and reduce the attack surface.

Before this patch on arm64:
[    0.028664] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.055031] printk: console [hvc0] enabled

After this patch on arm64:
[    0.028484] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.031370] printk: console [hvc0] enabled

Fixes: #2676
Signed-off-by: Jia He <justin.he@arm.com>
2020-10-16 23:40:54 +08:00
Julio Montes
3c415d93fe virtcontainers: 9p: shares multiple devices with only one export
Use 'remap' behaviour to deal with multiple devices being shared with
a 9p export.

Fixes the following warning:

```
9p: Multiple devices detected in same VirtFS export, which might lead to file
ID collisions and severe misbehaviours on guest!
You should either use a separate export for each device shared from host or
use virtfs option 'multidevs=remap'!
```

fixes #378

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-07-27 10:18:18 -05:00
Penny Zheng
1099a28830 kata 2.0: delete use_vsock option and proxy abstraction
With kata containers moving to 2.0, (hybrid-)vsock will be the only
way to directly communicate between host and agent.
And kata-proxy as additional component to handle the multiplexing on
serial port is also no longer needed.
Cleaning up related unit tests, and also add another mock socket type
`MockHybridVSock` to deal with ttrpc-based hybrid-vsock mock server.

Fixes: #389

Signed-off-by: Penny Zheng penny.zheng@arm.com
2020-07-16 04:20:02 +00:00
David Gibson
ea1d799f79 qemu: Only one element of qemuPaths map is relevant
The qemuPaths field in qemuArchBase maps from machine type to the default
qemu path.  But, by the time we construct it, we already know the machine
type, so that entry ends up being the only one we care about.

So, collapse the map into a single path.  As a bonus, the qemuPath()
method can no longer fail.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:26:43 +10:00
David Gibson
5dffffd432 qemu: Remove useless table from qemuArchBase
The supportedQemuMachines array in qemuArchBase has a list of all the
qemu machine types supported for the architecture, with the options
for each.  But, the machineType field already tells us which of the
machine types we're actually using, and that's the only entry we
actually care about.

So, drop the table, and just have a single value with the machine type
we're actually using.  As a bonus that means the machine() method can
no longer fail, so no longer needs an error return.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:26:38 +10:00
Adrian Moreno
7faaa06a52 qemu: support appending a vIOMMU device
Add a new function appendIOMMU() to the qemuArch interface
and provide an implementation on amd64 architecture.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-06-22 16:37:20 +02:00
Peng Tao
6de95bf36c gomod: update runtime import path
To use the kata-containers repo path.

Most of the change is generated by script:
find . -type f -name "*.go" |xargs sed -i -e \
's|github.com/kata-containers/runtime|github.com/kata-containers/kata-containers/src/runtime|g'

Fixes: #201
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-04-29 18:39:03 -07:00
Peng Tao
a02a8bda66 runtime: move all code to src/runtime
To prepare for merging into kata-containers repository.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-04-27 19:39:25 -07:00