Commit Graph

45 Commits

Author SHA1 Message Date
Gabi Beyer
5f0799f1b7 vc: add rootless dir to path variables
Modify some path variables to be functions that return the path
with the rootless directory prefix if running rootlessly.

Fixes: #1827

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-09-26 16:17:16 +02:00
Julio Montes
f2f09230ee virtcontainers: rename kataVSOCK type and move it into the types package
Rename kataVSOCK to VSock and move it into the types package, this way it can
be accessible by other subpackages. This change is required because in next
commits the socket address and type (socket, vsock, hybrid vsock) will be
hypervisor specific.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-09-19 11:25:11 -05:00
Alice Frosi
23e607314e virtcontainers: Move bridge var in qemu type
In this way it is possible to set bridge variable for each arch when
instantiating the hypervisor.

Fixes: #1153

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Co-authored-by: Jan Schintag <jan.schintag@de.ibm.com>
2019-09-02 14:32:03 +02:00
Peng Tao
0075bf85ba hypervisor: allow to return a slice of pids
so that for qemu, we can save and export virtiofsd pid,
and put it to the same cgroup as the qemu process.

Fixes: #1972
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-21 11:37:01 +08:00
Peng Tao
aebc49692b qemu: fix memory prealloc option handling
Memory preallocation is just a property that hugepage, file backed
memory and memory-backend-ram can each choose to configure.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-16 12:58:25 +00:00
Julio Montes
f2423e7d7c virtcontainers: convert virtcontainers tests to testify/assert
Convert virtcontainers tests to testify/assert to make the virtcontainers
tests more readable.

fixes #156

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-07-19 15:28:45 +00:00
Peng Tao
d14968b66a qemu: use x-ignore-shared to implement vm template
qemu upstream has x-ignore-shared that works similar
to our private bypass-shared-memory. We can use it to
implement the vm template feature.

Fixes: #1798
Depends-on: github.com/kata-containers/packaging#641
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-17 05:37:22 -07:00
Manohar Castelino
78ea50c36c virtcontainers: Jailer: Add jailer support for firecracker
Firecracker provides a jailer to constrain the VMM. Use this
jailer to launch the firecracker VMM instead of launching it
directly from the kata-runtime.

The jailer will ensure that the firecracker VMM will run
in its own network and mount namespace. All assets required
by the VMM have to be present within these namespaces.
The assets need to be copied or bind mounted into the chroot
location setup by jailer in order for firecracker to access
these resouces. This includes files, device nodes and all
other assets.

Jailer automatically sets up the jail to have access to
kvm and vhost-vsock.

If a jailer is not available (i.e. not setup in the toml)
for a given hypervisor the runtime will act as the jailer.

Also enhance the hypervisor interface and unit tests to
include the network namespace. This allows the hypervisor
to choose how and where to lauch the VMM process, vs
virtcontainers directly launching the VMM process.

Fixes: #1129

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2019-07-11 21:32:36 +00:00
James O. D. Hunt
fcf9f9f6dd test: Fix fd leak causing test error
Update the `TestQemuAddDeviceKataVSOCK` test so that it:

- Doesn't hard-code the file descriptor number.
- Cleans up after itself.

The latter issue was causing an odd error similar to the following in
the test output:

```
Unable to launch /tmp/vc-tmp-526112270/hypervisor: fork/exec /tmp/vc-tmp-526112270/hypervisor: permission denied
```

Partially fixes: #1835.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2019-07-05 11:11:52 +01:00
Ganesh Maharaj Mahalingam
a41894da18 runtime: Enable file based backend
A file based memory backend mapped to the host, fot eg: '/dev/shm' will
be used by virtio-fs for performance reasons. This change is a generic
implementation of that for kata. This will be enabled default for
virtio-fs negating the need to enable hugepages in that scenario. This
option can be used without virtio-fs by setting 'file_mem_backend' to
the location in the configuration file. Default value is an empty
string.

Fixes: #1656
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
2019-05-23 20:47:42 -07:00
Penny Zheng
47670fcf73 memoryDevice: reconstruct memoryDevice
If kata-runtime supports memory hotplug via probe interface, we need to reconstruct
memoryDevice to store relevant status, which are addr and probe. addr specifies the
physical address of the memory device, and probe determines it is hotplugged via
acpi-driven or probe interface.

Fixes: #1149

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2019-04-04 17:03:20 +08:00
Archana Shinde
57b103a81b vsock: Pass info about vsock being used or not to the agent.
Instead of the agent trying to determine if a serial
or vsock channel is used, pass this information explicitly
as a kernel command line option.

Fixes #1457

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-04-02 09:48:10 -07:00
Ace-Tang
502fdab75e test: add test for addDeviceToBridge
add test for addDeviceToBridge in three case
1. addDeviceToBridge successful
2. fail cause no more available bridge slot
3. fail cause state.bridge == 0

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-03-12 09:18:03 +08:00
Hui Zhu
90704c8bb6 VMCache: the core and the client
VMCache is a new function that creates VMs as caches before using it.
It helps speed up new container creation.
The function consists of a server and some clients communicating
through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
The VMCache server will create some VMs and cache them by factory cache.
It will convert the VM to gRPC format and transport it when gets
requestion from clients.
Factory grpccache is the VMCache client.  It will request gRPC format
VM and convert it back to a VM.  If VMCache function is enabled,
kata-runtime will request VM from factory grpccache when it creates
a new sandbox.

VMCache has two options.
vm_cache_number specifies the number of caches of VMCache:
unspecified or == 0   --> VMCache is disabled
> 0                   --> will be set to the specified number
vm_cache_endpoint specifies the address of the Unix socket.

This commit just includes the core and the client of VMCache.

Currently, VM cache still cannot work with VM templating and vsock.
And just support qemu.

Fixes: #52

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2019-03-08 10:05:59 +08:00
Samuel Ortiz
fad23ea54e virtcontainers: Conversion to Stores
We convert the whole virtcontainers code to use the store package
instead of the resource_storage one. The resource_storage removal will
happen in a separate change for a more logical split.

This change is fairly big but mostly does not change the code logic.
What really changes is when we create a store for a container or a
sandbox. We now need to explictly do so instead of just assigning a
filesystem{} instance. Other than that, the logic is kept intact.

Fixes: #1099

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-02-07 00:59:29 +01:00
Samuel Ortiz
799ac6edf6 virtcontainers: Reduce qemu test noise
We only need to set the context before calling into qemu's API.

Fixes: #1211

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-02-05 16:33:06 +01:00
Samuel Ortiz
b25f43e865 virtcontainers: Add Capabilities to the types package
In order to move the hypervisor implementations into their own package,
we need to put the capabilities type into the types package.

Fixes: #1119

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-14 20:30:06 +01:00
Samuel Ortiz
763bf18daa virtcontainers: Remove the hypervisor init method
We always combine the hypervisor init and createSandbox, because what
we're trying to do is simply that: Set the hypervisor and have it create
a sandbox.

Instead of keeping a method with vague semantics, remove init and
integrate the actual hypervisor setup phase into the createSandbox one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 19:37:20 +01:00
Samuel Ortiz
b05dbe3886 runtime: Convert to the new internal types package
We can now remove all the sandbox shared types and convert the rest of
the code to using the new internal types package.

This commit includes virtcontainers, cli and containerd-shim changes in
one atomic change in order to not break bisect'ibility.

Fixes: #1095

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 14:43:33 +01:00
Peng Tao
bf1a5ce000 sandbox: cleanup sandbox if creation failed
This includes cleaning up the sandbox on disk resources,
and closing open fds when preparing the hypervisor.

Fixes: #1057

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-12-21 13:46:16 +08:00
Jose Carlos Venegas Munoz
d4586d4bcc test: remove TestHotplugRemoveMemory
HotplugRemoveMemory require to do a qmp call, but
unit test does not start a Qemu instance.

Depends-on: github.com/kata-containers/tests#1007

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-12-13 16:33:35 -06:00
Alice Frosi
deb6f16d82 virtcontainers: update context id of vsock to uint64
The CID of VSock needs to be change to uint64. Otherwise that leads to
an endianess issue. For more details see
https://github.com/kata-containers/runtime/issues/947

Remove the uint64 introduced by #984

Fixes: #958

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-12-06 10:13:30 +00:00
Sebastien Boeuf
018c8c1468 vendor: Update govmm vendoring
Shortlog:

f9b31c0 qemu: Allow disable-modern option from QMP
d617307 Run tests for the s390x build
b36b5a8 Contributors: Add Clare Chen to CONTRIBUTORS.md
b41939c Contributors: Add my name
dab4cf1 qmp: Add tests
5ea6da1 Verify govmm builds on s390x
ee75813 contributors: add my name
c80fc3b qemu: Add s390x support
ca477a1 Update source file headers
e68e005 Update the CONTRIBUTING.md
2b7db54 Add the CONTRIBUTORS.md file
b3b765c qemu: test Valid for Vsock for Context ID
3becff5 qemu: change of ContextID from uint32 to uint64
f30fd13 qmp: Output error detail when execute QMP command failed
7da6a4c qmp: fix mem-path properties for hotplug memory.
e4892e3 qemu/qmp: preparation for s390x support
110d2fa qemu/qmp: add new function ExecuteBlockdevAddWithCache
a0b0c86 qmp_test: Change QMP version from 2.6 to 2.9
10c36a1 qemu: add support for pidfile option

Fixes #983

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-06 00:13:15 -08:00
Alice Frosi
d73f27c612 test: set arch for test TestHotplugRemoveMemory
The arch field needs to be set

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-11-19 11:21:14 +00:00
Jose Carlos Venegas Munoz
1f5792ecbb test: fix unit test nil pointer.
Add filesystem to qemu object.
Fix mock_hypervisor

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-10-02 15:58:08 -05:00
Jose Carlos Venegas Munoz
19801bf784 config: Add Memory slots configuration.
Add configuration to decide the amount of slots that will be used in a VM

- This will limit the amount of times that memory can be hotplugged.
- Use memory slots provided by user.
- tests: aling struct

cli: kata-env: Add memory slots info.

- Show the slots to be added to the VM.

```diff
[Hypervisor]
  MachineType = "pc"
  Version = "QEMU ..."
  Path = "/opt/kata/bin/qemu-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  Msize9p = 8192
+  MemorySlots = 10
  Debug = false
  UseVSock = false
```

Fixes: #751

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-09-21 10:57:00 -05:00
Peng Tao
a1537a5271 hypervisor: rename DefaultVCPUs and DefaultMemSz
Now that we only use hypervisor config to set them, they
are not overridden by other configs. So drop the default prefix.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-09-06 21:04:56 +08:00
Peng Tao
ce288652d5 virtcontainers: remove sandboxConfig.VMConfig
We can just use hyprvisor config to specify the memory size
of a guest. There is no need to maintain the extra place just
for memory size.

Fixes: #692

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-09-06 14:15:56 +08:00
James O. D. Hunt
d0679a6fd1 tracing: Add tracing support to virtcontainers
Add additional `context.Context` parameters and `struct` fields to allow
trace spans to be created by the `virtcontainers` internal functions,
objects and sub-packages.

Note that not every function is traced; we can add more traces as
desired.

Fixes #566.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-22 08:24:58 +01:00
James O. D. Hunt
fc0142ec8e Merge pull request #527 from jodh-intel/remove-initcall-debug-kernel-option
kernel: Remove initcall_debug boot option
2018-08-01 12:50:52 +01:00
James O. D. Hunt
a8f5e2becf kernel: Remove initcall_debug boot option
Remove the `initcall_debug` boot option from the kernel command-line as
we don't need it any more and it generates a ton of boot messages that
may well be impacting performance.

Fixes #526.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-01 09:52:13 +01:00
James O. D. Hunt
487f9efa57 Merge pull request #536 from bergwolf/qmp_clear
qemu: clear qmp state before wait for qemu process
2018-08-01 09:51:43 +01:00
Julio Montes
052769196d virtcontainers: implement function to cold plug vsocks
`appendVSockPCI` function can be used to cold plug vocks, vhost file descriptor
holds the context ID and it's inherit by QEMU process, ID must be unique and
disable-modern prevents qemu from relying on fast MMIO.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-07-31 13:52:44 -05:00
Peng Tao
44a3a441aa qemu: wait on disconnected channel in qmp shutdown
That is how govmm ensures us that the qmp channel has been cleaned
up entirely.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-31 18:34:37 +08:00
Peng Tao
28b6104710 qemu: prepare for vm templating support
1. support qemu migration save operation
2. setup vm templating parameters per hypervisor config
3. create vm storage path when it does not exist. This can happen when
an empty guest is created without a sandbox.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-19 12:44:58 +08:00
Peng Tao
18e6a6effc hypervisor: decouple hypervisor from sandbox
A hypervisor implementation does not need to depend on a sandbox
structure. Decouple them in preparation for vm factory.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-19 10:49:25 +08:00
Peng Tao
66a3e812f2 hypervisor/qemu: add memory hotplug support
So that we can add more memory to an existing guest.

Fixes: #469

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-09 15:29:50 +08:00
Julio Montes
07db945b09 virtcontainers/qemu: reduce memory footprint
There is a relation between the maximum number of vCPUs and the
memory footprint, if QEMU maxcpus option and kernel nr_cpus
cmdline argument are big, then memory footprint is big, this
issue only occurs if CPU hotplug support is enabled in the kernel,
might be because of kernel needs to allocate resources to watch all
sockets waiting for a CPU to be connected (ACPI event).

For example

```
+---------------+-------------------------+
|               | Memory Footprint (KB)   |
+---------------+-------------------------+
| NR_CPUS=240   | 186501                  |
+---------------+-------------------------+
| NR_CPUS=8     | 110684                  |
+---------------+-------------------------+
```

In order to do not affect CPU hotplug and allow to users to have containers
with the same number of physical CPUs, this patch tries to mitigate the
big memory footprint by using the actual number of physical CPUs as the
maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in
the runtime configuration file,  otherwise `default_maxvcpus` is used as the
maximum number of vCPUs.

Before this patch a container with 256MB of RAM

```
              total        used        free      shared  buff/cache   available
Mem:           195M         40M        113M         26M         41M        112M
Swap:            0B          0B          0B
```

With this patch

```
              total        used        free      shared  buff/cache   available
Mem:           236M         11M        188M         26M         36M        186M
Swap:            0B          0B          0B
```

fixes #295

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-05-14 17:33:31 -05:00
James O. D. Hunt
bce9edd277 socket: Enforce socket length
A Unix domain socket is limited to 107 usable bytes on Linux. However,
not all code creating socket paths was checking for this limits.

Created a new `utils.BuildSocketPath()` function (with tests) to
encapsulate the logic and updated all code creating sockets to use it.

Fixes #268.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-05-09 11:36:24 +01:00
Sebastien Boeuf
ea789dbab9 Merge pull request #207 from amshinde/msize-9p
Add configuration for 9p msize
2018-04-18 11:20:44 -07:00
Graham whaley
d6c3ec864b license: SPDX: update all vc files to use SPDX style
When imported, the vc files carried in the 'full style' apache
license text, but the standard for kata is to use SPDX style.
Update the relevant files to SPDX.

Fixes: #227

Signed-off-by: Graham whaley <graham.whaley@intel.com>
2018-04-18 13:43:15 +01:00
Archana Shinde
3187a98188 9p: Add hypervisor configuration for 9p msize
This allows msize option for 9p to be configured and tuned.

Fixes #206

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2018-04-17 12:15:23 -07:00
Peng Tao
6107694930 runtime: rename pod to sandbox
As agreed in [the kata containers API
design](https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md),
we need to rename pod notion to sandbox. The patch is a bit big but the
actual change is done through the script:
```
sed -i -e 's/pod/sandbox/g' -e 's/Pod/Sandbox/g' -e 's/POD/SB/g'
```

The only expections are `pod_sandbox` and `pod_container` annotations,
since we already pushed them to cri shims, we have to use them unchanged.

Fixes: #199

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-04-13 09:32:51 +08:00
Peng Tao
4f57b65147 hypervisor: add initrd image support
If an initrd image is configured in HypervisorConfig or passed in by
annotations, append it to qemu command line arguments.

Fixes: #97

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-03-27 15:58:41 +08:00
Samuel Ortiz
24eff72d82 virtcontainers: Initial import
This is a virtcontainers 1.0.8 import into Kata Containers runtime.

virtcontainers is a Go library designed to manage hardware virtualized
pods and containers. It is the core Clear Containers framework and will
become the core Kata Containers framework, as discussed at
https://github.com/kata-containers/runtime/issues/33

Some more more pointers:

virtcontainers README, including some design and architecure notes:
https://github.com/containers/virtcontainers/blob/master/README.md

virtcontainers 1.0 API:
https://github.com/containers/virtcontainers/blob/master/documentation/api/1.0/api.md

Fixes #40

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2018-03-13 00:49:46 +01:00