Commit Graph

175 Commits

Author SHA1 Message Date
Julio Montes
34e2064b39 Merge pull request #1152 from Pennyzct/memory_hotplug
runtime: support memory hotplug via probe interface on aarch64
2019-04-08 08:43:03 -05:00
Julio Montes
c884f65a26 Merge pull request #1449 from alicefr/thread_id
s390x: not set socketID and threadID
2019-04-08 08:01:40 -05:00
Penny Zheng
47670fcf73 memoryDevice: reconstruct memoryDevice
If kata-runtime supports memory hotplug via probe interface, we need to reconstruct
memoryDevice to store relevant status, which are addr and probe. addr specifies the
physical address of the memory device, and probe determines it is hotplugged via
acpi-driven or probe interface.

Fixes: #1149

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2019-04-04 17:03:20 +08:00
Eric Ernst
5a41e5f240 Merge pull request #1458 from amshinde/pass-vsock-as-kernel-option
vsock: Pass info about vsock being used or not to the agent.
2019-04-02 16:18:41 -07:00
Archana Shinde
57b103a81b vsock: Pass info about vsock being used or not to the agent.
Instead of the agent trying to determine if a serial
or vsock channel is used, pass this information explicitly
as a kernel command line option.

Fixes #1457

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-04-02 09:48:10 -07:00
Peng Tao
6fda03ec92 hypervisor: make getThreadIDs return vcpu to threadid mapping
We need such mapping information to put vcpus in container cpuset properly.

Fixes: #1435

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-02 15:51:27 +08:00
Alice Frosi
49be8ee21c s390x: not set socketID and threadID
For cpu hotplug, the options socketID and threadID are not used.

Fixes: #1448

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2019-04-01 17:29:24 +02:00
Penny Zheng
2e5194e279 linter: remove deadcode linter check for generic item
After we switched golang linter to golangci-lint, we has extra 'deadcode'
linter check, and we need to remove this linter check for all
generic items.

Fixes: #1432

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2019-03-28 14:05:38 +08:00
Ganesh Maharaj Mahalingam
f4428761cb lint: Update go linter from gometalinter to golangci-lint.
gometalinter is deprecated and will be archived April '19. The
suggestion is to switch to golangci-lint which is apparently 5x faster
than gometalinter.

Partially Fixes: #1377

Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
2019-03-25 08:48:13 -07:00
Archana Shinde
9f96da2014 Merge pull request #1006 from Ace-Tang/throw_error
qemu: throw error when fail to get addr from bridges
2019-03-13 14:34:24 -07:00
Li Yuxuan
4e81522571 vc:qemu: Fix id calculation of memory hotplug
QMP doesn't guarantee the order of the array that is returned by
`query-memory-devices` command. So we would better search the whole
array to find out the current max slot, rather than simply use the last
element's slot.

Fixes: #1362

Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2019-03-13 16:39:31 +08:00
Ace-Tang
502fdab75e test: add test for addDeviceToBridge
add test for addDeviceToBridge in three case
1. addDeviceToBridge successful
2. fail cause no more available bridge slot
3. fail cause state.bridge == 0

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-03-12 09:18:03 +08:00
Ace-Tang
c7ace4b4bc qemu: throw error when fail to get addr from bridges
Return error soon when addDeviceToBridge() can not get empty address
from bridges, or the error will thrown by qemu, this is not obvious.

Fixes: #1005

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-03-11 18:03:46 +08:00
Hui Zhu
90704c8bb6 VMCache: the core and the client
VMCache is a new function that creates VMs as caches before using it.
It helps speed up new container creation.
The function consists of a server and some clients communicating
through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
The VMCache server will create some VMs and cache them by factory cache.
It will convert the VM to gRPC format and transport it when gets
requestion from clients.
Factory grpccache is the VMCache client.  It will request gRPC format
VM and convert it back to a VM.  If VMCache function is enabled,
kata-runtime will request VM from factory grpccache when it creates
a new sandbox.

VMCache has two options.
vm_cache_number specifies the number of caches of VMCache:
unspecified or == 0   --> VMCache is disabled
> 0                   --> will be set to the specified number
vm_cache_endpoint specifies the address of the Unix socket.

This commit just includes the core and the client of VMCache.

Currently, VM cache still cannot work with VM templating and vsock.
And just support qemu.

Fixes: #52

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2019-03-08 10:05:59 +08:00
xueshaojia 00464843
03dd780ddd qemu: fix devID value error
reason: When excutes ExecuteNetCCWDeviceAdd, the DevID is always "virtio-".
If add-iface multy times, qemu may report "dumplicated id:virtio-".

Fixes: #1305

Signed-off-by: xueshaojia <xueshaojia@huawei.com>
2019-03-04 09:01:38 +08:00
Peng Tao
1d79338a1a Merge pull request #1247 from nitkon/leakyPods
qemu: Cleanup Vm paths irrespective of Sandbox stop pass/fail
2019-02-21 11:56:57 +08:00
GabyCT
60f7c4f401 Merge pull request #1189 from devimc/topic/fixCpuCgroup
virtcontainers: reimplement sandbox cgroup
2019-02-20 10:18:56 -06:00
Nitesh Konkar
6daefdb177 qemu: Cleanup Vm paths irrespective of Sandbox stop pass/fail
Sometimes qemu/qmp commands error out and VM files
get left behind on the host filesystem. Clen them up
irrespective of `stopSandbox` succeeds or fails.

Fixes: #1246

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
2019-02-20 16:02:48 +05:30
Penny Zheng
1b967a4a6a unit-test: add nolint comment to avoid unused warning
since all generic* could bring unused linter warnings, which lead to
CI crash, we add nolint comment to avoid them.

Fixes: #1200

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2019-02-14 14:56:42 +08:00
Julio Montes
a1c85902f6 virtcontainers: add method to get hypervisor PID
hypervisor PID can be used to move the whole process and its
threads into a new cgroup.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-02-13 18:01:14 -06:00
Samuel Ortiz
fad23ea54e virtcontainers: Conversion to Stores
We convert the whole virtcontainers code to use the store package
instead of the resource_storage one. The resource_storage removal will
happen in a separate change for a more logical split.

This change is fairly big but mostly does not change the code logic.
What really changes is when we create a store for a container or a
sandbox. We now need to explictly do so instead of just assigning a
filesystem{} instance. Other than that, the logic is kept intact.

Fixes: #1099

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-02-07 00:59:29 +01:00
Nitesh Konkar
b0986a5f7f ppc64le: Fix vCPU hotplug issue
ppc64le qemu does not need threadID and
socketID parameters when hotplugging.

Fixes: #1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
2019-01-28 23:42:20 +05:30
Xu Wang
3b0b0147bd Merge pull request #1139 from bergwolf/delete
clean up container dir
2019-01-22 10:16:34 +08:00
Peng Tao
e8788bebd5 Merge pull request #1121 from jcvenegas/fix-memory-max-message
vc: qemu: fix error message on hotplug.
2019-01-21 14:16:41 +08:00
Peng Tao
36762c7cad qemu: cleanup vm template path properly
VM templates creates a symlink from `/run/vc/vm/sbid` to
`/run/vc/vm/vmid`. We need to clean up both of them.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2019-01-21 14:10:51 +08:00
Samuel Ortiz
2e1ddbc725 virtcontainers: Add Bridge to the types package
Bridge is representing a PCI/E bridge, so we're moving the bridge*.go
to types/pci*.go.

Fixes: #1119

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-16 15:45:08 +01:00
Samuel Ortiz
b25f43e865 virtcontainers: Add Capabilities to the types package
In order to move the hypervisor implementations into their own package,
we need to put the capabilities type into the types package.

Fixes: #1119

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-14 20:30:06 +01:00
Jose Carlos Venegas Munoz
a5a74f6d20 vc: qemu: fix error message on hotplug.
The error message does not provide the max memory that is exceeded.

Fix it for better error information.

Fixes: #1120

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-01-11 13:34:32 -06:00
Jose Carlos Venegas Munoz
d4dd5f1508 qemu: fix gofmt import order.
use gofmt changes the import order.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-01-11 13:33:01 -06:00
Samuel Ortiz
cf22f402d8 virtcontainers: Remove the hypervisor waitSandbox method
We always call waitSandbox after we start the VM (startSandbox), so
let's simplify the hypervisor interface and integrate waiting for the VM
into startSandbox.
This makes startSandbox a blocking call, but that is practically the
case today.

Fixes: #1009

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 19:38:33 +01:00
Samuel Ortiz
763bf18daa virtcontainers: Remove the hypervisor init method
We always combine the hypervisor init and createSandbox, because what
we're trying to do is simply that: Set the hypervisor and have it create
a sandbox.

Instead of keeping a method with vague semantics, remove init and
integrate the actual hypervisor setup phase into the createSandbox one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 19:37:20 +01:00
Samuel Ortiz
b05dbe3886 runtime: Convert to the new internal types package
We can now remove all the sandbox shared types and convert the rest of
the code to using the new internal types package.

This commit includes virtcontainers, cli and containerd-shim changes in
one atomic change in order to not break bisect'ibility.

Fixes: #1095

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 14:43:33 +01:00
Hui Zhu
dd28ff5986 memory: Add new option memory_offset
This value will be plused to max memory of hypervisor.
It is the memory address space for the NVDIMM devie.
If set block storage driver (block_device_driver) to "nvdimm",
should set memory_offset to the size of block device.

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2018-12-24 15:36:25 +08:00
Hui Zhu
ef75c3d19e block: Add new block storage driver "nvdimm"
Set block_device_driver to "nvdimm" will make the hypervisor use
the block device as NVDIMM disk.

Fixes: #1032

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2018-12-24 15:32:33 +08:00
Peng Tao
bf1a5ce000 sandbox: cleanup sandbox if creation failed
This includes cleaning up the sandbox on disk resources,
and closing open fds when preparing the hypervisor.

Fixes: #1057

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-12-21 13:46:16 +08:00
Sebastien Boeuf
e14071f2bd Merge pull request #1045 from mcastelino/topic/firecracker-virtio-mmio
Firecracker: virtio mmio support
2018-12-20 19:47:01 -08:00
Manohar Castelino
0d84d799ea virtio-mmio: Add support for virtio-mmio
Start adding support for virtio-mmio devices starting with block.
The devices show within the vm as vda, vdb,... based on order of
insertion and such within the VM resemble virtio-blk devices.

They need to be explicitly differentiated to ensure that the
agent logic within the VM can discover and mount them appropropriately.
The agent uses PCI location to discover them for virtio-blk.
For virtio-mmio we need to use the predicted device name for now.

Note: Kata used a disk for the VM rootfs in the case of Firecracker.
(Instead of initrd or virtual-nvdimm). The Kata code today does not
handle this case properly.

For now as Firecracker is the only Hypervisor in Kata that
uses virtio-mmio directly offset the drive index to comprehend
this.

Longer term we should track if the rootfs is setup as a block
device explicitly.

Fixes: #1046

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 15:08:51 -08:00
Manohar Castelino
e65bafa793 virtcontainers: Add firecracker as a supported hypervisor
Add firecracker as a supported hypervisor. This connects the
newly defined firecracker implementation as a supported
hypervisor.

Move operation definition to the common hypervisor code.

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 11:54:59 -08:00
Jose Carlos Venegas Munoz
618cfbf1db vc: sandbox: Let sandbox manage VM resources.
- Container only is responsable of namespaces and cgroups
inside the VM.

- Sandbox will manage VM resources.

The resouces has to be re-calculated and updated:

- Create new Container: If a new container is created the cpus and memory
may be updated.

- Container update: The update call will change the cgroups of a container.
the sandbox would need to resize the cpus and VM depending the update.

To manage the resources from sandbox the hypervisor interaface adds two methods.

- resizeMemory().

This function will be used by the sandbox to request
increase or decrease the VM memory.

- resizeCPUs()

vcpus are requested to the hypervisor based
on the sum of all the containers in the sandbox.

The CPUs calculations use the container cgroup information all the time.

This should allow do better calculations.

For example.

2 containers in a pod.

container 1 cpus = .5
container 2 cpus = .5

Now:
Sandbox requested vcpus 1

Before:
Sandbox requested vcpus 2

When a update request is done only some atributes have
information. If cpu and quota are nil or 0 we dont update them.

If we would updated them the sandbox calculations would remove already
removed vcpus.

This commit also moves the sandbox resource update call at container.update()
just before the container cgroups information is updated.

Fixes: #833

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-12-13 16:33:14 -06:00
Julio Montes
976f5b2a6e Merge pull request #990 from alicefr/s390x
s390x: add support for s390x
2018-12-11 10:57:27 -06:00
Alice Frosi
6f83061139 s390x: add support for s390x
The PR adds the support for s390x.

In the case of CCW devices, the vhost-user devices are not supported.
See #659. An error message is thrown if they tried to be used.

Memory hotplug is not supported on s390 yet and an error message is thrown.

The VirtioNetPCI has been changed to VirtioNet. The generalization
allows to set the VirtioNet to the correct CCW device for s390x.

Fixes: #666

Co-authored-by: Yash D Jain ydjainopensource@gmail.com
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-12-11 12:32:17 +01:00
Hui Zhu
f6511471d4 block: Add cache-related options for block devices
Add block_device_cache_set, block_device_cache_direct and
block_device_cache_noflush.
They are cache-related options for block devices that are described in
https://github.com/qemu/qemu/blob/master/qapi/block-core.json.
block_device_cache_direct denotes whether use of O_DIRECT (bypass the host
page cache) is enabled.  block_device_cache_noflush denotes whether flush
requests for the device are ignored.
The json said they are supported since 2.9.
So add block_device_cache_set to control the cache options set to block
devices or not.  It will help to support the old version qemu.

Fixes: #956

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2018-12-06 18:07:44 +08:00
Sebastien Boeuf
018c8c1468 vendor: Update govmm vendoring
Shortlog:

f9b31c0 qemu: Allow disable-modern option from QMP
d617307 Run tests for the s390x build
b36b5a8 Contributors: Add Clare Chen to CONTRIBUTORS.md
b41939c Contributors: Add my name
dab4cf1 qmp: Add tests
5ea6da1 Verify govmm builds on s390x
ee75813 contributors: add my name
c80fc3b qemu: Add s390x support
ca477a1 Update source file headers
e68e005 Update the CONTRIBUTING.md
2b7db54 Add the CONTRIBUTORS.md file
b3b765c qemu: test Valid for Vsock for Context ID
3becff5 qemu: change of ContextID from uint32 to uint64
f30fd13 qmp: Output error detail when execute QMP command failed
7da6a4c qmp: fix mem-path properties for hotplug memory.
e4892e3 qemu/qmp: preparation for s390x support
110d2fa qemu/qmp: add new function ExecuteBlockdevAddWithCache
a0b0c86 qmp_test: Change QMP version from 2.6 to 2.9
10c36a1 qemu: add support for pidfile option

Fixes #983

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-06 00:13:15 -08:00
Alice Frosi
0796f2e5a0 virtcontainers: Add function supportGuestMemoryHotplug
This PR defines a new function supportGuestMemoryHotplug that
clearly defines if the architecture supports memory hotplug. The function
can be reimplemented in virtcontainers/qemu_$arch.go file for each
architecture.

Fixes: #910

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-11-19 11:22:22 +00:00
Peng Tao
381ea37d86 Merge pull request #745 from bergwolf/query-migrate
qemu: query migrate status
2018-10-30 08:50:21 +08:00
Wei Zhang
34fe3b9d6d cgroups: add host cgroup support
Fixes #344

Add host cgroup support for kata.

This commits only adds cpu.cfs_period and cpu.cfs_quota support.

It will create 3-level hierarchy, take "cpu" cgroup as an example:

```
/sys/fs/cgroup
|---cpu
   |---kata
      |---<sandbox-id>
         |--vcpu
      |---<sandbox-id>
```

* `vc` cgroup is common parent for all kata-container sandbox, it won't be removed
after sandbox removed. This cgroup has no limitation.
* `<sandbox-id>` cgroup is the layer for each sandbox, it contains all other qemu
threads except for vcpu threads. In future, we can consider putting all shim
processes and proxy process here. This cgroup has no limitation yet.
* `vcpu` cgroup contains vcpu threads from qemu. Currently cpu quota and period
constraint applies to this cgroup.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Signed-off-by: Jingxiao Lu <lujingxiao@huawei.com>
2018-10-27 09:41:35 +08:00
Ruidong Cao
6935279beb network: add new NetInterworkingModel "none" and endpoint type TapEndpoint
This model is for not creating a new net ns for VM and directly
creating taps in the host net ns.

Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
2018-10-22 21:06:58 +08:00
Ruidong Cao
f8f29622a4 virtcontainers: refactor hotplug qmp functions
Refactor these functions so differernt types of endpoints can use a unified
function to hotplug nics.

Fixes #731

Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
2018-10-22 21:06:56 +08:00
Sebastien Boeuf
0ae5b142a6 qemu: Disable the default romfile used by virtio-pci
As we try to make sure we don't pull unneeded dependency when using
QEMU or NEMU as the hypervisor, and because SeaBIOS and OVMF firmware
already handle what's done by the default efi-virtio.rom binary, this
commit gets rid of this dependency by providing a default empty one.

Fixes #812

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-10-16 18:29:49 -07:00
Archana Shinde
3c590b0e2c network: Rename VirtualEndpoint to VethEndpoint
As this really represents a veth pair rather than a generic
virtual interface, rename VirtualEndpoint to VethEndpoint.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2018-10-11 14:45:57 -07:00