Commit Graph

184 Commits

Author SHA1 Message Date
Gabi Beyer
e93bf967d2 network: Add tuntap device
The tuntap network device is for tuntap interfaces to connect
to the container. A specific use case is the slirp4netns tap
interface for rootless kata-runtime.

Fixes: #1878

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-09-26 16:17:16 +02:00
Gabi Beyer
41407cfbed vc: make cgroup usage configurable if rootless
rootless execution does not yet support cgroups, so if running
rootlessly skip the cgroup creation and deletion.

Fixes: 1877

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-09-26 16:17:16 +02:00
Wang Liang
c81db9c3da sandbox: The unit of newMemory is MB
change Bytes to MB in log

Fixes: #2068

Signed-off-by: Wang Liang <wangliangzz@inspur.com>
2019-09-18 05:10:34 -04:00
Wei Zhang
2ed94cbd9d Config: Remove ConfigJSONKey from annotations
Fixes: #2023

We can get OCI spec config from bundle instead of annotations, so this
field isn't necessary.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-09-17 11:47:06 +08:00
Eric Ernst
282d85899e Merge pull request #1880 from jcvenegas/pod-cgroup-only
cgroups: Use only pod cgroup
2019-09-09 07:00:54 -07:00
Eric Ernst
b62814a6f0 sandbox: combine sandbox cgroup functions
Simplify the tests and the code by combining the create and join
functions into a single function.

Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-09-05 13:49:13 -07:00
Li Yuxuan
a5f1744132 vc: Delete store when new/create container is failed
The container store should be deleted when new/create is failed if the
store is newly created.

Fixes: #2013
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2019-08-30 18:05:59 +08:00
Jose Carlos Venegas Munoz
9fc7246e8a sandbox: delete cgroup for SandboxOnly option
Use all subsystems for SandboxOnly option to make sure
all cgroups are deleted.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:08:04 -05:00
Jose Carlos Venegas Munoz
3fc6f4bc55 sandbox: add containers, do not get cgroup path
Add containers does not need to check the cgroup path
this is done in a different function

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:08:04 -05:00
Jose Carlos Venegas Munoz
074418f56b sandbox: Join cgroup sandbox on create.
When a new sandbox is created, join to its cgroup path
this will create all proxy, shim, etc in the sandbox cgroup.

Fixes: #1879

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:08:04 -05:00
Jose Carlos Venegas Munoz
b65063248f config: add option SandboxCgroupOnly
add option to eneable only pod cgroup (SandboxCgroupOnly)

Depends-on: github.com/kata-containers/tests#1824

Fixes: #1879
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:08:04 -05:00
Jose Carlos Venegas Munoz
f45b2d9cc6 cgroups: quote some paths on errors.
Some errors propagate with printing showing a cgroup path.
If for some reason this is empty is difficult to know looking
at the logs.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:01:35 -05:00
Jose Carlos Venegas Munoz
6fdbef4ff5 sandbox: Rename constrainHypervisor
constrainHypervisor -> constrainHypervisorVCPUs

Document and rename function.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:01:35 -05:00
Jose Carlos Venegas Munoz
caac68c09f sandbox: cgroup: prefix cgroup related methods
rename to allow group in auto-generated docs.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:01:35 -05:00
Jose Carlos Venegas Munoz
529ec25fb7 sandbox: cgroups: move methods to sandbox file
Move sandbox related methods to its own file.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:01:35 -05:00
lifupan
c91556aa41 api: add a CleanupContainer api for VC
When shimv2 was killed by accident, containerd would try to
launch a new shimv2 binarry to cleanup the container. In order
to avoid race condition, the cleanup should be done serialized
in a sandbox. Thus adding a new api to do this by locking the
sandbox.

Fixes:#1832

Signed-off-by: lifupan <lifupan@gmail.com>
2019-08-24 08:16:02 +08:00
lifupan
52e68f5fce virtcontainers: cleanup the container config once failed
When create container failed, it should delete the container
config from sandbox, otherwise, the following new creating container
would get a wrong resources caculating which would contain the previous
failed container resources such as memory and cpu.

Fixes: #1997

Signed-off-by: lifupan <lifupan@gmail.com>
2019-08-22 17:43:04 +08:00
lifupan
5b749a56d8 virtcontainers: remove the redundant sandbox config store
The following storeSandbox() will store the sandbox config
data, thus there is no need to store it specifically before
run storeSandbox().

Signed-off-by: lifupan <lifupan@gmail.com>
2019-08-22 12:48:14 +08:00
Peng Tao
d90eba8593 network: always cold unplug network devices
We don't really need to unplug it from guest because we have
already stopped it. Just detach it and clean it up.

Fixes: #1968
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-15 00:02:52 -07:00
Peng Tao
d26ff71201 Revert: "sandbox: remove network before stopping vm"
This reverts commit 794e08e243.

It breaks vfio device passthru as we need to bind the device
back to host when removing the endpoint. And that is not possible
when qemu is still running (thus holding reference to the device).

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-15 00:02:44 -07:00
Peng Tao
794e08e243 sandbox: remove network before stopping vm
We might need to call hypervisor hotunplug to really remove
a network device. We cannot do it after stopping the VM.

Fixes: #1956
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-13 01:04:07 -07:00
Wei Zhang
3bfbbd666d persist: merge "network.json"
Merge "network.json" into "persist.json" so that new store can manage
network part.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:10:00 +08:00
Wei Zhang
7d5e48f1b5 persist: manage "hypervisor.json" with new store
Fixes #803

Merge "hypervisor.json" into "persist.json", so the new store can take
care of hypervisor data now.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:09:11 +08:00
Peng Tao
d5d7d82eeb vc: move container mount cleanup to container.go
For one thing, it is container specific resource so it should not
be cleaned up by the agent. For another thing, we can make container
stop to force cleanup these host mountpoints regardless of hypervisor
and agent liveness.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
835b6e9e1b sandbox: do not fail SIGKILL
Once we have found the container, we should never fail SIGKILL.
It is possible to fail to send SIGKILL because hypervisor might
be gone already. If we fail SIGKILL, upper layer cannot really
proceed to clean things up.

Also there is no need to save sandbox here as we did not change
any state.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
bc4460e12f sandbox: support force stop
When force is true, ignore any guest related errors. This can
be used to stop a sandbox when hypervisor process is dead.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Manohar Castelino
78ea50c36c virtcontainers: Jailer: Add jailer support for firecracker
Firecracker provides a jailer to constrain the VMM. Use this
jailer to launch the firecracker VMM instead of launching it
directly from the kata-runtime.

The jailer will ensure that the firecracker VMM will run
in its own network and mount namespace. All assets required
by the VMM have to be present within these namespaces.
The assets need to be copied or bind mounted into the chroot
location setup by jailer in order for firecracker to access
these resouces. This includes files, device nodes and all
other assets.

Jailer automatically sets up the jail to have access to
kvm and vhost-vsock.

If a jailer is not available (i.e. not setup in the toml)
for a given hypervisor the runtime will act as the jailer.

Also enhance the hypervisor interface and unit tests to
include the network namespace. This allows the hypervisor
to choose how and where to lauch the VMM process, vs
virtcontainers directly launching the VMM process.

Fixes: #1129

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2019-07-11 21:32:36 +00:00
Vijay Dhanraj
f246a799aa virtcontainers: Add support for updating virtio-blk based container rootfs
Thist patch adds the following,
1. ACRN only supports virtio-blk and so the rootfs for the VM
   sits at /dev/vda. So to get the container rootfs increment the
   globalIndex by 1.
2. ACRN doesn't hot-plug container rootfs (but uses blkrescan) to
   update the container rootfs. So the agent can be provided the virtpath
   rather than the PCIaddr avoiding unneccessary rescaning to find the
   virthpath.

v1->v2:
Removed the workaround of incrementing index for
virtio-blk device and addressed it acrn.

Fixes: #1778

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
2019-07-10 10:49:24 -07:00
Julio Montes
890a3d5960 Merge pull request #1637 from marcov/kill-hyp
virtcontainers: kill hypervisor if startSandbox fails
2019-05-23 15:11:54 -05:00
Fupan Li
100db8abdc Merge pull request #1670 from xs3c/fix-vfio-hang
shim v2: Close vhostfd after vm get vhostfd
2019-05-21 14:53:26 +08:00
Marco Vedovati
f89834a276 virtcontainers: avoid unnecessary error checking in startVM
Remove redundant error checking in startVM.

Signed-off-by: Marco Vedovati <mvedovati@suse.com>
2019-05-16 12:31:51 +02:00
Yang, Wei
071030b784 shimv2: Close vhostfd after vm get vhostfd
If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.

Fixes: #1669

Signed-off-by: Yang, Wei <w90p710@gmail.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-05-16 13:31:11 +08:00
Manohar Castelino
66b93c7ca0 Networking: Ensure that network namespace is propagated
Network namespace needs to be propagated if available at
createSandbox()

Fixes: #1664

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2019-05-10 18:00:30 -07:00
Hui Zhu
5ba09817d8 Merge pull request #1575 from WeiZhang555/simplify-persist-api
newstore:  removing deprecated files when use new store driver
2019-05-10 15:33:22 +08:00
Wei Zhang
4c192139cf newstore: remove file "devices.json"
When using experimental feature "newstore", we save and load devices
information from `persist.json` instead of `devices.json`, in such case,
file `devices.json` isn't needed anymore, so remove it.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-05-06 14:40:08 +08:00
Stefan Hajnoczi
9480978364 qemu: add vhost-user-fs-pci device instead of 9p
When enable_virtio_fs is true, add a vhost-user-fs-pci for the
kataShared volume instead of 9p.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-05-05 11:32:34 -06:00
Wei Zhang
341a988e06 persist: simplify persist api
Fixes #803

Simplify new store API to make the code easier to understand and use.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-30 11:54:42 +08:00
Archana Shinde
b5aa8d4f67 Merge pull request #1577 from chavafg/topic/revert-mount-pr
Revert "vc: change container rootfs to be a mount"
2019-04-25 09:41:15 -07:00
James O. D. Hunt
ed64240df2 agent: Support Kata agent tracing
Add configuration options to support the various Kata agent tracing
modes and types. See the comments in the built configuration files for
details:

- `cli/config/configuration-fc.toml`
- `cli/config/configuration-qemu.toml`

Fixes #1369.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2019-04-25 09:41:13 +01:00
James O. D. Hunt
e803a7f870 agent: Return an error, not just an interface
Make `newAgentConfig()` return an explicit error rather than handling
the error scenario by simply returning the `error` object in the
`interface{}` return type. The old behaviour was confusing and
inconsistent with the other functions creating a new config type (shim,
proxy, etc).

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2019-04-24 17:14:01 +01:00
Salvador Fuentes
bc9b9e2af6 vc: Revert "vc: change container rootfs to be a mount"
This reverts commit 196661bc0d.

Reverting because cri-o with devicemapper started
to fail after this commit was merged.

Fixes: #1574.

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
2019-04-23 08:56:36 -05:00
Peng Tao
196661bc0d vc: change container rootfs to be a mount
We can use the same data structure to describe both of them.
So that we can handle them similarly.

Fixes: #1566

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-20 00:42:25 -07:00
Wei Zhang
e40dcb9376 storage: set new storage driver as "experimental"
Set new persist storage driver "virtcontainers/persist/" as "experimental"
feature.
One day when this can fully work and we're ready to move to 2.0, we'll move
it from "experimental" feature to formal feature.
At that time, the "virtcontainers/filesystem_resource_storage.go" can be removed
completely.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-19 15:35:33 +08:00
Wei Zhang
504c706bea storage: address comments
Address some comments:
* fix persist driver func names for better understanding
* modify some logic, add some returned error etc

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-19 15:33:53 +08:00
Wei Zhang
039ed4eeb8 persist: persist device data
Persist device information to relative file

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-19 15:33:53 +08:00
Wei Zhang
b42fde69c0 persist: demo code for persist api
Demonstrate how to make use of `virtcontainer/persist/api` data structure
package.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-19 15:33:53 +08:00
Peng Tao
f5125421d0 sandbox: return ErrNoSuchContainer when failing to find a container
So that caller can determine that it is ENOENT-alike error.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-12 03:57:07 -07:00
Peng Tao
cf90751638 vc: export vc error types
So that shimv2 can convert it into grpc errors.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-12 02:01:02 -07:00
Peng Tao
c414599635 types: remove pid from sandbox state
No longer needed.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-09 18:59:56 -07:00
Peng Tao
616f26cfe5 types: split sandbox and container state
Since they do not really share many of the fields.

Fixes: #1434

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-09 18:59:56 -07:00