Commit Graph

78 Commits

Author SHA1 Message Date
Gabi Beyer
5f0799f1b7 vc: add rootless dir to path variables
Modify some path variables to be functions that return the path
with the rootless directory prefix if running rootlessly.

Fixes: #1827

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-09-26 16:17:16 +02:00
Wei Zhang
2ed94cbd9d Config: Remove ConfigJSONKey from annotations
Fixes: #2023

We can get OCI spec config from bundle instead of annotations, so this
field isn't necessary.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-09-17 11:47:06 +08:00
Eric Ernst
b62814a6f0 sandbox: combine sandbox cgroup functions
Simplify the tests and the code by combining the create and join
functions into a single function.

Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-09-05 13:49:13 -07:00
Jose Carlos Venegas Munoz
074418f56b sandbox: Join cgroup sandbox on create.
When a new sandbox is created, join to its cgroup path
this will create all proxy, shim, etc in the sandbox cgroup.

Fixes: #1879

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2019-08-29 14:08:04 -05:00
lifupan
c91556aa41 api: add a CleanupContainer api for VC
When shimv2 was killed by accident, containerd would try to
launch a new shimv2 binarry to cleanup the container. In order
to avoid race condition, the cleanup should be done serialized
in a sandbox. Thus adding a new api to do this by locking the
sandbox.

Fixes:#1832

Signed-off-by: lifupan <lifupan@gmail.com>
2019-08-24 08:16:02 +08:00
Julio Montes
14474a49a2 Merge pull request #1921 from Ace-Tang/fix-remove-network
network: fix failed to remove network
2019-08-07 14:06:52 -05:00
Ace-Tang
50c3e56aeb network: fix failed to remove network
in create sandbox, if process error, should remove network without judge
NetNsCreated is true, since network is created by kata and should be
removed by kata, and network.Remove has judged if need to delete netns
depend on NetNsCreated

Fixes: #1920

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-07-30 20:17:09 +08:00
Peng Tao
bc4460e12f sandbox: support force stop
When force is true, ignore any guest related errors. This can
be used to stop a sandbox when hypervisor process is dead.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Yang, Wei
071030b784 shimv2: Close vhostfd after vm get vhostfd
If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.

Fixes: #1669

Signed-off-by: Yang, Wei <w90p710@gmail.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-05-16 13:31:11 +08:00
Wei Zhang
297097779e persist: save/load GuestMemoryHotplugProbe
Support saving/loading `GuestMemoryHotplugProbe` from sandbox state.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-05-09 14:39:04 +08:00
Wei Zhang
4c192139cf newstore: remove file "devices.json"
When using experimental feature "newstore", we save and load devices
information from `persist.json` instead of `devices.json`, in such case,
file `devices.json` isn't needed anymore, so remove it.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-05-06 14:40:08 +08:00
Salvador Fuentes
bc9b9e2af6 vc: Revert "vc: change container rootfs to be a mount"
This reverts commit 196661bc0d.

Reverting because cri-o with devicemapper started
to fail after this commit was merged.

Fixes: #1574.

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
2019-04-23 08:56:36 -05:00
Peng Tao
196661bc0d vc: change container rootfs to be a mount
We can use the same data structure to describe both of them.
So that we can handle them similarly.

Fixes: #1566

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-20 00:42:25 -07:00
Wei Zhang
b42fde69c0 persist: demo code for persist api
Demonstrate how to make use of `virtcontainer/persist/api` data structure
package.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2019-04-19 15:33:53 +08:00
Peng Tao
cf90751638 vc: export vc error types
So that shimv2 can convert it into grpc errors.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-04-12 02:01:02 -07:00
Fupan Li
c9a3b933f8 Merge pull request #1427 from Ace-Tang/fix-qemu-leak
qemu: fix qemu leak when failed to start container
2019-04-02 23:32:11 +08:00
lifupan
628ea46c58 virtcontainers: change container's rootfs from string to mount alike struct
container's rootfs is a string type, which cannot represent a
block storage backed rootfs which hasn't been mounted.
Change it to a mount alike struct as below:
    RootFs struct {
            // Source specify the BlockDevice path
            Source string
            // Target specify where the rootfs is mounted if it has been mounted
            Target string
            // Type specifies the type of filesystem to mount.
            Type string
            // Options specifies zero or more fstab style mount options.
            Options []string
            // Mounted specifies whether the rootfs has be mounted or not
            Mounted bool
     }

If the container's rootfs has been mounted as before, then this struct can be
initialized as: RootFs{Target: <rootfs>, Mounted: true} to be compatible with
previous case.

Fixes:#1158

Signed-off-by: lifupan <lifupan@gmail.com>
2019-04-02 10:54:05 +08:00
Ace-Tang
096fa046f8 qemu: fix qemu leak when failed to start container
do cleanup inside startVM() if start vm get error

Fixes: #1426

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-03-28 19:38:56 +08:00
James O. D. Hunt
c759cf5f37 tracing: Fix tracing
The store refactor (#1066) inadvertently broke runtime tracing as it
created new contexts containing trace spans.

Reworking the store changes to re-use the existing context resolves the
problem since runtime tracing assumes a single context.

Fixes #1277.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2019-03-04 11:02:31 +00:00
James O. D. Hunt
f540a80354 store: Add SetLogger API
Add a `store.SetLogger()` API to allow the store package to log with the
standard set of fields (as expected by the log parser [1].

Fixes #1297.

---

[1] - https://github.com/kata-containers/tests/tree/master/cmd/log-parser#logfile-requirements

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2019-02-28 10:56:26 +00:00
Julio Montes
5201860bb0 virtcontainers: reimplement sandbox cgroup
All containers run in different cgroups even the sandbox, with this new
implementation the sandbox cpu cgroup wil be equal to the sum of all its
containers and the hypervisor process will be placed there impacting to the
containers running in the sandbox (VM). The default number of vcpus is
used when the sandbox has no constraints. For example, if default_vcpus
is 2, then quota will be 200000 and period 100000.

**c-ray test**
http://www.futuretech.blinkenlights.nl/c-ray.html

```
+=============================================+
|         | 6 threads 6cpus | 1 thread 1 cpu  |
+=============================================+
| current |   40 seconds    |   122 seconds   |
+==============================================
|   new   |   37 seconds    |   124 seconds   |
+==============================================
```

current = current cgroups implementation
new = new cgroups implementation

**workload**

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: c-ray
  annotations:
    io.kubernetes.cri.untrusted-workload: "true"
spec:
  restartPolicy: Never
  containers:
  - name: c-ray-1
    image: docker.io/devimc/c-ray:latest
    imagePullPolicy: IfNotPresent
    args: ["-t", "6", "-s", "1600x1200", "-r", "8", "-i",
          "/c-ray-1.1/sphfract", "-o", "/tmp/output.ppm"]
    resources:
      limits:
        cpu: 6
  - name: c-ray-2
    image: docker.io/devimc/c-ray:latest
    imagePullPolicy: IfNotPresent
    args: ["-t", "1", "-s", "1600x1200", "-r", "8", "-i",
          "/c-ray-1.1/sphfract", "-o", "/tmp/output.ppm"]
    resources:
      limits:
        cpu: 1
```

fixes #1153

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-02-19 13:13:44 -06:00
Samuel Ortiz
fad23ea54e virtcontainers: Conversion to Stores
We convert the whole virtcontainers code to use the store package
instead of the resource_storage one. The resource_storage removal will
happen in a separate change for a more logical split.

This change is fairly big but mostly does not change the code logic.
What really changes is when we create a store for a container or a
sandbox. We now need to explictly do so instead of just assigning a
filesystem{} instance. Other than that, the logic is kept intact.

Fixes: #1099

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-02-07 00:59:29 +01:00
Samuel Ortiz
b05dbe3886 runtime: Convert to the new internal types package
We can now remove all the sandbox shared types and convert the rest of
the code to using the new internal types package.

This commit includes virtcontainers, cli and containerd-shim changes in
one atomic change in order to not break bisect'ibility.

Fixes: #1095

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 14:43:33 +01:00
Samuel Ortiz
3ab7d077d1 virtcontainers: Alias for pkg/types
Since we're going to have both external and internal types packages, we
alias the external one as vcTypes. And the internal one will be usable
through the types namespace.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-08 14:24:06 +01:00
Samuel Ortiz
acf833cb4a virtcontainers: Call agent startSandbox from startVM
We always ask the agent to start the sandbox when we start the VM, so we
should simply call agent.startSandbox from startVM instead of open
coding those.
This slightly simplifies the complex createSandboxFromConfig routine.

Fixes: #1011

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-01-07 09:56:04 -08:00
Samuel Ortiz
ebf8547c38 virtcontainers: Remove useless startSandbox wrapper
startSandbox() wraps a single operation (sandbox.Start()), so we can
remove it and make the code easier to read/follow.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-01-07 09:48:22 -08:00
Peng Tao
bf1a5ce000 sandbox: cleanup sandbox if creation failed
This includes cleaning up the sandbox on disk resources,
and closing open fds when preparing the hypervisor.

Fixes: #1057

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-12-21 13:46:16 +08:00
Sebastien Boeuf
fa9b15dafe virtcontainers: Return the appropriate container status
When our runtime is asked for the container status, we also handle
the scenario where the container is stopped if the shim process for
that container on the host has terminated.

In the current implementation, we retrieve the container status
before stopping the container, causing a wrong status to be returned.
The wait for the original go-routine's completion was done in a defer
within the caller of statusContainers(), resulting in the
statusContainer()'s values to return the pre-stopped value.

This bug is first observed when updating to docker v18.09/containerd
v1.2.0. With the current implementation, containerd-shim receives the
TaskExit when it detects kata-shim is terminating. When checking the
container state, however, it does not get the expected "stopped" value.

The following commit resolves the described issue by simplifying the
locking used around the status container calls. Originally
StatusContainer would request a read lock. If we needed to update the
container status in statusContainer, we'd start a go-routine which
would request a read-write lock, waiting for the original read lock to
be released.  Can't imagine a bug could linger in this logic. We now
just request a read-write lock in the caller (StatusContainer),
skipping the need for a separate go-routine and defer. This greatly
simplifies the logic, and removes the original bug.

Fixes #926

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-28 20:10:34 -08:00
Sebastien Boeuf
982381bff0 api: Cleanup StartContainer()
Simple patch reducing the complexity of StartContainer().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:56 -08:00
Sebastien Boeuf
57773816b3 sandbox: Create and export Pause/ResumeContainer() to the API level
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless functions
PauseContainer() and ResumeContainer(), which would recreate a
new sandbox pointer and the corresponding ones for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:50 -08:00
Sebastien Boeuf
b298ec4228 sandbox: Create and export ProcessListContainer() to the API level
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless function
ProcessListContainer(), which would recreate a new sandbox pointer
and the corresponding ones for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:44 -08:00
Sebastien Boeuf
3add296f78 sandbox: Create and export KillContainer() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function KillContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:37 -08:00
Sebastien Boeuf
76537265cb sandbox: Create and export StopContainer() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:31 -08:00
Sebastien Boeuf
109e12aa56 sandbox: Export Stop() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:24 -08:00
Sebastien Boeuf
6c3e266eb9 sandbox: Export Start() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StartSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:04 -08:00
Sebastien Boeuf
7bf84d05ad types: Replace agent/pkg/types with virtcontainers/pkg/types
This commit replaces every place where the "types" package from the
Kata agent was used, with the new "types" package from virtcontainers.

In order to do so, it introduces a few translation functions between
the agent and virtcontainers types, since this is needed by the kata
agent implementation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-02 08:46:11 -07:00
Wei Zhang
34fe3b9d6d cgroups: add host cgroup support
Fixes #344

Add host cgroup support for kata.

This commits only adds cpu.cfs_period and cpu.cfs_quota support.

It will create 3-level hierarchy, take "cpu" cgroup as an example:

```
/sys/fs/cgroup
|---cpu
   |---kata
      |---<sandbox-id>
         |--vcpu
      |---<sandbox-id>
```

* `vc` cgroup is common parent for all kata-container sandbox, it won't be removed
after sandbox removed. This cgroup has no limitation.
* `<sandbox-id>` cgroup is the layer for each sandbox, it contains all other qemu
threads except for vcpu threads. In future, we can consider putting all shim
processes and proxy process here. This cgroup has no limitation yet.
* `vcpu` cgroup contains vcpu threads from qemu. Currently cpu quota and period
constraint applies to this cgroup.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Signed-off-by: Jingxiao Lu <lujingxiao@huawei.com>
2018-10-27 09:41:35 +08:00
Sebastien Boeuf
309dcf9977 vendor: Update the agent vendoring based on pkg/types
Some agent types definition that were generic enough to be reused
everywhere, have been split from the initial grpc package.

This prevents from importing the entire protobuf package through
the grpc one, and prevents binaries such as kata-netmon to stay
in sync with the types definitions.

Fixes #856

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-10-26 09:35:59 -07:00
Frank Cao
633f4567f3 Merge pull request #825 from jodh-intel/add-trace-to-remaining-api-funcs
virtcontainers: Add missing API trace calls
2018-10-18 16:53:30 +08:00
Graham Whaley
0a652a1ab8 Merge pull request #786 from linzichang/master
sandbox/virtcontainers: memory resource hotplug when create container.
2018-10-18 09:43:24 +01:00
Ruidong Cao
3f39d6e807 virtcontainers: Add missing API release calls
Add missing release sandbox calls to network related functions in
virtcontainers API.

Fixes #732.

Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
2018-10-18 06:58:04 +08:00
James O. D. Hunt
ee9275fedb virtcontainers: Add missing API trace calls
Add missing trace calls to remaining public virtcontainers API
functions.

Fixes #824.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-10-17 11:34:43 +01:00
Zichang Lin
8e2ee686bd sandbox/virtcontainers: memory resource hotplug when create container.
When create sandbox, we setup a sandbox of 2048M base memory, and
then hotplug memory that is needed for every new container. And
we change the unit of c.config.Resources.Mem from MiB to Byte in
order to prevent the 4095B < memory < 1MiB from being lost.

Depends-on:github.com/kata-containers/tests#813

Fixes #400

Signed-off-by: Clare Chen <clare.chenhui@huawei.com>
Signed-off-by: Zichang Lin <linzichang@huawei.com>
2018-10-15 10:37:29 +08:00
Clare Chen
12a0354084 sandbox: get and store guest details.
Get and store guest details after sandbox is completely created.
And get memory block size from sandbox state file when check
hotplug memory valid.

Signed-off-by: Clare Chen <clare.chenhui@huawei.com>
Signed-off-by: Zichang Lin <linzichang@huawei.com>
2018-09-17 07:00:46 -04:00
Sebastien Boeuf
9c6ed93f80 hook: Move OCI hooks handling to the CLI
The CLI being the implementation of the OCI specification, and the
hooks being OCI specific, it makes sense to move the handling of any
OCI hooks to the CLI level. This changes allows the Kata API to
become OCI agnostic.

Fixes #599

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-08-24 15:07:27 -07:00
James O. D. Hunt
d0679a6fd1 tracing: Add tracing support to virtcontainers
Add additional `context.Context` parameters and `struct` fields to allow
trace spans to be created by the `virtcontainers` internal functions,
objects and sub-packages.

Note that not every function is traced; we can add more traces as
desired.

Fixes #566.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-22 08:24:58 +01:00
James O. D. Hunt
90970d94c0 tracing: Add trace spans to virtcontainers APIs
Create spans for all `virtcontainers` API functions.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-22 08:24:58 +01:00
James O. D. Hunt
c200b28dc7 tracing: Add context to virtcontainers API
Add a `context.Context` parameter to all the virtcontainers API's to
support tracing.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-22 08:24:58 +01:00
Ruidong Cao
1a17200cc8 virtcontainers: add sandbox hotplug network API
Add sandbox hotplug network API to meet design

Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
2018-08-16 16:10:10 +08:00
Wei Zhang
6e6be98b15 devices: add interface "sandbox.AddDevice"
Fixes #50 .

Add new interface sandbox.AddDevice, then for Frakti use case, a device
can be attached to sandbox before container is created.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-08-15 15:24:12 +08:00