Commit Graph

1241 Commits

Author SHA1 Message Date
Bo Chen
359ab16a8f Merge pull request #1090 from likebreath/1106/clh_upgrade_v0.11.0
versions: Update cloud-hypervisor to release v0.11.0
2020-11-09 15:51:09 -08:00
bin liu
b8414045bf runtime: remove nsenter
remove code for nsenter

Fixes: #1081

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-09 11:42:51 +08:00
bin liu
e3510be867 runtime: use one line if statement to check if err is nil for qemu.go
Use `if err := q.qmpSetup(); err != nil` to reduce code and make it easy
to read. And remove checking err if last function call also return an error,
return the function call directly.

Fixes: #1081

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-09 11:42:45 +08:00
Fupan Li
d22c7cf00b Merge pull request #1013 from liubin/feature/1012-dump-guest-memroy-on-panic
Dump guest memory when kernel panic for QEMU
2020-11-09 09:46:28 +08:00
Bo Chen
92c1c4c690 versions: Update cloud-hypervisor to release v0.11.0
The release v0.11.0 of cloud-hypervisor features the following changes:
1) Improved Linux Boot Time, 2) `SIGTERM/SIGINT` Interrupt Signal,
Handling 3) Default Log Level Changed, 4) `io_uring` support by default
for `virtio-block` (on host kernel version 5.8+), 5) Windows Guest
Support, 6) New `--balloon` Parameter Added, 7) Experimental
`virtio-watchdog` Support, 8) Bug fixes.

Fixes: #1089

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-11-06 16:19:31 -08:00
Archana Shinde
6160043c01 Merge pull request #1077 from likebreath/1103/clh_refactor_device_unplug
clh: Consolidate the code path for device unplug
2020-11-06 16:00:56 -08:00
Peng Tao
c7a2b12fab Merge pull request #1086 from jodh-intel/2.0-dev-fix-annotations
annotations: Improve asset annotation handling
2020-11-06 10:29:22 +08:00
James O. D. Hunt
5ced96e96d hypervisor: Remove unused methods
Deleted `HypervisorConfig`'s unused  `CustomFirmwareAsset()` and
`JailerAssetPath()` methods.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-11-05 12:15:47 +00:00
James O. D. Hunt
e82c9daec3 annotations: Improve asset annotation handling
Make `asset.go` the arbiter of asset annotations by removing all asset
annotations lists from other parts of the codebase.

This makes the code simpler, easier to maintain, and more robust.

Specifically, the previous behaviour was inconsistent as the following
ways:

- `createAssets()` in `sandbox.go` was not handling the following asset
  annotations:

    - firmware:
      - `io.katacontainers.config.hypervisor.firmware`
      - `io.katacontainers.config.hypervisor.firmware_hash`

    - hypervisor:
      - `io.katacontainers.config.hypervisor.path`
      - `io.katacontainers.config.hypervisor.hypervisor_hash`

    - hypervisor control binary:
      - `io.katacontainers.config.hypervisor.ctlpath`
      - `io.katacontainers.config.hypervisor.hypervisorctl_hash`

    - jailer:
      - `io.katacontainers.config.hypervisor.jailer_path`
      - `io.katacontainers.config.hypervisor.jailer_hash`

- `addAssetAnnotations()` in the `oci` package was not handling the
  following asset annotations:

    - hypervisor:
      - `io.katacontainers.config.hypervisor.path`
      - `io.katacontainers.config.hypervisor.hypervisor_hash`

    - hypervisor control binary:
      - `io.katacontainers.config.hypervisor.ctlpath`
      - `io.katacontainers.config.hypervisor.hypervisorctl_hash`

    - jailer:
      - `io.katacontainers.config.hypervisor.jailer_path`
      - `io.katacontainers.config.hypervisor.jailer_hash`

This change fixes the bug where specifying a custom hypervisor path via an
asset annotation was having no effect.

Fixes: #1085.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-11-05 12:15:42 +00:00
James O. D. Hunt
0f26f1cd6f annotations: Add missing hypervisor control annotation
Add missing annotation definitions for a hypervisor control binary:

- `io.katacontainers.config.hypervisor.ctlpath`
- `io.katacontainers.config.hypervisor.hypervisorctl_hash`

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-11-05 12:12:58 +00:00
James O. D. Hunt
76064e3e2d asset: Formatting, grammar and whitespace
Improve formatting, grammar and whitespace.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-11-05 12:12:51 +00:00
bin liu
40418f6d88 runtime: add geust memory dump
When guest panic, dump guest kernel memory to host filesystem.
And also includes:
- hypervisor config
- hypervisor version
- and state of sandbox

Fixes: #1012

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-05 16:04:21 +08:00
Peng Tao
a958eaa8d3 runtime: mount shared mountpoint readonly
bindmount remount events are not propagated through mount subtrees,
so we have to remount the shared dir mountpoint directly.

E.g.,
```
mkdir -p source dest foo source/foo

mount -o bind --make-shared source dest

mount -o bind foo source/foo
echo bind mount rw
mount | grep foo
echo remount ro
mount -o remount,bind,ro source/foo
mount | grep foo
```
would result in:
```
bind mount rw
/dev/xvda1 on /home/ubuntu/source/foo type ext4 (rw,relatime,discard,data=ordered)
/dev/xvda1 on /home/ubuntu/dest/foo type ext4 (rw,relatime,discard,data=ordered)
remount ro
/dev/xvda1 on /home/ubuntu/source/foo type ext4 (ro,relatime,discard,data=ordered)
/dev/xvda1 on /home/ubuntu/dest/foo type ext4 (rw,relatime,discard,data=ordered)
```

The reason is that bind mount creats new mount structs and attaches them to different mount subtrees.
However, MS_REMOUNT only looks for existing mount structs to modify and does not try to propagate the
change to mount structs in other subtrees.

Fixes: #1061
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-11-04 17:51:49 +08:00
Peng Tao
125e21cea3 runtime: readonly mounts should be readonly bindmount on the host
So that we get protected at the VM boundary not just the guest kernel.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-11-04 17:51:49 +08:00
Bo Chen
93d7962510 clh: Consolidate the code path for device unplug
In cloud-hypervisor, it provides a single unified way of unplugging
devices, e.g. the `/vm.RemoveDevice` HTTP API. Taking advantage of this
API, we can simplify our implementation of `hotplugRemoveDevice` in
`clh.go`, where we can consolidate similar code paths for different
device unplug (e.g. no need to implement `hotplugRemoveBlockDevice` and
`hotplugRemoveVfioDevice` separately). We will only need to retrieve the
right `deviceID` based on the type of devices, and use the single
unified HTTP API for device unplug.

Fixes: #1076

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-11-03 15:46:38 -08:00
Julio Montes
77b50969ea runtime: cloud-hypervisor: reduce memory footprint
Cloud-hypervisor supports DAX, let's enable it to reduce its memory
footprint.

Before this patch:

**19.96M**

```
20448kB -- [/usr/share/kata-containers/kata.img]
```

With this patch:

**10.83M**

```
11100kB -- [/usr/share/kata-containers/kata.img]
```

fixes #1056

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-29 14:21:57 -06:00
Peng Tao
b316661818 runtime: remove the unused proto files
These are moved to the agent and no longer needed.

Fixes: #1028
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-25 10:57:38 +08:00
Peng Tao
54e23c8302 agent: move gogo.proto out of the github.com namespance
To follow the same namespace scope as other proto files.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-25 10:44:53 +08:00
Peng Tao
583e6ed3e5 agent: types.pb.go is not regenerated
When types.proto was relocated, types.pb.go is not regenerated and still
references the old location.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-25 10:35:35 +08:00
Archana Shinde
5f0b83cc54 Merge pull request #1000 from jongwu/pci
arm64: correct bridge type for QEMUVIRT
2020-10-22 13:53:27 -07:00
bin liu
5b065eb599 runtime: change govmm package
Change govmm package name from github.com/intel/govmm
to github.com/kata-containers/govmm

Fixes: #859

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-22 21:27:49 +08:00
Jianyong Wu
9eab301526 arm64: correct bridge type for QEMUVIRT
port forward PR https://github.com/kata-containers/runtime/pull/3017

Fixes: #3016
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2020-10-20 14:09:03 +08:00
Julio Montes
f162e7e960 Merge pull request #948 from justin-he/max_ports
virtcontainers: Append max_ports to virtio-serial device
2020-10-19 08:55:06 -05:00
Jia He
da79b4be67 virtcontainers: Append max_ports to virtio-serial device
Allow API consumers to change the maximum number of ports in the
virtio-serial devices, setting a lower number of ports can improve the
boot time and reduce the attack surface.

Before this patch on arm64:
[    0.028664] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.055031] printk: console [hvc0] enabled

After this patch on arm64:
[    0.028484] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.031370] printk: console [hvc0] enabled

Fixes: #2676
Signed-off-by: Jia He <justin.he@arm.com>
2020-10-16 23:40:54 +08:00
Peng Tao
0d5d69e8cd Merge pull request #902 from c3d/bug-v2/launchpad-1878234-access
Validate runtime annotations
2020-10-16 15:47:45 +08:00
Christophe de Dinechin
c5771be2de annotations: Correct unit tests to validate new protections
Add the verification of some basic protections, namely that:
- EnableAnnotations is honored
- Dangerous paths cannot be modified if no match
- Errors are returned when expected

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
398d79184c annotations: Split addHypervisorOverrides to reduce complexity
Warning from gocyclo during make check:
 virtcontainers/pkg/oci/utils.go:404:1: cyclomatic complexity 37 of func `addHypervisorConfigOverrides` is high (> 30) (gocyclo)
 func addHypervisorConfigOverrides(ocispec specs.Spec, config *vc.SandboxConfig, runtime RuntimeConfig) error {
^

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b2b3bc7ad8 annotations: Add unit test for checkPathIsInGlobs
There are a few interesting corner cases to consider for this
function.

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
6f52179ce4 annotations: Add unit test for regexpContains function
James O.D Hunt: "But also, regexpContains() and
checkPathIsInGlobList() seem like good candidates for some unit
tests. The "look" obvious, but a few boundary condition tests would be
useful I think (filenames with spaces, backslashes, special
characters, and relative & absolute paths are also an interesting
thought here)."

There aren't that many boundary conditions on a list with regexps,
if you assume the regexp match function itself works. However, the
tests is useful in documenting expectations.

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b119427405 annotations: Give better names to local variabes in search functions
Use more meaningful variable names for clarity.

Fixes: #901

Suggested-by: James O.D. Hunt james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b5db114aad annotations: Rename checkPathIsInGlobList with checkPathIsInGlobs
The name is shorter and more specific

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
7c6aede5d4 config: Whitelist hypervisor annotations by name
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."

For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":

  enable_annotations = [ "virtio.*", "initrd", "_path" ]

The default is an empty list of enabled annotations, which disables
annotations entirely.

If an anontation is rejected, the message is something like:

  annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled

Fixes: #901

Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
f047fced0b config: Use glob instead of regexp to match paths in annotations
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
11b9c90cd8 annotations: Fix typo in comment
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
4e89b885d2 config: Protect file_mem_backend against annotation attacks
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
aae9656d8b config: Protect vhost_user_store_path against annotation attacks
This path could be used to overwrite data on the host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b21a829c61 config: Protect ctlpath from annotation attack
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
27b6620b23 config: Protect jailer_path annotation
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2d431c61c6 annotations: Simplify negative logic
Replace strange negative logic  (!ok -> continue) with positive
logic (ok -> do it)

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2ca9ca892d config: Add hypervisor path override through annotations
The annotation is provided, so it should be respected.
Furthermore, it is important to implement it with the appropriate
protetions similar to what was done for virtiofsd.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2e093dfd8b config: Fix typo in function name
There was an extra 'p' in addHypervisorVirtioFsOverrides.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
bf13ff0a3a config: Protect virtio_fs_daemon annotation
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Eric Ernst
d8a8fe47fb cpuset: don't set cpuset.mems in the guest
Kata doesn't map any numa topologies in the guest. Let's make sure we
clear the Cpuset fields before passing container updates to the
guest.

Note, in the future we may want to have a vCPU to guest CPU mapping and
still include the cpuset.Cpus. Until we have this support, clear this as
well.

Fixes: #932

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
88cd712876 sandbox: consider cpusets if quota is not enforced
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
77a463e57a cpuset: support setting mems for sandbox
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
2d690536b8 cpuset: add cpuset pkg
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
12cc0ee168 sandbox: don't constrain cpus, mem only cpuset, devices
Allow for constraining the cpuset as well as the devices-whitelist . Revert
sandbox constraints for cpu/memory, as they break the K8S use case. Can
re-add behind a non-default flag in the future.

The sandbox CPUSet should be updated every time a container is created,
updated, or removed.

To facilitate this without rewriting the 'non constrained cgroup'
handling, let's add to the Sandbox's cgroupsUpdate function.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Eric Ernst
b6cf68a985 cgroups: add ability to update CPUSet
Add function for applying a cpuset change to a cgroup

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Eric Ernst
b812d4f7fa virtcontainers: add method for calculating cpuset for sandbox
Calculate sandbox's CPUSet as the union of each of the container's
CPUSets.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Bin Liu
43da14e7b3 Merge pull request #752 from YchauWang/clear-moke-code01
runtime: Clear the VCMock 1.x API Methods from 2.0
2020-10-09 17:41:21 +08:00