Commit Graph

5564 Commits

Author SHA1 Message Date
Christophe de Dinechin
997f7c4433 annotations: Rename checkPathIsInGlobList with checkPathIsInGlobs
The name is shorter and more specific

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:43:15 +08:00
Christophe de Dinechin
74d4065197 config: Add better comments in the template files
When there is a default value from the code (usually empty) that
differs from a possible suggested value from the distro, then the
wording "default: empty" is confusing.

Fixes: #901

Suggested-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:43:15 +08:00
Christophe de Dinechin
73bb3fdbee config: Whitelist hypervisor annotations by name
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."

For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":

  enable_annotations = [ "virtio.*", "initrd", "_path" ]

The default is an empty list of enabled annotations, which disables
annotations entirely.

If an anontation is rejected, the message is something like:

  annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled

Fixes: #901

Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:43:10 +08:00
Christophe de Dinechin
5a587ba506 config: Use glob instead of regexp to match paths in annotations
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
29f5dec38f annotations: Fix typo in comment
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
d71f9e1155 config: Add makefile variables for path lists
Add variables to override defaults at build time for the various lists
used to control path annotations.

Fixes: #901

Suggested-by: Fabiano Fidencio <fidencio@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
28c386c51f config: Protect file_mem_backend against annotation attacks
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
c2a186b18c config: Protect vhost_user_store_path against annotation attacks
This path could be used to overwrite data on the host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
8cd094cf06 config: Add security warning on configuration examples
Add the following text explaining the risk of using regular
expressions in path lists:

Each member of the list can be a regular expression, but prefer names.
Otherwise, please read and understand the following carefully.
SECURITY WARNING: If you use regular expressions, be mindful that
an attacker could craft an annotation that uses .. to escape the paths
you gave. For example, if your regexp is /bin/qemu.* then if there is
a directory named /bin/qemu.d/, then an attacker can pass an annotation
containing /bin/qemu.d/../put-any-binary-name-here and attack your host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
b5f2a1e8c4 config: Protect ctlpath from annotation attack
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
2d65b3bfd8 config: Protect jailer_path annotation
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
fe5e1cf2e1 config: Add examples for path_list configuration
The path_list configuration gives a series of regular expressions that
limit which values are acceptable through annotations in order to
avoid kata launching arbitrary binaries on the host when receiving an
annotation.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
3f7bcf54f0 annotations: Simplify negative logic
Replace strange negative logic  (!ok -> continue) with positive
logic (ok -> do it)

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
80144fc415 config: Add hypervisor path override through annotations
The annotation is provided, so it should be respected.
Furthermore, it is important to implement it with the appropriate
protetions similar to what was done for virtiofsd.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
2f5f35608a config: Fix typo in function name
There was an extra 'p' in addHypervisorVirtioFsOverrides.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
2faafbdd3a config: Protect virtio_fs_daemon annotation
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Christophe de Dinechin
9e5ed41511 config: Add 'List' alternates for hypervisor configuration paths
Paths mentioned in the hypervisor configuration can be overriden
using annotations, which is potentially dangerous. For each path,
add a 'List' variant that specifies the list of acceptable values
from annotations.

Bug: https://bugs.launchpad.net/katacontainers.io/+bug/1878234

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-18 00:40:16 +08:00
Peng Tao
b33d4fe708 agent: fix panic on malformed device resource in container update
Somehow containerd is sending a malformed device in update API. While it
should not happen, we should not panic either.

Fixes: #946
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-18 00:40:16 +08:00
Eric Ernst
183823398d cpuset: add cpuset pkg
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
bfbbe8ba6b cpuset: don't set cpuset.mems in the guest
Kata doesn't map any numa topologies in the guest. Let's make sure we
clear the Cpuset fields before passing container updates to the
guest.

Note, in the future we may want to have a vCPU to guest CPU mapping and
still include the cpuset.Cpus. Until we have this support, clear this as
well.

Fixes: #932

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
5c21ec278c sandbox: consider cpusets if quota is not enforced
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
9bb0d48d56 cpuset: support setting mems for sandbox
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
64a2ef62e0 virtcontainers: add method for calculating cpuset for sandbox
Calculate sandbox's CPUSet as the union of each of the container's
CPUSets.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
a441f21c40 cpuset: add cpuset pkg
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
James O. D. Hunt
ce54090f25 docs: Update upgrading guide
Update the upgrading guide for 2.0.

Fixes: #928.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-10-18 00:40:16 +08:00
Ychau Wang
e884fef483 docs: update the build kata containers kernel document
Update the build kata containers kernel document for 2.0 release. Fixed
the 1.x release project paths and urls, using the kata-containers
project file paths and urls.

Fixes: #929

Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
2020-10-18 00:40:16 +08:00
David Gibson
9c16643c12 agent/device: Check type as well as major:minor when looking up devices
To update device resource entries from host to guest, we search for
the right entry by host major:minor numbers, then later update it.
However block and character devices exist in separate major:minor
namespaces so we could have one block and one character device with
matching major:minor and thus incorrectly update both with the details
for whichever device is processed second.

Add a check on device type to prevent this.

Port from the Kata 1 Go agent
https://github.com/kata-containers/agent/commit/27ebdc9d2761

Fixes: #703

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-18 00:40:16 +08:00
David Gibson
4978c9092c agent/device: Index all devices in spec before updating them
The agent needs to update device entries in the OCI spec so that it
has the correct major:minor numbers for the guest, which may differ
from the host.

Entries in the main device list are looked up by device path, but
entries in the device resources list are looked up by (host)
major:minor.  This is done one device at a time, updating as we go in
update_spec_device_list().

But since the host and guest have different namespaces, one device
might have the same major:minor as a different device on the host.  In
that case we could update one resource entry to the correct guest
values, then mistakenly update it again because it now matches a
different host device.

To avoid this, rather than looking up and updating one by one, we make
all the lookups in advance, creating a map from (host) device path to
the indices in the spec where the device and resource entries can be
found.

Port from the Go agent in Kata 1,
https://github.com/kata-containers/agent/commit/d88d46849130

Fixes: #703

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-18 00:40:16 +08:00
David Gibson
a7ba362f92 agent/device: Forward port update_spec_device_list() unit test
The Kata 1 Go agent included a unit test for updateSpecDeviceList, but no
such unit test exists for the Rust agent's equivalent
update_spec_device_list().  Port the Kata1 test to Rust.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-18 00:40:16 +08:00
David Gibson
230a9833f8 agent/device: update_spec_device_list() should error if dev not found
If update_spec_device_list() is given a device that can't be found in the
OCI spec, it currently does nothing, and returns Ok(()).  That doesn't
seem like what we'd expect and is not what the Go agent in Kata 1 does.

Change it to return an error in that case, like Kata 1.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-18 00:40:16 +08:00
Eric Ernst
a6d9fd4118 sandbox: don't constrain cpus, mem only cpuset, devices
Allow for constraining the cpuset as well as the devices-whitelist . Revert
sandbox constraints for cpu/memory, as they break the K8S use case. Can
re-add behind a non-default flag in the future.

The sandbox CPUSet should be updated every time a container is created,
updated, or removed.

To facilitate this without rewriting the 'non constrained cgroup'
handling, let's add to the Sandbox's cgroupsUpdate function.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
8f0cb2f1ea cgroups: add ability to update CPUSet
Add function for applying a cpuset change to a cgroup

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
Eric Ernst
cbdae44992 agent: fix errorneous parsing for guest block size
We were assuming base 10 string before, when the block size from sysfs
is actually a hex string. Let's fix that.

Fixes: #908

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-18 00:40:16 +08:00
James O. D. Hunt
97acaa8124 docs: Add containerd install guide
Create a containerd installation guide and a new `kata-manager` script
for 2.0 that automated the steps outlined in the guide.

Also cleaned up and improved the installation documentation in various
ways, the most significant being:

- Added legacy install link for 1.x installs.
- Official packages section:
  - Removed "Contact" column (since it was empty!)
  - Reworded "Versions" column to clarify the versions are a minimum
    (to reduce maintenance burden).
  - Add a column to show which installation methods receive automatic updates.
  - Modified order of installation options in table and document to
    de-emphasise automatic installation and promote official packages
    and snap more.
- Removed sections no longer relevant for 2.0.

Fixes: #738.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-10-18 00:40:16 +08:00
bin liu
23246662b2 agent: use ok_or/map_err instead of match
Sometimes `Option.or_or` and `Result.map_err` may be simpler
than match statement. Especially in rpc.rs, there are
many `ctr.get_process` and `sandbox.get_container` which
are using `match`.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
ebe5ad1386 rustjail: use Iterator to manipulate vector elements
Use Iterator can save codes, and make code more readable

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
c9497c88e4 rustjail: delete codes commented out
There are some uses/codes/struct fields are commented out, and
may not turn into  un-comment these codes, so delete these comments.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
d5d9928f97 rustjail: delete unused test code
The auto generated test code is no meanings, delete it.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
f70892a5bb agent: use chain of Result to avoid early return
Use rust `Result`'s `or_else`/`and_then` can write clean codes.
And can avoid early return by check wether the `Result`
is `Ok` or `Err`.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
ab64780a0b agent: update not accurate comments
This commit includes:
- update comments that not matched the function name
- file path with doubled slash

Fixes: #922

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
9e064ba192 agent: use macro to simplify parse_cmdline function in config.rs
In function parse_cmdline there are some similar codes, if we want
to add more commandline arguments, the code will grow too long.
Use macro can reduce some codes with the same logic/processing.

Fixes: #914

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
42c48f54ed agent: add blank lines between methods
In rpc.rs, there are no blank lines between methods, this commit
add blank lines for these methods.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
d3a36fa06f agent: delete unused field in agentService
The code is for test, and not needed now.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
fa546600ff agent: use no-named closure to reduce codes
For simple closures, inline closures can save codes.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
efddcb4ab8 agent: use a local fn to reduce duplicated codes
The same codes used twices, aggregated into a function can
reduce codes.

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00
Peng Tao
7bb3e562bc packaging: fix cloud-hypervisor binary path
1. ensure build-static-clh.sh puts cloud-hypervisor under ./cloud-hypervisor directory
2. install cloud-hypervisor/cloud-hypervisor binary

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-18 00:40:16 +08:00
Peng Tao
7b53041bad packaging: fix missing cloud_hypervisor_repo
It is needed in order to build from source.

Fixes: #916
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-18 00:40:16 +08:00
Peng Tao
38212ba6d8 packaging: apply qemu v5.1 stable fixes
Qemu v5.1 was released with an affending commit 9b3a35ec82
(virtio: verify that legacy support is not accidentally on).
As a result, it breaks commandline compatiblilities for old qemu
users. Upstream qemu has fixed it but no release has been put out yet.
Let's apply these fixes by hand for now.

Refs: https://www.mail-archive.com/qemu-devel@nongnu.org/msg729556.html

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-18 00:40:16 +08:00
Jianyong Wu
fb7e9b4f32 agent: fix aarch64 build
aarch64 needs libgcc to resolve some non-builtin symbols.

Fixes: #909
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-18 00:40:16 +08:00
bin liu
0cfcbf79b8 docs: add namespace key to pod/container config files
If no namespace field in config files, CRI-O will failed:
 setting pod sandbox name and id: cannot generate pod name without namespace

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-18 00:40:16 +08:00