- backport 2.0-dev commits to stable-2.0.0
dbfe85e snap: install libseccomp-dev
0c3b6a9 package: drop qemu-virtiofs shim
f751c98 packaging: install virtiofsd for normal qemu build as well
08361c5 runtime: enable virtiofs by default
da9bfb2 runtime: Pass `--thread-pool-size=1` to virtiofsd
7347d43 packaging: Apply virtiofs performance related fixes to 5.x
c7bb1e2 tools: Improve agent-ctl README
e6f7ddd tools: Make agent-ctl support more APIs
46cfed5 tools: Remove commented out code in agent-ctl
81fb2c9 tools: Log request in agent-ctl tool if debug enabled
0c43215 tools: Rename agent-ctl command to GetGuestDetails
6511ffe tools: Fix comment in agent-ctl
ee59378 kernel: update to 5.4.71
ef11213 config: make virtio-fs part of standard kernel
1fb6730 agent: remove `unwrap()` for `e.as_errno()`
05e9fe0 agent: Use `?` instead of `match` when the error returns directly
d658129 kata-monitor: use regexp to check if runtime is kata containers
ae2d89e agent: use anyhow `context` to attach context to `Error` instead of `match`
095d4ad agent: remove useless match
bd816df agent: Use `ok_or_else` instead of match for Option -> Result
d413bf7 agent: Fix crasher if AddARPNeighbors request empty
76408c0 agent: Fix crasher if UpdateRoutes request empty
6e4da19 agent: Fix crasher if UpdateInterface request empty
8f8061d agent: replace `match Result` with `or_else`
64e4b2f agent: replace unnecessary `match Result` with `map_err`
7c0d68f agent: replace check! with map_err for readability
82ed34a agent: remove `check!` in child process because we cant' see logs.
9def624 agent: replace `if let Err` with `or_else`
6926914 agent: refactor namespace::setup to optimize error handling
e733c13 agent: replace `if let Err` with `map_err`
ba069f9 rustjail: add length check for uid_mappings in rootless euid mapping
cc8ec7b versions: Update Kubernetes, containerd, cri-o and cri-tools
8a364d2 annotations: Correct unit tests to validate new protections
0cc6297 annotations: Split addHypervisorOverrides to reduce complexity
b6059f3 annotations: Add unit test for checkPathIsInGlobs
c6afad2 annotations: Add unit test for regexpContains function
451608f makefile: Add missing generated vars to `USER_VARS`
8328136 makefile: Improve names of config entries for annotation checks
a92a630 annotations: Give better names to local variabes in search functions
997f7c4 annotations: Rename checkPathIsInGlobList with checkPathIsInGlobs
74d4065 config: Add better comments in the template files
73bb3fd config: Whitelist hypervisor annotations by name
5a587ba config: Use glob instead of regexp to match paths in annotations
29f5dec annotations: Fix typo in comment
d71f9e1 config: Add makefile variables for path lists
28c386c config: Protect file_mem_backend against annotation attacks
c2a186b config: Protect vhost_user_store_path against annotation attacks
8cd094c config: Add security warning on configuration examples
b5f2a1e config: Protect ctlpath from annotation attack
2d65b3b config: Protect jailer_path annotation
fe5e1cf config: Add examples for path_list configuration
3f7bcf5 annotations: Simplify negative logic
80144fc config: Add hypervisor path override through annotations
2f5f356 config: Fix typo in function name
2faafbd config: Protect virtio_fs_daemon annotation
9e5ed41 config: Add 'List' alternates for hypervisor configuration paths
b33d4fe agent: fix panic on malformed device resource in container update
1838233 cpuset: add cpuset pkg
bfbbe8b cpuset: don't set cpuset.mems in the guest
5c21ec2 sandbox: consider cpusets if quota is not enforced
9bb0d48 cpuset: support setting mems for sandbox
64a2ef6 virtcontainers: add method for calculating cpuset for sandbox
a441f21 cpuset: add cpuset pkg
ce54090 docs: Update upgrading guide
e884fef docs: update the build kata containers kernel document
9c16643 agent/device: Check type as well as major:minor when looking up devices
4978c90 agent/device: Index all devices in spec before updating them
a7ba362 agent/device: Forward port update_spec_device_list() unit test
230a983 agent/device: update_spec_device_list() should error if dev not found
a6d9fd4 sandbox: don't constrain cpus, mem only cpuset, devices
8f0cb2f cgroups: add ability to update CPUSet
cbdae44 agent: fix errorneous parsing for guest block size
97acaa8 docs: Add containerd install guide
2324666 agent: use ok_or/map_err instead of match
ebe5ad1 rustjail: use Iterator to manipulate vector elements
c9497c8 rustjail: delete codes commented out
d5d9928 rustjail: delete unused test code
f70892a agent: use chain of Result to avoid early return
ab64780 agent: update not accurate comments
9e064ba agent: use macro to simplify parse_cmdline function in config.rs
42c48f5 agent: add blank lines between methods
d3a36fa agent: delete unused field in agentService
fa54660 agent: use no-named closure to reduce codes
efddcb4 agent: use a local fn to reduce duplicated codes
7bb3e56 packaging: fix cloud-hypervisor binary path
7b53041 packaging: fix missing cloud_hypervisor_repo
38212ba packaging: apply qemu v5.1 stable fixes
fb7e9b4 agent: fix aarch64 build
0cfcbf7 docs: add namespace key to pod/container config files
997f1f6 docs: Add crictl example json files
f60f43a runtime: Clear the VCMock 1.x API Methods from 2.0
1789527 ci: snap: add event filtering
999f67d agent: do not follow link when mounting container proc and sysfs
cb2255f agent: set init process non-dumpable
2a6c9ee agent-ctl: include cargo lock updates
eaff5de versions: add plugins section
4f1d23b virtiofs: Disable DAX
6d80df9 snap: specify python version
a116ce0 osbuilder: Create target directory for agent
4dc3bc0 rust-agent: Treat warnings as error
8f7a484 rust-agent: Identify unused results in tests
ce54e5d rust-agent: Log returned errors rather than ignore them
9adb7b7 rust-agent: Remove unused imports
73ab9b1 rust-agent: Report errors to caller if possible
4db3f9e rust-agent: Ignore write errors while writing to the logs
19cb657 rust-agent: Remove unused code that has undefined behavior
86bc151 rust-agent: Remove 'mut' where not needed
8d8adb6 rust-agent: Remove uses of deprecated functions
76298c1 rust-agent: Remove or rename unused parameters
7d303ec rust-agent: Remove or rename unused variables
e0b79eb rust-agent: Remove unused functions
8ed61b1 rust-agent: Remove useless braces
cc4f02e rust-agent: Remove unused macros
ace6f1e clh: Support VFIO device unplug
47cfeaa clh: Remove unnecessary VmmPing
63c4757 versions: cloud-hypervisor: Bump to version 6d30fe05
059b89c docs: Change kata_tap0 to tap0_kata
4ff3ed5 docs: update networking description
de8dcb1 dev-guide: update kata-agent install details
c488cc4 docs: Update docs for enabling agent debug console
e5acb12 docs: update dev guide for agent build
1bddde7 ci: add github action to test the snap
9517b0a versions: cloud-hypervisor: bump version
f5a7175 runtime: cloud-hypervisor: tag openapi-generator-cli container
Signed-off-by: Ubuntu <ubuntu@ip-172-31-19-197.ap-southeast-1.compute.internal>
For experimental-virtiofs, we use it to test virtiofs with DAX. Let's
rename its virtiofsd to virtiofsd-dax.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We've been shipping it for a long time. It's time to make it default
replacing the old obsolet 9pfs.
Fixes: #935
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Dave Gilbert brough up that passing --thread-pool-size=1 to virtiofsd
may result in a performance improvement especially when using
`cache=none`. While our current default is `cache=auto`, Dave mentioned
that he seems no harm in having it set and he also mentiond that it may
use a lot less stack space on aarch/arm.
Fixes: #943
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Vivek Goyal found out that using "shared" thread pool, instead of
"exclusive" results in better performance.
Knowning that and with the plan to have virtio-fs as the default fs for
the 2.0, let's bring this patch in for both 5.0 and 5.1.
Fixes: #944
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Added new `agent-ctl` commands to allow the following agent API calls to
be made:
- `AddARPNeighborsRequest`
- `CloseStdinRequest`
- `CopyFileRequest`
- `GetMetricsRequest`
- `GetOOMEventRequest`
- `MemHotplugByProbeRequest`
- `OnlineCPUMemRequest`
- `ReadStreamRequest`
- `ReseedRandomDevRequest`
- `SetGuestDateTimeRequest`
- `TtyWinResizeRequest`
- `UpdateContainerRequest`
- `WriteStreamRequest`
Fixes: #969.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Basic virtio-fs support has made it upstream in the Linux kernel, as
well as in QEMU and Cloud Hypervisor. Let's go ahead and add it to the
standard configuration.
Since the device driver / DAX handling is still in progress for
upstream, we will want to still build a seperate experimental kernel for
those who are comfortable trading off bleeding edge stability/kernel
updates for improved FIO numbers.
Fixes: #963
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Use `{:?}` to print `e.as_errno()` instead of using `{}`
to print `e.as_errno().unwrap().desc()`.
Avoid panic only caused by error's content.
Signed-off-by: Tim Zhang <tim@hyper.sh>
To support a few common configurations for Kata, including:
- `io.containerd.kata.v2`
- `io.containerd.kata-qemu.v2`
- `io.containerd.kata-clh.v2`
`kata-monintor` changes to use regexp instead of direct string comparison.
Fixes: #957
Signed-off-by: bin liu <bin@hyper.sh>
Check if the ARP neighbours specified in the `AddARPNeighbors` API is
set before using it to avoid crashing the agent.
Fixes: #955.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Check if the routes specified in the `UpdateRoutes` API is set before
using it to avoid crashing the agent.
Fixes: #949.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Check if the interface specified in the `UpdateInterface` API is set
before using it to avoid crashing the agent.
Fixes: #950.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Kubernetes: from 1.17.3 to 1.18.9
CRI-O: from 0eec454168e381e460b3d6de07bf50bfd9b0d082 (1.17) to 1.18.3
Containerd: from 3a4acfbc99aa976849f51a8edd4af20ead51d8d7 (1.3.3) to 1.3.7
cri-tools: from 1.17.0 to 1.18.0
Fixes: #960.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add the verification of some basic protections, namely that:
- EnableAnnotations is honored
- Dangerous paths cannot be modified if no match
- Errors are returned when expected
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Warning from gocyclo during make check:
virtcontainers/pkg/oci/utils.go:404:1: cyclomatic complexity 37 of func `addHypervisorConfigOverrides` is high (> 30) (gocyclo)
func addHypervisorConfigOverrides(ocispec specs.Spec, config *vc.SandboxConfig, runtime RuntimeConfig) error {
^
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
There are a few interesting corner cases to consider for this
function.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
James O.D Hunt: "But also, regexpContains() and
checkPathIsInGlobList() seem like good candidates for some unit
tests. The "look" obvious, but a few boundary condition tests would be
useful I think (filenames with spaces, backslashes, special
characters, and relative & absolute paths are also an interesting
thought here)."
There aren't that many boundary conditions on a list with regexps,
if you assume the regexp match function itself works. However, the
tests is useful in documenting expectations.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This was discovered while checking a massive change in variables.
The root cause for the error is a very long list of manual
replacements, that is best replaced with a $(foreach).
All individual variables in the output configuration files were
checked against the old build using diff.
This is a forward port of a makefile fix included in
PR https://github.com/kata-containers/runtime/issues/3004
for issue https://github.com/kata-containers/runtime/issues/2943Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The entries used to be things like PATH_LIST, which are too generic.
Replace them with more precise name with a distinguishing keyword,
namely VALID. For example valid_hypervisor_paths.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When there is a default value from the code (usually empty) that
differs from a possible suggested value from the distro, then the
wording "default: empty" is confusing.
Fixes: #901
Suggested-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."
For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":
enable_annotations = [ "virtio.*", "initrd", "_path" ]
The default is an empty list of enabled annotations, which disables
annotations entirely.
If an anontation is rejected, the message is something like:
annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled
Fixes: #901
Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add variables to override defaults at build time for the various lists
used to control path annotations.
Fixes: #901
Suggested-by: Fabiano Fidencio <fidencio@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add the following text explaining the risk of using regular
expressions in path lists:
Each member of the list can be a regular expression, but prefer names.
Otherwise, please read and understand the following carefully.
SECURITY WARNING: If you use regular expressions, be mindful that
an attacker could craft an annotation that uses .. to escape the paths
you gave. For example, if your regexp is /bin/qemu.* then if there is
a directory named /bin/qemu.d/, then an attacker can pass an annotation
containing /bin/qemu.d/../put-any-binary-name-here and attack your host.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The path_list configuration gives a series of regular expressions that
limit which values are acceptable through annotations in order to
avoid kata launching arbitrary binaries on the host when receiving an
annotation.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The annotation is provided, so it should be respected.
Furthermore, it is important to implement it with the appropriate
protetions similar to what was done for virtiofsd.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Paths mentioned in the hypervisor configuration can be overriden
using annotations, which is potentially dangerous. For each path,
add a 'List' variant that specifies the list of acceptable values
from annotations.
Bug: https://bugs.launchpad.net/katacontainers.io/+bug/1878234Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Somehow containerd is sending a malformed device in update API. While it
should not happen, we should not panic either.
Fixes: #946
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Kata doesn't map any numa topologies in the guest. Let's make sure we
clear the Cpuset fields before passing container updates to the
guest.
Note, in the future we may want to have a vCPU to guest CPU mapping and
still include the cpuset.Cpus. Until we have this support, clear this as
well.
Fixes: #932
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Update the build kata containers kernel document for 2.0 release. Fixed
the 1.x release project paths and urls, using the kata-containers
project file paths and urls.
Fixes: #929
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
To update device resource entries from host to guest, we search for
the right entry by host major:minor numbers, then later update it.
However block and character devices exist in separate major:minor
namespaces so we could have one block and one character device with
matching major:minor and thus incorrectly update both with the details
for whichever device is processed second.
Add a check on device type to prevent this.
Port from the Kata 1 Go agent
https://github.com/kata-containers/agent/commit/27ebdc9d2761Fixes: #703
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The agent needs to update device entries in the OCI spec so that it
has the correct major:minor numbers for the guest, which may differ
from the host.
Entries in the main device list are looked up by device path, but
entries in the device resources list are looked up by (host)
major:minor. This is done one device at a time, updating as we go in
update_spec_device_list().
But since the host and guest have different namespaces, one device
might have the same major:minor as a different device on the host. In
that case we could update one resource entry to the correct guest
values, then mistakenly update it again because it now matches a
different host device.
To avoid this, rather than looking up and updating one by one, we make
all the lookups in advance, creating a map from (host) device path to
the indices in the spec where the device and resource entries can be
found.
Port from the Go agent in Kata 1,
https://github.com/kata-containers/agent/commit/d88d46849130Fixes: #703
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Kata 1 Go agent included a unit test for updateSpecDeviceList, but no
such unit test exists for the Rust agent's equivalent
update_spec_device_list(). Port the Kata1 test to Rust.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If update_spec_device_list() is given a device that can't be found in the
OCI spec, it currently does nothing, and returns Ok(()). That doesn't
seem like what we'd expect and is not what the Go agent in Kata 1 does.
Change it to return an error in that case, like Kata 1.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Allow for constraining the cpuset as well as the devices-whitelist . Revert
sandbox constraints for cpu/memory, as they break the K8S use case. Can
re-add behind a non-default flag in the future.
The sandbox CPUSet should be updated every time a container is created,
updated, or removed.
To facilitate this without rewriting the 'non constrained cgroup'
handling, let's add to the Sandbox's cgroupsUpdate function.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
We were assuming base 10 string before, when the block size from sysfs
is actually a hex string. Let's fix that.
Fixes: #908
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Create a containerd installation guide and a new `kata-manager` script
for 2.0 that automated the steps outlined in the guide.
Also cleaned up and improved the installation documentation in various
ways, the most significant being:
- Added legacy install link for 1.x installs.
- Official packages section:
- Removed "Contact" column (since it was empty!)
- Reworded "Versions" column to clarify the versions are a minimum
(to reduce maintenance burden).
- Add a column to show which installation methods receive automatic updates.
- Modified order of installation options in table and document to
de-emphasise automatic installation and promote official packages
and snap more.
- Removed sections no longer relevant for 2.0.
Fixes: #738.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Sometimes `Option.or_or` and `Result.map_err` may be simpler
than match statement. Especially in rpc.rs, there are
many `ctr.get_process` and `sandbox.get_container` which
are using `match`.
Signed-off-by: bin liu <bin@hyper.sh>
There are some uses/codes/struct fields are commented out, and
may not turn into un-comment these codes, so delete these comments.
Signed-off-by: bin liu <bin@hyper.sh>
Use rust `Result`'s `or_else`/`and_then` can write clean codes.
And can avoid early return by check wether the `Result`
is `Ok` or `Err`.
Signed-off-by: bin liu <bin@hyper.sh>
This commit includes:
- update comments that not matched the function name
- file path with doubled slash
Fixes: #922
Signed-off-by: bin liu <bin@hyper.sh>
In function parse_cmdline there are some similar codes, if we want
to add more commandline arguments, the code will grow too long.
Use macro can reduce some codes with the same logic/processing.
Fixes: #914
Signed-off-by: bin liu <bin@hyper.sh>
Qemu v5.1 was released with an affending commit 9b3a35ec82
(virtio: verify that legacy support is not accidentally on).
As a result, it breaks commandline compatiblilities for old qemu
users. Upstream qemu has fixed it but no release has been put out yet.
Let's apply these fixes by hand for now.
Refs: https://www.mail-archive.com/qemu-devel@nongnu.org/msg729556.html
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
If no namespace field in config files, CRI-O will failed:
setting pod sandbox name and id: cannot generate pod name without namespace
Signed-off-by: bin liu <bin@hyper.sh>
Clear the 1.x branch api methods in the 2.0. Keep the same methods to
the VC interface, like the VCImpl struct.
Fixes: #751
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
Run the snap CI on every PR is not needed. Don't run the snap CI
on PRs that don't change the source code (*.go/*.rs), a configuration
file or Makefile.
fixes#896
Signed-off-by: Julio Montes <julio.montes@intel.com>
Attackers might use it to explore other containers in the same pod.
While it is still safe to allow it, we can just close the race window
like runc does.
Fixes: #885
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
On old kernels (like v4.9), kernel applies CLOECEC in wrong order w.r.t.
dumpable task flags. As a result, we might leak guest file descriptor to
containers. This is a former runc CVE-2016-9962 and still applies to
kata agent. Although Kata container is still valid at protecting the
host, we should not leak extra resources to user containers.
This sets the init processes that join and setup the container's
namespaces as non-dumpable before they setns to the container's pid (or
any other ) namespace.
This settings is automatically reset to the default after the Exec in
the container so that it does not change functionality for the
applications that are running inside, just our init processes.
This prevents parent processes, the pid 1 of the container, to ptrace
the init process before it drops caps and other sets LSMs.
The order during the exec syscall is that the process is set back to
dumpable before O_CLOEXEC are processed.
Refs:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=613cc2b6f272c1a8ad33aefa21cad77af23139f7https://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1290-L1318opencontainers/runc@50a19c6https://nvd.nist.gov/vuln/detail/CVE-2016-9962Fixes: #890
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Simply running `make` would generate some cargo lock updates for
agent-ctl. Let's include them so that we have fixed dependencies.
Fixes: #883
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
plugins sections contains the details of plugins required for
the components or testing.
Add sriov-network-device-plugin url and version that are consumed
by the VFIO test in the tests repository.
fixes#879
Signed-off-by: Julio Montes <julio.montes@intel.com>
virtiofs DAX support is not stable today, there are
a few corner cases to make it default.
Fixes: #862Fixes: #875
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to avoid `unmet dependencies` error in the CI,
the python version must be specified in the yaml.
fixes#877
Signed-off-by: Julio Montes <julio.montes@intel.com>
When building with AGENT_SOURCE_BIN pointing to an already built
kata-agent binary, the target directory needs to be created in the
rootfs tree.
Fixes#873
Signed-off-by: Ralf Haferkamp <rhafer@suse.com>
Assign unused results to _ in order to silence warnings.
This addresses the following warnings:
warning: unused `std::result::Result` that must be used
--> rustjail/src/mount.rs:1182:16
|
1182 | defer!(unistd::chdir(&olddir););
| ^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/mount.rs:1183:9
|
1183 | unistd::chdir(tempdir.path());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
While in regular code, we want to log possible errors, in test code
it's OK to simply ignore the returned value.
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
In a number of cases, we have functions that return a Result<...>
and where the possible error case is simply ignored. This is a bit
unhealthy.
Add a `check!` macro that allows us to not ignore error values
that we want to log, while not interrupting the flow by returning
them. This is useful for low-level functions such as `signal::kill` or
`unistd::close` where an error is probably significant, but should not
necessarily interrupt the flow of the program (i.e. using `call()?` is
not the right answer.
The check! macro is then used on low-level calls. This addresses the
following warnings from #750:
This addresses the following warning:
warning: unused `std::result::Result` that must be used
--> /home/ddd/go/src/github.com/kata-containers-2.0/src/agent/rustjail/src/container.rs:903:17
|
903 | signal::kill(Pid::from_raw(p.pid), Some(Signal::SIGKILL));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> /home/ddd/go/src/github.com/kata-containers-2.0/src/agent/rustjail/src/container.rs:916:17
|
916 | signal::kill(Pid::from_raw(child.id() as i32), Some(Signal::SIGKILL));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:340:13
|
340 | write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:554:13
|
554 | / write_sync(
555 | | cwfd,
556 | | SYNC_FAILED,
557 | | format!("setgroups failed: {:?}", e).as_str(),
558 | | );
| |______________^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:340:13
|
340 | write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:340:13
|
340 | write_sync(cwfd, SYNC_FAILED, format!("{:?}", e).as_str());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:554:13
|
554 | / write_sync(
555 | | cwfd,
556 | | SYNC_FAILED,
557 | | format!("setgroups failed: {:?}", e).as_str(),
558 | | );
| |______________^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:626:5
|
626 | unistd::close(cfd_log);
| ^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:627:5
|
627 | unistd::close(crfd);
| ^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:628:5
|
628 | unistd::close(cwfd);
| ^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:770:9
|
770 | fcntl::fcntl(pfd_log, FcntlArg::F_SETFD(FdFlag::FD_CLOEXEC));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:799:9
|
799 | fcntl::fcntl(prfd, FcntlArg::F_SETFD(FdFlag::FD_CLOEXEC));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:800:9
|
800 | fcntl::fcntl(pwfd, FcntlArg::F_SETFD(FdFlag::FD_CLOEXEC));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:803:13
|
803 | unistd::close(prfd);
| ^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:930:9
|
930 | log_handler.join();
| ^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:803:13
|
803 | unistd::close(prfd);
| ^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:804:13
|
804 | unistd::close(pwfd);
| ^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:842:13
|
842 | sched::setns(old_pid_ns, CloneFlags::CLONE_NEWPID);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/container.rs:843:13
|
843 | unistd::close(old_pid_ns);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
Fixes: #844Fixes: #750
Suggested-by: Tim Zhang <tim@hyper.sh>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Various recently added error-causing calls
This addresses the following warning:
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:93:9
|
93 | cg.add_task(CgroupPid::from(pid as u64));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:196:17
|
196 | freezer_controller.thaw();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:199:17
|
199 | freezer_controller.freeze();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:365:9
|
365 | cpuset_controller.set_cpus(&cpu.cpus);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:369:9
|
369 | cpuset_controller.set_mems(&cpu.mems);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:381:13
|
381 | cpu_controller.set_shares(shares);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:385:5
|
385 | cpu_controller.set_cfs_quota_and_period(cpu.quota, cpu.period);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
warning: unused `std::result::Result` that must be used
--> rustjail/src/cgroups/fs/mod.rs:1061:13
|
1061 | cpuset_controller.set_cpus(cpuset_cpus);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
The specific case of cpu_controller.set_cfs_quota_and_period is
addressed in a way that changes the logic following a suggestion by
Liu Bin, who had just added the code.
Fixes: #750
Suggested-by: Liu Bin <bin@hyper.sh>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When we are writing to the logs and there is an error doing so, there
is not much we can do. Chances are that a panic would make things
worse. So let it go through.
warning: unused `std::result::Result` that must be used
--> rustjail/src/sync.rs:26:9
|
26 | write_count(lfd, log_str.as_bytes(), log_str.len());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
::: rustjail/src/container.rs:339:13
|
339 | log_child!(cfd_log, "child exit: {:?}", e);
| ------------------------------------------- in this macro invocation
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: this warning originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Some functions have undefined behavior and are not actually used.
This addresses the following warning:
warning: the type `oci::User` does not permit zero-initialization
--> rustjail/src/lib.rs:99:18
|
99 | unsafe { MaybeUninit::zeroed().assume_init() }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| this code causes undefined behavior when executed
| help: use `MaybeUninit<T>` instead, and only call `assume_init` after initialization is done
|
= note: `#[warn(invalid_value)]` on by default
note: `std::ptr::Unique<u32>` must be non-null (in this struct field)
warning: the type `protocols::oci::Process` does not permit zero-initialization
--> rustjail/src/lib.rs:146:14
|
146 | unsafe { MaybeUninit::zeroed().assume_init() }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| this code causes undefined behavior when executed
| help: use `MaybeUninit<T>` instead, and only call `assume_init` after initialization is done
|
note: `std::ptr::Unique<std::string::String>` must be non-null (in this struct field)
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Addresses the following warning (and a few similar ones):
warning: variable does not need to be mutable
--> rustjail/src/container.rs:369:9
|
369 | let mut oci_process: oci::Process = serde_json::from_str(process_str)?;
| ----^^^^^^^^^^^
| |
| help: remove this `mut`
|
= note: `#[warn(unused_mut)]` on by default
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This addresses the following:
warning: use of deprecated item 'std::error::Error::description': use the Display impl or to_string()
--> rustjail/src/container.rs:1598:31
|
1598 | ... e.description(),
| ^^^^^^^^^^^
|
= note: `#[warn(deprecated)]` on by default
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Parameters that are never used were removed.
Parameters that are unused, but necessary because of some common
interface were renamed with a _ prefix.
In one case, consume the parameter by adding an info! call, and fix a
minor typo in a message in the same function.
This addresses the following warning:
warning: unused variable: `child`
--> rustjail/src/container.rs:1128:5
|
1128 | child: &mut Child,
| ^^^^^ help: if this is intentional, prefix it with an underscore: `_child`
warning: unused variable: `logger`
--> rustjail/src/container.rs:1049:22
|
1049 | fn update_namespaces(logger: &Logger, spec: &mut Spec, init_pid: RawFd) -> Result<()> {
| ^^^^^^ help: if this is intentional, prefix it with an underscore: `_logger`
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Remove variables that are simply not used.
Rename as _ variables where only initialization matters.
This addresses the following warnings:
warning: unused variable: `writer`
--> src/main.rs:130:9
|
130 | let writer = unsafe { File::from_raw_fd(wfd) };
| ^^^^^^ help: if this is intentional, prefix it with an underscore: `_writer`
|
= note: `#[warn(unused_variables)]` on by default
warning: unused variable: `ctx`
--> src/rpc.rs:782:9
|
782 | ctx: &ttrpc::TtrpcContext,
| ^^^ help: if this is intentional, prefix it with an underscore: `_ctx`
warning: unused variable: `ctx`
--> src/rpc.rs:808:9
|
808 | ctx: &ttrpc::TtrpcContext,
| ^^^ help: if this is intentional, prefix it with an underscore: `_ctx`
warning: unused variable: `dns_list`
--> src/rpc.rs:1152:16
|
1152 | Ok(dns_list) => {
| ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_dns_list`
warning: value assigned to `child_stdin` is never read
--> rustjail/src/container.rs:807:13
|
807 | let mut child_stdin = std::process::Stdio::null();
| ^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?
warning: value assigned to `child_stdout` is never read
--> rustjail/src/container.rs:808:13
|
808 | let mut child_stdout = std::process::Stdio::null();
| ^^^^^^^^^^^^^^^^
|
= help: maybe it is overwritten before being read?
warning: value assigned to `child_stderr` is never read
--> rustjail/src/container.rs:809:13
|
809 | let mut child_stderr = std::process::Stdio::null();
| ^^^^^^^^^^^^^^^^
|
= help: maybe it is overwritten before being read?
warning: value assigned to `stdin` is never read
--> rustjail/src/container.rs:810:13
|
810 | let mut stdin = -1;
| ^^^^^^^^^
|
= help: maybe it is overwritten before being read?
warning: value assigned to `stdout` is never read
--> rustjail/src/container.rs:811:13
|
811 | let mut stdout = -1;
| ^^^^^^^^^^
|
= help: maybe it is overwritten before being read?
warning: value assigned to `stderr` is never read
--> rustjail/src/container.rs:812:13
|
812 | let mut stderr = -1;
| ^^^^^^^^^^
|
= help: maybe it is overwritten before being read?
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This addresses the following warning:
warning: unnecessary braces around assigned value
--> src/rpc.rs:1411:26
|
1411 | detail.init_daemon = { unistd::getpid() == Pid::from_raw(1) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
|
= note: `#[warn(unused_braces)]` on by default
Fixes: #750
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
We can rely on the error handling of the actual HTTP API calls to catch
errors, and don't need to call VmmPing explicitly in advance.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The cloud-hypervisor commit `6d30fe05` introduced a fix on its API for
VFIO device hotplug (`VmAddDevice`), which is required for supporting
VFIO unplug through openAPI calls in kata.
Signed-off-by: Bo Chen <chen.bo@intel.com>
First, most people don't care about CNM. Move that out of main doc.
Second, tc-filter is the default. Let's add a bit more background on
our usage of tc-filter (and clarify why we use this instead of macvtap).
Fixes#797
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
The systemd method of adding a debug console is not really
user friendly. Since we have added a much more straightforward
method to enable agent debug console, update developer guide to
reflect this.
Fixes#834
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add github action to test that the snap package was generated
correctly, this CI don't test the snap, it just build it.
fixes#838
Signed-off-by: Julio Montes <julio.montes@intel.com>
Tag openapi-generator-cli container to v4.3.1 that is the latest
stable, this way we can have reproducible builds and the same
generated code in all the systems
Signed-off-by: Julio Montes <julio.montes@intel.com>
ae6ccbe8 rust-agent: Update README
3faef791 docs: drop docker installation guide
f3466b87 docs: fix static check errors in docs/install/README.md
89ec614d docs: update architecture.md
1ed73179 qemu: upgrade qemu version to 5.1.0 for arm64.
cb79dddf agent: Fix OCI Windows network shared container name typo
c50aee9d github: Remove issue template and use central one
2a4c3e6a docs: fix broken links
9e2a314e Packaging: release notes script using error kernel path urls
aed20f43 rust-agent: Replaces improper use of match for non-constant patterns
868d0248 devices: fix go test warning in manager_test.go
14164392 action: Allow long lines if non-alphabetic
2ece152c agent: remove unreachable code
033925f9 agent: Change do_exec return type to ! because it will never return
c90fff82 agent: propagate the internal detail errors to users
c0ea9102 packaging: Stop providing OBS packages
ca54edef install: Add contacts to the distribution packages
b5ece037 install: Update information about Community Packages
378e429d install: Update SUSE information
567f8587 install: Update openSUSE information
18f32d13 install: Update RHEL information
8280523c install: Update Fedora information
578db2fc install: Update CentOS information
781d6eca ci: fix clone_tests_repo function
c18c5e2c agent: Set LIBC=gnu for ppc64le arch by default
a378ba53 fc: integrate Firecracker's metrics
9991f4b5 static-build/qemu-virtiofs: Refactor apply virtiofs patches
4a0fd6c2 packaging/qemu: Add common code to apply patches
37acc030 static-build/qemu-virtiofs: Fix to apply QEMU patches
6c275c92 runtime: fix TestNewConsole UT failure
0479a4cb travis: skip static checker for ppc64
b3e52844 runtime: fix golint errors
d36d3486 agent: fix cargo fmt
e1094d7f ci: always checkout 2.0-dev of test repository
c8ba30f9 docs: fix static check errors
eaa5c433 runtime: fix make check
07caa2f2 gitignore: ignore agent service file
f34e2e66 agent: fix UT failures due to chdir
442e5906 agent: Only allow proc mount if it is procfs
f2850668 rustjail: make the mount error info much more clear
73414554 runtime: add enable_debug_console configuration item for agent
0b62f5a9 runtime: add debug console service
c23a401e runtime: Call s.newStore.Destroy if globalSandboxList.addSandbox
80879197 shimv2: add a comment in checkAndMount()
b6066cbc osbuilder: specify default toolchain verion in rust-init.
1290d007 runtime: Update cloud-hypervisor client pkg to version v0.10.0
afeece42 agent/oci: Don't use deprecated Error::description() method
a4075f0f runtime: Fix linter errors in release files
01df3c1d packaging: Build from source if the clh release binary is missing
bacd41bb runtime: add podman configuration to data collection script
d9746f31 ci: use export command to export envs instead of env config item
ca2a1176 ci: use Travis cache to reduce build time
67af593a agent: update cgroups crate
cabc60f3 docs: Update the reference path of kata-deploy in the packaging
a5859197 runtime: make kata-check check for newer release
08d194b8 how-to: add privileged_without_host_devices to containerd guide
89ade8f3 travis: enable RUST_BACKTRACE
4b30001d agent/rustjail: add more unit tests
232c8213 agent/rustjail: remove makedev function
74bcd510 agent/rustjail: add unit tests for ms_move_rootfs and mask_path
a36f93c9 agent/rustjail: implement functions to chroot
fe0f2198 agent/rustjail: add unit test for pivot_rootfs
5770c2a2 agent/rustjail: implement functions to pivot_root
838b1794 agent/rustjail: add unit test for mount_cgroups
1a60c1de agent/rustjail: add unit test for init_rootfs
77ecfed2 agent/rustjail/mount: don't use unwrap
fa7079bc agent/rustjail: add tempfile crate as depedency
c23bac5c rustjail: implement functions to mount and umount files
e99f3e79 docs: Fix the kata-pkgsync tool's docs script path
d05a7cda docs: fix k8s containerd howto links
f6877fa4 docs: fix up developer guide for 2.0
6d326f21 gitignore: ignore agent version.rs
407cb9a3 agent: fix agent panic running as init
38eb1df4 packaging: use local version file for kata 2.0 in Makefile
313dfee3 docs: fix release process doc
0c4e7b21 packaging: fix release notes
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Pull over kata-deploy-test from the 1.x packaging repository. This is
intended to be used for testing any changes to the kata-deploy
scripting, and does not exercise any new source code changes.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
This reverts commit c0ea910273.
Two scripts are still required for release and testing, which should
have never been under obs-packaging dir in the first place. Let's
revert, move the scripts / update references to it, and then we can
remove the remaining obs-packaging/ tooling.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
rust agent does not use grpc as submodule for a while, update README
to reflect the change.
Fixes: #196
Signed-off-by: Yang Bo <bo@hyper.sh>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We have removed cli support and that means dockder support is dropped
for now. Also it doesn't make sense to have so many duplications on each
distribution as we can simply refer to the official docker guide on how
to install docker.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Now, the qemu version used in arm is so old. As some new features have merged
in current qemu, so it's time to upgrade it. As obs-packaging has been removed,
I put the qemu patch under qemu/patch/5.1.x.
As vxfs has been Deprecated in qemu-5.1, it will be no longer exist in
configuration-hyperversior.sh when qemu version larger than 5.0.
Fixes: #816
Signed-off-by: Edmond AK Dantes <edmond.dantes.ak47@outlook.com>
Some sections and files were removed in a previous commit,
remove all reference to such sections and files to fix the
check-markdown test.
fixes#826
Signed-off-by: Julio Montes <julio.montes@intel.com>
2.0 Packaging runtime-release-notes.sh script is using 1.x Packaging
kernel urls. Fix these urls to 2.0 branch Packaging urls.
Fixes: #829
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
The code used `match` as a switch with variable patterns `ev_fd` and
`cf_fd`, but the way Rust interprets the code is that the first
pattern matches all values. The code does not perform as expected.
This addresses the following warning:
warning: unreachable pattern
--> rustjail/src/cgroups/notifier.rs:114:21
|
107 | ev_fd => {
| ----- matches any value
...
114 | cg_fd => {
| ^^^^^ unreachable pattern
|
= note: `#[warn(unreachable_patterns)]` on by default
Fixes: #750Fixes: #793
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Create "class" and "config" file in temporary device BDF dir,
and remove dir created by ioutil.TempDir() when test finished.
fixes: #746
Signed-off-by: zhanghj <zhanghj.lc@inspur.com>
Overly long commit lines are annoying. But sometimes,
we need to be able to force the use of long lines
(for example to reference a URL).
Ironically, I can't refer to the URL that explains this
because of ... the long line check! Hence:
```sh
$ cat <<EOT | tr -d '\n'; echo
See: https://github.com/kata-containers/tests/tree/master/
cmd/checkcommits#handling-long-lines
EOT
```
Maximum body length updated to 150 bytes for parity with:
https://github.com/kata-containers/tests/pull/2848Fixes: #687.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The code in the end of init_child is unreachable and need to be removed.
The code after do_exec is unreachable and need to be removed.
Signed-off-by: Tim Zhang <tim@hyper.sh>
Let's add a new column to the Official packages table, and let the
maintainers of the official distro packages to jump in and add their
names there.
This will help us to ping & redirect to the right people possible issues
that are reported against the official packages.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Kata Containers will stop distributing the community packages in favour
of kata-deploy.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Following up a conversation with Ralf Haferkamp, we can safely drop the
instructions for using Kata Containers on SLES 12 SP3 in favour of using
the official builds provided for SLE 15 SP1, and SLE 15 SP2.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's update the openSUSE Installation Guide to reflect the current
information on how to install kata packages provided by the distro
itself.
The official packages are present on Leap 15.2 and Tumbleweed, and can
be just installed. Leap 15.1 is slightly different, as the .repo file
has to be added before the packages can be installed.
Leap 15.0 has been removed as it already reached its EOL.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Although the community packages are present for RHEL, everything about
them is extremely unsupported on the Red Hat side.
Knowing this, we'd be better to simply not mentioned those and, if users
really want to try kata-containers on RHEL, they can simply follow the
CentOS installation guide.
In the future, if the Fedora packages make their way to RHEL, we can add
the information here. However, if we're recommending something
unsupported we'd be better recommending kata-deploy instead.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's update the Fedora Installation Guide to reflect the current
information on how to install kata packages provided by the distro
itself.
These are official packages and we, as Fedora members, recommend using
kata-containers on Fedora 32 and onwards, as from this version
everything works out-of-the-box. Also, Fedora 31 will reach its EOL as
soon as Fedora 33 is out, which should happen on October.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's update the CentOS Installation Guide to reflect the current
information on how to install kata packages provided by the
Virtualiation Special Interest Group.
These are not official CentOS packages, as those are not coming from Red
Hat Enterprise Linux. These are the same packages we have on Fedora and
we have decided to keep them up-to-date and sync'ed on both Fedora and
CentOS, so people can give Kata Containers a try also on CentOS.
The nature of these packages makes me think that those are "as official
as they can be", so that's the reason I've decided to add the
instructions to the "official" table.
Together with the change in the Installation Guide, let's also update
the README and reflect the fact we **strongly recommend** using CentOS
8, with the packages provided by the Virtualization Special Interest
Group, instead of using the CentOS 7 with packages built on OBS.
Fixes: #623
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
We should not checkout to 2.0-dev branch in the clone_tests_repo
function when running in Jenkins CI as it discards changes from
tests repo.
Fixes: #818.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Firecracker expose metrics through fifo file
and using a JSON format. This PR will parse the
Firecracker's metrics and convert to Prometheus metrics.
Fixes: #472
Signed-off-by: bin liu <bin@hyper.sh>
In static-build/qemu-virtiofs/Dockerfile the code which
applies the virtiofs specific patches is spread in several
RUN instructions. Refactor this code so that it runs in a
single RUN and produce a single overlay image.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The qemu and qemu-virtiofs Dockerfile files repeat the code to apply
patches based on QEMU stable branch being built. Instead, this adds
a common script (qemu/apply_patches.sh) and make it called by the
respective Dockerfile files.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Fix a bug on qemu-virtiofs Dockerfile which end up not applying
the QEMU patches.
Fixes#786
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Somehow we are not running static checks for a long time.
And that ended up with a lot for errors.
* Ensure debug options are valid is dropped
* fix snap links
* drop extra CONTRIBUTING.md
* reference kata-pkgsync
* move CODEOWNERS to proper place
* remove extra CODE_OF_CONDUCT.md.
* fix spell checker error on Developer-Guide.md
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Current working directory is a process level resource. We cannot call
chdir in parallel from multiple threads, which would cause cwd confusion
and result in UT failures.
The agent code itself is correct that chdir is only called from spawned
child init process. Well, there is one exception that it is also called
in do_create_container() but it is safe to assume that containers are
never created in parallel (at least for now).
Fixes: #782
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
This only allows some whitelists files bind mounted under proc
and prevent other malicious mount to procfs.
Fixes: #807
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
Set enable_debug_console=true in Kata's congiguration file,
runtime will pass `agent.debug_console`
and `agent.debug_console_vport=1026` to agent.
Fixes: #245
Signed-off-by: bin liu <bin@hyper.sh>
In checkAndMount(), it is not clear why we check IsBlockDevice() and if
DisableBlockDeviceUse == false and then only return "false, nil" instead
of "false, err". Adding a comment to make it a bit more readable.
Fixes: #732
Signed-off-by: Qian Cai <cai@redhat.com>
The latest release of cloud-hypervisor v0.10.0 contains the following
updates: 1) `virtio-block` Support for Multiple Descriptors; 2) Memory
Zones; 3) `Seccomp` Sandbox Improvements; 4) Preliminary KVM HyperV
Emulation Control; 5) various bug fixes and refactoring.
Note that this patch updates the client code of clh's HTTP API in kata,
while the 'versions.yaml' file was updated in an earlier PR.
Fixes: #789
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch add fall-back code path that builds cloud-hypervisor static
binary from source, when the downloading of cloud-hypervisor binary is
failing. This is useful when we experience network issues, and also
useful for upgrading clh to non-released version.
Together with the changes in the tests repo
(https://github.com/kata-containers/tests/pull/2862), the Jenkins config
file is also updated with new Execute shell script for the clh CI in the
kata-containers repo. Those two changes fix the regression on clh CI
here. Please check details in the issue below.
Fixes: #781
Fixes: https://github.com/kata-containers/tests/issues/2858
Signed-off-by: Bo Chen <chen.bo@intel.com>
Be more verbose about podman configuration in the output of the data
collection script: get the system configuration as seen by podman and
dump the configuration files when present.
Fixes: #243
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
This PR includes these changes:
- use Rust installed by Travis
- install x86_64-unknown-linux-musl
- install rustfmt
- use Travis cache
- delete ci/install_vc.sh
Fixes: #748
Signed-off-by: bin liu <bin@hyper.sh>
Use the relative path of kata-deploy to replace the 1.x packaging url in
the kata-deploy/README.md file. Fixed the path issue, producted by
creating new branch.
Fixes: #777
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
Update `kata-check` to see if there is a newer version available for
download. Useful for users installing static packages (without a package
manager).
Fixes: #734.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add unit tests for finish_root, read_only_path and mknod_dev
increasing code coverage of mount.rs
fixes#284
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use conditional compilation (#[cfg]) to change chroot behaviour
at compilation time. For example, such function will just return
`Ok(())` when the unit tests are being compiled, otherwise real
chroot operation is performed.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use conditional compilation (#[cfg]) to change pivot_root behaviour
at compilation time. For example, such function will just return
`Ok(())` when the unit tests are being compiled, otherwise real
pivot_root operation is performed.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Don't use unwrap in `init_rootfs` instead return an Error, this way
we can write unit tests that don't panic.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add tempfile crate as depedency, it will be used in the following
commits to create temporary directories for unit testing.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use conditional compilation (#[cfg]) to change mount and umount
behaviours at compilation time. For example, such functions will just
return `Ok(())` when the unit tests are being compiled, otherwise real
mount and umount operations are performed.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Fix the kata-pkgsync tool's docs, change the download path of the
packaging tool in 2.0 release.
Fixes: #773
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
1. Until we restore docker/moby support, we should use crictl as
developer example.
2. Most of the hyperlinks should point to kata-containers repository.
3. There is no more standalone mode.
Fixes: #767
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We should just download the official static build binary instead of
trying to build on our own.
Fixes: #760
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
- agent: add cgroup v2 support
- runtime: Don't use hard-coded crio config
- Generate version file with more information in it.
- ci: replace spaces by tabs as indent
- fix issues with short life time container/exec processes
- action: Add issue to project and move to "In progress" on linked PR
- virtiofsd: fix typo in test code
- agent: setup DNS for guest
- ci: run agent test under root user
- docs: update sandbox apis doc for kata 2.0-dev
- rustjail: fix the issue of invalid cgroup_parent path
- osbuilder: update usage of RUST_AGENT variable
- agent: add retry between doing CPU hotplug and make it online.
- kernel: update to the latest LTS kernel 5.4.60
- osbuilder: fix rootfs build on ppc64le
- kernel: Enabling PTP clock support in kernel
- rootfs-builder: fix unbootable dracut-based initramfs on Fedora
- [fwport-2.0] osbuilder/image-builder: disable reflink
- virtcontainers: Add unit test for utils/compare.go
- reimplement error handling: use anyhow
- docs: update yaml file link for prometheus deployment
- docs: Update the doc for minikube installing kata
- trivial: Fix spelling of "privilege"
- [port] image-builder: disable reflink
- runtime: qemu: reduce boot time and memory footprint
- snap for kata 2.0
- runtime: Fix typo in hotplugVFIODevice()
- drivers: Correct isPCIeDevice logic
- docs: Add documentation for VFIO-AP passthrough
- [fwport-2.0] qemu: update build dependencies to support QEMU 5
- kata-deploy: add ACRN runtime to Docker configuration
- runtime: Add support for VFIO-AP pass-through
- agent: update Cargo files authors
- packaging: adjustment for 2.0 branch
- Fix epthemeral mount issue
- clh: Disable the 'seccomp' option temporarily
- Subject: [PATCH] qemu: add annotations for iommu_platform for s390x v…
- Foward-port :virtiofs: Update virtiofs docs
- Forward-port kata deploy conf
- initrd: Increase Alpine Version to 3.12
- [forward port]: osbuilder: Update yq
- tools: Add Unix socket support to agentl-ctl
- agent: Add target optimize for Makefile
- server: Allow address to be specified
- rustjail: default permission of device node should be 666
- packaging: Add VFIO-AP fragment for s390x
- console: Fix crash if debug console disabled
- agent: support guest hooks
- virtcontainers: Add to utils unit tests
- sandbox: Disconnect from agent after VM shutdown
- runtime: Re-vendor GoVMM for hotplugging IBM Adjunct Processor (AP) devices over VFIO
- clh: Port cloud-hypervisor related changes from kata-runtime
- docs: remove outdated dependencies from agent docs
- [forward-port] packaging: s390x kernel config fragments
- action: Fix subsystem check
- osbuilder : ppc64le support for rust agent based rootfs/initrd image
- packaging: add usage instructions for -a (arch_target) option
- rustjail: add the "HOME" env for process
- rustjail: fix the issue of missing set propagation for bind mount
- agent: add unit tests for rustjail/process.rs
- ci: Update experimental kernel tag to enable CLH CI
- virtcontainers: fix outdated example code in api document
- agent: setup the "lo" interface run agent as init
- Fix commit-message-check and do some updates about github actions
- virtcontainers: cleanup codes, delete not used APIs
- Use github action to do Fixes/Length/Subsystem check for commit message
- docs: Remove installation of proxy
- virtcontainers: Add unit test for types/container.go
- shimv2: fix the issue of close IO stream
- docs: Update contributions section in limitations document kata 2.0
- Fix fd leakage in execute_hook
- Kata 2.0-dev port of #2867 (NoReboot Knob)
- qemu: remove multidev in fsdev parameter on arm64
- Makefile: add CLHCMD in arm64-options.mk
- runtime: change un-structured log to structured log
- virtcontainers: Add function to capabilities test
- virtcontainers: Expand unit test coverage for asset
615ffb93 agent: Generate version file with more adequate information in it.
f13ca94e agent: Fix setting of version
c823b4cd agent: Make build remove generated files on clean
357d7885 ci: replace spaces by tabs as indent
22876b2d agent: allow multiple wait on the same process
295f5100 runtime: Don't use hard-coded crio config
6487044f shimv2: trust cached status when deleting containers
325a4f86 shimv2: do not kill a stopped exec process
d7c77b69 runtime: write oom file to notify CRI-O tha OOM occurred
15065e44 agent: add cgroup v2 support
2ce97ec6 virtiofsd: fix typo in test code
b081f26a action: Add issue to project and move to "In progress" on linked PR
6520320f agent: setup DNS for guest
90e0dc88 ci: run agent test under root user
c133a456 rustjail: fix the issue of invalid cgroup_parent path
20a084ae docs: update sandbox apis doc for kata 2.0-dev
d86e7467 agent: add retry between doing CPU hotplug and make it online.
ebd3f316 osbuilder: fix rootfs build on ppc64le
2dfb8bc5 rootfs-builder: fix unbootable dracut-based initramfs on Fedora
2019f00e docs: update yaml file link for prometheus deployment
0be02a8f runtime: qemu: reduce boot time and memory footprint
8b07bc2c agent: fix unit tests - remove rustjail::errors
6c96d666 agent: update Cargo toml and lock
46d7b9b8 agent/rustjail: remove rustjail::errors
fbb79739 agent: Use anyhow for error handling
33759af5 agent: Add anyhow dependency
c192446a agent/rustjail: Use anyhow for error handling
2e3e2ce1 agent/rustjail/capabilities: Use anyhow for error handling
6a4c9b14 agent/rustjail/cgroups: Use anyhow for error handling
359286a8 agent/rustjail: Add anyhow dependency
dd60e56f trivial: Fix spelling of "privilege"
cb999375 runtime: Fix typo in hotplugVFIODevice()
0d198f93 virtcontainers: Add unit test for utils/compare.go
1de9bc0f snap: reimplement snapcraft.yaml to support kata 2.0
85642c32 snap: move snapcraft.yaml to the right place
92dfa463 drivers: Correct isPCIeDevice logic
b4748280 kernel: Remove arm patches for ptp
82efd2f2 kernel: Enabling PTP clock support in kernel
8666e01e qemu/default-configs: update default-config for QEMU 5
2d12da8e qemu: update default-configs
cf3ac9f7 docs: Add documentation for VFIO-AP passthrough
11e8a494 docs: update the docs for minikube installing kata
517dda02 kernel: update to the latest LTS kernel 5.4.60
ae98ea45 obs-packaging: fix wait for obs
f5b71d34 qemu: update build dependencies to support QEMU 5
fcd29a28 osbuilder/image-builder: disable reflink
dae6c7d9 osbuilder: update usage of RUST_AGENT variable
1236e224 runtime: Add support for VFIO-AP pass-through
65970d38 osbuilder: install-yq should not print on success
c624fa74 osbuilder: install musl for aarch64
b24f2cb9 gitignore: ignore vscode directory
cf1b72d6 osbuilder: install rust before sourcing cargo env
7b5ab586 packaging: fix kata-deploy yaml path
76c18aa3 osbuilder: fix alpine agent build
5216815d packaging: make build-kernel.sh work for 2.0
aa3fb4db packaging: make kata-deploy work for 2.0
86a6e0b3 packaging: fix build image scripts
ceebd06b release: add 2.0 release actions
dadab1fe osbuilder: build rust agent by default
1bd58259 packaging: tag releases on kata-containers repo
f56f68bf obs-packaging: adjust for building on kata-containers repo
60245a83 agent: update Cargo files authors
544219d9 mount: fix the issue of epthemeral storage handler
fd8f3ee9 mount: add much more error info using chain_err
10b1deb2 tools: Add Unix socket support to agentl-ctl
f5598a1b Subject: [PATCH] qemu: add annotations for iommu_platform
f879acd6 scripts: Foward port osbuilder scripts to update yq
7be95b15 tools: Simplify error handling in agent-ctl
5b0e6f37 kata-deploy: add ACRN runtime to Docker configuration
adf9ecc5 initrd: Increase Alpine Version to 3.12
32b86a8d agent: Add target optimize for Makefile
26506d83 virtiofs: Update virtiofs docs
bee17d1c kata-deploy: Add containerd configuration to support kata annotations.
219f93ff kata-deploy: Add default privileged_without_host_devices
4b62fc16 clh: Disable the 'seccomp' option temporarily
f7ff6d32 image-builder: disable reflink
0a9b8e0a rustjail: default permission of device node should be 666
81644003 server: Allow address to be specified
bb30759e agent: add guest hooks UT
095ebb8c agent: fix OCI hook handling
03a4d107 agent: support guest hooks
e7bfeb41 agent: construct container bundle in tmpfs location
2ee40027 packaging: Add VFIO-AP fragment for s390x
4c30b255 runtime: Re-vendor GoVMM for VFIO-AP support
282bff9f sandbox: Disconnect from agent after VM shutdown
9f1a3d15 kernel: add s390x fragment
f1350616 kernel: config CONFIG_GENERIC_MSI_IRQ_DOMAIN
b67325c3 kernel: add missing configs
454dd854 kernel: config CONFIG_ PARAVIRT
62b45064 kernel: config CONFIG_NO_HZ_FULL
6dca74ba kernel: moved acpi hotplug config
7c85decc kernel: config CONFIG_PCI_MSI_IRQ_DOMAIN
efe51b29 kernel: fragment for pmem
08d046d9 kernel: config CONFIG_HAVE_NET_DSA
7b49fa12 kernel: fragments not supported on s390x
ccfb73cb agent/agent-ctl: update Cargo.lock
fd13c93c virtcontainers: Add msg to existing utils unit tests
c3fc09b9 virtcontainers: Add to utils unit tests
96582556 docs: remove outdated dependencies from agent docs
d12f920b console: Fix crash if debug console disabled
572de288 sandbox: Remove unnecessary thread
d5fbba3b main: Remove commented out and redundant code
1b2fe4a5 agent: Refactor main function
bac79eee main: Display config in announce
e2952b53 main: Simplify version handling
cfa35a90 action: Fix subsystem check
39b53f44 clh: enable build using Podman
04b156f6 qemu-virtiofs: Update to qemu 5.0 + virtiofs + dax
3ec05a9f clh: Add support to unplug block devices
45e32e1b clh: Set 'Id' explicitly while hotplugging block device
895959d0 clh: Provide cpu topology to API
31594387 clh: opeanapi: update api for cloud hypervisor
89836cd3 versions: cloud-hypervisor 0.9.0
8d5a60ac versions: Update qemu-virtiofs to 5.0
76a64667 clh: Remove the use of deprecated '--memory file=' parameter
bfd78104 packaging: add usage instructions for -a (arch_target) option
ecaa1f9e clh: Enable versions and kernel tag to enable CLH CI for kata 2.0
64b06944 ppc64le: Support for rust agent based rootfs
2511cabb virtcontainers: fix outdated example code in api document
5c7f0016 rustjail: add the "HOME" env for process
58dfd503 rustjail: fix the issue of missing set propagation for bind mount
e79c5727 agent: setup the "lo" interface run agent as init
d0a45637 agent: add unit tests for rustjail/process.rs
2889af77 actions: Run subject-line-length check even if the previous checks failed
9f0fef5a actions: Add commit-body-missing check
d81af48a actions: Do not limit the length of single word in commit body
8c46a41b actions: Fix subsystem checking in github-action
2466ac73 actions: Fix 'Fixes checking' problem by update dependent action
e7d3ba12 virtcontainers: cleanup codes, delete not used APIs
998a6343 docs: Remove installation of proxy
c305911d actions: Use github action to do Fixes/Length/Subsystem check
bd78ccaf shimv2: fix the issue of close IO stream
06834931 agent: Fix fd leaks in execute_hook
b03cd1bf docs: Update contributions section in limitations document kata 2.0
c15ef219 qemu: Set govmmQemu NoReboot config Knob
57269262 qemu: Add test for qemuConfig Knobs
5010e3a3 vendor: update govmm
61d133f9 runtime: change un-structured log to structured log
f24ad25d virtcontainers: Add unit test for types/container.go
1637e9d3 qemu: remove multidev in qemu/fsdev parameter on arm64
b61c9ca2 Makefile: add CLHCMD in arm64-options.mk
e1a79e69 virtcontainers: Add function to capabilities test
d1d5c69b virtcontainers: Expand unit test coverage for asset
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
The version.rs file is now generated to contain up-to-date information
from the makefile, including git commit and the full binary path.
The makefile has also been modified to make it easier to add changes
in generated files based on makefile variables.
Fixes: #740
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Fix the bug where the version string generated by the `Makefile` was not
being passed to the agent, resulting in a "unknown" version.
Fixes: #725.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Until a container is deleted, agent should allow runtime to wait for
a process in parallel, as being supported by the go agent.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Same as containers, it is possible for an exec process to stop so
quickly that containerd may send a parallel Kill request. We should
just return success in such case.
Fixes: #716
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
A PR now needs *two* labels to be applied before it can be merged.
One label must be a backport label from the list below and the other
a forward port label:
- backport labels:
`needs-backport`, `no-backport-needed`, `backport`.
- forward-port labels:
`needs-forward-port`, `no-forward-port-needed`, `forward-port`.
This is to make the maintainer think carefully before merging a PR
and hopefully maximise efficient porting.
Related: https://github.com/kata-containers/kata-containers/issues/634Fixes: #639.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The cgroup_parent path is expected to be absolute path,
add an '/' prefix to the passed cgroup_parent path to make
sure it's an absolute path.
Fixes: #336
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
Sync the api from the runtime codes to the documentation. Remove and add
some apis in the kata-api-design.md doc. And new table for Sandbox
Monitor APIs.
Fixes: #701
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
Sometimes runtime will fail in onlining CPU process,
because when the runtime calls to QMP
`device_add`, QEMU doesn't allocate all vCPUs inmediatelly.
Fixes: #665
Signed-off-by: bin liu <bin@hyper.sh>
The linux kernel feature RANDOMIZE_BASE improved the security and at
the same time increased the memory footprint of a kata container,
this feature was enabled in kata-containers/packaging#1006.
In order to mitigate this increase in memory consumption, we can
boot container using the uncompressed kernel.
Reduce boot time by ~5%
Reduce KSM memory footprint by ~14%
Reduce noKSM memory footprint by ~27%
fixes#669
Signed-off-by: Julio Montes <julio.montes@intel.com>
`rustjail::erros` was removed in a previous commit, hence some external crates
like `error_chain` are no longger required, update Cargo.toml and Cargo.lock
to reflect these changes.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Don't use `rustjail::errors` for error handling, since it's not
thread safe and there are better alternatives like `anyhow`.
`anyhow` attaches context to help the person troubleshooting
the error understand where things went wrong, for example:
Current error messages:
```
No such file or directory (os error 2)
```
With `anyhow`:
```
Error: Failed to read config.json
Caused by:
No such file or directory (os error 2)
```
fixes#641
Signed-off-by: Julio Montes <julio.montes@intel.com>
anyhow provides `anyhow::Error`, a trait object based error type for
easy idiomatic error handling in Rust applications
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use `.to_string` to wrap up `caps::errors::Error`s since they are not
thread safe, otherwise `cargo build` will fail with the following error:
```
doesn't satisfy `caps::errors::Error: std::marker::Sync`
```
Signed-off-by: Julio Montes <julio.montes@intel.com>
Return `anyhow::Result` from all the functions in this directory.
Add function `io_error_kind_eq` to compare an `anyhow::Error` with an
`io::Error`, this function downcast the `anyhow::Error`.
Signed-off-by: Julio Montes <julio.montes@intel.com>
anyhow provides `anyhow::Error`, a trait object based error type for
easy idiomatic error handling in Rust applications.
Signed-off-by: Julio Montes <julio.montes@intel.com>
I noticed the spelling mistake while reviewing another change and
doing a "grep" for "privilege" that turned up nothing.
Fixes: #671
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
In order to use a build systemd like launchpad, the snapcraft.yaml file
must be in the root directory of the project or under the `snap`
directory, that way launchpad detects that this project can be build
using the `snapcraft` command
Signed-off-by: Julio Montes <julio.montes@intel.com>
Currently, isPCIeDevice() attempts to determine if a (host) device is
PCI-Express capable by looking up its link speed via the PCI slots
information in sysfs. This is a) complicated and b) wrong. PCI-e
devices don't have to have slots information, so this frequently fails.
Instead determine if devices are PCI-e by checking for the presence of
PCIe extended configuration space by looking at the size of the "config"
file in sysfs.
Forward ported from 6bf93b23 in the Kata 1.x runtime repository.
Fixes: #611
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These patches are causing compilation issues while building on x86.
Remove these while we fix the issue.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Disable the following IPMI configs, since they are not needed
for kata containers and fixes the snap job in launchpad
CONFIG_PCI_IPMI_KCS
CONFIG_PCI_IPMI_BT
CONFIG_IPMI_SSIF
fixes#581
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add guide on how to pass a VFIO-AP device, such as Crypto Express cards
on IBM Z mainframes, to a Kata container. Like the documentation for
VFIO-PCI, this was put in the virtcontainers README.
Fixes: #658
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
The command for intalling kata in minikube still keeping the old path of
the packaging project from the 1.x branch. This commit changed the path
of the packaging's files to 2.0-dev branch.
Fixes: #619
Signed-off-by: Ychau Wang <wangyongchao.bj@inspur.com>
Reimplement the loop that waits for OBS. Look for the packages
that are still building, not for the repos.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add the following packages as build dependencies to build QEMU
5 in OBS and launchpad (snap)
* libselinux1
* libffi
* libmount
* libblkid
* python3
fixes#1075
Signed-off-by: Julio Montes <julio.montes@intel.com>
Disable reflink when using DAX. Reflink is a xfs feature that cannot be
used together with DAX.
fixes#577
Signed-off-by: Julio Montes <julio.montes@intel.com>
Recognise when a device to be hot-plugged is an IBM Adjunct Processor
(AP) device and execute VFIO AP hot-plug accordingly. Includes unittest
for recognising and uses CCW for addDeviceToBridge in hotplugVFIODevice
if appropriate.
Fixes: #491
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Co-authored-by: Julio Montes <julio.montes@intel.com>
Reviewed-by: Alice Frosi <afrosi@redhat.com>
Since we always build musl kata-agent, there is no need to build
it inside a musl container. We can just build on the host and then
copy the binary to the target rootfs.
There are still a lot to clean up and it should be made so for ALL
target distros instead of just alpine. But this is at least working
for alpine first.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We do not need to clone packaging repository, nor apply
virtio_vsock as virtio-fs-dev has already included that fix.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Anyone can collaborate in the Kata Containers project, so instead of
adding her/his name and email to the Cargo.toml files, use
`The Kata Containers community` as name and
`kata-dev@lists.katacontainers.io` as email.
fixes#643
Signed-off-by: Julio Montes <julio.montes@intel.com>
For ephemeral storage handler, it should return an
empty string instead of the mount destination.
Fixes: #635
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
Rather than specifying the VSOCK address as two CLI options
(`--vsock-cid` and `--vsock-port`), allow the agent's ttRPC server
address to be specified to the `agent-ctl` tool using a single URI
`--server-address` CLI option. Since the ttrpc crate supports VSOCK and
UNIX schemes, this allows the tool to be run inside the VM by specifying
a UNIX address.
Fixes: #549.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
for s390x virtio devices
Add iommu_platform annotations for qemu for ccw,
other supported devices can also make use of that.
Fixes#603
Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>
Don't format the error string before passing to the `anyhow!()` macro
since it can format strings itself.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add an ACRN runtime ('kata-acrn') to the Docker configuration
('/etc/docker/daemon.json').
Fixes: #579
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
Update this document to get rid of any nemu mentions.
Added comment to mention that number of containers that can be
launched may be limited by the size of `/dev/shm`.
Fixes#572
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
In case of containerd, not all annotations are passed down to the OCI
layer. We need to configure "pod_annotations" field for a runtime class.
This field is a list of annotations that can be passed by Kata as OCI
annotations. Add this as default configuration with kata-deploy.
Fixes: #594
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
For privieleged containers, all host devices are passed to
container. We have done work in crio and containerd to define a
scope of privileged in Kata to prevent this from happening.
Add this as the default as this falls under a best practice to follow
with Kata.
Note that if this flag has been already defined, then this change
does not override it.
Fixes#582
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
We kept observing instabilities from CLH CI jobs periodically (kata
1.x). To separate the random failures caused by `seccomp` from other
failures, this patch disables the 'seccomp' option from clh in kata for
now. We will bring this option back after completing the 'seccomp'
filter lists based on Kata's CI workload. Details are tracked in the
following two issues:
https://github.com/kata-containers/runtime/issues/2899 and
https://github.com/kata-containers/runtime/issues/2901
We are facing the similar challenge to stabilize CI jobs related to
cloud-hypervisor in Kata 2.0. We are disabling the `seccomp` option here
for the same reason. Related issue:
https://github.com/kata-containers/tests/issues/2813Fixes: #614
Signed-off-by: Bo Chen <chen.bo@intel.com>
Disable reflink when using DAX. Reflink is a xfs feature that cannot be
used together with DAX.
fixes kata-containers/osbuilder#456
fixes#577
Signed-off-by: Julio Montes <julio.montes@intel.com>
Allow the default (VSOCK) ttRPC server address to be changed using a new
`KATA_AGENT_SERVER_ADDR` environment variable (for testing and
debugging).
Fixes: #552.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Scan guest hooks upon creating new sandbox and append
them to guest OCI spec before running containers.
Fixes: #485
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Add vfio-ap.conf to the s390 kernel config fragments, which includes
the necessary flags for passing an IBM Adjunct Processor (AP) device
over VFIO.
Fixes: #567
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Reviewed-by: alicefr <afrosi@redhat.com>
This is a re-vendor of intel/govmm, with support for hot-plugging IBM
Adjunct Processor (AP) devices over VFIO. This is necessary for
enabling AP device pass-through in Kata (see #491).
39c372a Add support for hot-plugging IBM VFIO-AP devices
f5bdd53 travis: disable amd64 jobs
1af1c0d github: enable github actions
4831c6e travis: Run coveralls after success
cf0f05d qemu: add iommu_platform knob for qemuParams
175ac49 typo fix
Fixes: #565
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
When a one-shot pod dies in CRI-O, the shimv2 process isn't killed until
the pod is actually deleted, even though the VM is shut down. In this
case, the shim appears to busyloop when attempting to talk to the (now
dead) agent via VSOCK. To address this, we disconnect from the agent
after the VM is shut down.
This is especially catastrophic for one-shot pods that may persist for
hours or days, but it also applies to any shimv2 pod where Kata is
configured to use VSOCK for communication.
See github.com/kata-containers/runtime#2719 for details.
Fixes#2719
Signed-off-by: Evan Foster <efoster@adobe.com>
Moved CONFIG_GENERIC_MSI_IRQ_DOMAIN in arch base.conf.
The config is not selected for s390x
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Some kernel configs need additional dependencies:
- CONFIG_NO_HZ depends on
CONFIG_GENERIC_CLOCKEVENTS
- CONFIG_CGROUP_PERF depends on
CONFIG_PERF_EVENTS
CONFIG_HAVE_PERF_EVENTS
- CONFIG_BLK_DEV_LOOP depends on
CONFIG_BLK_DEV
CONFIG_BLOCK
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Moved CONFIG_ PARAVIRT to each arch base.conf.
CONFIG_ PARAVIRT only defined in x86, arm64, arm in arch/$arch/Kconfig.
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Moved CONFIG_NO_HZ_FULL config to each arch base.conf.
The config CONFIG_NO_HZ_FULL depends on CONFIG_HAVE_CONTEXT_TRACKING.
See https://github.com/torvalds/linux/blob/
a811c1fa0a02c062555b54651065899437bacdbe/kernel/time/Kconfig#L96
The context tracking is not supported on s390x yet.
See https://github.com/torvalds/linux/blob/
a811c1fa0a02c062555b54651065899437bacdbe/Documentation/features/time/
context-tracking/arch-support.txt#L27
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Moved:
---
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_PNPACPI=y
---
from hotplug to acpi.
In this way, it is possible to skip these config if the acpi feature is
not supported.
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
The option CONFIG_VIRTIO_PMEM is not supported on s390x.
It requires nvdimm support.
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Add !s390x tag to skip these group of fragments for s390x.
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Signed-off-by: Jakob-Naucke <jakob.naucke@ibm.com>
Expand unit tests for virtcontainers/utils/utils.go to include testing
CleanupFds, CPU calculations, ID string creation, and memory alignment
functions.
Fixes#490
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
The logic for the debug console meant that if the debug console was
_disabled_, the agent was guaranteed to crash on function exit due to
the unsafe code block. Fixed by simplifying the code to use the standard
`Option` idiom for optional values.
Fixes: #554.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Don't create a thread to wait for the ttRPC server to end - it isn't
required as the operation should be blocked on.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Print a simple version string rather than delaying the output
to display a structured version string. The structured output
is potentially more useful but:
- This output is not consistent with other components.
- Delaying the output makes `--version` unusable in some
environments (since a lot of setup is called before the
version string can be output).
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
\h is not a valid metacharacter in javascript which is used in
github-action.
Use \s\t to replace it.
Fixes: #551
Signed-off-by: Tim Zhang <tim@hyper.sh>
[ Port from packaging commit 4e1b5729f47d5f67902e1344521bc5b121673046 ]
Build clh with Podman, allow build the vmm in the Podman CI
Virtiofs qemu has to be build as this is requried by clh.
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from packaging commit cbe53bdb14e303830fa9f2d5a7f3c9161a32f033 ]
Update build scripts for qemu-virtiofs.
- virtiofs-0.3 patches are not needed
- Sync build on how vanilla qemu is built
- Apply patches for virtiofsd if any (none today)
- Apply patches that are used for the qemu vanilla
- Apply patches in order
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 44b58e4151d1fc7debed41274b65c37233a437e3 ]
This patch enables kata+clh to unplug block devices, which is required
to pass cri-o integration tests.
Fixes: #461
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 03fb9c50c180d3359178c30e06f1122df312ae76 ]
To support unplug block device, we need to set the 'Id' explicitly while
hotplugging devices with cloud-hypervisor HTTP API.
Fixes: #461
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 39897867bc89667daaafdd141367ec4a5fdc9247 ]
API now requires cpu topology.
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 40f49312a4881c904a1cbdace04c4c697bd2d429 ]
Update api geneated by openapi.
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 0dcbbd8dc113878c2aa8c78b5300e4853a7e64be ]
Highlights for cloud-hypervisor version 0.9.0 include:
virtiofs updates to new dax implementation based in qemu 5.0
Fixed random issues caused due to seccomp filters
io_uring Based Block Device Support
If the io_uring feature is enabled and the host kernel supports it then io_uring will be used for block devices. This results a very significant performance improvement.
Block and Network Device Statistics
Statistics for activity of the virtio network and block devices is now exposed through a new vm.counters HTTP API entry point. These take the form of simple counters which can be used to observe the activity of the VM.
HTTP API Responses
The HTTP API for adding devices now responds with the name that was assigned to the device as well the PCI BDF.
CPU Topology
A topology parameter has been added to --cpus which allows the configuration of the guest CPU topology allowing the user to specify the numbers of sockets, packages per socket, cores per package and threads per core.
Release Build Optimization
Our release build is now built with LTO (Link Time Optimization) which results in a ~20% reduction in the binary size.
Hypervisor Abstraction
A new abstraction has been introduced, in the form of a hypervisor crate so as to enable the support of additional hypervisors beyond KVM.
Snapshot/Restore Improvements
Multiple improvements have been made to the VM snapshot/restore support that was added in the last release. This includes persisting more vCPU state and in particular preserving the guest paravirtualized clock in order to avoid vCPU hangs inside the guest when running with multiple vCPUs.
Virtio Memory Ballooning Support
A virtio-balloon device has been added, controlled through the resize control, which allows the reclamation of host memory by resizing a memory balloon inside the guest.
Enhancements to ARM64 Support
The ARM64 support introduced in the last release has been further enhanced with support for using PCI for exposing devices into the guest as well as multiple bug fixes. It also now supports using an initramfs when booting.
Intel SGX Support
The guest can now use Intel SGX if the host supports it. Details can be found in the dedicated SGX documentation.
Seccomp Sandbox Improvements
The most frequently used virtio devices are now isolated with their own seccomp filters. It is also now possible to pass --seccomp=log which result in the logging of requests that would have otherwise been denied to further aid development.
Notable Bug Fixes
Our virtio-vsock implementation has been resynced with the implementation from Firecracker and includes multiple bug fixes.
CPU hotplug has been fixed so that it is now possible to add, remove, and re-add vCPUs (#1338)
A workaround is now in place for when KVM reports MSRs available MSRs that are in fact unreadable preventing snapshot/restore from working correctly (#1543).
virtio-mmio based devices are now more widely tested (#275).
Multiple issues have been fixed with virtio device configuration (#1217)
Console input was wrongly consumed by both virtio-console and the serial. (#1521)
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit d803f077c6fd26e4d020643eda415ea315f47e0c ]
Update to qemu 5.0.x with support for virtiofs + dax.
Fixes: #461
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
[ Port from runtime commit 30b40f5505fd46d23b89eb5fb38301d2f7454f35 ]
Along with the release of cloud-hypervisor v0.8.0, this option has been
deprecated. clh now enforces to use the alternative controls,
e.g. "shared" and "hugepages", which can infer the backing file
paths. Also, we don't use "hugepages" in kata, so we are fine now as the
"shared" control is already enabled.
Fixes: #461
Signed-off-by: Bo Chen <chen.bo@intel.com>
Add usage instructions for -a option in script and README,
currently supported architecture are aarch64/ppc64le/s390x/x86_64.
Fixes: #534
Signed-off-by: zhanghj <zhanghj.lc@inspur.com>
This PR updates the versions for the virtiofs kernel branch and
as there is a tag based in kernel 5.6 move patches to uses the tag name.
This PR is needed to enable CLH CI for kata 2.0. This PR is backporting
kata-containers/runtime#2843 and kata-containers/packaging#1098.
Fixes#532
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
For building rust agent on ppc64le, the rust toolchain is built using
the LIBC implementation - gnu instead of musl.
Fixes: #481
Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
Some type declarations were changed. The example code here is outdated
according to the example_pod_run_test.go under virtcontainers directory.
And add the imports to make where the types from clear.
Fixes: #507
Signed-off-by: Li Ning <lining_yewu@cmss.chinamobile.com>
When creating a container process/exec process, it should set the
"HOME" env for this process by getting from /etc/passwd.
Fixes: #498
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
When do bind mount for container's volumes, the propagation
flags should be mount/set after bind mount.
Fixes: #530
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
If the line comprises of only a single word,
it may be something like a URL (it's certainly very unlikely to be a
normal word if the default lengths are being used), so length
checks won't be applied to it.
Signed-off-by: Tim Zhang <tim@hyper.sh>
The Fixes checking should pass as long as one of the commits of
pull-request pass the check.
update depdent github-action commit-message-checker-with-regex to v0.3.1
shortlog:
d6d9770 commit-message-checker-with-regex: Add input one_pass_all_pass
Fixes: #519
Signed-off-by: Tim Zhang <tim@hyper.sh>
This PR removes the installation of proxy in the Developer Guide as it
does not exist on kata 2.0
Fixes#502
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The commit checks does not need to wait for CI dependencies to be
installed, It's a waste of time. we need show errors ASAP.
And we should display as many problems as possible at once
Fixes: #487
Signed-off-by: Tim Zhang <tim@hyper.sh>
It should wait until the stdin io copy
termianted to close the process's io stream,
otherwise, it would miss forwarding some contents
to process stdin.
Fixes: #439
Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
This PR updates the contributions sections for the limitations document
for kata 2.0 that instead using the previous runtime repository as example,
it will use the new one.
Fixes#476
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The Kata architecture does not support rebooting VMs (the lifecycle
being start/exec/kill) and if a VM is killed (e.g. using sysrq-trigger),
the VM does not exit fully and other layers do not notice the state change.
Set the NoReboot config Knob so that govmmQemu.LaunchQemu() runs QEMU
with the --no-reboot command-line option.
Fixes: #2866
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Add unit tests for types/container.go. Tests were adapted from
sandbox_test.go since ContainerState is a sandbox state structure and
the transition tests are the same.
Fixes#451
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
As the current qemu of arm64 is so old, the new multidev parameter
in 9pfsdev is not supported on arm64, so disabled it temporarily.
Fixes:#466
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The new alpha release brings in following changes:
1f8e4f67 docs: Update travis and go report card url
db93a163 runtime: remove mock shim
e5910c9b sandbox: Stop and clean up containers that fail to create
1283febd ci: checkout TRAVIS_BRANCH
d7f75dce docs: remove shim/proxy topics and fix docs links
0b3cbee8 virtcontainers: Add additional unit tests for sandbox
c0720179 package: enable cloud-hypervisor for arm64
07a307b4 virtcontainers: Remove duplicate unit tests
d914f018 virtcontainers: Move unit tests for types/sandbox.go
33b1865e actions: Pin to a particular sha for actions
8564c99e actions: Add github actions to perform DCO check
c5081624 actions: Add action to perform WIP check for pull requests
7bbb9e81 rootfs-builder: Don't modify /sbin/init on the build host
3d467505 device: Ease device access for rootfs device to allow node creation
f554cdec virtcontainers: Add to bridges unit test
1d7d944f fc: refactor --daemonize option
7f3e8959 console-watcher: use console watcher to monitor guest console outputs
1099a288 kata 2.0: delete use_vsock option and proxy abstraction
73bf9329 cgroup: fix the issue of crashed when meet unsupported cgroup
ab7afae6 docs: Clarifying minimum version of containerd for annotations
5b15e9ef runtime: consolidate types definition
c6e4d092 agent: sandbox shared pid namespace support
afcf269c rustjail: fix the issue of missing join pid namespace
f3da6900 docs: add link to CRI Configuration for pods
4291eb17 runtime: add monitor_address to .gitignore
1c56abb7 runtime: virtcontainers: vhost-user-blk/scsi are block device nodes
bbf85170 runtime: add pprof interface for shim
0790ca49 runtime: add pod overhead metrics
ae83c96d Modifie to proper CPU architecture name for ppc64le.
f404f4d9 Modified Makefile to pick up correct architecture name for ppc64le.
cdbba6ac agent: Make LIBC configurable
2afbfcab virtcontainers: print a warning when the device to append is not supported
919fc4cd virtcontainer/cgroup: create cgroup manager after creating the network
a134c2e0 virtcontainers/network: Change signature of Enpoint Attach method
9a9721c2 drivers: change BindDevicetoVFIO signature
66219d16 device: support vfio cold plug
3eb694c5 device: add ColdPlug flag
3cf8b470 runtime: delete Stateful from SandboxConfig
069505e2 runtime: delete unused sub-commands.
a0a96db2 runtime: handle unimplemented RPC call by NotFound status code
bd8f03a5 runtime: remove agent abstraction
41c04648 runtime: fix wrong issue links
83b23665 config: there is no need to check vhost-vosck for FC
d96b3063 docs: add metrics design documents for Kata 2.0
b28b850a versions: Revert "versions: update QEMU to 5.0.0"
5ff53037 tools: fix branch and runime repo
24ea3f01 virtcontainers: GetOOMEvent should have no timeout
1b75daa0 runtime: add new command to collect metrics from Kata containers
5200ac06 runtime: remove old store
186fed2a runtime: add implementation of GetMetrics
0c4c69de agent: add GetMetrics implementation
9fd3e48c agent: add new pb message GetMetrics
9c501f3d agent: device: Allow "VmPath" to be used when adding block devices
15af20b6 versions: update QEMU to 5.0.0
a06d01e1 versions: specify rust version
7ae4376b clh: vsock: Use the updated VsockConfig
d8a333b9 versions: Move to cloud-hypervisor v0.8.0
9177d3a3 virtiofsd: Use cache=auto
d66f2192 cli: Fix kata-env output on Power
94fdec4e clh: Allow add virtiofs args and cache options from config
653df674 kata_agent: Add unit tests
6da49a04 clh: Clear the "PCIAddr" field while blk device hotplug
2d6c0731 kata_agent: Pass "VirtPath" with "PCIAddr" of blk devices to agent
56ae2099 kata_agent: Allow to use "VirtPath" as volume source for blk devices
bdd386ba qemu: Fix rtc parameter is not set to qemu
51a6d60a qemu: Remove PMU feature for Power (ppc64le) platform
3ece4130 runtime: clean up shim abstraction
3a17e7aa qemu: Remove pmu limitation in nested virtualization of amd/ppc64le
06571f03 build: Add "pmu=off" to default cpu_features option
115dfa19 annotations: add cpu_features
fa9d619e qemu: add cpu_features option
520295b9 network: Detect and add static ARP entries
117ce4ac clh: remove slow boot debug flags from kernel cmdline
70137962 clh: Remove vsock log port in kernel cmdline
fd5d1394 clh: Improve hypervisor logging
21f83348 clh: Set 'virtio-blk' as the default block device driver
8b5eed70 clh: Enable disk block device hotplug support
883af9c7 agent: set hostname when running as init
899b75f2 agent: fix the issue of missing found right shell
2a8650ba agent-ctl: add Cargo.lock
a8430b37 gitignore: ignore more files
be9ca0d5 qemu: Don't leak file descriptors in case of error
60606647 virtiofsd: Improve logging
7e250f29 shim: exit out of oom polling if unimplemented
9f8d1baa virtcontainers: tests fix, nit fix
d3b3e8be virtcontainers: x86: Support microvm machine type
19833936 virtcontainers: add support for getOOMEvent agent endpoint to sandbox
7c205be2 virtcontainers: add support for getOOMEvent agent endpoint to sandbox
380f07ec proto: update agent protocol
dbc1c30d versions: Remove golangci-lint and gometalinter entries
6e7dd435 qemu: arm64: Set defaultGICVersion to 3 to limit the max vCPU number
93d1f7b4 versions: Misc changes to descriptions
17b3021b qemu: arm64: Don't detect gic version by /proc/interrupts
4cda90ab dax: enable dax on arm64
7a440254 Makefile: add trace-forwarder/agent-ctl missing targets
61e011e8 vc: Version support check is ineffective in createSandbox
ebfbca03 osbuilder: use newest golang
0fd1eb59 Makefile: add default rule
3f8d4b68 trace-forwarder: add Cargo.lock
b68d4e45 shimv2: Removing function as no longer used
f570a2cd shimv2 : Remove workaround for sharedPidNs
b2cc403e build: Improve top-level Makefile
f2a19966 agent: Rename check rule to test
ea1d799f qemu: Only one element of qemuPaths map is relevant
5dffffd4 qemu: Remove useless table from qemuArchBase
97a02131 qemu: Detect and fail a bad machine type earlier
d6e7a58a qemu: Clarify test with bad machine type
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Add additional test cases that cover more asset types and functions to
increase unit test coverage.
Fixes#424
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
This PR fixes travis and go report carl url for the runtime README for kata
2.0
Fixes#432
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Use 'remap' behaviour to deal with multiple devices being shared with
a 9p export.
Fixes the following warning:
```
9p: Multiple devices detected in same VirtFS export, which might lead to file
ID collisions and severe misbehaviours on guest!
You should either use a separate export for each device shared from host or
use virtfs option 'multidevs=remap'!
```
fixes#378
Signed-off-by: Julio Montes <julio.montes@intel.com>
New features that can improve/impact in kata containers:
x86:
VMX features can be enabled/disabled via the "-cpu" flag.
When nested virtualization is enabled with an option like
"-cpu Haswell,+vmx", the set of VMX features will also be constrained to
what was available on the corresponding CPU model.
New "microvm" machine type that has virtio-mmio instead of PCI, and no ACPI
support (so no hotplug too). The new machine type is meant as a baseline
for performance optimizations of QEMU, firmware and guests. While inspired
by Firecracker it is not entirely compatible with it (for example it does
not have Firecracker's userspace IP stack and MicroVM Metadata Service).
Reduce memory footprint when booting uncompressed kernels.
ARM:
We now correctly support more than 256 CPUs when using KVM
The virt board now supports memory hotplugging, when used with a UEFI
guest BIOS and ACPI.
virtio-iommu is now supported with machvirt.
The Cortex-M7 CPU is now supported.
s390:
Using KVM now explicitly requires a host kernel version of at least 3.15
(which includes the 'flic' KVM device). This had been broken since QEMU
2.10 already.
ppc64le:
pseries machine type, now consumes less host resources when running a KVM
guest with XIVE (with a recent enough host kernel). This allows running
more concurrent guests with KVM accelerated XIVE.
NVDIMMs with file backend is now supported and SLOF updated to work with
iommu_platform=on for virtio devices.
Signed-off-by: Julio Montes <julio.montes@intel.com>
A container that is created and added to a sandbox can still fail
the final creation steps. In this case, the container must be stopped
and have its resources cleaned up to prevent leaking sandbox mounts.
Forward port of https://github.com/kata-containers/runtime/pull/2826Fixes#2816
Signed-off-by: Evan Foster <efoster@adobe.com>
Add tests for state change, empty string failures for Volumes and
Sockets. Change two function names to accurately reflect tests.
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Now, cloud-hypervisor is capable to work on arm64. it's time to
enable it in kata for arm64.
as cloud-hypervisor can only use virtio-fs, a new patch should be
applied to kernel for virtiofs and some config should be removed
temporarily.
Fixes: #446
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Remove tests from virtcontainers/sandbox_test.go which were moved to
virtcontainers/types/sandbox_test.go.
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Move unit tests that were in virtcontainers/sandbox_test.go relating
to Socket, Volume, and SandboxState to types/sandbox_test.go.
Change testSandboxStateTransition function to use SandboxState only
instead of Sandbox from virtcontainers/sandbox.go.
Fixes#435
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Since actions can access the github token, lets use a
particular version of sha rather than using master.
Fixes: #437
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
(cherry picked from commit 57b64f35e0)
Action performs a check to verify PR raised has commits
that are signed-off.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
(cherry picked from commit 1b157e5015)
Use github actions for performing WIP checks on PRs.
The action checks for keywords in subject line
as well labels.
Fixes: #437
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
(cherry picked from commit 0d96145c29)
Don't modify /sbin/init on the build host when using command `AGENT_INIT="yes" ./rootfs.sh centos` to build rootfs.
Fixes: #430
Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn>
For docker in docker scenario, the nested container created
has entry "b *:* m" in the list of devices it is allowed to access
under /sys/fs/cgroup/devices/docker/{ctrid}/devices.list.
This entry was causing issues while starting a nested container
as we were denying "m" access to the rootfs block devices.
With this change we add back "m" access, the container would be
allowed to create a device node for the rootfs device but will
not have read-write access to the created device node.
This fixes the docker in docker use case while still making sure
the container is not allowed read/write access to the rootfs.
Note, this could also be fixed by simply skipping {"Type : "b"}
while creating the device cgroup with libcontainer.
But this seems to be undocumented behaviour at this point,
hence refrained from taking this approach.
Fixes#426
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add function that creates new bridges to increase unit test coverage
for virtcontainers/types/bridges. Also adds test for address formats.
Fixes#422
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Import new console watcher to monitor guest console outputs, and will be
only effective when we turn on enable_debug option.
Guest console outputs may include guest kernel debug info, agent debug info,
etc.
Fixes: #389
Signed-off-by: Penny Zheng penny.zheng@arm.com
With kata containers moving to 2.0, (hybrid-)vsock will be the only
way to directly communicate between host and agent.
And kata-proxy as additional component to handle the multiplexing on
serial port is also no longer needed.
Cleaning up related unit tests, and also add another mock socket type
`MockHybridVSock` to deal with ttrpc-based hybrid-vsock mock server.
Fixes: #389
Signed-off-by: Penny Zheng penny.zheng@arm.com
Using pod annotations requires a minimum version of v1.3.0 of containerd
to pass annotations down to kata. This is already somewhat mentioned in
the corresponding how-to, however, it can be mis-read as the minimum
version of kata-containers instead of containerd. This can cause
extended and futile troubleshooting on older distributions such as
Ubuntu 16.04 which ship a version of 1.2.x of containerd. This patch
attempts to clarify this.
Fixes: #690
Signed-off-by: Georg Kunz <georg.kunz@est.tech>
We do not need the vc types translation for network data structures.
Just use the protocol buffer definitions.
Fixes: #415
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Add support shareProcessNamespace.
BTW, this commit only support shared pid namespace by
sharing the infrastructure pause container's pid namespace
with other containers, instead of creating a new pid
namespace different from pause container.
Fixes: #342
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
When checking if a device is an emulated vhost-user-blk or
vhost-user-scsi one, we should not only check for their major number but
also their device node type. They must be block devices.
Fixes: #401
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Makefile is determining the architecture by running uname command
which gives ppc64le as output. But rust toolchain target is available
with the name powerpc64le for ppc64le arch. So this change took care of that.
Signed-off-by: Abhishek Dasgupta <abdasgupta@in.ibm.com>
Currently the default LIBC used to build the agent is "musl". However,
"musl" is not preset in a big portion of the distros *and* "gnu" libc
just works as expected.
Knowing that, let's add the option to the one building the project to
simply do `make LIBC=gnu` instead of expected the person to go through
the Makefile and replace musl by gnu there.
Fixes: #369
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Print a warning message when the device to append to a QEMU VM is not
supported. This change is just to improve debuggability.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Create the cgroup manager once the network has been created, this way the
list of device will include the network VFIO devices attached to the sandbox,
when the physical enpoint is the network driver.
fixes#2774
Signed-off-by: Julio Montes <julio.montes@intel.com>
In order to use the device manager and receiver from the network enpoints,
the signature of the Attach method must change to revice a Sandbox instead of
a Hypervisor, this way devices can be added through the device manager API.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Depending on ColdPlug flag, cold or hot plug vfio devices. The VFIO device
won't be hot removed when such flag is false
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add ColdPlug flag to DeviceInfo and DeviceState to identify whether a device
must be or was cold plugged
Signed-off-by: Julio Montes <julio.montes@intel.com>
For now, agent return status of NotFound when calling getOOMEvents, runtime should handle it correctly.
Fixes: #393
Signed-off-by: bin liu <bin@hyper.sh>
Since the FC used the hybrid vsock, there's no need
to check whether the vhost vsock suported by host.
Fixes: #387
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
This reverts commit 15af20b6da.
kubernetes test are failing randomly with QEMU 5.0.0, let's go back to
QEMU 4.1.1 and debug the failures with QEMU 5
Depends-on: github.com/kata-containers/tests#2701
fixes#379
Signed-off-by: Julio Montes <julio.montes@intel.com>
Kata 2.0 lives in `github.com/kata-containers/kata-containers`, so all scripts
should point to it.
Currently the branch for Kata 2.0 is 2.0-dev not master, then the branch envar
must be used instead of hardcoding `master` as default branch.
Signed-off-by: Julio Montes <julio.montes@intel.com>
When the "PCIAddr" (BDF information) is available, we allow to use the
predicted "VmPath" (from kata-runtime) to locate the block device in the
agent. This is a special code path for supporting block-device/volume
passthrough w/ cloud-hypervisor when the BDF information is not
available (as of clh v0.8.0).
This is mainly porting the changes from kata-agent PR https://github.com/kata-containers/agent/pull/790,
as the related changes from kata-runtime is ported to kata 2.0 earlier
this week (https://github.com/kata-containers/kata-containers/pull/362).
Note that the upstream clh recently added the support of returning BDF
information for hotplugged devices. We will consolidate/remove this
special code path for the next upgrade of clh version in kata.
Fixes: #248
Signed-off-by: Bo Chen <chen.bo@intel.com>
New features that can improve/impact in kata containers:
x86:
VMX features can be enabled/disabled via the "-cpu" flag.
When nested virtualization is enabled with an option like
"-cpu Haswell,+vmx", the set of VMX features will also be constrained to
what was available on the corresponding CPU model.
New "microvm" machine type that has virtio-mmio instead of PCI, and no ACPI
support (so no hotplug too). The new machine type is meant as a baseline
for performance optimizations of QEMU, firmware and guests. While inspired
by Firecracker it is not entirely compatible with it (for example it does
not have Firecracker's userspace IP stack and MicroVM Metadata Service).
Reduce memory footprint when booting uncompressed kernels.
ARM:
We now correctly support more than 256 CPUs when using KVM
The virt board now supports memory hotplugging, when used with a UEFI
guest BIOS and ACPI.
virtio-iommu is now supported with machvirt.
The Cortex-M7 CPU is now supported.
s390:
Using KVM now explicitly requires a host kernel version of at least 3.15
(which includes the 'flic' KVM device). This had been broken since QEMU
2.10 already.
ppc64le:
pseries machine type, now consumes less host resources when running a KVM
guest with XIVE (with a recent enough host kernel). This allows running
more concurrent guests with KVM accelerated XIVE.
NVDIMMs with file backend is now supported and SLOF updated to work with
iommu_platform=on for virtio devices.
Depends-on: github.com/kata-containers/tests#2694
fixes#372
Signed-off-by: Julio Montes <julio.montes@intel.com>
[ port runtime commit 364435a6a18bfbb1277512431040bf085554ffdf ]
The new release of clh v0.8.0 updated the 'VsockConfig' of its HTTP API,
which requires changes on our clh driver.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 17d265af6fc1f0913545bfa64e3e1a497f3e44c0 ]
Major new functionalities added in clh v0.8.0 include Experimental
Snapshot and Restore Support, Experimental ARM64 Support, 5-level guest
paging support, etc. Also, there are quite some bug fixings and CLI/API
changes for cleanup. More details can be found in the release note:
https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v0.8.0.
Changes:
52b83969 build, release-notes: Document 0.8.0 release
776f8fc5 build: Update Cargo.lock
3f18f93f docs: Add a guide for testing on AArch64
97a1e5e1 vmm: Exit VMM event loop after guest shutdown for AArch64
5cd1730b vmm: Configure VM on AArch64
917219fa vmm: Enable VCPU for AArch64
b5f1c912 vmm: Enable memory manager for AArch64
eeeb45bb vmm: Enable device manager for AArch64
e9488846 vm-allocator: Enable vm-allocator for AArch64
5343b0ac net_util: Fix usage of deprecated mac_address method
bf37ebdc arch: x86_64: Add 5th level of paging when needed
abd6204d source: Fix file permissions
02ac1820 scripts: Ensure musl-gcc is used by musl build
cc85d896 tests: Extend test_*_reboot with checks on fd leaking
2ae547cf build(deps): bump vmm-sys-util from 0.6.0 to 0.6.1
f3556279 build(deps): bump serde_json from 1.0.54 to 1.0.55
dc034eb3 scripts: Only use musl for the Rust components
176d6716 build: Run musl builds in parallel to glibc builds
083189e5 build(deps): bump vcpkg from 0.2.9 to 0.2.10
2334b521 build(deps): bump syn from 1.0.30 to 1.0.31
99c99c24 build(deps): bump serde_json from 1.0.53 to 1.0.54
96a5e22b resources: kernel: Enable 5 levels of page table
653087d7 vmm: Reduce MMIO address space by 4KiB
5f0b6201 arch: x86_64: Enable CR4 LA57 feature
09fd3259 build: Use fork of vm-memory with less performance impact
5f9e079a device: Add AArch64 RTC PL031 implementation
625bab69 vmm: api: Allow to delete non-booted VMs
313883f6 remove duplicated structure InitrdConfig
afe60808 build(deps): bump synstructure from 0.12.3 to 0.12.4
aa79a92c tests: Add integration test for unprivileged network
9b71ba20 vmm, vm-virtio: Stop always autogenerating a host MAC address
1f8b6fa9 net_util: Allow retrieving the MAC address from the TAP device
929d70bc net_util: Only try and enable the TAP device if it not already enabled
eda9bfc7 vhost_user_fs: Replace the '--sock' parameter with '--socket'
a8cdf2f0 tests,vm-virtio,vmm: Use 'socket' for all CLI/API parameters
90e7accf ch-remote: Show response body from error
e436bbf3 build: Install libfdt in github cross-build workflow
2d13751d aarch64: Porting fdt related files from Firecracker
5a18dd36 aarch64: Porting AArch64 register implementation from Firecracker
d605fda3 aarch64: Porting GIC source files from Firecracker
ce624a6d aarch64: Add memory layout for AArch64
c7d44b88 build(deps): bump quote from 1.0.6 to 1.0.7
7c91dfae build(deps): bump proc-macro-nested from 0.1.4 to 0.1.5
17c16e5c build(deps): bump pin-project from 0.4.19 to 0.4.20
a2398742 build(deps): bump arc-swap from 0.4.6 to 0.4.7
b31fe72e build(deps): bump openssl-sys from 0.9.57 to 0.9.58
96497004 build(deps): bump dirs-sys from 0.3.4 to 0.3.5
eabf43fb Revert "tests: Extend test_*_reboot with checks on fd leaking"
7dc4e913 tests: Extend test_*_reboot with checks on fd leaking
601d898f build(deps): bump pin-project from 0.4.17 to 0.4.19
6ff107af vm-device: Switch to use get_host_address_range in vfio-ioctls
3336e801 vfio: Switch to the vfio-ioctls crate ch branch
d24aa72d vfio: Rename to vfio-ioctls
53ce5298 vfio: Move the PCI implementation to the PCI crate
8f7dc735 vmm: Move Vcpu::configure() to arch crate
969e5e0b vmm: Split configure_system() from load_kernel() for x86_64
20cf21cd vmm: Change booting process to cover AArch64 requirements
61aa4615 vhost_user_net: Implement VIRTIO_RING_F_EVENT_IDX
a4d377a0 vm-virtio: net: Implement VIRTIO_RING_F_EVENT_IDX
f0697073 vm-virtio: net: Handle lost interrupts on restore
a5596020 vm-virtio: Add some info! level debugging interrupt generation
cc51fdb8 vhost_user_net: Use NetQueuePair from vm-virtio
fcc62efc vm-virtio: net: Prepare NetQueuePair for use in vhost-user-net
2dbd1186 vm-virtio: net: Split network handling
237cb184 vm-virtio: net: Add further missing error reporting
36d072e6 vm-virtio: Add error propagation for TAP listener (un)registration
3151b5d8 vm-virtio: net: Refactor to support code reuse
22be88d3 build(deps): bump vfio-bindings from `887b3cf` to `f08cbcb`
6121f462 build(deps): bump vfio-bindings from `46ef9d4` to `887b3cf`
b731e63a build(deps): bump ryu from 1.0.4 to 1.0.5
d2d5ccb1 build(deps): bump proc-macro2 from 1.0.17 to 1.0.18
a1b9131b build(deps): bump syn from 1.0.29 to 1.0.30
2571b279 build(deps): bump vcpkg from 0.2.8 to 0.2.9
57f477ef build(deps): bump syn from 1.0.28 to 1.0.29
8a08ea46 build(deps): bump serde_derive from 1.0.110 to 1.0.111
b8ae30d4 build(deps): bump serde from 1.0.110 to 1.0.111
0a0fb246 build(deps): bump syn from 1.0.27 to 1.0.28
bc2921b2 build(deps): bump regex from 1.3.8 to 1.3.9
917ad530 build(deps): bump regex from 1.3.7 to 1.3.8
aac87196 build(deps): bump vm-memory from 0.2.0 to 0.2.1
4c2e6054 build: Update to latest version of container
c471ae94 Dockerfile: Update to latest Rust toolchain: 1.43.0
c31ad72e build: Address issues found by 1.43.0 clippy
fbd1a6c5 vmm: api: Return complete error responses in handle_http_request()
0728bece vmm: seccomp: Ensure that umask() can be reprogrammed
3497eeff main: Set the umask to 0077
c1d15de7 build(deps): bump syn from 1.0.25 to 1.0.27
a4bb96d4 build(deps): bump libc from 0.2.70 to 0.2.71
bfd52ad8 build(deps): bump linux-loader from `bd01b6d` to `1af92d2`
8f1f9d9e devices: Implement InterruptController on AArch64
b32d3025 devices: Refactor IOAPIC to cover other architectures
d5884180 build(deps): bump syn from 1.0.24 to 1.0.25
83c18de5 build(deps): bump proc-macro-hack from 0.5.15 to 0.5.16
7708b95e build(deps): bump syn from 1.0.23 to 1.0.24
749f2f03 build(deps): bump proc-macro2 from 1.0.15 to 1.0.17
c98d6fd0 build(deps): bump openssl-sys from 0.9.56 to 0.9.57
a9ca493b build(deps): bump proc-macro2 from 1.0.14 to 1.0.15
974c7138 build(deps): bump thiserror from 1.0.18 to 1.0.19
321c479b build(deps): bump proc-macro2 from 1.0.13 to 1.0.14
4f5c8be3 build: Added a workflow to cross-build targetting AArch64
1befae87 build: Fixed build errors and warnings on AArch64
0090ec2d build: Updated development utilities for AArch64
af8292b6 vmm, config, vhost_user_blk: remove "wce" parameter
9101bdd7 vm-virtio: block: Ensure backing file consistency
dc66eee8 vhost_user_block: Ensure backing file consistency
10db2131 vm-virtio: block: Add "writeback" control to Request
b94d9a30 vhost_user_backend: Allow backends to know features that can be used
9d88ba7a vhost_user_block: Use VirtioBlockConfig from vm-virtio
1fac2632 vm-virtio: Use config name as per spec
077a5c36 build(deps): bump syn from 1.0.22 to 1.0.23
a813b57f vm-virtio, vhost_user_{fs,block,backend}: Move EVENT_IDX handling
8ae7a38d build: Use same virtio-bindings version
3947809c vm-virtio: block: Ensure that VIRTIO_BLK_T_FLUSH requests actually sync
ca6edafb build(deps): bump cc from 1.0.53 to 1.0.54
a7f236b8 ci: Extend snapshot/restore to validate virtio-vsock
f442c62b vm-virtio: Implement Snapshottable trait for Vsock
f9759988 ci: Extend snapshot/restore test with virtio-iommu
646d33fe vm-virtio: Set queue fields explicitely during restore
02cbea54 vm-virtio: Implement Snapshottable trait for Iommu
4f89cb05 build(deps): bump linux-loader from `43d1c51` to `bd01b6d`
14db7b0a build(deps): bump addr2line from 0.12.0 to 0.12.1
9f2eddd9 ci: Fix test_serial_off
7c3e19c6 vhost_user_backend, vmm: Close leaked file descriptors
35782bd9 vm-virtio: Close file descriptors created by epoll::create()
039accc1 vhost_user_net, vm-virtio: Interrupt guest when TX queue is updated
c8a081e4 build(deps): bump pin-project from 0.4.16 to 0.4.17
b80a7d01 build(deps): bump vmm-sys-util from 0.5.0 to 0.6.0
e6fd6d63 vhost_user_block: Implement VIRTIO_BLK_F_FLUSH
95e3edda build(deps): bump quote from 1.0.5 to 1.0.6
d760010c build(deps): bump ppv-lite86 from 0.2.6 to 0.2.8
0cde08a7 build(deps): bump hermit-abi from 0.1.12 to 0.1.13
3adfe3fb build(deps): bump syn from 1.0.21 to 1.0.22
85aadd15 build(deps): bump proc-macro2 from 1.0.12 to 1.0.13
c764c212 build(deps): bump thiserror from 1.0.17 to 1.0.18
4366dd92 vm-virtio: block: Add support for VIRTIO_RING_F_EVENT_IDX
5a55fc07 vhost_user_fs: Fix seccomp filter for musl
391508f0 tests: Add tests checking for host MAC address setting
1b8b5ac1 vhost-user_net, vm-virtio, vmm: Permit host MAC address setting
11049401 vmm: seccomp: Add ioctl() commands interface hardware address
59e1361f net_util: tap: Add support for setting tap MAC address
68fc4329 vmm: Update seccomp filters with clock_nanosleep
badf8261 build(deps): bump anyhow from 1.0.30 to 1.0.31
7b10f732 build(deps): bump cc from 1.0.52 to 1.0.53
4120a7de vhost_user_fs: Add seccomp
6aa29bdb vmm: api: Use a common handler for data actions too
0fe223f0 vmm: api: Extend VmAction to reduce code duplication
6ec605a7 vmm: api: Refactor generic action handler
c652625b vmm: api: Add a default implementation for simple PUT requests
a3e8bea0 vmm: api: Move HttpError enum to http module
6aab0a54 vhost_user_fs: Implement support for optional sandboxing
c4bf383f vhost_user_*: Create a vhost::Listener in advance
fa844865 vhost_user_fs: Allow callers to provide a fd for /proc/self/fd
831cff3f vhost_user_fs: Use a fd for /proc/self/fd instead of /proc
ba4ec7fc ci: Extend snapshot_restore_test with hotplug
9e165c2c ci: Enable snapshot/restore integration test
c566f1f0 build(deps): bump once_cell from 1.3.1 to 1.4.0
7ffde295 build(deps): bump backtrace from 0.3.47 to 0.3.48
e9c2dbc8 build(deps): bump anyhow from 1.0.29 to 1.0.30
9ccc7daa build, vmm: Update to latest kvm-ioctls
80aa0a75 tests: Test unplugging virtio-fs
88ec93d0 vmm: config: Add missing "id" from FsConfig parsing
0f89f5ec build(deps): bump anyhow from 1.0.28 to 1.0.29
ab3d374a build(deps): bump syn from 1.0.20 to 1.0.21
35b8992e build(deps): bump thiserror from 1.0.16 to 1.0.17
3415b11d build(deps): bump quote from 1.0.4 to 1.0.5
6989bf05 build(deps): bump backtrace from 0.3.46 to 0.3.47
2991fd2a build(deps): bump libc from 0.2.69 to 0.2.70
c37da600 vmm: Update DeviceTree upon PCI BAR reprogramming
d0ae9d7c vmm: Share the DeviceTree across threads
5e9d2545 vmm: Store and restore virtio-pci BAR resources
02bd50f6 vm-virtio: Add helper to set the configuration BAR value
8a826ae2 vmm: Store and restore virtio-pci device on right PCI slot
98dac352 vmm: Add optional PCI b/d/f to each DeviceNode
1e0ebb76 pci: Allow specific PCI b/d/f to be reserved
e577b64a build(deps): bump syn from 1.0.19 to 1.0.20
36bffff2 tests: Expand the test_large_memory() test to cover lots of vCPUs
b9ba81c3 arch, vmm: Don't build mptable when using ACPI
16ac24d8 tests: Only test "noacpi" test when we don't build with ACPI
bb8d19bb arch: Check RSDP address does not go past memory
1c44e917 build(deps): bump clap from 2.33.0 to 2.33.1
4cd2eccf build(deps): bump signal-hook from 0.1.14 to 0.1.15
308b790c vm-virtio: Implement Snapshottable trait for VirtioPciDevice
6d594286 vm-virtio: Implement Snapshottable trait for VirtioPciCommonConfig
e1701f11 pci: Implement Snapshottable trait for PciConfiguration
376db311 pci: Implement Snapshottable trait for MsixConfig
52ac3779 tests: Remove network interface from test_memory_overhead
b57eeb96 vhost_user_block: Add "queue_size" to --block-backend
5016fcf8 vhost_user_block: Use config::OptionParser to simplify block backend parsing
592de97f vhost_user_net: Use config::OptionParser to simplify net backend parsing
f3f398eb vhost_user_block: Consolidate the vhost-user-block backend syntax
3220292d vhost_user_net: Consolidate the vhost-user-net backend syntax
0d2be3b6 build(deps): bump serde from 1.0.107 to 1.0.110
9d8754c6 build(deps): bump pin-project from 0.4.13 to 0.4.16
9bac13de build(deps): bump serde_json from 1.0.52 to 1.0.53
e8d4a13e build(deps): bump serde_derive from 1.0.107 to 1.0.110
d8f181c5 build(deps): bump futures from 0.3.4 to 0.3.5
1e44ac51 build(deps): bump serde_derive from 1.0.106 to 1.0.107
c197bd6f build(deps): bump serde from 1.0.106 to 1.0.107
475040b2 vm-virtio: Correctly reset the virtqueues
d809f2fe vm-virtio: Add virtio reset() support to MmioDevice
0d720cc3 bin: ch-remote: Ensure ch-remote supports syntax it advertises
74d88c4c build(deps): bump openssl-sys from 0.9.55 to 0.9.56
9adc32a0 tests: Print out details for smaps in test_memory_overhead
250f825f tests: Check that requesting tap name for virtio-net succeeds
006da040 tests: Check tap name provided is used for vhost_user_net tests
54b3329f tests: Add tests that use (non-existing) named tap
6fde2d18 build: Strip the binaries before using/releasing them
a4d23c3c build(deps): bump syn from 1.0.18 to 1.0.19
12e00c0f vmm: cpu: Retry sending signals if necessary
31bde4f5 vmm: Unpark the DeviceManager threads in shutdown
801e72ac vmm: cpu: Unpause vCPU threads
91a4a258 vmm: cpu: When coming out of the pause event check for a kill signal
cd60de8f Revert "vmm: vm: Unpark the threads before shutdown when the current state is paused"
797cd13d build(deps): bump vec_map from 0.8.1 to 0.8.2
f6a71bec vmm: Add unit tests for DeviceTree
64e01684 vmm: Create new module device_tree
3b77be90 vmm: Add device_node!() macro to improve code readability
83ec716e vmm: Create breadth-first search iterator for the DeviceTree
b91ab1e3 vmm: Remove the list of migratable devices
1be70372 vmm: Don't use migratable_devices for restore
bc608439 vmm: Add migratable field to the DeviceNode
7fec020f vmm: Create a dedicated DeviceTree structure
14b379de vmm: Add an identifier field to DeviceNode structure
0805d458 vmm: Add support for multiple children per DeviceNode
daaeba51 vmm: Change Node into DeviceNode
5c7df03e vmm: Store and restore virtio-pmem resources
2e6895d9 vmm: Store and restore virtio-fs resources
987f8215 vmm: Store and restore virtio-mmio resources
9cb1e1cc vmm: Perform MMIO allocation from virtio-mmio device creation
adf29706 vmm: Create devices in different path if restoring the VM
d39f91de vmm: Reorganize DeviceManager creation
89c2a586 vmm: Restore devices following the device tree
52c80cfc vmm: Snapshot and restore DeviceManager state
5b408eec vmm: Create a device tree
a6fde0bb vm-device: Define a Resource
b8841d7a tests: Validate vsock functionality works across a reboot
fec97e05 vm-virtio, vmm: Delete unix socket on shutdown
5109f914 vmm: config: Reject attempts to use VFIO or IOMMU without PCI
cb220ae1 tests: Add some debugging to test_memory_overhead
eb3d9d15 build(deps): bump ssh2 from 0.8.0 to 0.8.1
59b73034 build(deps): bump failure from 0.1.7 to 0.1.8
dd0791d7 build(deps): bump pnet from 0.25.0 to 0.26.0
7660a104 build(deps): bump failure_derive from 0.1.7 to 0.1.8
327d67fa virtio-mem: Return reize error in MemEpollHandler.run
bc318b64 build(deps): bump proc-macro2 from 1.0.10 to 1.0.12
5571c6af build(deps): bump signal-hook from 0.1.13 to 0.1.14
af3d0802 build(deps): bump pnet_macros from 0.25.0 to 0.26.0
678855e8 build(deps): bump term_size from 0.3.1 to 0.3.2
2a16ce7e build(deps): bump quote from 1.0.3 to 1.0.4
99e3a150 build(deps): bump backtrace-sys from 0.1.36 to 0.1.37
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 4645d3e6ef2e99dae1f2b3a7bfded6fc304d3023 ]
Today for virtiofsd kata sets by default `cache=always`. This option is
useful for performance but if the shared files are modified from the
host changes are not updated in the guest as virtiofsd uses cached value
all time.
This patch changes to `cache=auto` to fix consistency issues. The option
can still be set to always if it is wanted by the user.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 9ac39116b08148de8e66abfca2e5407bc153af87 ]
kata-env output always shows "VMContainerCapable=false" on Power.
This patch fixes the same.
Signed-off-by: bpradipt@in.ibm.com
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit e5a3211c74e20e9878fd0f5d1c80a3c4354eabd1 ]
Today some options for virtiofsd could improve compatibility
for example xattrs for dnf or cache=auto for file consistency
for changes in the host. Allow users can enabled as requiered.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 6be76fcd07a3d74ca5521af2feaf966dd6f2c344 ]
This patch adds the unit test for 'handleDeviceBlockVolume()'.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 5b96e01f1ba3b0458539c1c920d0c1aab7d5968e ]
We explicitly set "PCIAddr" to NULL, so that the "VirtPath" field can be
used by the agent to create the container.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 50c1dce137bb3d608daa931c01e4941ed5fdb6cc ]
In case the "PCIAddr" of block devices is not available (e.g.
cloud-hypervisor), we also pass the "VirtPath" to the agent for adding
block devices to the container.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit aea29b64b66f75049cb045f9e41dff2becdbebdc ]
When the "PCIAddr" of block device is not available (e.g. cloud-hypervisor), we
allow to use the "VirtPath" as the volume source for creating containers.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 379f19f7ccd71ebe938d9d6fe3cfe5f05f4f02bf ]
Add default value for Clock, otherwise rtc parameter will be dropped
by Valid function. "host" is the default value in qemu for rtc clock.
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 6b32472c2138536ea7e859360498f175601d9ec9 ]
The bug got introduced in 06571f0
Signed-off-by: bpradipt@in.ibm.com
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 18662e16687453185ff4cf99b495a34e3ea9935f ]
It's up to the user enable/disable pmu. After previous commit, the default
pmu option has been set to off.
This patch removes the hard limitation and unit test codes.
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 41a06d4961f51af4ec4799aaee202c744584f31e ]
The user sometimes doesn't care about pmu usage(e.g. perf tool profiling).
But pmu will cost significant overhead on boot time and virtualization
context switch. E.g. on arm64, if guest pmu is enabled, kvm should save
and restore all PMU registers when guest/host switching.
for dmesg comparision:
Before:
[ 0.007620] bus: 'platform': driver_probe_device: matched device pmu with driver armv8-pmu
[ 0.007622] bus: 'platform': really_probe: probing driver armv8-pmu with device pmu
[ 0.036282] hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
[ 0.036285] driver: 'armv8-pmu': driver_bound: bound to device 'pmu'
[ 0.036295] bus: 'platform': really_probe: bound device pmu to driver armv8-pmu
After:
[ 0.007935] bus: 'platform': driver_probe_device: matched device alarmtimer with driver alarmtimer
[ 0.007937] bus: 'platform': really_probe: probing driver alarmtimer with device alarmtimer
[ 0.007940] driver: 'alarmtimer': driver_bound: bound to device 'alarmtimer'
[ 0.007944] bus: 'platform': really_probe: bound device alarmtimer to driver alarmtimer
Because s390 doest support "pmu=off", keep the default CPUFEATURES to be ""
instead of "pmu=off".
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit f03c17d107999fd68da87d98ab3e242ac7843051 ]
So that users can use annotations to set it.
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 0100af18a2afdd6dfcc95129ec6237ba4915b3e5 ]
To control whether guest can enable/disable some CPU features. E.g. pmu=off,
vmx=off. As discussed in the thread [1], the best approach is to let users
specify them. How about adding a new option in the configuration file.
Currently this patch only supports this option in qemu,no other vmm.
[1] https://github.com/kata-containers/runtime/pull/2559#issuecomment-603998256
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 67d3e2c5c5d11738c0c0ff46b1228909a6c81ab0 ]
Some network plugins add static arp entries in the network namespace.
Scan namespace for static entries and pass these on to the
agent to be added within the guest.
If the grpc api is not implemented by the agent due to a older running
agent, check for this and do not error out to maintain
backward compatibility.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 6c517548429da06d33172c8e135dc9b9a297175d ]
The systemd debug and kernel init call debug flags make slow the boot.
The flags are not really related with the hypervisor and
can be added if needed using extra kernel command line options.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 160e3a7c98043a52032b15cc8f6e32a91b032258 ]
Cloud hypervisor logs console via stdout. Using console logs help
to get not only agent logs but early boot kernel logs.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit e1ee00d16ed621594a92ce0456eb048362962ff0 ]
Use systemd-cat to collect hypervisor output. The `systemd-cat` program
will open a journal fd and call `cat(1)` to redirect all the output to
the fd. This requires an extra binary to read from hypervisor stdout
(that has combined stdin, stderr and serial terminal). But because it is
cat the overhead is minimal and only is started on Kata debug mode.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 5e5527204c03036f1d1a6b3122c1e0c3e1d1ba94 ]
The block device driver defaults to 'virtio-scsi' when it is not set in
the hypervisor configuration file, while cloud-hypervisor supports only
'virtio-blk' for its block devices.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit c5f97b24d7a1eaac216f144b2c5429feb3451553 ]
With this patch, the container image can be shared from host with guest
as a block device when the 'devicemapper' is used as the storage driver
for docker. Note: The 'block_device_driver="virtio-blk"' entry is
required in the hypervisor config file to work properly.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
It should iter the shells to find the existing shell
command instead of return an error directly when it
meet an absent shell command.
Fixes: #354
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
[ port from runtime commit 7b269ff7aa2d62fe12593ff7040798e6c9bd5d65 ]
If we take one of the error paths from setupVirtiofsd() after
opening the fd variable, the fd.Close() function is not called.
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 882a82393305a4b11a77744b5fc77b98e42d15b9 ]
Send virtiofsd logs to syslog in the same way that qemu implementation
does. This requires not to wait for messages from virtiofsd stdout. This
takes the qemu implementation approach. Give the socket fd to virtiofsd.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 86f581068eb9dc4b6862c7415cdc912e111177dd ]
This exits out of polling for OOM events if the getOOMEvent
method is unimplemented.
Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit b4833a48c81132e5a6b1c25a764cd0ebbdc6afff ]
fix tests and nit
Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 6aff077901021d9a0075c446dfe281b2487e1487 ]
With the addition of support to govmm for multiple transports (intel/govmm#111)
and microvm (intel/govmm#121) we can now enable support for the 'microvm'
machine type in kata-runtime.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 86686b56a2bf7f6dd62f620278ae289564da51d0 ]
This adds support for the getOOMEvent agent endpoint to retrieve OOM
events from the agent.
Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit 86686b56a2bf7f6dd62f620278ae289564da51d0 ]
This adds support for the getOOMEvent agent endpoint to retrieve OOM
events from the agent.
Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit ee985a608015d81772901c1d9999190495fc9a0a ]
After removing dectect of host gic version, we need to limit the max vCPU
in different cases.
Given that in most cases, Kata is running on gicv3 host, set it as default
value. If the user really want to run Kata on gicv2 host, he/she need to
set default_maxvcpus in toml file to 8 instead of 0.
In summary, If the user uses host gicv3 gicv4, everything is fine
If the user uses host gicv2, set default_maxvcpus=8
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime commit c4b5922df2 ]
Most of the description fields have capitalized text,
some of those that don't are then converted on this
change.
Fixed spelling of 'required'.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime repository commit 4d4a153af5cb145215cb6e6e386eac2bcb8c3e32 ]
Commit b4385901da ("qemu/arm64: Detect host GIC version to configure guest
GIC") reads /proc/interrupts to detect the host gic version.
But on a ThunderX2 host with 224 cpus, the /proc/interrupts is ~762K bytes.
Hence it will costs ~900K bytes memory overhead.
From the go tool pprof results:
flat flat% sum% cum cum%
976.89kB 100% 100% 976.89kB 100% github.com/kata-containers/runtime/virtcontainers.getHostGICVersion
Although the allocated memory will be freed, seems it worthy removing that
for speed up the runtime.
As per [1], there is no perfect way to detect the gic version on host.
At qemu side, if we use "gic-version=host", qemu will automatically detect
the verion by kvm ioctl. So we'd better let qemu determine the gic version.
If the user really want to start vm with gic-verion=2, he/she can set it
in machine_accelerators option.
[1]https://lists.cs.columbia.edu/pipermail/kvmarm/2014-October/011690.html
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime repository commit e36389e25e ]
After backporting patch series of enabling memory hot remove on aarch64
to v5.4.x, we finally could enable nvdimm/dax on aarch64.
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ port from runtime repository commit 7e47046111 ]
If major version matches max supported major, we continue comparing the minor version.
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Removing code that existed as a workaround for a bug in
how shared process namespaces were handled in the agent.
That has been long fixed in the agent.
With this, sharedPidNs will now work with shimv2.
Fixes#337
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Define a set of functions that support the standard rules (build,
install, test, *etc*). Then simply add new components and tools to the
appropriate variable to support all the standard build semantics.
Fixes#331.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Changed the name of the rule that runs the tests to "test" for
consistency, but retained `check` for backwards compatibility
for now.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The qemuPaths field in qemuArchBase maps from machine type to the default
qemu path. But, by the time we construct it, we already know the machine
type, so that entry ends up being the only one we care about.
So, collapse the map into a single path. As a bonus, the qemuPath()
method can no longer fail.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The supportedQemuMachines array in qemuArchBase has a list of all the
qemu machine types supported for the architecture, with the options
for each. But, the machineType field already tells us which of the
machine types we're actually using, and that's the only entry we
actually care about.
So, drop the table, and just have a single value with the machine type
we're actually using. As a bonus that means the machine() method can
no longer fail, so no longer needs an error return.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently, newQemuArch() doesn't return an error. So, if passed an invalid
machine type, it will return a technically valid, but unusable qemuArch
object, which will probably fail with other errors shortly down the track.
Change this, to more cleanly fail the newQemuArch itself, letting us
detect a bad machine type earlier.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The last stanza of TestQemuAmd64Bridges is rather odd. It tries to create
a qemu instance with a machine type of (QemuQ35 + QemuPC), or in other
words "q35pc", which isn't a thing.
What it's asserting about this is that the returned bridges list is empty
despite asking for bridges, so it looks like what this is really trying to
test is for sane behaviour when given a bad machine type.
So, split this out into a separate test, and make it explicit for clarity.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Implement tc-based tx rate limiter to control network I/O outbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
We take different actions, based on various inter-networking models.
For tcfilters as inter-networking model, we simply apply htb
qdisc discipline on the virtual netpair.
For other inter-networking models, such as macvtap, we resort to ifb,
by redirecting interface ingress traffic to ifb egress, and then apply htb
to ifb egress.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Ingress traffic shaping is very limited, and the htb
qdisc discipline couldn't be applied to interface ingress traffic.
Here, we import a new pseudo network interface, Intermediate Functional Block (ifb).
It is an alternative to tc filters for handling ingress traffic, by
redirecting interface ingress traffic to ifb and treat it as egress traffic there.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As for hypervisors that support built-in rate limiter, like firecracker,
we use this built-in characteristics to implement rate limiter in kata.
kata-defined rate is in bits with scaling factors of 1000, otherwise fc-defined
rate is in bytes with scaling factors of 1024, so need reversion.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Implement tc-based rx rate limiter to control network I/O inbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
In some detail, we use HTB(Hierarchical Token Bucket) qdisc shaping schemes
to control host interface egress traffic.
HTB shapes traffic based on the Token Bucket Filter algorithm, and one
fundamental part of the HTB qdisc is the borrowing mechanism.
Children classes borrow tokens from their parents once they have exceeded rate,
it will continue to attempt to borrow until it reaches ceil. See more details in
https://tldp.org/HOWTO/Traffic-Control-HOWTO/classful-qdiscs.htmlFixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
We use tc-based or built-in rate limiter to shape network I/O traffic
and they all must be tied to one specific interface/endpoint.
In order to tell whether we've ever added rate limiter to this interface/endpoint,
we create get/set func to reveal/store such info.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
We have defined specific config file configuration-fc.toml for firecracker,
including specific features and requirements, but the related unit test
TestNewFirecrackerHypervisor is missing.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As for some hypervisors, like firecracker, they support built-in rate limiter
to control network I/O bandwidth on VMM level. And for some hypervisors, like qemu,
they don't.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Add configuration/annotation about network I/O throttling on VM level.
rx_rate_limiter_max_rate is dedicated to control network inbound
bandwidth per pod.
tx_rate_limiter_max_rate is dedicated to control network outbound
bandwidth per pod.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
The virtiofs daemon may run into errors other than the file
not existing, e.g. the file may not be executable.
Fixes: #2682
Message is now:
virtiofs daemon /usr/local/bin/hello returned with error:
fork/exec /usr/local/bin/virtiofsd: permission denied
instead of
panic: runtime error: invalid memory address or nil
Fixes: #2582
Message is now:
virtiofs daemon /usr/local/bin/hello-not-found returned with error:
fork/exec /usr/local/bin/hello-not-found: no such file or directory
instead of:
virtiofsd path (/usr/local/bin/hello-no-found) does not exist
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The current path is hardcoded as follows:
virtio_fs_daemon = "/path/to/virtiofsd"
Switch to using the value of config.VirtioFSDaemon instead.
Fixes: #2686
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
In service#StartShim, there is no applicable error variable which is checked by deferred func because the err variable is redefined.
This PR fixes the error variable.
Fixes#2727
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Call the `pkg/cgroups` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Fixes: #2782
Signed-off-by: Julio Montes <julio.montes@intel.com>
[cherry picked from runtime commit 3c4fe035e8041b44e1f3e06d5247938be9a1db15]
Check if shm mount is backed by empty-dir memory based volume.
If so let the logic to handle epehemeral volumes take care of this
mount, so that shm mount within the container is backed by tmpfs mount
within the the container in the VM.
Fixes: #323
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[cherry picked from runtime commit d0dbd0485d2f4ec3760f6fa1252ded86a7709042]
Call the `device/config` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ cherry-picked from runtime commit 13887bf89da9d2d7c215d77ca63129e1813e4c4a ]
Call the `store` packages `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We need to make sure containers cannot modify host path unless it is explicitly shared to it. Right now we expose an additional top level shared directory to the guest and allow it to be modified. This is less ideal and can be enhanced by following method:
1. create two directories for each sandbox:
-. /run/kata-containers/shared/sandboxes/$sbx_id/mounts/, a directory to hold all host/guest shared mounts
-. /run/kata-containers/shared/sandboxes/$sbx_id/shared/, a host/guest shared directory (9pfs/virtiofs source dir)
2. /run/kata-containers/shared/sandboxes/$sbx_id/mounts/ is bind mounted readonly to /run/kata-containers/shared/sandboxes/$sbx_id/shared/, so guest cannot modify it
3. host-guest shared files/directories are mounted one-level under /run/kata-containers/shared/sandboxes/$sbx_id/mounts/ and thus present to guest at one level under /run/kata-containers/shared/sandboxes/$sbx_id/shared/
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When an x86 sandbox has a vIOMMU (needed for VFIO), it needs the
'kernel_irqchip=split' option or it can't start. fdcd1f3a2 attempts to set
that, but ends up just writing it to a temporary (looks like Go for range
loops pass by value).
Fixes: #2694
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Add a configuration option and a Pod Annotation
If activated:
- Add kernel parameters to load iommu
- Add irqchip=split in the kvm options
- Add a vIOMMU to the VM
Fixes#2694
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Add a new function appendIOMMU() to the qemuArch interface
and provide an implementation on amd64 architecture.
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The ppc64 specific qemu setup code adds a "pmu=off" parameter to the cpu
model if the nestedRun option is set. But, not only does availability of
the pmu have nothing to do with nesting on POWER, there is no "pmu=" cpu
opton for ppc64 at all.
So, simply remove it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Hard-coded Qemu machine options create challenges when running Kata
with latest Qemu (v5.0) or with latest processor version.
This patch makes it configurable by leveraging the existing machine_accelerators
option in configuration.toml.
This patch fixes#2657 for ppc64le
Signed-off-by: bpradipt@in.ibm.com
The default ppc64le Qemu binary path was specific for Ubuntu.
This patch fixes the default binary path for both Fedora and Ubuntu
Fixes: #2738
Signed-off-by: bpradipt@in.ibm.com
qemu_ppc64le.go applies the "tsc=reliable", "no_timer_check" and
"noreplace-smp" kernel parameters, despite those being x86 specific. So,
just remove them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Let's add information on how to debug shimv2 when using cri-o, similarly
to what already is present with containerd.
Fixes: #672
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Dup a new file descriptor for temporary logger writer,
since this logger would be dropped and it's writer would
be closed out of if definition scope, which would cause
the logger process thread terminated if it used the original
pipe write fd.
Fixes: #318
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
With this change, a container is not longer given access to
the underlying root partition.
This is done by explicitly adding the root partition
to the device cgroup of the container.
Fixes: #317
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
The Qemu version check in unit test case is no longer needed for
Power since we don't support Kata with Qemu version < 4.x.
Fixes: #315
Signed-off-by: bpradipt@in.ibm.com
Improve the output of the data collection script to use lots more folds.
This makes it easier to review the information when viewing the pasted
output in a GitHub issue.
Fixes: #313.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add `start_section()` and `end_section()` functions to the data
collection script to allow new unfoldable sections to be created.
Redefine `show_header()` and `show_footer()` to use the new functions.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Now that the Kata containerd shim v2 can display a version string,
add those details to the data collection script.
Fixes: #309.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
All components should support a `--version` option to allow clear
identification of the version of the component being used.
Note that the build changes are required to allow the shim binary to
access the golang code generated by the build (such as the `version`
variable).
Fixes: #307.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add a new system component, used only when tracing is enabled. The
component listens to the agent over VSOCK, forwarding trace spans
created by the agent in the virtual machine onwards to an OpenTelemetry
collector (such as Jaeger) running on the host.
Fixes: #224.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Fix a long-standing bug where the KSM throttler logs would not be
collected by removing the last (unused) parameter to the
`find_system_journal_problems()` function.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The function that checks for legacy packages in the collect script was
missing pipes denoting regex alternation.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Added new functions to convert to/from a log level name (like `debug`)
to/from the equivalent `slog::Level::Debug`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The agent logger is useful and generic enough that it can be used by
other components, so move the agent logging package to below a top level
`pkg` to encourage re-use.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add the following patches for QEMU 5:
* memory-backend-file/nvdimm: support read-only files as memory-backends
* 9p: removing coroutines of 9p to increase the I/O performance
fixes#1064
Signed-off-by: Julio Montes <julio.montes@intel.com>
Clean up all clippy warning.
Also fix a bug in dealing with IFLA_IFNAME attribute.
nlh.addattr_var(IFLA_IFNAME, name.as_ptr() as *const u8, name.len() + 1);
The `name` is a rust String, which doesn't including the trailing '\0',
so name.len() + 1 may cause invalid memory access.
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Implment `TryFrom<IPAddress> for RtIPAddr` instead of From<IPAddress>,
so error code could be returned instead of unwrap().
Do the same for `TryFrom<Route> for RtRoute`.
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
There are too much unsafe code in the netlink crate, we need to reduce
unsafe code as much as possible. To achieve this, methods are classified
as public interfaces and internal methods.
All public interface of RtnlHandle has been reimplemented as safe code,
only some public helper functions to manipulater Netlink message data
structures are implemented as unsafe code.
The code to parse IPv4/IPv6/MAC addresses has been moved to a dedicated
file named parser.rs.
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
The scan_fmt crate has dependency on other four crates, and it's trivial
to use std library to implement the same logic. Get rid of scan_fmt to
reduce the dependency chain.
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Use features to enable/disable slog and agent handler on demand.
This helps to reduce dependency chains if slog/agent handler is unused.
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
The netlink crate is a library to communicate with Linux kenrel by using
the netlink socket. It's generic enough to be reused by other clients.
So get rid of dependency on the rustjail crate by:
1) normalize all pub interfaces to return Result<T, nix::Error>,
2) add helpers to reduce duplicated code,
3) move parse_mac() into lib.rs,
Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
qemu contains all device support for all the board qemu supported
on arm. But we use virt machine in most cases, so there are lots
of code in no relationship with virt then never used.
Here, we add a customized config, named arm-softmmu.mak.virt for
virt board. There is around 5M decrease of qemu binary using this
customized config compared with the common config.
arm-softmmu.mak includes and customizes the pci.mak and usb.mak to let
the change in aarch64-softmmu take effect. also arm-softmmu.mak.virt
is base on arm-softmmu.mak.
comparison of qemu binary between using common config and virt config
-rwxr-xr-x 1 root root 64190080 May 28 12:49 qemu-system-aarch64*
-rwxr-xr-x 1 root root 59061584 May 27 18:14 qemu-system-aarch64.virt*
Fixes: #1062
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Update ubuntu version to 20.04, due to the following linkage errors
is not possible to build QEMU 5 on ubuntu 18.04.
```
libmount.a(libmount_la-fs.o): In function `__mnt_fs_set_source_ptr':
(.text+0x9b1): undefined reference to `blkid_parse_tag_string'
libmount.a(libmount_la-tab.o): In function `mnt_table_find_source':
(.text+0x1dbf): undefined reference to `blkid_parse_tag_string'
libmount.a(libmount_la-utils.o): In function `mnt_tag_is_valid':
(.text+0x618): undefined reference to `blkid_parse_tag_string'
libmount.a(libmount_la-cache.o): In function `mnt_free_cache':
(.text+0x834): undefined reference to `blkid_put_cache'
libmount.a(libmount_la-cache.o): In function `mnt_cache_read_tags':
(.text+0xa24): undefined reference to `blkid_new_probe_from_filename'
(.text+0xa3d): undefined reference to `blkid_probe_enable_superblocks'
(.text+0xa4a): undefined reference to `blkid_probe_set_superblocks_flags'
(.text+0xa57): undefined reference to `blkid_probe_enable_partitions'
(.text+0xa64): undefined reference to `blkid_probe_set_partitions_flags'
(.text+0xa6c): undefined reference to `blkid_do_safeprobe'
(.text+0xb32): undefined reference to `blkid_free_probe'
(.text+0xb7c): undefined reference to `blkid_free_probe'
(.text+0xba0): undefined reference to `blkid_probe_lookup_value'
libmount.a(libmount_la-cache.o): In function `mnt_get_fstype':
(.text+0xef0): undefined reference to `blkid_new_probe_from_filename'
(.text+0xf09): undefined reference to `blkid_probe_enable_superblocks'
(.text+0xf16): undefined reference to `blkid_probe_set_superblocks_flags'
(.text+0xf1e): undefined reference to `blkid_do_safeprobe'
(.text+0xf4a): undefined reference to `blkid_free_probe'
(.text+0xf68): undefined reference to `blkid_probe_lookup_value'
libmount.a(libmount_la-cache.o): In function `mnt_resolve_tag':
(.text+0x130b): undefined reference to `blkid_evaluate_tag'
```
fixes#1060
Signed-off-by: Julio Montes <julio.montes@intel.com>
We only documented how to launch minikube/kata with CRI-O. It is
trivial to flip this to containerd, and that also works with kata-deploy,
so document it.
Fixes: #660
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Not all the fragments in common are needed by all the arch. The fragment
can be skipped if the have the tag !arch. For example:
# !s390x
Fixes: #1010
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Cloud hypervisor uses vsock, without this patch CI
for cloud hypervisor is not stable.
Patch information:
```
There was a race condition between bind() and listen() that was hit very
rarely when using Kata Containers and Cloud-Hypervisor. It's been
identified the problem is really coming from the virtio-vsock driver,
which is fixed by those new kernel patches uploaded for each version of
the kernels used by Kata Containers.
```
Update:
Fixed to make it build with kernel 5.6
Fixes#932
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
CRI has a v2 schema that seems to be the default in a lot of
containerd installations. It uses a "long" form for the plugin
id in the TOML config file.
Fixes#881
Signed-off-by: Dave Syer <dsyer@pivotal.io>
Debian 10 has been broken for a while but CI started
to detected recently.
Remove package until find a way to build it.
Fixes: #1052
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This remains the original aarch64-softmmu *explicit* default config. And
borrows the tuned configuration from i386-softmmu except the board specific
ones.
Fixes: #1044
Signed-off-by: Jia He <justin.he@arm.com>
Capstone is a disassembly framework which is not required for Kata.
Disabling it in configure can reduce ~6M bytes on arm64.
-rwxr-xr-x 71977368 May 8 09:32 qemu-system-aarch64.with.capstone*
-rwxr-xr-x 65676640 May 8 09:39 qemu-system-aarch64.without.capstone*
Fixes: #1044
Signed-off-by: Jia He <justin.he@arm.com>
This PR adds the documentation repository for the update-repository-version
verification.
Fixes#1027
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Enable the Travis CI configuration to perform static CI checks
on PRs to this repo.
Fixes: #1031
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Whilst enabling the static CI checks for this repo, it picked up
a spelling mistake. We'll need to fix that before we can enable
the CI.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
QEMU >= 4.0 is able to boot into the uncompressed kernel using the PVH
entry point, but to get this `CONFIG_PVH` must be enabled in the guest
kernel and `pvh.bin` installed in the host.
Booting uncompressed kernels in QEMU 5.0 can reduce the memory footprint,
~17% for KSM and ~15% nonKSM.
fixes#1029
Signed-off-by: Julio Montes <julio.montes@intel.com>
Support for passing sandbox annotations to the OCI layer was added
in containerd 1.3.0. Add this to the docs along with configuration
changes needed.
Fixes#653
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This PR updates the current version of the SLES obs packages that are
being generated.
Fixes#651
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The config file created by kernel fragments scheme is quite different
with the old arm64_kata_kvm_5.4.x.
So I will update arm64_kata_kvm_5.4.x for consistency.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Backport Anshuman Khandual's patch series of Enabling memory hot
remove on aarch64(https://patchwork.kernel.org/cover/11419305/)
to v5.4.x.
XONE_DEVICE is dependent on the implementation of memory hot remove.
This patch series has already been merged, and queued for 5.7.
After backporting this series, we could finally enable nvdimm/dax
on arm64.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Add a few arm64-specific configs and classify them into seven new categories
, that is,
1. base architecture-dependent options(base.conf)
It also includes varient-specific features, like CONFIG_ARM64_PMEM is
one ARMv8.2 arichitectural features.
2. crypto-related options(crypto.conf)
ARMv8 adds cryptographic instructions that could significantly improve
performance on tasks such as AES encryption and SHA1 and SHA256 hashing.
3. device tree related options(dt.conf)
The "Open Firmware Device Tree", or simply Device Tree (DT), is a data
structure and language for describing hardware, which is commonly
used in arm architecture.
4. ARM errata workarounds options(errata.conf)
There are many Kconfig entires under "Kernel Features" ->
"ARM errata workarounds via the alternatives framework", which provides
software workarounds to mitigate systems affected by those erratum.
Vendor-specific option will be left to users to decide.
5. pci related options(pci.conf)
a simplified pci host controller for mach-virt.
6. serial devices options(serial.conf)
CONFIG_SERIAL_OF_PLATFORM is used for all 8250 compatible serial ports
that are probed through device tree.
7. rtc related options(rtc.conf)
we don't have KVM’s paravirtualized clock and ptp implementation is
still under experimental mode, so we need rtc on aarch64.
QEMU provides an emulated ARM AMBA PrimeCell PL031 RTC.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Compaction is the only memory management component to form high order
(larger physically contiguous) memory blocks reliably.
The page allocator relies on compaction heavily and the lack of the feature
can lead to unexpected OOM killer invocations for high order memory requests.
We shouldn't disable this option unless there really is a strong reason.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
mmio devices are required in firecracker, and for now, x86_64 and
aarch64 are all supporting kata containers with firecracker.
So, we need to move mmio-related configs to common dir.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
There exists a few security-related configs, which are x86-64 specific.
CONFIG_LEGACY_VSYSCALL_NONE=y
CONFIG_RETPOLINE=y
CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE are kinds of tangled on
aarch64, if CONFIG_RANDOMIZE_BASE=y, then CONFIG_RELOCATABLE will be
selected automatically.
CONFIG_RANDOMIZE_BASE will randomize the virtual address at which the
kernel image is loaded, which as a security feature could deter exploit
attempts relying on knowledge of the location of kernel internals.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
There exists a few configs about linux guest support or optimization
that are not supported on aarch64.
CONFIG_HYPERVISOR_GUEST is only defined under arch/x86/Kconfig and
unfortunately, CONFIG_KVM_GUEST is not supported on aarch64 for now.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
For now, a few configs as follows in common acpi dir are truly x86-spcecific
or disable by default on arm64.
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP
CONFIG_ACPI_LPIT=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
CONFIG_HAVE_ACPI_APEI_NMI=y
And I also add a few configs which are aarch64-specific.
Like CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y, since ARM64 can run properly
in ACPI hardware reduced mode.
Fixes: #1004
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
- release: Tag and fork documentation repo as part of release
- obs: let patch set in order before apply them
- scripts: Disable pie for qemu when static building
- kernel: Enable CONFIG_VIRTIO_PMEM for booting from pmem
- kernel: Fix patch ordering
- tests: Remove performing updates in Fedora dockerfile
- kata-deploy: fix k3s containerd check
- scripts: update configuration script to support QEMU 5.0
- obs: Update SLES version for packaging
- config: enable printk-time for kernel-5.4 for arm64
- actions: change trigger phrase for kata-deploy action
- kernel: enable virtio-fs for arm64.
- add kernel config for gpu
- Optimize the kata qemu binary size
- obs: Remove OpenSUSE Leap 15.0 from obs generation
- pod : optimization Some debian package manager tweaks
d271ee7 obs: let patch set in order before apply them
fbad186 kernel: Enable CONFIG_VIRTIO_PMEM for booting from pmem
652d1fd release: Tag and fork documentation repo as part of release
7e22144 scripts: Disable pie for qemu when static building
93da145 kernel: Fix patch ordering
59f7678 tests: Remove performing updates in Fedora dockerfiles
96f3b99 kata-deploy: fix k3s containerd check
fb42e38 scripts: update configuration script to support QEMU 5.0
9bdc51c obs: Update SLES version for packaging
32986db config: enable printk-time for kernel-5.4 for arm64
9b8f20c kernel: enable virtio-fs for arm64.
12d351d kernel: add usage in readme
1389500 kernel: support force setup
7a17b50 kernel: support bash debug
d248e41 kernel: support build guest kernel for gpu
cbfc7a1 obs: Remove OpenSUSE Leap 15.0 from obs generation
9a6bd12 debian: Install missing ca-certificates package
d527c4f debian: Don't install recommended software
3670074 scripts: Disable a few options to reduce qemu binary size on generic architectures
711eae6 scripts: Set --enable-pie on aarch64 arch
7cdf113 scripts: Relax the version limitation for qemu
0871391 scripts: Remove obsoleted --disable-uuid
878a223 scripts: Disable xen when builing qemu on generic architectures
e92f3db actions: change trigger phrase for kata-deploy action
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
The enablement of ptp_kvm for arm is under review, see [1].
So we have to apply private patch to enable it in 5.4 kernel.
ptp_kvm can offer the capability of time sync in kata even there
is no network available and higher precision than time sync
service depend on network.
note:
If you want to use this feature on your arm machine, the host kernel
also need apply this patch. we recommend that your host kernel version
is the 5.4, then you can apply this patch smoothly.
[1] https://patchwork.kernel.org/cover/11372743/Fixes: #997
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
obs ci for linuxcontainer will fail when apply patch set which have
dependency within. so patch set should be made in order before feed
to apply.
Fixes: #1015
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
To support booting from pmem with cloud-hypervisor, we need to enable
the virtio-pmem in our kernel.
Fixes: #1013
Signed-off-by: Bo Chen <chen.bo@intel.com>
Since we want to start tagging and branching this repo,
create a VERSION file starting with the last version released.
Fixes#246
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
We should start maintaining stable branches for the documentation
repo similar to other repos.
Fixes#1007
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
--enable-pie is not compatible with --static option for qemu building.
Without this patch, it will report a configure error during static building:
ERROR: static and pie are mutually incompatible
Fixes: #982
Signed-off-by: Jia He <justin.he@arm.com>
Add containerd and crio versions that support
`privileged_without_host_devices` behaviour.
Fixes#638
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Fix the `build-kernel.sh` script to sort patches correctly. Previously,
it relied on `find(1)` for the ordering. However, `find(1)` does not
guarantee any ordering of files within a directory. Since the ordering
could therefore be "random", it was quite possible for patches to be
applied in the wrong order, resulting in conflicts.
Fixes: #1003.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The default k3s containerRuntimeVersion takes the form of:
containerd://1.3.3-k3s2
The awk was stripping away the k3s portion before checking if it was a
k3s containerd.
fixes#996
Signed-off-by: Brandon Wilson <brandon@coil.com>
By default, SPDK's setup.sh will bind PCI devices to
userspace from kernel. This may confuse beginners.
So add PCI_WHITELIST="none" to blacklist all PCI devices.
Fixes: #626
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Configure parameter "enable_vhost_user_store" is
added as an indicator to enable vhost-user storage
device assignment.
Also notice user hugepage should be enabled for
SPDK vhost target currently.
Fixes: #626
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Sentences for how to do host setup for vhost-user devices
were not clear, so re-edit them.
Fixes: #626
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Currently for our CI, we have SLES 15 SP1, this PR updates the current obs
version to match with our current testing.
Fixes#983
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We need to update the SLES installation guide, as we have obs packages
for SLES 12 SP4 and not for SLES 12 SP3.
Fixes#620
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
CRI-O config option manage_network_ns_lifecycle is replaced with
manage_ns_lifecycle in 1.17, which determines whether we pin and remove
namespaces and manage their lifecycle. Update docs to reflect both.
Fixes#617
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
This patch add patch file for virtio-fs-v0.3 kernel to enable memory hot
remove to let virtio-fs available on arm64. Also, kernel config file for
virtio-fs-v0.3x for arm64 is offered.
Fixes: #973
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Add option '-f' in build-kernel.sh to force the generation of .config
Signed-off-by: Jimmy Xu <junming.xjm@antfin.com>
n 请为您的变更输入提交说明。以 '#' 开始的行将被忽略,而一个空的提交
- Index the use-case of Intel and Nvidia GPUs
- Fix link to vfio-mediated-device in Intel GPU use-cases
Fixes#616
Signed-off-by: Jimmy Xu <junming.xjm@antfin.com>
This document decsribes how an Nvidia GPU can be used with Kata Containers in Nvidia GPU pass-through mode.
Fixes#616
Signed-off-by: Jimmy Xu <junming.xjm@antfin.com>
Because CI build is
1. Slow and in log it is showing because "apt-utils" not installed
2. to avoid CI build to exits with error without having certificate
Fixes: #970
Signed-off-by: Pratik Raj <rajpratik71@gmail.com>
By default, Ubuntu or Debian based "apt" or "apt-get" system installs recommended but not suggested packages .
By passing "--no-install-recommends" option, the user lets apt-get know not to consider recommended packages as a dependency to install.
This results in smaller downloads and installation of packages .
Refer to blog at [Ubuntu Blog](https://ubuntu.com/blog/we-reduced-our-docker-images-by-60-with-no-install-recommends) .
Fixes: #970
Signed-off-by: Pratik Raj <rajpratik71@gmail.com>
- ci: Provide source directory path for script execution
- kernel: Install uncompressed kernel by Image instead of vmlinux on arm64
- ACPI: Always build evged in for experimental kernel
- obs: Update obs packages for ppc64le
- scripts: enable libpmem only for x86_64
- scripts/qemu: enable libpmem
- release: Remove release docs
- test: Test for kata-containers packages on Fedora 31
- obs: Remove obs packages and testing for ubuntu 19.04 and fedora 29
- kernel: enable BPF to support libcontainer's cgroups V2 implementation
- kata-deploy: improve logic for crio.conf runtime additions
- yq: Use install_yq.sh script from tests repository
f599c8e kernel: Install uncompressed kernel by Image instead of vmlinux on arm64
c3949fd ACPI: Always build evged in for experimental kernel
83a69de scripts: enable libpmem only for x86_64
aad1e0e obs: Update obs packages for ppc64le
c0d45d8 scripts/qemu: enable libpmem
acf5b91 release: Remove release docs
3418d40 build: Enclose source dir for script execution
ac0d569 kernel: enable BPF to support libcontainer's cgroups V2 implementation
d7c2a38 obs: Remove obs packages and testing for ubuntu 19.04 and fedora 29
c8c3e46 test: Test for kata-containers packages on Fedora 31
43ab57f yq: Use install_yq.sh script from tests repository
cd6d364 kata-deploy: improve logic for crio.conf runtime additions
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
It explains the details of current supported annotations.
Fixes: #486Fixes: #294
Depends-on: github.com/kata-containers/tests#2240
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
The QAT instructions was broken after moving to a newer 4.19 kernel. Now
that the new 5.4 kernel is out, these instructions fix that.
Fixes#612
Signed-off-by: eric.adams@intel.com
shimv2/containerd logs are placed and formatted differently than for
kata CRI-O. Add some details to the Fluentd parsing document to aid
in parsing those.
Fixes: #610
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
PIE (position-independent executables) does good to security.
For some historical reason(compliation failure), it was disabled. But it
can be supported now on aarch64.
Fixes#926
Signed-off-by: Jia He <justin.he@arm.com>
Currently arm64 kata uses 3.0 qemu version. Hence aarch64 can't use some
--disable configure options between [3.1, 4.0].
Besides, due to upstream qemu bug about --disable-replication, still
enable the replication on aarch64 for qemu 3.0. Please refer to the
commit 3ebb9c4f52 ("migration/colo.c: Fix compilation issue when disable
replication")
Fixes#926
Signed-off-by: Jia He <justin.he@arm.com>
Qemu commit 315d318 uses built-in UUID implementation, hence we can't
disable uuid. This option is for generic arch, not only for aarch64.
Otherwise there is a warning during configure:
configure: --disable-uuid is obsolete, UUID support is always built
Fixes#926
Signed-off-by: Jia He <justin.he@arm.com>
Previously, it misses to add the --disable-xen for reducing qemu size
on aarch64. This patch add disable-xen on all arches, hence the case
switch is removed.
Fixes#926
Signed-off-by: Jia He <justin.he@arm.com>
vmlinux on arm64
arm64 does not use vmlinux to boot, Image is used instead.
Otherwise, kata can't boot from vmlinux.container
Besides, given that firecracker only supports booting from Image,
don't set vmlinux for firecracker target
Fixes#930
Signed-off-by: Jia He <justin.he@arm.com>
Let's change the kata-deploy github action trigger from:
'/test kata-deploy'
to
'/test-kata-deploy'
which will hopefully reduce the number of false triggers caused when
we issue the 'normal' CI runs that are triggered by other
'/test xxxx' phrases.
Fixes: #971
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
There are two 'debug' settings in the containerd config file that
affect the shimv2 runtime log output. Add the other method to the
existing documentation, and also note that enabling full containerd
debug also affects all of containerd.
The commit also re-generates the TOC, which seems to correct a
few anomolies there.
Fixes: #596
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Fedora versions 28 and 29 has come EOL, we should update the generation
of obs packages but now for Fedora 30.
Fixes#963
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Enable libpmem to support PMEM when running under Kubernetes.
see https://github.com/kata-containers/runtime/issues/2262
According to QEMU's nvdimm documentation: When 'pmem' is 'on' and QEMU is
built with libpmem support, QEMU will take necessary operations to guarantee
the persistence of its own writes to the vNVDIMM backend.
fixes#958
Signed-off-by: Julio Montes <julio.montes@intel.com>
Much of the information is from the release docs from packaging repo.
Plan is to maintain all the release information in this repo.
Fixes#600
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Some of the information in this doc has gone stale.
Move the relevant information over to Stable-Branch-Strategy.md.
It is a good idea to not have information dispersed accross
too many docs.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
1. For the git clone operation, 'sh' step in a single line would suffice.
2. Provide directory context using 'dir', this avoids having to provide the
path to the scripts twice, while executing each and every script in that folder.
Signed-off-by: Ramanathan Muthaiah <rus.cahimb@gmail.com>
Document examples of how to import Kata logs with `fluentd`.
Show examples both from the systemd/logfmt method and the
file/JSON method.
Fixes: #601
Depends-on:github.com/kata-containers/tests/pull/2334
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
libcontainer's cgroups V2 implementation requires BPF to run a BPF
program in the container
fixes#955
Signed-off-by: Julio Montes <julio.montes@intel.com>
Now that ubuntu 19.04 and fedora 29 has come EOL, we should remove the generation of
the obs generation and testing for ubuntu 19.04.
Fixes#953
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This will test the kata-containers packages that are available on
Fedora 31 to see that they are working properly.
Fixes#951
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
document what cgroups are supported and what changes are needed in the
configuration file to support them.
fixes#603
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add a section detailing the minimum debug you need to configure in
order to capture the kernel boot messages in the system journal.
Fixes: #593
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
This repo triggers the github action to create release tarballs.
It looks for release tags in other repos. So tag this repo
last to make sure tags have been created on other repos.
Fixes#947
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Removes two (similar) functions that install `yq`. Instead of
having different functions, use the one that we have in the
tests repository.
In addition, removes the `.ci/lib.sh` which only had an additional
`clone_tests_repo` function which was not being used.
Fixes: #939.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
We now make alpha releases before making a release candidate release.
Mention this in the docs.
Fixes#598
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Update release doc to mention that patch releases are not made
every 3 weeks, while minor releases are made every 12 weeks now.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
The containerd's debug option will determine whether
the kata's log forared to containerd's log pipe or
not.
Fixes:#596
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
Do not check for 'not in final' in spec creation, the logic
to fully validate is longer that just one grep.
Next should:
Use the same script build-kernel.sh to generate spec and validate it.
For now is still safe as CI will run all the build-kernels.sh to verify
the resulting config.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Kernel build for packages got broken after upgrade, this add needed
changes to build again.
Fixes#924
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
- Config changes for 5.4 kernel
- kernel: Enable new LTS 5.4.x on ppc64le arch
- lib: yq: explode anchors to get real value of image values
- kernel: use the maximum number of CPUs supported by KVM
- release: use absolute path for kubeconfig
- network: Enable ipv6 config CONFIG_IPV6_MULTIPLE_TABLES
- actions: check for packaging before clone
- release: bump kata-containers repository
- kernel/configs: enable CONFIG_X86_MPPARSE
- obs: Add ubuntu 19.04 testing
- release: tag and branch kata-containers repository
- add workflow for testing kata-deploy
- fixes for qemu 4.2.0
- config: enable printk_time for arm64.
- kernel: Enable new LTS 5.4.3 on AArch64
- FC: ELF format kernel image unsupported with firecracker on AArch64
- kata-static: Add sudo while building cloud hypervisor docker image
- obs: Remove fedora 28 obs packages
- snap: fix how latest stable version is obtained
- qemu: Patch qemu to support image without write access.
- snap: fix snap in launchpad
- kata-deploy: action: take updated yaml paths into account
04386a6 kernel: Enable new LTS 5.4.x on ppc64le arch
ea8b775 lib: yq: explode anchors to get real value of image values
b66fb43 kernel: Remove CONFIG_INET6 options from fragments
17d86c3 kernel: Always apply whitelist
ba68012 kernel: use the maximum number of CPUs supported by KVM
e0a57b6 network: Enable ipv6 config CONFIG_IPV6_MULTIPLE_TABLES
0751072 release: use absolute path for kubeconfig
32f2ff1 actions: check for packaging before clone
0ff7072 release: bump kata-containers repository
a95b359 kernel/configs: enable CONFIG_X86_MPPARSE
b023d8d kata-deploy: use clh instead of cloud-hypervisor
59a34bb static-build: drop NEMU, add CLH
6c9db9b kata-deploy-action: test CLH
f184afc testing: add workflows for testing kata-deploy
c14ded3 obs: Add ubuntu 19.04 testing
3ce2d36 release: tag and branch kata-containers repository
2ef9bbc FC: ELF format kernel image unsupported with firecracker on AArch64
ca6df85 kata-static: Add sudo while building cloud hypervisor docker image
59dc61d kernel: Enable new LTS 5.4.3 on AArch64
34d2c81 obs: Remove fedora 28 obs packages
ce2accc qemu/patches: add patches for qemu 4.2.0
7c13dc3 static-build: update blacklist for qemu 4.2.0
a407c92 config: enable printk_time for arm64.
5877ab7 snap: fix how latest stable version is obtained
43a6e67 snap: overwrite Makefile variables
bfe65e0 kernel: make get_config_version quiet
076cfa9 qemu: Patch qemu to support image without write access.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
There was a race condition between bind() and listen() that was hit very
rarely when using Kata Containers and Cloud-Hypervisor. It's been
identified the problem is really coming from the virtio-vsock driver,
which is fixed by those new kernel patches uploaded for each version of
the kernels used by Kata Containers.
Fixes#932
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Linux has embraced another LTS kernel version v5.4.x.
Update the kernel config for Power as well.
Fixes: #936
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
yq is not exploding anchors anymore and requiere an extra flag.
Add flag to fix CI.
Fixes: #934
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Now crio.conf has some kata entries in by default, but commented
out and without the runtime_path elements to them, our deploy
script gets a little confused and fails to add the kata-qemu
elements to the config.
This is because the grep spots the commented out lines, and tries
to, unsuccessfully, update the matching runtime_path elements, that
don't actually exist.
Improve this by matching only uncommented config lines, so now the
script sees that the runtime is not really configured already, and
instead of trying to edit/update it, will place a entry at the
end of the file.
Fixes: #928
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Dont think these are options are required at all.
Remove them from fragments and whitelist.
Fixes#924
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
The whitelist contains options that we dont really care.
Always apply it, irrespective of if we are using an
experimental kernel.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Since we don't know how many CPUs can have the host, we should
use the maximum number of CPUs supported by KVM (240).
255 is the maximum number of CPUs supported in the kernel, but the
maximmum number of CPUs recommended by KVM is 240, if more than 240
CPUs are used, next error will be returned by QEMU
```
Number of hotpluggable cpus requested (255) exceeds the
recommended cpus supported by KVM (240)
```
fixes#922fixeskata-containers/runtime#2413
Signed-off-by: Julio Montes <julio.montes@intel.com>
Although CONFIG_IPV6 is enabled, this additional config is
needed so that multiple route tables are used for ipv6.
Without this, the kernel adds routes for "fe80::/64"
with proto kernel in the main table instead of the
local routing table.
This makes the behaviour similar to regular containers.
Fixes#920
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
These instructions cover how to install and setup SPDK
vhost-user target, construct a vhost-user-blk device based
memory, configure the vhost-user-blk device to be available
for kata container, and run kata container with SPDK
vhost-user-blk device via docker.
Fixes: #586
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
If already exit do not clone it, but fetch.
Fetch will keep repository is up-to-date before checkout.
Fixes: #911
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
kata-containers is now part of the release processs.
Lets update the version for that repository.
Fixes: #905
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Firecracker needs CONFIG_X86_MPPARSE to support `vcpu_count`, otherwise the
amount of cpus wil always be 1.
fixes#901
Signed-off-by: Julio Montes <julio.montes@intel.com>
Update start-up guide on setting up kata containers with ACRN hypervisor.
The udpated guide is tested using KBL-NUC and addresses 2 parts,
1. Fixes broken links.
2. Adds a pre-requisite to enable MACVTAP for networking
in the Service OS.
Fixes: #580
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
We still were adding NEMU binaries - remove, and make sure we create a
kata-clh file for kata-deploy binaries.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
1. AKS based action updated to be run from either packaging or remote
repository. We will only clone kata-deploy for yaml/scripts/tests if we
are running the action outside of the packaging repo. If in packaging,
the bits are already included. Misc. cleanup as well.
2. Workflow introduced which leverages the updated AKS action. This will
allow testing of packaging changes to kata-deploy.
The workflow itself uses the following github action: xt0rted/slash-command-action
The workflow will create a kata-deploy container image based off of the latest
release, utilizing the latest released Kata artifacts off of master. It
will then use the AKS kata-deploy GitHub action.
Users with admin access on the repo can trigger this test by:
/test kata-deploy
Fixes: #845
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Now that we have obs packages for ubuntu 19.04, we should add it in the
testing script.
Fixes#884
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Now CI depends on this repository, needed to make work stable
branches starting stable-1.10
Fixes: #894
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The bootloader in firecracker on ARM platform only supports kernel
in Portable Executable(PE) format.
So we need `build-kernel.sh` to provide correct kernel image format
when parameter `hypervisor_target`, `-t`, defined with firecracker.
Fixes: #886
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
If we want to run the build.sh by using a user this is failing by saying
that `failed to dial gRPC: cannot connect to the Docker daemon...
/var/run/docker.sock: connect: permission denied`. This PR fixes that issue.
Fixes#889
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Linux has embraced another LTS kernel version v5.4.x.
If we, AArch64, update stable guest kernel version
to v5.4.x, we could get rid of huge chunkes of backport
patches under patches/4.19.x/.
Except following configs are penny-defined turned on/off,
all the other are sort of `built-in` defined or inherited
from v4.19.x.
1. CONFIG_IO_URING = y
This option enables support for the io_uring interface.
2. CONFIG_RODATA_FULL_DEFAULT_ENABLED = n
Apply read-only attributes of VM areas to the linear
alias of the backing pages as well.
3. CONFIG_ARM64_TAGGED_ADDR_ABI = n
When this option is enabled, user applications can opt in to
a relaxed ABI allow virtual tagged addresses to be passed to
system calls as pointer arguments.
4. CONFIG_ARM64_PTR_AUTH = n
Pointer authentication provides instructions for signing and
authenticating pointers against secret keys, which can be used to
mitigate Return Oriented Programming (ROP) and other attacks.
Fixes: #882
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Fedora 28 has come to end of life status which makes not possible to
retrieve the repositories while performing an update. This PR removes
this distro with this version so we not longer create and test obs packages
for fedora 28.
Fixes#879
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
cri-o now supports running privilged containers without passing devices
from the host to the container.
Fixes#529
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
As no printk time enabled for arm64, printk and dmesg will show
without timestamp.
This patch enables printk_time in kernel for arm64.
Fixes: #875
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Use `sort -V -r` to sort versions and use a regexp to
make sure the tag has the right format, since not all
tags follow Semantic Versioning 2.0.0.
fixes#872
Signed-off-by: Julio Montes <julio.montes@intel.com>
Overwrite Makefile variable `DISTRO` in order to
build rootfs and initrd images with the right distro.
fixes#868
Signed-off-by: Julio Montes <julio.montes@intel.com>
`get_config_version` should not log anything because it's used
by functions that print a string as return value, hence its return value
can be tainted, i.e `get_config_version`.
fixes#867
Signed-off-by: Julio Montes <julio.montes@intel.com>
Modify existing patch to include EACCES condition to account for files
that do not have write access to be used as a memory backend.
With this not-only files on a read-only filesystem, but files without
write access on a read-write filesystem can be used as a memory
backend in qemu.
This will alow the image to be used read-only by a rootless user as
well.
Fixes#870
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Since kustomize was introduced, we need to take into account the new
paths for our kata-deploy yamls.
Fixes: #865
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
This test is not executed at all and it is problematic when
tags are not updated.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
NEW_VERSION may be unbound whereas kata_version should be defined
following manual release process docs and while using github actions.
Use kata_version instead to checkout correct version of patches.
Check if kata_version is not empty before doing so,
as the release may be triggered for master as well.
Fixes#857
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
- release: Fix bug in how version is determined for actions
- kata-deploy: improve debug message, longer cleanup timeout
- v4.19.86: patch update for v4.19.86 on AArch64
- kata-deploy: add k3s support
- ci: Add obs testing for packaging
- kernel: Fix that the help is not printed twice
- obs: Check for broken packages
- kata-deploy: Increase the wait timeout for control plane to come up
- obs: Failed when we have unresolvable packages
- obs: Add fakeroot dependency for ubuntu 19.04
ff20f20 release: Checkout right version of kernel patches
9377c5d release: Fix bug in how version is determined for actions
168709c v4.19.86: patch update for v4.19.86 on AArch64
bbcffc3 kata-deploy: improve debug message, longer cleanup timeout
34ce361 ci: Add obs testing for packaging
0d84085 kernel: Fix that the help is not printed twice
e9bb8e5 kata-deploy: Increase the wait timeout for control plane to come up
37bce87 obs: Check for broken packages
9e716ae kata-deploy: add k3s support
380bd92 kata-deploy: reorganize files to support kustomize
0b9b722 obs: Add fakeroot dependency for ubuntu 19.04
5956065 obs: Failed when we have unresolvable packages
Signed-off-by: katacontainersbot <katacontainersbot@gmail.com>
Improve our virtualization documentation, as well as introduce
the Cloud Hypervisor VMM. This creates a virtualization specific
document, and references this from the primary architecture document.
We are still limited on ACRN documentation: this should be augmented
in a follow on PR.
The PNGs included were grabbed from https://docs.google.com/presentation/d/1ZJg3w3O6F_j3ucQhdbBdj2hZUwg7L7qF347xC07L2_wFixes: #567
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Checkout tag for packaging repo based on env variable NEW_VERSION
or kata_version with kata_version taking precedence.
With this, we checkout to the right version of packaging repo before
applying kernel patches.
Fixes#849
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Althought, we changed the script "gen_versions_txt.sh" to accept a tag
rather than a branch, this change is not sufficient.
This script generates the right version file based on a tag, but
function `get_from_kata_deps` does not use this, and ends up using the
master branch instead. This is because this function looks at an env
variable called $BRANCH and ends up using master branch if the variable
is not defined.
Pass the tag/new version to the build scripts, so that this tag is
passed along to `get_from_kata_dep`.
With this change, the correct version information is consumed by the
build scripts for the various hypervisors and kernel.
Fixes#831
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
we need to do patch update for kernel bump to v4.19.86.
Fixes: #806
Depends-on: github.com/kata-containers/runtime#2185
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
I am seeing tests fail at times waiting for label cleanup. Let's improve
the error message when this fails, and give the control plane a bit more
time, to improve stability of this test.
Fixes: #846
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
This will test that is possible to install the obs packages in different
distributions.
Fixes#621
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
While running the build-kernel.sh script with no arguments, the help is
printed twice. This PR will fix that.
Fixes#433
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Recent runs of setting up aks with github workflows shows that a timeout
of 5m is not always sufficent fot aks control plane to come up.
Increase this from 5m to 10m.
Fixes#839
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
By default, k3s uses an embedded containerd. Reconfiguring this
containerd requires modifying a template config file and restarting the
k3s (master node) or k3s-agent (worker node) systemd service.
Signed-off-by: Brandon Wilson <brandon@coil.com>
It seems that to build ksm-throttler, proxy, runtime and shim OBS packages
for ubuntu 19.04, we need fakeroot in order to have unresolvable OBS packages. This adds that dependency so we can build the packages.
Fixes#776
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We need to fail when we have unresolvable packages as they are not build
correctly.
Fixes#820
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
- kata-static: Add cloud-hypervisor to tarball
- obs: Do not wait on excluded packages
- kata-deploy: add or overwrite runtimes in containerd config
- kata-deploy: add support for Cloud Hypervisor and remove nemu
- qemu-virtiofs: Add one patch to fix libvhost-user
- release: Rename generated artifacts to a particular format
- scripts: Fix static build docker config script
- obs: Disable repo-publishing for CI builds
- release: Fix bug in evaluation kata_version.
- obs: Add ubuntu 19.04
- CI: Fix bump test
- kata-deploy: don't remove pre-existing containerd
- kernel: Enable configuration for fips mode.
- kata-deploy: action: reference kata-containers instead of fork
- snap: make launchpad happy again
dabef60 kata-static: Add cloud-hypervisor to tarball
233dfb6 static: fix qemu-virtiofs build
e4a8c6b obs: Do not wait on excluded packages
c745308 kata-deploy: add or overwrite runtimes in containerd config
c78f10f kata-deploy: remove nemu
5431096 kata-deploy: add support for Cloud Hypervisor
5d8f405 qemu-virtiofs: Add one patch to fix libvhost-user
c6f4313 release: Remove all traces of qemu-lite from packaging
e6c2a53 release: Pass the qemu tarball name as a docker build arg
7895958 release: Rename generated artifacts to a particular format
14558de scripts: Fix static build docker config script
627445e obs: Add ubuntu 19.04
4abfa70 obs: Disable repo-publishing for CI builds
c12c533 kata-deploy: don't remove pre-existing containerd
05a8d4b CI: Fix bump test
853a99c release: Fix bug in evaluation kata_version.
4d129fd kata-deploy: action: reference kata-containers instead of fork
ec95961 kernel: Enable configuration for fips mode.
27c7773 snap: reimplement image part
43a5d14 snap: use adopt-info to set grade and version
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
In case a package in obs is excluded ie no longer being built,
do not wait for it to be built. Wait as long as there are packages
being built or blocked on others to be built.
Fixes#815
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
In order to get both QEMU and Cloud-Hypervisor working with virtio-fs, a
patch needs to be applied in order to fix a libvhost-user bug.
Fixes#810
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Rather than hardcoding the tarball name to be generated in the
Dockerfile, pass this as an argument.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Rename artifacts to format kata-static-{artifact-name}.tar.gz.
These predictable names are intended to be consumed by github
actions in our release process.
Fixes#803
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
The existing document hasn't been updated since ~1.4. Updated to remove
references to qemu-lite, added details on Firecracker.
We still need details on ACRN added here as well.
Fixes: #570
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Fix bug in the `kata-configure-docker.sh` script which assumed
`/etc/docker/` existed by default.
Fixes: #800
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The repos of the CI builds are not used anywhere so let's be friendly to
the OBS infrastructure and do not publish them.
Signed-off-by: Ralf Haferkamp <rhafer@suse.com>
Bump test fails because Kata version at this moment is alfa and
only bumps from alpha to rc0 are allowed. Just use rc0 as use-case
for all, there is not any other constrain at the moment.
Fixes: #795
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This will allow us to run a VM in fips mode.
The intention is to check if the host is running in fips mode
and then start a container in fips mode as well.
Fixes#787
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
in order to make launchpad happy again, next changes are required:
* Install podman and cni plugings
* Use podman to build the rootfs or initrd image
* Depending on the architecture, build rootfs or initrd image
fixes#678
Signed-off-by: Julio Montes <julio.montes@intel.com>
- release: Fix typos and organization issues
- kata-deploy: fix qemu-virtiofs entry on crio configuration
- Add actions release automation
- tags: Tag all repos with the same kata VERSION
- kata-deploy: Add qemu-virtiofs to containerd configuration
- release: Fixing message information
- kata-deploy: Add qemu-virtiofs wrapper
- doc: Fixes for release.md
- deploy: Skip installing nemu
84e004e kata-deploy: fix qemu-virtiofs entry on crio configuration
d56dec0 release: Fix typos and organization issues
9a7d692 kata-deploy: Add a simple GitHub Action
4eb376b artifact-list: provide script to get items to build
4f89e97 kata-deploy: look for kata artifacts locally
dc8fe05 release: Allow functions to take release versions
6c8df7f release: Call kata-deploy-binaries.sh main only if it not sourced
5307b03 release: Define a default value for destdir
7a932cf release: Create tarballs after every stage
420eb6e qemu-virtiofs: Fix tar naming for qemu with virtiofs support
f2ef841 release: don't checkout packaging from packaging
643ddf9 release: Add option to generate versions based on tag
b8dcb1c tags: Tag all repos with the same kata VERSION
eea8cea kata-deploy: Fix indentation issues.
8234f9a kata-deploy: Add qemu-virtiofs to containerd configuration
aafd329 release: Fixing message information
dab8087 kata-deploy: Add qemu-virtiofs wrapper
7c26509 doc: Fixes for release.md
8eb5cf3 deploy: Skip installing nemu
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
adopt-info is a snapcraft feature that allows us to specify the snap grade and
version at runtime. Depending on the environment the master or the latest
stable branch can be used to build the Kata Containers snap, for the kata
CI and launchpad snap-master branch, the master branch are used, otherwise
the latest stable branch.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use correct key for the kata-qemu-virtiofs runtime class definition
in the crio configuration file.
Fixes: #771.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
For better reading, re-orginize the `release/README.md`
and fix a typo in `runtime-release-notes.sh`.
Fixes: #769.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Create a container based action which will test a Kata artifact tarball
in the kata-deploy daemonset on AKS. This AZ credentials are available
from the callers environment.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
artifact-list.sh is created so a builder can quickly determine which
artifacts may be built within this repository.
I user may get this list, which indicates exactly which functions are
available within ./release/kata-deploy-binaries.sh for building.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
When building the kata-deploy images before, we would look to pull the
latest artifacts from the release URL.
It would be better to allow the user to pull from this URL, or to create
the artifacts locally, and pass the location of this tar.xz to the build
process.
Instead of providing KATA_VER, builders should provide KATA_ARTIFACTS,
which is the filename that is assumed to be located within the docker
build path.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
We want to isupport calling individual functions from the script,
independendent of the actual script being called.
Define a default value for $destdir.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
We ran into issues in the past since we didn't create stable branches
for the packaging repository. We will maintain this appropriately going
forward, so let's go ahead and remove the notion of local versus remote.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Now that all files have a VERSION we
can check if there is a bump for the file.
We can now tag all repos with the same kata version.
As all of them are branched and have a VERSION file.
Fixes: #748
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We need an entry of `kata-qemu-virtiofs` on the
containerd configuration file.
In addition we need to add `kata-qemu-virtiofs` to the
shim list, so that the wrapper is created for shimv2.
Fixes: #760.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add requirement for the user to be added to the docker group.
Observed firecracker failing due to this.
Add fixes for typos and missing spaces.
Fixes#754
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Instead of have special cases, just
bump all, bumps are check and more if are automated.
CI probbly not, but we can skip if necesary.
Fixes: #744
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This is a experimental feature for arm64 as linux kernel has
not enable kvm ptp for arm64.
ptp_kvm need co-work from host and guest, so you need add this
patch both to your guest and host. Host kernel version is better
lower than 5.0 and higher than 4.19.
another version of this patch base on kernel v5.3 is under review in kernel upstream, refer to [1]
to see the full info.
[1] https://lkml.org/lkml/2019/8/29/80Fixes: #692
Signed-off-by: Jianyong Wu jianyong.wu@arm.com
Apply qemu/patches/virtiofsd/0001-add-time-to-seccomp.patch
to be able to build virtiofsd statically.
Fixes: #742.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
time syscall on seccomp is needed to be able to build
virtiofsd successfully.
This patch is currently not availabe upstream, so lets
add it until it becomes available.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Removes `--vsock` flag when building Firecracker since
the flag was removed as vsock is enabled by default.
Also update the path where the binaries are placed.
Fixes: #739.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Since kernel version updated to v4.19.73, kernel config file should
also been updated accorindly.
Fixes: #736
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Running for the first time the kata-deploy script can fail if hub
is not installed it, this will avoid this issue.
Fixes#728
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This will allow to test local changes to the kernel as well it will allow
us to have vmlinuz or vmlinux with virtiofs.
Depends-on: github.com/kata-containers/runtime#2078
Fixes#717
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
x86 has "releases" as the release branch. For
ppc64le, we have "alpha" branch. Update the scripts
for the same.
Fixes: #704
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
Overlay and veth support wasn't included when migrating to fragment
based configs. Re-add to fix DinD use case.
Fixes: #715
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
As per the comments in security.conf, the intention was to
enable STACKPROTECTOR and STACKPROTECTOR_STRONG.
The current config leaves them unset in the final .config
and also prevents other fragments from overriding the setting.
Set both to =y as indicated in the comments.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
This adds the kata deploy for QEMU and kernel with virtio-fs 3.0
Depends-on: github.com/kata-containers/runtime#2052
Fixes#709
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
When we try to run the kata-deploy-binaries.sh script, we have a failure on
the pkglib.sh script that we can not source the versions.txt. In order to
avoid these kind of failures, we introduce to detect if this file exists and
in case that it does not, we fail the script.
Fixes#712
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
kata deploy script setup a new GOPATH to pull
a fresh environment to install kata. This script
was using the local kernel install script and not the
one in the new environment
Fixes: #706
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Document how cgroups are done today and what is expected
for the upcoming SandboxCgroupOnly option.
Prior cgroup documentation are no longer accurate. Removing the cgroup
discussion from the cpu sizing discussion. Updating the
cpu-constraints.md file name to reflect this.
Fixes: #542
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
common/DAX:
- ARCH_ENABLE_MEMORY_HOTPLUG: not needed (auto-selected)
- ARCH_HAS_ZONE_DEVICE: already automatically selected. This is
also removed in future kernels, so let's go ahead and drop.
- RADIX_TREE_MULTIORDER: already autoselected, and dropped in future
kernels
common/net:
- NF_NAT_NEEDED, NF_NAT_PROTO_*: these don't exist in newer kernels, as
they are refactored and unecessary in the upstream kernel. Keep them for
now, but consider dropping if we move to newer LTS. These are part of
whitelist of options we expect to be dropped with newer kernels in our
fragment building.
- NF_NAT_MASQUERADE_IPV4: this is a select, not a tristate. Also, in
the future much of the ipv4/ipv6 nat code is combined, so this config
will not exist in newer kernels. Dropped.
- INET6_XFRM_MODE_* are not needed on newer kernels. While I'm not
confident they are needed today for Kata, we will just note them and add
to whitelist for options we expect to be dropped with newer kernels in
our fragment building.
- MAY_USE_DEVLINK: removed in future kernels, and should not be needed
anyway. Dropped.
x86_64/DAX:
- ARCH_HAS_HMM: should not be needed, and is dropped in future kernels.
Dropped
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
These will be handled directly from virtio-fs gitlab, which is utilized
when experimental support is requested in kernel build.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Experimental kernel is much newer, and many configuration options have
dropped since 4.19. Let's use a whitelist to itemize what we expect to
be dropped in the final config if experimental kernel us utilized.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
This isn't available in the baseline kernel, necessarily. Only
add these config options if an experimental kernel is being used.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Adding option `-e` to support experimental flag. When selected, the
kernel for virtio-fs is utilized instead of standard kernel.org.
This is a bit more hack-ish than I'd prefer, sorry.
Fixes: #700
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Improved the Docker installation instructions by making it clear *which*
of the multiple ways of configuration Docker for Kata is the default,
and that it is not necessary to do anything further if users select the
automatic installation method.
Fixes: #551.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The documentation says hugepages are required for virtio-fs. This
limitation was removed in Kata 1.8 in kata-runtime commit
a41894da18 ("runtime: Enable file based
backend").
Fixes: #544
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Create a dedicate how-to guide for running Kata with k8s, and link to it
from the original guide location inside the Developer Guide.
Fixes: #333
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
This commit adds documentation for privileged containers and the mounting of host devices
when privileged is used. It has instructions for disabling this functionality when using
Containerd and CRI.
Fixes#529
Signed-off-by: Alex Price <aprice@atlassian.com>
Create symlink to patches directory, the list of patches will be
included in the spec and rules files.
Signed-off-by: Julio Montes <julio.montes@intel.com>
- Run depends-on for packaging CI.
- Change were yq is installed
Depends-on: github.com/kata-containers/runtime#1996
Fixes: #683
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to trim the list of devices, default-configs/i386-softmmu.mak must
be copied after having configured QEMU. This change helps to reduce the
attack surface and the QEMU binary size.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Running the container with `ctr` when the image is not present
on the system gives an error.
Fixes#536
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Refresh installation guide README with a clearer structure, and provide
a list of distribution with official Kata packages. This also updates
the openSUSE Leap versions supported to 15 and 15.1.
Fixes: #533
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Upgrade the container before building qemu and nemu in order to install
the latest fixes for the CVEs.
fixes#676
Signed-off-by: Julio Montes <julio.montes@intel.com>
Do not use cache to build the docker images that build static qemu and nemu.
The latest version of the packages must be installed, since they may include
the fixes for theirs CVEs.
Signed-off-by: Julio Montes <julio.montes@intel.com>
In theory the latest ubuntu long term may have less CVE than previous versions,
so let's use it to build the static QEMU.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use the rootfs image by defult since performance is better,
smaller memory footprint and boot time.
fixes#667
Signed-off-by: Julio Montes <julio.montes@intel.com>
Upgrade openSUSE Leap version from 42.3 to the latest 15.1, since 42.3
version is now discontinued.
Fixes: #637
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Fix `arch` assignment and define `Dockerfile` variable to avoid
usage of unbound variables.
fixes#671
Signed-off-by: Julio Montes <julio.montes@intel.com>
The job to wait for packages are built is failing randomly.
Seems that sometimes the command is not returning and expected
out out and may be mask by the
`while osc pr | grep; done`
This probably can fail at osc pr but because it failed at
osc and not grep we consider is working.
- We check for more states that we consider not ready,
like excluded or blocked.
First query the result, if fail the script will stop,
if not then try to find the string `state=building`.
Additionally, check for failed jobs in the same query to
stop the job earlier.
Fixes: #665
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
OBS fails because of a syntax error in debian.rules
```
/bin/sh: 1: Syntax error: end of file unexpected (expecting "fi")
```
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use master branch to test the snap in order to detect errors earlier
before releasing the next snap
fixes#663
Signed-off-by: Julio Montes <julio.montes@intel.com>
`kata-fc` does not presently function under `minikube` due to
lack of block based storage. Make that clear in the installation
documents, to help prevent users going through the whole install
process, only to be disappointed when they find ti does not work.
Fixes: #526
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
In order to improve the security of Kata, nothing should be able to modify
the images. It would be really bad if a malicious container or process
modified them.
fixes#631
Signed-off-by: Julio Montes <julio.montes@intel.com>
use `merge_config.sh` script to generate the final `.config` file if the
`${arch}_kata_kvm_*` file doesn't exist.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Now we are using the fragments, drop the x86_64 4.19 config file
so we default to fragment mode.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Add the framework to build kernel config files from trees
of kernel fragments.
If no fragment directory is found for the requested kernel
version and architecture then revert to looking for a whole
prebuilt kernel config file instead.
Fixes: #234
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Create a script that will be added to the
`kata-static-${version}-${arch}.tar.xz` file and which can be either run
directly by the user to configure Docker, or can be run indirectly by
the `kata-manager` script.
Fixes: #648.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The networking part of the doc talks about the
two ends of the veth pair. One end is in the container
networking namespace and the other one should
be in the host networking namespace. Fix this info.
Fixes: #518
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Add Minikube to the list of 'cloud' providers installation instructions.
Whilst there, order the list alphabetically.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
In the workaround solution of ConfigPath, there is a '$@'
missing in the script, so add it.
Fixes: #515
Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn>
These instructions cover how to install the out of tree
QAT drivers to the host, build a custom kata kernel and
rootfs, and build a QAT accelerated OpenSSL container
image.
Fixes: #509
Signed-off-by: <eric.adams@intel.com>
Then we can use x-ignore-shared to do migration and drop the
extra patch once we move to qemu 4.1.0 or later.
Fixes: #640
Depends-on: github.com/kata-containers/runtime#1799
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
To avoid conflics between kata companents and their versions, all
components should be built using their tagged version.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Setup the kernel by hand is prone to errors.
Use `build-kernel.sh setup` to pull and setup the kata kernel.
fixes#438
Signed-off-by: Julio Montes <julio.montes@intel.com>
osbuilder shares the yq binary with the container that generates the image,
unfortunately the snap version of yq is not a static binary hence it's not
compatible with the alpine container.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add missing kernel configs to avoid `make oldconfig` asks or
takes the default value for the missing configs.
fixes#623
Signed-off-by: Julio Montes <julio.montes@intel.com>
We want to use the same script for both
PRs and new package CD. Depending if CI
is set a release push will be done or
a ci.
Fixes: #617
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
qemu static is using all the patches that we have for qemu,
we only want to apply depending the version.
Fixes: #619
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The image tag opensuse:leap not longer exist
use the the new image format.
Fixes: #615
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The main purpose is that this script will be used to verify
that VERSION among the components are equal before merging the runtime.
Fixes#613
Depends-on: github.com/kata-containers/runtime#1858
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
After a rc0 is created the path to have an stable release starts, after that
any rc0 is to improve stabability and not more features are added. When it is
the projects is considered stable no more rc* are done.
Fixes: #611
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Sometimes get logs could fail, for example
when a tag does not exit, instead of fail
just log the error in the PR.
Fixes: #609
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Packeges uses fuzz 0, lets have the same behavior
in scripts and packages.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
- OBS packages are build based on kata head
- The OBS kata branch is created on demand
- TODO: Delete branch when is not needed anymore
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add simple yaml definition to run job in azure pipelines.
- The pipeline should be triggerd with comments when is a PR
Fixes: #480
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We are looking to deprecate qemu-lite. As a first step,
let's go ahead and make qemu-vanilla (4.0) the default VMM.
We should probably rename qemu-vanilla to just qemu in a follow on
PR.
Fixes: #601
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Latest firecracker has moved the generated binaries to a new
location. Update the scripts to use the new location.
Fixes: #599
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
we need to do patch and config update for v4.19.52 on AArch64.
The config file adds a few configs involved with memory hot-plug
support.
Fixes: #591
Depends-on: github.com/kata-containers/runtime#1817
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Add jailer binary to kata-deploy. It allows us to enable jailer
with firecracker.
Fixes: #593
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
Use the tag of qemu from `versions.yaml` instead of the
version number if the version does not exist in references
of the repository.
Fixes: #587.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add further advice to the documentation requirements document to make it
easier for the spell checker to accept a document.
Fixes: #501.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Make the CI setup script call the "master" `setup.sh` script (in the
`tests` repo) and update the Travis config accordingly to ensure that
both setup and static checks are run.
Also updated Travis to use Ubuntu 16.04 LTS (Xenial).
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Instead of always having `prefix` hardcoded to `/opt/kata`,
change the script to be able to take the value from an
enviroment variable.
Fixes: #589.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Update the developer guide with instructions to
attach to the debug console of a initrd rootfs based
VM.
Fixes: #502
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Add `--head` option to use the head of the branch instead of the kata
version to generated the hashes for the packages. With this new option
kata packages can be generated using the latest commit on master.
fixes#566
Signed-off-by: Julio Montes <julio.montes@intel.com>
runtime's `versions.yaml` was updated to support QEMU 4. Update
`gen_versions_txt.sh` to support the latest `versions.yaml`.
Signed-off-by: Julio Montes <julio.montes@intel.com>
The formula is not updated according on
how is done in kata-runtime.
Fixes: #489
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
For RPM packages (but not for DEBs), OBS does not use the release number
provided in the spec file, hence, when specifying a
`Requires: package = version-release` dependency, it's not possible to know
in advance the correct release number until that reuired package
is built.
Note that omitting the release number works for RPM packages but not for DEB.
This fixes/complements e6dac82Fixes: #563
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
This package is not installed with systemd in Clear. Add this
as an additional package requirement for debug console to make it
possible to debug.
This package contains utilties like `cat`, `ls`, `echo` etc required
for a useful debug.
Fixes#492
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Recent change to always build tools from the local repository if the
script is run in a CI environment fails during a release build as the
variable ${CI} is not initialized. This fix addresses that issue.
Fixes: #537
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
update yaml, and update README to describe creation of the CRD in
Kubernetes versions < 1.14.
Fixes: #560
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
The runtime package is faling to build due to
compatiblity issues with gcc + golang because
the redhat version provided in OBS old.
Disable temporarily to allow release CI work.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Disable PAM authentication for QEMU 4+: it's a feature used together with VNC
access that's not used in Kata.
See QEMU commit 8953caf for more details on PAM auth.
Fixes: #550
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Update pkgcloud pacakge to add Fedora 30 to the list of distros
supported by to Packagecloud.
Shortlog since last vendoring of github.com/mlafeldt/pkgcloud:
926cf4b Update list of distros (Add Fedora 30)
Fixes: #546
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
When specifying a "Depends: (= VERSION" match in deb packages, the full
"VERSION" needs to be specified, including the trailing release number.
This fixes a regression introduced in: 63413814Fixes: #531
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Correct typos and resolve formatting issues including incorrect heading
levels and missing TOC entries.
Fixes: #541.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Editing systemd unit files to customize Docker for Kata may generate conflicts
with what's specified in /etc/sysconfig/docker, so use that file directly.
Also, libcgroup1 dependency is wrong for newer distros, and should be
pulled automatically for older ones.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
normal users might not have the correct permissions to run
docker without sudo.
In addition, as docker will run with sudo, fix permissions
on the qemu and nemu files.
Fixes: #544.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Creating Kata packages fails
due to "Makefile:58: *** target pattern
contains no '%'. Stop" error. Fix it.
Fixes: #539
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
NEMU releases are build dynamically on a xenial platform and there
aren't any plans on providing packaging for various versions and distros
today. NEMU needs to be built statically as the current default release
to be consumable by Kata. Given we are doing that, it would be nice to
test it in our CI also the same way. This change is to aid with that.
Fixes: #521
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
the versions.yaml file in runtime carries the information on all the
components we use and ship with kata. It would be nice to have the CI
test the newer versions when the file is changed and CI is triggered.
The current code always fetches from the master tree from github and
that does not help to validate version changes before it lands in the
tree.
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
hub tool from github lets you show a particular tag that might exist in
the repostiory. Switching our tag checking logic to use that instead of
listing all tags and grepping for the one we want. For some reason the
existing grep logic always fails to return the right code and always
lands on the portion of the code to generate a new tag.
Fixes: #519
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Created a skeletal set of README for the packaging areas that didn't
have them:
- Jenkins
- OBS
- QEMU
- static build tooling
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
GFM doesn't require corresponding heading hashes at the end of line -
start of lines hashes are adequate.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add missing heading names and a table of contents.
Also, folded the long lines to make them easier to edit and diff.
Fixes#501.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
If a doc contains commands, they should be non-interactive where
possible to allow for the possibility of automating the testing of the
document in the CI.
Fixes#477.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
New documentation is great, but finding it should be easy. Require that
all new docs are referenced by an existing document in the repo.
Fixes#475.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
- Created a how-to README.
- Moved howto links in top-level README to the how-to README.
- Moved svc-mesh how to into the how-to directory.
Fixes#473.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Removed two list entries in the design README that don't have a
corresponding document to link to.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Fixed the Debian install guide which was pointing to the Ubuntu Docker
install guide by mistake.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
- Add more motivation, background on filesystem sharing
- simplify configuration, installation by utilizing kata deploy
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
virtio-fs has landed as an experimental feature in kata. This patch
enable the basic how-to for this feature.
Fixes: #468
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Strict dependencies guarantees that an older version of the runtime will
not be installed together with a more recent version of the other kata
packages.
This complements commit e73473f.
Fixes: #508
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
As shpchp used for pci hotplug on arm64 initialized
its bottom half work as a delay work for 5 seconds, pci bus
rescan triggered between up half and bottom half of shpc interrupt
handling will fail. so disable shpc and let bus rescan
to do the device hotplug on arm64.
Fixes: #498
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
We modified the kernel subdir, even thought it was only a doc
change, so we need to bump the config ver to reflect that.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
We don't append fullstops to section titles, and they mess the
ToC up (looks wise). Nuke the one we had in this file.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Note that to use the build script you need to have some prereqs,
including a new enough golang.
Fixes: #478
Reported-by: Rory Savage <rsavage@dispersivegroup.com>
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Change the kata-deploy doc to get rid of code-snippets
and instead include instructions to apply the provided
RuntimeClass yaml according to the k8s version being used.
Fixes#457
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
It's a little complicated to understand the note of the section
Install containerd with cri plugin, that says Just check if the cri plugin has
been disabled in the containerd configuration file but if it's disabled
containerd + the runtime class won't work.
fixes#462
Signed-off-by: Julio Montes <julio.montes@intel.com>
update-repository-version script no longer expects the repository name,
but just the version and the target branch. Modify associated Makefile
and jenkins pipeline files to adapt to that change.
Fixes: #443
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Prior to this, some of the binaries installed by kata were not owned by
root. Any user can write/replace these binaries.
This was happening as tar perserves ownership while creating the
archive.
Change the ownership of all binaries to root.
Fixes#489
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Simplify the bullet list of general requirements in the documentation
requirements document at the same time as making the wording
unambiguous.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The image now is generated using versions file. It is not generated
it will fail.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If the container has had to restart, lack of overwrite here causes a benign error message to appear since the nodes already have `katacontainers.io/kata-runtime=true` label. Having a overwrite here means that we don't get the following error message:
error: 'katacontainers.io/kata-runtime' already has a value (true), and --overwrite is false
Signed-off-by: Bharat Kunwar <b.kunwar@gmail.com>
This patches adds virtio-fs capability to the kata kernel along with
config changes to enable the same on kata by default. The system will
only be exercised when `shared_fs` is set to `virtio-fs` in the kata
configuration file. the default still remains to be 9p
Fixes: #387
Depends-on: github.com/kata-containers/runtime#1016
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Previous tarball uploaded to github has a different hash length
this commit make the regex more flexible in case the commit
length is bigger.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add script to wait until obs finish the process to build.
- check if process failed
Useful for CI job.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
SUSE does not support CONFIG_RETPOLINE.
This has being failing for a while in order
to allow the pipeline pass all the builds
must be successful.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If branch is provided not use master.
When buiding packages the master repository is used
this is bad for stable releases. Use the BRANCH variable
exported in releases.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Simplify the pipeline code by doing all the bumps.
- Instead of get the repo to bump, make the script bump them all
- Do not bump osbuilder and ksm on stable branches.
- Simplify usage for automation.
Fixes: #443
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When CI (re)create repos, this does not provide ubuntu
updates.
- Ubuntu 16.04 requiere enable more repositories to
get latest gcc and allow build with golang.
- Add support to define multiple repositories
Repositories are comma separated in distros file.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If image is already uploaded to github dont build it again.
This Reduce pipeline time.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Some CI system timeout after some time of not output.
- Remove unused build variable
- remove quiet from image build, to know what is doing.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Reduce pipeline time by not installing golang.
golang is not needed to use osc, it makes slower the image creation.
- remove go dependency from pacakge lib
Remove calls to golang, this will be not not installed in
the docker image.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The conduit project seems to have renamed itself to linkerd so update
the service mesh document to reflect that.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
change short commit length to be the same to OBS.
Now that runtime has a strict depedency version from packages.
Like qemu-lite:
Now:
qemu-lite = 2.11.0+git.87517af
Before:
qemu-lite >= 2.11.0+git.87517af
The rpm fails because the real package version of qemu lite is
2.11.0+git.87517afd72
The commit length comes from the format of OBS '%h'
This change the shortcommit length to be the same to OBS
and runtime dependencies and packages that include git commit
as version use the same format.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Image tarball size is the same as the one defined
in lib, factor out to avoid future errors if is modified.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When a repository holds more than one kata version
it is difficult to ask dnf or zypper for a kata version
because the version includes a git commit.
This commit removes the sha from the package version.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The runtime requiere the componets that were
build in are release. If other versions is used
it may fail.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If we apply patches directly to qemu package
this will fail unless we use qemu 4.0 (not yet today).
This patch organize qemu patches per version. For following
PRs we should make scripts aware of this and apply the right
set of patches.
Fixes: #475
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
On ppc64le, qemu installed inside a snap image
is qemu-system-ppc64, but referred in config.toml
as qemu-system-ppc64le.
Fixes#467
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Simplify qemu rpm list files using wildcard
this will help to build different qemu versions
without change all the list of files.
- Exclude not needed binaries.
Kata does not use helper binaries, and
4.0 build has a missing qemu-ga by default,
excluding files does not fail if the file exist or not.
Fixes: #464
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Change weird condition to check qemu >=3.1
Add comment explaining the malloc-trim option.
fixes#462
Signed-off-by: Julio Montes <julio.montes@intel.com>
Fix installing docker on Debian by changing the docker install guide to
ensure that only the `kata-containers.conf` systemd service snippet is
created. Previously, both the snippet and the `daemon.json` Docker
config files were being updated because the latter also specified a bash
code block.
Note that the `daemon.json` section is now consistent with the other
install guides - it just displays the JSON code to add rather than
trying to set it.
Also, added missing shell prompts, changed code blocks into shell (but
not bash) code blocks and fixed a few minor grammar and whitespace
issues.
For further details, see:
- https://github.com/kata-containers/documentation/blob/master/Documentation-Requirements.md
- https://github.com/kata-containers/tests/tree/master/cmd/kata-managerFixes#442.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
modify configure-hypervisor.sh to support Qemu 4 and enable `malloc-trim`
for memory optimization.
fixes#459
Signed-off-by: Julio Montes <julio.montes@intel.com>
- Add information about package testing pipelines
- Fix release notes command
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Using only one directory to hold kernel patches lead to
difficult maintenance. Instead use a list of patches per
kernel version.
If patches for a kernel version does not exist, dont fail.
Fixes: #308
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add the yaml for kata RuntimeClasses. It is useful to
include this explicitly, rather than just having it in the docs.
Also, this feature has transitioned from alpha to beta from k8s 1.13
to 1.14. Hence maintain separate yamls for these versions.
Fixes#444
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
All instances of the deprecated `arch` command are now replaced with `uname -m`.
Bumps kernel/kata_config_version to 34.
Fixes: #423
Signed-off-by: Rasmus Moorats <me@neonsea.uk>
Fixes: #429
Add zun-ui plugin for devstack when intergrated with zun.
Depends-on:github.com/kata-containers/tests#1484
Signed-off-by: Lei Xu <xulei@cmss.chinamobile.com>
build, install and test kata containers snap package.
Depends-on: github.com/kata-containers/tests#1454
fixes#428
Signed-off-by: Julio Montes <julio.montes@intel.com>
This patch is update version for [1] as kernel
upgrad to v4.19.
It derives from [2] which has accept by kernel
community after v4.20. Modifacation has been done
to make it be able to enable memory hotplug using
probe method as it originally aims to using acpi.
Also some corresponding configurations in kernel
config are opened.
[1] https://github.com/kata-containers/packaging/
commit/e654dbd8367371c1b34776445a402d3c90f0dc66
[2] https://git.kernel.org/pub/scm/linux/kernel/
git/torvalds/linux.git/commit/
?id=4ab215061554ae2a4b78744a5dd3b3c6639f16a7
Change-Id: I305435f1d7e38d5cfcee22799792d1f4b0f015f8
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Jira: ENTOS-899
Kata OBS repositories provide multiple branch support.
Let define a variable to allow users define kata branch to use.
Fixes: #423
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
To assist in tracking older issues/PRs, let's add a tool for marking
issues and pull requests as being stale after 60 days of inactivity. A
stale issue/PR will be closed after 7 days of being marked stale.
Fixes: #366
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Unless we run kata VM as a hypervisor, we may need
CONFIG_S390_HYPFS_FS and CONFIG_SYS_HYPERVISOR.
CONFIG_S390_VMUR is for z/VM hypvervisor.
Remove CONFIG_ZSWAP and its dependencies to match other arches.
Fixes: #421
Signed-off-by: Tuan Hoang <tmhoang@linux.ibm.com>
api.proto moved, resulting in a broken link. The original link wasn't
very useful in the first place, so simply remove.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Let's keep all design documents in the same logical location. Updating
the file to be called 'cpu-constraints', though we may want to expand to
resource constraints going forward.
Fixes: #417
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
When writing our runtime configs to crio.conf, let's add some
whitespace and comments to make it clearer, and fit in with the
rest of the crio.conf file.
Fixes: #412
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
We no longer use the TrustedSandbox style annotations now we
have moved to the RuntimeClass method of choosing a runtime.
Drop the remaining Trusted items from the examples.
Fixes: #403
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Add a note to the developer guide explaining that the debug console
requires systemd support (hence nominally you cannot use alpine linux
for example as that doesn't use systemd).
Fixes#412.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Kata does support privileged flag but within the guest,
so explain how this works in the Limitations docs.
Fixes#362
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Change the debug console systemd job to specify the path to bash as
`/bin/bash`, *not* `/usr/bin/bash`. This unbreaks the debug console for
Ubuntu and Debian and also works for all other distros.
Fixes#410.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Don't force Docker to be kept at version 18.06 (to ensure devicemapper
is available). This feature won't be re-added by Docker and remaining on
an old version of Docker is not good from a security perspective.
Replace the pinning with a note pointing users at an issue which
provides details of alternatives to devicemapper.
Fixes#407.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Former snap configuration snapcraft.toml install qemu-lite for all
platforms, which isn't applicable on aarch64. We need qemu-aarch64
of specific version and extra patches.
Fixes: #399
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Kata supports multiple configuration file locations, so update the dev
guide to tweak config settings in
`/etc/kata-containers/configuration.toml` rather than the pristine
`/usr/share/defaults/kata-containers/configuration.toml` file. The
former is read first meaning the system can be reset to a vanilla Kata
configuration by simply deleting
`/etc/kata-containers/configuration.toml`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The debug console systemd job needs to specify `PrivateDevices=no` to
ensure the job can access the *real* console. Without this, connecting
to the socket does not provide access to the main guest root context.
Fixes#403.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
We use a packagecloud OSS account for package hosting.
As part of the arrangement with packagecloud we need to
credit them and add a link back to https://packagecloud.io
on our website and project README.
This was added to the kata-containers repository's README,
but it is also probably appropriate to add it to the packaging
README as well.
Signed-off-by: Thierry Carrez <thierry@openstack.org>
Update the how-to containerd-kata doc to support runtime option, by which
we can specify kata configure file for different kata runtime.
Fixes:#390
Signed-off-by: fupan <lifupan@gmail.com>
Rather than add the config for kata-qemu and kata-fc unconditionally,
the script now checks if the runtime config exists.
If it exists, then do not chnage the path for the runtime.
The user may have configured this to a specific path for testing
local chnages.
Fixes#374
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
to make kata-linux-container package builds reproducible.
See https://reproducible-builds.org/ for why this is good.
Uses -u to be independent of timezone.
Uses LANG=C to not have Day-of-Week and Month names vary.
Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
new kernel, new dependencies. Add bison, build-essential and flex as
kernel dependencies
fixes#395
Signed-off-by: Julio Montes <julio.montes@intel.com>
let's open nvdimm-related kernel config parameters on arm64, such as
CONFIG_ACPI_NFIT, etc. and we also need to backport patch
'kvm:arm64:Dynamic IPA and 52bit IPA'(https://patchwork.kernel.org/cover/10616271/)
and related dependency into v4.19.X to fully support nvdimm from guest kernel.
Former patch has already been merged into v4.20.X.
Fixes: #376
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Unfortunately, at present we have no way of testing Kata packages for
Red Hat Enterprise Linux (RHEL) or SUSE Linux Enterprise (SLES).
Add warnings to the RHEL and SLES install guides explaining this and
advising users to exercise caution. Hopefully, we will be able to drop
this warning soon (either when we have the ability to test on RHEL/SLES
or when Kata packages are available in RHEL/SLES).
Fixes#396.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
enable ZONE_DEVICE config to support map pages, pmem_should_map_pages()
function fails if this config is not enabled.
fixes#378
Signed-off-by: Julio Montes <julio.montes@intel.com
Eventually containerd will allow us to provide an argument for a given
runtime handler, but in the meantime, let's use bash to provide
indirection to specify the appropriate configuration file.
Only QEMU is handled until we have a block based snapshotter available.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Add support for the v2-shim integration with containerd. This registers
a runtimeClass named 'kata', utilizing the containerd-shim-kata-v2
binary.
This change adds volume mounts (hopefully temporarily) for
/usr/local/bin, as containerd requires the shim binary be within the
existing path.
Fixes: #323
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Add a CODEOWNERS file so we get auto-review requests from github
for any .md file changes.
Fixes: #394
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
`build-kernel_test.sh` builds the kernel when there are changes
of it in a PR and then runs the whole CI tests. Now we are running
all CI tests on all changes[1] (not only when there is a kernel change).
This is making the CI to run all tests twice when there is a change
in the kernel, so we need to remove the script.
[1] https://github.com/kata-containers/packaging/pull/348Fixes: #380.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
This will add missing config option (DRM_FBDEV_LEAK_PHYS_SMEM) that are
being asked while running the installation script for kata kernel. Also,
this jumps to the current kernel version that is being used at the runtime.
Fixes#372
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Add documentation on how the kernel is tested and how changes could be
introduced.
Fixes: #344
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We do releases based on kata branches lets get a fresh
versions file as the one in the host may be not updated.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add CONFIG_CFS_BANDWIDTH so CPU hotplug feature works on s390x. Note
that CPU hot-unplug does not work yet due to limitations in qemu s390x.
Fixes#360
Signed-off-by: Tuan Hoang <tmhoang@linux.vnet.ibm.com>
run all CI test to increase testing coverage on kernel config changes.
Fixes: #346
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We want to make sure Kata runs on latest stable kernels so that it
benefits from the latest features.
Fixes: #358
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
snap-build scripts were used to cross-build snap images in local environments.
Currently we are using launchpad to build and release the snaps, hence those
old scripts are no more needed.
fixes#350
Signed-off-by: Julio Montes <julio.montes@intel.com>
we add the rough kernel config v4.19.23 for arm64, here we let
'make oldconfig'(setting default) to do the transformation from
v4.14.X to v4.19.X.
Fixes: #337
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
kata-deploy inserts 'manage_network_ns_lifecycle' into crio.conf without any
prior checks and if there is a previous entry in the file, this becomes a
duplicate causing crio service restart issues. This patch addresses that
particular scenario.
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
Docker 18.09 removed devicemapper support but did not provide an
alternative. This can cause problems for users so update the install
docs to install Docker at version 18.06 (the last version that supports
devicemapper).
This is a temporary solution until either docker provide an alternative
or we find a way to work around the Docker feature being removed.
Note the extra logic required for Fedora since 18.06 is not available
for that release.
Fixes#373.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Document flow to create a release based in the tools
from this repository.
Fixes: #207
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
With the 1.5 release, we made several changes:
-simplification of daemonsets
-introduction of runtimeClass
Update documentation to take this into account.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Before the kata-deploy container image was intended to be
used with only Kubernetes. This commit adds a script for configuring
Kata to run with Docker.
This assumes > release 1.5 of Kata, as Firecracker is being configured
as well as QEMU based Kata. Note, in order for this to work, Docker must
be configured to use a block-based storage driver.
To succeed, it the following directories must be mounted:
- /opt/kata - this is the location that the kata artifacts are stored
- /run/systemd - for reloading the docker service
- /var/run/dbus - for reloading the docker service
- /etc/docker - for updating the docker configuration (daemon.json)
usage: kata-deploy-kata [install | remove]
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Simplify the yaml and combine the prior scripts. The resulting script,
kata-deploy.sh, is used for install and configuration and
removal for CRI-O and containerd. While this could be used standalone
outside of daemonsets, today it will sleep infinity after processing the
request, since it is assumed to be called by a daemon.
By checking the CRI runtime within the script itself, we no longer need
to support many daemonsets for deploy - just a single. Still requires a
seperate cleanup daemonset (for restarting the CRI runtime), and an
RBAC.
Verified with CRI-O -- containerd testing WIP
Throwing this up now for feedback since I do not bash good.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Use a new GOPATH to build image in order to avoid clashes with user's GOPATH,
otherwise user's kata agent will be used causing problem if that repository is
not up to date.
Signed-off-by: Julio Montes <julio.montes@intel.com>
We have some initial Firecracker/Kata documentaiton, but for now
it lives in the wiki. Link off to it from the top level docs
README to make it more obvious and easier to find.
Fixes: #367
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
New hypervisor configs could be added in the future, add
any possible new config file.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Now there are 2 config paths lets update both to not use
initrd by default.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Use a new GOPATH to build image in order to avoid clashes with user's GOPATH,
otherwise user's kata agent will be used causing problem if that repository is
not up to date.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Although the installation instructions specify `apt-get -y ...`, the
installation blocks when trying to install the Kata pages with a message
like this:
```
...
Restart services during package upgrades without asking?
<Yes> <No>
```
Setting `DEBIAN_FRONTEND=noninteractive` avoids this.
Fixes#363.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The snap install doc only told you how to install the kata snap,
and did not then go further to describe how to configure and
intergrate it. Those details are available already over in the
packaging repo, so let's link out to them.
Fixes: #360
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
- update cri-containerd to containerd with cri plugin
- suggest the shimv2 to be the preferred kubernetes integration way.
Signed-off-by: Xu Wang <xu@hyper.sh>
- mentioned shimv2 in the configuration part of manual installation
- reference the link of shim v2 api and the k8s containerd howto
Signed-off-by: Xu Wang <xu@hyper.sh>
The OpenSUSE and SLES install guide for Docker used the --containerd
option. When this option is used on OpenSUSE Leap 15 or SLES 15, the
following error occurs when starting Docker:
Failed to connect to containerd: failed to dial
"/run/containerd/containerd.sock": context deadline exceeded
Removing the --containerd option from the configuration file allows the
Docker daemon to start successfully and a Kata container to be created.
Fixes: #350
Signed-off-by: John L. Jolly <jjolly@suse.com>
As memory hotplug for arm64 by acpi is not ready on qemu, we choose
"probe" instead. You can refer to [1] to get more infomation about
"probe". The process of memory hotplug by "probe" in kata lies below:
firstly, add memory in qemu qmp; secondly, echo the start phyical address
of that memory to /sys/devices/system/memory/probe, which will be done
through kata-agent; thirdly, excute online op, then this newly added
memory is capable to be used.
All functions in this patch will be called after "echo" op. It can be
divided into two parts:
1. create page table for that memory;
2. add that memory to memblock.
In this patch, NUMA must be turned off for not all arm64 machine supports
NUMA.
As the newly added memory should be placed from 2T to 6T which is decided
in qemu and phyical address and virtual address will be one-one mapping
when create pgd for that memory, we must config ARM64_VA_BITS as 48.
Also some configs should be turned on, especially "ARCH_MEMORY_PROBE".
We have tested this patch integrated with another patch which performed
that echo op. It works well when using "-m" in command line when start a
kata-container on aarch64 machine.
This patch derived from Maciej Bielski. You can refer to [2] to get full
infomation about it.
[1] https://www.kernel.org/doc/Documentation/memory-hotplug.txt
[2] https://lkml.org/lkml/2017/11/23/183Fixes: #309
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
In some build systems like launchpad is not possible to run neither
custom commands or hooks, hence build a snap image with `make snap` is
not feasible, to deal with this limitation, the final snapcraft.yaml
is part of the repository and all packages versions are read from versions.yaml
in the runtime repository.
fixes#305
Signed-off-by: Julio Montes <julio.montes@intel.com>
Remove the `network connect` limitation from `Limitations.md` as the limitation has been removed.
Fixes#287.
Signed-off-by: Ayoub Bousselmi <abousselmi@users.noreply.github.com>
Remove the `ps` limitation from `Limitations.md` as the limitation has been removed.
Fixes#342.
Signed-off-by: Ayoub Bousselmi <abousselmi@users.noreply.github.com>
If the runtime repository is already cloned get version from it,
else keep getting from github.
Fixes: #299
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Update the developer guide to include documentation
for the inclusion of seccomp packages in initrd/rootfs
images.
Fixes: #339
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Add a reference to the release rotation wiki, and clarify that
the current stable release schedule is every-other-week.
Fixes: #337
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Fixes#310.
These install instructions are based on the katacontainers repository for Debian. For installation, a newer version of `librbd1` is required. This is available from the `unstable` repo.
Tested only on Debian 9 - Stretch (x86_64).
- tested with `docker-ce=17.12.0~ce-0~debian`
Signed-off-by: zeigerpuppy <zeigerpuppy@users.noreply.github.com>
1.5.0-rc2 packages for linux-container fail for Ubuntu. Let's use 1.4-stable instead of master for now.
Fixes#325
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Since aarch64 need custom QEMU binary and doesn't support OBS
packaging for now, we add this section to lead developers to build
required qemu-system-aarch64 binary.
Fixes: #320
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
We want to make sure Kata runs on latest stable kernels so that it
benefits from the latest features.
For instance, in case of Kata relying on NEMU hypervisor, the recent
kernel patches reworking the way timer calibration is handled are
solving some boot latency issues.
Fixes#287
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Fix version compare when specifying a pre-release version in
versions.txt. This is needed because kata on git uses strict semver,
while kata RPM packages uses ~ in place of - for PATCH version, to
allow RPM version comparison to work properly.
Fixes: #285
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
For supporting nvdimm, we need to update kernel on aarch64 to the
stable version 4.19.8 and backport Suzuki K Poulose's latest
Dynamic IPA and 52bit IPA support patch series
(https://patchwork.kernel.org/cover/10616271/)which has been included
in 4.20-rc3+ to the v4.19.8.
Fixes: #268
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Wei Chen <Wei.Chen@arm.com>
Only create a new docker unit file if no other existing unit files
are detected. Creating a new docker file when not necessary may mask out
existing docker daemon configurations.
Fixes: #300
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
We want to make sure Kata runs on latest stable kernels so that it
benefits from the latest features.
For instance, in case of Kata relying on NEMU hypervisor, the recent
kernel patches reworking the way timer calibration is handled are
solving some boot latency issues.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Remove configs s390_kata_kvm_4.14.x
The patch 0003-serial-forbid-8250-on-s390 is no longer necessary as it
has been upstreamed since version 4.16
The kernel configs have been generated as described in https://github.com/kata-containers/packaging/issues/246
plus the vsock options have been manually enabled:
CONFIG_VSOCKETS=y
CONFIG_VIRTIO_VSOCKETS=y
CONFIG_VIRTIO_VSOCKETS_COMMON=y
Fixes: #280
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Remove modules from default kernel config.
Modules are not used in default kata images.
Lets remove them.
Fixes: #276
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Right golang version is installed before building kata-containers, skip go
version check to avoid including extra build dependencies.
fixes#265
Signed-off-by: Julio Montes <julio.montes@intel.com>
Update the kernel package to allow building for multiple architectures with
a single set of sources.
Changes:
- Add kernel configs for all architectures
- Detect at runtime the correct target architecture and kernel
compressed image location. This is done with the script kata-multiarch.sh
Note that debian control files still need to be updated to handle Multi-Arch,
so that they are not tied to the architecture on which
`linux-container/update.sh` is run.
Fixes: #262
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Detect misalignments of versions between the content of versions.txt
file and the version found in the VERSION file in the git branch being
released on OBS.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Introduce the --compare option to compare the content of the local
versions.txt file with the one found at the specified git branch.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
This improves the guest kernel build section of Developer-Guide
on arm64.
This also improve the description of sriov use-case.
Fixes: #299
Signed-off-by: Jia He <justin.he@arm.com>
Remove the redundant dh-modaliases package as a build requirement
for deb packages. This allows to build packages for the Debian distro.
Fixes: #249
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
The s390_kata_kvm_4.14.x configs have been obtained by applying the patch
serial-forbid-8250-on-s390 and the combination:
make defconfig kvmconfig localyesconfig
Fixes: #246
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
The patch 0003-serial-forbid-8250-on-s390.patch fixes a conflict between
the ttysclp0 and serial 8250 console. The patch is already upstream and
it has been introduce in version v4.16-rc1.
However, it is not backported. See https://lore.kernel.org/patchwork/patch/861679/
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
Remove the `!` from the `echo` in the code example in the doc
requirements doc.
The current code is in fact invalid as the shell will try to interpret
the exclamation mark as it is a reserved word. Rather than escaping it
in the example, just remove it.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Skip the golang version check when building the runtime, passing to make
`SKIP_GO_VERSION_CHECK=1`. This check requires yq, that's not packaged
for most distributions and it can't be downloaded at build time on OBS.
It is the responsibility of the package maintainer to verify that the
correct golang version is used.
Fixes: #242
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Remove hardcoded golang version, as it overwrites the value previously
fetched from the runtime/versions.yaml.
NOTE: this has as consequence replacing the golang compiler version from
1.10.2 to 1.11.1 (that is currently the "newest-version" specified on
master).
Fixes: #242
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Usage:
- add more information about what the script does
- support for -h / --help flags
- tagging of error messages with `ERROR: ` prefix
Fixes: #244
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Add the ability to specify a list of projects to process, instead of
processing all projects (default behaviour).
Fixes: #244
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Use /snap/kata-containers/current as prefix on building but /usr on install.
This changes are needed to include all new kata components like netmon in the
final snap.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Without Real time clock the date could not work properly for Arm64.
fixes: #238
Change-Id: I5834a5e90dc648cc9599c50f259d5ae273052a39
Signed-off-by: Wei Chen <wei.chen@arm.com>
Instead of specifying a version for OVMF binary, this patch uses
a tiny script to retrieve the proper URL to download from.
Fixes#289
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Include in the release notes the kubernetes version that
has been tested with the release.
Fixes: #235.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
The Intel GPU support has been enabled in kata runtime, but the
guest kernel of kata container lacks the support of Intel GPU,
so this commit enables it as default in guest kernel.
CONFIG_DRM, CONFIG_DRM_I915 and CONFIG_DRM_I915_USERPTR are necessary.
Others are obtained by running command "make menuconfig" and selecting
the following options.
Device Drivers
---> Graphics support
---> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)
Device Drivers
---> Graphics support
---> Intel 8xx/9xx/G3x/G4x/HD Graphics
Fixes#232
Signed-off-by: Zhao Xinda <xinda.zhao@intel.com>
Add new CI tests to ensure that the following installation methods are
also tested:
- "Automatic" method ([`kata-manager`](https://github.com/kata-containers/tests/tree/master/cmd/kata-manager))
- "Scripted" method ([`kata-doc-to-script`](https://github.com/kata-containers/tests/blob/master/.ci/kata-doc-to-script.sh))
**Note:** the "Automatic" method is **not** the same as the existing
`kata-manager` test: the existing test executes the "Manual"
installation method (which runs `kata-manager` to execute the
appropriate distro-specific install guide). However, this new test
executes the `install/installing-with-kata-manager.md` document, which
subsequently calls the `kata-manager` script.
Since the "Automatic" and "Scripted" installation methods are designed
to run "standalone" (without requiring any local git repo clones), the
script which runs these new tests has to take care to ensure the
environment they run in is clean. It does this by using the following
approach:
- Removes any local Kata github repos from the standard `GOPATH`
locations (to ensure the scripts do not inadvertently access local
files) [1].
- Creates a temporary directory containing:
- A copy of *itself*.
- The scripts it generated from the "Automatic" and "Scripted" installation documents.
- Re-exec's itself to run the version in the temporary directory,
passing an option that tells itself to simply execute the scripts in
the specified directory.
- It then runs the scripts in the directory specified.
---
[1] - Since the recursive delete of all local Kata github repos is
potentially dangerous, the test will immediately fail if the standard
`KATA_DEV_MODE` variable is set (since this denotes a developer system)
and will also fail unless the standard `CI` variable is set (denoting
the script is running in a Continuous Integration environment, such as
JenkinsCI.
Fixes#278.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Wrap the function calls in the doc test script in a `main()` function to
simplify future changes.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Split out a function to create a container from
`test_distro_install_guide() in the script used to test install docs.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The script used to test the install docs does not actually use the
golang binary (it only uses the `GOPATH` variables) so remove the
unnecessary call to `go`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Rework the logic in `check_install_docs()` to make the intention
clearer and support adding additional tests.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Rename the `check_install_guides()` function to `check_install_docs()`
and clean up:
- Improve messages.
- Add more braces around variables.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Update the `kata-doc-to-script` install document to actually execute the
generated scripts, allowing the entire installation to be tested by the CI.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Converted the plain code blocks in
`install/installing-with-kata-doc-to-script.md` to bash code blocks so
that they are executable by... `kata-doc-to-script.sh`.
Also, removed the backslashes to let github render scroll bars for
consistency with other docs.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Use a one-line code block for the installation command, and document the
dry run option.
Fixes: #275
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
Improvements to the table in the installation README:
- Fix the invalid link for the "Build from sources" option.
- Add column for "Packaged install" to make it clearer which
options result in a distro-packed install.
- Tweaked the "Suggested for" column to make the use-case options
clearer.
- Added detail for each use-case in a "Description" column.
Fixes#276.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
As discussed in issue #171 IPv6-in-IPv4 tunnel is useless in guest. So we
decide to disable the CONFIG_IPV6_SIT by default for Arm64.
Fixed#230
Signed-off-by: Wei Chen <wei.chen@arm.com>
As x86_64 has updated the guest kernel to enable EFI support for NEMU,
because OVMF that is used by NEMU is an EFI firmware. Although the
NEMU is not ready for Arm64, we'd better to enable EFI support in
kernel to keep sync with x86_64.
Fixes#228
Signed-off-by: Wei Chen <wei.chen@arm.com>
Linux-container OBS packaging for ppc64le
fails as the spec file is x86 specific for
kernel build and install process.
Fixes: #224
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
evged is required to make Kata work with NEMU.
Apply the kernel patch when building kernel.
Fixes: #268
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Refactor installation instruction to minimize duplicate content,
to document package source verification process, and to remove
some of the typos.
Fixes: #263
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
This commit bumps the default config from 4.14.49 to 4.14.67 first,
and then enables the support for EFI firmware as OVMF used by NEMU
is an EFI firmware.
Fixes#220
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This documentation is a short document explaining how to make Kata
Containers running with the NEMU hypervisor.
Fixes#267
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This document decsribes how an Intel GPU can be used with
Kata Containers in GVT-g and GVT-d mode.
An example of an actual workload will be added in the future.
Fixes#260
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
The spec-template file looks for x86 specific
files irrespective of the arch on which
packaging is done for.
Fixes: #216
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Add patch to enable evged, the config option cannot be set normally since it
breaks current kata supported machine types.
Fixes: #214
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Update variables needed by osbuilder.
Also fix query to get the initrd base OS.
Fixes: #210
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
make -f .obs-packaging/Makefile clean fails with
"No such file or directory" even after deleting the
files returned by find. Fix it by using -prune.
Fixes: #203
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
We do not currently support enablement of `selinux` in the
dockerd config. Document that.
Fixes: #252
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
The repository URLs in the installation guides needs to point to the
latest release version.
This impact tests execution too (kata-manager uses this guides as
installation recipes).
Fixes: #255
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
This PR is based on #124 but has been reworked and updated to take into
account review feedback and extra cleanups to bring this howto in line
with the latest documentation requirements.
Fixes#127.
Signed-off-by: T. Nichole Williams <tribecca@tribecc.us>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
It is the only repo that requires LIBEXECDIR. Do not set it
for other repos, otherwise the runtime repo will mistakenly install
kata-netmon in a wrong path.
Signed-off-by: Peng Tao <bergwolf@gmail.com>
It fails with:
step 13/16 : RUN make clean
---> Running in 5385ba104ad8
find: '/root/qemu/tests/docker/dockerfiles/debian-alpha-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-hppa-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-m68k-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-mips64-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-powerpc-user-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-riscv64-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-sh4-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-sid.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-sparc64-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/debian-tricore-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/dockerfiles/fedora-i386-cross.docker': No such file or directory
find: '/root/qemu/tests/docker/test-debug': No such file or directory
find: '/root/qemu/tests/docker/test-unit': No such file or directory
Signed-off-by: Peng Tao <bergwolf@gmail.com>
We should make sure ${tag} fully matches otherwise we cannot
differentiate `1.3.0` vs. `1.3.0-rc1`, nor `1.3.0` vs. `11.3.0`.
Fixes: #196
Signed-off-by: Peng Tao <bergwolf@gmail.com>
The kata-runtime spec file, when specifying a `Requires:` version for
qemu-lite and qemu-vanilla, does not include the "+git.<commit hash>"
part.
As a result of this, versions of kata-runtime and qemu installed on a
system using RPM package management may be inconsistent.
Fixes: #193
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
unable to prepare context, unable to evaluate symlinks
in context path when building target test-packaging-tools
on ppc64le.
Fixes: #189
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
"make clean" errors out if snap/snapcraft.yaml file
does not exsist and the recipe for target 'clean'
fails. Avoid this my adding a "-f" option to rm to
have a clean state.
Fixes: #187
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
It is expected that this document will change over time. This
represents an initial starting point as we create and release
our stable branches.
Fixes: #237
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
We find for the tarball name with the package name.
If this is a `-rc` tha package versoin will have `~rc`,
lets replace `~` for `-` before get the tarball name.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Replace non-valid character from kata version.
This will make the version compatible with rpmbuild.
Fixes: #179
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Make sure we checkout the new_version tag before grabbing version
information from the runtime repository.
Fixes: #174.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Add script to generate release notes template
for runtime.
Will get the range of changes from two tags.
Get assets information from versions.yaml file.
Fixes: #169
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When tries to get release number from a new repo, the
specfile wont exist. Dont do grep in this case.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Clean target tries to remove files multiples times.
Limit find max depth to not try to remove files more than once.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Fix image generation.
Instead of use agent code from the host checkout to the
agent source code in a clean GOPATH env.
Make sure that the agent `commit id` is the correct before
push to github or OBS.
Fixes: #166
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We need to update the way CNI is handled which is
mostly how CNM is taken care of. Start of by removing
the incorrect steps documented for CNI.
Fixes#236
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Checkout to a target branch before create a tag.
We dont checkout for repos that does not have stable branches.
We want to do is just push the tags to master branch
since we don't maintain a seperate one.
The repos are:
osbuilder
packaging
ksm-throttler
Fixes: #163
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Lets point to qemu repository instead of kata fork
for qemu-vanilla.
Fixes: #161
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The osbuilder version file wont be the same if
we tag a stable branch. But we still want to tag
the HEAD of osbuilder to do reproducible builds of
a Kata branch.
Fixes: #158
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When we get changes from one version to a newer this
is empty because we dont get the current version.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We need to create the osc file before enter the container.
If build_all.sh is executed without a container and osc
is intalled osc will ask for setup but in the container
fails do to a missing tty.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When we genete packages file we want to see
the resulting files. This changes to now
create repos in a tmpdir.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If want to create pkgs based in new branch, this script will create
all the empty repositories in OBS for each kata package.
Then we can point use the rest of scripts to push changes to this new repo.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
pkglib.sh uses BUILD_ARCH and DEB_ARCH which are
by default not set and hence take the value of
x86_64 and amd64 respectively. Make this
architecture specific.
Fixes: #154
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
qemu-lite is required to be packaged only
for amd64 arch. Skip it for all other
architectures.
Fixes: #152
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Currently, since GOARCH is not passed as build-arg
to a Dockerfile, it by default always pick's up amd64
when building it. Also pass it as --env when running it.
Fixes#148
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Fixes#144
Current kata containers can't run with kernel 4.1 built from current x86 config,
it will report error:
```
$ docker run -ti --runtime kata busybox sh
docker: Error response from daemon: oci runtime error: rpc error: code = Internal
desc = Could not run process: container_linux.go:348: starting container process
caused "process_linux.go:402: container init caused \"open /dev/ptmx: no such
file or directory\"".
```
This is caused by bogus devpts mount options. When run container with docker,
docker will assign a default devpts mount for every container which equals to
command below:
```
$ mount -t devpts -o nosuid,noexec,newinstance,ptmxmode=0666,mode=0620,gid=5 \
devpts /dev/pts
```
This requires kernel config `CONFIG_DEVPTS_MULTIPLE_INSTANCES=y` to work properly
under kernel-4.1, but this option is already removed from latest kernel.
It's better to add it back for support older kernel than current 4.14.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
When pcre-tools is installed before build/obs-service-tar_scm
then "build-mkbaselibs-20180629-289.1.noarch.rpm" is installed
as dependency but OBS repo does not have that rpm. So install
"pcre-tools" at the end and dockerfile builds fine on ppc64le.
Fixes: #139
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
The minimum golang version should be defined *once* - in [1]. Since the
developer guide already provides a link to that human-readable file,
remove the hard-coded golang version number to avoid having to maintain
that part of the devguide.
Fixes#232.
[1] - https://github.com/kata-containers/runtime/blob/master/versions.yaml
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
1.2.0 release changed the tarball file layout for the
Kata artifacts. Adjust scripts accordingly.
Fixes: #142
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
create a use-cases subdirectory and add an initial use case,
booting a kata container which makes use of vpp vhost-user interface.
Fixes: #209
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
The tag_repos.sh script always check master. Now when we want
to know the version of kata we may want to choose a branch to check.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Kata is staring maintain mutiples braches. When we want to
update the project version now we need to have a target branch.
Add argument to choose kata branch we will use to create the PR.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Update install documentation guide for fedora to include the
support for fedora 28.
Fixes#218
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Now that this issue has solved kata-containers/packaging#39,
we can remove the workaround for the proxy and the shim.
Fixes#216
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
kata-deploy container image changed format slightly as we've changed
the release tarball. Update to 1.2.0 and make adjustments accordingly.
Fixes: #135
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
We want all the static qemu is intalled in /opt/kata
use PREFIX variable to notify to configure script.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We were using an static prefix let allow the user choose where will be installed.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If the file is not found fail. We use this file
to identify what config we use to build the kernel.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Based ina a branch will query the current
kata version and needed hashes.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Accelerate the build process by not creating image again.
Add DEBUG flag to docker run.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
update scripts are relative to this script go to
it and then try to update.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add script to generate a tarball with kata binaries install kata
whitout pkgs.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
For 1.2.0 we are changing the format of the kata-deploy container image.
To avoid issues, pull an explicit version in our daemonset.
In a follow on PR we'll update the yaml/scripts to 1.2.0 format
Fixes: #135
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
We want to create pacakges based in different branches modify
function to get the yaml version needed to to that.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add long name bash options.
Make xtrace optional when DEBUG variable is set.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If we can not find a release number in a file this means
it is an new repository. This could happend when upload changes
for a new brach.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Since there have not been any major architecture changes
in Kata Containers since 1.0 release, bump version to latest
1.2.0 release. Also, add another supported machine type
"pseries" for IBM Power Systems. A typo is also fixed in this
commit.
Fixes#210
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
The #usage part of README talks about
cross building snap images for all "supported
architectures" not "supported images".
Also fold the "Usage" part into "Cross-build
snap images" section.
Fixes: #131
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Improve README by pointing to a specific sub-section
in runtime repo that actually talks about the possibility
of having multiple configurations files.
Fixes: #129
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Kernel building fails as part of "make snap" as
the kernel config file is renamed from ppc64le_kata_kvm_4.14.x
to powerpc_kata_kvm_4.14.x
Fixes: #127
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
install_yq() has no arch entry for
ppc64le and hence installing yq
fails on Power systems.
Fixes: #124
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Add scripts to cross-build snap images for all supported
architectures using virtual machines
fixes#98
Signed-off-by: Julio Montes <julio.montes@intel.com>
The kernel config file name prefix changed from
ppc64le to powerpc. This change broke the kernel
build on ppc64le. Fix the kernel build steps
accordingly.
Fixes: #207
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
libcontainer limits the memory+swap usage by writing the limit at
/sys/fs/cgroup/memory/docker/$CONTID/memory.memsw.limit_in_bytes, this path
doesn't exist if CONFIG_MEMCG_SWAP and CONFIG_MEMCG_SWAP_ENABLED are not
enabled.
fixes#103
Signed-off-by: Julio Montes <julio.montes@intel.com>
Post Fix#111, the kernel config name is
expected is to be prefixed with powerpc instead
of ppc64le. Just rename the file to suit the scripts.
Fixes: #113
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Run new script in arm server to build kernel, then find a minor
typo. An extra "/" in the end of default_kernel_config_dir will
cause error:
ERROR: failed to find default config
../src/github.com/kata-containers/packaging/kernel/configs//aarch64_kata_kvm_4.14.x
Signed-off-by: Wei Chen <wei.chen@arm.com>
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
https://github.com/kata-containers/runtime/pull/527 Removed the
hard-coded `initcall_debug` kernel option (as it generates a lot of
kernel output at boot).
Add the `initcall_debug` option to the "Enable full debug" section to
allow users to enable these potentially useful messages when debugging.
Fixes#204.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
shift $((OPTIND-1)) can be unsafe.To prevent unwanted
word-splitting all parameter expansions should be
double-quoted. Use the safe form for the command:
shift "$((OPTIND-1))"
Fixes: #109
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
The debug console socket path looks to have moved from
the `sbs` dir to the `vm` dir. Update the docs to reflect
this.
Fixes: #202
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
After upgrading the kernel to latest 4.14.x kernel, there are new
kconfig options that need users to select in install-kata-kernel.sh.
the prompt will block the script. We update this config file to give
user a good defined default config.
The new kconfig options are about, “Meltdown” and “Spectre”. So I
selected them to yes by default in this config file:
CONFIG_ARM64_ERRATUM_1024718=y
CONFIG_QCOM_FALKOR_ERRATUM_E1041=y
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_HARDEN_BRANCH_PREDICTOR=y
CONFIG_ARM64_SSBD=y
Fixed#106
Signed-off-by: Wei Chen <wei.chen@arm.com>
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Build image with agent and osbuilder with master by default.
If want to build a release tag just use -v <version> and
will use that osbuilder and agent tag.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
a few qemu options generated by configure-hypervisor.sh were only
suitable for amd64, leading compilation err in aarch64.
Fixes: #92
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Wei Chen <Wei.Chen@arm.com>
Make sure kernel config version is validated on test.
Also, increse Kata Kernel config version.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Today we have instructions to build the kernel
but there are a lot of manual steps to get one kernel.
This tries to automate the process to setup a kernel
for kata.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Now that initial files for kata-deploy have merged, we
have an initial image on dockerhub. s/egernst/katadocker
Fixes: #100
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
A Dockerfile is created and reference daemonsets are also
provided for deploying Kata Containers onto a running Kubernetes
cluster. A few daemonsets are introduced:
1) runtime-labeler: This daemonset will create a label on each node in
the cluster identifying the CRI shim in use. For example,
container-runtime=crio or container-runtime=containerd.
2) crio and containerd kata installer: Assuming either CRIO or
containerd is the CRI runtime on the node (determined based on label
from (1),, either the crio or containerd variant will execute. These daemonsets
will install the VM artifacts and host binaries required for using
Kata Containers. Once installed, it will add a node label kata-runtime=true
and reconfigure either crio or containerd to make use of Kata for untrusted workloads.
As a final step it will restart the CRI shim and kubelet. Upon deletion,
the daemonset will remove the kata binaries and VM artifacts and update
the label to kata-runtime=cleanup.
3) crio and containerd cleanup: Either of these two daemonsets will run,
pending the container-runtime label value and if the node has label
kata-runtime=cleanup. This daemonset simply restarts crio/containerd as
well as kubelet. This was not feasible in a preStepHook, hence the
seperate cleanup step.
An RBAC is created to allow the daemonsets to modify labels on the node.
To deploy kata:
kubectl apply -f kata-rbac.yaml
kubectl apply -f kata-deploy.yaml
To remove kata:
kubectl delete -f kata-deploy.yaml
kubectl apply -f kata-cleanup.yaml
kubectl delete -f kata-cleanup.yaml
kubectl delete -f kata-rbac.yaml
This initial commit is based on contributions by a few folks on
github.com/egernst/kata-deploy
Also-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Signed-off-by: Ricardo Aravena <raravena@branch.io>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Add script that will help to bump versions for all the projects.
Fixes: #49
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Journald, by default on some systems, will rate limit log messages,
and in the case of our 'enable full debug', will likely drop some
of our debug.
Document how to identify if this is happening, and how to configure
`systemd-journald` appropriately.
Fixes: #181
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
the yaml file is the recipe to build a snap image
with all Kata Containers components inside.
fixes#81
Signed-off-by: Julio Montes <julio.montes@intel.com>
We populate all the conent of a OBS project.
Lets remove after we checkout to the OBS project.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Packaging scripts search for patches in a directory
called 'patches'. We store the kernel patches in a diferent place
to make easy to mantain them.
Lets do a symlink to allow the automation find the patches.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We dont have commit when we build the pkg.
Lets define the COMMIT variable to kwnow the commit from each project.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
the installation takes a long time without print anything
add verbose to know is doing something.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
On each release we want to know the kernel config what was used.
Lets create a tag ${kata_verson}-kernel-config.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Make sure the osbuilder VERSION file is updated before tag
Also, sort repos alphabetically.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This commit introduces the instructions to be able to run trendy
service mesh Istio and Conduit with Kata Containers. It provides
a bit of feedback on how they actually work to give the reader a
quick overview. After this introduction, it provides restrictions
and instructions to enable them with Kata Containers.
Fixes#171
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Excised an extraneous definite article in the install README.
How did we miss this in the review phase I wonder?
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add the new Google Compute Engine installation guide to the
installation README, reworking this doc to add in a table of contents
and a new "Cloud services" section.
Fixes#173.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Simplify the installation README by using relative URLs - let github
expand them automatically for readers.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The docker install guides end with a call to `docker run`. However, they
all specify `-ti` which is causing our CI to fail.
Remove the `-ti` so that the command works both under the CI and as
expected for the user.
Fixes#175.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Simplify the CI check that looks for modified install guides to catch
any modified document below `install/`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Build qemu if there is any change in static-build.
Do the same with the rest of projects in this repositoy.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This covers the GCP portion of #130.
Introduces a guide to configuring a VM image with nested virtualization. The
primary focus of the guide is the set of commands required for creating and
managing nested VMX GCE images. For Kata installation itself the guide defers
to the distribution-specific Kata documentation for actually installing Kata.
The upside is that it needn't be updated every time the instructions for a
given distribution change. The downside is that it is not a standalone
artifact.
Fixes: #155.
Signed-off-by: Jon Olson <jonolson@google.com>
Add functions to be used across the repository.
- get kata version deps
- die
- info
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
This repository is growing is due to different projects are living
here, kernel config and patches, obs scripts, kata-deploy, release tools.
Lets move the obs scripts to its own directory.
Fixes: #75
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
As the Developer-Guide[1] mentioned, the progress of installing
guest kernel images need a default kernel config file. But for
Arm64 architecture, this config file is missing.
In this patch, we provide a default Arm64 kernel config file for
Linux kernel 4.14.x.
Notes:
[1] https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md
Signed-off-by: Wei Chen <Wei.Chen@arm.com>
kernel_arch was being set to amd64 instead of x86_64
on intel. The kernel config file name starts with
x86_64 and hence this needs to be fixed.
Fixes: #158
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
This will add the RHEL installation guide for the OBS packages for
kata runtime.
Fixes#86
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The install guides assumed that users wanted to install Docker. Since
there are other container managers, split the Docker-specific
instructions into separate documents (with backlinks) and allow the
user to choose between Docker or Kubernetes from the install guides.
Fixes#144.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Ensure the first mention of all the distro names in the install guides:
- Have the required asterisk after the name.
- Have a link to the website.
Also folded the overly long lines.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
We have new CPU constraints docs, now in this repo. Update the
Limitations document to reflect that.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Move all the Zun docs into their own subdir, and add a
Zun reference and link to the top level README index.
Fixes: #131
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Puts the nested virt/bare metal requirement in the top line
of the Install Guide and references the Kata hardware
check.
Signed-off-by: Anne Bertucio <anne@openstack.org>
The design subdir README index was a little slim and
potentially out of date, and was missing hotlinks to
some documents that did exist.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
The top level README.md index for the repository was a bit
slim on entries.
Add all the other docs in this repo to the index, and sort them
alphabetically by symbolic name (which may be slightly different
from the filename itself).
Fixes: #146
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
This commit updates all 3 installation instructions related to
Ubuntu, Fedora and Centos, providing a disclaimer about the k8s
installation. Particularly, it says that those docs are only
explaining how to run Kata Containers with Docker, and that the
user should refer to the developer documentation to read how
to install Kata for k8s.
Fixes#134
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This documentation update purpose is to propose an alternative to the
default Docker usage that was described. The developer wanting to
interact with Kubernetes will have the proper information to start.
Fixes#134
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
A few assumption were made, making the steps not working directly on
a clean system.
Fixes#134
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Moved static tests to CI setup script and added a new CI test to
execute all install guides if any one changes.
Fixes#132.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Updated the `Upgrading.md` document to ensure users remove the Clear
Containers throttler package to avoid conflicts with the Kata Container
equivalent.
Fixes#138.
Signed-off-by: Liu Changcheng <changcheng.liu@intel.com>
Add the version of config and patches we are using in a package.
Kernel version before:
4.14.22-128
Now:
4.14.22.1-128
Fixes: #45
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to track the changes that we add to the kernel, lets
add a kata_config_version file that should be bumped whenever
a change is added to the kernel directory
Fixes#43.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add details to documentation requirements document explaining how we use
and format notes.
Fixes#125.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Update the install README and the install guides to point to the
upgrading document.
Fixes#119.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add document to describe to to setup kubernetes and "cri" containerd
Fixes: #87
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Now that we have a README for the installation guides, update the
developer guide to refer to that page, to avoid hard-coding links to
(some of) the installation guides.
Fixes#117.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The CentOS install guide was referencing an invalid package
(`dnf-plugins-core`) so update for the yum equivalent.
Fixes#329.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Remove backslash from centos url that was
preventing the $VERSION_ID to take its correct value.
Fixes: #112.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Remove the `bash` tag from the last command in the install guides where
we show the user how to create a container with a busybox shell. This
doesn't change the content of the document but it ensures that all bash
blocks can be run non-interactively (by the `kata-doc-to-script.sh`
script in the tests repo).
Fixes#109.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
cli events is now partly supported.It returns
the stats of a certain container.
OOM notification and Intel RDT stats are not supproted
as what runc does
Fixes: #103
Signed-off-by: Haomin <caihaomin@huawei.com>
Don't append to Kata-specific apt sources file to avoid apt warnings
and make the install idempotent.
Fixes#107.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Change the installation guides so that all commands the user must type
use an explicit bash code block rather than a standard code block.
This adds meaning to the documents and will then allow us to extract
the commands and run them for testing purposes.
Fixes#92.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Provide a pointer to the versions database to allow developers
to see the range of golang versions known to work.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add details of how to install the packaged components to start off with
a working system. This involved splitting out part of the "Assumptions"
section into a new "Initial setup" section.
Fixes#80.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add an appendix explaining how to setup a debug console to login to the
virtual machine for debugging.
Fixes#72.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
You cannot remove an existing rootfs directory without being `root`,
so use `sudo(8)` in the developer guide.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add a document explaining how to upgrade a system:
- Running Clear Containers.
- Already installed with Kata Containers binaries.
Also, include details of assets and how and when the are updated.
Fixes#69, fixes#78.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Allow kernel builds in fedora 28.
Ignore new warnings from gcc 8.
Fixes: #30
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We can only set disable fno-semantic-interposition if the
gcc used to build qemu is 5.3 or newer.
CentOS provides an older gcc, then we need to not enable this
option if it is the case.
Fixes#32.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add details to the developer guide of how to obtain a backtrace by
sending a `SIGUSR1` signal to the component.
Fixes#70.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add brief details to the developer guide explaining that the
log parser can convert the format of the logs.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
This commit adds the necessary spec files and scripts in order to be able to create
packages in OBS (Open Build System) and locally.
Fixes#15
Signed-off-by: Erick Cardona <erick.cardona.ruiz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The link to the release checklist in the Releases document was linking
to the parent document, not the separate checklist document.
Fixes#73.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Remove more references to Clear in the images as well as text.
Reworked some sections for grammar/flow.
Immediate next steps:
1. At least the delete/kill command section needs to be cleaned
up/clarified
2. Move CRI-O UML flow example to its own section, or subection of CRI-O
3. Carve up UML diagram for basic docker example case.
4. Add section describing initrd configuration
5. Add section detailing the gRPC protocol
6. Agent section needs cleaning around gRPC description.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
There is still a lot in progress, but sharing a first pass.
To do includes:
-need updated pngs (s/cc/kata)
-'signifcant' cmd section could use (simple) UML
-Need better location, and possible split up the CRI UML example
-need description of CRI-containerd
-Missing API extensions and description
Should likely carve this up into smaller .mds, as no one should read
that much text, and I don't want to get more than 200 review comments.
Contributed to by: Julio Montes, Archana Shinde, Sebastien Boeuf, and
the original CC-3.0 doc.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
If ssh keys are not set ssh clone wont work.
Clone using https and push using ssh.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
- Add tests for release tool tag_repos.sh
- Toplevel makefile
- Add make test target for CI
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add a document providing an overview of releases along with the
all-important release checklist.
Fixes#32.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add helper script to tag repos.
- Check all repos are in the same version
- Create annotated tags
- Push tags to the repos
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
If kata-runtime is already added as a runtime to
kata-containers.conf then you need not add it again.
Fixes: #49
Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>
Set the qemu major and minor version variables in the hypervisor
configuration script.
Partially fixes#13.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Qemu 2.11 does not support --disable-static flag and
--enable-strip flag, this patch adds a condition
to only use it for qemu 2.7 or older.
Fixes: #11.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
Add details on how to disable the `initrd` config option to ensure the
base system as documented is functional.
Fixes#42.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
This script will ensure we use a single source of qemu build
options for the Kata Containers project.
Fixes: #7.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
The table of contents showed an incorrect link for building and
installing the runtime.
Fixes#40.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Change the command to enable agent debug slightly so that even if the
config file specifies kernel parameters, the command will successfully
enable the agent debug.
Fixes#38.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
To fulfill the kata design requirements, and based on the disscusion on
Virtcontainers API extentions, runtime API early sketch and runtime API
comparison, this commit added the high level design of the kata runtime
library API.
fixes: #26
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Adding initial kata requirements list, based off of discussion from
kata-containers/runtime issue #31.
Fixes: #18
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Require an additional approval from a `documentation` team member for
PRs containing documentation changes.
Fixes#4.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Require two approvals from documentation team members before a PR
can land.
The configuration file is the same as those used for the other repos,
except for the approval team name.
Fixes#2.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
You MUST choose one of `alpine`, `centos`, `clearlinux`, `debian`, `euleros`, `fedora`, `suse`, and `ubuntu` for `${distro}`. By default `seccomp` packages are not included in the rootfs image. Set `SECCOMP` to `yes` to include them.
> **Note:**
>
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
> - You must ensure that the *default Docker runtime* is `runc` to make use of
> the `USE_DOCKER` variable. If that is not the case, remove the variable
> from the previous command. See [Checking Docker default runtime](#checking-docker-default-runtime).
### Add a custom agent to the image - OPTIONAL
> **Note:**
>
> - You should only do this step if you are testing with the latest version of the agent.
You can build and install the guest kernel image as shown [here](../tools/packaging/kernel/README.md#build-kata-containers-kernel).
# Install a hypervisor
When setting up Kata using a [packaged installation method](install/README.md#installing-on-a-linux-system), the `qemu-lite` hypervisor is installed automatically. For other installation methods, you will need to manually install a suitable hypervisor.
## Build a custom QEMU
Your QEMU directory need to be prepared with source code. Alternatively, you can use the [Kata containers QEMU](https://github.com/kata-containers/qemu/tree/master) and checkout the recommended branch:
Kata containers provides two ways to connect to the guest. One is using traditional login service, which needs additional works. In contrast the simple debug console is easy to setup.
### Simple debug console setup
Kata Containers 2.0 supports a shell simulated *console* for quick debug purpose. This approach uses VSOCK to
connect to the shell running inside the guest which the agent starts. This method only requires the guest image to
contain either `/bin/sh` or `/bin/bash`.
#### Enable agent debug console
Enable debug_console_enabled in the `configuration.toml` configuration file:
```
[agent.kata]
debug_console_enabled = true
```
This will pass `agent.debug_console agent.debug_console_vport=1026` to agent as kernel parameters, and sandboxes created using this parameters will start a shell in guest if new connection is accept from VSOCK.
#### Start `kata-monitor`
The `kata-runtime exec` command needs `kata-monitor` to get the sandbox's `vsock` address to connect to, first start `kata-monitor`.
```
$ sudo kata-monitor
```
`kata-monitor` will serve at `localhost:8090` by default.
#### Connect to debug console
Command `kata-runtime exec` is used to connect to the debug console.
If you want to access guest OS through a traditional way, see [Traditional debug console setup)](#traditional-debug-console-setup).
### Traditional debug console setup
By default you cannot login to a virtual machine, since this can be sensitive
from a security perspective. Also, allowing logins would require additional
packages in the rootfs, which would increase the size of the image used to
boot the virtual machine.
If you want to login to a virtual machine that hosts your containers, complete
the following steps (using rootfs or initrd image).
> **Note:** The following debug console instructions assume a systemd-based guest
> O/S image. This means you must create a rootfs for a distro that supports systemd.
> Currently, all distros supported by [osbuilder](../tools/osbuilder) support systemd
> except for Alpine Linux.
>
> Look for `INIT_PROCESS=systemd` in the `config.sh` osbuilder rootfs config file
> to verify an osbuilder distro supports systemd for the distro you want to build rootfs for.
> For an example, see the [Clear Linux config.sh file](../tools/osbuilder/rootfs-builder/clearlinux/config.sh).
>
> For a non-systemd-based distro, create an equivalent system
> service using that distro’s init system syntax. Alternatively, you can build a distro
> that contains a shell (e.g. `bash(1)`). In this circumstance it is likely you need to install
> additional packages in the rootfs and add “agent.debug_console” to kernel parameters in the runtime
> config file. This tells the Kata agent to launch the console directly.
>
> Once these steps are taken you can connect to the virtual machine using the [debug console](Developer-Guide.md#connect-to-the-virtual-machine-using-the-debug-console).
#### Create a custom image containing a shell
To login to a virtual machine, you must
[create a custom rootfs](#create-a-rootfs-image) or [custom initrd](#create-an-initrd-image---optional)
containing a shell such as `bash(1)`. For Clear Linux, you will need
an additional `coreutils` package.
For example using CentOS:
```
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
If you wish to raise an issue for a new limitation, either
[raise an issue directly on the runtime](https://github.com/kata-containers/kata-containers/issues/new)
or see the
[project table of contents](https://github.com/kata-containers/kata-containers)
for advice on which repository to raise the issue against.
# Pending items
This section lists items that might be possible to fix.
## Runtime commands
### checkpoint and restore
The runtime does not provide `checkpoint` and `restore` commands. There
are discussions about using VM save and restore to give [`criu`](https://github.com/checkpoint-restore/criu)-like functionality, which might provide a solution.
Note that the OCI standard does not specify `checkpoint` and `restore`
commands.
See issue https://github.com/kata-containers/runtime/issues/184 for more information.
### events command
The runtime does not fully implement the `events` command. `OOM` notifications and `Intel RDT` stats are not fully supported.
Note that the OCI standard does not specify an `events` command.
See issue https://github.com/kata-containers/runtime/issues/308 and https://github.com/kata-containers/runtime/issues/309 for more information.
### update command
Currently, only block I/O weight is not supported.
All other configurations are supported and are working properly.
## Networking
### Docker swarm and compose support
The newest version of Docker supported is specified by the
See issue https://github.com/kata-containers/runtime/issues/175 for more information.
Docker compose normally uses custom networks, so also has the same limitations.
## Resource management
Due to the way VMs differ in their CPU and memory allocation, and sharing
across the host system, the implementation of an equivalent method for
these commands is potentially challenging.
See issue https://github.com/clearcontainers/runtime/issues/341 and [the constraints challenge](#the-constraints-challenge) for more information.
For CPUs resource management see
[CPU constraints](design/vcpu-handling.md).
### docker run and shared memory
The runtime does not implement the `docker run --shm-size` command to
set the size of the `/dev/shm tmpfs` within the container. It is possible to pass this configuration value into the VM container so the appropriate mount command happens at launch time.
See issue https://github.com/kata-containers/kata-containers/issues/21 for more information.
### docker run and sysctl
The `docker run --sysctl` feature is not implemented. At the runtime
level, this equates to the `linux.sysctl` OCI configuration. Docker
allows configuring the sysctl settings that support namespacing. From a security and isolation point of view, it might make sense to set them in the VM, which isolates sysctl settings. Also, given that each Kata Container has its own kernel, we can support setting of sysctl settings that are not namespaced. In some cases, we might need to support configuring some of the settings on both the host side Kata Container namespace and the Kata Containers kernel.
See issue https://github.com/kata-containers/runtime/issues/185 for more information.
## Docker daemon features
Some features enabled or implemented via the
[`dockerd` daemon](https://docs.docker.com/config/daemon/) configuration are not yet
implemented.
### SELinux support
The `dockerd` configuration option `"selinux-enabled": true` is not presently implemented
in Kata Containers. Enabling this option causes an OCI runtime error.
See issue https://github.com/kata-containers/runtime/issues/784 for more information.
The consequence of this is that the [Docker --security-opt is only partially supported](#docker---security-opt-option-partially-supported).
Kubernetes [SELinux labels](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#assign-selinux-labels-to-a-container) will also not be applied.
# Architectural limitations
This section lists items that might not be fixed due to fundamental
architectural differences between "soft containers" (i.e. traditional Linux*
containers) and those based on VMs.
## Networking limitations
### Support for joining an existing VM network
Docker supports the ability for containers to join another containers
namespace with the `docker run --net=containers` syntax. This allows
multiple containers to share a common network namespace and the network
interfaces placed in the network namespace. Kata Containers does not
support network namespace sharing. If a Kata Container is setup to
share the network namespace of a `runc` container, the runtime
effectively takes over all the network interfaces assigned to the
namespace and binds them to the VM. Consequently, the `runc` container loses
its network connectivity.
### docker --net=host
Docker host network support (`docker --net=host run`) is not supported.
It is not possible to directly access the host networking configuration
from within the VM.
The `--net=host` option can still be used with `runc` containers and
inter-mixed with running Kata Containers, thus enabling use of `--net=host`
when necessary.
It should be noted, currently passing the `--net=host` option into a
Kata Container may result in the Kata Container networking setup
modifying, re-configuring and therefore possibly breaking the host
networking setup. Do not use `--net=host` with Kata Containers.
### docker run --link
The runtime does not support the `docker run --link` command. This
command is now deprecated by docker and we have no intention of adding support.
Equivalent functionality can be achieved with the newer docker networking commands.
The `--security-opt=` option used by Docker is partially supported.
We only support `--security-opt=no-new-privileges` and `--security-opt seccomp=/path/to/seccomp/profile.json`
option as of today.
Note: The `--security-opt apparmor=your_profile` is not yet supported. See https://github.com/kata-containers/runtime/issues/707.
# Appendices
## The constraints challenge
Applying resource constraints such as cgroup, CPU, memory, and storage to a workload is not always straightforward with a VM based system. A Kata Container runs in an isolated environment inside a virtual machine. This, coupled with the architecture of Kata Containers, offers many more possibilities than are available to traditional Linux containers due to the various layers and contexts.
In some cases it might be necessary to apply the constraints to multiple levels. In other cases, the hardware isolated VM provides equivalent functionality to the the requested constraint.
The following examples outline some of the various areas constraints can be applied:
- Inside the VM
Constrain the guest kernel. This can be achieved by passing particular values through the kernel command line used to boot the guest kernel. Alternatively, sysctl values can be applied at early boot.
- Inside the container
Constrain the container created inside the VM.
- Outside the VM:
- Constrain the hypervisor process by applying host-level constraints.
- Constrain all processes running inside the hypervisor.
This can be achieved by specifying particular hypervisor configuration options.
Note that in some circumstances it might be necessary to apply particular constraints
to more than one of the previous areas to achieve the desired level of isolation and resource control.
* [Installation guides](./install/README.md): Install and run Kata Containers with Docker or Kubernetes
## More User Guides
* [Upgrading](Upgrading.md): how to upgrade from [Clear Containers](https://github.com/clearcontainers) and [runV](https://github.com/hyperhq/runv) to [Kata Containers](https://github.com/kata-containers) and how to upgrade an existing Kata Containers system to the latest version.
* [Limitations](Limitations.md): differences and limitations compared with the default [Docker](https://www.docker.com/) runtime,
[`runc`](https://github.com/opencontainers/runc).
### Howto guides
See the [howto documentation](how-to).
## Kata Use-Cases
* [GPU Passthrough with Kata](./use-cases/GPU-passthrough-and-Kata.md)
* [OpenStack Zun with Kata Containers](./use-cases/zun_kata.md)
* [SR-IOV with Kata](./use-cases/using-SRIOV-and-kata.md)
* [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md)
* [VPP with Kata](./use-cases/using-vpp-and-kata.md)
* [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md)
## Developer Guide
Documents that help to understand and contribute to Kata Containers.
### Design and Implementations
* [Kata Containers Architecture](design/architecture.md): Architectural overview of Kata Containers
* [Kata Containers design](./design/README.md): More Kata Containers design documents
### How to Contribute
* [Developer Guide](Developer-Guide.md): Setup the Kata Containers developing environments
* [How to contribute to Kata Containers](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md)
* [Code of Conduct](../CODE_OF_CONDUCT.md)
### Code Licensing
* [Licensing](Licensing-strategy.md): About the licensing strategy of Kata Containers.
- [How to do a Kata Containers Release](#how-to-do-a-kata-containers-release)
- [Requirements](#requirements)
- [Release Process](#release-process)
- [Bump all Kata repositories](#bump-all-kata-repositories)
- [Merge all bump version Pull requests](#merge-all-bump-version-pull-requests)
- [Tag all Kata repositories](#tag-all-kata-repositories)
- [Check Git-hub Actions](#check-git-hub-actions)
- [Create release notes](#create-release-notes)
- [Announce the release](#announce-the-release)
<!-- TOC END -->
## Requirements
- [hub](https://github.com/github/hub)
- OBS account with permissions on [`/home:katacontainers`](https://build.opensuse.org/project/subprojects/home:katacontainers)
- GitHub permissions to push tags and create releases in Kata repositories.
- GPG configured to sign git tags. https://help.github.com/articles/generating-a-new-gpg-key/
- You should configure your GitHub to use your ssh keys (to push to branches). See https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/.
* As an alternative, configure hub to push and fork with HTTPS, `git config --global hub.protocol https` (Not tested yet) *
## Release Process
### Bump all Kata repositories
- We have set up a Jenkins job to bump the version in the `VERSION` file in all Kata repositories. Go to the [Jenkins bump-job page](http://jenkins.katacontainers.io/job/release/build) to trigger a new job.
- Start a new job with variables for the job passed as:
-`BRANCH=<the-branch-you-want-to-bump>`
-`NEW_VERSION=<the-new-kata-version>`
For example, in the case where you want to make a patch release `1.10.2`, the variable `NEW_VERSION` should be `1.10.2` and `BRANCH` should point to `stable-1.10`. In case of an alpha or release candidate release, `BRANCH` should point to `master` branch.
Alternatively, you can also bump the repositories using a script in the Kata packaging repo
```
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
- The above step will create a GitHub pull request in the Kata projects. Trigger the CI using `/test` command on each bump Pull request.
- Check any failures and fix if needed.
- Work with the Kata approvers to verify that the CI works and the pull requests are merged.
### Tag all Kata repositories
Once all the pull requests to bump versions in all Kata repositories are merged,
tag all the repositories as shown below.
```
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
$ git checkout <kata-branch-to-release>
$ git pull
$ ./tag_repos.sh -p -b "$BRANCH" tag
```
### Check Git-hub Actions
We make use of [GitHub actions](https://github.com/features/actions) in this [file](https://github.com/kata-containers/kata-containers/blob/master/.github/workflows/main.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-conatiners` repository.
Check the [actions status page](https://github.com/kata-containers/kata-containers/actions) to verify all steps in the actions workflow have completed successfully. On success, a static tarball containing Kata release artifacts will be uploaded to the [Release page](https://github.com/kata-containers/kata-containers/releases).
### Create release notes
We have a script in place in the packaging repository to create release notes that include a short-log of the commits across Kata components.
Run the script as shown below:
```
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
# Note: OLD_VERSION is where the script should start to get changes.
Branch and release maintenance for the Kata Containers project.
## Introduction
This document provides details about Kata Containers releases.
## Versioning
The Kata Containers project uses [semantic versioning](http://semver.org/) for all releases.
Semantic versions are comprised of three fields in the form:
```
MAJOR.MINOR.PATCH
```
For examples: `1.0.0`, `1.0.0-rc.5`, and `99.123.77+foo.bar.baz.5`.
Semantic versioning is used since the version number is able to convey clear
information about how a new version relates to the previous version.
For example, semantic versioning can also provide assurances to allow users to know
when they must upgrade compared with when they might want to upgrade:
- When `PATCH` increases, the new release contains important **security fixes**
and an upgrade is recommended.
The patch field can contain extra details after the number.
Dashes denote pre-release versions. `1.0.0-rc.5` in the example denotes the fifth release
candidate for release `1.0.0`. Plus signs denote other details. In our example, `+foo.bar.baz.5`
provides additional information regarding release `99.123.77` in the previous example.
- When `MINOR` increases, the new release adds **new features** but *without
changing the existing behavior*.
- When `MAJOR` increases, the new release adds **new features, bug fixes, or
both** and which *changes the behavior from the previous release* (incompatible with previous releases).
A major release will also likely require a change of the container manager version used,
for example Docker\*. Please refer to the release notes for further details.
## Release Strategy
Any new features added since the last release will be available in the next minor
release. These will include bug fixes as well. To facilitate a stable user environment,
Kata provides stable branch-based releases and a master branch release.
## Stable branch patch criteria
No new features should be introduced to stable branches. This is intended to limit risk to users,
providing only bug and security fixes.
## Branch Management
Kata Containers will maintain two stable release branches in addition to the master branch.
Once a new MAJOR or MINOR release is created from master, a new stable branch is created for
the prior MAJOR or MINOR release and the older stable branch is no longer maintained. End of
maintenance for a branch is announced on the Kata Containers mailing list. Users can determine
the version currently installed by running `kata-runtime kata-env`. It is recommended to use the
latest stable branch available.
A couple of examples follow to help clarify this process.
### New bug fix introduced
A bug fix is submitted against the runtime which does not introduce new inter-component dependencies.
This fix is applied to both the master and stable branches, and there is no need to create a new
stable branch.
| Branch | Original version | New version |
|--|--|--|
| `master` | `1.3.0-rc0` | `1.3.0-rc1` |
| `stable-1.2` | `1.2.0` | `1.2.1` |
| `stable-1.1` | `1.1.2` | `1.1.3` |
### New release made feature or change adding new inter-component dependency
A new feature is introduced, which adds a new inter-component dependency. In this case a new stable
branch is created (stable-1.3) starting from master and the older stable branch (stable-1.1)
is dropped from maintenance.
| Branch | Original version | New version |
|--|--|--|
| `master` | `1.3.0-rc1` | `1.3.0` |
| `stable-1.3` | N/A| `1.3.0` |
| `stable-1.2` | `1.2.1` | `1.2.2` |
| `stable-1.1` | `1.1.3` | (unmaintained) |
Note, the stable-1.1 branch will still exist with tag 1.1.3, but under current plans it is
not maintained further. The next tag applied to master will be 1.4.0-alpha0. We would then
create a couple of alpha releases gathering features targeted for that particular release (in
this case 1.4.0), followed by a release candidate. The release candidate marks a feature freeze.
A new stable branch is created for the release candidate. Only bug fixes and any security issues
are added to the branch going forward until release 1.4.0 is made.
## Backporting Process
Development that occurs against the master branch and applicable code commits should also be submitted
against the stable branches. Some guidelines for this process follow::
1. Only bug and security fixes which do not introduce inter-component dependencies are
candidates for stable branches. These PRs should be marked with "bug" in GitHub.
2. Once a PR is created against master which meets requirement of (1), a comparable one
should also be submitted against the stable branches. It is the responsibility of the submitter
to apply their pull request against stable, and it is the responsibility of the
reviewers to help identify stable-candidate pull requests.
## Continuous Integration Testing
The test repository is forked to create stable branches from master. Full CI
runs on each stable and master PR using its respective tests repository branch.
### An alternative method for CI testing:
Ideally, the continuous integration infrastructure will run the same test suite on both master
and the stable branches. When tests are modified or new feature tests are introduced, explicit
logic should exist within the testing CI to make sure only applicable tests are executed against
stable and master. While this is not in place currently, it should be considered in the long term.
## Release Management
### Patch releases
Releases are made every three weeks, which include a GitHub release as
well as binary packages. These patch releases are made for both stable branches, and a "release candidate"
for the next `MAJOR` or `MINOR` is created from master. If there are no changes across all the repositories, no
release is created and an announcement is made on the developer mailing list to highlight this.
If a release is being made, each repository is tagged for this release, regardless
of whether changes are introduced. The release schedule can be seen on the
[release rotation wiki page](https://github.com/kata-containers/community/wiki/Release-Team-Rota).
If there is urgent need for a fix, a patch release will be made outside of the planned schedule.
The process followed for making a release can be found at [Release Process](Release-Process.md).
## Minor releases
### Frequency
Minor releases are less frequent in order to provide a more stable baseline for users. They are currently
running on a twelve week cadence. As the Kata Containers code base has reached a certain level of
maturity, we have increased the cadence from six weeks to twelve weeks. The release schedule can be seen on the
[release rotation wiki page](https://github.com/kata-containers/community/wiki/Release-Team-Rota).
### Compatibility
Kata guarantees compatibility between components that are within one minor release of each other.
This is critical for dependencies which cross between host (runtime, shim, proxy) and
the guest (hypervisor, rootfs and agent). For example, consider a cluster with a long-running
deployment, workload-never-dies, all on Kata version 1.1.3 components. If the operator updates
the Kata components to the next new minor release (i.e. 1.2.0), we need to guarantee that the 1.2.0
runtime still communicates with 1.1.3 agent within workload-never-dies.
Handling live-update is out of the scope of this document. See this [`kata-runtime` issue](https://github.com/kata-containers/runtime/issues/492) for details.
- [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
- [Appendices](#appendices)
- [DAX](#dax)
## Overview
This is an architectural overview of Kata Containers, based on the 2.0 release.
The primary deliverable of the Kata Containers project is a CRI friendly shim. There is also a CRI friendly library API behind them.
The [Kata Containers runtime](../../src/runtime)
is compatible with the [OCI](https://github.com/opencontainers) [runtime specification](https://github.com/opencontainers/runtime-spec)
and therefore works seamlessly with the [Kubernetes\* Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/container-runtime-interface.md)
through the [CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and
Kata Containers creates a QEMU\*/KVM virtual machine for pod that `kubelet` (Kubernetes) creates respectively.
The [`containerd-shim-kata-v2` (shown as `shimv2` from this point onwards)](../../src/runtime/containerd-shim-v2)
is the Kata Containers entrypoint, which
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2) for Kata.
Before `shimv2` (as done in [Kata Containers 1.x releases](https://github.com/kata-containers/runtime/releases)), we need to create a `containerd-shim` and a [`kata-shim`](https://github.com/kata-containers/shim) for each container and the Pod sandbox itself, plus an optional [`kata-proxy`](https://github.com/kata-containers/proxy) when VSOCK is not available. With `shimv2`, Kubernetes can launch Pod and OCI compatible containers with one shim (the `shimv2`) per Pod instead of `2N+1` shims, and no standalone `kata-proxy` process even if no VSOCK is available.

The container process is then spawned by
[`kata-agent`](../../src/agent), an agent process running
as a daemon inside the virtual machine. `kata-agent` runs a [`ttRPC`](https://github.com/containerd/ttrpc-rust) server in
the guest using a VIRTIO serial or VSOCK interface which QEMU exposes as a socket
file on the host. `shimv2` uses a `ttRPC` protocol to communicate with
the agent. This protocol allows the runtime to send container management
commands to the agent. The protocol is also used to carry the I/O streams (stdout,
stderr, stdin) between the containers and the manage engines (e.g. CRI-O or containerd).
For any given container, both the init process and all potentially executed
commands within that container, together with their related I/O streams, need
to go through the VSOCK interface exported by QEMU.
The container workload, that is, the actual OCI bundle rootfs, is exported from the
host to the virtual machine. In the case where a block-based graph driver is
configured, `virtio-scsi` will be used. In all other cases a 9pfs VIRTIO mount point
will be used. `kata-agent` uses this mount point as the root filesystem for the
container processes.
## Virtualization
How Kata Containers maps container concepts to virtual machine technologies, and how this is realized in the multiple
hypervisors and VMMs that Kata supports is described within the [virtualization documentation](./virtualization.md)
## Guest assets
The hypervisor will launch a virtual machine which includes a minimal guest kernel
and a guest image.
### Guest kernel
The guest kernel is passed to the hypervisor and used to boot the virtual
machine. The default kernel provided in Kata Containers is highly optimized for
kernel boot time and minimal memory footprint, providing only those services
required by a container workload. This is based on a very current upstream Linux
kernel.
### Guest image
Kata Containers supports both an `initrd` and `rootfs` based minimal guest image.
#### Root filesystem image
The default packaged root filesystem image, sometimes referred to as the "mini O/S", is a
highly optimized container bootstrap system based on [Clear Linux](https://clearlinux.org/). It provides an extremely minimal environment and
has a highly optimized boot path.
The only services running in the context of the mini O/S are the init daemon
(`systemd`) and the [Agent](#agent). The real workload the user wishes to run
is created using libcontainer, creating a container in the same manner that is done
by `runc`.
For example, when `ctr run -ti ubuntu date` is run:
- The hypervisor will boot the mini-OS image using the guest kernel.
-`systemd`, running inside the mini-OS context, will launch the `kata-agent` in
the same context.
- The agent will create a new confined context to run the specified command in
(`date` in this example).
- The agent will then execute the command (`date` in this example) inside this
new context, first setting the root filesystem to the expected Ubuntu\* root
filesystem.
#### Initrd image
A compressed `cpio(1)` archive, created from a rootfs which is loaded into memory and used as part of the Linux startup process. During startup, the kernel unpacks it into a special instance of a `tmpfs` that becomes the initial root filesystem.
The only service running in the context of the initrd is the [Agent](#agent) as the init daemon. The real workload the user wishes to run is created using libcontainer, creating a container in the same manner that is done by `runc`.
## Agent
[`kata-agent`](../../src/agent) is a process running in the guest as a supervisor for managing containers and processes running within those containers.
For the 2.0 release, the `kata-agent` is rewritten in the [RUST programming language](https://www.rust-lang.org/) so that we can minimize its memory footprint while keeping the memory safety of the original GO version of [`kata-agent` used in Kata Container 1.x](https://github.com/kata-containers/agent). This memory footprint reduction is pretty impressive, from tens of megabytes down to less than 100 kilobytes, enabling Kata Containers in more use cases like functional computing and edge computing.
The `kata-agent` execution unit is the sandbox. A `kata-agent` sandbox is a container sandbox defined by a set of namespaces (NS, UTS, IPC and PID). `shimv2` can
run several containers per VM to support container engines that require multiple
containers running inside a pod.
`kata-agent` communicates with the other Kata components over `ttRPC`.
## Runtime
`containerd-shim-kata-v2` is a [containerd runtime shimv2](https://github.com/containerd/containerd/blob/v1.4.1/runtime/v2/README.md) implementation and is responsible for handling the `runtime v2 shim APIs`, which is similar to [the OCI runtime specification](https://github.com/opencontainers/runtime-spec) but simplifies the architecture by loading the runtime once and making RPC calls to handle the various container lifecycle commands. This refinement is an improvement on the OCI specification which requires the container manager call the runtime binary multiple times, at least once for each lifecycle command.
`containerd-shim-kata-v2` heavily utilizes the
[virtcontainers package](../../src/runtime/virtcontainers/), which provides a generic, runtime-specification agnostic, hardware-virtualized containers library.
### Configuration
The runtime uses a TOML format configuration file called `configuration.toml`. By default this file is installed in the `/usr/share/defaults/kata-containers` directory and contains various settings such as the paths to the hypervisor, the guest kernel and the mini-OS image.
The actual configuration file paths can be determined by running:
```
$ kata-runtime --kata-show-default-config-paths
```
Most users will not need to modify the configuration file.
The file is well commented and provides a few "knobs" that can be used to modify the behavior of the runtime and your chosen hypervisor.
The configuration file is also used to enable runtime [debug output](../Developer-Guide.md#enable-full-debug).
## Networking
Containers will typically live in their own, possibly shared, networking namespace.
At some point in a container lifecycle, container engines will set up that namespace
to add the container to a network which is isolated from the host network, but
which is shared between containers
In order to do so, container engines will usually add one end of a virtual
ethernet (`veth`) pair into the container networking namespace. The other end of
the `veth` pair is added to the host networking namespace.
This is a very namespace-centric approach as many hypervisors/VMMs cannot handle `veth`
interfaces. Typically, `TAP` interfaces are created for VM connectivity.
To overcome incompatibility between typical container engines expectations
and virtual machines, Kata Containers networking transparently connects `veth`
Container workloads are shared with the virtualized environment through [virtio-fs](https://virtio-fs.gitlab.io/).
The [devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/master/snapshots/devmapper) is a special case. The `snapshotter` uses dedicated block devices rather than formatted filesystems, and operates at the block level rather than the file level. This knowledge is used to directly use the underlying block device instead of the overlay file system for the container root file system. The block device maps to the top read-write layer for the overlay. This approach gives much better I/O performance compared to using `virtio-fs` to share the container file system.
Kata Containers has the ability to hotplug and remove block devices, which makes it possible to use block devices for containers started after the VM has been launched.
Users can check to see if the container uses the devicemapper block device as its rootfs by calling `mount(8)` within the container. If the devicemapper block device
is used, `/` will be mounted on `/dev/vda`. Users can disable direct mounting of the underlying block device through the runtime configuration.
## Kubernetes support
[Kubernetes\*](https://github.com/kubernetes/kubernetes/) is a popular open source
container orchestration engine. In Kubernetes, a set of containers sharing resources
such as networking, storage, mount, PID, etc. is called a
A Kubernetes cluster runs a control plane where a scheduler (typically running on a
dedicated master node) calls into a compute Kubelet. This Kubelet instance is
responsible for managing the lifecycle of pods within the nodes and eventually relies
on a container runtime to handle execution. The Kubelet architecture decouples
lifecycle management from container execution through the dedicated
`gRPC` based [Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/container-runtime-interface-v1.md).
In other words, a Kubelet is a CRI client and expects a CRI implementation to
handle the server side of the interface.
[CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and [Containerd\*](https://github.com/containerd/containerd/) are CRI implementations that rely on [OCI](https://github.com/opencontainers/runtime-spec)
compatible runtimes for managing container instances.
Kata Containers is an officially supported CRI-O and Containerd runtime. Refer to the following guides on how to set up Kata Containers with Kubernetes:
- [How to use Kata Containers and Containerd](../how-to/containerd-kata.md)
- [Run Kata Containers with Kubernetes](../how-to/run-kata-with-k8s.md)
#### OCI annotations
In order for the Kata Containers runtime (or any virtual machine based OCI compatible
runtime) to be able to understand if it needs to create a full virtual machine or if it
has to create a new container inside an existing pod's virtual machine, CRI-O adds
specific annotations to the OCI configuration file (`config.json`) which is passed to
the OCI compatible runtime.
Before calling its runtime, CRI-O will always add a `io.kubernetes.cri-o.ContainerType`
annotation to the `config.json` configuration file it produces from the Kubelet CRI
request. The `io.kubernetes.cri-o.ContainerType` annotation can either be set to `sandbox`
or `container`. Kata Containers will then use this annotation to decide if it needs to
respectively create a virtual machine or a container inside a virtual machine associated
> **Note:** Since Kubernetes 1.12, the [`Kubernetes RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/)
> has been supported and the user can specify runtime without the non-standardized annotations.
With `RuntimeClass`, users can define Kata Containers as a `RuntimeClass` and then explicitly specify that a pod being created as a Kata Containers pod. For details, please refer to [How to use Kata Containers and Containerd](../../docs/how-to/containerd-kata.md).
# Appendices
## DAX
Kata Containers utilizes the Linux kernel DAX [(Direct Access filesystem)](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.txt)
feature to efficiently map some host-side files into the guest VM space.
In particular, Kata Containers uses the QEMU NVDIMM feature to provide a
memory-mapped virtual device that can be used to DAX map the virtual machine's
root filesystem into the guest memory address space.
Mapping files using DAX provides a number of benefits over more traditional VM
file and device mapping mechanisms:
- Mapping as a direct access devices allows the guest to directly access
the host memory pages (such as via Execute In Place (XIP)), bypassing the guest
page cache. This provides both time and space optimizations.
- Mapping as a direct access device inside the VM allows pages from the
host to be demand loaded using page faults, rather than having to make requests
via a virtualized device (causing expensive VM exits/hypercalls), thus providing
a speed optimization.
- Utilizing `MAP_SHARED` shared memory on the host allows the host to efficiently
share pages.
Kata Containers uses the following steps to set up the DAX mappings:
1. QEMU is configured with an NVDIMM memory device, with a memory file
backend to map in the host-side file into the virtual NVDIMM space.
2. The guest kernel command line mounts this NVDIMM device with the DAX
feature enabled, allowing direct page mapping and access, thus bypassing the
guest page cache.

Information on the use of NVDIMM via QEMU is available in the [QEMU source code](http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/nvdimm.txt;hb=HEAD)
| `SandboxCgroupOnly=false` | yes | legacy | Easiest to make Kata work | Unaccounted for memory and resource utilization | v1
| `SandboxCgroupOnly=true` | no | recommended | Complete tracking of Kata memory and CPU utilization. In Kubernetes, the Kubelet can fully constrain Kata via the pod cgroup | Requires upper layer orchestrator which sizes sandbox cgroup appropriately | v1, v2
Kata implement CRI's API and support [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interface to get basic metrics about container.
But unlike `runc`, Kata is a VM-based runtime and has a different architecture.
## Limitations of Kata 1.x and the target of Kata 2.0
Kata 1.x has a number of limitations related to observability that may be obstacles to running Kata Containers at scale.
In Kata 2.0, the following components will be able to provide more details about the system.
- containerd shim v2 (effectively `kata-runtime`)
- Hypervisor statistics
- Agent process
- Guest OS statistics
> **Note**: In Kata 1.x, the main user-facing component was the runtime (`kata-runtime`). From 1.5, Kata then introduced the Kata containerd shim v2 (`containerd-shim-kata-v2`) which is essentially a modified runtime that is loaded by containerd to simplify and improve the way VM-based containers are created and managed.
>
> For Kata 2.0, the main component is the Kata containerd shim v2, although the deprecated `kata-runtime` binary will be maintained for a period of time.
>
> Any mention of the "Kata runtime" in this document should be taken to refer to the Kata containerd shim v2 unless explicitly noted otherwise (for example by referring to it explicitly as the `kata-runtime` binary).
## Metrics architecture
Kata 2.0 metrics strongly depend on [Prometheus](https://prometheus.io/), a graduated project from CNCF.
Kata Containers 2.0 introduces a new Kata component called `kata-monitor` which is used to monitor the other Kata components on the host. It's the monitor interface with Kata runtime, and we can do something like these:
- Get metrics
- Get events
In this document we will cover metrics only. And until now it only supports metrics function.
This is the architecture overview metrics in Kata Containers 2.0.
For a quick evaluation, you can check out [this how to](../how-to/how-to-set-prometheus-in-k8s.md).
### Kata monitor
`kata-monitor` is a management agent on one node, where many Kata containers are running. `kata-monitor`'s work include:
> **Note**: node is a single host system or a node in K8s clusters.
- Aggregate sandbox metrics running on this node, and add `sandbox_id` label
- As a Prometheus target, all metrics from Kata shim on this node will be collected by Prometheus indirectly. This can easy the targets count in Prometheus, and also need not to expose shim's metrics by `ip:port`
Only one `kata-monitor` process are running on one node.
`kata-monitor` is using a different communication channel other than that `conatinerd` communicating with Kata shim, and Kata shim listen on a new socket address for communicating with `kata-monitor`.
The way `kata-monitor` get shim's metrics socket file(`monitor_address`) like that `containerd` get shim address. The socket is an abstract socket and saved as file `abstract` with the same directory of `address` for `containerd`.
> **Note**: If there is no Prometheus server is configured, i.e., there is no scrape operations, `kata-monitor` will do nothing initiative.
### Kata runtime
Runtime is responsible for:
- Gather metrics about shim process
- Gather metrics about hypervisor process
- Gather metrics about running sandbox
- Get metrics from Kata agent(through `ttrpc`)
### Kata agent
Agent is responsible for:
- Gather agent process metrics
- Gather guest OS metrics
And in Kata 2.0, agent will add a new interface:
```protobuf
rpcGetMetrics(GetMetricsRequest)returns(Metrics);
messageGetMetricsRequest{}
messageMetrics{
stringmetrics=1;
}
```
The `metrics` field is Prometheus encoded content. This can avoid defining a fixed structure in protocol buffers.
### Performance and overhead
Metrics should not become the bottleneck of system, downgrade the performance, and run with minimal overhead.
Requirements:
* Metrics **MUST** be quick to collect
* Metrics **MUST** be small.
* Metrics **MUST** be generated only if there are subscribers to the Kata metrics service
* Metrics **MUST** be stateless
In Kata 2.0, metrics are collected mainly from `/proc` filesystem, and consumed by Prometheus, based on a pull mode, that is mean if there is no Prometheus collector is running, so there will be zero overhead if nobody cares the metrics.
Metrics service also doesn't hold any metrics in memory.
|\*|No Sandbox | 1 Sandbox | 2 Sandboxes |
|---|---|---|---|
|Metrics count| 39 | 106 | 173 |
|Metrics size(bytes)| 9K | 144K | 283K |
|Metrics size(`gzipped`, bytes)| 2K | 10K | 17K |
*Metrics size*: Response size of one Prometheus scrape request.
It's easy to estimated that if there are 10 sandboxes running in the host, the size of one metrics fetch request issued by Prometheus will be about to 9 + (144 - 9) * 10 = 1.35M (not `gzipped`) or 2 + (10 - 2) * 10 = 82K (`gzipped`). Of course Prometheus support `gzip` compression, that can reduce the response size of every request.
And here is some test data:
- End-to-end (from Prometheus server to `kata-monitor` and `kata-monitor` write response back): 20ms(avg)
- Agent(RPC all from shim to agent): 3ms(avg)
Test infrastructure:
- OS: Ubuntu 20.04
- Hardware: Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz, 6 Cores, and 16GB memory.
**Scrape interval**
Prometheus default `scrape_interval` is 1 minute, and usually it is set to 15s. Small `scrape_interval` will cause more overhead, so user should set it on monitor demand.
## Metrics list
Here listed is all supported metrics by Kata 2.0. Some metrics is dependent on guest kernels in the VM, so there may be some different by your environment.
Metrics is categorized by component where metrics are collected from and for.
> * Labels here are not include `instance` and `job` labels that added by Prometheus.
> * Notes about metrics unit
> * `Kibibytes`, abbreviated `KiB`. 1 `KiB` equals 1024 B.
> * For some metrics (like network devices statistics from file `/proc/net/dev`), unit is depend on label( for example `recv_bytes` and `recv_packets` are having different units).
> * Most of these metrics is collected from `/proc` filesystem, so the unit of metrics are keeping the same unit as `/proc`. See the `proc(5)` manual page for further details.
### Metric types
Prometheus offer four core metric types.
- Counter: A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase.
- Gauge: A gauge metric represents a single numerical value that can go up and down, typically used for measured values like current memory usage.
- Histogram: A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
- Summary: A summary samples observations like histogram, it can calculate configurable quantiles over a sliding time window.
See [Prometheus metric types](https://prometheus.io/docs/concepts/metric_types/) for detailed explanations about these metric types.
### Kata agent metrics
Agent's metrics contains metrics about agent process.
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_agent_io_stat`: <br> Agent process IO stat. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelled_write_byte`</li><li>`rchar`</li><li>`read_bytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`write_bytes`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_proc_stat`: <br> Agent process stat. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_proc_status`: <br> Agent process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_total_rss`: <br> Agent process total `rss` size | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_total_time`: <br> Agent process total time | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_agent_total_vm`: <br> Agent process total `vm` size | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
### Firecracker metrics
Metrics for Firecracker vmm.
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_firecracker_api_server`: <br> Metrics related to the internal API server. | `GAUGE` | | <ul><li>`item`<ul><li>`process_startup_time_cpu_us`</li><li>`process_startup_time_us`</li><li>`sync_response_fails`</li><li>`sync_vmm_send_timeout_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_get_api_requests`: <br> Metrics specific to GET API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`instance_info_count`</li><li>`instance_info_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_i8042`: <br> Metrics specific to the i8042 device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`reset_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_latencies_us`: <br> Performance metrics related for the moment only to snapshots. | `GAUGE` | | <ul><li>`item`<ul><li>`diff_create_snapshot`</li><li>`full_create_snapshot`</li><li>`load_snapshot`</li><li>`pause_vm`</li><li>`resume_vm`</li><li>`vmm_diff_create_snapshot`</li><li>`vmm_full_create_snapshot`</li><li>`vmm_load_snapshot`</li><li>`vmm_pause_vm`</li><li>`vmm_resume_vm`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_logger`: <br> Metrics for the logging subsystem. | `GAUGE` | | <ul><li>`item`<ul><li>`log_fails`</li><li>`metrics_fails`</li><li>`missed_log_count`</li><li>`missed_metrics_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_mmds`: <br> Metrics for the MMDS functionality. | `GAUGE` | | <ul><li>`item`<ul><li>`connections_created`</li><li>`connections_destroyed`</li><li>`rx_accepted`</li><li>`rx_accepted_err`</li><li>`rx_accepted_unusual`</li><li>`rx_bad_eth`</li><li>`rx_count`</li><li>`tx_bytes`</li><li>`tx_count`</li><li>`tx_errors`</li><li>`tx_frames`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_patch_api_requests`: <br> Metrics specific to PATCH API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`drive_count`</li><li>`drive_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_put_api_requests`: <br> Metrics specific to PUT API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`actions_count`</li><li>`actions_fails`</li><li>`boot_source_count`</li><li>`boot_source_fails`</li><li>`drive_count`</li><li>`drive_fails`</li><li>`logger_count`</li><li>`logger_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`metrics_count`</li><li>`metrics_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_rtc`: <br> Metrics specific to the RTC device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_seccomp`: <br> Metrics for the seccomp filtering. | `GAUGE` | | <ul><li>`item`<ul><li>`num_faults`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_signals`: <br> Metrics related to signals. | `GAUGE` | | <ul><li>`item`<ul><li>`sigbus`</li><li>`sigsegv`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_uart`: <br> Metrics specific to the UART device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`flush_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vcpu`: <br> Metrics specific to VCPUs' mode of functioning. | `GAUGE` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vmm`: <br> Metrics specific to the machine manager as a whole. | `GAUGE` | | <ul><li>`item`<ul><li>`device_events`</li><li>`panic_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_guest_cpu_time`: <br> Guest CPU stat. | `GAUGE` | | <ul><li>`cpu` (CPU no. and total for all CPUs)<ul><li>`0` (CPU 0)</li><li>`1` (CPU 1)</li><li>`total` (for all CPUs)</li></ul></li><li>`item` (Kernel/system statistics, from `/proc/stat`)<ul><li>`guest`</li><li>`guest_nice`</li><li>`idle`</li><li>`iowait`</li><li>`irq`</li><li>`nice`</li><li>`softirq`</li><li>`steal`</li><li>`system`</li><li>`user`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_guest_diskstat`: <br> Disks stat in system. | `GAUGE` | | <ul><li>`disk` (disk name)</li><li>`item` (see `/proc/diskstats`)<ul><li>`discards`</li><li>`discards_merged`</li><li>`flushes`</li><li>`in_progress`</li><li>`merged`</li><li>`reads`</li><li>`sectors_discarded`</li><li>`sectors_read`</li><li>`sectors_written`</li><li>`time_discarding`</li><li>`time_flushing`</li><li>`time_in_progress`</li><li>`time_reading`</li><li>`time_writing`</li><li>`weighted_time_in_progress`</li><li>`writes`</li><li>`writes_merged`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_guest_meminfo`: <br> Statistics about memory usage on the system. | `GAUGE` | | <ul><li>`item` (see `/proc/meminfo`)<ul><li>`active`</li><li>`active_anon`</li><li>`active_file`</li><li>`anon_hugepages`</li><li>`anon_pages`</li><li>`bounce`</li><li>`buffers`</li><li>`cached`</li><li>`cma_free`</li><li>`cma_total`</li><li>`commit_limit`</li><li>`committed_as`</li><li>`direct_map_1G`</li><li>`direct_map_2M`</li><li>`direct_map_4M`</li><li>`direct_map_4k`</li><li>`dirty`</li><li>`hardware_corrupted`</li><li>`high_free`</li><li>`high_total`</li><li>`hugepages_free`</li><li>`hugepages_rsvd`</li><li>`hugepages_surp`</li><li>`hugepages_total`</li><li>`hugepagesize`</li><li>`hugetlb`</li><li>`inactive`</li><li>`inactive_anon`</li><li>`inactive_file`</li><li>`k_reclaimable`</li><li>`kernel_stack`</li><li>`low_free`</li><li>`low_total`</li><li>`mapped`</li><li>`mem_available`</li><li>`mem_free`</li><li>`mem_total`</li><li>`mlocked`</li><li>`mmap_copy`</li><li>`nfs_unstable`</li><li>`page_tables`</li><li>`per_cpu`</li><li>`quicklists`</li><li>`s_reclaimable`</li><li>`s_unreclaim`</li><li>`shmem`</li><li>`shmem_hugepages`</li><li>`shmem_pmd_mapped`</li><li>`slab`</li><li>`swap_cached`</li><li>`swap_free`</li><li>`swap_total`</li><li>`unevictable`</li><li>`vmalloc_chunk`</li><li>`vmalloc_total`</li><li>`vmalloc_used`</li><li>`writeback`</li><li>`writeback_tmp`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_guest_netdev_stat`: <br> Guest net devices stats. | `GAUGE` | | <ul><li>`interface` (network device name)</li><li>`item` (see `/proc/net/dev`)<ul><li>`recv_bytes`</li><li>`recv_compressed`</li><li>`recv_drop`</li><li>`recv_errs`</li><li>`recv_fifo`</li><li>`recv_frame`</li><li>`recv_multicast`</li><li>`recv_packets`</li><li>`sent_bytes`</li><li>`sent_carrier`</li><li>`sent_colls`</li><li>`sent_compressed`</li><li>`sent_drop`</li><li>`sent_errs`</li><li>`sent_fifo`</li><li>`sent_packets`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_monitor_go_gc_duration_seconds`: <br> A summary of the pause duration of garbage collection cycles. | `SUMMARY` | `seconds` | | 2.0.0 |
| `kata_monitor_go_goroutines`: <br> Number of goroutines that currently exist. | `GAUGE` | | | 2.0.0 |
| `kata_monitor_go_info`: <br> Information about the Go environment. | `GAUGE` | | <ul><li>`version` (golang version)<ul><li>`go1.13.9` (environment dependent variable)</li></ul></li></ul> | 2.0.0 |
| `kata_monitor_go_memstats_alloc_bytes`: <br> Number of bytes allocated and still in use. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_alloc_bytes_total`: <br> Total number of bytes allocated, even if freed. | `COUNTER` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_buck_hash_sys_bytes`: <br> Number of bytes used by the profiling bucket hash table. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_frees_total`: <br> Total number of frees. | `COUNTER` | | | 2.0.0 |
| `kata_monitor_go_memstats_gc_cpu_fraction`: <br> The fraction of this program's available CPU time used by the GC since the program started. | `GAUGE` | | | 2.0.0 |
| `kata_monitor_go_memstats_gc_sys_bytes`: <br> Number of bytes used for garbage collection system metadata. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_heap_alloc_bytes`: <br> Number of heap bytes allocated and still in use. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_heap_idle_bytes`: <br> Number of heap bytes waiting to be used. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_heap_inuse_bytes`: <br> Number of heap bytes that are in use. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_heap_objects`: <br> Number of allocated objects. | `GAUGE` | | | 2.0.0 |
| `kata_monitor_go_memstats_heap_released_bytes`: <br> Number of heap bytes released to OS. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_heap_sys_bytes`: <br> Number of heap bytes obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_last_gc_time_seconds`: <br> Number of seconds since 1970 of last garbage collection. | `GAUGE` | `seconds` | | 2.0.0 |
| `kata_monitor_go_memstats_lookups_total`: <br> Total number of pointer lookups. | `COUNTER` | | | 2.0.0 |
| `kata_monitor_go_memstats_mallocs_total`: <br> Total number of `mallocs`. | `COUNTER` | | | 2.0.0 |
| `kata_monitor_go_memstats_mcache_inuse_bytes`: <br> Number of bytes in use by `mcache` structures. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_mcache_sys_bytes`: <br> Number of bytes used for `mcache` structures obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_mspan_inuse_bytes`: <br> Number of bytes in use by `mspan` structures. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_mspan_sys_bytes`: <br> Number of bytes used for `mspan` structures obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_next_gc_bytes`: <br> Number of heap bytes when next garbage collection will take place. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_other_sys_bytes`: <br> Number of bytes used for other system allocations. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_stack_inuse_bytes`: <br> Number of bytes in use by the stack allocator. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_stack_sys_bytes`: <br> Number of bytes obtained from system for stack allocator. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_memstats_sys_bytes`: <br> Number of bytes obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
| `kata_monitor_go_threads`: <br> Number of OS threads created. | `GAUGE` | | | 2.0.0 |
| `kata_monitor_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | | 2.0.0 |
| `kata_monitor_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | | 2.0.0 |
| `kata_monitor_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | | 2.0.0 |
| Metric name | Type | Units | Labels | Introduced in Kata version |
|---|---|---|---|---|
| `kata_shim_agent_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (RPC actions of Kata agent)<ul><li>`grpc.CheckRequest`</li><li>`grpc.CloseStdinRequest`</li><li>`grpc.CopyFileRequest`</li><li>`grpc.CreateContainerRequest`</li><li>`grpc.CreateSandboxRequest`</li><li>`grpc.DestroySandboxRequest`</li><li>`grpc.ExecProcessRequest`</li><li>`grpc.GetMetricsRequest`</li><li>`grpc.GuestDetailsRequest`</li><li>`grpc.ListInterfacesRequest`</li><li>`grpc.ListProcessesRequest`</li><li>`grpc.ListRoutesRequest`</li><li>`grpc.MemHotplugByProbeRequest`</li><li>`grpc.OnlineCPUMemRequest`</li><li>`grpc.PauseContainerRequest`</li><li>`grpc.RemoveContainerRequest`</li><li>`grpc.ReseedRandomDevRequest`</li><li>`grpc.ResumeContainerRequest`</li><li>`grpc.SetGuestDateTimeRequest`</li><li>`grpc.SignalProcessRequest`</li><li>`grpc.StartContainerRequest`</li><li>`grpc.StartTracingRequest`</li><li>`grpc.StatsContainerRequest`</li><li>`grpc.StopTracingRequest`</li><li>`grpc.TtyWinResizeRequest`</li><li>`grpc.UpdateContainerRequest`</li><li>`grpc.UpdateInterfaceRequest`</li><li>`grpc.UpdateRoutesRequest`</li><li>`grpc.WaitProcessRequest`</li><li>`grpc.WriteStreamRequest`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_fds`: <br> Kata containerd shim v2 open FDs. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_gc_duration_seconds`: <br> A summary of the pause duration of garbage collection cycles. | `SUMMARY` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_goroutines`: <br> Number of goroutines that currently exist. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_info`: <br> Information about the Go environment. | `GAUGE` | | <ul><li>`sandbox_id`</li><li>`version` (golang version)<ul><li>`go1.13.9` (environment dependent variable)</li></ul></li></ul> | 2.0.0 |
| `kata_shim_go_memstats_alloc_bytes`: <br> Number of bytes allocated and still in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_alloc_bytes_total`: <br> Total number of bytes allocated, even if freed. | `COUNTER` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_buck_hash_sys_bytes`: <br> Number of bytes used by the profiling bucket hash table. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_frees_total`: <br> Total number of frees. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_gc_cpu_fraction`: <br> The fraction of this program's available CPU time used by the GC since the program started. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_gc_sys_bytes`: <br> Number of bytes used for garbage collection system metadata. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_alloc_bytes`: <br> Number of heap bytes allocated and still in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_idle_bytes`: <br> Number of heap bytes waiting to be used. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_inuse_bytes`: <br> Number of heap bytes that are in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_objects`: <br> Number of allocated objects. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_released_bytes`: <br> Number of heap bytes released to OS. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_heap_sys_bytes`: <br> Number of heap bytes obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_last_gc_time_seconds`: <br> Number of seconds since 1970 of last garbage collection. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_lookups_total`: <br> Total number of pointer lookups. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_mallocs_total`: <br> Total number of `mallocs`. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_mcache_inuse_bytes`: <br> Number of bytes in use by `mcache` structures. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_mcache_sys_bytes`: <br> Number of bytes used for `mcache` structures obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_mspan_inuse_bytes`: <br> Number of bytes in use by `mspan` structures. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_mspan_sys_bytes`: <br> Number of bytes used for `mspan` structures obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_next_gc_bytes`: <br> Number of heap bytes when next garbage collection will take place. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_other_sys_bytes`: <br> Number of bytes used for other system allocations. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_stack_inuse_bytes`: <br> Number of bytes in use by the stack allocator. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_stack_sys_bytes`: <br> Number of bytes obtained from system for stack allocator. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_memstats_sys_bytes`: <br> Number of bytes obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_go_threads`: <br> Number of OS threads created. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_io_stat`: <br> Kata containerd shim v2 process IO statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelledwritebytes`</li><li>`rchar`</li><li>`readbytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`writebytes`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_pod_overhead_cpu`: <br> Kata Pod overhead for CPU resources(percent). | `GAUGE` | percent | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_pod_overhead_memory_in_bytes`: <br> Kata Pod overhead for memory resources(bytes). | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_proc_stat`: <br> Kata containerd shim v2 process statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_proc_status`: <br> Kata containerd shim v2 process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpmd`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_shim_process_virtual_memory_max_bytes`: <br> Maximum amount of virtual memory available in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
To fulfill the [Kata design requirements](kata-design-requirements.md), and based on the discussion on [Virtcontainers API extensions](https://docs.google.com/presentation/d/1dbGrD1h9cpuqAPooiEgtiwWDGCYhVPdatq7owsKHDEQ), the Kata runtime library features the following APIs:
- Sandbox based top API
- Storage and network hotplug API
- Plugin frameworks for external proprietary Kata runtime extensions
- Built-in shim and proxy types and capabilities
## Sandbox Based API
### Sandbox Management API
|Name|Description|
|---|---|
|`CreateSandbox(SandboxConfig, Factory)`| Create a sandbox and its containers, base on `SandboxConfig` and `Factory`. Return the `Sandbox` structure, but do not start them.|
### Sandbox Operation API
|Name|Description|
|---|---|
|`sandbox.Delete()`| Shut down the VM in which the sandbox, and destroy the sandbox and remove all persistent metadata.|
|`sandbox.Monitor()`| Return a context handler for caller to monitor sandbox callbacks such as error termination.|
|`sandbox.Release()`| Release a sandbox data structure, close connections to the agent, and quit any goroutines associated with the Sandbox. Mostly used for daemon restart.|
|`sandbox.Start()`| Start a sandbox and the containers making the sandbox.|
|`sandbox.Stats()`| Get the stats of a running sandbox, return a `SandboxStats` structure.|
|`sandbox.Status()`| Get the status of the sandbox and containers, return a `SandboxStatus` structure.|
|`sandbox.Stop(force)`| Stop a sandbox and Destroy the containers in the sandbox. When force is true, ignore guest related stop failures.|
|`sandbox.CreateContainer(contConfig)`| Create new container in the sandbox with the `ContainerConfig` param. It will add new container config to `sandbox.config.Containers`.|
|`sandbox.DeleteContainer(containerID)`| Delete a container from the sandbox by containerID, return a `Container` structure.|
|`sandbox.EnterContainer(containerID, cmd)`| Run a new process in a container, executing customer's `types.Cmd` command.|
|`sandbox.KillContainer(containerID, signal, all)`| Signal a container in the sandbox by the containerID.|
|`sandbox.PauseContainer(containerID)`| Pause a running container in the sandbox by the containerID.|
|`sandbox.ProcessListContainer(containerID, options)`| List every process running inside a specific container in the sandbox, return a `ProcessList` structure.|
|`sandbox.ResumeContainer(containerID)`| Resume a paused container in the sandbox by the containerID.|
|`sandbox.StartContainer(containerID)`| Start a container in the sandbox by the containerID.|
|`sandbox.StatsContainer(containerID)`| Get the stats of a running container, return a `ContainerStats` structure.|
|`sandbox.StatusContainer(containerID)`| Get the status of a container in the sandbox, return a `ContainerStatus` structure.|
|`sandbox.StopContainer(containerID, force)`| Stop a container in the sandbox by the containerID.|
|`sandbox.UpdateContainer(containerID, resources)`| Update a running container in the sandbox.|
|`sandbox.WaitProcess(containerID, processID)`| Wait on a process to terminate.|
### Sandbox Hotplug API
|Name|Description|
|---|---|
|`sandbox.AddDevice(info)`| Add new storage device `DeviceInfo` to the sandbox, return a `Device` structure.|
|`sandbox.AddInterface(inf)`| Add new NIC to the sandbox.|
|`sandbox.RemoveInterface(inf)`| Remove a NIC from the sandbox.|
|`sandbox.ListInterfaces()`| List all NICs and their configurations in the sandbox, return a `pbTypes.Interface` list.|
|`sandbox.UpdateRoutes(routes)`| Update the sandbox route table (e.g. for portmapping support), return a `pbTypes.Route` list.|
|`sandbox.ListRoutes()`| List the sandbox route table, return a `pbTypes.Route` list.|
### Sandbox Relay API
|Name|Description|
|---|---|
|`sandbox.WinsizeProcess(containerID, processID, Height, Width)`| Relay TTY resize request to a process.|
|`sandbox.SignalProcess(containerID, processID, signalID, signalALL)`| Relay a signal to a process or all processes in a container.|
|`sandbox.IOStream(containerID, processID)`| Relay a process stdio. Return stdin/stdout/stderr pipes to the process stdin/stdout/stderr streams.|
### Sandbox Monitor API
|Name|Description|
|---|---|
|`sandbox.GetOOMEvent()`| Monitor the OOM events that occur in the sandbox..|
|`sandbox.UpdateRuntimeMetrics()`| Update the shim/hypervisor's metrics of the running sandbox.|
|`sandbox.GetAgentMetrics()`| Get metrics of the agent and the guest in the running sandbox.|
## Plugin framework for external proprietary Kata runtime extensions
### Hypervisor plugin
TBD.
### Metadata storage plugin
The metadata storage plugin controls where sandbox metadata is saved.
All metadata storage plugins must implement the following API:
|Name|Description|
|---|---|
|`storage.Save(key, value)`| Save a record.|
|`storage.Load(key)`| Load a record.|
|`storage.Delete(key)`| Delete a record.|
Built-in implementations include:
- Filesystem storage
- LevelDB storage
### VM Factory plugin
The VM factory plugin controls how a sandbox factory creates new VMs.
All VM factory plugins must implement following API:
|Name|Description|
|---|---|
|`VMFactory.NewVM(HypervisorConfig)`|Create a new VM based on `HypervisorConfig`.|
Built-in implementations include:
|Name|Description|
|---|---|
|`CreateNew()`| Create brand new VM based on `HypervisorConfig`.|
|`CreateFromTemplate()`| Create new VM from template.|
|`CreateFromCache()`| Create new VM from VM caches.|
In theory, being OCI compatible should be enough. In practice, the Kata Containers runtime
should comply with the latest *stable*`runc` CLI. In particular, it **MUST** implement the
following `runc` commands:
*`create`
*`delete`
*`exec`
*`kill`
*`list`
*`pause`
*`ps`
*`start`
*`state`
*`version`
The Kata Containers runtime **MUST** implement the following command line options:
*`--console-socket`
*`--pid-file`
### [CRI](http://blog.kubernetes.io/2016/12/container-runtime-interface-cri-in-kubernetes.html) and [Kubernetes](https://kubernetes.io) support
The Kata Containers project **MUST** provide two interfaces for CRI shims to manage hardware
virtualization based Kubernetes pods and containers:
- An OCI and `runc` compatible command line interface, as described in the previous section.
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`cri-containerd`](https://github.com/containerd/cri-containerd), for example.
- A hardware virtualization runtime library API for CRI shims to consume and provide a more
CRI native implementation. The [`frakti`](https://github.com/kubernetes/frakti) CRI shim is an example of such a consumer.
### Multiple hardware architectures support
The Kata Containers runtime **MUST NOT** be architecture-specific. It should be able to support
multiple hardware architectures and provide a modular and flexible design for adding support
for additional ones.
### Multiple hypervisor support
The Kata Containers runtime **MUST NOT** be tied to any specific hardware virtualization technology,
hypervisor, or virtual machine monitor implementation.
It should support multiple hypervisors and provide a pluggable and flexible design to add support
for additional ones.
#### Nesting
The Kata Containers runtime **MUST** support nested virtualization environments.
### Networking
* The Kata Containers runtime **MUST** support CNI plugin.
* The Kata Containers runtime **MUST** support both legacy and IPv6 networks.
### I/O
#### Devices direct assignment
In order for containers to directly consume host hardware resources, the Kata Containers runtime
**MUST** provide containers with secure pass through for generic devices such as GPUs, SRIOV,
RDMA, QAT, by leveraging I/O virtualization technologies (IOMMU, interrupt remapping).
#### Acceleration
The Kata Containers runtime **MUST** support accelerated and user-space-based I/O operations
for networking (e.g. DPDK) as well as storage through `vhost-user` sockets.
#### Scalability
The Kata Containers runtime **MUST** support scalable I/O through the SRIOV technology.
### Virtualization overhead reduction
A compelling aspect of containers is their minimal overhead compared to bare metal applications.
A container runtime should keep the overhead to a minimum in order to provide the expected user
experience.
The Kata Containers runtime implementation **SHOULD** be optimized for:
* Minimal workload boot and shutdown times
* Minimal workload memory footprint
* Maximal networking throughput
* Minimal networking latency
### Testing and debugging
#### Continuous Integration
Each Kata Containers runtime pull request **MUST** pass at least the following set of container-related
tests:
* Unit tests: runtime unit tests coverage >75%
* Functional tests: the entire runtime CLI and APIs
* Integration tests: Docker and Kubernetes
#### Debugging
The Kata Containers runtime implementation **MUST** use structured logging in order to namespace
- [Virtualization in Kata Containers](#virtualization-in-kata-containers)
- [Mapping container concepts to virtual machine technologies](#mapping-container-concepts-to-virtual-machine-technologies)
- [Kata Containers Hypervisor and VMM support](#kata-containers-hypervisor-and-vmm-support)
- [QEMU/KVM](#qemukvm)
- [Machine accelerators](#machine-accelerators)
- [Hotplug devices](#hotplug-devices)
- [Firecracker/KVM](#firecrackerkvm)
- [Cloud Hypervisor/KVM](#cloud-hypervisorkvm)
- [Summary](#summary)
Kata Containers, a second layer of isolation is created on top of those provided by traditional namespace-containers. The
hardware virtualization interface is the basis of this additional layer. Kata will launch a lightweight virtual machine,
and use the guest’s Linux kernel to create a container workload, or workloads in the case of multi-container pods. In Kubernetes
and in the Kata implementation, the sandbox is carried out at the pod level. In Kata, this sandbox is created using a virtual machine.
This document describes how Kata Containers maps container technologies to virtual machines technologies, and how this is realized in
the multiple hypervisors and virtual machine monitors that Kata supports.
## Mapping container concepts to virtual machine technologies
A typical deployment of Kata Containers will be in Kubernetes by way of a Container Runtime Interface (CRI) implementation. On every node,
Kubelet will interact with a CRI implementor (such as containerd or CRI-O), which will in turn interface with Kata Containers (an OCI based runtime).
The CRI API, as defined at the [Kubernetes CRI-API repo](https://github.com/kubernetes/cri-api/), implies a few constructs being supported by the
CRI implementation, and ultimately in Kata Containers. In order to support the full [API](https://github.com/kubernetes/cri-api/blob/a6f63f369f6d50e9d0886f2eda63d585fbd1ab6a/pkg/apis/runtime/v1alpha2/api.proto#L34-L110) with the CRI-implementor, Kata must provide the following constructs:

These constructs can then be further mapped to what devices are necessary for interfacing with the virtual machine:

Ultimately, these concepts map to specific para-virtualized devices or virtualization technologies.

Each hypervisor or VMM varies on how or if it handles each of these.
## Kata Containers Hypervisor and VMM support
Kata Containers is designed to support multiple virtual machine monitors (VMMs) and hypervisors.
Which configuration to use will depend on the end user's requirements. Details of each solution and a summary are provided below.
### QEMU/KVM
Kata Containers with QEMU has complete compatibility with Kubernetes.
Depending on the host architecture, Kata Containers supports various machine types,
for example `pc` and `q35` on x86 systems, `virt` on ARM systems and `pseries` on IBM Power systems. The default Kata Containers
machine type is `pc`. The machine type and its [`Machine accelerators`](#machine-accelerators) can
be changed by editing the runtime [`configuration`](./architecture.md/#configuration) file.
Devices and features used:
- virtio VSOCK or virtio serial
- virtio block or virtio SCSI
- virtio net
- virtio fs or virtio 9p (recommend: virtio fs)
- VFIO
- hotplug
- machine accelerators
Machine accelerators and hotplug are used in Kata Containers to manage resource constraints, improve boot time and reduce memory footprint. These are documented below.
#### Machine accelerators
Machine accelerators are architecture specific and can be used to improve the performance
and enable specific features of the machine types. The following machine accelerators
are used in Kata Containers:
- NVDIMM: This machine accelerator is x86 specific and only supported by `pc` and
`q35` machine types. `nvdimm` is used to provide the root filesystem as a persistent
memory device to the Virtual Machine.
#### Hotplug devices
The Kata Containers VM starts with a minimum amount of resources, allowing for faster boot time and a reduction in memory footprint. As the container launch progresses,
devices are hotplugged to the VM. For example, when a CPU constraint is specified which includes additional CPUs, they can be hot added. Kata Containers has support
for hot-adding the following devices:
- Virtio block
- Virtio SCSI
- VFIO
- CPU
### Firecracker/KVM
Firecracker, built on many rust crates that are within [rust-VMM](https://github.com/rust-vmm), has a very limited device model, providing a lighter
footprint and attack surface, focusing on function-as-a-service like use cases. As a result, Kata Containers with Firecracker VMM supports a subset of the CRI API.
Firecracker does not support file-system sharing, and as a result only block-based storage drivers are supported. Firecracker does not support device
hotplug nor does it support VFIO. As a result, Kata Containers with Firecracker VMM does not support updating container resources after boot, nor
does it support device passthrough.
Devices used:
- virtio VSOCK
- virtio block
- virtio net
### Cloud Hypervisor/KVM
Cloud Hypervisor, based on [rust-VMM](https://github.com/rust-vmm), is designed to have a lighter footprint and attack surface. For Kata Containers,
relative to Firecracker, the Cloud Hypervisor configuration provides better compatibility at the expense of exposing additional devices: file system
sharing and direct device assignment. As of the 1.10 release of Kata Containers, Cloud Hypervisor does not support device hotplug, and as a result
does not support updating container resources after boot, or utilizing block based volumes. While Cloud Hypervisor does support VFIO, Kata is still adding
this support. As of 1.10, Kata does not support block based volumes or direct device assignment. See [Cloud Hypervisor device support documentation](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/master/docs/device_model.md)
for more details on Cloud Hypervisor.
Devices used:
- virtio VSOCK
- virtio block
- virtio net
- virtio fs
### Summary
| Solution | release introduced | brief summary |
|-|-|-|
| QEMU | 1.0 | upstream QEMU, with support for hotplug and filesystem sharing |
| NEMU | 1.4 | Deprecated, removed as of 1.10 release. Slimmed down fork of QEMU, with experimental support of virtio-fs |
| Firecracker | 1.5 | upstream Firecracker, rust-VMM based, no VFIO, no FS sharing, no memory/CPU hotplug |
| QEMU-virtio-fs | 1.7 | upstream QEMU with support for virtio-fs. Will be removed once virtio-fs lands in upstream QEMU |
| Cloud Hypervisor | 1.10 | rust-VMM based, includes VFIO and FS sharing through virtio-fs, no hotplug |
- [Run Kata containers with `crictl`](run-kata-with-crictl.md)
- [Run Kata Containers with Kubernetes](run-kata-with-k8s.md)
- [How to use Kata Containers and Containerd](containerd-kata.md)
- [How to use Kata Containers and CRI (containerd plugin) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
- [Kata Containers and service mesh for Kubernetes](service-mesh.md)
- [How to import Kata Containers logs into Fluentd](how-to-import-kata-logs-with-fluentd.md)
## Hypervisors Integration
- [Kata Containers with Firecracker](https://github.com/kata-containers/documentation/wiki/Initial-release-of-Kata-Containers-with-Firecracker-support)
- [Kata Containers with NEMU](how-to-use-kata-containers-with-nemu.md)
- [Kata Containers with ACRN Hypervisor](how-to-use-kata-containers-with-acrn.md)
## Advanced Topics
- [How to use Kata Containers with virtio-fs](how-to-use-virtio-fs-with-kata.md)
- [Setting Sysctls with Kata](how-to-use-sysctls-with-kata.md)
- [What Is VMCache and How To Enable It](what-is-vm-cache-and-how-do-I-use-it.md)
- [What Is VM Templating and How To Enable It](what-is-vm-templating-and-how-do-I-use-it.md)
- [Privileged Kata Containers](privileged.md)
- [How to load kernel modules in Kata Containers](how-to-load-kernel-modules-with-kata.md)
- [How to use Kata Containers with `virtio-mem`](how-to-use-virtio-mem-with-kata.md)
- [How to set sandbox Kata Containers configurations with pod annotations](how-to-set-sandbox-config-kata.md)
- [How to monitor Kata Containers in K8s](how-to-set-prometheus-in-k8s.md)
By default, the configuration of containerd is located at `/etc/containerd/config.toml`, and the
`cri` plugins are placed in the following section:
```toml
[plugins]
[plugins.cri]
[plugins.cri.containerd]
[plugins.cri.containerd.default_runtime]
#runtime_type = "io.containerd.runtime.v1.linux"
[plugins.cri.cni]
# conf_dir is the directory in which the admin places a CNI conf.
conf_dir = "/etc/cni/net.d"
```
The following sections outline how to add Kata Containers to the configurations.
#### Kata Containers as a `RuntimeClass`
For
- Kata Containers v1.5.0 or above (including `1.5.0-rc`)
- Containerd v1.2.0 or above
- Kubernetes v1.12.0 or above
The `RuntimeClass` is suggested.
The following configuration includes three runtime classes:
- `plugins.cri.containerd.runtimes.runc`: the runc, and it is the default runtime.
- `plugins.cri.containerd.runtimes.kata`: The function in containerd (reference [the document here](https://github.com/containerd/containerd/tree/master/runtime/v2#binary-naming))
where the dot-connected string `io.containerd.kata.v2` is translated to `containerd-shim-kata-v2` (i.e. the
binary name of the Kata implementation of [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2)).
- `plugins.cri.containerd.runtimes.katacli`: the `containerd-shim-runc-v1` calls `kata-runtime`, which is the legacy process.
```toml
[plugins.cri.containerd]
no_pivot = false
[plugins.cri.containerd.runtimes]
[plugins.cri.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v1"
[plugins.cri.containerd.runtimes.runc.options]
NoPivotRoot = false
NoNewKeyring = false
ShimCgroup = ""
IoUid = 0
IoGid = 0
BinaryName = "runc"
Root = ""
CriuPath = ""
SystemdCgroup = false
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
[plugins.cri.containerd.runtimes.katacli]
runtime_type = "io.containerd.runc.v1"
[plugins.cri.containerd.runtimes.katacli.options]
NoPivotRoot = false
NoNewKeyring = false
ShimCgroup = ""
IoUid = 0
IoGid = 0
BinaryName = "/usr/bin/kata-runtime"
Root = ""
CriuPath = ""
SystemdCgroup = false
```
From Containerd v1.2.4 and Kata v1.6.0, there is a new runtime option supported, which allows you to specify a specific Kata configuration file as follows:
```toml
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
privileged_without_host_devices = true
[plugins.cri.containerd.runtimes.kata.options]
ConfigPath = "/etc/kata-containers/config.toml"
```
`privileged_without_host_devices` tells containerd that a privileged Kata container should not have direct access to all host devices. If unset, containerd will pass all host devices to Kata container, which may cause security issues.
This `ConfigPath` option is optional. If you do not specify it, shimv2 first tries to get the configuration file from the environment variable `KATA_CONF_FILE`. If neither are set, shimv2 will use the default Kata configuration file paths (`/etc/kata-containers/configuration.toml` and `/usr/share/defaults/kata-containers/configuration.toml`).
If you use Containerd older than v1.2.4 or a version of Kata older than v1.6.0 and also want to specify a configuration file, you can use the following workaround, since the shimv2 accepts an environment variable, `KATA_CONF_FILE` for the configuration file path. Then, you can create a
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
runtime_type = "io.containerd.kata.v2"
```
For the earlier versions of Kata Containers and containerd that do not support Runtime V2 (Shim API), you can use the following alternative configuration:
```toml
[plugins.cri.containerd]
# "plugins.cri.containerd.default_runtime" is the runtime to use in containerd.
[plugins.cri.containerd.default_runtime]
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
runtime_type = "io.containerd.runtime.v1.linux"
# "plugins.cri.containerd.untrusted_workload_runtime" is a runtime to run untrusted workloads on it.
# runtime_type is the runtime type to use in containerd e.g. io.containerd.runtime.v1.linux
runtime_type = "io.containerd.runtime.v1.linux"
# runtime_engine is the name of the runtime engine used by containerd.
runtime_engine = "/usr/bin/kata-runtime"
```
You can find more information on the [Containerd config documentation](https://github.com/containerd/cri/blob/master/docs/config.md)
#### Kata Containers as the default runtime
If you want to set Kata Containers as the only runtime in the deployment, you can simply configure as follows:
```toml
[plugins.cri.containerd]
[plugins.cri.containerd.default_runtime]
runtime_type = "io.containerd.kata.v2"
```
Alternatively, for the earlier versions of Kata Containers and containerd that do not support Runtime V2 (Shim API), you can use the following alternative configuration:
```toml
[plugins.cri.containerd]
[plugins.cri.containerd.default_runtime]
runtime_type = "io.containerd.runtime.v1.linux"
runtime_engine = "/usr/bin/kata-runtime"
```
### Configuration for `cri-tools`
> **Note:** If you skipped the [Install `cri-tools`](#install-cri-tools) section, you can skip this section too.
First, add the CNI configuration in the containerd configuration.
The following is the configuration if you installed CNI as the *[Install CNI plugins](#install-cni-plugins)* section outlined.
Put the CNI configuration as `/etc/cni/net.d/10-mynet.conf`:
```json
{
"cniVersion": "0.2.0",
"name": "mynet",
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"ipam": {
"type": "host-local",
"subnet": "172.19.0.0/24",
"routes": [
{ "dst": "0.0.0.0/0" }
]
}
}
```
Next, reference the configuration directory through containerd `config.toml`:
```toml
[plugins.cri.cni]
# conf_dir is the directory in which the admin places a CNI conf.
conf_dir = "/etc/cni/net.d"
```
The configuration file of `crictl` command line tool in `cri-tools` locates at `/etc/crictl.yaml`:
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --rm docker.io/library/busybox:latest hello sh
```
This launches a BusyBox container named `hello`, and it will be removed by `--rm` after it quits.
### Launch Pods with `crictl` command line
With the `crictl` command line of `cri-tools`, you can specify runtime class with `-r` or `--runtime` flag.
Use the following to launch Pod with `kata` runtime class with the pod in [the example](https://github.com/kubernetes-sigs/cri-tools/tree/master/docs/examples)
> **Warning**: This how-to is only for evaluation purpose, you **SHOULD NOT** running it in production using this configurations.
## Introduction
If you are running Kata containers in a Kubernetes cluster, the best way to run `kata-monitor` is using Kubernetes native `DaemonSet`, `kata-monitor` will run on desired Kubernetes nodes without other operations when new nodes joined the cluster.
Prometheus also support a Kubernetes service discovery that can find scrape targets dynamically without explicitly setting `kata-monitor`'s metric endpoints.
## Pre-requisites
You must have a running Kubernetes cluster first. If not, [install a Kubernetes cluster](https://kubernetes.io/docs/setup/) first.
Also you should ensure that `kubectl` working correctly.
> **Note**: More information about Kubernetes integrations:
> - [Run Kata Containers with Kubernetes](run-kata-with-k8s.md)
> - [How to use Kata Containers and Containerd](containerd-kata.md)
> - [How to use Kata Containers and CRI (containerd plugin) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
## Configure Prometheus
Start Prometheus by utilizing our sample manifest:
This will create a new namespace, `prometheus`, and create the following resources:
*`ClusterRole`, `ServiceAccount`, `ClusterRoleBinding` to let Prometheus to access Kubernetes API server.
*`ConfigMap` that contains minimum configurations to let Prometheus run Kubernetes service discovery.
*`Deployment` that run Prometheus in `Pod`.
*`Service` with `type` of `NodePort`(`30909` in this how to), that we can access Prometheus through `<hostIP>:30909`. In production environment, this `type` may be `LoadBalancer` or `Ingress` resource.
After the Prometheus server is running, run `curl -s http://hostIP:NodePort:30909/metrics`, if Prometheus is working correctly, you will get response like these:
```
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
This will create a new namespace `kata-system` and a `daemonset` in it.
Once the `daemonset` is running, Prometheus should discover `kata-monitor` as a target. You can open `http://<hostIP>:30909/service-discovery` and find `kubernetes-pods` under the `Service Discovery` list
This will create deployment and service for Grafana under namespace `prometheus`.
After the Grafana deployment is ready, you can open `http://hostIP:NodePort:30000/` to access Grafana server. For Grafana 7.0.5, the default user/password is `admin/admin`. You can modify the default account and adjust other security settings by editing the [Grafana configuration](https://grafana.com/docs/grafana/latest/installation/configuration/#security).
To use Grafana show data from Prometheus, you must create a Prometheus `datasource` and dashboard.
### Create `datasource`
Open `http://hostIP:NodePort:30000/datasources/new` in your browser, select Prometheus from time series databases list.
Normally you only need to set `URL` to `http://hostIP:NodePort:30909` to let it work, and leave the name as `Prometheus` as default.
### Import dashboard
A [sample dashboard](data/dashboard.json) for Kata Containers metrics is provided which can be imported to Grafana for evaluation.
You can import this dashboard using Grafana UI, or using `curl` command in console.
| `io.katacontainers.pkg.oci.container_type`| string | OCI container type. Only accepts `pod_container` and `pod_sandbox` |
## Runtime Options
| Key | Value Type | Comments |
|-------| ----- | ----- |
| `io.katacontainers.config.runtime.experimental` | `boolean` | determines if experimental features enabled |
| `io.katacontainers.config.runtime.disable_guest_seccomp`| `boolean` | determines if `seccomp` should be applied inside guest |
| `io.katacontainers.config.runtime.disable_new_netns` | `boolean` | determines if a new netns is created for the hypervisor process |
| `io.katacontainers.config.runtime.internetworking_model` | string| determines how the VM should be connected to the container network interface. Valid values are `macvtap`, `tcfilter` and `none` |
| `io.katacontainers.config.runtime.sandbox_cgroup_only`| `boolean` | determines if Kata processes are managed only in sandbox cgroup |
## Agent Options
| Key | Value Type | Comments |
|-------| ----- | ----- |
| `io.katacontainers.config.agent.enable_tracing` | `boolean` | enable tracing for the agent |
| `io.katacontainers.config.agent.kernel_modules` | string | the list of kernel modules and their parameters that will be loaded in the guest kernel. Semicolon separated list of kernel modules and their parameters. These modules will be loaded in the guest kernel using `modprobe`(8). E.g., `e1000e InterruptThrottleRate=3000,3000,3000 EEE=1; i915 enable_ppgtt=0` |
| `io.katacontainers.config.agent.trace_mode` | string | the trace mode for the agent |
| `io.katacontainers.config.agent.trace_type` | string | the trace type for the agent |
## Hypervisor Options
| Key | Value Type | Comments |
|-------| ----- | ----- |
| `io.katacontainers.config.hypervisor.asset_hash_type` | string | the hash type used for assets verification, default is `sha512` |
| `io.katacontainers.config.hypervisor.block_device_cache_direct` | `boolean` | Denotes whether use of `O_DIRECT` (bypass the host page cache) is enabled |
| `io.katacontainers.config.hypervisor.block_device_cache_noflush` | `boolean` | Denotes whether flush requests for the device are ignored |
| `io.katacontainers.config.hypervisor.block_device_cache_set` | `boolean` | cache-related options will be set to block devices or not |
| `io.katacontainers.config.hypervisor.block_device_driver` | string | the driver to be used for block device, valid values are `virtio-blk`, `virtio-scsi`, `nvdimm`|
| `io.katacontainers.config.hypervisor.default_max_vcpus` | uint32| the maximum number of vCPUs allocated for the VM by the hypervisor |
| `io.katacontainers.config.hypervisor.default_memory` | uint32| the memory assigned for a VM by the hypervisor in `MiB` |
| `io.katacontainers.config.hypervisor.default_vcpus` | uint32| the default vCPUs assigned for a VM by the hypervisor |
| `io.katacontainers.config.hypervisor.disable_block_device_use` | `boolean` | disallow a block device from being used |
| `io.katacontainers.config.hypervisor.disable_vhost_net` | `boolean` | specify if `vhost-net` is not available on the host |
| `io.katacontainers.config.hypervisor.enable_hugepages` | `boolean` | if the memory should be `pre-allocated` from huge pages |
| `io.katacontainers.config.hypervisor.enable_iothreads` | `boolean`| enable IO to be processed in a separate thread. Supported currently for virtio-`scsi` driver |
| `io.katacontainers.config.hypervisor.enable_mem_prealloc` | `boolean` | the memory space used for `nvdimm` device by the hypervisor |
| `io.katacontainers.config.hypervisor.enable_swap` | `boolean` | enable swap of VM memory |
| `io.katacontainers.config.hypervisor.entropy_source` | string| the path to a host source of entropy (`/dev/random`, `/dev/urandom` or real hardware RNG device) |
| `io.katacontainers.config.hypervisor.firmware` | string | the guest firmware that will run the container VM |
| `io.katacontainers.config.hypervisor.guest_hook_path` | string | the path within the VM that will be used for drop in hooks |
| `io.katacontainers.config.hypervisor.hotplug_vfio_on_root_bus` | `boolean` | indicate if devices need to be hotplugged on the root bus instead of a bridge|
This document provides an overview on how to run Kata containers with ACRN hypervisor and device model.
- [Introduction](#introduction)
- [Pre-requisites](#pre-requisites)
- [Configure Docker](#configure-docker)
- [Configure Kata Containers with ACRN](#configure-kata-containers-with-acrn)
## Introduction
ACRN is a flexible, lightweight Type-1 reference hypervisor built with real-time and safety-criticality in mind. ACRN uses an open source platform making it optimized to streamline embedded development.
Some of the key features being:
- Small footprint - Approx. 25K lines of code (LOC).
- Real Time - Low latency, faster boot time, improves overall responsiveness with hardware.
- Adaptability - Multi-OS support for guest operating systems like Linux, Android, RTOSes.
- Rich I/O mediators - Allows sharing of various I/O devices across VMs.
- Optimized for a variety of IoT (Internet of Things) and embedded device solutions.
Please refer to ACRN [documentation](https://projectacrn.github.io/latest/index.html) for more details on ACRN hypervisor and device model.
## Pre-requisites
This document requires the presence of the ACRN hypervisor and Kata Containers on your system. Install using the instructions available through the following links:
- For networking, ACRN supports either MACVTAP or TAP. If MACVTAP is not enabled in the Service OS, please follow the below steps to update the kernel:
$ sudo sed -i "s/$kernel_img/bzImage/g" /mnt/loader/entries/$conf_file
$ sync && sudo umount /mnt && sudo reboot
```
- Kata Containers installation: Automated installation does not seem to be supported for Clear Linux, so please use [manual installation](../Developer-Guide.md) steps.
> **Note:** Create rootfs image and not initrd image.
In order to run Kata with ACRN, your container stack must provide block-based storage, such as device-mapper.
> **Note:** Currently, by design you can only launch one VM from Kata Containers using ACRN hypervisor (SDC scenario). Based on feedback from community we can increase number of VMs.
## Configure Docker
To configure Docker for device-mapper and Kata,
1. Stop Docker daemon if it is already running.
```bash
$ sudo systemctl stop docker
```
2. Set `/etc/docker/daemon.json` with the following contents.
```
{
"storage-driver": "devicemapper"
}
```
3. Restart docker.
```bash
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
```
4. Configure [Docker](../Developer-Guide.md#update-the-docker-systemd-unit-file) to use `kata-runtime`.
## Configure Kata Containers with ACRN
To configure Kata Containers with ACRN, copy the generated `configuration-acrn.toml` file when building the `kata-runtime` to either `/etc/kata-containers/configuration.toml` or `/usr/share/defaults/kata-containers/configuration.toml`.
The following command shows full paths to the `configuration.toml` files that the runtime loads. It will use the first path that exists. (Please make sure the kernel and image paths are set correctly in the `configuration.toml` file)
>**Warning:** Please offline CPUs using [this](offline_cpu.sh) script, else VM launches will fail.
```bash
$ sudo ./offline_cpu.sh
```
Start an ACRN based Kata Container,
```bash
$ sudo docker run -ti --runtime=kata-runtime busybox sh
```
You will see ACRN(`acrn-dm`) is now running on your system, as well as a `kata-shim`, `kata-proxy`. You should obtain an interactive shell prompt. Verify that all the Kata processes terminate once you exit the container.
```bash
$ ps -ef | grep -E "kata|acrn"
```
Validate ACRN hypervisor by using `kata-runtime kata-env`,
* [Configure Kata Containers](#configure-kata-containers)
Kata Containers relies by default on the QEMU hypervisor in order to spawn the virtual machines running containers. [NEMU](https://github.com/intel/nemu) is a fork of QEMU that:
- Reduces the number of lines of code.
- Removes all legacy devices.
- Reduces the emulation as far as possible.
## Introduction
This document describes how to run Kata Containers with NEMU, first by explaining how to download, build and install it. Then it walks through the steps needed to update your Kata Containers configuration in order to run with NEMU.
## Pre-requisites
This document requires Kata Containers to be [installed](../install/README.md) on your system.
Also, it's worth noting that NEMU only supports `x86_64` and `aarch64` architecture.
> **Note:** The branch `experiment/automatic-removal` is a branch published by Jenkins after it has applied the automatic removal script to the `topic/virt-x86` branch. The purpose of this code removal being to reduce the source tree by removing files not being used by NEMU.
After those commands have successfully returned, you will find the NEMU binary at `$HOME/build-x86_64_virt/x86_64_virt-softmmu/qemu-system-x86_64_virt` (__x86__), or `$HOME/build-aarch64/aarch64-softmmu/qemu-system-aarch64` (__ARM__).
You also need the `OVMF` firmware in order to boot the virtual machine's kernel. It can currently be found at this [location](https://github.com/intel/ovmf-virt/releases).
> **Note:** The OVMF firmware will be located at this temporary location until the changes can be pushed upstream.
## Configure Kata Containers
All you need from this section is to modify the configuration file `/usr/share/defaults/kata-containers/configuration.toml` to specify the options related to the hypervisor.
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
@@ -31,7 +31,7 @@
# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
-firmware = ""
+firmware = "/usr/share/nemu/OVMF.fd"
# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
```
As you can see from this snippet above, all you need to change is:
- The path to the hypervisor binary, `/home/foo/build-x86_64_virt/x86_64_virt-softmmu/qemu-system-x86_64_virt` in this example.
- The machine type from `pc` to `virt`.
- The path to the firmware binary, `/usr/share/nemu/OVMF.fd` in this example.
Once you have saved those modifications, you can start a new container:
```bash
$ docker run --runtime=kata-runtime -it busybox
```
And you will be able to verify this new container is running with the NEMU hypervisor by looking for the hypervisor path and the machine type from the `qemu` process running on your system:
For additional documentation on setting sysctls with Docker please refer to [Docker-sysctl-doc](https://docs.docker.com/engine/reference/commandline/run/#configure-namespaced-kernel-parameters-sysctls-at-runtime).
#### Setting Namespaced Sysctls with Kubernetes:
Kubernetes considers certain sysctls as safe and others as unsafe. For detailed
information about what sysctls are considered unsafe, please refer to the [Kubernetes sysctl docs](https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/).
For using unsafe sysctls, the cluster admin would need to allow these as:
- [Install Kata Containers with virtio-fs support](#install-kata-containers-with-virtio-fs-support)
- [Run a Kata Container utilizing virtio-fs](#run-a-kata-container-utilizing-virtio-fs)
## Introduction
Container deployments utilize explicit or implicit file sharing between host filesystem and containers. From a trust perspective, avoiding a shared file-system between the trusted host and untrusted container is recommended. This is not always feasible. In Kata Containers, block-based volumes are preferred as they allow usage of either device pass through or `virtio-blk` for access within the virtual machine.
As of the 1.7 release of Kata Containers, [9pfs](https://www.kernel.org/doc/Documentation/filesystems/9p.txt) is the default filesystem sharing mechanism. While this does allow for workload compatibility, it does so with degraded performance and potential for POSIX compliance limitations.
To help address these limitations, [virtio-fs](https://virtio-fs.gitlab.io/) has been developed. virtio-fs is a shared file system that lets virtual machines access a directory tree on the host. In Kata Containers, virtio-fs can be used to share container volumes, secrets, config-maps, configuration files (hostname, hosts, `resolv.conf`) and the container rootfs on the host with the guest. virtio-fs provides significant performance and POSIX compliance improvements compared to 9pfs.
Enabling of virtio-fs requires changes in the guest kernel as well as the VMM. For Kata Containers, experimental virtio-fs support is enabled through `qemu` and `cloud-hypervisor` VMMs.
**Note: virtio-fs support is experimental in the 1.7 release of Kata Containers. Work is underway to improve stability, performance and upstream integration. This is available for early preview - use at your own risk**
This document describes how to get Kata Containers to work with virtio-fs.
## Pre-requisites
Before Kata 1.8 this feature required the host to have hugepages support enabled. Enable this with the `sysctl vm.nr_hugepages=1024` command on the host.In later versions of Kata, virtio-fs leverages `/dev/shm` as the shared memory backend. The default size of `/dev/shm` on a system is typically half of the total system memory. This can pose a physical limit to the maximum number of pods that can be launched with virtio-fs. This can be overcome by increasing the size of `/dev/shm` as shown below:
```bash
$ mount -o remount,size=${desired_shm_size} /dev/shm
```
## Install Kata Containers with virtio-fs support
The Kata Containers `qemu` configuration with virtio-fs and the `virtiofs` daemon are available in the [Kata Container release](https://github.com/kata-containers/runtime/releases) artifacts starting with the 1.9 release. Installation is available through [distribution packages](https://github.com/kata-containers/documentation/blob/master/install/README.md#supported-distributions) as well through [`kata-deploy`](https://github.com/kata-containers/packaging/tree/master/kata-deploy).
**Note: Support for virtio-fs was first introduced in `NEMU` hypervisor in Kata 1.8 release. This hypervisor has been deprecated.**
Install the latest release of Kata with `kata-deploy` as follows:
This will place the Kata release artifacts in `/opt/kata`, and update Docker's configuration to include a runtime target, `kata-qemu-virtiofs`. Learn more about `kata-deploy` and how to use `kata-deploy` in Kubernetes [here](https://github.com/kata-containers/packaging/tree/master/kata-deploy#kubernetes-quick-start).
## Run a Kata Container utilizing virtio-fs
Once installed, start a new container, utilizing `qemu` + `virtiofs`:
```bash
$ docker run --runtime=kata-qemu-virtiofs -it busybox
```
Verify the new container is running with the `qemu` hypervisor as well as using `virtiofsd`. To do this look for the hypervisor path and the `virtiofs` daemon process on the host:
- [Run a Kata Container utilizing `virtio-mem`](#run-a-kata-container-utilizing-virtio-mem)
## Introduction
The basic idea of `virtio-mem` is to provide a flexible, cross-architecture memory hot plug and hot unplug solution that avoids many limitations imposed by existing technologies, architectures, and interfaces.
More details can be found in https://lkml.org/lkml/2019/12/12/681.
Kata Containers with `virtio-mem` supports memory resize.
## Requisites
Kata Containers with `virtio-mem` requires Linux and the QEMU that support `virtio-mem`.
The Linux kernel and QEMU upstream version still not support `virtio-mem`. @davidhildenbrand is working on them.
Please use following unofficial version of the Linux kernel and QEMU that support `virtio-mem` with Kata Containers.
The Linux kernel is at https://github.com/davidhildenbrand/linux/tree/virtio-mem-rfc-v4.
The Linux kernel config that can work with Kata Containers is at https://gist.github.com/teawater/016194ee84748c768745a163d08b0fb9.
The QEMU is at https://github.com/teawater/qemu/tree/kata-virtio-mem. (The original source is at https://github.com/davidhildenbrand/qemu/tree/virtio-mem. Its base version of QEMU cannot work with Kata Containers. So merge the commit of `virtio-mem` to upstream QEMU.)
Set Linux and the QEMU that support `virtio-mem` with following line in the Kata Containers QEMU configuration `configuration-qemu.toml`:
```toml
[hypervisor.qemu]
path="qemu-dir"
kernel="vmlinux-dir"
```
Enable `virtio-mem` with following line in the Kata Containers configuration:
```toml
enable_virtio_mem=true
```
## Run a Kata Container utilizing `virtio-mem`
Use following command to enable memory overcommitment of a Linux kernel. Because QEMU `virtio-mem` device need to allocate a lot of memory.
```
$ echo 1 | sudo tee /proc/sys/vm/overcommit_memory
```
Use following command start a Kata Container.
```
$ docker run --rm -it --runtime=kata --name test busybox
```
Use following command set the memory size of test to default_memory + 512m.
* [Check `redis` server is working](#check-redis-server-is-working)
## What's `cri-tools`
[`cri-tools`](https://github.com/kubernetes-sigs/cri-tools) provides debugging and validation tools for Kubelet Container Runtime Interface (CRI).
`cri-tools` includes two tools: `crictl` and `critest`. `crictl` is the CLI for Kubelet CRI, in this document, we will show how to use `crictl` to run Pods in Kata containers.
> **Note:** `cri-tools` is only used for debugging and validation purpose, and don't use it to run production workloads.
> **Note:** For how to install and configure `cri-tools` with CRI runtimes like `containerd` or CRI-O, please also refer to other [howtos](./README.md).
## Use `crictl` run Pods in Kata containers
Sample config files in this document can be found [here](./data/crictl/).
After choosing one CRI implementation, you must make the appropriate configuration
to ensure it integrates with Kata Containers.
Kata Containers 1.5 introduced the `shimv2` for containerd 1.2.0, reducing the components
required to spawn pods and containers, and this is the preferred way to run Kata Containers with Kubernetes ([as documented here](../how-to/how-to-use-k8s-with-cri-containerd-and-kata.md#configure-containerd-to-use-kata-containers)).
An equivalent shim implementation for CRI-O is planned.
### CRI-O
For CRI-O installation instructions, refer to the [CRI-O Tutorial](https://github.com/kubernetes-incubator/cri-o/blob/master/tutorial.md) page.
The following sections show how to set up the CRI-O configuration file (default path: `/etc/crio/crio.conf`) for Kata.
Unless otherwise stated, all the following settings are specific to the `crio.runtime` table:
```toml
# The "crio.runtime" table contains settings pertaining to the OCI
# runtime used and options for how to set up and manage the OCI runtime.
[crio.runtime]
```
A comprehensive documentation of the configuration file can be found [here](https://github.com/cri-o/cri-o/blob/master/docs/crio.conf.5.md).
> **Note**: After any change to this file, the CRI-O daemon have to be restarted with:
>````
>$ sudo systemctl restart crio
>````
#### Kubernetes Runtime Class (CRI-O v1.12+)
The [Kubernetes Runtime Class](https://kubernetes.io/docs/concepts/containers/runtime-class/)
is the preferred way of specifying the container runtime configuration to run a Pod's containers.
To use this feature, Kata must added as a runtime handler with:
```toml
[crio.runtime.runtimes.kata-runtime]
runtime_path = "/usr/bin/kata-runtime"
runtime_type = "oci"
```
You can also add multiple entries to specify alternatives hypervisors, e.g.:
```toml
[crio.runtime.runtimes.kata-qemu]
runtime_path = "/usr/bin/kata-runtime"
runtime_type = "oci"
[crio.runtime.runtimes.kata-fc]
runtime_path = "/usr/bin/kata-runtime"
runtime_type = "oci"
```
#### Untrusted annotation (until CRI-O v1.12)
The untrusted annotation is used to specify a runtime for __untrusted__ workloads, i.e.
a runtime to be used when the workload cannot be trusted and a higher level of security
is required. An additional flag can be used to let CRI-O know if a workload
should be considered _trusted_ or _untrusted_ by default.
* [How is this different to VM templating](#how-is-this-different-to-vm-templating)
* [How to enable VMCache](#how-to-enable-vmcache)
* [Limitations](#limitations)
### What is VMCache
VMCache is a new function that creates VMs as caches before using it.
It helps speed up new container creation.
The function consists of a server and some clients communicating
through Unix socket. The protocol is gRPC in [`protocols/cache/cache.proto`](../../src/runtime/protocols/cache/cache.proto).
The VMCache server will create some VMs and cache them by factory cache.
It will convert the VM to gRPC format and transport it when gets
requested from clients.
Factory `grpccache` is the VMCache client. It will request gRPC format
VM and convert it back to a VM. If VMCache function is enabled,
`kata-runtime` will request VM from factory `grpccache` when it creates
a new sandbox.
### How is this different to VM templating
Both [VM templating](../how-to/what-is-vm-templating-and-how-do-I-use-it.md) and VMCache help speed up new container creation.
When VM templating enabled, new VMs are created by cloning from a pre-created template VM, and they will share the same initramfs, kernel and agent memory in readonly mode. So it saves a lot of memory if there are many Kata Containers running on the same host.
VMCache is not vulnerable to [share memory CVE](../how-to/what-is-vm-templating-and-how-do-I-use-it.md#what-are-the-cons) because each VM doesn't share the memory.
### How to enable VMCache
VMCache can be enabled by changing your Kata Containers config file (`/usr/share/defaults/kata-containers/configuration.toml`,
overridden by `/etc/kata-containers/configuration.toml` if provided) such that:
*`vm_cache_number` specifies the number of caches of VMCache:
* unspecified or == 0
VMCache is disabled
*`> 0`
will be set to the specified number
*`vm_cache_endpoint` specifies the address of the Unix socket.
Then you can create a VM templating for later usage by calling:
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. |
| [Using snap](#snap-installation) | Easy to install | yes | Good alternative to official distro packages. |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. |
| [Manual](#manual-installation) | Follow a guide step-by-step to install a working system | **No!** | For those who want the latest release with more control. |
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. |
### Official packages
Kata packages are provided by official distribution repositories for:
| Distribution (link to installation guide) | Minimum versions |
Kata Containers on Amazon Web Services (AWS) makes use of [i3.metal](https://aws.amazon.com/ec2/instance-types/i3/) instances. Most of the installation procedure is identical to that for Kata on your preferred distribution, except that you have to run it on bare metal instances since AWS doesn't support nested virtualization yet. This guide walks you through creating an i3.metal instance.
region = <your-aws-region-for-your-i3-metal-instance>
EOF
```
For more information on how to get AWS credentials please refer to [this guide](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html). Alternatively, you can ask the administrator of your AWS account to issue one with the AWS CLI:
```sh
$ aws_username="myusername"
$ aws iam create-access-key --user-name="$aws_username"
```
More general AWS CLI guidelines can be found [here](https://docs.aws.amazon.com/cli/latest/userguide/installing.html).
Refer to [this guide](https://docs.aws.amazon.com/cli/latest/userguide/cli-ec2-launch.html) for more details on how to launch instances with the AWS CLI.
SSH into the machine
```bash
$ ssh -i MyKeyPair.pen ubuntu@${IP}
```
Go onto the next step.
## Install Kata
The process for installing Kata itself on bare metal is identical to that of a virtualization-enabled VM.
For detailed information to install Kata on your distribution of choice, see the [Kata Containers installation user guides](../install/README.md).
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.