Commit Graph

5462 Commits

Author SHA1 Message Date
Christophe de Dinechin
be9ca0d58b qemu: Don't leak file descriptors in case of error
[ port from runtime commit 7b269ff7aa2d62fe12593ff7040798e6c9bd5d65 ]

If we take one of the error paths from setupVirtiofsd() after
opening the fd variable, the fd.Close() function is not called.

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 01:19:18 -07:00
Jose Carlos Venegas Munoz
60606647de virtiofsd: Improve logging
[ port from runtime commit 882a82393305a4b11a77744b5fc77b98e42d15b9 ]

Send virtiofsd logs to syslog in the same way that qemu implementation
does. This requires not to wait for messages from virtiofsd stdout. This
takes the qemu implementation approach. Give the socket fd to virtiofsd.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 01:16:08 -07:00
Alex Price
7e250f29e9 shim: exit out of oom polling if unimplemented
[ port from runtime commit 86f581068eb9dc4b6862c7415cdc912e111177dd ]

This exits out of polling for OOM events if the getOOMEvent
method is unimplemented.

Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 01:11:54 -07:00
Alex Price
9f8d1baa57 virtcontainers: tests fix, nit fix
[ port from runtime commit b4833a48c81132e5a6b1c25a764cd0ebbdc6afff ]

fix tests and nit

Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 01:08:54 -07:00
Liam Merwick
d3b3e8bee6 virtcontainers: x86: Support microvm machine type
[ port from runtime commit 6aff077901021d9a0075c446dfe281b2487e1487 ]

With the addition of support to govmm for multiple transports (intel/govmm#111)
and microvm (intel/govmm#121) we can now enable support for the 'microvm'
machine type in kata-runtime.

Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 01:06:30 -07:00
Alex Price
198339367b virtcontainers: add support for getOOMEvent agent endpoint to sandbox
[ port from runtime commit 86686b56a2bf7f6dd62f620278ae289564da51d0 ]

This adds support for the getOOMEvent agent endpoint to retrieve OOM
events from the agent.

Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 00:51:23 -07:00
Alex Price
7c205be27d virtcontainers: add support for getOOMEvent agent endpoint to sandbox
[ port from runtime commit 86686b56a2bf7f6dd62f620278ae289564da51d0 ]

This adds support for the getOOMEvent agent endpoint to retrieve OOM
events from the agent.

Signed-off-by: Alex Price <aprice@atlassian.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 00:42:08 -07:00
Peng Tao
380f07ec4b proto: update agent protocol
To add GetOOMEvent API.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 00:34:38 -07:00
Jia He
6e7dd435a2 qemu: arm64: Set defaultGICVersion to 3 to limit the max vCPU number
[ port from runtime commit ee985a608015d81772901c1d9999190495fc9a0a ]

After removing dectect of host gic version, we need to limit the max vCPU
in different cases.

Given that in most cases, Kata is running on gicv3 host, set it as default
value. If the user really want to run Kata on gicv2 host, he/she need to
set default_maxvcpus in toml file to 8 instead of 0.

In summary, If the user uses host gicv3 gicv4, everything is fine
            If the user uses host gicv2, set default_maxvcpus=8

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-28 20:48:48 -07:00
Jia He
17b3021b54 Subject: [PATCH] qemu: arm64: Don't detect gic version by /proc/interrupts
[ port from runtime repository commit 4d4a153af5cb145215cb6e6e386eac2bcb8c3e32 ]

Commit b4385901da ("qemu/arm64: Detect host GIC version to configure guest
GIC") reads /proc/interrupts to detect the host gic version.

But on a ThunderX2 host with 224 cpus, the /proc/interrupts is ~762K bytes.
Hence it will costs ~900K bytes memory overhead.
From the go tool pprof results:
      flat  flat%   sum%        cum   cum%
  976.89kB   100%   100%   976.89kB   100%  github.com/kata-containers/runtime/virtcontainers.getHostGICVersion
Although the allocated memory will be freed, seems it worthy removing that
for speed up the runtime.

As per [1], there is no perfect way to detect the gic version on host.
At qemu side, if we use "gic-version=host", qemu will automatically detect
the verion by kvm ioctl. So we'd better let qemu determine the gic version.

If the user really want to start vm with gic-verion=2, he/she can set it
in machine_accelerators option.

[1]https://lists.cs.columbia.edu/pipermail/kvmarm/2014-October/011690.html

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-28 20:43:16 -07:00
Penny Zheng
4cda90abcb dax: enable dax on arm64
[ port from runtime repository commit e36389e25e ]

After backporting patch series of enabling memory hot remove on aarch64
to v5.4.x, we finally could enable nvdimm/dax on aarch64.

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-28 20:40:41 -07:00
Peng Tao
7a44025464 Makefile: add trace-forwarder/agent-ctl missing targets
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-28 20:36:33 -07:00
Ted Yu
61e011e86b vc: Version support check is ineffective in createSandbox
[ port from runtime repository commit 7e47046111 ]

If major version matches max supported major, we continue comparing the minor version.

Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-28 20:32:55 -07:00
Peng Tao
3f8d4b6822 trace-forwarder: add Cargo.lock
And rely on protobuf 2.14.0. Otherwise build fails as protobuf 2.15.0
requires unstable cargo.

error[E0658]: non-builtin inner attributes are unstable

Fixes: #343
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-27 20:16:40 -07:00
Peng Tao
e587abe6b4 Merge pull request #333 from jodh-intel/improve-toplevel-makefile
build: Improve top-level Makefile
2020-06-26 16:20:01 +08:00
Peng Tao
a3d77bc0d1 Merge pull request #338 from amshinde/remove-workaround-sharedpid
shimv2 : Remove workaround for sharedPidNs
2020-06-26 16:18:48 +08:00
Peng Tao
9d90906546 Merge pull request #320 from dgibson/cleanups
Clean up some unnecessary data structures
2020-06-26 16:18:16 +08:00
Archana Shinde
b68d4e45ee shimv2: Removing function as no longer used
Function removeNamespace is no longer used. Get rid of
it.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2020-06-25 16:50:56 -07:00
Archana Shinde
f570a2cd40 shimv2 : Remove workaround for sharedPidNs
Removing code that existed as a workaround for a bug in
how shared process namespaces were handled in the agent.
That has been long fixed in the agent.
With this, sharedPidNs will now work with shimv2.

Fixes #337

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2020-06-25 16:50:39 -07:00
James O. D. Hunt
f2a19966b2 agent: Rename check rule to test
Changed the name of the rule that runs the tests to "test" for
consistency, but retained `check` for backwards compatibility
for now.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-06-25 11:18:23 +01:00
Peng Tao
a1ef594d2a cleanup: remove redundant files
And use top level VERSION for all components.

Fixes: #334
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-24 15:57:23 -07:00
Peng Tao
3bbb97add3 Merge pull request #312 from Pennyzct/network_throttle_on_qemu
rate-limiter: network I/O throttling on VM level
2020-06-25 04:59:44 +08:00
Peng Tao
bee02d47ed Merge pull request #310 from fidencio/wip/forward_port_c3d_and_ted_yu_patches
[forward port] Bring to the development branch fixes provided by Christophe De Dinechin and Ted Yu.
2020-06-25 04:57:48 +08:00
David Gibson
ea1d799f79 qemu: Only one element of qemuPaths map is relevant
The qemuPaths field in qemuArchBase maps from machine type to the default
qemu path.  But, by the time we construct it, we already know the machine
type, so that entry ends up being the only one we care about.

So, collapse the map into a single path.  As a bonus, the qemuPath()
method can no longer fail.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:26:43 +10:00
David Gibson
5dffffd432 qemu: Remove useless table from qemuArchBase
The supportedQemuMachines array in qemuArchBase has a list of all the
qemu machine types supported for the architecture, with the options
for each.  But, the machineType field already tells us which of the
machine types we're actually using, and that's the only entry we
actually care about.

So, drop the table, and just have a single value with the machine type
we're actually using.  As a bonus that means the machine() method can
no longer fail, so no longer needs an error return.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:26:38 +10:00
David Gibson
97a02131c6 qemu: Detect and fail a bad machine type earlier
Currently, newQemuArch() doesn't return an error.  So, if passed an invalid
machine type, it will return a technically valid, but unusable qemuArch
object, which will probably fail with other errors shortly down the track.

Change this, to more cleanly fail the newQemuArch itself, letting us
detect a bad machine type earlier.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:07:33 +10:00
David Gibson
d6e7a58ac9 qemu: Clarify test with bad machine type
The last stanza of TestQemuAmd64Bridges is rather odd.  It tries to create
a qemu instance with a machine type of (QemuQ35 + QemuPC), or in other
words "q35pc", which isn't a thing.

What it's asserting about this is that the returned bridges list is empty
despite asking for bridges, so it looks like what this is really trying to
test is for sane behaviour when given a bad machine type.

So, split this out into a separate test, and make it explicit for clarity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-24 21:02:17 +10:00
Penny Zheng
541fd58791 rate-limiter: add rate limiter unit test
add TestRxRateLimiter and TestTxRateLimiter unit tests

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:17:07 +00:00
Penny Zheng
d3098c56f6 rate-limiter: remove tc-based rate limiter
Removing tc-based rate limiter includes removing htb qdiscs, ifb
interfaces if created, etc.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:17:07 +00:00
Penny Zheng
08551287b1 rate-limiter: add tc-based tx rate limiter
Implement tc-based tx rate limiter to control network I/O outbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
We take different actions, based on various inter-networking models.
For tcfilters as inter-networking model, we simply apply htb
qdisc discipline on the virtual netpair.
For other inter-networking models, such as macvtap, we resort to ifb,
by redirecting interface ingress traffic to ifb egress, and then apply htb
to ifb egress.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:17:07 +00:00
Penny Zheng
65a37b7d9c rate-limiter: add ifb interface
Ingress traffic shaping is very limited, and the htb
qdisc discipline couldn't be applied to interface ingress traffic.
Here, we import a new pseudo network interface, Intermediate Functional Block (ifb).
It is an alternative to tc filters for handling ingress traffic, by
redirecting interface ingress traffic to ifb and treat it as egress traffic there.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:17:07 +00:00
Penny Zheng
cfeb966763 rate-limiter: implement hypervisor-built-in rate limiter
As for hypervisors that support built-in rate limiter, like firecracker,
we use this built-in characteristics to implement rate limiter in kata.
kata-defined rate is in bits with scaling factors of 1000, otherwise fc-defined
rate is in bytes with scaling factors of 1024, so need reversion.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:16:58 +00:00
Penny Zheng
676ad989d7 rate-limiter: implement tc-based rx rate limiter
Implement tc-based rx rate limiter to control network I/O inbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
In some detail, we use HTB(Hierarchical Token Bucket) qdisc shaping schemes
to control host interface egress traffic.
HTB shapes traffic based on the Token Bucket Filter algorithm, and one
fundamental part of the HTB qdisc is the borrowing mechanism.
Children classes borrow tokens from their parents once they have exceeded rate,
it will continue to attempt to borrow until it reaches ceil. See more details in
https://tldp.org/HOWTO/Traffic-Control-HOWTO/classful-qdiscs.html

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:59 +00:00
Penny Zheng
5a58ed29f1 rate-limiter: add getRateLimiter/setRateLimiter in endpoint
We use tc-based or built-in rate limiter to shape network I/O traffic
and they all must be tied to one specific interface/endpoint.
In order to tell whether we've ever added rate limiter to this interface/endpoint,
we create get/set func to reveal/store such info.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:51 +00:00
Penny Zheng
527c3f4634 test: Add unit test TestNewFirecrackerHypervisor
We have defined specific config file configuration-fc.toml for firecracker,
including specific features and requirements, but the related unit test
TestNewFirecrackerHypervisor is missing.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:42 +00:00
Penny Zheng
bd8658e362 rate-limiter: check if hypervisor supports built-in rate limiter
As for some hypervisors, like firecracker, they support built-in rate limiter
to control network I/O bandwidth on VMM level. And for some hypervisors, like qemu,
they don't.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:34 +00:00
Penny Zheng
c2645f5d5a rate-limiter: add rate limiter configuration/annotation on VM level
Add configuration/annotation about network I/O throttling on VM level.
rx_rate_limiter_max_rate is dedicated to control network inbound
bandwidth per pod.
tx_rate_limiter_max_rate is dedicated to control network outbound
bandwidth per pod.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:04 +00:00
Peng Tao
66d385d7ed runtime: remove unneeded tests files
These are moved to the top directory.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 21:06:26 -07:00
Peng Tao
84b8260cfe runtime: fix vendor go.mod inconsistency
As reported by golang 1.14.3.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 21:01:11 -07:00
Peng Tao
ec146a1b39 Merge pull request #321 from dgibson/ppc64le
Don't use some x86 specific kernel and qemu options
2020-06-24 10:28:07 +08:00
Christophe de Dinechin
487520ff74 qemu: Report all errors on virtiofsd execution
The virtiofs daemon may run into errors other than the file
not existing, e.g. the file may not be executable.

Fixes: #2682

Message is now:
  virtiofs daemon /usr/local/bin/hello returned with error:
  fork/exec /usr/local/bin/virtiofsd: permission denied

instead of
  panic: runtime error: invalid memory address or nil

Fixes: #2582

Message is now:
  virtiofs daemon /usr/local/bin/hello-not-found returned with error:
  fork/exec /usr/local/bin/hello-not-found: no such file or directory

instead of:
  virtiofsd path (/usr/local/bin/hello-no-found) does not exist

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-06-23 22:10:44 +02:00
Christophe de Dinechin
042426d73a katatestutils: Use the configured virtiofs daemon path
The current path is hardcoded as follows:
  virtio_fs_daemon = "/path/to/virtiofsd"

Switch to using the value of config.VirtioFSDaemon instead.

Fixes: #2686

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-06-23 22:10:44 +02:00
Ted Yu
342bf3e949 virtcontainers: drop deferred func for GetAndSetSandboxBlockIndex
Fixes #2726

Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-06-23 22:10:44 +02:00
Ted Yu
8e3bd358e5 shimv2: check correct error variable for deferred func in service#StartShim
In service#StartShim, there is no applicable error variable which is checked by deferred func because the err variable is redefined.
This PR fixes the error variable.

Fixes #2727

Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-06-23 22:10:44 +02:00
Julio Montes
ac9cc96a6f Merge pull request #304 from fidencio/wip/forward_port_2703
[foward port] Add vIOMMU support to qemu q35
2020-06-23 12:20:52 -05:00
Julio Montes
0ca5983fdf virtcontainers: Fix structured logging in cgroups package
Call the `pkg/cgroups` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.

Fixes: #2782

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-06-23 07:41:12 -05:00
Archana Shinde
a976548fb2 shm: handle shm mount backed by empty-dir memory volumes
[cherry picked from runtime commit 3c4fe035e8041b44e1f3e06d5247938be9a1db15]

Check if shm mount is backed by empty-dir memory based volume.
If so let the logic to handle epehemeral volumes take care of this
mount, so that shm mount within the container is backed by tmpfs mount
within the the container in the VM.

Fixes: #323
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 03:10:45 -07:00
Julio Montes
eed66021da virtcontainers: Fix structured logging in device/config package
[cherry picked from runtime commit d0dbd0485d2f4ec3760f6fa1252ded86a7709042]

Call the `device/config` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.

Signed-off-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 00:53:05 -07:00
Peng Tao
422768082d agent: update Cargo lock
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 00:52:46 -07:00
James O. D. Hunt
72283b86dd logging: Fix structured logging in store package
[ cherry-picked from runtime commit 13887bf89da9d2d7c215d77ca63129e1813e4c4a ]

Call the `store` packages `SetLogger()` function to ensure all its log
records contain all required structured logging fields.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-23 00:52:39 -07:00