Implement tc-based tx rate limiter to control network I/O outbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
We take different actions, based on various inter-networking models.
For tcfilters as inter-networking model, we simply apply htb
qdisc discipline on the virtual netpair.
For other inter-networking models, such as macvtap, we resort to ifb,
by redirecting interface ingress traffic to ifb egress, and then apply htb
to ifb egress.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Ingress traffic shaping is very limited, and the htb
qdisc discipline couldn't be applied to interface ingress traffic.
Here, we import a new pseudo network interface, Intermediate Functional Block (ifb).
It is an alternative to tc filters for handling ingress traffic, by
redirecting interface ingress traffic to ifb and treat it as egress traffic there.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As for hypervisors that support built-in rate limiter, like firecracker,
we use this built-in characteristics to implement rate limiter in kata.
kata-defined rate is in bits with scaling factors of 1000, otherwise fc-defined
rate is in bytes with scaling factors of 1024, so need reversion.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Implement tc-based rx rate limiter to control network I/O inbound traffic
on VM level for hypervisors which don't support built-in rate limiter.
In some detail, we use HTB(Hierarchical Token Bucket) qdisc shaping schemes
to control host interface egress traffic.
HTB shapes traffic based on the Token Bucket Filter algorithm, and one
fundamental part of the HTB qdisc is the borrowing mechanism.
Children classes borrow tokens from their parents once they have exceeded rate,
it will continue to attempt to borrow until it reaches ceil. See more details in
https://tldp.org/HOWTO/Traffic-Control-HOWTO/classful-qdiscs.htmlFixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
We use tc-based or built-in rate limiter to shape network I/O traffic
and they all must be tied to one specific interface/endpoint.
In order to tell whether we've ever added rate limiter to this interface/endpoint,
we create get/set func to reveal/store such info.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
We have defined specific config file configuration-fc.toml for firecracker,
including specific features and requirements, but the related unit test
TestNewFirecrackerHypervisor is missing.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As for some hypervisors, like firecracker, they support built-in rate limiter
to control network I/O bandwidth on VMM level. And for some hypervisors, like qemu,
they don't.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Add configuration/annotation about network I/O throttling on VM level.
rx_rate_limiter_max_rate is dedicated to control network inbound
bandwidth per pod.
tx_rate_limiter_max_rate is dedicated to control network outbound
bandwidth per pod.
Fixes: #250
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
The virtiofs daemon may run into errors other than the file
not existing, e.g. the file may not be executable.
Fixes: #2682
Message is now:
virtiofs daemon /usr/local/bin/hello returned with error:
fork/exec /usr/local/bin/virtiofsd: permission denied
instead of
panic: runtime error: invalid memory address or nil
Fixes: #2582
Message is now:
virtiofs daemon /usr/local/bin/hello-not-found returned with error:
fork/exec /usr/local/bin/hello-not-found: no such file or directory
instead of:
virtiofsd path (/usr/local/bin/hello-no-found) does not exist
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The current path is hardcoded as follows:
virtio_fs_daemon = "/path/to/virtiofsd"
Switch to using the value of config.VirtioFSDaemon instead.
Fixes: #2686
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
In service#StartShim, there is no applicable error variable which is checked by deferred func because the err variable is redefined.
This PR fixes the error variable.
Fixes#2727
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Call the `pkg/cgroups` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Fixes: #2782
Signed-off-by: Julio Montes <julio.montes@intel.com>
[cherry picked from runtime commit 3c4fe035e8041b44e1f3e06d5247938be9a1db15]
Check if shm mount is backed by empty-dir memory based volume.
If so let the logic to handle epehemeral volumes take care of this
mount, so that shm mount within the container is backed by tmpfs mount
within the the container in the VM.
Fixes: #323
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[cherry picked from runtime commit d0dbd0485d2f4ec3760f6fa1252ded86a7709042]
Call the `device/config` package `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
[ cherry-picked from runtime commit 13887bf89da9d2d7c215d77ca63129e1813e4c4a ]
Call the `store` packages `SetLogger()` function to ensure all its log
records contain all required structured logging fields.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We need to make sure containers cannot modify host path unless it is explicitly shared to it. Right now we expose an additional top level shared directory to the guest and allow it to be modified. This is less ideal and can be enhanced by following method:
1. create two directories for each sandbox:
-. /run/kata-containers/shared/sandboxes/$sbx_id/mounts/, a directory to hold all host/guest shared mounts
-. /run/kata-containers/shared/sandboxes/$sbx_id/shared/, a host/guest shared directory (9pfs/virtiofs source dir)
2. /run/kata-containers/shared/sandboxes/$sbx_id/mounts/ is bind mounted readonly to /run/kata-containers/shared/sandboxes/$sbx_id/shared/, so guest cannot modify it
3. host-guest shared files/directories are mounted one-level under /run/kata-containers/shared/sandboxes/$sbx_id/mounts/ and thus present to guest at one level under /run/kata-containers/shared/sandboxes/$sbx_id/shared/
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When an x86 sandbox has a vIOMMU (needed for VFIO), it needs the
'kernel_irqchip=split' option or it can't start. fdcd1f3a2 attempts to set
that, but ends up just writing it to a temporary (looks like Go for range
loops pass by value).
Fixes: #2694
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Add a configuration option and a Pod Annotation
If activated:
- Add kernel parameters to load iommu
- Add irqchip=split in the kvm options
- Add a vIOMMU to the VM
Fixes#2694
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Add a new function appendIOMMU() to the qemuArch interface
and provide an implementation on amd64 architecture.
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The ppc64 specific qemu setup code adds a "pmu=off" parameter to the cpu
model if the nestedRun option is set. But, not only does availability of
the pmu have nothing to do with nesting on POWER, there is no "pmu=" cpu
opton for ppc64 at all.
So, simply remove it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>