The tuntap network device is for tuntap interfaces to connect
to the container. A specific use case is the slirp4netns tap
interface for rootless kata-runtime.
Fixes: #1878
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
rootless execution does not yet support cgroups, so if running
rootlessly skip the cgroup creation and deletion.
Fixes: 1877
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
Fixes: #2023
We can get OCI spec config from bundle instead of annotations, so this
field isn't necessary.
Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
The container store should be deleted when new/create is failed if the
store is newly created.
Fixes: #2013
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
Use all subsystems for SandboxOnly option to make sure
all cgroups are deleted.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add containers does not need to check the cgroup path
this is done in a different function
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When a new sandbox is created, join to its cgroup path
this will create all proxy, shim, etc in the sandbox cgroup.
Fixes: #1879
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
add option to eneable only pod cgroup (SandboxCgroupOnly)
Depends-on: github.com/kata-containers/tests#1824
Fixes: #1879
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Some errors propagate with printing showing a cgroup path.
If for some reason this is empty is difficult to know looking
at the logs.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
constrainHypervisor -> constrainHypervisorVCPUs
Document and rename function.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When shimv2 was killed by accident, containerd would try to
launch a new shimv2 binarry to cleanup the container. In order
to avoid race condition, the cleanup should be done serialized
in a sandbox. Thus adding a new api to do this by locking the
sandbox.
Fixes:#1832
Signed-off-by: lifupan <lifupan@gmail.com>
When create container failed, it should delete the container
config from sandbox, otherwise, the following new creating container
would get a wrong resources caculating which would contain the previous
failed container resources such as memory and cpu.
Fixes: #1997
Signed-off-by: lifupan <lifupan@gmail.com>
The following storeSandbox() will store the sandbox config
data, thus there is no need to store it specifically before
run storeSandbox().
Signed-off-by: lifupan <lifupan@gmail.com>
We don't really need to unplug it from guest because we have
already stopped it. Just detach it and clean it up.
Fixes: #1968
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
This reverts commit 794e08e243.
It breaks vfio device passthru as we need to bind the device
back to host when removing the endpoint. And that is not possible
when qemu is still running (thus holding reference to the device).
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We might need to call hypervisor hotunplug to really remove
a network device. We cannot do it after stopping the VM.
Fixes: #1956
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Fixes#803
Merge "hypervisor.json" into "persist.json", so the new store can take
care of hypervisor data now.
Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
For one thing, it is container specific resource so it should not
be cleaned up by the agent. For another thing, we can make container
stop to force cleanup these host mountpoints regardless of hypervisor
and agent liveness.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Once we have found the container, we should never fail SIGKILL.
It is possible to fail to send SIGKILL because hypervisor might
be gone already. If we fail SIGKILL, upper layer cannot really
proceed to clean things up.
Also there is no need to save sandbox here as we did not change
any state.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When force is true, ignore any guest related errors. This can
be used to stop a sandbox when hypervisor process is dead.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Firecracker provides a jailer to constrain the VMM. Use this
jailer to launch the firecracker VMM instead of launching it
directly from the kata-runtime.
The jailer will ensure that the firecracker VMM will run
in its own network and mount namespace. All assets required
by the VMM have to be present within these namespaces.
The assets need to be copied or bind mounted into the chroot
location setup by jailer in order for firecracker to access
these resouces. This includes files, device nodes and all
other assets.
Jailer automatically sets up the jail to have access to
kvm and vhost-vsock.
If a jailer is not available (i.e. not setup in the toml)
for a given hypervisor the runtime will act as the jailer.
Also enhance the hypervisor interface and unit tests to
include the network namespace. This allows the hypervisor
to choose how and where to lauch the VMM process, vs
virtcontainers directly launching the VMM process.
Fixes: #1129
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
Thist patch adds the following,
1. ACRN only supports virtio-blk and so the rootfs for the VM
sits at /dev/vda. So to get the container rootfs increment the
globalIndex by 1.
2. ACRN doesn't hot-plug container rootfs (but uses blkrescan) to
update the container rootfs. So the agent can be provided the virtpath
rather than the PCIaddr avoiding unneccessary rescaning to find the
virthpath.
v1->v2:
Removed the workaround of incrementing index for
virtio-blk device and addressed it acrn.
Fixes: #1778
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.
Fixes: #1669
Signed-off-by: Yang, Wei <w90p710@gmail.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
When using experimental feature "newstore", we save and load devices
information from `persist.json` instead of `devices.json`, in such case,
file `devices.json` isn't needed anymore, so remove it.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Add configuration options to support the various Kata agent tracing
modes and types. See the comments in the built configuration files for
details:
- `cli/config/configuration-fc.toml`
- `cli/config/configuration-qemu.toml`
Fixes#1369.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Make `newAgentConfig()` return an explicit error rather than handling
the error scenario by simply returning the `error` object in the
`interface{}` return type. The old behaviour was confusing and
inconsistent with the other functions creating a new config type (shim,
proxy, etc).
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
This reverts commit 196661bc0d.
Reverting because cri-o with devicemapper started
to fail after this commit was merged.
Fixes: #1574.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
We can use the same data structure to describe both of them.
So that we can handle them similarly.
Fixes: #1566
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Set new persist storage driver "virtcontainers/persist/" as "experimental"
feature.
One day when this can fully work and we're ready to move to 2.0, we'll move
it from "experimental" feature to formal feature.
At that time, the "virtcontainers/filesystem_resource_storage.go" can be removed
completely.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Address some comments:
* fix persist driver func names for better understanding
* modify some logic, add some returned error etc
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>