A wrong path was being used for container directory when
virtiofs is utilized. This resulted in a warning message in
logs when a container is killed, or completes:
level=warning msg="Could not remove container share dir"
Without proper removal, they'd later be cleaned up when the shared
path is removed as part of stopping the sandbox.
Fixes: #1559
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
#1389 has added a context for many signatures to improve trace spans.
Functions specific to s390x lack this. Add context where required. This
affects some common code signatures, since some functions that do not
require context on other architectures do require it on s390x.
Also remove an unnecessary import in test_qemu_s390x.go.
Fixes: #1562
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
`make test` depends mock hook in virtcontainers directory,
before test, install it first.
And also run test as normal user and root in GitHub actions.
Fixes: #1554
Signed-off-by: bin <bin@hyper.sh>
Right now we rely heavily on mount propagation to share host
files/directories to the guest. However, because virtiofsd
pivots and moves itself to a separate mount namespace, the remount
mount is not present in virtiofsd's mount. And it causes guest to be
able to write to the host RO volume.
To fix it, create a private RO mount and then move it to the host mounts
dir so that it will be present readonly in the host-guest shared dir.
Fixes: #1552
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
It turns out we have managed to break the static checker in many
difference places with the absence of static checker in github action.
Let's fix them while enabling static checker in github actions...
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
github.com/kata-containers/kata-containers/src/runtime/virtcontainers/pkg/vcmock
virtcontainers/pkg/vcmock/container.go:19:10: cannot use c.MockSandbox
(type *Sandbox) as type virtcontainers.VCSandbox in return argument:
*Sandbox does not implement virtcontainers.VCSandbox (missing
GetHypervisorPid method)
github.com/kata-containers/kata-containers/src/runtime/pkg/katautils
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Modify calls in unit tests to use context since many functions were
updated to accept local context to fix trace span ordering.
Fixes#1355
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
A significant number of trace calls did not use a parent context that
would create proper span ordering in trace output. Add local context to
functions for use in trace calls to facilitate proper span ordering.
Additionally, change whether trace function returns context in some
functions in virtcontainers and use existing context rather than
background context in bindMount() so that span exists as a child of a
parent span.
Fixes#1355
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
If specified, sandbox_bind_mounts identifies host paths to be
mounted (ro) into the sandboxes shared path. This is only valid
if filesystem sharing is utilized.
The provided path(s) will be bindmounted (ro) into the shared fs directory on
the host, and thus mapped into the guest. If defaults are utilized,
these mounts should be available in the guest at
`/var/run/kata-containers/shared/containers/sandbox-mounts`
These will not be exposed to the container workloads, and are only
added for potential guest-services to consume (example: expose certs
into the guest that are available on the host).
Fixes: #1464
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Since the kata's hypervisor process is in the network namespace,
which is close to container's process, and some host metrics
such as cadvisor can use this pid to access the network namespace
to get some network metrics. Thus this commit replace the shim's
pid with the hypervisor's pid.
Fixes: #1451
Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
VhostUserDeviceAttrs::PCIAddr didn't actually store a PCI address
(DDDD:BB:DD.F), but rather a PCI path. Use the PciPath type and
rename things to make that clearer.
TestHandleBlockVolume previously used the bizarre value "0001:01"
which is neither a PCI address nor a PCI path for this value. Change
it to a valid PCI path - it appears the actual value didn't matter for
that test, as long as it was consistent.
Forward port of
3596058c67fixes#1040
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BlockDrive::PCIAddr doesn't actually store a PCI address
(DDDD:BB:DD.F) but a PCI path. Use the PciPath type and rename things
to make that clearer.
TestHandleBlockVolume() previously used a bizarre value "0002:01" for
the "PCI address" which was neither an actual PCI address, nor a PCI
path. Update it to use a PCI path - the actual value appears not to
matter in this test, as long as its consistent throughout.
Forward port of
64751f377bfixes#1040
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "PCI address" returned by Endpoint::PciPath() isn't actually a PCI
address (DDDD:BB:DD.F), but rather a PCI path. Rename and use the
PciPath type to clean this up and the various parts of the network
code connected to it.
Forward port of
3e589713cffixes#1040
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that we have types to represent PCI paths on both the agent and
runtime sides, we can update the protocol definitionto use clearer
terminology.
Note that this doesn't actually change the agent protocol, because it just
renames a field without changing its field ID or type.
While we're there fix a trivial rustfmt error in
src/agent/protocols/build.rs
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is a dedicated data type for representing PCI paths, that is, PCI
devices described by the slot numbers of the bridges we need to reach
them.
There are a number of places that uses strings with that structure for
things. The plan is to use this data type to consolidate their
handling. These are essentially Go equivalents of the pci::Slot and
pci::Path types introduced in the Rust agent.
Forward port of
185b3ab044
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today we only clear out the cpuset details when doing an update call on
existing container/pods. This works in the case of Kubernetes, but not
in the case where we are explicitly setting the cpuset details at boot
time. For example, if you are running a single container via docker ala:
docker run --cpuset-cpus 0-3 -it alpine sh
What would happen is the cpuset info would be passed in with the
container spec for create container request to the agent. At that point
in time, there'd only be the defualt number of CPUs available in the
guest (1), so you'd be left with cpusets set to 0. Next, we'd hotplug
the vCPUs, providing 0-4 CPUs in the guest, but the cpuset would never
be updated, leaving the application tied to CPU 0.
Ouch.
Until the day we support cpusets in the guest, let's make sure that we
start off clearing the cpuset fields.
Fixes: #1405
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
This commit includes two changes:
- migrate from opentracing to opentelemetry
- add jaeger configuration items
Fixes: #1351
Signed-off-by: bin <bin@hyper.sh>
acpi is enabled for kata 1.x, port and rebase code for 2.x
including:
runtime: enable pflash;
agent: add acpi support for pci bus path;
packaging: enable CONFIG_RTC_DRV_EFI;
Fixes: #1317
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Highlights for cloud-hypervisor version v0.12.0 include: removal of
`vhost-user-net` and `vhost-user-block` self spawning, migration of
`vhost-user-fs` backend, ARM64 enhancements with full support of
`--watchdog` for rebooting, and enhanced `info` HTTP API to include the
details of devices used by the VM including VFIO devices.
Fixes: #1315
Signed-off-by: Bo Chen <chen.bo@intel.com>
On pod delete, we were looking to read files that we had just deleted. In particular,
stopSandbox for QEMU was called (we cleanup up vmpath), and then QEMU's
save function was called, which immediately checks for the PID file.
Let's only update the persist store for QEMU if QEMU is actually
running. This'll avoid Error messages being displayed when we are
stopping and deleting a sandbox:
```
level=error msg="Could not read qemu pid file"
```
I reviewed CLH, and it looks like it is already taking appropriate
action, so no changes needed.
Ideally we won't spend much time saving state to persist.json unless
there's an actual error during stop/delete/shutdown path, as the persist will
also be removed after the pod is removed. We may want to optimize this,
as currently we are doing a persist store when deleting each container
(after the sandbox is stopped, VM is killed), and when we stop the sandbox.
This'll require more rework... tracked in:
https://github.com/kata-containers/kata-containers/issues/1181Fixes: #1179
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
If the upcast from resultingRoutes to *grpc.IRoutes fails, we return
(nil, err), but previous code ensures that err is nil at that point, so we
return no error.
fixes#1206
Forward port of
0ffaeeb5d8
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the upcast from resultingInterfaces to *grpc.Interfaces fails, we
return (nil, err), but previous code ensures that err is nil at that
point, so we return no error.
Forward port of
b86e904c2dfixes#1206
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
oci.proto imports "google/protobuf/wrappers.proto", but doesn't appear to
use it, which causes a warning from protoc when we compile it. Remove the
import to fix the warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We should always cleanup the vm directory when doing `stopSandbox`,
while we are skipping the cleanup process on some error code paths when
using cloud-hypervisor driver.
Fixes: #1098
Signed-off-by: Bo Chen <chen.bo@intel.com>
In PR 1079, CleanupContainer's parameter of sandboxID is changed to VCSandbox, but at cleanup,
there is no VCSandbox is constructed, we should load it from disk by loadSandboxConfig() in
persist.go. This commit reverts parts of #1079Fixes: #1119
Signed-off-by: bin liu <bin@hyper.sh>
Remove global sandbox variable, and save *Sandbox to hypervisor struct.
For some needs, hypervisor may need to use methods from Sandbox.
Signed-off-by: bin liu <bin@hyper.sh>
Update cloud-hypervisor to commit 2706319.
Fixes a limitation in OpenAPITools/openapi-generator tool,
it's impossible to send go zero types, like false and 0 to
cloud-hypervisor because `omitempty` is added if a field is not
required.
See cloud-hypervisor/cloud-hypervisor#1961 for more information
Signed-off-by: Julio Montes <julio.montes@intel.com>
Guest consumes 120Mb more of memory when DAX is enabled and the default
FS cache size (8G) is used. Disable dax when it is not required
reducing guest's memory footprint.
Without this patch:
```
7fdea4000000-7fdee4000000 rw-s 18850589 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 187876 kB
```
With this patch:
```
7fa970000000-7fa9b0000000 rw-s 612001 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 57308 kB
Pss: 56722 kB
```
fixes#1100
Signed-off-by: Julio Montes <julio.montes@intel.com>