This is a bug release from Cloud Hypervisor addressing the following
issues: 1) Don't error out when setting up the SIGWINCH handler (for
console resize) when this fails due to older kernel; 2) Seccomp rules
were refined to remove syscalls that are now unused; 3) Fix reboot on
older host kernels when SIGWINCH handler was not initialised; 4) Fix
virtio-vsock blocking issue.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v20.2Fixes: #3383
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 1f581a0405)
(added dependency at backport)
The kata-monitor's Dockerfile was added by Eric Ernst on commit 2f1cb7995f
but for some reason the static checker did not catch the file misses the copyright statement
at the time it was added. But it is now complaining about it. So this assign the copyright to
him to make the static-checker happy.
Fixes#3329
github.com/kata-containers/tests#4310
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Removed all errors/warnings pointed out by hadolint version 2.7.0, except for the following
ignored rules:
- "DL3008 warning: Pin versions in apt get install"
- "DL3041 warning: Specify version with `dnf install -y <package>-<version>`"
- "DL3033 warning: Specify version with `yum install -y <package>-<version>`"
- "DL3048 style: Invalid label key"
- "DL3003 warning: Use WORKDIR to switch to a directory"
- "DL3018 warning: Pin versions in apk add. Instead of apk add <package> use apk add <package>=<version>"
- "DL3037 warning: Specify version with zypper install -y <package>[=]<version>"
Fixes#3107
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Currently QEMU's submodules are git cloned but there is the scripts/git-submodule.sh
which is meant for that. Let's use that script.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The static build of QEMU takes a good amount of time on cloning the
source tree because we do a full git clone. In order to speed up that
operation this changed the Dockerfile so that it is carried out a
partial clone by using --depth=1 argument.
Fixes#3291
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
It seems the lack of protoc in the alpine containers is causing issues
with some of our CIs, such as the VFIO one.
Fixes: #3323
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently the versions.txt in rootfs-builder dir is already removed,
so avoid to copy it in list of helper files.
Fixes: #3267
Signed-off-by: zhanghj <zhanghj.lc@inspur.com>
In `utils/kata-manager.sh`, we download the first asset listed for the
release, which used to be the static x86_64 tarball. If that happened to
not match the system architecture, we would abort. Besides that logic
being invalid for !x86_64 (despite not distributing other tarballs at
the moment), the first asset listed is also not the static tarball any
more, it is the vendored source tarball. Retrieve all _static_ tarballs
and select the appropriate one depending on architecture.
Fixes: #3254
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
As containerd can properly run without having a existent
`/etc/containerd/config.toml` file (it'd run using the default
cobnfiguration), let's explicitly create the file in those cases.
This will avoid issues on ammending runtime classes to a non-existent
file.
Fixes: #3229
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Tested-by: Jakob Naucke <jakob.naucke@ibm.com>
Use the same runtime used for podman run also for the podman build cmd
Additionally remove "docker" from the docker_run_args variable
Fixes: #3239
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
The links are either pointing to the not-used-anymore `master` branch,
or to the kubernetes-incubator page.
Let's always point to the CRI-O github page, using the `main`branch.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
#3244 moved directories that were referred to with links to `main`,
which affects stable.
Fixes: #3334
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
This is a bug release from Cloud Hypervisor addressing the following
issues: 1) Networking performance regression with virtio-net; 2) Limit
file descriptors sent in vfio-user support; 3) Fully advertise PCI MMIO
config regions in ACPI tables; 4) Set the TSS and KVM identity maps so
they don't overlap with firmware RAM; 5) Correctly update the DeviceTree
on restore.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v20.1Fixes: #3262
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit bbfb10e169)
- Set Alpine guest rootfs to 3.13 on all instances.
- Specify a minor version rather than patch level as the Alpine
repositories use that.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
#2399 partially reverted #418, missing on returning to bootstrapping a
rootfs with `apk.static` instead of copying the entire root, which can
result in drastically larger (more than 10x) images. Revert this as well
(requires some updates to URL building).
Fixes: #3216
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
ppc64le & s390x have no (well supported) musl target for Rust,
therefore, the agent must use glibc and cannot use Alpine. Specify
Ubuntu as the distribution to be used for initrd.
Fixes: #3212
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
vhost-net is disabled in the rootless kata runtime feature, which has been abandoned since kata 2.0.
I reused the rootless flag for nonroot hypervisor and would like to enable vhost-net.
Fixes#3182
Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit b3bcb7b251)
In function `update_target`, if the updated source is a directory,
we should create the corresponding directory.
Fixes: #3140
Signed-off-by: bin <bin@hyper.sh>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy / kata-cleanup: change from "latest" to "rc0"
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
So that the debug console is more useful. In the meantime, remove
iptables as it is not used by kata-agent any more.
Fixes: #3138
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
kernel compiled in fedora 35 (latest) is not working, following error
is reported:
```
qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF
Note
```
Build QAT kernel in fedora 34 container to fix it
fixes#3135
Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 857501d8dd)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's take advantage of the "is-organization-member" action and only
allow members who are part of the `kata-containers` organization to
trigger `/test_kata_deploy`.
One caveat with this approach is that for the user to be considered as
part of an organization, they **must** have their "Organization
Visibility" configured as Public (and I think the default is Private).
This was found out and suggested by @jcvenegas!
Fixes: #3130
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5e7c1a290f)
Commit 3c9ae7f made /test_kata_deploy run
against HEAD, but it also mistakenly removed all the checks that ensure
/test_kata_deploy only runs when explicitly called.
Mea culpa on this, and let's add the tests back.
Fixes: #3101
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit a7c08aa4b6)
Is the past few releases we ended up hitting issues that could be easily
avoided if `/test_kata_deploy` would use HEAD instead of a specific
tarball.
By the end of the day, we want to ensure kata-deploy works, but before
we cut a release we also want to ensure that the binaries used in that
release are in a good shape. If we don't do that we end up either
having to roll a release back, or to cut a second release in a really
short time (and that's time consuming).
Note: there's code duplication here that could and should be avoided,b
but I sincerely would prefer treating it in a different PR.
Fixes: #3001
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3c9ae7fb4b)
We noticed s390x test failures on several of the watcher unit tests.
Discovered that on s390 in particular, if we update a file in quick
sucecssion, the time stampe on the file would not be unique between the
writes. Through testing, we observe that a 20 millisecond delay is very
reliable for being able to observe the timestamp update. Let's ensure we
have this delay between writes for our tests so our tests are more
reliable.
In "the real world" we'll be polling for changes every 2 seconds, and
frequency of filesystem updates will be on order of minutes and days,
rather that microseconds.
Fixes: #2946
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
- Even a directory could be a symlink - check for this. This is very
common when using configmaps/secrets
- Add unit test to better mimic a configmap, configmap update
- We would never remove directories before. Let's ensure that these are
added to the watched_list, and verify in unit tests
- Update unit tests which exercise maximum number of files per entry. There's a change
in behavior now that we consider directories/symlinks watchable as well.
For these tests, it means we support one less file in a watchable mount.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
The current implementation just copies the file, dereferencing any
simlinks in the process. This results in symlinks no being preserved,
and a change in layout relative to the mount that we are making
watchable.
What we want is something like "cp -d"
This isn't available in a crate, so let's go ahead and introduce a copy
function which will create a symlink with same relative path if the
source file is a symlink. Regular files are handled with the standard
fs::copy.
Introduce a unit test to verify symlinks are now handled appropriately.
Fixes: #2950
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
While building snap, static qemu is considered. Disable libudev
as it doesn't have static libraries on most of the distros of all
archs.
Backport-from: #3003Fixes: #3002
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
(cherry picked from commit 112ea25859)
Signed-off-by: Greg Kurz <groug@kaod.org>
If a file/directory doesn't exist, os.Stat() returns an
error. Assert the returned value with os.IsNotExist() to
prevent it from failing.
Backport-from: #2921Fixes: #2920
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
(cherry picked from commit d5a18173b9)
Signed-off-by: Greg Kurz <groug@kaod.org>
The main.yaml workflow was created and used only on 1.x. We inherited
it, but we didn't remove it after deprecating the 1.x repos.
While here, let's also update the reference to the `main.yaml` file,
and point to `release.yaml` (the file that's actually used for 2.x).
Fixes: #3033
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)
Backport-from: #2959Fixes: #2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
According to https://endoflife.date/go golang 1.15 is not supported
anymore. Let's remove it from out tests, add 1.17.x, and bump the
newest version known to work when building kata to 1.17.3.
Fixes: #3016
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 395638c4bc)
We need to explicitly call `${GOPATH}/bin/yq` that is installed by
`ci/install_yq.sh`.
Fixes: #3014
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
(cherry picked from commit 3430723594)
This reverts commit 76f16fd1a7 to bring
back cri-containerd crioptions parsing so that kata works with older
containerd versions like v1.3.9 and v1.4.6.
Fixes: #2999
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
(cherry picked from commit eacfcdec19)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- runtime# make sure the "Shutdown" trace span have a correct end
- tracing: Accept multiple dynamic tags
- logging: Enable agent debug output for release builds
- agent: "Revert agent: Disable seccomp feature on aarch64 temporarily"
- runtime: Enhancement for Makefile
- osbuilder: build image-builder image from Fedora 34
- agent: refactor process IO processing
- agent-ctl: Update for Hybrid VSOCK
- docs: Fix outdated links
- ci/install_libseccomp: Fix libseccomp build and misc improvement
- virtcontainers: simplify read-only mount handling
- runtime: add fast-test to let test exit on error
- test: Fix random failure for TestIoCopy
- cli: Show available guest protection in env output
- Update k8s, critools, and CRI-O to their 1.22 release
- package: assign proper value to redefined_string in build-kernel.sh
- agent: Make wording of error message match CRI-O test suite
- docs: Moving from EOT to EOF
- virtcontainers: api: update the functions in the api.md docs
- release: Upload libseccomp sources with notice to release page
- virtcontainers: check that both initrd and image are not set
- agent: Fix the configuration sample file
- runtime: set tags for trace span
- agent-ctl: Implement Linux OCI spec handling
- runtime: Remove comments about unsupported features in config for clh
- tools/packaging: Add options for VFIO to guest kernel
- agent/runtime: Add seccomp feature
- ci: test-kata-deploy: Get rid of slash-command-action action
- This is to bump the OOT QAT 1.7 driver version to the latest version.…
- forwarder: Drop privileges when using hybrid VSOCK
- packaging/static-build: s390x fixes
- agent-ctl: improve the oci_to_grpc code
- agent: do not return error but print it if task wait failed
- virtcontainers: delete duplicated notify in watchHypervisor function
- agent: Handle uevent remove actions
- enable unit test on arm
- rustjail: Consistent coding style of LinuxDevice type
- cli: Fix outdated kata-runtime bash completion
- Allow VFIO devices to be used as VFIO devices in the container
- Expose top level hypervisor methods -
- Upgrade to Cloud Hypervisor v19.0
- docs: use-cases: Update Intel SGX use case
- virtcontainers: clh: Enable the `seccomp` feature
- runtime: delete cri containerd plugin from versions.yaml
- docs: Write tracing documentation
- runtime: delete useless src/runtime/cli/exit.go
- snap: add cloud-hypervisor and experimental kernel
- osbuilder: Call detect_rust_version() right before install_rust.sh
- docs: Updating Developer Guide re qemu-img
- versions: Add libseccomp and gperf version
- Enable agent tracing for hybrid VSOCK hypervisors
- runtime: optimize test code
- runtime: use containerd package instead of cri-containerd
- runtime: update sandbox root dir cleanup behavior in rootless hypervisor
- utils: kata-manager: Update kata-manager.sh for new containerd config
- osbuilder: Re-enable building the agent in Docker
- agent: Do not fail when trying to adding existing routes
- tracing: Fix typo in "package" tag name
- kata-deploy: add .dockerignore file
- runtime: change name in config settings back to "kata"
- tracing: Remove trace mode and trace type
09d5d88 runtime: tracing: Change method for adding tags
bcf3e82 logging: Enable agent debug output for release builds
a239a38 osbuilder: build image-builder image from Fedora 34
375ad2b runtime: Enhancement for Makefile
b468dc5 agent: Use dup3 system call in unit tests of seccomp
1aaa059 agent: "Revert agent: Disable seccomp feature on aarch64 temporarily"
1e331f7 agent: refactor process IO processing
9d3ec58 runtime: make sure the "Shutdown" trace span have a correct end
3f21af9 runtime: add fast-test to let test exit on error
9b270d7 ci/install_libseccomp: use a temporary work directory
98b4406 ci/install_libseccomp: Fix fail when DESTDIR is set
338ac87 virtcontainers: api: update the functions in the api.md docs
23496f9 release: Upload libseccomp sources with notice to release page
e610fc8 runtime: Remove comments about unsupported features in config for clh
7e40195 agent-ctl: Add stub for AddSwap API
82de838 agent-ctl: Update for Hybrid VSOCK
d1bcf10 forwarder: Remove quotes from socket path in doc
e66d047 virtcontainers: simplify read-only mount handling
bdf4824 tools/packaging: Add options for VFIO to guest kernel
c509a20 agent-ctl: Implement Linux OCI spec handling
42add7f agent: Disable seccomp feature on aarch64 temporarily
5dfedc2 docs: Add explanation about seccomp
45e7c2c static-checks: Add step for installing libseccomp
a3647e3 osbuilder: Set up libseccomp library
3be50ad agent: Add support for Seccomp
4280415 agent: Fix the configuration sample file
b0bc71f ci: test-kata-deploy: Get rid of slash-command-action action
309dae6 virtcontainers: check that both initrd and image are not set
a10cfff forwarder: Fix changing log level
6abccb9 forwarder: Drop privileges when using hybrid VSOCK
bf00b8d agent-ctl: improve the oci_to_grpc code
b67fa9e forwarder: Make explicit root check
e377578 forwarder: Fix docs socket path
5f30633 virtcontainers: delete duplicated notify in watchHypervisor function
5f5eca6 agent: do not return error but print it if task wait failed
d2a7b6f packaging/static-build: s390x fixes
6cc8000 cli: Show available guest protection in env output
2063b13 virtcontainers: Add func AvailableGuestProtections
a13e2f7 agent: Handle uevent remove actions
34273da runtime/device: Allow VFIO devices to be presented to guest as VFIO devices
68696e0 runtime: Add parameter to constrainGRPCSpec to control VFIO handling
d9e2e9e runtime: Rename constraintGRPCSpec to improve grammar
57ab408 runtime: Introduce "vfio_mode" config variable and annotation
730b9c4 agent/device: Create device nodes for VFIO devices
175f9b0 rustjail: Allow container devices in subdirectories
9891efc rustjail: Correct sanity checks on device path
d6b62c0 rustjail: Change mknod_dev() and bind_dev() to take relative device path
2680c0b rustjail: Provide useful context on device node creation errors
42b92b2 agent/device: Allow container devname to differ from the host
827a41f agent/device: Refactor update_spec_device_list()
8ceadcc agent/device: Sanity check guest IOMMU groups
ff59db7 agent/device: Add function to get IOMMU group for a PCI device
13b06a3 agent/device: Rebind VFIO devices to VFIO driver inside guest
e22bd78 agent/device: Add helper function for binding a guest device to a driver
b40eedc rustjail: Consistent coding style of LinuxDevice type
57c0f93 agent: fix race condition when test watcher
1a96b8b template: disable template unit test on arm
43b13a4 runtime: DefaultMaxVCPUs should not greater than defaultMaxQemuVCPUs
c59c367 runtime: current vcpu number should be limited
fa92251 runtime: kernel version with '+' as suffix panic in parse
52268d0 hypervisor: Expose the hypervisor itself
a72bed5 hypervisor: update tests based on createSandbox->CreateVM change
f434bcb hypervisor: createSandbox is CreateVM
76f1ce9 hypervisor: startSandbox is StartVM
fd24a69 hypervisor: waitSandbox is waitVM
a6385c8 hypervisor: stopSandbox is StopVM
f989078 hypervisor: resumeSandbox is ResumeVM
73b4f27 hypervisor: saveSandbox is SaveVM
7308610 hypervisor: pauseSandbox is nothing but PauseVM
8f78e1c hypervisor: The SandboxConsole is the VM's console
4d47aee hypervisor: Export generic interface methods
6baf258 hypervisor: Minimal exports of generic hypervisor internal fields
37fa453 osbuilder: Update QAT driver in Dockerfile
8030b6c virtcontainers: clh: Re-generate the client code
8296754 versions: Upgrade to Cloud Hypervisor v19.0
2b13944 docs: Fix outdated links
4f75ccb docs: use-cases: Update Intel SGX use case
4f018b5 runtime: delete useless src/runtime/cli/exit.go
7a80aeb docs: Moving from EOT to EOF
09a5e03 docs: Write tracing documentation
b625f62 runtime: delete cri containerd plugin from versions.yaml
24fff57 snap: make curl commands consistent
2b9f79c snap: add cloud-hypervisor and experimental kernel
273a1a9 runtime: optimize test code
76f16fd runtime: use containerd package instead of cri-containerd
6d55b1b docs: use containerd to replace cri-containerd
ed02bc9 packaging: add containerd to versions.yaml
50da26d osbuilder: Call detect_rust_version() right before install_rust.sh
b4fadc9 docs: Updating Developer Guide re qemu-img
b8e69ce versions: Add libseccomp and gperf version
17a8c5c runtime: Fix random failure for TestIoCopy
f34f67d osbuilder: Specify version when installing Rust
135a080 osbuilder: Pass CI env to container agent build
eb5dd76 osbuilder: Re-enable building the agent in Docker
bcffa26 tracing: Fix typo in "package" tag name
e61f5e2 runtime: Show socket path in kata-env output
5b3a349 trace-forwarder: Support Hybrid VSOCK
e42bc05 kata-deploy: add .dockerignore file
321be0f tracing: Remove trace mode and trace type
7d0b616 agent: Do not fail when trying to adding existing routes
3f95469 runtime: logging: Add variable for syslog tag
adc9e0b runtime: fix two bugs in rootless hypervisor
51cbe14 runtime: Add option "disable_seccomp" to config hypervisor.clh
98b7350 virtcontainers: clh: Enable the `seccomp` feature
46720c6 runtime: set tags for trace span
d789b42 package: assign proper value to redefined_string
4d7ddff utils: kata-manager: Update kata-manager.sh for new containerd config
f5172d1 cli: Fix outdated kata-runtime bash completion
d45c86d versions: Update CRI-O to its 1.22 release
c4a6426 versions: Update k8s & critools to v1.22
881b996 agent: Make wording of error message match CRI-O test suite
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Upgrade from v0.20.0 to v1.0.0, first stable release.
Git log
4bfa0034 Release prep v1.0.0-RC3 (2218)
c7ae470a Refactor SDK span creation and implementation (2213)
db317fce Verify and update OTLP trace exporter documentation (2053)
04de34a2 Update the website getting started docs (2203)
a7b9d021 Rename metric instruments to match feature-freeze API specification (2202)
1f527a52 Update trace API config creation functions (2212)
361a2096 Fix RC2 header in changelog (2215)
e209ee75 chore(exporter/zipkin): improves logging on invalid collector. (2191)
c0c5ef65 Fix typos in resource.go. (2201)
abf6afe0 Update otel example guide (2210)
3b05ba02 Bump actions/setup-go from 2.1.3 to 2.1.4 (2206)
bcd7ff7b Bump codecov/codecov-action from 2.0.2 to 2.0.3 (2205)
c912b179 Print JSON objects to stdout without a wrapping array (2196)
add511c1 Make WithoutTimestamps work (2195)
85c27e01 Bump github.com/golangci/golangci-lint from 1.41.1 to 1.42.0 in /internal/tools (2199)
bf6500b3 Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /exporters/otlp/otlptrace (2184)
9392af96 Bump google.golang.org/grpc in /exporters/otlp/otlptrace/otlptracegrpc (2185)
c95694dc Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /example/otel-collector (2183)
0528fa66 Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /exporters/otlp/otlpmetric (2186)
3a26ed21 Deprecate the oteltest package (2188)
c885435f Website: support GH page links to canonical src (2189)
6da20a27 Add cross-module test coverage (2182)
dfc866bd Support capturing stack trace (2163)
41588fea Deprecate the attribute.Any function (2181)
4e8d667f Support a single Resource per MeterProvider in the SDK (2120)
a8bb0bf8 Make the tracetest.SpanRecorder concurrent safe (2178)
87d09df3 Deprecate Array attribute in favor of *Slice types (2162)
df384a9a Move InstrumentKind into the new metric/sdkapi package (2091)
1cb5cdca Unify the OTLP attribute transform (2170)
a882ee37 Clarify the attribute package documentation and order/grouping (2168)
5d25c4d2 Add support for int32 in attribute.Any (2169)
2b0e139e Refactor attributes benchmark tests (2167)
4c7470d9 Bump google.golang.org/grpc from 1.39.0 to 1.39.1 in /exporters/otlp/otlptrace (2176)
990c534a Bump google.golang.org/grpc in /example/otel-collector (2172)
b45c9d31 Bump google.golang.org/grpc from 1.39.0 to 1.39.1 in /exporters/otlp/otlpmetric (2174)
a3d4ff5c Deprecated the bridge/opencensus/utils package (2166)
b1d1d529 Move OC bridge integration tests to own mod (2165)
89a9489c Add OC bridge internal unit tests (2164)
56c743ba Allow global ErrorHandler to be set multiple times (2160)
d18c135f Add OpenCensus bridge internal package (2146)
fcf945a4 Just a little typo fix in code documentation. (2159)
59a82eba Update version.go (2157)
21d4686f Add ErrorHandlerFunc to simplify creating ErrorHandlers (2149)
23cb9396 Remove `internal/semconv-gen` (2155)
39acab32 Fix code sample in otel.GetTraceProvider (2147)
2b1bb29e Update OpenCensus bridge docs with limitations (2145)
fd7c327b Fix Jaeger exporter agent port default value and docs (2131)
b8561785 fix(2138): add guard to constructOTResources to return an empty resource (2139)
11f62640 Add a SpanRecorder to the sdk/trace/tracetest (2132)
fd9de7ec rename assertsocketbuffersize.go to *_test (2136)
a6b4d90c nit doc fix (2135)
79398418 pre-release v1.0.0-RC2 (2133)
2501e0fd Use semconv.SchemaURL in STDOUT exporter example (2134)
ef03dbc9 Bump codecov/codecov-action from 1 to 2.0.2 (2129)
bbe6ca40 Deprecate oteltest.Harness for removal (2123)
7a624ac2 Deprecated the oteltest.TraceStateFromKeyValues function (2122)
ece1879f Removed dropped link's attributes field from API package (2118)
03902d98 Rename sdk/trace/tracetest test.go -> exporter.go (2128)
cb607b0a Unify OTLP exporter retry logic (2095)
abe22437 API: create new linked span from current context (2115)
db81d4aa Update internal/global/trace testing (2111)
7f10ef72 Remove propagation testing types from oteltest (2116)
25d739b0 Remove resource.WithBuiltinDetectors() which has not been maintained (2097)
d57c5a56 Remove several metrics test helpers (2105)
49359495 Simplify trace_context tests (2108)
56d42011 Simplify trace context benchmark test (2109)
63dfe64a Correct status transform in OTLP exporter (2102)
9b1a5f70 Performance improvement: avoid creating multiple same read-only objects (2104)
ab78dbd0 Update release URL (2106)
647af3a0 Pre release experimental metrics v0.22.0 (2101)
0a562337 Fixed OS type value for DragonFly BSD (2092)
62c21ffb Bump golang.org/x/tools from 0.1.4 to 0.1.5 in /internal/tools (2096)
4a3da55a Ensure sample code in website_docs getting started page works (2094)
d3063a3d Update otel.Meter to global.Meter in Getting Started Document.(2087) (2093)
00a1ec5f Add documentation guidelines and improve Jaeger exporter readme (2082)
12f737c7 oteltest: ensure valid SpanContext created for span started WithNewRoot (2073)
484258eb OS description attribute detector (1840)
d8c9a955 Bump google.golang.org/grpc from 1.38.0 to 1.39.0 in /example/otel-collector (2054)
4ffdf034 Add @pellard as an Approver (2047)
1a74b399 Bump google.golang.org/protobuf from 1.26.0 to 1.27.0 in /exporters/otlp/otlpmetric (2040)
57c2e8fb Bump golang.org/x/tools from 0.1.3 to 0.1.4 in /internal/tools (2036)
7cff31a9 Bump google.golang.org/protobuf from 1.26.0 to 1.27.0 in /exporters/otlp/otlptrace (2035)
9e8f523d when using WithNewRoot, don't use the parent context for sampling (2032)
62af6c70 semconv-gen: fix capitalization at word boundaries, add stability/deprecation indicators (2033)
0bceed7e Fix docs on otel-collector example (2034)
6428cd69 Update doc.go (2030)
311a6396 fix documentation for trace.Status (2029)
16f83ce6 export ToZipkinSpanModels for use outside this library (2027)
d5d4c87f Add HTTP metrics exporter for OTLP (2022)
d6e8f60f Bump github.com/golangci/golangci-lint from 1.40.1 to 1.41.1 in /internal/tools (2023)
51dbe3cb Remove deprecated exporters (2020)
257ef7fc Update project status in README (2017)
ced177b7 Pre-release 1.0.0-RC1 (2013)
694c9a41 Interface stability documentation (2012)
39fe8092 Add span.TracerProvider() (2009)
d020e1a2 Add more tests for go.opentelemetry.io/otel/trace package. (2004)
6d4a38f1 replace WithSyncer with WithBatcher in opencensus example (2007)
c30cd1d0 Split stdout exporter into stdouttrace and stdoutmetric (2005)
80ca2b1e otlp: mark unix endpoints to work without transport security (2001)
65140985 Update codecov ignore (2006)
3be9813d Deprecate the exporters in the "trace" and "metric" sub-directories (1993)
377f7ce4 remove WithTrace* options from otlptrace exporters (1997)
b33edaa5 OTLP metrics gRPC exporter (1991)
64b640cc Remove old OTLP exporter (1990)
7728a521 Remove dependency on metrics packages (1988)
135ac4b6 Moved internal/tools duplicated findRepoRoot function to common package (1978)
cdf67ddf Update semantic conventions to v1.4.0, move to versioned package (1987)
4883cb11 Refactor exporter creation functions (1985)
87cc1e1f Test BatchSpanProcessor export timeout directly (1982)
7ffe2845 Added inputPath validation to semconv-gen (1986)
a113856a Add caveat about installing opencensus bridge (1983)
741cb9a3 Fix generator.go call typo in RELEASING.md (1977)
7a0cee7b Replaces golint by revive and fix newly reported linter issues (1946)
46d9687a Add Schema URL support to Resource (1938)
0827aa62 Use mock server as jaeger agent listener. (1930)
20886012 Bugfix jaeger exporter test panic (1973)
4bf6150f Add baggage implementation based on the W3C and OpenTelemetry specification (1967)
bbe2b8a3 Bump github.com/itchyny/gojq from 0.12.3 to 0.12.4 in /internal/tools (1971)
4949bf05 Bump github.com/cenkalti/backoff/v4 from 4.1.0 to 4.1.1 in /exporters/otlp/otlptrace (1972)
015b4c17 Bump github.com/cenkalti/backoff/v4 from 4.1.0 to 4.1.1 in /exporters/otlp (1970)
13eb12ac Bump github.com/prometheus/client_golang from 1.10.0 to 1.11.0 in /exporters/metric/prometheus (1974)
2371bb0a add otlp trace http exporter (1963)
a75ade4e sdk/resource: honor OTEL_SERVICE_NAME in fromEnv resource detector (1969)
aed45802 Bump go.opentelemetry.io/proto/otlp from 0.8.0 to 0.9.0 in /exporters/otlp/otlptrace (1959)
c4ebae6a Bump go.opentelemetry.io/proto/otlp (1960)
b1d2be3b Bump google.golang.org/grpc from 1.37.1 to 1.38.0 in /exporters/otlp/otlptrace (1958)
f6daea5e Generate semantic conventions according to specification latest tagged version (1933)
435a63b3 Bump github.com/google/go-cmp from 0.5.5 to 0.5.6 (1954)
6c46af66 Bump github.com/google/go-cmp from 0.5.5 to 0.5.6 in /exporters/trace/jaeger (1953)
4d294853 Bump actions/cache from 2.1.5 to 2.1.6 (1952)
dfe2b6f1 OTLP trace gRPC exporter (1922)
5a8f7ff7 Bump go.opentelemetry.io/proto/otlp from 0.8.0 to 0.9.0 in /exporters/otlp (1943)
bd935866 Add schema URL support to Tracer (1889)
c1f460e0 Update API configs. (1921)
270cc603 Small fixes on some Span method's documentation headers (1950)
8603b902 Fix typo in doc (1949)
acbb1882 Bump google.golang.org/grpc from 1.37.1 to 1.38.0 in /exporters/otlp (1942)
b1621501 Add codecov badge (1940)
ea1434c3 Fix some golint issues (1947)
0eeb8f87 Refactor Tracestate (1931)
d3b12808 Add Passthrough example (1912)
f06cace6 Add @MadVikingGod as a project Approver (1923)
ab5facb3 Bump github.com/golangci/golangci-lint in /internal/tools (1925)
d23cc61b Refactor configs (1882)
6324adaa Add tracer option argument to global Tracer function (1902)
035fc650 Do not include authentication information in the http.url attribute (1919)
d8ac212c Fix sporadic test failure in otlp exporter http driver (1906)
a3df00f4 Create .gitattributes (1920)
fb88e926 Bump google.golang.org/grpc from 1.37.0 to 1.37.1 in /exporters/otlp (1914)
1982dc46 Bump google.golang.org/grpc in /example/prom-collector (1915)
1759c630 Bump github.com/golangci/golangci-lint in /internal/tools (1916)
7342aa47 Bump google.golang.org/grpc in /example/otel-collector (1913)
21c16418 Add support for scheme in OTEL_EXPORTER_OTLP_ENDPOINT (1886)
5cb62636 Semantic Convention generation tooling (1891)
6219221f Move the unit package to the metric module (1903)
63e0ecfc Implement global default non-recording span (1901)
b6d5442f Remove the Tracer method from the Span API (1900)
ae85fab3 Document functional options (1899)
cabf0c07 Fix default Jaeger collector endpoint (1898)
1e3fa3a3 Bump go.opentelemetry.io/proto/otlp from 0.7.0 to 0.8.0 in /exporters/otlp (1872)
696af787 Bump github.com/benbjohnson/clock from 1.0.3 to 1.1.0 in /sdk/metric (1532)
97eea6c3 Fix some golint issues (1894)
79d9852e fix container port mismatch issue (1895)
d20e7228 CI builds validate against last two versions of Go, dropping 1.14 and adding 1.16 (1865)
cbcd4b1a Redefine ExportSpans of SpanExporter with ReadOnlySpan (1873)
c99d5e99 Split large jaeger span batch to admire the udp packet size limit (1853)
42a84509 Unembed SpanContext (1877)
b7d02db1 Add Status type to SDK (1874)
f90d0d93 Update README (1876)
a1349944 Update resource.go (1871)
f40cad5e Add markdown link check configuration and action (1869)
9bc28f6b Fix existing markdown lint issues (1866)
08f4c270 Add documentation for tracer.Start() (1864)
2bd4840c remove Set.Encoded(Encoder) enconding cache (1855)
7674eebf Removed different types of Detectors for Resources. (1810)
f92a6d83 Implement retry policy for the OTLP/gRPC exporter (1832)
ec75390f Fix BSP context done tests (1863)
8e55f10a Move the Event type from the API to the SDK (1846)
e399d355 drop failed to exporter batches and return error when forcing flush a span processor (1860)
f6a9279a Honor context deadline or cancellation in SimpleSpanProcessor.Shutdown (1856)
aeef8e00 Add markdown lint GitHub action (1849)
d4c8ffad Replace spaces to tabs in Go code snippets (1854)
cb097250 fixed typo (1857)
392a44fa Refine configuration design docs (1841)
62cd933d Handle Resource env error when non-nil (1851)
24a91628 Document the SSP is not for production use (1844)
ec26ac23 Update RELEASING.md (1843)
8eb0bb99 Fix golint issue caused by typo (1847)
ca130e54 Markdownlint (1842)
1144a83d Small typo fixes to existing CHANGELOG entries (1839)
e6086958 Update website_docs to v0.20.0 (1838)
0f4e454c Change NewSplitDriver paramater and initialization (1798)
92551d39 Prerelease v1.0.0 (2250)
61839133 zipkin: remove no-op WithSDKOptions (2248)
568e7556 Set Schema URL when exporting traces to OTLP (2242)
ec26b556 Fix RC tags in docs (2239)
767ce26c Bump github.com/itchyny/gojq from 0.12.4 to 0.12.5 in /internal/tools (2216)
fe7058da adding NewNoopMeterProvider to follow trace api (2237)
c338a5ef Bump github.com/golangci/golangci-lint from 1.42.0 to 1.42.1 in /internal/tools (2236)
ef126f5c Remove deprecated Array from attribute package (2235)
360d1302 Add tests for nil *Resource (2227)
9e7812d1 Remove the deprecated oteltest package (2234)
486afd34 Remove the deprecated bridge/opencensus/utils pkg (2233)
eaacfaa8 Fix slice-valued attributes when used as map keys (2223)
df2bdbba Fix the import comments of otelpconfig (2224)
7aae2a02 otlptrace: Document supported environment variables (2222)
Fixes#2591
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Update OpenTelemetry from v0.15.0 to v0.20.0.
Git log
02d8bdd5 Release v0.20.0 (1837)
aa66fe75 OS and Process resource detectors (1788)
7374d679 Fix Links documents (1835)
856f5b84 Add feature request issue template (1831)
0fdc3d78 Remove bundler from Jaeger exporter (1830)
738ef11e Fix flaky global ErrorHandler delegation test (1829)
e43d9c00 Update Default Value for Jaeger Exporter Endpoint (1824)
0032bd64 Fix default merging of resource attributes from environment variable (1785)
96c5e4ba Add SpanProcessor example for Span annotation on start (1733)
543c8144 Remove the WithSDKOptions from the Jaeger exporter (1825)
66389ad6 Update function docs in sdk.go (1826)
70bc9eb3 Adds support for timeout on the otlp/gRPC exporter (1821)
081cc61d Update Jaeger exporter convenience functions (1822)
1b9f16d3 Remove the WithDisabled option from Jaeger exporter (1806)
6867faa0 Bump actions/cache from v2.1.4 to v2.1.5 (1818)
a2bf04dc Build context pipeline in Jaeger upload process (1809)
2de86f23 Remove locking from Jaeger exporter shutdown/export (1807)
4f9fec29 Add ExportSpans benchmark to Jaeger exporter (1805)
d9566abe Fix OTLP testing flake: signal connection from mock collector (1816)
a2cecb6e add support for env var configuration to otlp/gRPC (1811)
d616df61 Fix flaky OTLP exporter reconnect test (1814)
b09df84a Changes stdout to expose the `*sdktrace.TracerProvider` (1800)
04890608 Remove options field from Jaeger exporter (1808)
6db20e00 Remove the abandoned Process struct in Jaeger exporter (1804)
086abf34 docs: use test example to document prometheus.InstallNewPipeline (1796)
d0cea04b Bump google.golang.org/api from 0.43.0 to 0.44.0 in /exporters/trace/jaeger (1792)
99c477fe Fixed typo for default service name in Jaeger Exporter (1797)
95fd8f50 Bump google.golang.org/grpc from 1.36.1 to 1.37.0 in /exporters/otlp (1791)
9b251644 Zipkin Exporter: Use default resouce's serviceName as default serivce name (1777) (1786)
4d141e47 Add k8s.node.name and k8s.node.uid to semconv (1789)
5c99a34c Fix golint issue caused by incorrect comment (1795)
c5d006c0 Update Jaeger environment variables (1752)
58432808 add NewExportPipeline and InstallNewPipeline for otlp (1373)
7d8e6bd7 Zipkin Exporter: Adjust span transformation to comply with the spec (1688)
2817c091 Merge sdk/export/trace into sdk/trace (1778)
c61e654c Refactor prometheus exporter tests to match file headers as well (1470)
23422c56 Remove process config for Jaeger exporter (1776)
0d49b592 Add test to check bsp ignores `OnEnd` and `ForceFlush` post Shutdown` (1772)
e9aaa04b Record links/events attribute drops independently (1771)
5bbfc22c Make ExportSpans for Jaeger Exporter honor deadline (1773)
0786fe32 Add Bug report issue templates (1775)
3c7facee Add `ExportTimeout` option to batch span processor (1755)
c6b92d5b Make TraceFlags spec-compliant (1770)
ee687ca5 Bump github.com/itchyny/gojq from 0.12.2 to 0.12.3 in /internal/tools (1774)
52a24774 add support for configuring tls certs via env var to otlp/HTTP (1769)
35cfbc7e Update precedence of event name in Jaeger exporter (1768)
33699d24 Adds semantic conventions for exceptions (1492)
928e3c38 Modify ForceFlush to abort after timeout/cancellation (1757)
3947cab4 Fix testCollectorEndpoint typo and add tag assertions in jaeger_test (1753)
ecc635dc add website docs (1747)
07a8d195 Fix Jaeger span status reporting and unify tag keys (1761)
4fa35c90 add partial support for env var config to otlp/HTTP (1758)
bf180d0f improve OTLP/gRPC connection errors (1737)
d575865b Fix span IsRecording when not sampling (1750)
20c93b01 Update SamplingParameters (1749)
97501a3f Update SpanSnapshot to use parent SpanContext (1748)
604b05cb Store current Span instead of local and remote SpanContext in context.Context (1731)
c61f4b6d Set @lizthegrey to emeritus status (1745)
b1342fec Bump github.com/golangci/golangci-lint in /internal/tools (1743)
54e1bd19 Bump google.golang.org/api from 0.41.0 to 0.43.0 in /exporters/trace/jaeger (1741)
4d25b6a2 Bump github.com/prometheus/client_golang from 1.9.0 to 1.10.0 in /exporters/metric/prometheus (1740)
0a47b66f Bump google.golang.org/grpc from 1.36.0 to 1.36.1 in /exporters/otlp (1739)
26f006b8 Reinstate @paivagustavo as an Approver (1734)
382c7ced Remove hasRemoteParent field from SDK span (1728)
862a5a68 Remove setting error status while recording error with Span from oteltest package (1729)
6defcfdf Remove links on NewRoot spans (1726)
a9b2f851 upgrade thrift to v0.14.1 in jaeger exporter (1712)
5a6a854d Bump google.golang.org/protobuf from 1.25.0 to 1.26.0 in /exporters/otlp (1724)
23486213 Migrate to using go.opentelemetry.io/proto/otlp (1713)
5d559b40 Remove makeSamplingDecision func (1711)
e24702da Update the TraceContext.Extract docs (1720)
9d4eb1f6 Update dates in CHANGELOG.md for 2021 releases (1723)
2b4fa968 Release v0.19.0 (1710)
4beb7041 sdk/trace: removing ApplyConfig and Config (1693)
1d42be16 Rename WithDefaultSampler TracerProvider option to WithSampler and update docs (1702)
860d5d86 Add flag to determine whether SpanContext is remote (1701)
0fe65e6b Comply with OpenTelemetry attributes specification (1703)
88884351 Bump google.golang.org/api from 0.40.0 to 0.41.0 in /exporters/trace/jaeger (1700)
345f264a breaking(zipkin): removes servicName from zipkin exporter. (1697)
62cbf0f2 Populate Jaeger's Span.Process from Resource (1673)
28eaaa9a Add a test to prove the Tracer is safe for concurrent calls (1665)
8b1be11a Rename resource pkg label vars and methods (1692)
a1539d44 OpenCensus metric exporter bridge (1444)
77aa218d Fix issue #1490, apply same logic as in the SDK (1687)
9d3416cc Fix synchronization issues in global trace delegate implementation (1686)
58f69f09 Span status from HTTP code: Do not set status message if it can be inferred (1681)
9c305bde Flush metric events prior to shutdown in OTLP example (1678)
66b1135a Fix CHANGELOG (1680)
90bd4ab5 Update employer information for maintainers (1683)
36841913 Remove WithRecord() option from trace.SpanOption when starting a span (1660)
65c7de20 Remove trace prefix from NoOp src files. (1679)
e88a091a Make SpanContext Immutable (1573)
d75e2680 Avoid overriding configuration of tracer provider (1633)
2b4d5ac3 Bump github.com/golangci/golangci-lint in /internal/tools (1671)
150b868d Bump github.com/google/go-cmp from 0.5.4 to 0.5.5 (1667)
76aa924e Fix the examples target info messaging (1676)
a3aa9fda Bump github.com/itchyny/gojq from 0.12.1 to 0.12.2 in /internal/tools (1672)
a5edd79e Removed setting error status while recording err as span event (1663)
e9814758 chore(zipkin): improves zipkin example to not to depend on timeouts. (1566)
3dc91f2d Add ForceFlush method to TracerProvider (1608)
bd0bba43 exporter: swap pusher for exporter (1656)
56904859 Update the SimpleSpanProcessor (1612)
a7f7abac SpanStatus description set only when status code is set to Error (1662)
05252f40 Jaeger Exporter: Fix minor mapping discrepancies (1626)
238e7c61 Add non-empty string check for attribute keys (1659)
e9b9aca8 Add tests for propagation of Sampler Tracestate changes (1655)
875a2583 Add docs on when reviews should be cleared (1556)
7153ef2d Add HTTP/JSON to the otlp exporter (1586)
62e2a0f7 Unexport the simple and batch SpanProcessors (1638)
992837f1 Add TracerProvider tests to oteltest harness (1607)
bb4c297e Pre release v0.18.0 (1635)
712c3dcc Fix makefile ci target and coverage test packages (1634)
841d2a58 Rename local var new to not collide with builtin (1610)
13938ab5 Update SpanProcessor docs (1611)
e25503a0 Add compatibility tests to CI (1567)
1519d959 Use reasonable interval in sdktrace.WithBatchTimeout (1621)
7d4496e0 Pass metric labels when transforming to gaugeArray (1570)
6d4a5e0d Bump google.golang.org/grpc from 1.35.0 to 1.36.0 in /exporters/otlp (1619)
a93393a0 Bump google.golang.org/grpc in /example/prom-collector (1620)
e499ca86 Fix validation for tracestate with vendor and add tests (1581)
43886e52 Make timestamps sequential in lastvalue agg check (1579)
37688ef6 revent end-users from implementing some interfaces (1575)
85e696d2 Updating documentation with an working example for creating NewExporter (1513)
562eb28b Unify the Added sections of the unreleased changes (1580)
c4cf1aff Fix Windows build of Jaeger tests (1577)
4a163bea Fix stdout TestStdoutTimestamp failure with sleep (1572)
bd4701eb Stagger timestamps in exact aggregator tests (1569)
b94cd4b2 add code attributes to semconv package (1558)
78c06cef Update docs from gitter to slack for communication (1554)
1307c911 Remove vendor exclude from license-check (1552)
5d2636e5 Bump github.com/golangci/golangci-lint in /internal/tools (1565)
d7aff473 Vendor Thrift dependency (1551)
298c5a14 Update span limits to conform with OpenTelemetry specification (1535)
ecf65d79 Rename otel/label -> otel/attribute (1541)
1b5b6621 Remove resampling on span.SetName (1545)
8da52996 fix: grpc reconnection (1521)
3bce9c97 Add Keys() method to propagation.TextMapCarrier (1544)
0b1a1c72 Make oteltest.SpanRecorder into a concrete type (1542)
7d0e3e52 SDK span no modification after ended (1543)
7de3b58c Remove extra labels types (1314)
73194e44 Bump google.golang.org/api from 0.39.0 to 0.40.0 in /exporters/trace/jaeger (1536)
8fae0a64 Create resource.Default() with required attributes/default values (1507)
76f93422 Release v0.17.0 (1534)
9b242bc4 Organize API into Go modules based on stability and dependencies (1528)
e50a1c8c Bump actions/cache from v2 to v2.1.4 (1518)
a6aa7f00 Bump google.golang.org/api from 0.38.0 to 0.39.0 in /exporters/trace/jaeger (1517)
38efc875 Code Improvement - Error strings should not be capitalized (1488)
6b340501 Update default branch name (1505)
b39fd052 nit: Fix comment to be up-to-date (1510)
186c2953 Fix golint error of package comment form (1487)
9308d662 Bump google.golang.org/api from 0.37.0 to 0.38.0 in /exporters/trace/jaeger (1506)
1952d7b6 Reverse order of attribute precedence when merging two Resources (1501)
ad7b4715 Remove build flags for runtime/trace support (1498)
4bf4b690 Remove inaccurate and unnecessary import comment (1481)
7e19eb6a Bump google.golang.org/api from 0.36.0 to 0.37.0 in /exporters/trace/jaeger (1504)
c6a4406a Bump github.com/golangci/golangci-lint in /internal/tools (1503)
9524ac09 Update workflows to include main branch as trigger (1497)
c066f15e Bump github.com/gogo/protobuf from 1.3.1 to 1.3.2 in /internal/tools (1478)
894e0240 Bump github.com/golangci/golangci-lint in /internal/tools (1477)
71ffba39 Bump google.golang.org/grpc from 1.34.0 to 1.35.0 in /exporters/otlp (1471)
515809a8 Bump github.com/itchyny/gojq from 0.12.0 to 0.12.1 in /internal/tools (1472)
3e96ad1e gitignore: remove unused example path (1474)
c5622777 Histogram aggregator functional options (1434)
0df8cd62 Rename Makefile.proto to avoid interpretation as proto file (1468)
979ff51f Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 (1453)
1df8b3b8 Bump github.com/gogo/protobuf from 1.3.1 to 1.3.2 in /exporters/otlp (1456)
4c30a90a Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /sdk (1455)
5a9f8f6e Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/stdout (1454)
7786f34c Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/trace/zipkin (1457)
4352a7a6 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/otlp (1460)
6990b3b3 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/metric/prometheus (1461)
7af40d22 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/trace/jaeger (1463)
f16f1892 Bump google.golang.org/grpc in /example/otel-collector (1465)
fe363be3 Move Span Event to API (1452)
43922240 Bump google.golang.org/grpc in /example/prom-collector (1466)
0aadfb27 Prepare release v0.16.0 (1464)
207587b6 Metric histogram aggregator: Swap in SynchronizedMove to avoid allocations (1435)
c29c6fd1 Shutdown underlying span exporter while shutting down BatchSpanProcessor (1443)
dfece3d2 Combine the Push and Pull metric controllers (1378)
74deeddd Handle tracestate in TraceContext propagator (1447)
49f699d6 Remove Quantile aggregation, DDSketch aggregator; add Exact timestamps (1412)
9c949411 Rename internal/testing to internal/internaltest (1449)
8d809814 Move gRPC driver to a subpackage and add an HTTP driver (1420)
9332af1b Bump github.com/golangci/golangci-lint in /internal/tools (1445)
5ed96e92 Update exporters/otlp Readme.md (1441)
bc9cb5e3 Switch CircleCI badge to GitHub Actions (1440)
716ad082 Remove CircleCI config (1439)
0682db1e Adding Security Workflows to GitHub Actions (2/2): gosec workflow (1429)
11f732b8 Adding Security Workflows to GitHub Actions (1/2): codeql workflow (1428)
40f1c003 Add Tracestate into the SamplingResult struct (1432)
db06c8d1 Flush metric events before shutdown in collector example (1438)
f6f458e1 Fix golint issue caused by typo in trace.go (1436)
fe9d1f7e Use uint64 Count consistently in metric aggregation (1430)
3a337d0b Bump github.com/golangci/golangci-lint in /internal/tools (1433)
1e4c8321 cleanup: drop the removed examples in gitignore (1427)
5c9221cf Unify endpoint API that related to OTel exporter (1401)
045c3ffe Build scripts: Replace mapfile with read loop for old bash versions (1425)
2def8c3d Add Versioning Documentation (1388)
6bcd1085 Bump github.com/itchyny/gojq from 0.11.2 to 0.12.0 in /internal/tools (1424)
38e76efe Add a split protocol driver for otlp exporter (1418)
439cd313 Add TraceState to SpanContext in API (1340)
35215264 Split connection management away from exporter (1369)
add9d933 Bump github.com/prometheus/client_golang from 1.8.0 to 1.9.0 in /exporters/metric/prometheus (1414)
93d426a1 Add @dashpole as a project Approver (1410)
6fe20ef3 Fix small typo (1409)
b22d0d70 Mention the getting started guide (1406)
3fb80fb2 Fix duplicate checkout action in GitHub workflow (1407)
2051927b Correct CI workflow syntax (1403)
f11a86f7 Fix typo in comment (1402)
bdf87a78 Migrate CircleCI ci.yml workflow to GitHub Actions (1382)
4e59dd1f Bump google.golang.org/grpc from 1.32.0 to 1.34.0 in /example/otel-collector (1400)
83513f70 Bump google.golang.org/api from 0.32.0 to 0.36.0 in /exporters/trace/jaeger (1398)
a354fc41 Bump github.com/prometheus/client_golang from 1.7.1 to 1.8.0 in /exporters/metric/prometheus (1397)
3528e42c Bump google.golang.org/grpc from 1.32.0 to 1.34.0 in /exporters/otlp (1396)
af114baf Call otel.Handle with non-nil errors (1384)
c3c4273e Add RO/RW span interfaces (1360)
Fixes#2591
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
In later versions of OpenTelemetry label.Any() is deprecated. Create
addTag() to handle type assertions of values. Change AddTag() to
variadic function that accepts multiple keys and values.
Fixes#2547
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Raise the `slog` maximum log level feature for release code from `info`
to `debug` by changing the `slog` maximum level features in the shared
`logging` crate. This allows the consumers of the `logging` crate (the
agent, the `trace-forwarder` and the `agent-ctl` tool) to produce debug
output when their debug options are enabled. Currently, those options
will essentially be a NOP (unless using a debug version of the code).
Testing showed that setting the `slog` maximum level features in the
rust manifest files for the consumers of the `logging` crate has no
impact: those values are ignored, so they have been removed and replaced
with a comment stating the levels are set in the `logging` crate.
Fixes: #2966.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Currently the image-builder image is built from `fedora:latest` and
this is error-prone as any update of the base image can lead to
breakage. Instead let's create the image from Fedora 34, which is the
last known version to build fine.
Fixes#2960
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
There are some issues with Makefile for runtime:
- default target can't be used as a dependent of other targets.
- empty target `check`
And also add two targets for locally development/tests.
- lint: run golangci-lint
- pre-commit: run lint and test
Fixes: #2942
Signed-off-by: bin <bin@hyper.sh>
Use `dup3` system call instead of `dup2` in unit tests of seccomp
because `dup2` is obsolete on aarch64.
Fixes: #2939
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
We only added span.End() in the main process of the shim2 Shutdown method.
The "Shutdown" span would keep alive, when the containers number is not 0.
This PR make sure the "Shutdown" trace span have a correct end.
Fixes: #2930
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
Add -failfast option to let test exit on error, but -failfast option
can't cross package, so there is a for loop used to test on all packages
in src/runtime, and the parallel number is set to 1, this may lead test
to be slow.
Fixes: #1997
Signed-off-by: bin <bin@hyper.sh>
It is safer to download the tarballs and work on a temporary directory
which can be proper cleaned up when the script finishes.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
If DESTDIR is set on the environment then gperf will be installed
in an unexpected directory, resulting on the libseccomp's configure
not being able to find it. To avoid that issue this changed the
ci/install_libseccomp.sh so that PREFIX and DESTDIR are unset
inside the script.
Fixes#2932
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Virtcontainers API document functions weren't sync with the codes Sandbox and VCImpl.
And we have two functions named `CreateSandbox` functions, diff by one parameter,
very confused. So this pr sync the codes to api documents.
Fixes: #2928
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
The `kata-agent` binaries inside the Kata Containers images provided
with release are statically linked with the GNU LGPL-2.1 licensed
libseccomp library by default.
Therefore, we attach the complete source code of the libseccomp
to the release page in order to comply with the LGPL-2.1 (6(a)).
In addition, we add the description about the libseccomp license
to the release page.
Fixes: #2922
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Cloud hypervisor is only supporting virtio-blk, this PR removes comments
that make a wrong reference of other features that are not supported
by clh.
Fixes#2924
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Allow the `agent-ctl` tool to connect to a Hybrid VSOCK hypervisor such
as Cloud Hypervisor or Firecracker.
Fixes: #2914.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Update the trace forwarder README to remove the quotes around the socket
path, which makes manipulating that path easier.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Current handling of read-only mounts is a little tricky.
However, a clearer solution can be used here:
1. make a private ro bind mount at privateDest to the mount source
2. make a bind mount at mountDest to the mount created in step 1
3. umount the private bind mount created in step 1
One important aspect is that the mount in step 2 is duplicated from
the one we created in step 1. So the MS_RDONLY flag is properly
preserved in all mounts created in the propagtion.
Fixes: #2205
Depends-on: github.com/kata-containers/tests#4106
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
Pull #2795 recently added support for a closer-to-OCI behaviour for
VFIO devices, in which they appear to the container as VFIO devices,
rather than being interpreted by the guest kernel. However, in order
to use this, the Kata guest kernel needs to include the VFIO PCI
driver, along with dependencies like the Intel IOMMU driver.
The kernel as built by the scripts within Kata don't currently include
those, so this patch adds them.
fixes#2913
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
- convert linux field from oci spec to grpc spec
- include all the fields below linux oci spec
Fixes: #2715
Signed-off-by: Da Li Liu <liudali@cn.ibm.com>
In order to pass CI test of aarch64, it is necessary to run
`ci/install_libseccomp.sh` before ruuning unit tests in
`jenkins_job_build.sh`.
However, `ci/install_libseccomp.sh` is not available
until PR #1788 including this commit is merged in the mainline.
Therefore, we disable seccomp feature on aarch64 temporarily.
After #1788 lands and CI is fixed, this commit will be reverted.
Fixes: #1476
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
This adds explanation about how to enable seccomp in the kata-runtime and
build the kata-agent with seccomp capability.
Fixes: #1476
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
This adds a step for installing libseccomp because the kata-agent
supports seccomp feature.
Fixes: #1476
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
The osbuilder needs to set up libseccomp library to build the kata-agent
because the kata-agent supports seccomp currently.
The library is built from the sources to create a static library for musl libc.
In addition, environment variables for the libseccomp crate are set to
link the library statically.
Fixes: #1476
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
The kata-agent supports seccomp feature based on the OCI runtime specification.
This seccomp capability in the kata-agent is enabled by default.
However, it is not enforced by default: users need to enable that by setting
`disable_guest_seccomp` to `false` in the main configuration file.
Fixes: #1476
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
All endpoint names share the `Request` suffix.
Also, the current list is based on functions, not requests.
Fixes#2916
Reported-by: Jakob Naucke <jakob.naucke@ibm.com>
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
There is a problem with slash-command-action which is on absence of a slash command
the job fails (instead of simply ignore, i.e., skip). This is documented on
https://github.com/xt0rted/slash-command-action/issues/124. There is a workaround
also documented on that issue, but here instead let's get rid of the action.
In this new implementation all comments sent to the pull request are parsed, if any
starts with "/test_kata-deploy" then the job is triggered.
Fixes#2836
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
This changed valid() in hypervisor to check the case where both
initrd and image path are set; in this case it returns an error.
Fixes#1868
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Fix `-l <log-level>` for the trace forwarder which didn't work
previously as it lacked the magic Cargo configuration.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Hybrid VSOCK requires `root` privileges to access the sandbox-specific
host-side AF_UNIX socket created by the hypervisor (CLH or FC). However,
once the socket has been bound, privileges can be dropped, allowing the
forwarder to run as user `nobody`.
Fixes: #2905.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The oci_to_grpc function just handles part of oci fields,
and others are not copied from oci spec to grpc spec,
such as process.env, process.capabilities, mounts and so on.
Try to implement more handlings to convert thoses fields.
Fixes#2686
Signed-off-by: Lei Li <cdlleili@cn.ibm.com>
Rather than generating a potentially misleading error message if the
socket bind fails, perform an explicit check for `root` for Hybrid
VSOCK.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Updated the trace forwarder README to ensure the real socket path is
created, not the template socket path returned by `kata-runtime env`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
- Install OpenSSL for key generation in kernel build
- Do not install libpmem
- Do not exclude `*/share/*/*.img` files in QEMU tarball since among
them are boot loader files critical for IPLing.
Fixes: #2895
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Show available guest protections in the
`kata-runtime env` output. Also bump the formatVersion.
Fixes: #1982
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
Add functions to return guestProtection as a string slice, which
can be then used in `kata-runtime env` output.
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
uevents with action=remove was ignored causing the agent to reuse stale
data in the device map. This patch adds handling of such uevents.
Fixes#2405
Signed-off-by: Haitao Li <lihaitao@gmail.com>
On a conventional (e.g. runc) container, passing in a VFIO group device,
/dev/vfio/NN, will result in the same VFIO group device being available
within the container.
With Kata, however, the VFIO device will be bound to the guest kernel's
driver (if it has one), possibly appearing as some other device (or a
network interface) within the guest.
This add a new `vfio_mode` option to alter this. If set to "vfio" it will
instruct the agent to remap VFIO devices to the VFIO driver within the
guest as well, meaning they will appear as VFIO devices within the
container.
Unlike a runc container, the VFIO devices will have different names to the
host, since the names correspond to the IOMMU groups of the guest and those
can't be remapped with namespaces.
For now we keep 'guest-kernel' as the value in the default configuration
files, to maintain current Kata behaviour. In future we should change this
to 'vfio' as the default. That will make Kata's default behaviour more
closely resemble OCI specified behaviour.
fixes#693
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently constrainGRPCSpec always removes VFIO devices from the OCI
container spec which will be used for the inner container. For
upcoming support for VFIO devices in DPDK usecases we'll need to not
do that.
As a preliminary to that, add an extra parameter to the function to
control whether or not it will remove the VFIO devices from the spec.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
"constraint" is a noun, "constrain" is the associated verb, which makes
more sense in this context.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In order to support DPDK workloads, we need to change the way VFIO devices
will be handled in Kata containers. However, the current method, although
it is not remotely OCI compliant has real uses. Therefore, introduce a new
runtime configuration field "vfio_mode" to control how VFIO devices will be
presented to the container.
We also add a new sandbox annotation -
io.katacontainers.config.runtime.vfio_mode - to override this on a
per-sandbox basis.
For now, the only allowed value is "guest-kernel" which refers to the
current behaviour where VFIO devices added to the container will be bound
to whatever driver in the VM kernel claims them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add and adjust the vfio devices in the inner container spec so that
rustjail will create device nodes for them.
In order to do that, we also need to make sure the VFIO device node is
ready within the guest VM first. That may take (slightly) longer than
just the underlying PCI device(s) being ready, because vfio-pci needs
to initialize. So, add a helper function that will wait for a
specific VFIO device node to be ready, using the existing uevent
listening mechanism. It also returns the device node name for the
device (though in practice it will always /dev/vfio/NN where NN is the
group number).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Many device nodes go directly under /dev, however some are conventionally
placed in subdirectories under /dev. For example /dev/vfio/vfio or
/dev/pts/ptmx.
Currently, attempting to pass such a device into a Kata container will fail
because mknod() will get an ENOENT because the parent directory is missing
(or an equivalent error for bind_dev()).
Correct that by making subdirectories as necessary in create_devices().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For each user supplied device, create_devices() checks that the given path
actually is in /dev, by checking that its path starts with /dev and does
not contain "..".
However, this has subtle errors because it's interpreting the path as a raw
string without considering separators. It will accept the path /devfoo
which it should not, while it will not accept the valid (though weird)
paths /dev/... and /dev/a..b.
Correct this by using std::path::Path methods designed for the purpose.
Having done this, it's trivial to also generate the relative path that
mknod_dev() or bind_dev() will need, so do that at the same time.
We also move this logic into a helper function so that we can add some unit
tests for it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Both these functions take the absolute path from LinuxDevice and drop the
leading '/' to make a relative path. They do that with a simple
&dev.path[1..]. That can be technically incorrect in some edge cases such
as a path with redundant /s like "//dev//sda".
To handle cases like that, have the explicit relative path passed into
these functions. For now we calculate it in the same buggy way, but we'll
fix that shortly.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
create_devices() within the rustjail module is responsible for creating
device nodes within the (inner) containers. Errors that occur here will
be propagated up, but are likely to be low level failures of mknod() - e.g.
ENOENT or EACCESS - which won't be very useful without context when
reported all the way up to the runtime without the context of what we were
trying to do.
Add some anyhow context information giving the details of the device we
were trying to create when it failed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently, update_spec_device() assumes that the proper device path in the
(inner) container is the same as the device path specified in the outer OCI
spec on the host.
Usually that's correct. However for VFIO group devices we actually need
the container to see the VM's device path, since it's normal to correlate
that with IOMMU group information from sysfs which will be different in the
guest and which we can't namespace away.
So, add an extra "final_path" parameter to update_spec_device() to allow
callers to chose the device path that should be used for the inner
container. All current callers pass the same thing as container_path, but
that will change in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
update_spec_device_list() is used to update the container configuration to
change device major/minor numbers configured by the Kata client based on
host details to values suitable for the sandbox VM, which may differ. It
takes a 'device' object, but the only things it actually uses from there
are container_path and vm_path.
Refactor this as update_spec_device(), taking the host and guest paths to
the device as explicit parameters. This makes the function more
self-contained and will enable some future extensions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Each VFIO device passed into the guest could represent a whole IOMMU group
of devices on the host. Since these devices aren't DMA isolated from each
other, they must appear as the same IOMMU group in the guest as well.
The VMM should enforce that for us, but double check it, since things can't
work otherwise. This also means we determine the guest IOMMU group for the
VFIO device, which we'll be needing later.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For upcoming VFIO extensions we'll need to work with the IOMMU groups of
VFIO devices. This helps us towards that by adding pci_iommu_group() to
retrieve the IOMMU group (if any) of a given PCI device.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
VFIO devices can be added to a Kata container and they will be passed
through to the sandbox guest. However, inside the guest those devices
will bind to a native guest driver, so they will no longer appear as VFIO
devices within the guest. This behaviour differs from runc or other
conventional container runtimes.
This code allows the agent to match the behaviour of other runtimes,
if instructed to by kata-runtime. VFIO devices it's informed about
with the "vfio" type instead of the existing "vfio-gk" type will be
rebound to the vfio-pci driver within the guest.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For better VFIO support, we're going to need to take control of which guest
driver controls specific guest devices. To assist with that, add the
pci_driver_override() function to force a specific guest device to be
bound to a specific guest driver.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use `"c".to_string` in the device type of `dev/full`
in order to consistent with the coding style of other devices
Fixes: #2890
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
create_tmpfs won't pass as the race condition in watcher umount. quote
James's words here:
1. Rust runs all tests in parallel.
2. Mounts are a process-wide, not a per-thread resource.
The only test that calls watcher.mount() is create_tmpfs().
However, other tests create BindWatcher objects.
3. BindWatcher's drop() implementation calls self.cleanup(),
which calls unmount for the mountpoint create_tmpfs() asserts.
4. The other tests are calling unmount whenever a BindWatcher goes
out of scope.
To avoid that issue, let the tests using BindWatcher in watcher and
sandbox.rs run sequentially.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
DefaultMaxVCPUs may be larger than the defaultMaxQemuVCPUs that should
be checked and avoided.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The physical current vcpu number should not be used directly as the
largest vcpu number is limited to defaultMaxQemuVCPUs.
Here, a new helper is introduced in pkg/katautils/config.go to get
current vcpu number.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The current kernel version parse lib can't process suffix '+', as the
modified kernel version will add '+' as suffix, thus panic will occur.
For example, if the current kernel version is "5.14.0-rc4+", test
TestHostNetworkingRequested will panic:
--- FAIL: TestHostNetworkingRequested (0.00s)
panic: &{DistroName:ubuntu DistroVersion:18.04
KernelVersion:5.11.0-rc3+ Issue: Passed:[] Failed:[] Debug:true
ActualEUID:0}: failed to check test constraints: error: Build meta data
is empty
Here, remove the suffix '+' in kernel version fix helper.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Last of a series of commits to export the top level
hypervisor generic methods.
s/createSandbox/CreateVM
Fixes#2880
Signed-off-by: Manohar Castelino <mcastelino@apple.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Export commonly used hypervisor fields and utility functions.
These need to be exposed to allow the hypervisor to be consumed
externally.
Note: This does not change the hypervisor interface definition.
Those changes will be separate commits.
Signed-off-by: Manohar Castelino <mcastelino@apple.com>
This is to bump the OOT QAT 1.7 driver version to the
latest version. I dida test on my QAT enabled system and
everything functioned as expected.
Fixes: #2877
Signed-off-by: Eric Adams <eric.adams@intel.com>
Highlights from the Cloud Hypervisor release v19.0: 1) Improved PTY
handling for serial and virtio-console; 2) PCI boot time optimisations;
3) Improved TDX support; 4) Live migration enhancements (support with
virtio-mem and virtio-balloon); 5) virtio-mem support with vfio-user; 6)
AArch64 for virtio-iommu; 7) Various bug fixes for live-migration and
VFIO passthrough.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v19.0Fixes: #2871
Signed-off-by: Bo Chen <chen.bo@intel.com>
The upstream kernel SGX support has changed drastically since
the initial version of the Intel SGX use case doc was written.
The updated use case documents how to easily setup SGX with
Kata Containers running in a Kubernetes cluster.
Fixes: #2811
Depends-on: github.com/kata-containers/tests#4079
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Co-authored-by: James O. D. Hunt <james.o.hunt@intel.com>
This PR includes these optimize changes:
- Remove the dependency on the container engine.
The old code uses runc to generate config.json and
Docker to export rootfs, that will be heavy and need
additional dependency.
Using a fixed config for busybox image can avoid
the heavy processing above.
- Moved duplicate code to pkg/katatestutils package
Fixes: #2752
Signed-off-by: bin <bin@hyper.sh>
cri-containerd project has been merged into containerd repo, and
we should not reference it any more in code and docs.
This commit will use containerd package instead of cri-containerd
package.
Fixes: #2791
Signed-off-by: bin <bin@hyper.sh>
This commit will add containerd to versions.yaml.
Please at now there are both containerd and cri-containerd
in the versions.yaml.
After updating of kata-containers/tests repo, the cri-containerd
should be removed.
Fixes: #2791
Signed-off-by: bin <bin@hyper.sh>
When building with dracut method the build_rootfs_distro() is not called, in turn
detect_rust_version() isn't either, so the install_rust.sh script is gave a null
rust version. This changed the script to call detect_rust_version() right before
install_rust.sh.
Related to commit: f34f67d610Fixes#2862
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Add `libseccomp` and `gperf` version information to support
for seccomp feature in Kata agent: #1788.
Fixes: #2858
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
When running the TestIoCopy test, on some occasions, the test
runs too quick, and closes the stdin pipe before the ioCopy()
routine start to read from it. This causes a SIGSEGV error.
To fix this issue, I am adding additional read/write tests before
closing the pipes. As the read operation waits for the writer to
be done, this actually synchronizes the threads and make sure
the final tests (with closed pipes) works as expected.
Fixes: #2042
Signed-off-by: Julien Ropé <jrope@redhat.com>
and update the script in `ci/` accordingly.
When only parts of the Kata Containers repositories are checked out
(e.g. when building with Snap) and no Rust version is provided in
calling `install_rust.sh`, the scripts will attempt to clone the
appropriate repos to read the version, which will fail because the
directories already exist. Since we have read the version already, we
can just specify it.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
The agent build inside a Docker or Podman container has been re-enabled,
but we have since introduced the `$CI` environment variable. Pass it to
avoid checking out the tests repo to main when there is a dependency.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
or Podman. This is a partial revert of
76c18aa345. The rationale behind that
commit was the fact that the agent could not be built on Alpine, and
then this capability was removed altogether. The issue in Alpine has
since been resolved (see
https://github.com/kata-containers/osbuilder/issues/386). At the same
time, this ensures being able to run a glibc agent on hosts with distros
more recent than the osbuilder distro used (i.e. as of now, when you
build the agent on the host, and its glibc is newer than the one used in
the guest, the agent may encounter unresolved symbols).
Fixes#2398
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
The tracing tags for api.go contain `"packages"` as a tag name,
whereas all other tags contain `"package"`.
Fixes: #2847
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Display a pseudo path to the sandbox socket in the output of
`kata-runtime env` for those hypervisors that use Hybrid VSOCK.
The path is not a real path since the command does not create a sandbox.
The output includes a `{ID}` tag which would be replaced with the real
sandbox ID (name) when the sandbox was created.
This feature is only useful for agent tracing with the trace forwarder
where the configured hypervisor uses Hybrid VSOCK.
Note that the features required a new `setConfig()` method to be added
to the `hypervisor` interface. This isn't normally needed as the
specified hypervisor configuration passed to `setConfig()` is also
passed to `createSandbox()`. However the new call is required by
`kata-runtime env` to display the correct socket path for Firecracker.
The new method isn't wholly redundant for the main code path though as
it's now used by each hypervisor's `createSandbox()` call.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add support for Hybrid VSOCK. Unlike standard vsock (`vsock(7)`), under
hybrid VSOCK, the hypervisor creates a "master" *UNIX* socket on the
host. For guest-initiated VSOCK connections (such as the Kata agent uses
for agent tracing), the hypervisor will then attempt to open a VSOCK
port-specific variant of the socket which it expects a server to be
listening on. Running the trace forwarder with the new `--socket-path`
option and passing it the Hypervisor specific master UNIX socket path,
the trace forwarder will listen on the VSOCK port-specific socket path
to handle Kata agent traces.
For further details and examples, see the README or run the
trace forwarder with `--help`.
Fixes: #2786.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
.dockerignore file is similar to .gitignore and serves the purpose to
simply ignore paths in the build context.
For now, let me just use it to fix the following problem:
```
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz .
error checking context: 'no permission to read from
'(...)/local-build/build/firecracker/builddir/firecracker/(...)/crc64-1.0.0/.gitignore''.
```
Fixes: #2845
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Adding a route that already exists should not be a reason for the agent to fail
booting and thus preventing the sandbox to start.
Fixes#2712
Signed-off-by: zhaojizhuang <571130360@qq.com>
- kata-monitor: add index page
- clh: Refine the usage of guest console and kernel parameters with Cloud Hypervisor
- agent: exec should inherit container process capabilities
- GitHubActions: fix invalid format of require-pr-porting-labels.yaml
- agent: flush root span before process finish
- Extend PCI submodules to represent non-zero functions and addresses
- packaging/kernel: Add CONFIG_PCI_MMCONFIG to x86 guest kernel configuration
- runtime: don't start shim management server in tests
- qemu: use GitLab repos instead of qemu.org
- runtime: optimize code for managing temp users for rootless mode
- Agent configuration file and API restriction
- Delete file virtcontainers-setup.sh
- vendor: Update containerd to v1.5.7
- runtime: Optimize func noNeedForOutput and add test cases
- runtime: Fix !x86 static checks
- #2676: fixing centos gpg key url for ppc64le
- Pass the host route IP family to the guest
- cmd: get return value for setCPUtype
- packaging: Configure QEMU with --enable-pie
- clh: Enable guest userland output
- cmd: Fix mismatched types in testModuleData
- runtime: update .gitignore to ignore monitor_address file
- runtime: fix the make check-go-static command error
- virtcontainers: clean up useless code
- Remove forced PCI rescans from agent
- kernel: Enable SGX in experimental kernel.
- runtime: fix nil reference in cleanup rootless user
- qemu: prepare to upgrade qemu version to 6.1.0 for arm
- kata-monitor (minor) improvements
- virtcontainers: Fix incorrect scripts path
- runtime: clear virtcontainers cgroup duplicated function
- Kata monitor: cache improvements
- virtiofs: fix error report in TestVirtiofsdStart when go test running
176dee6f agent: exec should inherit container process capabilities
7b2bfd4e virtcontainers: clh: Use 'quiet' as the default kernel parameter
3e24e46c virtcontainers: clh: Turn-off serial and virtio-console by default
2d7b65e8 agent: flush root span before process finish
5c77cc2c runtime: don't start shim management server in tests
72044180 agent/device: Return PCI address from wait_for_pci_device()
e50b05d9 agent/pci: Add type to represent PCI addresses
8528157b agent/pci: Extend Slot type to represent PCI function as well
bf8f582c runtime: optimize code for managing temp users for rootless mode
a9c2a4ba GitHubActions: fix invalid format of require-pr-porting-labels.yaml
c4236cb2 packaging/kernel: Add CONFIG_PCI_MMCONFIG to x86 guest kernel configuration
08360c98 agent: Add an agent configutation file example
8a4e69d2 agent: rpc: Return UNIMPLEMENTED for not allowed endpoints
0ea2e3af agent: config: Allow for building the configuration from a file
63539dc9 agent: config: Add allowed endpoints
a953fea3 agent: config: Simplify configuration creation
b888edc2 agent: config: Implement Default
7eac2ec7 protection: add confidential compute frame for arm
8acfc154 check: fix typecheck failure in qemu_arm64_test.go
5b02d54e virtcontainers: fix lint failure on ppc64le
ff9728f0 virtcontainers: nolint guestProtection
5c138c8f runtime: Fix field alignment on s390x
191d0016 vendor: Update containerd to v1.5.7
f7f6bd01 kata-monitor: add index page
a44cde7e agent: netlink: Use the grpc IP family field when updating the route
71ce6cfe runtime: Pass the route IP family to the agent
99450bd1 agent: protos: Add a Family field to the Route payload
f85fe702 runtime: vendor: Bump the netlink package dependency
e439cec7 cmd: fix field alignment on ppc64le
e5159ea7 cmd: get return value for setCPUtype
2ce8d426 clh: Suppress hypervisor output to make guest output visible
cd1064b1 packaging: Configure QEMU with --enable-pie
762922a5 runtime: delete func ConstraintsToVCPUs
4f485430 runtime: delete virtcontainers-setup.sh
80f6b977 osbuilder: fixing centos gpg key url for ppc64le
bb99bfb4 runtime: fix the make check-go-static command error
870771d7 runtime: update .gitignore to ignore monitor_address file
18bff584 runtime: Optimize func noNeedForOutput and add test cases
e5fe53f0 runtime: fix nil reference in cleanup rootless user
2304a596 runtime: set the sandbox storage path static
315295e0 runtime: rename GetSanboxesStoragePath() --> GetSandboxesStoragePath()
13e65f2e cmd: Fix mismatched types in testModuleData
da42cbc0 actions: Build experimental kernel on kata-deploy push action
dffc5092 kernel: Enable SGX in experimental kernel.
ff6a677d kernel-build: Enable multiple config types.
90046964 experimental-kernel: bump 5.13.10
1fbb7304 build: kata-deploy kernel experimental
907459c1 agent/device: Don't force PCI rescans
75f426dd agent: Simplify do_add_swap()
aad1a873 runtime/device: Give the agent information about VFIO devices
ebd7b618 runtime: Don't repeat GetDeviceByID between appendDevices() and append*()
ad45c52f runtime/device: Record guest PCI path for VFIO devices
5c2af3e3 runtime/device: Refactor hotplugVFIODevice() to have common exit path
8bc71105 agent/device: Add device type for VFIO devices
f7a27075 agent: Move driver type constants into device.rs
5b1eb08b agent/uevent: Improve logging of wait_for_uevent()
cf36fd87 runtime: Fix some leftover go fmt errors
6d94957a kernel: reduce alignment size of memory hotplug to 128M
48090f62 qemu: disable plug on arm64 when pie is added
57e3712d virtiofs: fix error report in TestVirtiofsdStart when go test running
8b0bc1f4 kata-monitor: bump version to 0.2.0
bfb556d5 kata-monitor: refresh kata sandbox list on fs events
0e854f3b kata-monitor: improve detection of kata workloads
80463b44 qemu: use GitLab repos instead of qemu.org
3b0c4bf9 runtime: clear virtcontainers cgroup duplicated function
afad910d kata-monitor: add getSandboxFS()
e38686f7 runtime: add GetSandboxesStoragePath()
245a12bb kata-monitor: improve sandbox caching
fc067d61 kata-monitor: warn when unable to retrive the lower level runtime
53ec4df9 kata-monitor: minor fixes
47516988 virtcontainers: Fix incorrect scripts path
814cea96 virtcontainers: clean up useless code
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The 'quiet' kernel parameter can avoid guest kernel logs while booting,
which can reduce boot time.
Fix: #2820
Signed-off-by: Bo Chen <chen.bo@intel.com>
We will need to have console output from the guest only for debugging
purposes. As a result, we can turn-off both the serial and
virtio-console devices by default for better boot time.
Fixes: #2820
Signed-off-by: Bo Chen <chen.bo@intel.com>
Variables in rust will be dropped at the end of the function.
In function real_main the trace will be shut down by `tracer::end_tracing()`,
but at this time the root span is in an active state, so this root span
will not be sent to the trace collector.
This can be fixed by dropping the root span manually.
Fixes: #2812
Signed-off-by: bin <bin@hyper.sh>
The variable for 'name' in config-settings.go.in was previously
hardcoded as "kata". In e7c42fb it was changed to the runtime name,
which is "kata-runtime". Add a variable to specify a syslog identifier
for consistency for tests and documentation that use it.
Fixes#2806
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Update the sandbox dir clean up logic to be more appropriate
Add different seeds for randInt() method
Fixes#2770
Signed-off-by: Feng Wang <feng.wang@databricks.com>
This patch adds an option "disable_seccomp" to the config
hypervisor.clh, from which users can disable the `seccomp`
feature from Cloud Hypervisor when needed (for debugging purposes).
Fixes: #2782
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch enables the `seccomp` feature from Cloud Hypervisor which
provides fine-grained allowed syscalls for each of its worker
threads. It brings important security benefits, while would increase
memory footprint.
Fixes: #2782
Signed-off-by: Bo Chen <chen.bo@intel.com>
Shim management server is running in a go routine, in test mode
this will cause the directory where the listen socket
file(/run/vc/sbs/777-77-77777777/shim-monitor.sock) in leak
after the tests finished.
Fixes: #2805
Signed-off-by: bin <bin@hyper.sh>
wait_for_pci_device() waits for the PCI device at the given path to become
ready, but it doesn't currently give you any meaningful handle on that
device.
Change the signature, so that it returns the PCI address of the device.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a new pci::Address type which represents a guest PCI address in
DDDD:BB:SS.F form.
fixes#2745
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
pci::Slot represents a PCI slot. However, in all cases where we use it, we
actually care about addressing a specific PCI function. So, at the moment
we can only refer to function 0 in each slot.
Replace pci::Slot with pci::SlotFn to represent both the slot and function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This commit does two chagnes:
- move code for managing temp users to rootless.go.
- use common function in qemu.go when shutdown the VM.
Fixes: #2759
Signed-off-by: bin <bin@hyper.sh>
The yaml file has an indent issue from line 15.
And the branches filter should be under pull_request_target but
not the pull_request trigger.
Also actions/checkout@v2 does not need the token parameter.
Fixes: #2798
Signed-off-by: bin <bin@hyper.sh>
The guest kernel configuration suggested for Kata, and which is used by the
CI didn't include CONFIG_PCI_MMCONFIG. That's kind of weird, MMCONFIG is
the modern normal way of handling configuration cycles.
In addition, due to a complex set of interactions through the ACPI code,
disabling MMCONFIG means that SHPC hotplug doesn't work: the driver is
included in the guest kernel, but will fail to probe on PCI to PCI bridges,
meaning it won't actually be activated.
Enable MMCONFIG so that we suggest and testa more typical guest kernel
configuration.
fixes#2288
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
From the endpoints string described through the configuration file, we
build a hash set of allowed enpoints. If a configuration files does not
include an endpoints section, we assume all endpoints are not allowed.
If there is no configuration file, then all endpoints are allowed.
Then for every ttrpc request, we check if the name of the endpoint is
part of the hashset. If it is not, then we return ttrcp::UNIMPLEMENTED.
Fixes: #1837
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
When the kernel command line includes a agent.config_file=<path> entry,
then we will try to override the default confiuguration values with the
ones we parse from a TOML file at <path>.
As the configuration file overrides the default values, we need to go
through a simplified builder that convert a set of Option<> fields into
the actual AgentConfig structure.
Fixes: #1837
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
They will define the list of endpoints that an agent supports.
They're empty and non actionable for now.
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
A single constructor setting default value is a typical pattern for a
Default implementation.
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
Even CCA, which is the confidential compute archtecture, has not been
ready, add a empty implementation to avoid static check error.
Fixes: #2789
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Suggested-by: Fabiano Fidêncio <fidencio@redhat.com>
Exclude from lint checking for it is ultimately only used in
architecture-specific code.
Fixes: #2273
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Bump containerd to v1.5.7 in order to bring in a fix for CVE-2021-41103,
"insufficiently restricted permissions ons plugins directories
(https://github.com/advisories/GHSA-c2h3-6mxw-7mvq)".
dependabot found a potential security vulnerability and raised a PR to
fix it. However, dependabot does not properly follows nor understands
the needed of our CIs (mainly related to formatting the PR and whatnot),
thus I'm re-raising it.
Fixes: #2796
Supersedes: #2787
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Not all routes have either a gateway or a destination IP.
Interface routes, where the source, destination and gateway are undefined,
will default to IP v4 with the current is_ipv6() check even when they
are v6 routes.
We use the provided gRPC Route.Family field instead. This field is built
from the host netlink messages, and is a reliable way of finding out
a route's IP family.
Fixes: #2768
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Our check for the IP family is working as long as we have either a
gateway or a destination IP. Some routes are missing both.
The RT netlink messages provide the IP family information for each
route, so we can carry that piece of information up to the guest. That
will allow for a more reliable route IP family determination.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
We need to be able to get the IP family from the netlink route meesages,
and the Route.Family field only got recently added to the netlink
package.
The update generates static check warnings about the call for
nethandler.Delete() being deprecated in favor of a Close() call instead.
So we include the s/Delete()/Close()/ change as part of this PR.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Reduce the cloud-hypervisor log level from `Debug` to `Info` when hypervisor
debug is enabled. This is required since `Debug` level:
- Is overkill for debugging hypervisor failures.
- Effectively hides the output from the guest kernel and userland: CLH
generates so much output that the output from the guest gets "lost in
the noise" (experiments show that for each full CLH debug message, at most
1 _byte_ of guest output is displayed).
Fixes: #2726.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
We explicitely set the Postion Independant Executlable (PIE) options
in the extra CFLAGS and LDFLAGS that are passed to the QEMU configure
script for all archs. This means that these options are used pretty
much everywhere, including when building the sample plugins under the
test directory. These cannot be linked with -pie and break the build,
as experienced recently on ARM (see PR #2732).
This only broke on ARM because other archs are configured with
--disable-tcg : this disables plugins which are built by default
otherwise.
The --enable-pie option is all that is needed. The QEMU build system
knows which binaries should be created as PIE, e.g. the important
bits like QEMU and virtiofsd, and which ones should not, e.g. the
sample plugins that aren't used in production.
Rely on --enable-pie only, for all archs. This allows to drop the
workaround that was put in place in PR #2732.
Fixes: #2757
Signed-off-by: Greg Kurz <groug@kaod.org>
The centos ppc64le gpg key at mirror.centos.org doesn't exist (link rot?).
Replacing it with url from CentOS/sig-core-AltArch on github.
Fixes: #2676
Signed-off-by: Aaron Simmons <paleozogt@gmail.com>
modify the make script of the check-go-static, changing the `./cli` path to `./cmd/kata-runtime`
Fixes: #2765
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
Run tests sometimes generate pkg/containerd-shim-v2/monitor_address,
and `git status` will treat it as a new file.
Package containerd-shim-v2 has moved to pkg/containerd-shim-v2,
the monitor_address in .gitignore should be updated too.
Fixes: #2762
Signed-off-by: bin <bin@hyper.sh>
It seems the client (crio) can send multiple requests to stop the Kata VM,
resulting a nil reference if the uid has already been cleaned up by a different thread.
Fixes#2743
Signed-off-by: Feng Wang <feng.wang@databricks.com>
Since we now have "unix://" kind of socket returned by the
SocketAddress() function, there is no more need to build the sandbox
storage path dynamically to keep OS compatibility.
Fixes: #2738
Suggested-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
Rectify the values of testModuleData with the correct
types in TestCCCheckCLiFunction in kata-check_(!x86)_test.go
Fixes: #2735
Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
Optional build types are common for early adoption.
Lets add a flag to build and optional config.
e.g.
kernel-build.sh -b experimental
In the future instead of add more flags just add a new build type.
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
The agent initiates a PCI rescan from two places. One is triggered
for each virtio-blk PCI device, and one is triggered unconditionally
when we start a new container.
The PCI bus rescan code was added long time ago in Clear Containers due to
lack of ACPI support in QEMU 2.9 + q35. Since Kata routinely plugs devices
under a PCIe-to-PCI bridge, that left SHPC as the only available hotplug
mechanism.
However, while Kata was using SHPC on the qemu side, it wasn't actually
using it on the guest side. Due to a quirk of our guest kernel
configuration, the SHPC driver never bound to the bridge, and *no* hotplug
was working at all. To work around that, Kata was forcing the rescan
manually, which would discover the new device. That was very fragile (we
were arguably relying on a kernel bug). Even if we were using SHPC
propertly, it includes a mandatory 5s delay during plug operations
(designed for physical cards and human operators), which makes it
unsuitable quick start up.
Worse, the forced PCI rescans could race with either SHPC or PCIe native
hotplug sequences, causing several problems. In some cases this could put
the device into an entirely broken state where it wouldn't respond to
config space accesses at all.
Since pull request #2323 was merged, we have instead used ACPI hotplug
which is both fast, and more solid in terms of semantics and races. So,
the forced PCI rescans are no longer necessary. Remove them all.
fixes#683
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
do_add_swap() has some mildly complex code to translate the PCI path of
a virtio-blk device (where the swap will reside) into a /dev path. However,
the device module already has get_virtio_blk_pci_device_name() which does
exactly that. The existing code has some further advantages: it uses
more precise matching of the sysfs paths, and if necessary it will wait for
the device to be added to the guest.
While we're there, remove an unnecessary 'as u8' from the PCI path
construction: pci::Path::new() already accepts anything which implements
TryInfo<u8>, which u32 certainly does.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We send information about several kinds of devices to the agent so
that it can apply specific handling. We don't currently do this with
VFIO devices. However we need to do that so that the agent can
properly wait for VFIO devices to be ready (previously it did that
using a PCI rescan which may not be reliable and has some very bad
side effects).
This patch collates and sends the relevant information.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Both appendBlockDevice and appendVhostUserBlkDevice start by using
GetDeviceByID to lookup the api.Device object corresponding to their
ContainerDevice object. However their common caller, appendDevices() has
already done this.
This changes it so the looked up api.Device is passed to the individual
append*Device() functions. This slightly reduces duplicated work, but more
importantly it makes it clearer that append*Device() don't need to check
for a nil result from GetDeviceByID, since the caller has already done
that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For several device types which correspond to a PCI device in the guest
we record the device's PCI path in the guest. We don't currently do
that for VFIO devices, but we're going to need to for better handling
of SR-IOV devices.
To accomplish this, we have to determine the guest PCI path from the
information the VMM gives us:
For qemu, we query the slot of the device and its bridge from QMP.
For cloud-hypervisor, the device add interface gives us a guest PCI
address. In fact this represents a design error in the clh API -
there's no way it can really know the guest PCI address in general.
It works in this case, because clh doesn't use PCI bridges, so the
device will always be on the root bus. Based on that, the PCI path is
simply the device's slot number.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
hotplugVFIODevice() has several different paths depending if we're
plugging into a root port or a PCIE<->PCI bridge and if we're using a
regular or mediated VFIO device.
We're going to want some common code on the successful exit path here,
so refactor the function to allow that without duplication.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently, VFIO devices attached to a Kata container aren't described to
the agent at all. We essentially just hope they're ready by the time
we've entered the container proper, which is usually the case because of
the PCI rescan - but that causes other problems.
This adds a new device type to the agent representing VFIO devices. The
agent will use its existing uevent watching mechanisms to wait for the
associated guest PCI device to appear before proceeding.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently the constants giving the names for each device/driver type in
the protocol are in mount.rs, and used in device.rs. Since these constants
are inherently related to, well, devices, it makes more sense to put them
in device.rs and use them from mount.rs.
This will become even more so with planned extensions which will add some
device types that will not be used in mount.rs at all.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A few "go fmt" errors appear to have crept it. Clean them up with
"go fmt ./..." in the src/runtime directory.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
After 5.11-rc4, memory hotplug alignment size is reduced to 128M for 4K
page.
It works better for memory hotplug and nvdimm plug in kata on arm.
without this patch, memory hotplug will fail for the current memory
hotplug alignment is 1G but the nvdimm size align with 128M in kata.
After port it here, we can avoid a fix in qemu side.
Note: if you change the page size to other size than 4K, memory hotplug
will has no effect.
Fixes: #2707
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
For qemu 6.1.0 build on arm64, compile error occurs when "-pie" is added
to ldflag.
tests/plugins/empty.c won't be linked as a sysmbol is missing.
I consider there maybe a bug.
Before figure it out, we should disable plugins for qemu 6.1.0 on arm64.
Fixes: #2707
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
There's a typo in the file that should receive the output of `cargo
vendor`. We should use forward the output to `.cargo/config` instead of
`.cargo/vendor`.
This was introduced by 21c8511630.
Fixes: #2729
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
While releasing kata-containers 2.3.0-alpha1 we've hit some issues as
the tags attribution is done incorrectly. We want an array of tags to
iterate over, but the currently code is just lost is the parenthesis.
This issue was introduced in a156288c1f.
Fixes: #2725
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
- virtiofs: Create shared directory with 0700 mode, not 0750
- watcher: ensure we create target mount point for storage
- packaging: fix qemu build on ppc64le
- runtime: tracing: Use root context to stop tracing
- Replace SHPC with ACPI PCI hotplug for Kata guests
- kata-deploy: Also provide "stable" & "latest" tags
- runtime: tracing: Fix logger passed in newContainer
- virtcontainers: update VC SandboxConfig API add SandboxBindMounts field
- sandbox: Allow the device to be accessed,such as /dev/null and /dev/u…
- qemu: add v5.1.0 dir under tag_patches
- threat-model: Add missing threat-model document
- docs: documentation for running non-root VMM
- workflows,release: Upload the vendored cargo code
- runtime: run the QEMU VMM process with a non-root user
- runtime: update .gitignore file cleare the vc shim config
- runtime: fix empty cgroup path validation error
- ci: Call agent shutdown test only in the correspondent CI_JOB
- runtime: Remove outdated TestStoreContainer
- runtime: refactor commandline code directory
- virtcontainers: update VC HypervisorConfig API add three lost fields
- virtcontainers: add unit tests for container.go
- runtime: clh: Enable hugepages support
- agent: Simplify mount point creation
- versions: Allow newer Rust versions
- runtime/qemu: Move from query-cpus to query-cpus-fast
- Update Kata to use qemu-6.1
- Host cgroups improvements and simplifications
- Add doc for guest swap
- versions: Upgrade to Cloud Hypervisor v18.0
- runtime: Fix README link
- qemu: remove default config for arm64.
- sandbox: Add device permissions such as /dev/null to cgroup
- virtcontainers: fc: parse vcpuID correctly
- kata-tarball: Build and test fixes
- test: enable running tests under root user
- osbuilder: Change to "=" operator to make script more portable
- makefile: Fix error exit status code
- osbuilder: fix inconsistent calculation of fs size
- virtcontainers: Remove NewStoreFeature
- snap: Test variable instead of executing "branch"
- license: drop redundent license files
- Fix swap fail insert fail issue
272771dc watcher: ensure we create target mount point for storage
439e5ac3 packaging: fix qemu build on ppc64le
8bbcb06a qemu: Disable SHPC hotplug
cc4983ee runtime: Remove unused qemuArchBase.appendBridges definition
e248de46 vendor: Update govmm
0ca8c272 qemu: add v5.1.0 dir under tag_patches
3bdcfaa6 kata-deploy: Add more info about the stable tag
41c590fa kata-deploy: Improve README
debf3c9f kata-deploy: Remove qemu-virtiofs runtime class
43a72d76 release: update the kata-deploy yaml files accordingly
ea9b2f9c kata-deploy: Add "stable" info to the README
e5411056 kata-deploy: Update the README
9acf4e5d kata-deploy: Add `stable` yaml files
a86babe0 kata-deploy: Point to the `latest` release
a156288c workflows: Add "stable" & "latest" tags to kata-deploy
305afc8b docs: documentation for running non-root VMM
1fe080fd threat-model: Add missing threat-model document
21c85116 workflows,release: Upload the vendored cargo code
9a6d56f1 runtime: fix empty cgroup path validation error
90e63887 ci: Call agent shutdown test only in the correspondent CI_JOB
48fb1d92 virtiofs: Create shared directory with 0700 mode, not 0750
077b77c1 runtime: tracing: Fix logger passed in newContainer
39cd05e0 runtime: tracing: Use root context to stop tracing
1cfe5930 runtime: Run QEMU using a non-root user/group
fd983738 runtime: update .gitignore file cleare the vc shim config
067c44d0 runtime: fix UT build failure
9353cd77 runtime: Remove outdated TestStoreContainer
9a311a2b docs: fix invalid kernel dax doc url
e7c42fbc runtime: unify generated config
4f7cc186 runtime: refactor commandline code directory
9d3cd984 agent/mount: Remove unused ensure_destination_exists()
64aa5623 agent: Correct mount point creation
08d7aebc agent/mount: Split out regular file case from ensure_destination_exists()
9fa3beff agent: Remove unnecessary BareMount structure
49282854 agent: Simplify BareMount::mount by using nix::mount::mount
d00decc9 runtime: clh: Enable hugepages support
64bb803f runtime/qemu: Move from query-cpus to query-cpus-fast
25ac3524 versions: Allow newer Rust versions
851d5f86 tests: Correct heading in static checks test
4b7e4a4c runtime: Vendoring update
8d9d6e6a docs: Host cgroups documentation update
9bed2ade virtcontainers: Convert to the new cgroups package API
b42ed393 virtcontainers: cgroups: Add a containerd API based cgroups package
f17752b0 virtcontainers: container: Do not create and manage container host cgroups
dc7e9bce virtcontainers: sandbox: Host cgroups partitioning
f811026c virtcontainers: Unconditionally create the sandbox cgroup manager
a6066404 virtcontainers: update VC HypervisorConfig API add three lost fields
bb18cd47 virtcontainers: update VC SandboxConfig API add SandboxBindMounts field
58e77a3c sandbox: Allow the device to be accessed,such as /dev/null and /dev/urandom
d67a414b src/runtime/README.md: Fix URL of Licence
13b8bb0c runtime: Fix README link
25670d30 packaging/qemu: Update qemu-exerimental version to v6.1.0
041a513f versions: Update qemu to v6.1.0
62baa48e virtcontainers: fc: parse vcpuID correctly
81de2d47 packaging: Correct error message in apply_patches.sh
f785ff0b virtcontainers: clh: Revert the workaround incorrect default values
0e0e59dc virtcontainers: clh: Re-generate the client code
f0b53314 versions: Upgrade to Cloud Hypervisor v18.0
11652136 actions: test make kata-tarball
626d659f actions: kata-deploy on PRs and use makefile
78d99f51 kata-deploy: Make verbose single builds
59486b85 kata-deploy: Add tarball suffix to makefile targets
96e1246b makefile: Include kata-deploy targets
74d645cd how-to: Add how-to-setup-swap-devices-in-guest-kernel.md
d865c809 virtcontainers: add unit tests for container.go
71f915c6 sandbox: Add device permissions such as /dev/null to cgroup
2174fee4 docs: Add swap annotations introduction
2abc450a test: enable running tests under root user
924a68d0 osbuilder: Change to "=" operator to make script more portable
1fff9be7 qemu: remove default config for arm64.
e2a9e78c virtcontainers: Remove NewStoreFeature
bfcee911 osbuilder: fix inconsistent calculation of fs size
4996f9b7 snap: Test variable instead of executing "branch"
256c3b27 license: drop redundent license files
bcc9fa3b hotplugAddBlockDevice: Use ExecuteBlockdevAddWithDriverCache with swap
bd85da04 vendor: Update vendor/github.com/kata-containers/govmm
d422789f makefile: Fix error exit status code
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
We would only create the target when updating files. We need to make
sure that we create the target if the source is a directory. Without
this, we'll fail to start a container that utilizes an empty configmap,
for example.
Add unit tests for this.
Fixes: #2638
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
We now support any container engine CRI compliant. Let's bump the
kata-monitor version to 0.2.0.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
This commit stops the container engine polling in favor of
the kata sandbox storage path monitoring.
The pod cache list is now refreshed based on fs events and synced with
the container engine only when needed.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
When the container engine is different than containerd or CRI-O we
lack proper detection of kata workloads and consider all the pods as
kata ones.
Instead of querying the container engine for the lower level runtime
used in each pod, check if a directory matching the pod exists in
the virtualcontainers sandboxes storage path.
This provides a container engine independent way to check for kata pods.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
Since the qemu upgrade to v6.1.0, the build fails
with a linking issue. Adding --disable-tcg to fix
it.
Fixes: #2710
Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
Under certain circumstances[0] Kata will attempt to use SHPC hotplug
for PCI devices on the guest. In fact we explicitly enable SHPC on
our PCI to PCI bridges, regardless of the qemu default.
SHPC was designed a long, long time ago for physical hotplugging and
works very poorly for a virtual environment. In particular it has a
mandatory 5s delay to allow a (real, human) operator to back out the
operation if they press a button by mistake. This alone makes it
unusable for a fast start up application like Kata.
Worse, the agent forces a PCI rescan during startup. That will race
with the SHPC hotplug operation causing the device to go into a bad
state where config space can't be accessed from the guest at all.
The only reason we've sort of gotten away with this is that our
default guest kernel configuration triggers what's arguably a kernel
bug effectively disabling SHPC. That makes the agent rescan the only
reason we see the new device.
Now that we require a qemu >=6.1, which includes ACPI PCI hotplug on
the q35 machine, we can explicitly disable SHPC in all cases. It's
nothing but trouble.
fixes#2174
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemuArchBase.appendBridges is never actually used, because the bare
qemuArchBase type is itself never used (outside of unit tests). Instead
*all* the subclasses of qemuArchBase override appendBridges() to call
the very similar, but not identical genericAppendBridges. So, we can
remove the qemuArchBase.appendBridges implementation.
Furthermore, all those subclasses override appendBridges() in exactly
the same way, and so we can remove *those* definitions and replace the
base class qemuArchBase appendBridges() with that version, calling
genericAppendBridges().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A related dir is needed when apply qemu patch using script. As qemu 5.1
is used for arm, a dir of "v5.1.0" is needed under tag_patches.
Fixes: #2696
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
There are `DeviceToDeviceCgroup` and `deviceToDeviceCgroup` two functions,
creating a `specs.LinuxDeviceCgroup` object. We clear the new function `deviceToDeviceCgroup`.
Fixes: #2694
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
Let's make it as clear as possible for the user that if they go for a
tagged version of kata-deploy, eg, 2.2.1, they'll have the kata runtime
2.2.1 deployed on their cluster.
Suggested-by: Eric Adams <eric.adams@intel.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's add more instructions in the README in order to make clear to the
reader what they can do to check whether kata-deploy is ready, or
whether they have to wait till proceeding with the next instruction.
Suggested-by: Eric Adams <eric.adams@intel.com>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
There's only one QEMU runtime class deployed as part of kata-deploy, and
that includes virtiofs support (which is the default for quite some time
already). Knowing this, let's just remove the `qemu-virtiofs` runtime
class definition.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's teach our `update-repository-version.sh` script to properly update
the kata-deploy tags on both kata-deploy and kata-cleanup yaml files.
The 3 scenarios that we're dealing with, based on which branch we're
targetting, are:
```
1) [main] ------> [main] NO-OP
"alpha0" "alpha1"
+----------------+----------------+
| from | to |
-----------------+----------------+----------------+
kata-deploy | "latest" | "latest" |
-----------------+----------------+----------------+
kata-deploy-base | "stable | "stable" |
-----------------+----------------+----------------+
2) [main] ------> [stable] Update kata-deploy and
"alpha2" "rc0" get rid of kata-deploy-base
+----------------+----------------+
| from | to |
-----------------+----------------+----------------+
kata-deploy | "latest" | "rc0" |
-----------------+----------------+----------------+
kata-deploy-base | "stable" | REMOVED |
-----------------+----------------+----------------+
3) [stable] ------> [stable] Update kata-deploy
"x.y.z" "x.y.(z+1)"
+----------------+----------------+
| from | to |
-----------------+----------------+----------------+
kata-deploy | "x.y.z" | "x.y.(z+1)" |
-----------------+----------------+----------------+
kata-deploy-base | NON-EXISTENT | NON-EXISTENT |
-----------------+----------------+----------------+
```
And we can easily cover those 3 cases only with the information about
the "${target_branch}" and the "${new_version}", where:
* case 1) if "${target_branch}" is "main" *and* "${new_version}"
contains "alpha", do nothing
* case 2) if "${target_branch}" is "main" *and* "${new_version}"
contains "rc":
* change the kata-deploy & kata-cleanup tags from "latest" to
"${new_version}".
* delete the kata-deploy-stable & kata-cleanup-stable files.
* case 3) if the "${target_branch}" contains "stable":
* change the kata-deploy & kata-cleanup tags from "${current_version}"
to "${new_version}".
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Similar to the instructions we have for the "latest" images, let's also
add instructions about the "stable" images.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's just point to our repo URLs rather than assume users using
kata-deploy will have our repo cloned.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
This is **not** the nicest patch of my career, and I know it adds code
duplication. However, I've decided to take this approach in order to
have easier / better instructions for users who're consuming
kata-deploy.
Having both stable & latest yaml on `main` will let us point to just one
place, without having to update the instructions.
I know, would be better to have those generated from a .in file,
wouldn't it? For sure, but then we'd lose the ability to just point to
those files from kata-deploy pages (either on dockerhub or quay.io).
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Instead of point to a specific release number, let's point to the
`latest` tag on the main branch.
There's still some work needed in order to point to the `stable` tag on
the stable-x.y branches, as this is something that should be done
automagically as part of the release process.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
When releasing a tarball, let's *also* add the "stable" & "latest" tags
to the kata-deploy image.
The "stable" tag refers to any official release, while the "latest" tag
refers to any pre-release / release candidate.
Fixes: #2302
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
This was added in the 1.x repo and is missing in the 2.x repo.
Copying over the document from 1.x.
This is a starting point and focuses on the devices / interfaces
with the virtual machine, and ultimately to the container itself.
We then discuss how these devices/interfaces vary by VMM/hypervisor.
The threat model drawing is created via gdocs, located here:
https://docs.google.com/drawings/d/1dPi9DG9bcCUXlayxrR2OUa1miEZXewtW7YCt4r_VDmA/edit?usp=sharing
For Kata 2.x, the block named as `kata-runtime` has been changed to
`kata-shim`.
Fixes: #2340
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
As part of the release, let's also upload a tarball with the vendored
cargo code. By doing this we allow distros, which usually don't have
access to the internet while performing the builds, to just add the
vendored code as a second source, making the life of the downstream
maintainers slightly easier*.
Fixes: #1203
*: The current workflow requires the downstream maintainer to download
the tarball, unpack it, run `cargo vendor`, create the tarball, etc.
Although this doesn't look like a ridiculous amount of work, it's better
if we can have it in an automated fashion.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The agent shutdown test should only run on the CI JOB of CRI_CONTAINERD_K8S_MINIMAL
which is the only one where testing tracing is being enabled, however, this
test is being triggered in multiple CI jobs where it should not run. This PR
fixes that issue.
Fixes#2683
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
A discussion on the Linux kernel mailing list [1] exposed that virtiofsd makes a
core assumption that the file systems being shared are not accessible by any
non-privileged user. We currently create the `shared` directory in the sandbox
with the default `0750` permissions, which gives read and directory traversal
access to the group. There is no real good reason for a non-root user to access
the shared directory, and this is potentially dangerous.
Fixes: #2589
[1]: https://lore.kernel.org/linux-fsdevel/YTI+k29AoeGdX13Q@redhat.com/
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Retrieve the absolute sandbox storage path. We will soon need this to
monitor the creation/deletion of new kata sandboxes.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
The storage path we use to collect the sandbox files is defined in the
virtcontainers/persist/fs package.
We create the runtime socket in that storage path, by hardcoding the
full path in the SocketAddress() function in the runtime package.
This commit splits the hardcoded path by the socket address path so that
the runtime package will be able to provide the storage path to all the
components that may need it.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
In order to retrieve the list of sandboxes, we poll the container engine
every 15 seconds via the CRI. Once we have the list we have to inspect
each pod to find out the kata ones.
This commit extend the sandbox cache to keep track of all the pods,
marking the kata ones, so that during the next polling only the new
sandboxes should be inspected to figure out which ones are using the
kata runtime.
Fixes: #2563
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
this is an unexpected event (likely a change in how containerd/cri-o
record the lower level runtime in the pod) and should be more visible:
raise the log level to "warning".
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
Change logger in Trace call in newContainer from sandbox.Logger() to
nil. Passing nil will cause an error to be logged by kataTraceLogger
instead of the sandbox logger, which will avoid having the log message
report it as part of the sandbox subsystem when it is part of the
container subsystem.
The kataTraceLogger will not log it as related to the container
subsystem, but since the container logger has not been created at this
point, and we already use the kataTraceLogger in other instances where a
subsystem's logger has not been created yet, this PR makes the call
consistent with other code.
Fixes#2665
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Call StopTracing with s.rootCtx, which is the root context for tracing,
instead of s.ctx, which is parent to a subset of trace spans.
Fixes#2661
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
A random generated user/group is used to start QEMU VMM process.
The /dev/kvm group owner is also added to the QEMU process to grant it access.
Fixes#2444
Signed-off-by: Feng Wang <feng.wang@databricks.com>
Due to #2332 being merged after running tests for #2604, and the latter
being merged now, a test for the now removed `storeContainer` was added.
Remove it.
Fixes: #2652
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
And use a released version instead of the master branch so that it no
longer gets invalidated.
Depends-on: github.com/kata-containers/kata-containers#2645
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
The only remaining callers of ensure_destination_exists() are in its own
unit tests. So, just remove it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
mount_storage() first makes sure the mount point for the storage volume
exists. It uses fs::create_dir_all() in the case of 9p or virtiofs volumes
otherwise ensure_destination_exists(). But.. ensure_destination_exists()
boils down to an fs::create_dir_all() in most cases anyway. The only case
it doesn't is for a bind fstype, where it creates a file instead of a
directory. But, that's not correct anyway because we need to create either
a file or a directory depending on the source of the bind mount, which
ensure_destination_exists() doesn't know.
The 9p/virtiofs paths also check if the mountpoint exists before calling
fs::create_dir_all(), which is unnecessary (fs::create_dir_all already
handles that case).
mount_storage() does have the information to know what we need to create,
so have it explicitly call ensure_destination_file_exists() for the bind
mount to a non-directory case, and fs::create_dir_all() in all other cases.
fixes#2390
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ensure_destination_exists() can create either a directory or a regular file
depending on the arguments. This patch extracts the regular file specific
option into its own helper: ensure_destination_file_exists(). This:
- Avoids doing some steps in the directory case (they're already handled
by create_dir_all())
- Enables some further future cleanups
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
struct Baremount contains the information necessary to make a new mount.
As a datastructure, however, it's pointless, since every user just
constructs it, immediately calls the BareMount::mount() method then
discards the structure.
Simplify the code by making this a direct function call baremount().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BareMount::mount does some complicated marshalling and uses unsafe code to
call into the mount(2) system call. However, we're already using the nix
crate which provides a more Rust-like wrapper for mount(2). We're even
already using nix::mount::umount and nix::mount::MsFlags from the same
module.
In the same way, we can replace the direct usage of libc::umount() with
nix::mount::umount() in one of the tests.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch adds the configuration option that allows to use hugepages
with Cloud Hypervisor guests.
Fixes: #2648
Signed-off-by: Bo Chen <chen.bo@intel.com>
We recently updated to using qemu-6.1 (from qemu 5.2). Unfortunately one
breaking change in qemu 6.0 wasn't caught by the CI.
The query-cpus QMP command has been removed, replaced by query-cpus-fast
(which has been available since qemu 2.12). govmm already had support for
query-cpus-fast, we just weren't using it, so the change is quite easy.
fixes#2643
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Rust 1.47.0 which is the latest we note as tested in versions.yaml is now
getting fairly old - many current distros have newer versions (e.g.
Rust 1.54.0 in Fedora 34). Bring this more up to date.
Note that this is only updating the 'newest-version', not the minimum
required version.
The new version changes the name of the 'clippy::unknown_clipp_lints'
option to simply 'unknown_lints' so we need to change that as well to avoid
warnings.
fixes#2633
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The github static checks action has a section heading called "Building
rust". It doesn't actually build rust, though, just installs it with
rustup. Correct the misleading message.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The new API is based on containerd's cgroups package.
With that conversion we can simpligy the virtcontainers sandbox code and
also uniformize our cgroups external API dependency. We now only depend
on containerd/cgroups for everything cgroups related.
Depends-on: github.com/kata-containers/tests#3805
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Eventually, we will convert the virtcontainers and the whole Kata
runtime code base to only rely on that package.
This will make Kata only depends on the simpler containerd cgroups API.
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
The only process we are adding there is the container host one, and
there is no such thing anymore.
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
This is a simplification of the host cgroup handling by partitioning the
host cgroups into 2: A sandbox cgroup and an overhead cgroup.
The sandbox cgroup is always created and initialized. The overhead
cgroup is only available when sandbox_cgroup_only is unset, and is
unconstrained on all controllers. The goal of having an overhead cgroup
is to be more flexible on how we manage a pod overhead. Having such
cgroup will allow for setting a fixed overhead per pod, for a subset of
controllers, while at the same time not having the pod being accounted
for those resources.
When sandbox_cgroup_only is not set, we move all non vCPU threads
to the overhead cgroup and let them run unconstrained. When it is set,
all pod related processes and threads will run in the sandbox cgroup.
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
Regardless of the sandbox_cgroup_only setting, we create the sandbox
cgroup manager and set the sandbox cgroup path at the same time.
Without doing this, the hypervisor constraint routine is mostly a NOP as
the sandbox state cgroup path is not initialized.
Fixes#2184
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
Sync the virtcontainers api.md document, add `ConfidentialGuest` `EntropySourceList` `GuestSwap` three
fields to the HypervisorConfig API.
Fixes#2625
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
sync the virtcontainers api.md document, add SandboxBindMounts field to the SandboxConfig API.
And update the order of the SandboxConfig API fields.
Fixes#2621
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
If the device has no permission, such as /dev/null, /dev/urandom,
it needs to be added into cgroup.
Fixes: #2615
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
This brings it back into line with the normal qemu version. We refer to
v6.1.0 by full SHA in versions.yaml, rather than the tag, so that
apply_patches.sh sees it as different and applies the virtiofs DAX patches
which is what the experimental version is actually about having.
The virtiofs DAX patches themselves are updated to the version from
https://gitlab.com/virtio-fs/qemu, virtio-fs-dev branch as of commit
3620cb0a.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We need qemu-6.1 for ACPI PCI hotplug support for the q35 machine. At the
moment qemu will use SHPC hotplug under the PCIe to PCI bridge on q35.
SHPC is too slow to use for our purposes (it requires a 5s delay).
Update the qemu version to v6.1.0. This leaves the experimental version
*older* than the normal version, but we'll fix that up later.
We also need to tweak the snapcraft.yaml, since the location for configs
has changed in the new qemu version.
fixes#1691
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In getThreadIDs(), the cpuID variable is derived from a string that
already contains a whitespace. As a result, strings.SplitAfter returns
the cpuID with a leading space. This makes any go variant of string to int
fail (strconv.ParseInt() in our case). This patch makes sure that the
leading space character is removed so the string passed to
strconv.ParseInt() is "CPUID" and not " CPUID".
This has been caused by a change in the naming scheme of vcpu threads
for Firecracker after v0.19.1.
Fixes: #2592
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
If the script doesn't find a patches directory it expects, it gives an
error saying to create a dummy 'no_patches' file if you really don't want
any patches applied for that version.
But actual practice in the tree is to call the dummy file 'no_patches.txt'
rather than simply 'no_patches'. Correct the message to match existing
practice.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Highlights from the Cloud Hypervisor release v18.0: 1) Experimental User
Device (vfio-user) support; 2) Migration support for vhost-user devices;
3) VHDX disk image support; 4) Device pass through on MSHV hypervisor;
5) AArch64 for support virtio-mem; 6) Live migration on MSHV hypervisor;
7) AArch64 CPU topology support; 8) Power button support on AArch64; 9)
Various bug fixes on PTY, TTY, signal handling, and live-migration on
AArch64.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v18.0Fixes: #2543
Signed-off-by: Bo Chen <chen.bo@intel.com>
make kata-tarball is the main way to
build a kata in a single host. Lets
test it to make sure it works on every PR.
Fixes: #2416
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
- Run kata-deploy tarball generation action on every PR.
- Use kata-deploy makefile targets.
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
If a binary tarball for a single component is done,
the logs will be shown in stdout.
e.g.
make kernel-tarball
To build all a the same time still store logs in files.
make kata-tarball
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
Now that local-build kata-deploy makefile is inlucded in toplevel
makefile, lets use the suffix `-tarball` to avoid name collitions
and identify the tarball releted targets.
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
Use kata-deploy targets from toplevel.
This will help if want to build and
reinstall just one single kata component.
Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
Add how-to-setup-swap-devices-in-guest-kernel.md to how-to to introduce
how to setup swap device in guest kernel.
Fixes: #2326
Signed-off-by: Hui Zhu <teawater@antfin.com>
adds the default devices for unix such as /dev/null, /dev/urandom to
the container's resource cgroup spec
Fixes: #2539
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
zsh doesn't support "==" as equal comparison operator, so
replace "==" with "=" to make the script more portable
Fixes: #2584
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
The current default config in qemu for arm64 doesn't suit for qemu
version 5.1+, so remove them here.
Fixes: #2595
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
This patch fixes inconsistent calculations of the rootfs size.
For `du` and `df`, `-B 1MB` is different from `-BM`. The
former is the power of 1000, and the latter is the power of
1024. So comparing them doesn't make sense. The bug may result
in a larger image than needed.
Fixes: #2560
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
In snapcraft.yaml we have a case statement on $(branch) - that is on the
output of executing a command "branch". From the selections it appears
that what it actually wants is to simply select on the contents of the
$branch variable, which should be ${branch} instead.
fixes#2558
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There is no need to keep multiple copies of the license file in
different directory. We can just use the top level one for the project.
Fixes: #2553
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Use ExecuteBlockdevAddWithDriverCache with swap in
hotplugAddBlockDevice to handle swap file cannot work OK with
ExecuteBlockdevAddWithCache issue.
Fixes: #2548
Signed-off-by: Hui Zhu <teawater@antfin.com>
- tracing: Change runtime tracing tags to vars
- shimv2: add logging to shimv2 api calls
- drop qemu-lite support
- runtime: delete types or const that no longer needed
- runtime: Optimize the way slice created
- virtcontainers: simplify tests
- virtcontainers: clh: Upgrade to the openapi-generator v5.2.1
- build_image: Fix error soft link about initrd.img
- ci: Temporarily skip agent shutdown test on s390x
- Fix version parsing for firecracker version 0.25 and over
- Osbuilder fixes
- docs: update the GoDoc url from runtime project to kata-containers/sr…
- docs: update `how-to` README file for Firecracker config
- ci/openshift-ci: Pull centos from registry.centos.org
- docs: update containerd CRI plugin url
2250360b docs: remove mentioning of qemu-lite
a9de761d runtime: drop qemu-lite support
8ae3edbc runtime: fix default hypervisor path
0c7789fa runtime: Add container field to logs
72e3538e shimv2: add information to method comment
8dadca9c shimv2: add logging to shimv2 api calls
a99fcc3a virtcontainers: simplify tests
39ffd8ee runtime: delete types or const that no longer needed
ff37f5c7 runtime: Optimize the way slice created
8f0f949a tracing: Move dynamically added attributes to Trace()
932ee41b virtcontainers: clh: Workaround incorrect default values
bff38e4f virtcontainers: clh: Fix the unit test
d967d3cb virtcontainers: clh: Use constructors to ensure proper default value
87de26bd tracing: Modify Trace() to accept multiple tag maps
8058e972 tracing: Change runtime tracing tags to vars
a6a2e525 virtcontainers: clh: Migrate to use the updated client APIs
9de1129b osbuilder: Fix rootfs-builder when running in VMs
65a1e131 osbuilder: Allow running the tool several times
a4214738 osbuilder: Fix Makefile
b8717f35 ci: Temporarily skip agent shutdown test on s390x
938981be build_image: Fix error soft link about initrd.img
2304f935 docs: update the GoDoc url from kata 1.x to 2.x
2a614577 docs: update `how-to` README file for Firecracker config
486baba7 docs: update containerd CRI plugin url
46eb07e1 virtcontainers: clh: Re-generate the client code
80fba4d6 virtcontainers: clh: Upgrade to the openapi-generator v5.2.1
8594f80c ci/openshift-ci: Pull centos from registry.centos.org
87bbae1b fc: fix version parsing for fc >= 0.25
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Where possible, move attributes added with AddTag() to Trace() call to
reduce the amount of code used for tracing.
Fixes#2512
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Two default values defined in the 'cloud-hypervisor.yaml' have typo, and this
patch manually overwrites them with the correct value as a workaround
before the corresponding fix is landed to Cloud Hypervisor upstream.
Signed-off-by: Bo Chen <chen.bo@intel.com>
With the updated openapi-generator, the client code now handles optional
attributes correctly, and ensures to assign the right default
values. This patch enables to use those constructors to make sure the
proper default values being used.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The general Trace() function accepts one map as a set of tags. Modify it
to accept multiple sets of tags so that additional ones can be added at
Trace() and not as a subsequent call.
Additionally, we should not iterate over the maps unless tracing tracing
is enabled.
Fixes#2512
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Tracing tags are stored inconsistently throughout the runtime. Change
all instances of tracing tags to variables.
Fixes#2512
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
The client code (and APIs) for Cloud Hypervisor has been changed
dramatically due to the upgrade to `openapi-generator` v5.2.1. This
patch migrate the Cloud Hypervisor driver in the kata-runtime to use
those updated APIs.
The main change from the client code is that it now uses "pointer" type
to represent "optional" attributes from the input openapi specification
file.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The script runs apt sync at some point which scans all possible fds
in order to close them. The operation is incredibly slow on VMs
and may lead to build timeouts.
Fix it by limiting the container runtime fds to a sane limit.
Fixes: #2510
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Once the ${ROOTFS_DIR} is created, the tool can't run the second
time since the directory is populated and the debootstrap tool
will fail.
Fix by deleting the contents of ${ROOTFS_DIR} if the directory exists.
Note that running make clean will also allow the re-run, it
is only an optimization for some cases the build fails in the middle.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Let the DISTRO variable to be set from outside,
allowing "sudo -E DISTRO=<ANY> make clean" to delete the correct files.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
the katatestutils GoDoc url stilled using the kata 1.x branch url. This PR fixed the
url from kata-containers/runtime/pkg/katatestutils to
kata-containers/kata-containers/src/runtime/pkg/katatestutils
Fixes: #2500
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
Remove the `Kata Containers with Firecracker` additional configuration steps.
From kata 2.x, the config of `firecracker` is same to `qemu` and `cloud-hypervisor`.
Fixes: #2492
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
update cri plugin source path to containerd pkg in the
how-to-use-k8s-with-cri-containerd-and-kata.md file. The cri project was moved to containerd project pkg directory.
Fixes: #2490
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
To improve the quality and correctness of the auto-generated code, this
patch upgrade the `openapi-generator` to its latest stable release
v5.2.1.
Fixes: #2487
Signed-off-by: Bo Chen <chen.bo@intel.com>
In order to avoid hit the pull requests limit of docker.io, this changed the
openshift-ci/images/Dockerfile.buildroot dockerfile to pull the centos image
from registry.centos.org.
Fixes#1636
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Generate `config-generated.go` file under src/runtime/cli/containerd-shim-kata-v2 before excuting test or coverage.
Fixes#2479
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
- use CRI in kata-monitor
- config: Enable jailer by default when using firecracker
- workflows: Actually push the release to quay.io
- docs: update general wording for installation documentation
- Cleanup kernel packaging
- tracing: Return context in runHooks() span creation
- osbuilder: Document no Alpine support on s390x
- osbuilder: Upgrade Ubuntu guest to 20.04
- agent: watcher / inotify stability fixes
- enable snap build for arm64
- agent: Fix cargo 1.54 clippy warning
- osbuilder: Drop Go agent support
- kernel: PTP_KVM support for arm/arm64 in Kata
- docs: update the docs project url from kata 1.x to 2.x
- clh: correct cloud-hypervisor installation on non-x86
- virtcontainers: fc: properly remove jailed block device
- CI: Call agent shutdown test
- kata deploy: always update the base image
- docs: Remove kata-proxy and invalid script reference
- workflows: Actually login to quay.io
- kata-deploy: Update our content to use / point to quay.io/kata-containers rather than katadocker
- agent: Create the process CWD when it does not exist
- Update Kata to allow it to use Qemu 6.1
- osbuilder/dracut: Add missing libraries
- osbuilder: pass env OS_VERSION
- tools: shorten directory path
- virtcontainers: clh: Do not use the default HTTP client
- docs: update kata deploy README doc to add cloud-hypervisor test command
- Container: Add initConfigResourcesMemory and call it in newContainer
- qemu/arm: remove nvdimm/"ReadOnly" option on arm64
- Fix issue container start fail if io.katacontainers.container.resource.swap_in_bytes and memory_limit_in_bytes are not set
- docs: Add tracing proposals doc
- docs: Remove table of contents
- static-checks: Check for the `force-skip-ci` label on each step
- docs: update the kata release url in the kata deploy document
- kata-deploy: Allow build kata-deploy tarball from HEAD
- mod: unify runc and containerd dependencies
- how-to-use-virtio-mem-with-kata.md: Remove undefined ${REPORT_DIR}
- ci: Run static checks when PRs are updated
- docs: update url for log parser in how-to-import-kata-logs-with-fluen…
- versions: Upgrade to Cloud Hypervisor v17.0
- snap: Substitute image configuration with initrd
- docs: Update url for log parser in Developer guide
- mount: fix the issue of missing check file exists
- build(deps): bump github.com/containerd/containerd from 1.5.2 to 1.5.4 in /src/runtime
- docs: Update experimental documentation
- snap: do not export agent version
- Upgrade runc to 1.0.1
- runtime: read-only NVDIMM
- osbuilder/scripts: add support to yq version 4 and above
- osbuilder: update centos arm rootfs image config 'GPG_KEY_ARCH_URL'
- monitor: mv the monitor socket into sbs directory
- fix govet fieldalignment
- docs: added a glossary to support SEO tactics
- ci: expand $CI to nothing
- Add swap support
- snap: fixed snap aarch64 qemu patches dir in snapcraft.yaml file
- agent: clear MsFlags if the option has clear flag set
- snap: Remove QEMU before clone
- docs: fix minikube installation guide runtimeclasses error
- docs: fixed kata-deploy path for kata logs with fluentd doc
- agent/agent-ctl: update tokio to 1.8.1
- ci: set -o nounset
- static-checks: Add a make target to run static-checks locally
- virtiofsd: fix the issue of missing stop virtiofsd
- docs: Update containerd configuration format
- osbuilder: Skip installing golang for building rootfs
- agent-ctl: Use a common Makefile style like other components
- vsock-exporter: switch to tokio runtime
- config: Fix description for OCI hooks
- shimv2: fix the issue of kata-runtime exec failed
7a5ffd4a config: Enable jailer by default when using firecracker
2cb7b513 docs: update general wording for installation documentation
76f4588f workflows: Actually push the release to quay.io
b980c62f packaging/kernel: Update kernel build doc
99e9a6ad packaging/kernel: Update versions.yaml kernel urls
c23ffef4 packaging/kernel: Remove old Jenkins pipeline
9586d482 tracing: Return context in runHooks() span creation
6a6dee7c osbuilder: Document no Alpine support on s390x
71f304ce agent: watcher: cleanup mount if needed when container is removed
f1a505db agent: Temporarily allow unknown linters
961aaff0 agent: watcher: fixes to make more robust
7effbdeb osbuilder: Upgrade Ubuntu guest to 20.04
99ab91df docs: update the docs project url from kata 1.x to 2.x
4fe23b19 kernel: PTP_KVM support for arm/arm64 in Kata
f981fc64 clh: correct cloud-hypervisor installation
f87cee9d kata-deploy: Rely directly on a centos:7 image
6871aeaa snap: enable snap build for arm64
15e0a3c8 kata-deploy: Remove unneeded yum cached files
d01aebeb kata-deploy: Ensure the system is up-to-date
77160e59 workflows: Actually login to quay.io
b9e03a1c docs: update the image repository to quay.io
f47cad3d tools: Update the image repository to quay.io
9fa1febf workflows: Also push the image to quay.io
233b53c0 agent: Fix cargo 1.54 clippy warning
2d8386ea kata-monitor: add few unit tests
8714a350 kata-monitor: make code to identify kata pods simpler
68a6f011 kata-monitor: drop the runtime info from the sandbox cache
97dcc5f7 kata-monitor: drop getMonitorAddress()
0b03d97d vendor: update vendors for kata-monitor
c2f03e89 kata-monitor: talk to the container engine via the CRI
c867d1e0 osbuilder: Drop Go agent support
1d25d7d4 docs: Remove kata-proxy and binaries reference
64dd35ba virtcontainers: fc: properly remove jailed block device
b8133a18 osbuilder/dracut: Add missing libraries
831c2fee packaging: Remove reference to sheepdog driver
2e28b714 packaging: Drop support for qemu < 5.0
d5f85698 vendor: Update govmm
31650956 runtime/qemu: Use explicit "on" for kernel_irqchip parameter
a72b0811 osbuilder: pass env OS_VERSION
d007bb85 kata-deploy: shorten directory path
e6408fe6 Container: Add initConfigResourcesMemory and call it in newContainer
49083bfa agent: Create the process CWD when it does not exist
ee90affc newContainer: Initialize c.config.Resources.Memory if it is nil
767a41ce updateResources: Log result after calculateSandboxMemory
760ec4e5 virtcontainers: clh: Do not use the default HTTP client
3fe6695b static-checks: Check for the `force-skip-ci` label on each step
7df56301 CI: Call agent shutdown test
57b696a5 docs: Removed mention of 1.x
4f0726bc docs: Remove table of contents
f186c5e2 docs: Fix invalid URLs
7c610a6f docs: Fix shell code
80afba15 docs: update kata deploy README doc to add cloud-hypervisor test command
5a0d3c4f docs: update the kata release url in the kata deploy document
9514dda5 mod: unity containerd dependency
6ffe37b9 mod: unify runc dependency
5b514177 docs: Add tracing proposals doc
b53e8405 how-to-use-virtio-mem-with-kata.md: Remove undefined ${REPORT_DIR}
5957bc7d ci: Run static checks when PRs are updated
81e6bf6f kata-deploy: Split shimv2 build in a separate container.
d46ae324 kernel: build: Add container build
b789a935 actions: release: Use new kata-deploy scripts.
85987c6d kata-deploy: Add Makefile
b9d2eea3 kata-deploy: Add script to merge kata tarballs.
4895747f Rootfs: Add curl to alpine rootfs builder.
fc90bb53 Actions: Add new workflow to create static tarballs
bbb06c49 actions: Remove scripts from actions directory.
2f9859ab build: Reuse firecracker directory on builds.
3533a5b6 Packaging: stop using GOPATH for yq.
0c5ded4b kata-deploy: build kata only with docker in host
2ec31093 docs: update url for log parser in how-to-import-kata-logs-with-fluentd.md
cc0bb9ae versions: Upgrade to Cloud Hypervisor v17.0
8e9ffe6f snap: Substitute image configuration with initrd
8b15eafa docs: Update url for log parser in Developer guide
77604de8 qemu/arm: remove nvdimm/"ReadOnly" option on arm64
4fbae549 docs: Update experimental documentation
07f7ad9d build(deps): bump github.com/containerd/containerd in /src/runtime
9c0b8a7f snap: do not export agent version
3727caf7 versions: Update runc to 1.0.1
116c29c8 cgroups: manager's Set() now takes Resources as its parameter
c0f801c0 rootless: RunningInUserNS() is now part of userns namespace
b5293c52 runtime: update runc dependency to 1.0.1
2859600a runtime: virtcontainers: make rootfs image read-only
8befb1f3 kata-deploy: Refactor builder options.
7125f5d8 image-builder: Allow build image and initrd independently.
0f8c0dbc osbuilder/scripts: add support to yq version 4 and above
070590fb vendor: update govmm
b4c45df8 runtime: tools/packaging/cmd/kata-pkgsync: fix govet fieldalignment
aec53090 runtime: virtcontainers/utils: fix govet fieldalignment
1e4f7faa runtime: virtcontainers/types: fix govet fieldalignment
bb9495c0 runtime: virtcontainers/pkg: fix govet fieldalignment
80ab91ac runtime: virtcontainers/persist: fix govet fieldalignment
54bdd018 runtime: virtcontainers/factory: fix govet fieldalignment
dd58de36 runtime: virtcontainers/device: fix govet fieldalignment
47d95dc1 runtime: virtcontainers: fix govet fieldalignment
8ca7a7c5 runtime: netmon: fix govet fieldalignment
31de8eb7 runtime: pkg: fix govet fieldalignment
2b80091e runtime: containerd-shim-v2: fix govet fieldalignment
0dc59df6 runtime: cli: fix govet fieldalignment
c1042523 ci: expand $CI to nothing
add480ed monitor: mv the monitor socket into sbs directory
f7c6f170 docs: added a glossary to support SEO tactics
a8649acf snap: fixed snap aarch64 qemu patches dir in snapcraft.yaml file
38826194 osbuilder: update centos arm rootfs image config 'GPG_KEY_ARCH_URL'
c5fdc0db docs: fix minikube installation guide runtimeclasses error
f2ef25c6 docs: fixed kata-deploy path for kata logs with fluentd doc
cb6b7667 runtime: Add option "enable_guest_swap" to config hypervisor.qemu
a733f537 runtime: newContainer: Handle the annotations of SWAP
2c835b60 ContainerConfig: Set ocispec.Annotations to containerConfig.Annotations
243d4b86 runtime: Sandbox: Add addSwap and removeSwap
e1b91986 runtime: Update golang proto code for AddSwap
4f066db8 agent: agent.proto: Add AddSwap
4f23b8cd ci: set -o nounset
35cbc93d agent: clear MsFlags if the option has clear flag set
ff87da72 config: Fix description for OCI hooks
8e0daf67 shimv2: fix the issue of kata-runtime exec failed
b12b21f3 osbuilder: Skip installing golang for building rootfs
558f1be6 snap: Remove QEMU before clone
5371b921 mount: fix the issue of missing check file exists
27b299b2 agent-ctl: Use a common Makefile style like other components
05084699 agent-ctl: bump to latest tokio
acf69328 agent: update tokio to 1.8.1
dcd29867 static-checks: Call the static-checks make target
afd97850 makefile: Add static-checks target
34828df9 virtiofsd: fix the issue of missing stop virtiofsd
73d3798c vsock-exporter: switch to tokio runtime
7960689e tracing: replace SimpleSpanProcessor with BatchSpanProcessor
e887b39e docs: Update containerd configuration format
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Now that we have enabled CI tests for jailed firecracker and we have
fixed the issue with removing the block storage device #2387, we
should leverage the full power of firecracker and enable jailer by
default.
Fixes: #2455
Signed-off-by: Jack Rieck <jack.rieck@sendgrid.com>
Remove duplicated information, reduce text separation, and rewrite notes
to be more clear and concise.
Fixes: #2449
Signed-off-by: Joao Vanzuita <joaovanzuita@me.com>
As quay.io is becoming our de-facto image registry, let's actually push
the kata-deploy release to it. This commit should've been part of
9fa1febfd9 but ended up slipping out.
Fixes: #2306
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
The call to Trace() in runHooks() should return a context so that
subsequent calls to runHook() produce properly ordered trace spans.
Fixes#2423
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
As kubernetes version has been bumped to 1.22, let's bump the CRI-O
version accordingly.
Related: #2434
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Let's test our `main` branch against the latest version of k8s. In
order to do the bump, let's also update critools version accordingly.
Depends-on: github.com/kata-containers/tests#3818
Fixes: #2433
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Alpine used to work as guest under 1.x, but because there is no musl
target for Rust on s390x, Alpine will not work for 2.x. Document this.
Fixes: #2436
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
inotify/watchable-mount changes...
- Allow up to 16 files. It isn't that uncommon to have 3 files in a secret.
In Kubernetes, this results in 9 files in the mount (the presented files,
which are symlinks to the latest files, which are symlinks to actual files
which are in a seperate hidden directoy on the mount). Bumping from eight to 16 will
help ensure we can support "most" secret/tokens, and is still a pretty
small number to scan...
- Now we will only replace the watched storage with a bindmount if we observe
that there are too many files or if its too large. Since the scanning/updating is racy,
we should expect that we'll occassionally run into errors (ie, a file
deleted between scan / update). Rather than stopping and making a bind
mount, continue updating, as the changes will be updated the next time
check is called for that entry (every 2 seconds today).
To facilitate the 'oversized' handling, we create specific errors for too large
or too many files, and handle these specific errors when scanning the storage entry.
- When handling an oversided mount, do not remove the prior files -- we'll just
overwrite them with the bindmount. This'll help avoid the files
disappearing from the user, avoid racy cleanup and simplifies the flow.
Similarly, only mark it as a non-watched storage device after the
bindmount is created successfully.
- When creating bind mount, make sure destination exists. If we hadn't
had a successful scan before, this wouldn't exist and the mount would
fail. Update logic and unit test to cover this.
- In several spots, we were returning when there was an error (both in
scan and update). For update case, let's just log an warning and continue;
since the scan/update is racy, we should expect that we'll have
transient errors which should resolve the next time the watcher runs.
Fixes: #2402
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
- no need to create `/usr/lib/systemd/systemd` link any more
- install `chrony` as extra package and install extra packages in chroot
rather than `debootstrap`, because `chrony` provides `time-daemon`,
which under 20.04 is provided by `systemd-timesyncd`, which is
required by `systemd`, and `debootstrap`'s conflict resolvement can't
handle this, but `apt`'s can.
Fixes: #2147
Depends-on: github.com/kata-containers/tests#3636
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
changed the document project url in the using-vpp-and-kata.md and
runtime experimental README.md files.
Fixes: #2418
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
This work patched the 4.19, 5.4 and 5.10 kernels, and now ptp_kvm can work
correctly when the host and guest use different kernel versions..
Fixes: #2123
Signed-off-by: Damon Kwok <damon-kwok@outlook.com>
Currently, there is cloud hypervisor binary released only for x86, thus
we must build from source code when install cloud hypervisor on arm64.
Fixes: #2410
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Instead of relying on a centos/docker image, present only on dockerhub,
let's rely on the centos:7 image from the centos registry, and apply
the same modifications applied when generating the centos/systemd image.
The main reason for doing this is avoiding to update an image from 3
years ago, making the delta of the packages updated smaller.
If you're curious why we keep using CentOS 7 though, the reason is
because CentOS 8, and UBI images have a different systemd configuration
that works quite well when mounting the image using podman, but systemd
can't connect dbus when running on environments like AKS or even
minikube. So, in order to be as compatible as possible, let's keep
using the CentOS 7 image for now, at least till we find a suitable
substitute for that.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
snap build for arm64 fail for a long time, here we enable it.
the changes:
1. correct the variable of "branch"
2. add v5.1.0 under tag_patchs
Fixes: #2194
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Let's just remove the cached failes as those are not needed for anything
we do when using this image.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
In order to avoid providing an image with security issues, let's ensure
we run `yum update` as part of our image build process. This is needed
as even with the latest CentOS images there may be fix provided by some
CVE that's already part of the updates but not yet part of the image.
In our case, it's even more needed as the `centos/systemd` image has not
been updated for 3 years or so and those are the vulnerabilities found
in the current images:
https://quay.io/repository/kata-containers/kata-deploy?tab=tagsFixes: #2303
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
9fa1febfd9 added the support to also push
the image to quay.io. However, we didn't try explicitly pass quay.io as
the registry server, causing then to login to fail.
Fixes: #2306
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
just search for the "kata" substring in the runtime value and log at
info level when the runtime name/type is not found.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
We keep the container engine info in the sandbox cache map, as the value
associated to the pod id (the key). Since we used that in
getMonitorAddress() only (which is gone) we can avoid storing that
information. Let's drop it.
Keep the map structure and the [put,delete]IfExists functions as we may
want to move to an event based cache update process sooner or later, and
we will need those.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
since the shim socket path is statically defined in the containerd-shimv2
code, we don't need to retrieve the socket name from the filesystem:
construct the socket name using the containerd-shimv2 code.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
kata-monitor switched from containerd client to CRI. Update the
dependencies and vendored code.
go mod tidy
go mod vendor
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
kata-monitor uses containerd client to retrieve information from the
container engine. This makes kata-monitor work with the containerd
container engine only.
Bin Liu (bin <bin@hyper.sh>) worked on a kata-monitor version able
to talk to any container engine leveraging the standard CRI[1].
Here, the original work of Bin Lui has been adapted on the current
kata-monitor to make it container engine independent.
[1] https://github.com/liubin/kata-containers/tree/fix/1030-use-cri-in-kata-monitorFixes: #1030
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
With Kata 1.x EOL, the Go agent is no more. So, remove support for it from
the osbuilder scripts. This removes the RUST_AGENT variable, treating it
as always true.
fixes#2396
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Kata-proxy is not longer used in kata 2.x, this PR removes the
reference as well to an script that is not longer existing.
Fixes#2391
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
When running a firecracker instance jailed, block devices
are not removed correctly, as the jailerRoot path is not
stripped from the PATCH command sent to the FC API.
This patch differentiates the jailed case from the non-jailed
one and allows the firecracker instance to be properly
terminated.
Fixes#2387
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
When the guest is built using dracut and the agent uses glibc (esp.
ppc64le/s390x), libraries might be missing. In my case, it was
`libutil.so`, but more can be added easily. Add a script to configure
`install_items` for dracut w.r.t. `ldd` of the agent.
Fixes: #2384
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
The CRI-O integration test suite has two tests that fail because they search for
"not found" in the error message, but we emit "is not exist".
Change the error message to match the expectations of the test suite.
Fixes: #2036
Reported-by: Julien Ropé <jrope@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The QEMU sheepdog driver was deprecated in 5.2.0 and removed entirely in
6.1. Explicitly disabling, therefore is unnecessary from 5.2.0 and will
give an error from 6.1.
fixes#2337
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We only test qemu 5.2 in the CI (5.1 for ARM), and I believe we already
have some subtle dependencies that will stop things working on older qemu
versions.
We just updated govmm to a version that explicitly only works with qemu 5.0
and later, so we can drop stale checks for older qemu versions. More
specifically that means we can drop patches for older qemu versions, and
remove checks for older qemu versions from configure-hypervisor.sh.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Update to commit 3c64244cbb, in particular to get these fixes which
are needed to work with qemu-6.0 and later:
https://github.com/kata-containers/govmm/pull/192https://github.com/kata-containers/govmm/pull/194
Git log
d27256f (qmp: Don't use deprecated 'props' field for object-add, 2021-08-03)
d8cdf9a (qemu: Drop support for versions older than 5.0, 2021-08-03)
1b02192 (Use 'host_device' driver for blockdev backends, 2021-07-29)
9518675 (add support for "sandbox" feature to qemu, 2021-07-20)
335fa81 (qemu: fix golangci-lint errors, 2021-07-21)
61b6378 (.github/workflows: reimplement github actions CI, 2021-07-21)
9d6e797 (go: support go modules, 2021-07-21)
0d21263 (qemu: support read-only nvdimm, 2021-07-21)
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Kata uses the 'kernel_irqchip' machine option to qemu. By default it
uses it in what qemu calls the "short-form boolean" with no parameter.
That style was deprecated by qemu between 5.2 and 6.0 (commit
ccd3b3b8112b) and effectively removed entirely between 6.0 and 6.1
(commit d8fb7d0969d5).
Update ourselves for newer qemus by using an explicit
"kernel_irqchip=on".
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
With lines like
0a2e2c6038/tools/osbuilder/rootfs-builder/fedora/config.sh (L8)
we imply that one can set another OS_VERSION and it will get picked up.
This is not the case when building inside Docker/Podman because the
variable is not passed to the container, which can lead to confusion.
Forward this env.
Fixes: #2378
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
long file paths are difficult to read, this change adds a new readonly variable to shorten the full file path of the static build folder files.
Fixes: #2354
Signed-off-by: Joao Vanzuita <joaovanzuita@me.com>
The swappiness is not right if just set
io.katacontainers.container.resource.swappiness:
$ pod_yaml=pod.yaml
$ container_yaml=container.yaml
$ image="quay.io/prometheus/busybox:latest"
$ cat << EOF > "${pod_yaml}"
metadata:
name: busybox-sandbox1
EOF
$ cat << EOF > "${container_yaml}"
metadata:
name: busybox-killed-vmm
annotations:
io.katacontainers.container.resource.swappiness: "100"
image:
image: "$image"
command:
- top
EOF
$ sudo crictl pull $image
$ podid=$(sudo crictl runp $pod_yaml)
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
$ sudo crictl start $cid
crictl exec $cid cat /sys/fs/cgroup/memory/memory.swappiness
60
The cause of this issue is there are two elements store the resources
infomation. They are c.config.Resources for calculateSandboxMemory and
c.GetPatchedOCISpec() for agent.
This add initConfigResourcesMemory to Container and call it in
newContainer to handle the issue.
Fixes: #2372
Signed-off-by: Hui Zhu <teawater@antfin.com>
Although the OCI specification does not explictly requires that, we
should create the process CWD if it does not exist, before chdir'ing
to it. Without that fizx, the kata-agent fails to create a container
and returns a grpc error when it's trying to change the containerd
working directory to an non existing folder.
runc, the OCI runtime reference implementation, also creates the process
CWD when it's not part of the container rootfs.
Fixes#2374
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
When enabling tracing with Cloud Hypervisor, we end up establishing 2
connections to 2 different HTTP servers: The Cloud Hypervisor API one
that runs over a UNIX socket and the Jaeger endpoint running over UDP.
Both connections use the default HTTP golang client instance, and thus
share the same transport layer. As the Cloud Hypervisor implementation
sets it up to be over a Unix socket, the jaeger uploader ends up going
through that transport as well, and sending its spans to the Cloud
Hypervisor API server.
We fix that by giving the Cloud Hypervisor implementation its own HTTP
client instance and we avoid sharing it with anything else in the shim.
Fixes#2364
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
This is not the most beautiful solution, but when do the check on every
single step we ensure the test at least started, and consequently will
succeed.
Without this the tests wouldn't even start, making any PR using the
`force-skip-ci` label not mergeable.
Fixes: #2362
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Run the agent shutdown test as part of CI testing code in this repo.
Fixes: #1808.
Depends-on:github.com/kata-containers/tests#3495
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
All users should be running 2.x releases so remove the legacy details
since it's arguably confusing to have two sets of details.
Reworked the components listed in the main README so that rather than
being sorted alphabetically, they are now sorted in semi-order of
importance and split into two tables to make the point more clearly.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Removed all TOCs now that GitHub auto-generates them.
Also updated the documentation requirements doc removing the requirement
to add a TOC.
Fixes: #2022.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Kata deploy README document only contains Firecracker and Qemu. This PR adds
cloud-hypervisor test command to the README.md file.
Fixes: #2357
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
The old ones are carrying CVEs, do not use them.
PS: In order to update the modules, we're running `make handle_vendor`
target from the runtime's Makefile. This is now part of the CI and
ensures that the vendored code is up-to-date. It's important to note
that older versions of golang may generate different results for those,
but those versions are not supported anymore, so we're good to go with
what we have in the CI (1.15 and 1.16).
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Since the old ones are carrying CVEs. Do not use them.
PS: In order to update the modules, we're running `make handle_vendor`
target from the runtime's Makefile. This is now part of the CI and
ensures that the vendored code is up-to-date. It's important to note
that older versions of golang may generate different results for those,
but those versions are not supported anymore, so we're good to go with
what we have in the CI (1.15 and 1.16).
Fixes: #2338
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Create a document summarising the tracing design proposals
from PR #1937.
Fixes: #2061.
Signed-off-by: bin <bin@hyper.sh>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Looking at the changes that could cause the static-checks not to run
when a PR is updated I think 7db8a85a1f
could be the one that introduced such a regression.
Let's (try to) fix this by enforcing the workflow to run also when the
PR has been "edited" and "synchronized".
Fixes: #2343
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Add makefile to document possible options to run.
e.g
Default: Create a kata tarball, it will build assets concurrently.
```
$ make
```
Create a tarball build for cloud-hypervisor.
```
$ make cloud-hypervisor
```
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
If alpine image is created inside a container,
it does not get any golang version data. It will try
to get it by installing yq. To install yq curl is used.
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Tarballs are generated on push and merge events.
push: Allows get a tarball from the PR and use locally.
merge: After a PR is merged we have a quick way to get latest
kata-tarball.
The tarball can be downloaded from github page only.
Fixes: #1710
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
kata-deploy buider now reuses the build directory, this
makes faster rebuilds. Update firecracker builder to
not fail if is called twice.
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Use the yq installed in the env. Needed
to build kata from docker. The container builder
has not an initial Go env.
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
Add script to build kata using docker.
Allow build kata-deploy binaries using docker.
kata-deploy-binaries-in-docker.sh is a wrapper of
kata-deploy-binaries.sh it will call kata-deploy-binaries.sh in a
container with all the dependencies installed.
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
rather than removing the other line because configuration only contains
the image line ever more and this is how we already do it in tests.
Fixes: #2330
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
This PR updates the proper url for log parser for kata 2.x for
the Developer Guide document.
Fixes#2328
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
There is a new "ReadOnly" option added to nvdimm device in qemu
and now added to kata. However, qemu used for arm64 is a little
old and has no this feature. Here we remove this feature for arm.
Fixes: #2320
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
This PR updates the experimental documentation with the proper reference
to kata 2.x
Fixes#2317
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This causes the repository to be checked out to a version tag, which is
inconsistent with how we build runtime, and reverts us to a buggy
`snap/snapcraft.yaml`.
Fixes: #2313
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Let's ensure the runc version installed and used for running our tests
matches the vendored version.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Pior our bump to runc 1.0.1 the manager's Set() would take a Config as
its parameter. Now it takes the Resources directly.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Previously part of the "system" namespace, the RunningInUserNS() has
been moved to the "userns" namespace.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Dependabot brought to us attention that we were still vendoring the runc
code which was affected by CVE-2021-30465.
Although the vulnerability doesn't seem to affect kata-containers, we
better keep our dependencies up-to-date anyways. With this in mind,
let's bump our runc dependency to the latest release.
Fixes: #2309
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Improve security by making rootfs image read-only, nobody
will be able to modify it from the guest.
fixes#1916
Signed-off-by: Julio Montes <julio.montes@intel.com>
Update kata-deploy-binaries.sh cli options.
Add options to allow ask build a tarball for a specific asset.
It will help developers build a specific component and update
a kata-deploy installation. Also build each asset independetly
can help to create cache tarballs per asset in the future.
e.g. Build a tarball with shimv2.
```
./kata-deploy-binaries.sh --build=shim-v2
```
Additionally, the script path is moved to a new directory
as not only will work for releases.
Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com>
yq changed syntax in an incompatible way starting from version 4 and
above. Deal with that.
Fixes: #2297
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
Bring read-only nvdimm support
Shortlog:
335fa81 qemu: fix golangci-lint errors
61b6378 .github/workflows: reimplement github actions CI
9d6e797 go: support go modules
0d21263 qemu: support read-only nvdimm
ff34d28 qemu: Consistent parameter building
Signed-off-by: Julio Montes <julio.montes@intel.com>
PR #2252 put `set -o nounset` in `ci/lib.sh`. It turns out that this
won't work when `$CI` is unset (it is always set in CI). Expand `$CI` to
nothing.
Fixes: #2283
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Since the monitor socket used the unix socket path file,
which needed to be cleaned after the pod terminated,
thus put it into the sandbox data directory, and it
would be cleaned up once the sandbox termianted.
Fixes: #2269
Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
This commit is a result of Assisted PR Process for PR #1515. It
deviates from it in that the original commits were not retained as the
original commit structure was unnecessarily complex - the same commit
was added to two parallel branches which were then merged, producing the
same result in the end as any of the original two non-merge commits.
Also, a squash was requested by an original PR review.
Other changes to the original PR were changing capitalisation of the word
"Kubelet" in Glossary.md to placate spell checker and fixing link names and
syntax.
The original commit message follows:
The terms added are: Kata Containers, container software, container
runtime interface, virtual machine software, container virtualization,
container security solutions, serverless containers, pod containers,
virtual machine monitor, private cloud, infrastructure architecture,
public cloud, and auto scaling.
Fixes: #1509
Signed-off-by: Helena Spease <helena@openstack.org>
Signed-off-by: Pavel Mores <pmores@redhat.com>
fixed arm qemu patches dir in snap part. Clear the old `packaging/obs-packaging` path.
Fixes: #2279
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
the kata-deploy project scripts were changed, but minikube installation guide doc still use old yaml script.
fix guide doc use the new yaml script of runtimeClasses.
Fixes: #2276
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
The kata-deploy project path has changed from kata v2. fixed kata-deploy path in the document how-to-import-kata-logs-with-fluentd.md.
The correct path is `$GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kata-deploy`
Fixes: #2273
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
This commit add option "enable_guest_swap" to config hypervisor.qemu.
It will enable swap in the guest. Default false.
When enable_guest_swap is enabled, insert a raw file to the guest as the
swap device if the swappiness of a container (set by annotation
"io.katacontainers.container.resource.swappiness") is bigger than 0.
The size of the swap device should be
swap_in_bytes (set by annotation
"io.katacontainers.container.resource.swap_in_bytes") - memory_limit_in_bytes.
If swap_in_bytes is not set, the size should be memory_limit_in_bytes.
If swap_in_bytes and memory_limit_in_bytes is not set, the size should be
default_memory.
Fixes: #2201
Signed-off-by: Hui Zhu <teawater@antfin.com>
This commit add code to handle the annotations
"io.katacontainers.container.resource.swappiness" and
"io.katacontainers.container.resource.swap_in_bytes".
It will set the value of "io.katacontainers.resource.swappiness" to
c.config.Resources.Memory.Swappiness and set the value of
"io.katacontainers.resource.swap_in_bytes" to
c.config.Resources.Memory.Swap.
Fixes: #2201
Signed-off-by: Hui Zhu <teawater@antfin.com>
ocispec.Annotations is dropped in ContainerConfig.
This commit let it to be set to containerConfig.Annotations in
ContainerConfig.
Fixes: #2201
Signed-off-by: Hui Zhu <teawater@antfin.com>
addSwap will create a swap file, hotplug it to hypervisor as a special
block device and let agent to setup it in the guest kernel.
removeSwap will remove the swap file.
Just QEMU support addSwap.
Fixes: #2201
Signed-off-by: Hui Zhu <teawater@antfin.com>
Add new fuction AddSwap. When agent get AddSwap, it will get the device
name from PCIPath and set the device as the swap device.
Fixes: #2201
Signed-off-by: Hui Zhu <teawater@antfin.com>
'FLAGS' hash map has bool to indicate if the flag should be cleared or
not. But in parse_mount_flags_and_options() we set the flag even 'clear'
is true. This results in a 'rw' mount being mounted as 'MS_RDONLY'.
Fixes: #2262
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Commit 32c9ae1388 upgrade the
containerd vendor, which used the socket path to replace
the abstract socket address for socket listen and dial, and
there's an bug in containerd's abstract socket dialing.
Thus we should replace our monitor and exec socket server
with the socket path to fix this issue.
Fixes: #2238
Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
Building rootfs does not depend on golang, delete intalling
golang may save build time.
And there is only rust agent now, the code for golang agent should
be deleted too.
Fixes: #2170
Signed-off-by: bin <bin@hyper.sh>
If you snap in an environment where you previously snapped,
`git clone`ing QEMU will fail. Remove the checkout directory.
Fixes: #2249
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
It's better to check whether the destination file exists
before creating them, if it had been existed, then return
directly.
Fixes: #2247
Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
Update to latest tokio to address RUSTSEC-2021-0072:
Task dropped in wrong thread when aborting `LocalSet` task
Update the toml to specify just 1.x for the tokio version.
Fixes: #2165
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Instead of calling the ci/static-checks.sh script directly, it was changed the
workflow to call `make static-checks`. And because the `static-checks` target
depends on build, the build step in the workflow is not longer needed.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Added the 'static-checks' make target to allow developers to easily run
the static checks locally.
Fixes#2206
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Make the vsock-exporter async totally using tokio runtime.
And delay the timing of the connection to trace-forwarder so that
it is easy to reconnect when the connection was broken.
Fixes: #2234
Signed-off-by: Tim Zhang <tim@hyper.sh>
`containerd` has adopted a new configuration style. Update the example configuration to reflect the change.
Fixes: #2180
Signed-off-by: Yujia Qiao <qiaoyujia@bytedance.com>
2021-07-12 10:25:21 +00:00
1826 changed files with 113247 additions and 159118 deletions
a method used in cloud computing, whereby the amount of computational resources in a server farm, typically measured in terms of the number of active servers, which vary automatically based on the load on the farm.
## B
## C
### Container Security Solutions
The process of implementing security tools and policies that will give you the assurance that everything in your container is running as intended, and only as intended.
### Container Software
A standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
### Container Runtime Interface
A plugin interface which enables Kubelet to use a wide variety of container runtimes, without the need to recompile.
### Container Virtualization
A container is a virtual runtime environment that runs on top of a single operating system (OS) kernel and emulates an operating system rather than the underlying hardware.
## D
## E
## F
## G
## H
## I
### Infrastructure Architecture
A structured and modern approach for supporting an organization and facilitating innovation within an enterprise.
## J
## K
### Kata Containers
Kata containers is an open source project delivering increased container security and Workload isolation through an implementation of lightweight virtual machines.
## L
## M
## N
## O
## P
### Pod Containers
A Group of one or more containers , with shared storage/network, and a specification for how to run the containers.
### Private Cloud
A computing model that offers a proprietary environment dedicated to a single business entity.
### Public Cloud
Computing services offered by third-party providers over the public Internet, making them available to anyone who wants to use or purchase them.
## Q
## R
## S
### Serverless Containers
An architecture in which code is executed on-demand. Serverless workloads are typically in the cloud, but on-premises serverless platforms exist, too.
## T
## U
## V
### Virtual Machine Monitor
Computer software, firmware or hardware that creates and runs virtual machines.
### Virtual Machine Software
A software program or operating system that not only exhibits the behavior of a separate computer, but is also capable of performing tasks such as running applications and programs like a separate computer.
Kata Containers is an open source project and community working to build a
@@ -67,69 +46,34 @@ Please raise an issue
> **Note:**
> If you are reporting a security issue, please follow the [vulnerability reporting process](https://github.com/kata-containers/community#vulnerability-handling)
#### Kata Containers 1.x versions
For older Kata Containers 1.x releases, please raise an issue in the
| [KSM throttler](https://github.com/kata-containers/ksm-throttler) | optional core | Daemon that monitors containers and deduplicates memory to maximize container density on the host. |
| [osbuilder](https://github.com/kata-containers/osbuilder) | infrastructure | See [components](#components). |
| [packaging](https://github.com/kata-containers/packaging) | infrastructure | See [components](#components). |
| [proxy](https://github.com/kata-containers/proxy) | core | Multiplexes communications between the shims, agent and runtime. |
| [runtime](https://github.com/kata-containers/runtime) | core | See [components](#components). |
| [shim](https://github.com/kata-containers/shim) | core | Handles standard I/O and signals on behalf of the container process. |
> **Note:**
>
> - There are more components for the original Kata Containers 1.x implementation.
> - The current implementation simplifies the design significantly:
> compare the [current](docs/design/architecture.md) and
The following repositories are used by both the current and first generation Kata Containers implementations:
| Component | Description | Current | First generation | Notes |
|-|-|-|-|-|
| CI | Continuous Integration configuration files and scripts. | [Kata 2.x](https://github.com/kata-containers/ci/tree/main) | [Kata 1.x](https://github.com/kata-containers/ci/tree/master) | |
| kernel | The Linux kernel used by the hypervisor to boot the guest image. | [Kata 2.x][kernel] | [Kata 1.x][kernel] | Patches are stored in the packaging component. |
| tests | Test code. | [Kata 2.x](https://github.com/kata-containers/tests/tree/main) | [Kata 1.x](https://github.com/kata-containers/tests/tree/master) | Excludes unit tests which live with the main code. |
| www.katacontainers.io | Contains the source for the [main web site](https://www.katacontainers.io). | [Kata 2.x][github-katacontainers.io] | [Kata 1.x][github-katacontainers.io] | | |
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [`agent-ctl`](tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
### Packaging and releases
@@ -138,6 +82,9 @@ Kata Containers is now
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
the [components](#components) section for further details.
## Glossary of Terms
See the [glossary of terms](Glossary.md) related to Kata Containers.
You MUST choose one of `alpine`, `centos`, `clearlinux`, `debian`, `euleros`, `fedora`, `suse`, and `ubuntu` for `${distro}`. By default `seccomp` packages are not included in the rootfs image. Set `SECCOMP` to `yes` to include them.
> - If you do *not* wish to build under Docker, remove the `USE_DOCKER`
> variable in the previous command and ensure the `qemu-img` command is
> available on your system.
> - If `qemu-img` is not installed, you will likely see errors such as `ERROR: File /dev/loop19p1 is not a block device` and `losetup: /tmp/tmp.bHz11oY851: Warning: file is smaller than 512 bytes; the loop device may be useless or invisible for system tools`. These can be mitigated by installing the `qemu-img` command (available in the `qemu-img` package on Fedora or the `qemu-utils` package on Debian).
* [Design and Implementations](#design-and-implementations)
* [How to Contribute](#how-to-contribute)
* [Code Licensing](#code-licensing)
* [The Release Process](#the-release-process)
* [Help Improving the Documents](#help-improving-the-documents)
* [Website Changes](#website-changes)
The [Kata Containers](https://github.com/kata-containers)
documentation repository hosts overall system documentation, with information
common to multiple components.
@@ -22,6 +11,10 @@ For details of the other Kata Containers repositories, see the
* [Installation guides](./install/README.md): Install and run Kata Containers with Docker or Kubernetes
## Tracing
See the [tracing documentation](tracing.md).
## More User Guides
* [Upgrading](Upgrading.md): how to upgrade from [Clear Containers](https://github.com/clearcontainers) and [runV](https://github.com/hyperhq/runv) to [Kata Containers](https://github.com/kata-containers) and how to upgrade an existing Kata Containers system to the latest version.
@@ -51,6 +44,7 @@ Documents that help to understand and contribute to Kata Containers.
* [Kata Containers Architecture](design/architecture.md): Architectural overview of Kata Containers
* [Kata Containers E2E Flow](design/end-to-end-flow.md): The entire end-to-end flow of Kata Containers
* [Kata Containers design](./design/README.md): More Kata Containers design documents
* [Kata Containers threat model](./threat-model/threat-model.md): Kata Containers threat model
- [How to do a Kata Containers Release](#how-to-do-a-kata-containers-release)
- [Requirements](#requirements)
- [Release Process](#release-process)
- [Bump all Kata repositories](#bump-all-kata-repositories)
- [Merge all bump version Pull requests](#merge-all-bump-version-pull-requests)
- [Tag all Kata repositories](#tag-all-kata-repositories)
- [Check Git-hub Actions](#check-git-hub-actions)
- [Create release notes](#create-release-notes)
- [Announce the release](#announce-the-release)
<!-- TOC END -->
## Requirements
- [hub](https://github.com/github/hub)
@@ -78,7 +64,7 @@
### Check Git-hub Actions
We make use of [GitHub actions](https://github.com/features/actions) in this [file](https://github.com/kata-containers/kata-containers/blob/main/.github/workflows/main.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-containers` repository.
We make use of [GitHub actions](https://github.com/features/actions) in this [file](https://github.com/kata-containers/kata-containers/blob/main/.github/workflows/release.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-containers` repository.
Check the [actions status page](https://github.com/kata-containers/kata-containers/actions) to verify all steps in the actions workflow have completed successfully. On success, a static tarball containing Kata release artifacts will be uploaded to the [Release page](https://github.com/kata-containers/kata-containers/releases).
- [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
- [Appendices](#appendices)
- [DAX](#dax)
## Overview
This is an architectural overview of Kata Containers, based on the 2.0 release.
@@ -35,7 +14,7 @@ through the [CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and
Kata Containers creates a QEMU\*/KVM virtual machine for pod that `kubelet` (Kubernetes) creates respectively.
The [`containerd-shim-kata-v2` (shown as `shimv2` from this point onwards)](../../src/runtime/containerd-shim-v2)
The [`containerd-shim-kata-v2` (shown as `shimv2` from this point onwards)](../../src/runtime/cmd/containerd-shim-kata-v2/)
is the Kata Containers entrypoint, which
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2) for Kata.
@@ -280,7 +259,7 @@ With `RuntimeClass`, users can define Kata Containers as a `RuntimeClass` and th
## DAX
Kata Containers utilizes the Linux kernel DAX [(Direct Access filesystem)](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.txt)
Kata Containers utilizes the Linux kernel DAX [(Direct Access filesystem)](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.rst?h=v5.14)
feature to efficiently map some host-side files into the guest VM space.
In particular, Kata Containers uses the QEMU NVDIMM feature to provide a
memory-mapped virtual device that can be used to DAX map the virtual machine's
| `SandboxCgroupOnly=false` | yes | legacy | Easiest to make Kata work | Unaccounted for memory and resource utilization | v1
| `SandboxCgroupOnly=true` | no | recommended | Complete tracking of Kata memory and CPU utilization. In Kubernetes, the Kubelet can fully constrain Kata via the pod cgroup | Requires upper layer orchestrator which sizes sandbox cgroup appropriately | v1, v2
Kata implement CRI's API and support [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interface to get basic metrics about container.
But unlike `runc`, Kata is a VM-based runtime and has a different architecture.
@@ -222,7 +207,7 @@ Metrics for Firecracker vmm.
| `kata_firecracker_uart`: <br> Metrics specific to the UART device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`flush_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vcpu`: <br> Metrics specific to VCPUs' mode of functioning. | `GAUGE` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
| `kata_firecracker_vmm`: <br> Metrics specific to the machine manager as a whole. | `GAUGE` | | <ul><li>`item`<ul><li>`device_events`</li><li>`panic_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
To fulfill the [Kata design requirements](kata-design-requirements.md), and based on the discussion on [Virtcontainers API extensions](https://docs.google.com/presentation/d/1dbGrD1h9cpuqAPooiEgtiwWDGCYhVPdatq7owsKHDEQ), the Kata runtime library features the following APIs:
@@ -30,7 +30,7 @@ The Kata Containers runtime **MUST** implement the following command line option
The Kata Containers project **MUST** provide two interfaces for CRI shims to manage hardware
virtualization based Kubernetes pods and containers:
- An OCI and `runc` compatible command line interface, as described in the previous section.
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`cri-containerd`](https://github.com/containerd/cri-containerd), for example.
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`containerd`](https://github.com/containerd/containerd), for example.
- A hardware virtualization runtime library API for CRI shims to consume and provide a more
CRI native implementation. The [`frakti`](https://github.com/kubernetes/frakti) CRI shim is an example of such a consumer.
- [Run Kata containers with `crictl`](run-kata-with-crictl.md)
- [Run Kata Containers with Kubernetes](run-kata-with-k8s.md)
- [How to use Kata Containers and Containerd](containerd-kata.md)
- [How to use Kata Containers and CRI (containerd plugin) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
- [How to use Kata Containers and CRI (containerd) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
- [Kata Containers and service mesh for Kubernetes](service-mesh.md)
- [How to import Kata Containers logs into Fluentd](how-to-import-kata-logs-with-fluentd.md)
@@ -21,13 +17,13 @@
-`firecracker`
-`ACRN`
While `qemu`and`cloud-hypervisor` work out of the box with installation of Kata,
some additional configuration is needed in case of `firecracker` and `ACRN`.
While `qemu`,`cloud-hypervisor` and `firecracker` work out of the box with installation of Kata,
some additional configuration is needed in case of `ACRN`.
Refer to the following guides for additional configuration steps:
- [Kata Containers with Firecracker](https://github.com/kata-containers/documentation/wiki/Initial-release-of-Kata-Containers-with-Firecracker-support)
- [Kata Containers with ACRN Hypervisor](how-to-use-kata-containers-with-acrn.md)
## Advanced Topics
- [How to use Kata Containers with virtio-fs](how-to-use-virtio-fs-with-kata.md)
- [Setting Sysctls with Kata](how-to-use-sysctls-with-kata.md)
- [What Is VMCache and How To Enable It](what-is-vm-cache-and-how-do-I-use-it.md)
@@ -38,3 +34,5 @@
- [How to set sandbox Kata Containers configurations with pod annotations](how-to-set-sandbox-config-kata.md)
- [How to monitor Kata Containers in K8s](how-to-set-prometheus-in-k8s.md)
- [How to use hotplug memory on arm64 in Kata Containers](how-to-hotplug-memory-arm64.md)
- [How to setup swap devices in guest kernel](how-to-setup-swap-devices-in-guest-kernel.md)
- [How to run rootless vmm](how-to-run-rootless-vmm.md)
To improve security, Kata Container supports running the VMM process (currently only QEMU) as a non-`root` user.
This document describes how to enable the rootless VMM mode and its limitations.
## Pre-requisites
The permission and ownership of the `kvm` device node (`/dev/kvm`) need to be configured to:
```
$ crw-rw---- 1 root kvm
```
use the following commands:
```
$ sudo groupadd kvm -r
$ sudo chown root:kvm /dev/kvm
$ sudo chmod 660 /dev/kvm
```
## Configure rootless VMM
By default, the VMM process still runs as the root user. There are two ways to enable rootless VMM:
1. Set the `rootless` flag to `true` in the hypervisor section of `configuration.toml`.
2. Set the Kubernetes annotation `io.katacontainers.hypervisor.rootless` to `true`.
## Implementation details
When `rootless` flag is enabled, upon a request to create a Pod, Kata Containers runtime creates a random user and group (e.g. `kata-123`), and uses them to start the hypervisor process.
The `kvm` group is also given to the hypervisor process as a supplemental group to give the hypervisor process access to the `/dev/kvm` device.
Another necessary change is to move the hypervisor runtime files (e.g. `vhost-fs.sock`, `qmp.sock`) to a directory (under `/run/user/[uid]/`) where only the non-root hypervisor has access to.
## Limitations
1. Only the VMM process is running as a non-root user. Other processes such as Kata Container shimv2 and `virtiofsd` still run as the root user.
2. Currently, this feature is only supported in QEMU. Still need to bring it to Firecracker and Cloud Hypervisor (see https://github.com/kata-containers/kata-containers/issues/2567).
3. Certain features will not work when rootless VMM is enabled, including:
1. Passing devices to the guest (`virtio-blk`, `virtio-scsi`) will not work if the non-privileged user does not have permission to access it (leading to a permission denied error). A more permissive permission (e.g. 666) may overcome this issue. However, you need to be aware of the potential security implications of reducing the security on such devices.
2. `vfio` device will also not work because of permission denied error.
@@ -34,8 +34,6 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.agent.enable_tracing` | `boolean` | enable tracing for the agent |
| `io.katacontainers.config.agent.container_pipe_size` | uint32 | specify the size of the std(in/out) pipes created for containers |
| `io.katacontainers.config.agent.kernel_modules` | string | the list of kernel modules and their parameters that will be loaded in the guest kernel. Semicolon separated list of kernel modules and their parameters. These modules will be loaded in the guest kernel using `modprobe`(8). E.g., `e1000e InterruptThrottleRate=3000,3000,3000 EEE=1; i915 enable_ppgtt=0` |
| `io.katacontainers.config.agent.trace_mode` | string | the trace mode for the agent |
| `io.katacontainers.config.agent.trace_type` | string | the trace type for the agent |
## Hypervisor Options
| Key | Value Type | Comments |
@@ -91,6 +89,13 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.virtio_fs_cache` | string | the cache mode for virtio-fs, valid values are `always`, `auto` and `none` |
Setup swap device in guest kernel can help to increase memory capacity, handle some memory issues and increase file access speed sometimes.
Kata Containers can insert a raw file to the guest as the swap device.
## Requisites
The swap config of the containers should be set by [annotations](how-to-set-sandbox-config-kata.md#container-options). So [extra configuration is needed for containerd](how-to-set-sandbox-config-kata.md#containerd-configuration).
Kata Containers just supports setup swap device in guest kernel with QEMU.
Install and setup Kata Containers as shown [here](../install/README.md).
Enable setup swap device in guest kernel as follows:
```
$ sudo sed -i -e 's/^#enable_guest_swap.*$/enable_guest_swap = true/g' /etc/kata-containers/configuration.toml
```
## Run a Kata Container utilizing swap device
Use following command to start a Kata Container with swappiness 60 and 1GB swap device (swap_in_bytes - memory_limit_in_bytes).
This document provides an overview on how to run Kata containers with ACRN hypervisor and device model.
- [Introduction](#introduction)
- [Pre-requisites](#pre-requisites)
- [Configure Docker](#configure-docker)
- [Configure Kata Containers with ACRN](#configure-kata-containers-with-acrn)
## Introduction
ACRN is a flexible, lightweight Type-1 reference hypervisor built with real-time and safety-criticality in mind. ACRN uses an open source platform making it optimized to streamline embedded development.
@@ -27,7 +22,7 @@ This document requires the presence of the ACRN hypervisor and Kata Containers o
- For networking, ACRN supports either MACVTAP or TAP. If MACVTAP is not enabled in the Service OS, please follow the below steps to update the kernel:
- [Kata Containers with virtio-fs](#kata-containers-with-virtio-fs)
- [Introduction](#introduction)
## Introduction
Container deployments utilize explicit or implicit file sharing between host filesystem and containers. From a trust perspective, avoiding a shared file-system between the trusted host and untrusted container is recommended. This is not always feasible. In Kata Containers, block-based volumes are preferred as they allow usage of either device pass through or `virtio-blk` for access within the virtual machine.
- [Run a Kata Container utilizing `virtio-mem`](#run-a-kata-container-utilizing-virtio-mem)
## Introduction
The basic idea of `virtio-mem` is to provide a flexible, cross-architecture memory hot plug and hot unplug solution that avoids many limitations imposed by existing technologies, architectures, and interfaces.
Kata Containers on Amazon Web Services (AWS) makes use of [i3.metal](https://aws.amazon.com/ec2/instance-types/i3/) instances. Most of the installation procedure is identical to that for Kata on your preferred distribution, except that you have to run it on bare metal instances since AWS doesn't support nested virtualization yet. This guide walks you through creating an i3.metal instance.
# Install Kata Containers on Google Compute Engine
* [Create an Image with Nested Virtualization Enabled](#create-an-image-with-nested-virtualization-enabled)
* [Create the Image](#create-the-image)
* [Verify VMX is Available](#verify-vmx-is-available)
* [Install Kata](#install-kata)
* [Create a Kata-enabled Image](#create-a-kata-enabled-image)
Kata Containers on Google Compute Engine (GCE) makes use of [nested virtualization](https://cloud.google.com/compute/docs/instances/enable-nested-virtualization-vm-instances). Most of the installation procedure is identical to that for Kata on your preferred distribution, but enabling nested virtualization currently requires extra steps on GCE. This guide walks you through creating an image and instance with nested virtualization enabled. Note that `kata-runtime check` checks for nested virtualization, but does not fail if support is not found.
As a pre-requisite this guide assumes an installed and configured instance of the [Google Cloud SDK](https://cloud.google.com/sdk/downloads). For a zero-configuration option, all of the commands below were been tested under [Google Cloud Shell](https://cloud.google.com/shell/) (as of Jun 2018). Verify your `gcloud` installation and configuration:
This document discusses threat models associated with the Kata Containers project.
Kata was designed to provide additional isolation of container workloads, protecting
the host infrastructure from potentially malicious container users or workloads. Since
Kata Containers adds a level of isolation on top of traditional containers, the focus
is on the additional layer provided, not on traditional container security.
This document provides a brief background on containers and layered security, describes
the interface to Kata from CRI runtimes, a review of utilized virtual machine interfaces, and then
a review of threats.
## Kata security objective
Kata seeks to prevent an untrusted container workload or user of that container workload to gain
control of, obtain information from, or tamper with the host infrastructure.
In our scenario, an asset is anything on the host system, or elsewhere in the cluster
infrastructure. The attacker is assumed to be either a malicious user or the workload itself
running within the container. The goal of Kata is to prevent attacks which would allow
any access to the defined assets.
## Background on containers, layered security
Traditional containers leverage several key Linux kernel features to provide isolation and
a view that the container workload is the only entity running on the host. Key features include
`Namespaces`, `cgroups`, `capablities`, `SELinux` and `seccomp`. The canonical runtime for creating such
a container is `runc`. In the remainder of the document, the term `traditional-container` will be used
to describe a container workload created by runc.
Kata Containers provides a second layer of isolation on top of those provided by traditional-containers.
The hardware virtualization interface is the basis of this additional layer. Kata launches a lightweight
virtual machine, and uses the guest’s Linux kernel to create a container workload, or workloads in the case
of multi-container pods. In Kubernetes and in the Kata implementation, the sandbox is carried out at the
pod level. In Kata, this sandbox is created using a virtual machine.
## Interface to Kata Containers: CRI, v2-shim, OCI
A typical Kata Containers deployment uses Kubernetes with a CRI implementation.
On every node, Kubelet will interact with a CRI implementor, which will in turn interface with
an OCI based runtime, such as Kata Containers. Typical CRI implementors are `cri-o` and `containerd`.
The CRI API, as defined at the Kubernetes [CRI-API repo](https://github.com/kubernetes/cri-api/),
results in a few constructs being supported by the CRI implementation, and ultimately in the OCI
runtime creating the workloads.
In order to run a container inside of the Kata sandbox, several virtual machine devices and interfaces
are required. Kata translates sandbox and container definitions to underlying virtualization technologies provided
by a set of virtual machine monitors (VMMs) and hypervisors. These devices and their underlying
implementations are discussed in detail in the following section.
## Interface to the Kata sandbox/virtual machine
In case of Kata, today the devices which we need in the guest are:
- Storage: In the current design of Kata Containers, we are reliant on the CRI implementor to
assist in image handling and volume management on the host. As a result, we need to support a way of passing to the sandbox the container rootfs, volumes requested
by the workload, and any other volumes created to facilitate sharing of secrets and `configmaps` with the containers. Depending on how these are managed, a block based device or file-system
sharing is required. Kata Containers does this by way of `virtio-blk` and/or `virtio-fs`.
- Networking: A method for enabling network connectivity with the workload is required. Typically this will be done providing a `TAP` device
to the VMM, and this will be exposed to the guest as a `virtio-net` device. It is feasible to pass in a NIC device directly, in which case `VFIO` is leveraged
and the device itself will be exposed to the guest.
- Control: In order to interact with the guest agent and retrieve `STDIO` from containers, a medium of communication is required.
This is available via `virtio-vsock`.
- Devices: `VFIO` is utilized when devices are passed directly to the virtual machine and exposed to the container.
- Dynamic Resource Management: `ACPI` is utilized to allow for dynamic VM resource management (for example: CPU, memory, device hotplug). This is required when containers are resized,
or more generally when containers are added to a pod.
How these devices are utilized varies depending on the VMM utilized. We clarify the default settings provided when integrating Kata
with the QEMU, Firecracker and Cloud Hypervisor VMMs in the following sections.
### Devices
Each virtio device is implemented by a backend, which may execute within userspace on the host (vhost-user), the VMM itself, or within the host kernel (vhost). While it may provide enhanced performance,
vhost devices are often seen as higher risk since an exploit would be already running within the kernel space. While VMM and vhost-user are both in userspace on the host, `vhost-user` generally allows for the back-end process to require less system calls and capabilities compared to a full VMM.
#### `virtio-blk` and `virtio-scsi`
The backend for `virtio-blk` and `virtio-scsi` are based in the VMM itself (ring3 in the context of x86) by default for Cloud Hypervisor, Firecracker and QEMU.
While `vhost` based back-ends are available for QEMU, it is not recommended. `vhost-user` back-ends are being added for Cloud Hypervisor, they are not utilized in Kata today.
#### `virtio-fs`
`virtio-fs` is supported in Cloud Hypervisor and QEMU. `virtio-fs`'s interaction with the host filesystem is done through a vhost-user daemon, `virtiofsd`.
The `virtio-fs` client, running in the guest, will generate requests to access files. `virtiofsd` will receive requests, open the file, and request the VMM
to `mmap` it into the guest. When DAX is utilized, the guest will access the host's page cache, avoiding the need for copy and duplication. DAX is still an experimental feature,
and is not enabled by default.
From the `virtiofsd` [documentation](https://qemu-project.gitlab.io/qemu/tools/virtiofsd.html):
```This program must be run as the root user. Upon startup the program will switch into a new file system namespace with the shared directory tree as its root. This prevents “file system escapes” due to symlinks and other file system objects that might lead to files outside the shared directory. The program also sandboxes itself using seccomp(2) to prevent ptrace(2) and other vectors that could allow an attacker to compromise the system after gaining control of the virtiofsd process.```
DAX-less support for `virtio-fs` is available as of the 5.4 Linux kernel. QEMU VMM supports virtio-fs as of v4.2. Cloud Hypervisor
supports `virtio-fs`.
#### `virtio-net`
`virtio-net` has many options, depending on the VMM and Kata configurations.
##### QEMU networking
While QEMU has options for `vhost`, `virtio-net` and `vhost-user`, the `virtio-net` backend
for Kata defaults to `vhost-net` for performance reasons. The default configuration is being
reevaluated.
##### Firecracker networking
For Firecracker, the `virtio-net` backend is within Firecracker's VMM.
##### Cloud Hypervisor networking
For Cloud Hypervisor, the current backend default is within the VMM. `vhost-user-net` support
is being added (written in rust, Cloud Hypervisor specific).
#### virtio-vsock
##### QEMU vsock
In QEMU, vsock is backed by `vhost_vsock`, which runs within the kernel itself.
##### Firecracker and Cloud Hypervisor
In Firecracker and Cloud Hypervisor, vsock is backed by a unix-domain-socket in the hosts userspace.
#### VFIO
Utilizing VFIO, devices can be passed through to the virtual machine. We will assess this separately. Exposure to
host is limited to gaps in device pass-through handling. This is supported in QEMU and Cloud Hypervisor, but not
Firecracker.
#### ACPI
ACPI is necessary for hotplug of CPU, memory and devices. ACPI is available in QEMU and Cloud Hypervisor. Device, CPU and memory hotplug
- [Install and configure Kata Containers](#install-and-configure-kata-containers)
- [Build Kata Containers kernel with GPU support](#build-kata-containers-kernel-with-gpu-support)
- [Nvidia GPU pass-through mode with Kata Containers](#nvidia-gpu-pass-through-mode-with-kata-containers)
- [Nvidia vGPU mode with Kata Containers](#nvidia-vgpu-mode-with-kata-containers)
- [Install Nvidia Driver in Kata Containers](#install-nvidia-driver-in-kata-containers)
- [References](#references)
An Nvidia GPU device can be passed to a Kata Containers container using GPU passthrough
(Nvidia GPU pass-through mode) as well as GPU mediated passthrough (Nvidia vGPU mode).
@@ -79,7 +67,7 @@ To use large BARs devices (for example, Nvidia Tesla P100), you need Kata versio
The following configuration in the Kata `configuration.toml` file as shown below can work:
Hotplug for PCI devices by `shpchp` (Linux's SHPC PCI Hotplug driver):
Hotplug for PCI devices by `acpi_pcihp` (Linux's ACPI PCI Hotplug driver):
```
machine_type = "q35"
@@ -103,7 +91,6 @@ The following kernel config options need to be enabled:
```
# Support PCI/PCIe device hotplug (Required for large BARs device)
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_HOTPLUG_PCI_SHPC=y
# Support for loading modules (Required for load Nvidia drivers)
CONFIG_MODULES=y
@@ -303,4 +290,4 @@ Tue Mar 3 00:03:49 2020
- [Configuring a VM for GPU Pass-Through by Using the QEMU Command Line](https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#using-gpu-pass-through-red-hat-el-qemu-cli)
- [Copy Intel® QAT configuration files and enable virtual functions](#copy-intel-qat-configuration-files-and-enable-virtual-functions)
- [Expose and Bind Intel® QAT virtual functions to VFIO-PCI (Every reboot)](#expose-and-bind-intel-qat-virtual-functions-to-vfio-pci-every-reboot)
- [Check Intel® QAT virtual functions are enabled](#check-intel-qat-virtual-functions-are-enabled)
- [Prepare Kata Containers](#prepare-kata-containers)
- [Download Kata kernel Source](#download-kata-kernel-source)
- [Build Kata kernel](#build-kata-kernel)
- [Copy Kata kernel](#copy-kata-kernel)
- [Prepare Kata root filesystem](#prepare-kata-root-filesystem)
- [Compile Intel® QAT drivers for Kata Containers kernel and add to Kata Containers rootfs](#compile-intel-qat-drivers-for-kata-containers-kernel-and-add-to-kata-containers-rootfs)
- [Copy Kata rootfs](#copy-kata-rootfs)
- [Verify Intel® QAT works in a container](#verify-intel-qat-works-in-a-container)
> Note: Kata Containers supports creating VM sandboxes with Intel® SGX enabled
> using [cloud-hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor/) VMM only. QEMU support is waiting to get the
> Intel SGX enabled QEMU upstream release.
The following commands were tested on Fedora 32, they might work on other distros too.
## Installation
### Kata Containers Guest Kernel
Follow the instructions to [setup](../../tools/packaging/kernel/README.md#setup-kernel-source-code) and [build](../../tools/packaging/kernel/README.md#build-the-kernel) the experimental guest kernel. Then, install as:
* The Kata VM's SGX Encrypted Page Cache (EPC) memory size is based on the sum of `sgx.intel.com/epc`
resource requests within the pod.
*`init-sgx` can be removed from the YAML configuration file if the Kata rootfs is modified with the
necessary udev rules.
See the [note on SGX backwards compatibility](https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/cmd/sgx_plugin#backwards-compatibility-note).
* Intel SGX DCAP attestation is known to work from Kata sandboxes but it comes with one limitation: If
the Intel SGX `aesm` daemon runs on the bare metal node and DCAP `out-of-proc` attestation is used,
containers within the Kata sandbox cannot get the access to the host's `/var/run/aesmd/aesm.sock`
because socket passthrough is not supported. An alternative is to deploy the `aesm` daemon as a side-car
container.
* Projects like [Gramine Shielded Containers (GSC)](https://gramine-gsc.readthedocs.io/en/latest/) are
also known to work. For GSC specifically, the Kata guest kernel needs to have the `CONFIG_NUMA=y`
enabled and at least one CPU online when running the GSC container.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.