kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-08-28 04:21:03 +00:00

Author	SHA1	Message	Date
Greg Kurz	219bb8e7d0	govmm: Optionally start QMP with a pre-configured connection When QEMU is launched daemonized, we have the guarantee that the QMP socket is available. In order to launch a non-daemonized QEMU, the QMP connection should be created before QEMU is started in order to avoid a race. Introduce a variant of QMPStart() that can use such an existing connection. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-01-24 19:16:47 +01:00
GabyCT	421a33f846	Merge pull request #6096 from dcantah/kataruntime-use_hyp_consts runtime: Use consts in `kata-runtime check`	2023-01-18 10:54:42 -06:00
Peng Tao	7d1a604bad	Merge pull request #6060 from ls-ggg/6055/service.mu-deadlock runtime:all APIs are hang in the service.mu	2023-01-18 10:50:00 +08:00
Danny Canter	ba87e0afea	runtime: Use consts in `kata-runtime check` Fixes: #6095 We're already importing the virtcontainers package so might as well use the constants for the hypervisor types we're checking against instead of typing the names out in the switch cases. Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-17 06:55:36 -08:00
Bin Liu	790f45190b	Merge pull request #6074 from zhaojizhuang/enablevhostuserstore runtime: paas enablevhostuserstore annotation to hypervisor config	2023-01-17 11:43:43 +08:00
Tim Zhang	20196048bf	Merge pull request #6030 from liubin/fix/6029-use-system-hugepagesize runtime: use system pagesize for hugepage test	2023-01-16 16:57:55 +08:00
ls	69fc8de712	runtime:all APIs are hang in the service.mu When the vmm process exits abnormally, a goroutine sets s.monitor to null in the 'watchSandbox' function without getting service.mu, This will cause another goroutine to block when sending a message to s.monitor, and it holds service.mu, which leads to a deadlock. For example, the wait function in the file .../pkg/containerd-shim-v2/wait.go will send a message to s.monitor after obtaining service.mu, but s.monitor may be null at this time Fixes: #6059 Signed-off-by: ls <335814617@qq.com>	2023-01-16 14:45:37 +08:00
Eric Ernst	807eeaafd0	Merge pull request #6047 from egernst/build-kata-monitor-on-darwin runtime: Use git rev-parse for the kata-monitor tag	2023-01-13 15:29:00 -08:00
Eric Ernst	3d573ba579	Merge pull request #6050 from egernst/goos-the-vc virtcontainers: split out linux-specific bits for mount, factory	2023-01-13 15:28:42 -08:00
Eric Ernst	458fe865ea	Merge pull request #6052 from egernst/add-darwin-skeletons Add darwin skeletons	2023-01-13 13:14:16 -08:00
Eric Ernst	923cd3fda1	virtcontainers: split out Linux parts from mount Mount handling is often unique in Linux. Let's ensure that the common parts remain in mount.go, while Linux speific parts are within a linux file. Fixes: #6049 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-13 11:14:56 -08:00
Eric Ernst	54f2b296e3	Merge pull request #6048 from egernst/revendor-netlink vendor: revendor netlink to get latest	2023-01-13 11:08:47 -08:00
Eric Ernst	f82918f872	Merge pull request #6045 from egernst/fix-6044 Address issues with the initial vCPU pinning functionality	2023-01-13 11:06:42 -08:00
GabyCT	9c6e90fd55	Merge pull request #6043 from GabyCT/topic/fixerrormsg virtcontainers: Fix misspelling in error message	2023-01-13 09:16:34 -06:00
zhaojizhuang	cf1bae3521	runtime: paas enablevhostuserstore annotation to hypervisor config Fixes: #6073 Signed-off-by: zhaojizhuang <571130360@qq.com>	2023-01-13 17:07:38 +08:00
Eric Ernst	60ff230d80	virtcontainers: Split the factory package into Linux and Darwin bits - split template - split factory - add stubs for darwin Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 16:51:28 -08:00
Samuel Ortiz	76437a9721	runtime: Use git rev-parse for the kata-monitor tag The .git-commit can be a multiple line file, potentially confusing the Darwin linker for example. Fixes: #6046 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 16:01:58 -08:00
Samuel Ortiz	a9626682af	virtcontainers: resourcecontrol: Add skeleton for Darwin Cgroups do not exist on Darwin, so use an empty implementation for resourcecontrol for the time being. In the process, ensure that the utilized cgroup handling (ie, isSystemdCgroup) is kept in general file, since we use this to help assess/constrain the container spec we pass to the guest. Fixes: #6051 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:53:28 -08:00
Samuel Ortiz	ea06fe3afc	virtcontainers: Add a Network API skeleton for Darwin Empty for now. Fixes: #6051 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:53:28 -08:00
Eric Ernst	6ee550e9a5	runtime: vCPUs pinning is sandbox specific, not hypervisor While at it, make sure we persist this and fix a misc typo. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-12 15:44:25 -08:00
Peng Tao	2b4b825228	Merge pull request #6032 from liubin/fix/6031-add-test-file-to-gitignore runtime: add test generated file to .gitignore	2023-01-12 15:38:46 +08:00
Peng Tao	4a4232b851	Merge pull request #6037 from bergwolf/github/no-netns runtime: fix up disable_netns handling	2023-01-12 09:58:24 +08:00
Eric Ernst	e3d3b72fa2	virtcontainers: use resource control for setting CPU affinity Let's abstract the CPU affinity, instead of calling linux only code from sandbox. Fixes: #6044 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:55:53 -08:00
Eric Ernst	f137048be3	resource-control: add helper function for setting CPU affinity Let's abstract the CPU affinity Fixes: #6044 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:55:53 -08:00
Eric Ernst	73216a8104	vendor: revendor netlink to get latest This'll address issue where netlink couldn't build on Darwin hosts. Fixes: #6026 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2023-01-11 17:23:15 -08:00
Gabriela Cervantes	fc17d7cc41	virtcontainers: Fix misspelling in error message This PR fixes a misspelling in the error message when it tries to run a system without Confidential computing support. Fixes #6042 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-01-11 21:58:07 +00:00
Peng Tao	12fd6ffc1f	runtime: fix up disable_netns handling With `disable_netns=true`, we should never scan the sandbox netns which is the host netns in such case. Fixes: #6021 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-11 12:25:24 +00:00
Bin Liu	7eb43cec15	runtime: add test generated file to .gitignore Add test generated file to .gitignore to avoid making the working directory dirty. Fixes: #6031 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-11 17:16:06 +08:00
Bin Liu	8551853cfe	runtime: use system pagesize for hugepage test In TestHandleHugepages it will do a mount operation with different pagesizes, but some systems only support 2M pagesize, test for a 1g pagesize will fail. This commit try to fix by only mount pagesizes under `/sys/kernel/mm/hugepages`, which are supported to mount by the OS. Fixes: #6029 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-11 17:02:58 +08:00
Eric Ernst	07e77f5be7	Merge pull request #5994 from dcantah/virtcontainers_tests_darwin virtcontainers: tests: Ensure Linux specific tests are just run on Linux	2023-01-10 17:13:28 -08:00
Fabiano Fidêncio	147c56bb8d	Merge pull request #6019 from liubin/fix/6018-virtiofsd-cache-mod Change cache mode from none to never	2023-01-10 23:12:13 +01:00
Bin Liu	8225d8044e	Merge pull request #6003 from dcantah/fs-skeleton virtcontainers: fs_share: Add Darwin skeleton	2023-01-10 17:48:45 +08:00
Bin Liu	86a82cace9	runtime: change cache mode from none to never New Rust virtiofsd's `cache` mode doesn't support `none` mode, we should use `never` to replace it. Fixes: #6018 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-01-10 17:29:48 +08:00
Eric Ernst	4d53303a7d	Merge pull request #6005 from dcantah/vfw-skeleton virtcontainers: Add a Virtualization.framework skeleton	2023-01-09 15:50:04 -08:00
Bin Liu	1bae41a4d4	Merge pull request #5996 from dcantah/vfw-initial virtcontainers: Introduce hypervisor_darwin	2023-01-09 11:37:02 +08:00
Samuel Ortiz	fa9ae9362c	virtcontainers: Add a Virtualization.framework skeleton Fixes: #6004 A Virtualization.framework based Hypervisor implementation. This is just stubs for now to eventually get this building. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-08 07:40:21 -08:00
Eric Ernst	d48b22bb13	virtcontainers: fs_share: add Darwin skeleton Fixes: #6002 As a first pass for testing, let's add a skeleton for filesystem sharing support on Darwin.. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-07 19:56:47 -08:00
Bin Liu	2c10b37172	Merge pull request #5991 from dcantah/darwin-sigs runtime: Define Darwin handled signals list	2023-01-07 11:19:48 +08:00
Bin Liu	bc8a6423e0	Merge pull request #5986 from dcantah/nydus-nonetns nydus: net-ns handling needs to be only executed on Linux hosts	2023-01-07 11:19:07 +08:00
Eric Ernst	fafc7a8b1a	virtcontainers: tests: Ensure Linux specific tests are just run on Linux Fixes: #5993 Several tests utilize linux'isms like Mounts, bindmounts, vsock etc. Let's ensure that these are still tested on Linux, but that we also skip these tests when on other operating systems (Darwin). This commit just moves tests; there shouldn't be any functional test changes. While the tests still won't be runnable on Darwin/other hosts yet, this is a necessary step forward. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-06 11:09:11 -08:00
Fabiano Fidêncio	efa4fc0b25	clh: Add hotplug support for network devices This is needed in order to have Moby / Docker working properly with Cloud Hypervisor, as Moby / Docker relies on hotplugging a network device to the VM as a preStartHook. Fixes: #5997 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-06 18:59:47 +01:00
Fabiano Fidêncio	1074d2c1d3	clh: Make vmAddNetPutRequest capable of doing hotplugs THe only bit needed for having the vmAddNetPutRequest() capable of dealing with hotplugs, instead of only coldplugs, is making sure it doesn't error out in case a `200` response is returned. The 200 response means: """ The new device was successfully added to the VM instance. """ Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-01-06 18:55:55 +01:00
Fabiano Fidêncio	175794458f	Merge pull request #5972 from bergwolf/github/hook fix moby prestart hook handling	2023-01-06 14:54:39 +01:00
Eric Ernst	9ec8a13985	virtcontainers: introduce hypervisor_darwin Fixes: #5995 Placeholder skeleton at this point - implementation will be added after basic build refactoring lands. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-06 02:03:34 -08:00
Peng Tao	8bb68a9f28	vc/network: skip existing endpoints when scanning for new ones So that addAllEndpoints() becomes re-entrant and we can use it to scan netns changes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-06 10:01:19 +00:00
Samuel Ortiz	3b4420eb8e	runtime: Define Darwin handled signals list Fixes: #5990 Some signals may not be defined on non Linux host OSes, like SIGSTKFLT for example. It's also not defined on certain architectures, but irrelevant for this. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 17:50:47 -08:00
Danny Canter	24b05a99b6	schedcore: Make buildable on !linux Fixes: #5983 sched-core only makes sense on Linux hosts. Let's add stub/error for other platforms. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 11:51:04 -08:00
Danny Canter	3886aad199	nydus: net-ns handling needs to be only executed on Linux hosts Fixes: #5985 With nydus not being its own pkg, it is challenging to implement cleanly in a virtcontainers package that isn't necesarily Linux-only. The existing code utilizes network namespace code in order to ensure nydus is launched in the host netns. This is very Linux specific - so let's make sure we only carry this out in a linux specific file. In the Darwin case, to allow for compilation at least, let's add a stub for doNetNS. Ideally the nydus and vc code can be refactored / decoupled. Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-05 11:48:43 -08:00
Bin Liu	4ab9364aa6	Merge pull request #5946 from dcantah/clarify-var Runtime: Clarify mutability of global var	2023-01-05 13:08:45 +08:00
Bin Liu	649d2d4b8d	Merge pull request #5964 from openanolis/kata-runtime kata-runtime: add rust runtime path for kata-runtime exec	2023-01-05 09:35:21 +08:00
Peng Tao	d085389127	vc: fix up UT for CreateSandbox API change Need to adapt the UT as well. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 22:30:42 +08:00
Peng Tao	578a9c25f0	vc: rescan network endpoints after running prestart hooks Moby relies on the prestart hooks to configure network endpoints. We should rescan the netns after running them so that the newly added endpoints can be found and plugged to the guest. Fixes: #5941 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 22:30:41 +08:00
Peng Tao	cb84b0fb02	katautils: run prestart hooks after starting VM So that we can pass the hypervisor pid to the hook instead of the runtime process's. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-01-03 10:52:32 +00:00
Danny Canter	56e7b5d0fd	runtime/Makefile: Get some bits happy on darwin Substitution in the yq install script doesn't like zsh, and additionally the version of yq we're using doesn't have a darwin/arm64 build so grab the amd64 version and let rosetta work its magic. Additionally swap to abspath from readlink -m for the printing of what binaries to install, as the -m flag doesn't exist on the BSD variant, and this should be the same behavior. Fixes: #5970 Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-02 04:19:58 -08:00
Danny Canter	86ee24b33c	Runtime: Clarify mutability of global var Was about to change `urandomdev` to a constant when I realized it's intentionally mutable so it can be mocked in tests. There's other comments to the same effect so clarify here as well. Fixes: #5965 Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-01-02 01:13:34 -08:00
Zhongtao Hu	dae6670628	kata-runtime: add rust runtime path for kata-runtime exec add rust runtime path for kata-runtime exec Fixes:#5963 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-12-30 13:34:34 +08:00
Binbin Zhang	99485d871c	shim: return hypervisor's pid not shim's pid update outdated code comments Fixes: #3234 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-12-14 11:16:11 +08:00
Fabiano Fidêncio	f1381eb361	Merge pull request #4813 from ManaSugi/fix/add-selinux-agent runtime,agent: Add SELinux support for containers inside the guest	2022-12-13 11:24:53 +01:00
Alexandru Matei	d04d45ea05	runtime: use pidfd to wait for processes on Linux Use pidfd_open and poll on newer versions of Linux to wait for the process to exit. For older versions use existing wait logic Fixes: #5617 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-06 16:31:05 +02:00
Alexandru Matei	e9ba0c11d0	runtime: use exponential backoff for process wait Initial wait period between checks is 1ms, and the next ones are min(wait_period*5, 50ms) Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-06 16:30:58 +02:00
Alexandru Matei	71491a69c3	runtime: move process wait logic to another function extract process wait logic to another function Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-05 13:32:04 +02:00
Alexandru Matei	92ebe61fea	runtime: reap force killed processes reap child processes after sending SIGKILL Fixes #5739 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-12-05 13:31:58 +02:00
Bin Liu	d4321ab489	runtime: Add identification in version for runtime-rs Now we are supporting two runtime/shim, the go version, and the rust version, for debug purposes, we can add an identification in the version info to tell us which runtime/shim is used. Fixes: #5806 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-12-01 15:14:08 +08:00
Manabu Sugimoto	c617bbe70d	runtime: Pass SELinux policy for containers to the agent Pass SELinux policy for containers to the agent if `disable_guest_selinux` is set to `false` in the runtime configuration. The `container_t` type is applied to the container process inside the guest by default. Users can also set a custom SELinux policy to the container process using `guest_selinux_label` in the runtime configuration. This will be an alternative configuration of Kubernetes' security context for SELinux because users cannot specify the policy in Kata through Kubernetes's security context. To apply SELinux policy to the container, the guest rootfs must be CentOS that is created and built with `SELINUX=yes`. Fixes: #4812 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-11-29 19:07:56 +09:00
GabyCT	013752667b	Merge pull request #5776 from liubin/tmp/debug-static-check ci: let static checks don't depend on build	2022-11-28 07:51:42 -06:00
Bin Liu	6af037d379	Merge pull request #5154 from Yuan-Zhuo/main agent: support systemd cgroup for kata agent.	2022-11-28 18:40:10 +08:00
Bin Liu	e723bad0af	ci: let static checks don't depend on build Build is a time consumable operation, skip build while let ci run faster. Fixes: #5777 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-28 15:26:04 +08:00
Bin Liu	a55eb78c32	Merge pull request #5752 from liubin/fix/5750-go-fix-1.19 runtime: go fix code for 1.19	2022-11-26 02:09:02 +08:00
Peng Tao	e32c023d96	Merge pull request #5714 from UiPath/fix-mkdir runtime: don't fail mkdir if the folder is already created by another process	2022-11-25 17:52:56 +08:00
Bin Liu	1dfd845f51	runtime: go fix code for 1.19 We have starting to use golang 1.19, some features are not supported later, so run `go fix` to fix them. Fixes: #5750 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-11-25 11:29:18 +08:00
Alexandru Matei	4b45e13869	runtime: don't fail mkdir if the folder is already created Use MkdirAll instead of Mkdir so it doesn't generate an error when the folder is created by another process Fixes #5713 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-24 11:20:56 +02:00
Bin Liu	06a604b753	Merge pull request #5720 from YchauWang/wyc-docs-test-22 runtime: add log record to the qemu config method `appendDevices` for…	2022-11-24 13:15:06 +08:00
Peng Tao	b4d0a39f6d	Merge pull request #5723 from fidencio/topic/runtime-bump-containerd-to-v1.6.8 runtime: Use containerd v1.6.8	2022-11-24 11:28:58 +08:00
wangyongchao.bj	30a7ebf430	runtime: Log invalid devices in QEMU config When the user tried to add new devices to the VM, there is no error info for the invalid device. This PR adds a log record to the `appendDevices` for the invalid device of the qemu config. Fixes: #5719 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2022-11-23 09:09:45 +08:00
Fabiano Fidêncio	df3d9878d5	Merge pull request #5695 from darfux/virtiofs-queue-size runtime: Support virtiofs queue size for qemu and make it configurable	2022-11-22 20:04:30 +01:00
Fabiano Fidêncio	2539f31862	runtime: Use containerd v1.6.8 Let's follow the binary bump used in the CI and also bump the vendored version of containerd to v1.6.8. Fixes: #5722 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-22 18:28:30 +01:00
Peng Tao	a636d426d9	versions: update nydusd version To the latest stable v2.1.1. Depends-on: github.com/kata-containers/tests#5246 Fixes: #5635 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-11-19 16:33:29 +00:00
liyuxuan.darfux	3bb145c63a	runtime: Support virtiofs queue size for qemu and make it configurable The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024 by default which is same as clh. Also make this value configurable. Fixes: #5694 Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>	2022-11-19 15:38:11 +08:00
Bo Chen	36545aa81a	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v28.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #5683 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-11-17 09:45:27 -08:00
Fabiano Fidêncio	d94718fb30	runtime: Fix gofmt issues It seems that bumping the version of golang and golangci-lint new format changes are required. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-17 14:16:12 +01:00
Fabiano Fidêncio	16b8375095	golang: Stop using io/ioutils The package has been deprecated as part of 1.16 and the same functionality is now provided by either the io or the os package. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-11-17 13:43:25 +01:00
Peng Tao	eab8d6be13	build: update golang version to 1.19.2 So that we get the latest language fixes. There is little use to maitain compiler backward compatibility. Let's just set the default golang version to the latest 1.19.2. Fixes: #5494 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-11-16 19:02:39 +01:00
Alexandru Matei	a04afab74d	qemu: early exit from Check if the process was stopped Fixes: #5625 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	7e481f2179	qemu: set stopped only if StopVM is successful Fixes: #5624 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	0e3ac66e76	clh: return faster with dead clh process from isClhRunning Through proactively checking if Cloud Hypervisor process is dead, this patch provides a faster path for isClhRunning Fixes: #5623 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	9ef68e0c7a	clh: fast exit from isClhRunning if the process was stopped Use atomic operations instead of acquiring a mutex in isClhRunning. This stops isClhRunning from generating a deadlock by trying to reacquire an already-acquired lock when called via StopVM->terminate. Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Alexandru Matei	2631b08ff1	clh: don't try to stop clh multiple times Avoid executing StopVM concurrently when virtiofs dies as a result of clh being stopped in StopVM. Fixes: #5622 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-11-10 22:43:32 +02:00
Fabiano Fidêncio	7250be3601	Merge pull request #5584 from fengyehong/clh-thread cloud-hypervisor: Fix GetThreadIDs function	2022-11-07 08:22:40 +01:00
Guanglu Guo	daeee26a1e	cloud-hypervisor: Fix GetThreadIDs function Get vcpu thread-ids by reading cloud-hypervisor process tasks information. Fixes: #5568 Signed-off-by: Guanglu Guo <guoguanglu@qiyi.com>	2022-11-05 17:23:19 +08:00
LitFlwr0	2508d39b7c	runtime: added vcpus pinning logics Core VCPU threads pinning logics for issue 4476. Also provided docs. Fixes:#4476 Signed-off-by: LitFlwr0 <861690705@qq.com>	2022-11-04 17:52:42 +08:00
snir911	288e337a6f	Merge pull request #5434 from Rouzip/remove-doNetNS add EnterNetNS in virtcontainers	2022-10-30 11:19:07 +02:00
Yuan-Zhuo	d7bb4b5512	agent: support systemd cgroup for kata agent 1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd). 2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs). 3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md). Fixes: #4336 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>	2022-10-25 13:57:09 +08:00
Bo Chen	a151d8ee50	Merge pull request #5493 from fidencio/topic/update-clh versions: Update Cloud Hypervisor to b4e39427080	2022-10-24 07:54:02 -07:00
Fabiano Fidêncio	190e623c40	Merge pull request #5317 from Champ-Goblem/fix-containerd-stats shim: Ensure pagesize is set when reporting hugetlb stats	2022-10-24 10:24:49 +02:00
Fabiano Fidêncio	9d286af7b4	versions: Update Cloud Hypervisor to b4e39427080 An API change, done a long time ago, has been exposed on Cloud Hypervisor and we should update it on the Kata Containers side to ensure it doesn't affect Cloud Hypervisor CI and because the change is needed for an upcoming work to get QAT working with Cloud Hypervisor. Fixes: #5492 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-21 20:52:54 +02:00
Rouzip	39363ffbfb	runtime: remove same function Add EnterNetNS in virtcontainers to remove same function. FIXes #5394 Signed-off-by: Rouzip <1226015390@qq.com>	2022-10-17 10:59:13 +08:00
Fupan Li	2c88e1cd80	Merge pull request #5302 from liubin/fix/5285-SetFsSharingSupport-comment runtime: fix incorrect comment for SetFsSharingSupport function	2022-10-09 09:40:31 +08:00
Bin Liu	b556c9b986	Merge pull request #5235 from YchauWang/wyc-qmp-log virtcontainers: add warn log record for qmp hotplug cpu error	2022-10-09 08:29:09 +08:00
Vijay Dhanraj	435c8f181a	acrn: Enable ACRN hypervisor support for Kata 2.x release Currently ACRN hypervisor support in Kata2.x releases is broken. This commit re-enables ACRN hypervisor support and also refactors the code so as to remove dependency on Sandbox. Fixes #3027 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>	2022-10-07 07:40:32 -07:00
Archana Shinde	6e2d39c588	Merge pull request #5311 from likebreath/0930/clh_v27.0 Upgrade to Cloud Hypervisor v27.0	2022-10-04 10:56:00 -07:00
Champ-Goblem	89e62d4edf	shim: Ensure pagesize is set when reporting hugetbl stats The containerd stats method and metrics API are broken with Kata 2.5.x, the stats fail to load and the metrics API responds with status code 500 This seems to be down to the conversion from the stats reported by the agent RPC `StatsContainer` where the field `Pagesize` is not completed by the `setHugetlbStats` method. In the case where multiple sized tables stats are reported, this causes containerd to register two metrics with the same label set, rather than each being partitioned by the `page` label. Fixes: #5316 Signed-off-by: Champ-Goblem <cameron@northflank.com>	2022-10-04 09:16:30 +01:00
Bo Chen	067e2b1e33	runtime: clh: Use the new API to boot with TDX firmware (td-shim) The new way to boot from TDX firmware (e.g. td-shim) is using the combination of '--platform tdx=on' with '--firmware tdshim'. Fixes: #5309 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-03 10:30:54 -07:00
Bo Chen	5d63fcf344	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v27.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Fixes: #5309 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-03 10:30:42 -07:00
norbjd	17de94e118	microvm: Remove kernel_irqchip=on option `kernel_irqchip` option doesn't seem to bring any benefits and, on the contrary, its usage cause issues when using the microvm machine type. With this in mind, let's remove it. Fixes: #1984, #4386 Signed-off-by: norbjd <norbjd@users.noreply.github.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-10-03 11:48:05 +02:00
Bin Liu	68e8a86aec	runtime: fix incorrect comment for SetFsSharingSupport function The comment for SetFsSharingSupport is not suitable, correct the function name. Fixes: #5285 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-09-30 15:44:44 +08:00
Peng Tao	8a2df6b31c	Merge pull request #4931 from jpecholt/snp-support Added SNP-Support for Kata-Containers	2022-09-27 14:17:54 +08:00
Bin Liu	407e46b1b7	Merge pull request #5218 from bergwolf/github/deps runtime/runtime-rs: update dependency	2022-09-27 11:02:46 +08:00
wangyongchao.bj	04bbce8dc3	virtcontainers: add warn log record for qmp hotplug cpu error The qmp command of hotplug cpu failed error was hidden. It didn't friendly for the user tracing the hotplug cpu error. The PR help us to improve the hotplug cpu error log. Add real qemu command error log for `failed to hot add vCPUs`. Through the error message, we can get the reason of the failed qmp command for hotplug cpu operation. Fixes: #5234 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2022-09-23 08:22:30 +08:00
Peng Tao	9628c7df0c	runtime: update runc dependency To bring fix to CVE-2022-29162. Fixes: #5217 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-09-21 17:21:37 +08:00
Joana Pecholt	ded60173d4	runtime: Enable choice between AMD SEV and SNP This is based on a patch from @niteeshkd that adds a config parameter to choose between AMD SEV and SEV-SNP VMs as the confidential guest type in case both types are supported. SEV is the default. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	22bda0838c	runtime: Support for AMD SEV-SNP VMs This commit adds AMD SEV-SNP as a confidential guest option to the runtime. Information on required components such as OVMF, QEMU and a kernel supporting SEV-SNP are defined in the versions file and corresponding configs are added. Note: The CPU model 'host' provided by the current SNP-QEMU does not support all SNP capabilities yet, which is why this option is changed to EPYC-v4. Note: The guest's physical address space reduction specified with ReducedPhysBits is 1. Details are can be found in Section 15.34.6 here https://www.amd.com/system/files/TechDocs/24593.pdf Fixes #4437 Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	105eda5b9a	runtime: Initrd path option added to config Adds initrd configuration option to the configuration.toml that is generated for the setup using QEMU. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Feng Wang	f914319874	runtime: store the user name in hypervisor config The user name will be used to delete the user instead of relying on uid lookup because uid can be reused. Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-13 10:32:55 -07:00
Feng Wang	5cafe21770	runtime: make StopVM thread-safe StopVM can be invoked by multiple threads and needs to be thread-safe Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-12 21:56:15 -07:00
Feng Wang	c3015927a3	runtime: add more debug logs for non-root user operation Previously the logging was insufficient and made debugging difficult Fixes: #5155 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-09-12 21:38:57 -07:00
Eric Ernst	9997ab064a	sandbox_test: Add test to verify memory hotplug behavior Augment the mock hypervisor so that we can validate that ACPI memory hotplug is carried out as expected. We'll augment the number of memory slots in the hypervisor config each time the memory of the hypervisor is changed. In this way we can ensure that large memory hotplugs are broken up into appropriately sized pieces in the unit test. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-31 10:32:30 -07:00
Eric Ernst	f390c122f0	sandbox: don't hotplug too much memory at once If we're using ACPI hotplug for memory, there's a limitation on the amount of memory which can be hotplugged at a single time. During hotplug, we'll allocate memory for the memmap for each page, resulting in a 64 byte per 4KiB page allocation. As an example, hotplugging 12GiB of memory requires ~192 MiB of free memory, which is about the limit we should expect for an idle 256 MiB guest (conservative heuristic of 75% of provided memory). From experimentation, at pod creation time we can reliably add 48 times what is provided to the guest. (a factor of 48 results in using 75% of provided memory for hotplug). Using prior example of a guest with 256Mi RAM, 256 Mi * 48 = 12 Gi; 12GiB is upper end of what we should expect can be hotplugged successfully into the guest. Note: It isn't expected that we'll need to hotplug large amounts of RAM after workloads have already started -- container additions are expected to occur first in pod lifecycle. Based on this, we expect that provided memory should be freely available for hotplug. If virtio-mem is being utilized, there isn't such a limitation - we can hotplug the max allowed memory at a single time. Fixes: #4847 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-31 10:32:30 -07:00
Eric Ernst	e0142db24f	hypervisor: Add GetTotalMemoryMB to interface It'll be useful to get the total memory provided to the guest (hotplugged + coldplugged). We'll use this information when calcualting how much memory we can add at a time when utilizing ACPI hotplug. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-08-30 16:37:47 -07:00
Archana Shinde	7d52934ec1	Merge pull request #4798 from amshinde/use-iouring-qemu Use iouring for qemu block devices	2022-08-26 04:00:24 +05:30
Fabiano Fidêncio	ddc94e00b0	Merge pull request #4982 from fidencio/topic/improve-cloud-hypervisor-plus-tdx-support TDX: Get TDX working again with Cloud Hypervisor + a minor change on QEMU's code	2022-08-25 08:53:10 +02:00
Fabiano Fidêncio	dc90eae17b	qemu: Drop unnecessary `tdx_guest` kernel parameter With the current TDX kernel used with Kata Containers, `tdx_guest` is not needed, as TDX_GUEST is now a kernel configuration. With this in mind, let's just drop the kernel parameter. Fixes: #4981 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:43 +02:00
Fabiano Fidêncio	d4b67613f0	clh: Use HVC console with TDX As right now the TDX guest kernel doesn't support "serial" console, let's switch to using HVC in this case. Fixes: #4980 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:40 +02:00
Fabiano Fidêncio	c0cb3cd4d8	clh: Avoid crashing when memory hotplug is not allowed The runtime will crash when trying to resize memory when memory hotplug is not allowed. This happens because we cannot simply set the hotplug amount to zero, leading is to not set memory hotplug at all, and later then trying to access the value of a nil pointer. Fixes: #4979 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:22 +02:00
Fabiano Fidêncio	9f0a57c0eb	clh: Increase API and SandboxStop timeouts for TDX While doing tests using `ctr`, I've noticed that I've been hitting those timeouts more frequently than expected. Till we find the root cause of the issue (which is not in the Kata Containers), let's increase the timeouts when dealing with a Confidential Guest. Fixes: #4978 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 20:02:12 +02:00
Fabiano Fidêncio	c142fa2541	clh: Lift the sharedFS restriction used with TDX When booting the TDX kernel with `tdx_disable_filter`, as it's been done for QEMU, VirtioFS can work without any issues. Whether this will be part of the upstream kernel or not is a different story, but it easily could make it there as Cloud Hypervisor relies on the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the DMA API, making these devices compatible with TDX. See Sebastien Boeuf's explanation of this in the 3c973fa7ce208e7113f69424b7574b83f584885d commit: """ By using DMA API, the guest triggers the TDX codepath to share some of the guest memory, in particular the virtqueues and associated buffers so that the VMM and vhost-user backends/processes can access this memory. """ Fixes: #4977 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 17:14:05 +02:00
Peng Tao	a06d819b24	runtime: cri-o annotations have been moved to podman Let's swith to depending on podman which also simplies indirect dependency on kubernetes components. And it helps to avoid cri-o security issues like CVE-2022-1708 as well. Fixes: #4972 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-24 18:11:37 +08:00
Bin Liu	6551d4f25a	Merge pull request #4051 from bergwolf/github/vmx-vm-factory enable vmx for vm factory	2022-08-24 16:22:37 +08:00
Fabiano Fidêncio	9806ce8615	Merge pull request #4937 from chenhengqi/fix-error-msg network: Fix error message for setting hardware address on TAP interface	2022-08-19 17:54:58 +02:00
Fabiano Fidêncio	828383bc39	Merge pull request #4933 from likebreath/0816/prepare_clh_v26.0 Upgrade to Cloud Hypervisor v26.0	2022-08-18 18:36:53 +02:00
Peng Tao	f508c2909a	runtime: constify splitIrqChipMachineOptions A simple cleanup. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:09:20 +08:00
Peng Tao	2b0587db95	runtime: VMX is migratible in vm factory case We are not spinning up any L2 guests in vm factory, so the L1 guest migration is expected to work even with VMX. See https://www.linux-kvm.org/page/Nested_Guests Fixes: #4050 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:08:43 +08:00
Peng Tao	fa09f0ec84	runtime: remove qemuPaths It is broken that it doesn't list QemuVirt machine type. In fact we don't need it at all. Just drop it. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-08-18 10:06:10 +08:00
Bo Chen	3a597c2742	runtime: clh: Use the new 'payload' interface The new 'payload' interface now contains the 'kernel' and 'initramfs' config. Fixes: #4952 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-17 12:23:43 -07:00
Bo Chen	16baecc5b1	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v26.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Fixes: #4952 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-17 12:23:12 -07:00
Hengqi Chen	8ff5c10ac4	network: Fix error message for setting hardware address on TAP interface Error out with the correct interface name and hardware address instead. Fixes: #4944 Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-08-17 16:42:07 +08:00
Chelsea Mafrica	fcc1e0c617	runtime: tracing: End root span at end of trace The root span should exist the duration of the trace. Defer ending span until the end of the trace instead of end of function. Add the span to the service struct to do so. Fixes #4902 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-08-12 13:15:39 -07:00
Bin Liu	cb7f9524be	Merge pull request #4804 from openanolis/anolis/merge_runtime_rs_to_main runtime-rs:merge runtime rs to main	2022-08-11 08:40:41 +08:00
Tim Zhang	4813a3cef9	Merge pull request #4711 from liubin/fix/4710-wait-nydusd-api-server-ready nydus: wait nydusd API server ready before mounting share fs	2022-08-10 17:20:17 +08:00
liubin	2ae807fd29	nydus: wait nydusd API server ready before mounting share fs If the API server is not ready, the mount call will fail, so before mounting share fs, we should wait the nydusd is started and the API server is ready. Fixes: #4710 Signed-off-by: liubin <liubin0329@gmail.com> Signed-off-by: Bin Liu <bin@hyper.sh>	2022-08-08 16:18:38 +08:00
Tim Zhang	8d4d98587f	Merge pull request #4746 from liubin/fix/4745-add-log-field runtime: explicitly mark the source of the log is from qemu.log	2022-08-08 15:21:01 +08:00
Archana Shinde	c1e3b8f40f	govmm: Refactor qmp functions for adding block device Instead of passing a bunch of arguments to qmp functions for adding block devices, use govmm BlockDevice structure to reduce these. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	598884f374	govmm: Refactor code to get rid of redundant code Get rid of redundant return values from function. args and blockdevArgs used to return different values to maintain compatilibity between qemu versions. These are exactly the same now. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	00860a7e43	qmp: Pass aio backend while adding block device Allow govmm to pass aio backend while adding block device. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	e1b49d7586	config: Add block aio as a supported annotation Allow Block AIO to be passed as a per pod annotation. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Archana Shinde	ed0f1d0b32	config: Add "block_device_aio" as a config option for qemu This configuration will allow users to choose between different I/O backends for qemu, with the default being io_uring. This will allow users to fallback to a different I/O mechanism while running on kernels olders than 5.1. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
chmod100	d8ad16a34e	runtime: add unlock before return in sendReq Unlock is required before return, so there need to add unlock Fixes: #4827 Signed-off-by: chmod100 <letfu@outlook.com>	2022-08-05 13:30:12 +00:00
Archana Shinde	b6cd2348f5	govmm: Add io_uring as AIO type io_uring was introduced as a new kernel IO interface in kernel 5.1. It is designed for higher performance than the older Linux AIO API. This feature was added in qemu 5.0. Fixes #4645 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-03 10:43:12 -07:00
Archana Shinde	81cdaf0771	govmm: Correct documentation for Linux aio. The comments for "native" aio are incorrect. Correct these. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-03 10:41:50 -07:00
Zhongtao Hu	adfad44efe	Merge remote-tracking branch 'origin/main' into runtime-rs-merge-tmp To keep runtime-rs up to date, we will merge main into runtime-rs every week. Fixes:#4776 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2022-08-01 11:12:48 +08:00
yaoyinnan	5c3155f7e2	runtime: Support for host cgroup v2 Support cgroup v2 on the host. Update vendor containerd/cgroups to add cgroup v2. Fixes: #3073 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-07-28 10:30:45 +08:00
Bin Liu	85f4e7caf6	runtime: explicitly mark the source of the log is from qemu.log In qemu.StopVM(), if debug is enabled, the shim will dump logs from qemu.log, but users don't know which logs are from qemu.log and shim itself. Adding some additional messages will help users to distinguish these logs. Fixes: #4745 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-07-26 16:08:59 +08:00
gntouts	56d49b5073	versions: Update Firecracker version to v1.1.0 This patch upgrades Firecracker version from v0.23.4 to v1.1.0 * Generate swagger models for v1.1.0 (from firecracker.yaml) * Replace ht_enabled param to smt (API change) * Remove NUMA-related jailer param --node 0 Fixes: #4673 Depends-on: github.com/kata-containers/tests#4968 Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk> Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2022-07-26 07:01:26 +00:00
Ji-Xinyou	62182db645	runtime-rs: add unit test for ipvlan endpoint Add unit test to check the integrity of IPVlanEndpoint::new(...) Fixes: #4655 Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>	2022-07-18 15:56:06 +08:00
wllenyj	274598ae56	kata-runtime: add dragonball config check support. add dragonball config check support. Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2022-07-14 10:43:50 +08:00
Fabiano Fidêncio	be31207f6e	clh: Don't crash if no network device is set by the upper layer `ctr` doesn't set a network device when creating the sandbox, which leads to Cloud Hypervisor's driver crashing, see the log below: ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x55641c23b248] goroutine 32 [running]: github.com/kata-containers/kata-containers/src/runtime/virtcontainers.glob..func1(0xc000397900) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:163 +0x128 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).vmAddNetPut(...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1348 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).bootVM(0xc000397900, {0x55641c76dfc0, 0xc000454ae0}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:1378 +0x5a2 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(cloudHypervisor).StartVM(0xc000397900, {0x55641c76dff8, 0xc00044c240}, 0x55641b8016fd) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/clh.go:659 +0x7ee github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(Sandbox).startVM.func2() /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1219 +0x190 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(LinuxNetwork).Run.func1({0xc0004a8910, 0x3b}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:319 +0x1b github.com/kata-containers/kata-containers/src/runtime/virtcontainers.doNetNS({0xc000048440, 0xc00044c240}, 0xc0005d5b38) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:1045 +0x163 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(LinuxNetwork).Run(0xc000150c80, {0x55641c76dff8, 0xc00044c240}, 0xc00014e4e0) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/network_linux.go:318 +0x105 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(Sandbox).startVM(0xc000107d40, {0x55641c76dff8, 0xc0005529f0}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/sandbox.go:1205 +0x65f github.com/kata-containers/kata-containers/src/runtime/virtcontainers.createSandboxFromConfig({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, 0x0, ...}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:91 +0x346 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.CreateSandbox({_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, 0x0, ...}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/api.go:51 +0x150 github.com/kata-containers/kata-containers/src/runtime/virtcontainers.(VCImpl).CreateSandbox(_, {_, _}, {{0x0, 0x0, 0x0}, {0xc000385a00, 0x1, 0x1}, {0x55641d033260, ...}, ...}) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/virtcontainers/implementation.go:35 +0x74 github.com/kata-containers/kata-containers/src/runtime/pkg/katautils.CreateSandbox({_, _}, {_, _}, {{0xc0004806c0, 0x9}, 0xc000140110, 0xc00000f7a0, {0x0, 0x0}, ...}, ...) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/katautils/create.go:175 +0x8b6 github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.create({0x55641c76dff8, 0xc0004129f0}, 0xc00034a000, 0xc00036a000) /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/create.go:147 +0xdea github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(service).Create.func2() /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:401 +0x32 created by github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2.(service).Create /home/ubuntu/go/src/github.com/kata-containers/kata-containers/src/runtime/pkg/containerd-shim-v2/service.go:400 +0x534 ``` This bug has been introduced as part of the https://github.com/kata-containers/kata-containers/pull/4312 PR, which changed how we add the network device. In order to avoid the crash, let's simply check whether we have a device to be added before iterating the list of network devices. Fixes: #4618 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-13 10:40:21 +02:00
Fabiano Fidêncio	dc3b6f6592	versions: Update Cloud Hypervisor to v25.0 Cloud Hypervisor v25.0 has been released on July 7th, 2022, and brings the following changes: ch-remote Improvements The ch-remote command has gained support for creating the VM from a JSON config and support for booting and deleting the VM from the VMM. VM "Coredump" Support Under the guest_debug feature flag it is now possible to extract the memory of the guest for use in debugging with e.g. the crash utility. (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4012) Notable Bug Fixes * Always restore console mode on exit (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4249, https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4248) * Restore vCPUs in numerical order which fixes aarch64 snapshot/restore (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4244) * Don't try and configure IFF_RUNNING on TAP devices (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4279) * Propagate configured queue size through to vhost-user backend (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4286) * Always Program vCPU CPUID before running the vCPU to fix running on Linux 5.16 (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/4156) * Enable ACPI MADT "Online Capable" flag for hotpluggable vCPUs to fix newer Linux guest Removals The following functionality has been removed: * The mergeable option from the virtio-pmem support has been removed (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3968) * The dax option from the virtio-fs support has been removed (https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3889) Fixes: #4641 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-07-12 14:47:58 +00:00
Manabu Sugimoto	4d89476c91	runtime: Fix DisableSelinux config Enable Kata runtime to handle `disable_selinux` flag properly in order to be able to change the status by the runtime configuration whether the runtime applies the SELinux label to VMM process. Fixes: #4599 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-07-06 15:50:28 +09:00
Fabiano Fidêncio	071dd4c790	Merge pull request #4109 from pmores/drop-in-cfg-files-support Drop in cfg files support	2022-07-05 22:21:24 +02:00
Peng Tao	a1de394e51	Merge pull request #4550 from liubin/fix/4548-overwrite-mount-type-for-bind-mount runtime: overwrite mount type to bind for bind mounts	2022-07-04 19:56:26 +08:00
liubin	1f363a386c	runtime: overwrite mount type to bind for bind mounts Some clients like nerdctl may pass mount type of none for volumes/bind mounts, this will lead to container start fails. Referring to runc, it overwrites the mount type to bind and ignores the input value. Fixes: #4548 Signed-off-by: liubin <liubin0329@gmail.com>	2022-07-01 12:13:01 +08:00
GabyCT	02a51e75a7	Merge pull request #4554 from liubin/fix/delete-not-used-console-from-container-config runtime: delete Console from Cmd type	2022-06-30 11:40:07 -05:00
Fabiano Fidêncio	aa561b49f5	Merge pull request #4540 from fidencio/topic/default_maxmemory Add `default_maxmemory` config option	2022-06-30 12:08:15 +02:00
GabyCT	2a94261df5	Merge pull request #4549 from liubin/fix/4419-set-status-if-wait-process-failed shim: set a non-zero return code if the wait process call failed.	2022-06-29 17:04:53 -05:00
Fabiano Fidêncio	1e12d56512	Merge pull request #4469 from egernst/config-validation-refactor Refactor how hypervisor config validation is handled	2022-06-29 14:42:11 +02:00
liubin	a5a25ed13d	runtime: delete Console from Cmd type There is much code related to this property, but it is not used anymore. Fixes: #4553 Signed-off-by: liubin <liubin0329@gmail.com>	2022-06-29 17:36:32 +08:00
Pavel Mores	96553e8bd2	runtime: Add documentation of drop-in config file fragments Added user manual for the drop-in config file fragments feature. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 10:56:53 +02:00
Pavel Mores	c656457e90	runtime: Add tests of drop-in config file decoding The tests ensure that interactions between drop-ins and the base configuration.toml and among drop-ins themselves work as intended, basically that files are evaluated in the correct order (base file first, then drop-ins in alphabetical order) and the last one to set a specific key wins. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:39 +02:00
Pavel Mores	99f5ca80fc	runtime: Plug drop-in decoding into decodeConfig() Fixes #4108 Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	0f9856c465	runtime: Scan drop-in directory, read files and decode them updateFromDropIn() uses the infrastructure built by previous commits to ensure no contents of 'tomlConfig' are lost during decoding. To do this, we preserve the current contents of our tomlConfig in a clone and decode a drop-in into the original. At this point, the original instance is updated but its Agent and/or Hypervisor fields are potentially damaged. To merge, we update the clone's Agent/Hypervisor from the original instance. Now the clone has the desired Agent/Hypervisor and the original instance has the rest, so to finish, we just need to move the clone's Agent/Hypervisor to the original. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	2c1efcc697	runtime: Add helpers to copy fields between tomlConfig instances These functions take a TOML key - an array of individual components, e.g. ["agent" "kata" "enable_tracing"], as returned by BurntSushi - and two 'tomlConfig' instances. They copy the value of the struct field identified by the key from the source instance to the target one if necessary. This is only done if the TOML key points to structures stored in maps by 'tomlConfig', i.e. 'hypervisor' and 'agent'. Nothing needs to be done in other cases. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
Pavel Mores	20f11877be	runtime: Add framework to manipulate config structs via reflection For 'tomlConfig' substructures stored in Golang maps - 'hypervisor' and 'agent' - BurntSushi doesn't preserve their previous contents as it does for substructures stored directly (e.g. 'runtime'). We use reflection to work around this. This commit adds three primitive operations to work with struct fields identified by their `toml:"..."` tags - one to get a field value, one to set a field value and one to assign a source struct field value to the corresponding field of a target. Signed-off-by: Pavel Mores <pmores@redhat.com>	2022-06-29 09:54:38 +02:00
liubin	ab5f1c9564	shim: set a non-zero return code if the wait process call failed. Return code is an int32 type, so if an error occurred, the default value may be zero, this value will be created as a normal exit code. Set return code to 255 will let the caller(for example Kubernetes) know that there are some problems with the pod/container. Fixes: #4419 Signed-off-by: liubin <liubin0329@gmail.com>	2022-06-29 12:33:32 +08:00
Eric Ernst	e5be5cb086	runtime: device: cleanup outdated comments Prior device config move didn't update the comments. Let's address this, and make sure comments match the new path... Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-28 18:22:28 -07:00
Eric Ernst	5f936f268f	virtcontainers: config validation is host specific Ideally this config validation would be in a seperate package (katautils?), but that would introduce circular dependency since we'd call it from vc, and it depends on vc types (which, shouldn't be vc, but probably a hypervisor package instead). Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-28 18:22:28 -07:00
Fabiano Fidêncio	323271403e	virtcontainers: Remove unused function While working on the previous commits, some of the functions become non-used. Let's simply remove them. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Fabiano Fidêncio	0939f5181b	config: Expose default_maxmemory Expose the newly added `default_maxmemory` to the project's Makefile and to the configuration files. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Fabiano Fidêncio	58ff2bd5c9	clh,qemu: Adapt to using default_maxmemory Let's adapt Cloud Hypervisor's and QEMU's code to properly behave to the newly added `default_maxmemory` config. While implementing this, a change of behaviour (or a bug fix, depending on how you see it) has been introduced as if a pod requests more memory than the amount avaiable in the host, instead of failing to start the pod, we simply hotplug the maximum amount of memory available, mimicing better the runc behaviour. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Tim Zhang	916ffb75d7	Merge pull request #4432 from liubin/fix/4420-binary-log shim: support shim v2 logging plugin	2022-06-28 16:29:07 +08:00
Fabiano Fidêncio	afdc960424	hypervisor: Add default_maxmemory configuration Let's add a `default_maxmemory` configuration, which allows the admins to set the maximum amount of memory to be used by a VM, considering the initial amount + whatever ends up being hotplugged via the pod limits. By default this value is 0 (zero), and it means that the whole physical RAM is the limit. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 08:32:15 +02:00
Bin Liu	4e30e11b31	shim: support shim v2 logging plugin Now kata shim only supports stdout/stderr of fifo from containerd/CRI-O, but shim v2 supports logging plugins, and nerdctl default will use the binary schema for logs. This commit will add the others type of log plugins: - file - binary In case of binary, kata shim will receive a stdout/stderr like: binary:///nerdctl?_NERDCTL_INTERNAL_LOGGING=/var/lib/nerdctl/1935db59 That means the nerdctl process will handle the logs(stdout/stderr) Fixes: #4420 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-28 13:54:22 +08:00
Eric Ernst	bdf5e5229b	virtcontainers: validate hypervisor config outside of hypervisor itself Depending on the user of it, the hypervisor from hypervisor interface could have differing view on what is valid or not. To help decouple, let's instead check the hypervisor config validity as part of the sandbox creation, rather than as part of the CreateVM call within the hypervisor interface implementation. Fixes: #4251 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-27 11:53:41 -07:00
Eric Ernst	469e098543	katautils: don't do validation when loading hypervisor config Policy for whats valid/invalid within the config varies by VMM, host, and by silicon architecture. Let's keep katautils simple for just translating a toml to the hypervisor config structure, and leave validation to virtcontainers. Without this change, we're doing duplicate validation. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-27 10:13:26 -07:00
Bin Liu	27b1bb5ed9	Merge pull request #4467 from egernst/device-pkg device package cleanup/refactor	2022-06-27 14:40:53 +08:00
Eric Ernst	e32bf53318	device: deduplicate state structures Before, we maintained almost identical structures between our persist API and what we keep for our devices, with the persist API being a slight subset of device structures. Let's deduplicate this, now that persist is importing device package. Json unmarshal of prior persist structure will work fine, since it was an exact subset of fields. Fixes: #4468 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Eric Ernst	f97d9b45c8	runtime: device/persist: drop persist dependency from device pkgs Rather than have device package depend on persist, let's define the (almost duplicate) structures within device itself, and have the Kata Container's persist pkg import these. This'll help avoid unecessary dependencies within our core packages. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Eric Ernst	f9e96c6506	runtime: device: move to top level package Let's move device package to runtime/pkg instead of being buried under virtcontainers. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-06-26 21:31:29 -07:00
Fabiano Fidêncio	133528dd14	Merge pull request #4503 from amshinde/multi-queue-block block: Leverage multiqueue for virtio-block	2022-06-23 12:17:11 +02:00
Fabiano Fidêncio	78e27de6c3	Merge pull request #4358 from zvonkok/memreserve runtime: Add heuristic to get the right value(s) for mem-reserve	2022-06-22 13:41:23 +02:00
Archana Shinde	e227b4c404	block: Leverage multiqueue for virtio-block Similar to network, we can use multiple queues for virtio-block devices. This can help improve storage performance. This commit changes the number of queues for block devices to the number of cpus for cloud-hypervisor and qemu. Today the default number of cpus a VM starts with is 1. Hence the queues used will be 1. This change will help improve performance when the default cold-plugged cpus is greater than one by changing this in the config file. This may also help when we use the sandboxing feature with k8s that passes down the sum of the resources required down to Kata. Fixes #4502 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-06-21 12:38:53 -07:00
Eric Ernst	72049350ae	Merge pull request #4288 from fengwang666/enable-qemu-sandbox runtime: enable sandbox feature on qemu	2022-06-21 09:22:26 -07:00
Zvonko Kaiser	e7e7dc9dfe	runtime: Add heuristic to get the right value(s) for mem-reserve Fixes: #2938 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-06-21 03:44:28 -07:00
Liang Zhou	ef925d40ce	runtime: enable sandbox feature on qemu Enable "-sandbox on" in qemu can introduce another protect layer on the host, to make the secure container more secure. The default option is disable because this feature may introduce some performance cost, even though user can enable /proc/sys/net/core/bpf_jit_enable to reduce the impact. Fixes: #2266 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-17 15:30:46 -07:00
Chelsea Mafrica	28995301b3	tracing: Remove whitespace from root span Remove space from root span name to follow camel casing of other tracing span names in the runtime and to make parsing easier in testing. Fixes #4483 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2022-06-17 12:07:37 -07:00
Fabiano Fidêncio	f30fe86dc1	Merge pull request #4456 from Bevisy/fixIssue4454 docs: Update outdated URLs and keep them available	2022-06-16 10:26:24 +02:00
Bin Liu	553ec46115	Merge pull request #4436 from alex-matei/fix/sandbox-mem-overflow runtime: fix error when trying to parse sandbox sizing annotations	2022-06-16 11:18:24 +08:00
James O. D. Hunt	9766a285a4	Merge pull request #4422 from snir911/dependabot_bumps deps: Resolve dependabot bumps of containerd, crossbeam-utils, regex	2022-06-15 15:57:53 +01:00
Binbin Zhang	a305bafeef	docs: Update outdated URLs and keep them available By comparing the content of the old url and the new url, ensure that their content is consistent and does not contain ambiguities Fixes: #4454 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2022-06-15 16:34:28 +08:00
Fabiano Fidêncio	ac5dbd8598	clh: Improve logging related to the net dev addition Let's improve the log so we make it clear that we're only actually adding the net device to the Cloud Hypervisor configuration when calling our own version of VmAddNetPut(). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	0b75522e1f	network: Set queues to 1 to ensure we get the network fds We want to have the file descriptors of the opened tuntap device to pass them down to the VMMs, so the VMMs don't have to explicitly open a new tuntap device themselves, as the `container_kvm_t` label does not allow such a thing. With this change we ensure that what's currently done when using QEMU as the hypervisor, can be easily replicated with other VMMs, even if they don't support multiqueue. As a side effect of this, we need to close the received file descriptors in the code of the VMMs which are not going to use them. Fixes: #3533 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	93b61e0f07	network: Add FFI_NO_PI to the netlink flags Adding FFI_NO_PI to the netlink flags causes no harm to the supported and tested hypervisors as when opening the device by its name Cloud Hypervisor[0], Firecracker[1], and QEMU[2] do set the flag already. However, when receiving the file descriptor of an opened tutap device Cloud Hypervisor is not able to set the flag, leaving the guest without connectivity. To avoid such an issue, let's simply add the FFI_NO_PI flag to the netlink flags and ensure, from our side, that the VMMs don't have to set it on their side when dealing with an already opened tuntap device. Note that there's a PR opened[3] just for testing that this change doesn't cause any breakage. [0]: `e52175c2ab/net_util/src/tap.rs (L129)` [1]: `b6d6f71213/src/devices/src/virtio/net/tap.rs (L126)` [2]: `3757b0d08b/net/tap-linux.c (L54)` [3]: https://github.com/kata-containers/kata-containers/pull/4292 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	bf3ddc125d	clh: Pass the tuntap fds down to Cloud Hypervisor This is basically a no-op right now, as: * netPair.TapInterface.VMFds is nil * the tap name is still passed to Cloud Hypervisor, which is the Cloud Hypervisor's first choice when opening a tap device. In the very near future we'll stop passing the tap name to Cloud Hypervisor, and start passing the file descriptors of the opened tap instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	55ed32e924	clh: Take care of the VmAdNetdPut request ourselves Knowing that VmAddNetPut works as expected, let's switch to manually building the request and writing it to the appropriate socket. By doing this it gives us more flexibility to, later on, pass the file descriptor of the tuntap device to Cloud Hypervisor, as openAPI doesn't support such operation (it has no notion of SCM Rights). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:53:09 +00:00
Fabiano Fidêncio	01fe09a4ee	clh: Hotplug the network devices Instead of creating the VM with the network device already plugged in, let's actually add the network device after the VM is created, but before the Vm is actually booted. Although it looks like it doesn't make any functional difference between what's done in the past and what this commit introduces, this will be used to workaround a limitation on OpenAPI when it comes to passing down the network device's file descriptor to Cloud Hypervisor, so Cloud Hypervisor can use it instead of opening the device by its name on the VMM side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:51:02 +00:00
Fabiano Fidêncio	2e07538334	clh: Expose VmAddNetPut VmAddNetPut is the API provided by the Cloud Hypervisor client (auto generated) code to hotplug a new network device to the VM. Let's expose it now as it'll be used as part this series, mostly to guide the reviewer through the process of what we have to do, as later on, spoiler alert, it'll end up being removed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-14 10:27:30 +00:00
Fabiano Fidêncio	a80eb33cd6	Merge pull request #4308 from fidencio/topic/virtiofsd-switch-to-using-the-rust-version-on-all-arches runtime: Switch to using the rust version of virtiofsd (all arches but powerpc)	2022-06-13 13:45:51 +02:00
Bin Liu	81acfc1286	Merge pull request #4425 from liubin/fix/4376-change-log-level-of-getoomevent shim: change the log level for GetOOMEvent call failures	2022-06-13 17:53:11 +08:00
James O. D. Hunt	9b93db0220	Merge pull request #4417 from jodh-intel/docs-monitor-considerations docs: Add more kata monitor details	2022-06-13 10:51:52 +01:00
Fabiano Fidêncio	1ef0b7ded0	runtime: Switch to using the rust version of virtiofsd (all but power) So far this has been done for x86_64. Now that the support for building and testing has been added for all arches, let's do the second part of the switch. We're still not done yet for powerpc, as some a virtifosd crash on the rust version has been found by the maintainer. Fixes: #4258, #4260 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-13 10:41:26 +02:00
Alexandru Matei	721ca72a64	runtime: fix error when trying to parse sandbox sizing annotations Changed bitsize for parsing functions to 64-bit in order to avoid parsing errors. Fixes #4435 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2022-06-11 18:51:10 +03:00
Archana Shinde	aefe11b9ba	Merge pull request #4331 from dgibson/config-enable-iommu-annotation Allow io.katacontainers.config.hypervisor.enable_iommu annotation by …	2022-06-10 17:43:27 -07:00
James O. D. Hunt	412441308b	docs: Add more kata monitor details Add more detail to the `kata-monitor` doc to allow an admin to make a more informed decision about where and how to run the daemon. Fixes: #4416. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-06-09 09:20:11 +01:00
Eric Ernst	4ebf9d38b9	Merge pull request #4310 from egernst/core-sched shim: add support for core scheduling	2022-06-08 17:42:45 +02:00
Bin Liu	eff4e1017d	shim: change the log level for GetOOMEvent call failures GetOOMEvent is a blocking call that will fail if the container exit, in this case, it's not an error or warning. Changing the log level for logs in case of GetOOMEvent call fails will reduce log noise in a large cluster that has pods creating/deleting frequently. Fixes: #4376 Signed-off-by: Bin Liu <bin@hyper.sh>	2022-06-08 22:17:24 +08:00
dependabot[bot]	5d7fb7b7b0	build(deps): bump github.com/containerd/containerd in /src/runtime Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.1 to 1.6.6. - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.6.1...v1.6.6) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production ... Fixes: #4421 Signed-off-by: dependabot[bot] <support@github.com>	2022-06-08 10:54:46 +03:00
David Gibson	8f10e13e07	config: Allow enable_iommu pod annotation by default Since #902 the `io.katacontainers.config.hypervisor` pod annotations have only been permitted if explicitly allowed in the global configuration. The default global configuration allows no such annotations. That's important because several of those annotations would cause Kata to execute arbitrary binaries, and so were wildly unsafe. However, this is inconvenient for the `io.katacontainers.config.hypervisor.enable_iommu` annotation specifically, which controls whether the sandbox VM includes a vIOMMU. A guest side vIOMMU is necessary to implement VFIO passthrough devices with `vfio_mode = vfio`, so enabling that mode of operation currently requires a global configuration change, and can't just be enabled per-pod. Unlike some of the other hypervisor annotations, the `enable_iommu` annotation is quite safe. By default the vIOMMU is not present, so allowing a user to override it for a pod only improves their facilities for isolation. Even if the global default were changed to enable the vIOMMU, that doesn't compel the guest kernel to use it, so allowing a user to disable the vIOMMU doesn't materially affect isolation either. Therefore, allow the io.katacontainers.config.hypervisor.enable_iommu annotation to work in the default configurations. fixes #4330 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-06-04 13:02:05 +10:00
Eric Ernst	430da47215	Merge pull request #4360 from fengwang666/shim-leak runtime: ignore ESRCH error from stop container	2022-06-02 12:42:19 -07:00
Feng Wang	9726f56fdc	runtime: force stop container after the container process exits Set thestop container force flag to true so that the container state is always set to “StateStopped” after the container wait goroutine is finished. This is necessary for the following delete container step to succeed. Fixes: #4359 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-02 08:17:08 -07:00
Eric Ernst	d2df1209a5	docs: describe kata handling for core-scheduling Add initial documentation for core-scheduling. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 16:17:00 -07:00
Michael Crosby	22b6a94a84	shim: add support for core scheduling In linux 5.14 and hopefully some backports, core scheduling allows processes to be co scheduled within the same domain on SMT enabled systems. Containerd impl sets the core sched domain when launching a shim. This allows a clean way for each shim(container/pod) to be in its own domain and any additional containers, (v2 pods) be be launched with the same domain as well as any exec'd process added to the container. kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html For Kata specifically, we will look for SCHED_CORE environment variable to be set to indicate we shuold create a new schedule core domain. This is equivalent to the containerd shim's PR: `e48bbe8394` Fixes: #4309 Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Michael Crosby <michael@thepasture.io>	2022-05-31 10:10:40 -07:00
Eric Ernst	65f0cef16c	kata-runtime: add iptables CLI to test http endpoint While end users can connect directly to the shim, let's provide a way to easily get/set iptables from kata-runtime itself. Fixes: #4080 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	3201ad0830	shim-client: ensure we check resp status for Put/Post Without this, potential errors are silently dropped. Let's ensure we return the error code as well as potenial data from the response. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	0706fb28ac	kata-runtime: shmgmt: make url usage consistent Before, we had a mix of slash, etc. Unfortunately, when cleaning URL paths, serve mux seems to mangle the request method, resulting in each request being a GET (instead of PUT or POST). Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	2a09378dd9	shim-client: add support for DoPut While at it, make sure we check for nil in DoPost Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	640173cfc2	shim-mgmt: Add endpoint handler for interacting with iptables Add two endpoints: ip6tables, iptables. Each url handler supports GET and PUT operations. PUT expects the requests' data to be []bytes, and to contain iptable information in format to be consumed by iptables-restore. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	0136be22ca	virtcontainers: plumb iptable set/get from sandbox to agent Introduce get/set iptable handling. We add a sandbox API for getting and setting the IPTables within the guest. This routes it from sandbox interface, through kata-agent, ultimately making requests to the guest agent. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 09:27:58 -07:00
Eric Ernst	03176a9e09	proto: update generated code based on proto update Update the generated agent.pb.go code based on proto update. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-05-31 08:45:59 -07:00
Fabiano Fidêncio	fff832874e	clh: Update to v24.0 This release has been tracked through the v24.0 project. virtio-iommu specification describes how a device can be attached by default to a bypass domain. This feature is particularly helpful for booting a VM with guest software which doesn't support virtio-iommu but still need to access the device. Now that Cloud Hypervisor supports this feature, it can boot a VM with Rust Hypervisor Firmware or OVMF even if the virtio-block device exposing the disk image is placed behind a virtual IOMMU. Multiple checks have been added to the code to prevent devices with identical identifiers from being created, and therefore avoid unexpected behaviors at boot or whenever a device was hot plugged into the VM. Sparse mmap support has been added to both VFIO and vfio-user devices. This allows the device regions that are not fully mappable to be partially mapped. And the more a device region can be mapped into the guest address space, the fewer VM exits will be generated when this device is accessed. This directly impacts the performance related to this device. A new serial_number option has been added to --platform, allowing a user to set a specific serial number for the platform. This number is exposed to the guest through the SMBIOS. * Fix loading RAW firmware (#4072) * Reject compressed QCOW images (#4055) * Reject virtio-mem resize if device is not activated (#4003) * Fix potential mmap leaks from VFIO/vfio-user MMIO regions (#4069) * Fix algorithm finding HOB memory resources (#3983) * Refactor interrupt handling (#4083) * Load kernel asynchronously (#4022) * Only create ACPI memory manager DSDT when resizable (#4013) Deprecated features will be removed in a subsequent release and users should plan to use alternatives * The mergeable option from the virtio-pmem support has been deprecated (#3968) * The dax option from the virtio-fs support has been deprecated (#3889) Fixes: #4317 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-26 08:51:18 +00:00
Eric Ernst	6d00701ec9	Merge pull request #4298 from yibozhuang/fix-direct-volume Fix issues with direct-volume stats feature	2022-05-23 15:23:51 -07:00
Yibo Zhuang	4428ceae16	runtime: direct-volume stats use correct name Today the shim does a translation when doing direct-volume stats where it takes the source and returns the mount path within the guest. The source for a direct-assigned volume is actually the device path on the host and not the publish volume path. This change will perform a lookup of the mount info during direct-volume stats to ensure that the device path is provided to the shim for querying the volume stats. Fixes: #4297 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 18:42:47 -07:00
Yibo Zhuang	ffdc065b4c	runtime: direct-volume stats update to use GET parameter The go default http mux AFAIK doesn’t support pattern routing so right now client is padding the url for direct-volume stats with a subpath of the volume path and this will always result in 404 not found returned by the shim. This change will update the shim to take the volume path as a GET query parameter instead of a subpath. If the parameter is missing or empty, then return 400 BadRequest to the client. Fixes: #4297 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 18:41:51 -07:00
Yibo Zhuang	f295953183	runtime: fix incorrect Action function for direct-volume stats The action function expects a function that returns error but the current direct-volume stats Action returns (string, error) which is invalid. This change fixes the format and print out the stats from the command instead. Fixes: #4293 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-05-20 14:55:00 -07:00
Peng Tao	2c238c8504	Merge pull request #4213 from zvonkok/vfio runtime: Adding the correct detection of mediated PCIe devices	2022-05-20 15:00:23 +08:00
Fabiano Fidêncio	811ac6a8ce	Merge pull request #4282 from r4f4/runtime-dedup-types-import runtime: remove duplicate 'types' import	2022-05-19 22:15:36 +02:00
Chelsea Mafrica	d8be0f8e9f	Merge pull request #4281 from r4f4/runtime-qemu-comments runtime: sync docstrings with function names	2022-05-19 09:17:38 -07:00
Rafael Fonseca	7a5ccd1264	runtime: sync docstrings with function names The functions were renamed but their docstrings were not. Fixes #4006 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-19 14:31:47 +02:00
Greg Kurz	fa61bd43ee	Merge pull request #4238 from snir911/wip/legacy_console qemu: allow using legacy serial device for the console	2022-05-19 14:30:59 +02:00
Rafael Fonseca	ce2e521a0f	runtime: remove duplicate 'types' import Fallout of `09f7962ff` Fixes #4285 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-19 13:49:47 +02:00
Snir Sheriber	f4994e486b	runtime: allow annotation configuration to use_legacy_serial and update the docs and test Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-18 18:58:21 +03:00
Fabiano Fidêncio	c88a48be21	Merge pull request #4271 from r4f4/runtime-err-check-fix runtime: do not check for EOF error in console watcher	2022-05-18 09:49:48 +02:00
GabyCT	12f0ab120a	Merge pull request #4191 from dgibson/go-test-script Improve Go unit test script	2022-05-17 10:27:04 -05:00
Rafael Fonseca	8052fe62fa	runtime: do not check for EOF error in console watcher The documentation of the bufio package explicitly says "Err returns the first non-EOF error that was encountered by the Scanner." When io.EOF happens, `Err()` will return `nil` and `Scan()` will return `false`. Fixes #4079 Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>	2022-05-17 15:14:33 +02:00
Snir Sheriber	c67b9d2975	qemu: allow using legacy serial device for the console This allows to get guest early boot logs which are usually missed when virtconsole is used. - It utilizes previous work on the govmm side: https://github.com/kata-containers/govmm/pull/203 - unit test added Fixes: #4237 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-17 12:06:11 +03:00
Snir Sheriber	44814dce19	qemu: treat console kernel params within appendConsole as it is tightly coupled with the appended console device additionally have it tested Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-17 12:05:31 +03:00
Fabiano Fidêncio	c39852e83f	runtime: Use ${LIBEXEC}/virtiofsd as the default virtiofsd path As now we build and ship the rust version of virtiofsd, which is not tied to QEMU, we need to update its default location to match with where we're installing this binary. Fixes: #4249 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-16 09:30:24 +02:00
David Gibson	e73b70baff	runtime: Don't run unit tests verbose by default go-test.sh by default adds the -v option to 'go test' meaning that output will be printed from all the passing tests as well as any failing ones. This results in a lot of output in which it's often difficult to locate the failing tests you're interested in. So, remove -v from the default flags. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:22:31 +10:00
David Gibson	f24a6e761f	runtime: Consolidate flags setting in unit tests script One of the responsibilities of the go-test.sh script is setting up the default flags for 'go test'. This is constructed across several different places in the script using several unneeded intermediate variables though. Consolidate all the flag construction into one place. fixes #4190 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:22:29 +10:00
David Gibson	cf465feb02	runtime: Don't change test behaviour based on $CI or $KATA_DEV_MODE go-test.sh changes behaviour based on both the $CI and $KATA_DEV_MODE variables, but not in a way that makes a lot of sense. If either one is set it uses the test_coverage path, instead of the test_local path. That collects coverage information, as the name suggests, but it also means it runs the tests twice as root and non-root, which is very non-obvious. It's not clear what use case the test_local path is for at all. Developer local builds will typically have $KATA_DEV_MODE set and CI builds will have $CI set. There's essentially no downside to running coverage all the time - it has little impact on the test runtime. In addition, if both $CI and $KATA_DEV_MODE are set, the script refuses to run things as root, considering it "unsafe". While having both set might be unwise in a general sense, there's not really any way running sudo can be any more unsafe than it is with either one set. So, simplify everything by just always running the test_coverage path. This leaves the test_local path unused, so we can remove it entirely. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	34c4ac599c	runtime: Remove redundant subcommands from go-test.sh go-test.sh accepts subcommands, however invoking it in the usual way via the Makefile doesn't use them. In fact the only remaining subcommand is "help" and we already have another way of getting the usage information (-h or --help). We don't need a second way, so just drop subcommand handling. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	0aff5aaa39	runtime: Simplify package listing in go-test.sh go-test.sh defaults to testing all the packages listed by go list, except for a number filtered out. It turns out that none of those filters are necessary any more: * We've long required a Go newer than 1.9 which means the vendor filter isn't needed * The agent filter doesn't do anything now that we've moved to the Kata 2.x unified repo * The tests filters don't hit anything on the list of modules in src/runtime (which is the only user of the script) But since we don't need to filter anything out any more, we don't even need to iterate through a list ourselves. We can simply pass "./..." directly to go test and it will iterate through all the sub-packages itself. Interestingly this more than doubles the speed of "make test" for me - I suspect because go test's internal paralellism works better over a larger pool of tests. This also lets us remove handling of non-existent coverage files from test_go_package(), since with default options we will no longer test packages without tests by default. If the user explicitly requests testing of a package with no tests, then failing makes sense. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	557c4cfd00	runtime: Don't chmod coverage files in Go tests The go-test.sh script has an explicit chmod command, run as root, to set the mode of the temporary coverage files to 0644. AFAICT the point of this is specifically the 004 bit allowing world read access, so that we can then merge the temporary coverage file into the main coverage file. That's a convoluted way of doing things. Instead we can just run the tail command which reads the temporary file as the same user that generated it. In addition, go-test.sh became root to remove that temporary coverage file. This is not necessary, since deleting a regular file just requires write access to the directory, not the file itself. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	04c8b52e04	runtime: Remove HTML coverage option from go-test.sh The html-coverage option to this script doesn't really alter behaviour it just does the same thing as normal coverage, then converts the report to HTML. That conversion is a single command, plus a chmod to make the final output mode 0644. That overrides any umask the user has set, which doesn't seem like a policy decision this script should be making. Nothing in the kata-containers or tests repository uses this, so it doesn't really make sense to keep this logic inside this script. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	7f76914422	runtime: Add coverage.txt.tmp to gitignore In addition to coverage.txt, the go-test.sh script creates coverage.txt.tmp files while running. These are temporary and certainly shouldn't be committed, so add them to the gitignore file. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
David Gibson	13c2577004	runtime: Move go testing script locally The go unit tests for the runtime are invoked by the helper script ci/go-test.sh. Which calls the run_go_test() function in ci/lib.sh. Which calls into .ci/go-test.sh from the tests repository. But.. the runtime is the only user of this script, and generally stuff for unit tests (rather than functional or integration tests) lives in the main repository, not the tests repository. So, just move the actual script into src/runtime. A change to remove it from the tests repo will follow. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-05-13 13:14:37 +10:00
Zvonko Kaiser	2a1d394147	runtime: Adding the correct detection of mediated PCIe devices Fixes #4212 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2022-05-09 00:57:06 -07:00
Fabiano Fidêncio	33a8b70558	clh: Rely on Cloud Hypervisor for generating the device ID We're currently hitting a race condition on the Cloud Hypervisor's driver code when quickly removing and adding a block device. This happens because the device removal is an asynchronous operation, and we currently do not monitor events coming from Cloud Hypervisor to know when the device was actually removed. Together with this, the sandbox code doesn't know about that and when a new device is attached it'll quickly assign what may be the very same ID to the new device, leading to the Cloud Hypervisor's driver trying to hotplug a device with the very same ID of the device that was not yet removed. This is, in a nutshell, why the tests with Cloud Hypervisor and devmapper have been failing every now and then. The workaround taken to solve the issue is basically not passing down the device ID to Cloud Hypervisor and simply letting Cloud Hypervisor itself generate those, as Cloud Hypervisor does it in a manner that avoids such conflicts. With this addition we have then to keep a map of the device ID and the Cloud Hypervisor's generated ID, so we can properly remove the device. This workaround will probably stay for a while, at least till someone has enough cycles to implement a way to watch the device removal event and then properly act on that. Spoiler alert, this will be a complex change that may not even be worth it considering the race can be avoided with this commit. Fixes: #4176 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-05-04 09:04:03 +02:00
Jianyong Wu	982c32358a	Merge pull request #4031 from Jaylyn-Ren/kata-spdk Virtcontainers: Enable hot plugging vhost-user-blk device on ARM	2022-04-29 12:16:38 +08:00
Fabiano Fidêncio	b6467ddd73	clh: Expose disk rate limiter config With everything implemented, let's now expose the disk rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4139 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:29 +02:00
Fabiano Fidêncio	7580bb5a78	clh: Expose net rate limiter config With everything implemented, let's now expose the net rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4017 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:13 +02:00
Fabiano Fidêncio	a88adabaae	clh: Cloud Hypervisor has a built-in Rate Limiter The notion of "built-in rate limiter" was added as part of `bd8658e362`, and that commit considered that only Firecracker had a built-in rate limiter, which I think was the case when that was introduced (mid 2020). Nowadays, however, Cloud Hypervisor takes advantage of the very same crate used by Firecraker to do I/O throttling. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:56 +02:00
Fabiano Fidêncio	63c4da03a9	clh: Implement the Disk RateLimiter logic Let's take advantage of the newly added DiskRateLimiter* options and apply those to the network device configuration. The logic here is identical to the one already present in the Network part of Cloud Hypervisor's driver. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:53 +02:00
Fabiano Fidêncio	511f7f822d	config: Add DiskRateLimiter* to Cloud Hypervisor Let's add the newly added disk rate limiter configurations to the Cloud Hypervisor's hypervisor configuration. Right now those are not used anywhere, and there's absolutely no way the users can set those up. That's coming later in this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:15 +02:00
Fabiano Fidêncio	5b18575dfe	hypervisor: Add disk bandwidth and operations rate limiters This is the disk counterpart of the what was introduced for the network as part of the previous commits in this series. The newly added fields are: * DiskRateLimiterBwMaxRate, defined in bits per second, which is used to control the network I/O bandwidth at the VM level. * DiskRateLimiterBwOneTimeBurst, also defined in bits per second, which is used to define an initial max rate, which doesn't replenish. * DiskRateLimiterOpsMaxRate, the operations per second equivalent of the DiskRateLimiterBwMaxRate. * DiskRateLimiterOpsOneTimeBurst, the operations per second equivalent of the DiskRateLimiterBwOneTimeBurst. For now those extra fields have only been added to the hypervisor's configuration and they'll be used in the coming patches of this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:27:11 +02:00
Fabiano Fidêncio	1cf9469297	clh: Implement the Network RateLimiter logic Let's take advantage of the newly added NetRateLimiter* options and apply those to the network device configuration. The logic here is quite similar to the one already present in the Firecracker's driver, with the main difference being the single Inbound / Outbound MaxRate and the presence of both Bandwidth and Operations rate limiter. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:26:38 +02:00
Fabiano Fidêncio	00a5b1bda9	utils: Define DefaultRateLimiterRefillTimeMilliSecs Firecracker's driver doesn't expose the RefillTime option of the rate limiter to the user. Instead, it uses a contant value of 1000 miliseconds (1 second). As we're following Firecracker's driver implementation, let's expose create a new constant, use it as part of the Firecracker's driver, and later on re-use it as part of the Cloud Hypervisor's driver. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	be1bb7e39f	utils: Move FC's function to revert bytes to utils Firecracker's revertBytes function, now called "RevertBytes", can be exposed as part of the virtcontainers' utils file, as this function will be reused by Cloud Hypervisor, when adding the rate limiter logic there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	c9f6496d6d	config: Add NetRateLimiter* to Cloud Hypervisor Let's add the newly added network rate limiter configurations to the Cloud Hypervisor's hypervisor configuration. Right now those are not used anywhere, and there's absolutely no way the users can set those up. That's coming later in this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
Fabiano Fidêncio	2d35e6066d	hypervisor: Add network bandwidth and operations rate limiters In a similar way to what's already exposed as RxRateLimiterMaxRate and TxRateLimiterMaxRate, let's add four new fields to the Hypervisor's configuration. The values added are related to bandwidth and operations rate limiters, which have to be added so we can expose I/O throttling configurations to users using Cloud Hypervisor as their preferred VMM. The reason we cannot simply re-use {Rx,Tx}RateLimiterMaxRate is because Cloud Hypervisor exposes a single MaxRate to be used for both inbound and outbound queues. The newly added fields are: * NetRateLimiterBwMaxRate, defined in bits per second, which is used to control the network I/O bandwidth at the VM level. * NetRateLimiterBwOneTimeBurst, also defined in bits per second, which is used to define an initial max rate, which doesn't replenish. * NetRateLimiterOpsMaxRate, the operations per second equivalent of the NetRateLimiterBwMaxRate. * NetRateLimiterOpsOneTimeBurst, the operations per second equivalent of the NetRateLimiterBwOneTimeBurst. For now those extra fields have only been added to the hypervisor's configuration and they'll be used in the coming patches of this very same series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:22:42 +02:00
David Gibson	1b931f4203	runtime: Allock mockfs storage to be placed in any directory Currently EnableMockTesting() takes no arguments and will always place the mock storage in the fixed location /tmp/vc/mockfs. This means that one test run can interfere with the next one if anything isn't cleaned up (and there are other bugs which means that happens). If if those were fixed this would allow developers testing on the same machine to interfere with each other. So, allow the mockfs to be placed at an arbitrary place given as a parameter to EnableMockTesting(). In TestMain() we place it under our existing temporary directory, so we don't need any additional cleanup just for the mockfs. fixes #4140 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:47:59 +10:00
David Gibson	ef6d54a781	runtime: Let MockFSInit create a mock fs driver at any path Currently MockFSInit always creates the mockfs at the fixed path /tmp/vc/mockfs. This change allows it to be initialized at any path given as a parameter. This allows the tests in fs_test.go to be simplified, because the by using a temporary directory from t.TempDir(), which is automatically cleaned up, we don't need to manually trigger initTestDir() (which is misnamed, it's actually a cleanup function). For now we still use the fixed path when auto-creating the mockfs in MockAutoInit(), but we'll change that later. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	5d8438e939	runtime: Move mockfs control global into mockfs.go virtcontainers/persist/fs/mockfs.go defines a mock filesystem type for testing. A global variable in virtcontainers/persist/manager.go is used to force use of the mock fs rather than a normal one. This patch moves the global, and the EnableMockTesting() function which sets it into mockfs.go. This is slightly cleaner to begin with, and will allow some further enhancements. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	963d03ea8a	runtime: Export StoragePathSuffix storagePathSuffix defines the file path suffix - "vc" - used for Kata's persistent storage information, as a private constant. We duplicate this information in fc.go which also needs it. Export it from fs.go instead, so it can be used in fc.go. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	1719a8b491	runtime: Don't abuse MockStorageRootPath() for factory tests A number of unit tests under virtcontainers/factory use MockStorageRootPath() as a general purpose temporary directory. This doesn't make sense: the mockfs driver isn't even in use here since we only call EnableMockTesting for the pase virtcontainers package, not the subpackages. Instead use t.TempDir() which is for exactly this purpose. As a bonus it also handles the cleanup, so we don't need MockStorageDestroy any more. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:23:36 +10:00
David Gibson	bec59f9e39	runtime: Make bind mount tests better clean up after themselves There are several tests in mount_test.go which perform a sample bind mount. These need a corresponding unmount to clean up afterwards or attempting to delete the temporary files will fail due to the existing mountpoint. Most of them had such an unmount, but TestBindMountInvalidPgtypes was missing one. In addition, the existing unmounts where done inconsistently - one was simply inline (so wouldn't be executed if the test fails too early) and one is a defer. Change them all to use the t.Cleanup mechanism. For the dummy mountpoint files, rather than cleaning them up after the test, the tests were removing them at the beginning of the test. That stops the test being messed up by a previous run, but messily. Since these are created in a private temporary directory anyway, if there's something already there, that indicates a problem we shouldn't ignore. In fact we don't need to explicitly remove these at all - they'll be removed along with the rest of the private temporary directory. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:20:35 +10:00
David Gibson	f7ba21c86f	runtime: Clean up mock hook logs in tests The tests in hook_test.go run a mock hook binary, which does some debug logging to /tmp/mock_hook.log. Currently we don't clean up those logs when the tests are done. Use a test cleanup function to do this. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
David Gibson	90b2f5b776	runtime: Make SetupOCIConfigFile clean up after itself SetupOCIConfigFile creates a temporary directory with os.MkDirTemp(). This means the callers need to register a deferred function to remove it again. At least one of them was commented out meaning that a /temp/katatest- directory was leftover after the unit tests ran. Change to using t.TempDir() which as well as better matching other parts of the tests means the testing framework will handle cleaning it up. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
David Gibson	2eeb5dc223	runtime: Don't use fixed /tmp/mountPoint path Several tests in kata_agent_test.go create /tmp/mountPoint as a dummy directory to mount. This is not cleaned up after the test. Although it is in /tmp, that's still a little messy and can be confusing to a user. In addition, because it uses the same name every time, it allows for one run of the test to interfere with the next. Use the built in t.TempDir() to use an automatically named and deleted temporary directory instead. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-04-22 14:14:52 +10:00
Archana Shinde	33e244f284	Merge pull request #4102 from likebreath/0414/clh_v23.0 Upgrade to Cloud Hypervisor v23.0	2022-04-19 06:01:04 -07:00
Chelsea Mafrica	0af13b469d	Merge pull request #4086 from BbolroC/s390x-fix test: Fix golangci-lint error for s390x	2022-04-15 21:07:09 -07:00
Bin Liu	b19bfac7cd	Merge pull request #4042 from yibozhuang/direct-assign-fsgroup fsGroup support for direct-assigned volume	2022-04-16 10:23:15 +08:00
Bin Liu	4ec1967542	Merge pull request #4094 from fgiudici/kata-monitor_readme kata-monitor: add the README file	2022-04-16 08:27:22 +08:00
Bin Liu	362201605e	Merge pull request #4055 from fgiudici/kata-monitor_pprof kata-monitor: update the hrefs in the debug/pprof index page	2022-04-16 08:12:18 +08:00
Francesco Giudici	7b2ff02647	kata-monitor: add a README file Fixes: #3704 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-04-15 18:03:23 +02:00
Bo Chen	29e569aa92	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v23.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-04-14 12:56:01 -07:00
Chelsea Mafrica	32f92e75cc	Merge pull request #4021 from fengwang666/direct-volume-bug runtime: Base64 encode the direct volume mountInfo path	2022-04-13 13:15:38 -07:00
Greg Kurz	4443bb68a4	Merge pull request #4064 from tiezhuoyu/4063/no-need-to-write-error-of-virtiofsd-to-kata-log runtime: no need to write virtiofsd error to log	2022-04-13 11:59:19 +02:00
Hyounggyu Choi	d136c9c240	test: Fix golangci-lint error for s390x This is to fix a test failure for the kata-containers-2.0-ubuntu-20.04-s390x-main-baseline jenkins job Fixes: #4088 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2022-04-13 09:20:51 +02:00
Fupan Li	66aa07649b	Merge pull request #4062 from liubin/fix/4061-add-links-for-kata-monitor kata-monitor: add some links when generating pages for browsers	2022-04-13 11:30:21 +08:00
Francesco Giudici	86977ff780	kata-monitor: update the hrefs in the debug/pprof index page kata-monitor allows to get data profiles from the kata shim instances running on the same node by acting as a proxy (e.g., http://$NODE_ADDRESS:8090/debug/pprof/?sandbox=$MYSANDBOXID). In order to proxy the requests and the responses to the right shim, kata-monitor requires to pass the sandbox id via a query string in the url. The profiling index page proxied by kata-monitor contains the link to all the data profiles available. All the links anyway do not contain the sandbox id included in the request: the links result then broken when accessed through kata-monitor. This happens because the profiling index page comes from the kata shim, which will not include the query string provided in the http request. Let's add on-the-fly the sandbox id in each href tag returned by the kata shim index page before providing the proxied page. Fixes: #4054 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-04-12 15:53:59 +02:00
Zhuoyu Tie	6e79042aa0	runtime: no need to write virtiofsd error to log The scanner reads nothing from viriofsd stderr pipe, because param '--syslog' rediercts stderr to syslog. So there is no need to write scanner.Text() to kata log Fixes: #4063 Signed-off-by: Zhuoyu Tie <tiezhuoyu@outlook.com>	2022-04-12 15:59:57 +08:00
Yibo Zhuang	532d53977e	runtime: fsGroup support for direct-assigned volume The fsGroup will be specified by the fsGroup key in the direct-assign mountinfo metadate field. This will be set when invoking the kata-runtime binary and providing the key, value pair in the metadata field. Similarly, the fsGroupChangePolicy will also be provided in the mountinfo metadate field. Adding an extra fields FsGroup and FSGroupChangePolicy in the Mount construct for container mount which will be populated when creating block devices by parsing out the mountInfo.json. And in handleDeviceBlockVolume of the kata-agent client, it checks if the mount FSGroup is not nil, which indicates that fsGroup change is required in the guest, and will provide the FSGroup field in the protobuf to pass the value to the agent. Fixes #4018 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-04-11 08:41:13 -07:00
Yibo Zhuang	6a47b82c81	proto: fsGroup support for direct-assigned volume This change adds two fields to the Storage pb FSGroup which is a group id that the runtime specifies to indicate to the agent to perform a chown of the mounted volume to the specified group id after mounting is complete in the guest. FSGroupChangePolicy which is a policy to indicate whether to always perform the group id ownership change or only if the root directory group id does not match with the desired group id. These two fields will allow CSI plugins to indicate to Kata that after the block device is mounted in the guest, group id ownership change should be performed on that volume. Fixes #4018 Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>	2022-04-11 08:41:13 -07:00
bin	f8cc5d1ad8	kata-monitor: add some links when generating pages for browsers Add some links to rendered webpages for better user experience, let users can jump to pages only by clicking links in browsers. Fixes: #4061 Signed-off-by: bin <bin@hyper.sh>	2022-04-11 09:29:56 +08:00
bin	9d5b03a1b7	runtime: delete debug option in virtiofsd virtiofsd's debug will be enabled if hypervisor's debug has been enabled, this will generate too many noisy logs from virtiofsd. Unbind the relationship of log level between virtiofsd and hypervisor, if users want to see debug log of virtiofsd, can set it by: virtio_fs_extra_args = ["-o", "log_level=debug"] Fixes: #3303 Signed-off-by: bin <bin@hyper.sh>	2022-04-07 19:55:22 +08:00
Greg Kurz	d0d3787233	Merge pull request #3696 from shippomx/main kata-runtime enable hugepage support	2022-04-06 16:47:04 +02:00
Jaylyn Ren	b975f2e8d2	Virtcontainers: Enable hot plugging vhost-user-blk device on ARM The vhost-user-blk can be hotplugged on the PCI bridge successfully on X86, but failed on Arm. However, hotplugging it on Root Port as a PCIe device can work well on ARM. Open the "pcie_root_port" in configuration.toml is needed. Fixes: #4019 Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>	2022-04-06 17:37:51 +08:00
Fabiano Fidêncio	b39caf43f1	Merge pull request #3923 from Jakob-Naucke/no-initrd-se runtime: Allow and require no initrd for SE	2022-04-05 09:26:07 +02:00
Feng Wang	354cd3b9b6	runtime: Base64 encode the direct volume mountInfo path This is to avoid accidentally deleting multiple volumes. Fixes #4020 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-04-04 19:56:46 -07:00
Archana Shinde	e62bc8e7f3	Merge pull request #3915 from Juneezee/test/t.TempDir test: use `T.TempDir` to create temporary test directory	2022-04-04 01:34:46 -07:00
Fabiano Fidêncio	98750d792b	clh: Expose service offload configuration This configuration option is valid for all the hypervisor that are going to be used with the confidential containers effort, thus exposing the configuration option for Cloud Hypervisor as well. Fixes: #4022 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-01 11:15:55 +02:00
Eng Zer Jun	59c7165ee1	test: use `T.TempDir` to create temporary test directory The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete. This commit also updates the unit test advice to use `T.TempDir` to create temporary directory in tests. Fixes: #3924 Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-03-31 09:31:36 +08:00
snir911	18dc578134	Merge pull request #3999 from fgiudici/kata-monitor_fix_help kata-monitor: fix duplicated output when printing usage	2022-03-30 18:56:59 +03:00
Francesco Giudici	a63bbf9793	kata-monitor: fix duplicated output when printing usage (default: "/run/containerd/containerd.sock") is duplicated when printing kata-monitor usage: [root@kubernetes ~]# kata-monitor --help Usage of kata-monitor: -listen-address string The address to listen on for HTTP requests. (default ":8090") -log-level string Log level of logrus(trace/debug/info/warn/error/fatal/panic). (default "info") -runtime-endpoint string Endpoint of CRI container runtime service. (default: "/run/containerd/containerd.sock") (default "/run/containerd/containerd.sock") the golang flag package takes care of adding the defaults when printing usage. Remove the explicit print of the value so that it would not be printed on screen twice. Fixes: #3998 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-03-30 11:58:53 +02:00
bin	5e1c30d484	runtime: add logs around sandbox monitor For debugging purposes, add some logs. Fixes: #3815 Signed-off-by: bin <bin@hyper.sh>	2022-03-29 16:59:12 +08:00
bin	fb8be96194	runtime: stop getting OOM events when ttrpc: closed error getOOMEvents is a long-waiting call, it will retry when failed. For cases of agent shutdown, the retry should stop. When the agent hasn't detected agent has died, we can also check whether the error is "ttrpc: closed". Fixes: #3815 Signed-off-by: bin <bin@hyper.sh>	2022-03-29 16:39:01 +08:00
Bin Liu	9495316145	Merge pull request #3962 from yaoyinnan/fix/3750-VirtioMem runtime: Remove the explicit VirtioMem set and fix the comment	2022-03-29 10:20:05 +08:00
yaoyinnan	66f05c5bcb	runtime: Remove the explicit VirtioMem set and fix the comment Modify the 2Mib in the comment to 4Mib. VirtioMem is set by configuration file or annotation. And setupVirtioMem is called only when VirtioMem is true. Fixes: #3750 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-03-28 21:21:38 +08:00
Feng Wang	0928eb9f4e	agent: Kill the all the container processes of the same cgroup Otherwise the container process might leak and cause an unclean exit Fixes: #3913 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-27 10:06:58 -07:00
Jakob Naucke	ff17c756d2	runtime: Allow and require no initrd for SE Previously, it was not permitted to have neither an initrd nor an image. However, this is the exact config to use for Secure Execution, where the initrd is part of the image to be specified as `-kernel`. Require the configuration of no initrd for Secure Execution. Also - remove redundant code for image/initrd checking -- no need to check in `newQemuHypervisorConfig` (calling) when it is also checked in `getInitrdAndImage` (called) - use `QemuCCWVirtio` constant when possible Fixes: #3922 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-03-25 18:36:12 +01:00
Feng Wang	19f372b5f5	runtime: Add more debug logs for container io stream copy This can help debugging container lifecycle issues Fixes: #3913 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-24 21:35:16 -07:00
David Gibson	c77e34de33	runtime: Move mock hook source src/runtime/virtcontainers/hook/mock contains a simple example hook in Go. The only thing this is used for is for some tests in src/runtime/pkg/katautils/hook_test.go. It doesn't really have anything to do with the rest of the virtcontainers package. So, move it next to the test code that uses it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:37:35 +11:00
David Gibson	86723b51ae	virtcontainers: Remove unused install/uninstall targets We've now removed the need to install the mock hook binary for unit tests. However, it turns out that managing that was the only thing that the install and uninstall targets in the virtcontainers Makefile handled. So, remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:37:18 +11:00
David Gibson	0e83c95fac	virtcontainers: Run mock hook from build tree rather than system bin dir Running unit tests should generally have minimal dependencies on things outside the build tree. It definitely shouldn't modify system wide things outside the build tree. Currently the runtime "make test" target does so, though. Several of the tests in src/runtime/pkg/katautils/hook_test.go require a sample hook binary. They expect this hook in /usr/bin/virtcontainers/bin/test/hook, so the makefile, as root, installs the test binary to that location. Go tests automatically run within the package's directory though, so there's no need to use a system wide path. We can use a relative path to the binary build within the tree just as easily. fixes #3941 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-23 19:34:50 +11:00
David Gibson	e65db838ff	virtcontainers: Remove VC_BIN_DIR The VC_BIN_DIR variable in the virtcontainers Makefile is almost unused. It's used to generate TEST_BIN_DIR, and it's created in the install target. However, we also create TEST_BIN_DIR, which is a subdirectory of VC_BIN_DIR with mkdir -p, so it will necessarily create VC_BIN_DIR along the way. So we can drop the unnecessary mkdir and expand the definition of VC_BIN_DIR in the definition of TEST_BIN_DIR. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:53:59 +11:00
David Gibson	c20ad2836c	virtcontainers: Remove unused Makefile defines The INSTALL_EXEC and UNINSTALL_EXEC definitions from the virtcontainers Makefile (unlike those from the runtime Makefile in the parent directory) are entirely unused. Remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:40:57 +11:00
David Gibson	c776bdf4a8	virtcontainers: Remove unused parameter from go-test.sh The check-go-test target passes the path to the mock hook test binary to go-test.sh when it invokes it. But go-test.sh just calls run_go_test from ci/lib.sh, which invokes a script from the tests repo without any parameters. That is, this parameter is ignored anyway, so remove it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-03-22 16:39:22 +11:00
James O. D. Hunt	f8fb0d3bb6	Merge pull request #3322 from Kvasscn/kata_dev_block_driver_option device: using const strings for block-driver option instead of hard coding	2022-03-21 10:56:25 +00:00
Miao Xia	a2f5c1768e	runtime/virtcontainers: Pass the hugepages resources to agent The hugepages resources claimed by containers should be limited by cgroup in the guest OS. Fixes: #3695 Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>	2022-03-15 18:46:08 +08:00
Feng Wang	aa5ae6b17c	runtime: Properly handle ESRCH error when signaling container Currently kata shim v2 doesn't translate ESRCH signal, causing container fail to stop and shim leak. Fixes: #3874 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-14 11:03:05 -07:00
zhanghj	efa19c41eb	device: use const strings for block-driver option instead of hard coding Currently, the block driver option is specifed by hard coding, maybe it is better to use const string variables instead of hard coded strings. Another modification is to remove duplicate consts for virtio driver in manager.go. Fixes: #3321 Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>	2022-03-14 09:20:43 +08:00
Gabriela Cervantes	ffdf961ae9	docs: Update contact link in runtime README This PR updates the contact link in the runtime README document. Fixes #3854 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-03-08 16:27:34 +00:00
Bin Liu	deb8ce97a8	Merge pull request #3836 from liubin/fix/minor-fix Enhancement: fix comments/logs and delete not used function	2022-03-07 17:26:30 +08:00
bin	1b34494b2f	runtime: fix invalid comments for pkg/resourcecontrol Some comments are copied and not adjusted to the pkg/resourcecontrol package. Fixes: #3835 Signed-off-by: bin <bin@hyper.sh>	2022-03-05 10:32:31 +08:00
Evan Foster	afc567a9ae	storage: make k8s emptyDir creation configurable This change introduces the `disable_guest_empty_dir` config option, which allows the user to change whether a Kubernetes emptyDir volume is created on the guest (the default, for performance reasons), or the host (necessary if you want to pass data from the host to a guest via an emptyDir). Fixes #2053 Signed-off-by: Evan Foster <efoster@adobe.com>	2022-03-04 12:02:42 -08:00
Eric Ernst	1e301482e7	Merge pull request #3406 from fengwang666/direct-blk-assignment Implement direct-assigned volume	2022-03-04 11:58:37 -08:00
Feng Wang	e76519af83	runtime: small refactor to improve readability Remove some confusing/duplicate code so it's more readable Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-04 10:00:52 -08:00
Fabiano Fidêncio	7e5f11a52b	vendor: Update containerd to 1.6.1 Let's bring in the latest release of Containerd, 1.6.1, released on March 2nd, 2022. With this, we take the opportunity to remove containerd/api reference as we shouldn't need a separate module only for the API. Here's the list of changes needed in the code due to the bump: * stop using `grpc.WithInsecure()` as it's been deprecated - use `grpc.WithTransportCredentials(insecure.NewCredentials())` instead Fixes: #3820 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-04 10:28:40 +01:00
Fabiano Fidêncio	2af91b23e1	Merge pull request #3281 from jongwu/vcpu_hotplug_arm64 experimentally enable vcpu hotplug and virtio-mem on arm64 in kernel part	2022-03-04 09:14:31 +01:00
Jianyong Wu	42771fa726	runtime: don't set socket and thread for arm/virt As this is just a initial vcpu hotplug support, thread and socket has not been supported. So, don't set socket and thread when hotadd cpu for arm/virt. Fixes: #3280 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-03-04 11:22:18 +08:00
Feng Wang	f905161bbb	runtime: mount direct-assigned block device fs only once Mount the direct-assigned block device fs only once and keep a refcount in the guest. Also use the ro flag inside the options field to determine whether the block device and filesystem should be mounted as ro Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	ea51ef1c40	runtime: forward the stat and resize requests from shimv2 to kata agent Translate the volume path from host-known path to guest-known path and forward the request to kata agent. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	c39281ad65	runtime: update container creation to work with direct assigned volumes During the container creation, it will parse the mount info file of the direct assigned volumes and update the in memory mount object. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	4e00c2377c	agent: add grpc interface for stat and resize operations Add GetVolumeStats and ResizeVolume APIs for the runtime to query stat and resize fs in the guest. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:57:02 -08:00
Feng Wang	e9b5a25502	runtime: add stat and resize APIs to containerd-shim-v2 To query fs stats and resize fs, the requests need to be passed to kata agent through containerd-shim-v2. So we're adding to rest APIs on the shim management endpoint. Also refactor shim management client to its own go file. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 18:56:53 -08:00
Feng Wang	6e0090abb5	runtime: persist direct volume mount info In the direct assigned volume scenario, Kata Containers persists the information required for managing the volume inside the guest on host filesystem. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 15:32:12 -08:00
Feng Wang	fa326b4e0f	runtime: augment kata-runtime CLI to support direct-assigned volume Add commands to add, remove, resize and get stats of a direct-assigned volume. These commands are expected to be consumed by CSI. Fixes: #3454 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-03-03 15:32:03 -08:00
Fabiano Fidêncio	a2422cf2a1	Merge pull request #3389 from zhsj/rm-distro-test katatestutils: remove distro constraints	2022-03-03 23:26:58 +01:00
Fabiano Fidêncio	12af632952	Merge pull request #3814 from fidencio/wip/disable-block-device-use-minor-fixes Minor fixes for the `disable_block_device_use` comments	2022-03-03 23:26:05 +01:00
Fabiano Fidêncio	af80473496	clh: stop virtofsd if clh fails to boot up the vm If, for some reason, we're able to launch cloud hypervisor but not able to boot the VM up, the virtiofsd process would be left behind. Let's ensure, via defer, that we stop virtiofsd in case of errors. Fixes: #3819 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 19:10:37 +01:00
Fabiano Fidêncio	c54bc8e657	Merge pull request #3811 from fidencio/wip/clh-tdx-round-2 clh: tdx: Don't use sharedFS with Confidential Guests	2022-03-03 19:03:28 +01:00
Fabiano Fidêncio	97951a2d12	clh: Don't use SharedFS with Confidential Guests kata-containers/pulls#3771 added TDX support for Cloud Hypervisor, but two big things got overlooked while doing that. 1. virtio-fs, as of now, cannot be part of the trust boundary, so the Confidential Guest will not be using it. 2. virtio-block hotplug should be enabled in order to use virtio-block for the rootfs (used with the devmapper plugin). When trying to use cloud-hypervisor with TDX using virtio-fs, we're facing the following error on the guest kernel: ``` virtiofs virtio2: device must provide VIRTIO_F_ACCESS_PLATFORM ``` After checking and double-checking with virtiofs and cloud-hypervisor developers, it happens as confidential containers might put some limitations on the device, so it can't access all of the guests' memory and that's where this restriction seems to be coming from. Vivek mentioned that virtiofsd do not support VIRTIO_F_ACCESS_PLATFORM (aka VIRTIO_F_IOMMU_PLATFORM) yet, and that for ecrypted guests virtiofs may not be the best solution at the moment. @sboeuf put this in a very nice way: "if the virtio-fs driver doesn't support VIRTIO_F_ACCESS_PLATFORM, then the pages corresponding to the virtqueues and the buffers won't be marked as SHARED, meaning the VMM won't have access to it". Interestingly enough, it works with QEMU, and it may be due to some change done on the patched QEMU that @devimc is packaging, but we won't take the path to figure out what was the change and patch cloud-hypervisor on the same way, because of 1. Fixes: #3810 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:40 +01:00
Fabiano Fidêncio	c30b3a9ff1	clh: Adding a volume is not supported without SharedFS As mounting volumes into the guest requires SharedFS setup, let's ensure we error out if trying to do so in a situation where SharedFS is not supported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:30 +01:00
Fabiano Fidêncio	f889f1f957	clh: introduce supportsSharedFS() supportsSharedFS() is a new method to be used to ensure that no SharedFS specifics are called when, for a reason or another, Cloud Hypervisor is in a mode where SharedFSs are not supported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:28 +01:00
Fabiano Fidêncio	54d27ed721	clh: introduce loadVirtiofsDaemon() Similarly to the `createVirtiofsDaemon` and `stopVirtiofsDaemon` methos, let's introduce and use loadVirtiofsDaemon, at it'll also be handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:38 +01:00
Fabiano Fidêncio	ae2221ea68	clh: introduce stopVirtiofsDaemon() Similary to the `createVirtiofsDaemon` method, let's introduce and use its counterpart, as it'll also be handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:26 +01:00
Fabiano Fidêncio	e8bc26f90d	clh: introduce setupVirtiofsDaemon() Similarly to what's been done with the `createVirtiofsDaemon`, let's create a `setupVirtiofsDaemon` one. It will also become handy later in this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:14 +01:00
Fabiano Fidêncio	413b3b477a	clh: introduce createVirtiofsDaemon() Let's introduce and use a new `createVirtiofsDaemon` method. Its name says it all, and it'll be handy later in this series when, spoiler alert, SharedFS cannot be used (in such cases as in Confidential Guests). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:48:02 +01:00
James O. D. Hunt	55cd0c89d8	runtime: Build golang components with extra security options Enable stack protector and fortify source for golang builds. Fixes: #3817. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-03 10:41:26 +00:00
Fabiano Fidêncio	76e4f6a2a3	Revert "hypervisors: Confidential Guests do not support Device hotplug" This reverts commit `df8ffecde0`, as device hotplug is supported and, more than that, is very much needed when using virtio-blk instead of virtio-fs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 09:59:55 +01:00
Fabiano Fidêncio	fa8b93927c	config: qemu: Fix disable_block_device_use comments virtio-fs, instead of virtio-9p, is the default shared file system type in case virtio-blk is not used. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:36 +01:00
Fabiano Fidêncio	9615c8bc9c	config: fc: Don't expose disable_block_device_use Relying on virtio-block is the only way to use Firecracker with Kata Containers, as shared FS (virtio-{fs,fs-nydus,9p}) is not supported by Firecracker. As configuration doesn't make sense to be exposed, we hardcode the `false` value in the Firecracker configuration structure. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:28 +01:00
Bin Liu	2ae8bd696a	Merge pull request #3367 from wfly1998/main build: always reset ARCH after getting it	2022-03-02 14:42:45 +08:00
Bin Liu	75877f8793	Merge pull request #3187 from Kvasscn/kata_dev_remove_temp_vsock_dir virtcontainers: remove temp dir created for vsock in test code	2022-03-02 11:05:47 +08:00
Francesco Giudici	7f638dd049	Merge pull request #3764 from Jakob-Naucke/hugepages-test-s390x virtcontainers: Use available s390x hugepages	2022-03-01 14:33:59 +01:00
Fabiano Fidêncio	4ab35b0899	Merge pull request #3796 from jodh-intel/fix-monitor-listen-address Fix monitor listen address	2022-03-01 13:51:01 +01:00
Fabiano Fidêncio	97c17085b0	Merge pull request #3770 from Jakob-Naucke/gofmt-vmm-s390x runtime: Gofmt fixes	2022-03-01 11:34:15 +01:00
James O. D. Hunt	e64c54a2ad	monitor: Listen to localhost only by default Change `kata-monitor` to listen to port `8090` on the local interface only by default. > Note: > > This is a breaking change as previously it listened on all interfaces. Fixes: #3795. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-01 10:00:43 +00:00
James O. D. Hunt	e6350d3d45	monitor: Fix build options Removed redundant and duplicated build options to build `kata-monitor` the same way as the other components: - `CGO_ENABLED=0` is not necessary. - `-buildmode=exe` is not necessary since `BUILDFLAGS` already sets the build mode. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-03-01 10:00:43 +00:00
GabyCT	ccb063b848	Merge pull request #3788 from fidencio/wip/update-clh-confidential-guest-comments Update `confidential_guest` comments	2022-02-28 15:11:01 -06:00
GabyCT	bc1733bb0e	Merge pull request #3774 from egernst/delinux-runtime cleanup runtime pkgs for Darwin build, add basic Darwin build/unit test	2022-02-28 15:08:09 -06:00
Jakob Naucke	eda8ea154a	runtime: Gofmt fixes - Mostly blank lines after `+build` -- see https://pkg.go.dev/go/build@go1.14.15 -- this is, to date, enforced by `gofmt`. - 1.17-style go:build directives are also added. - Spaces in govmm/vmm_s390x.go Fixes: #3769 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-28 17:24:47 +01:00
Eric Ernst	e355a71860	container: file is not linux specific This should not be linux specific -- drop restriction. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	b31876eefb	device-manager: move linux-only test to a linux-only file We can't Mkdev on Darwin - let's make sure the vfio test is in a linux-only file. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	6a5c634490	resourcecontrol: SystemdCgroup check is not necessarily linux specific This utility function is also used to check the spec that will run in the guest - no need for this to be linux specific. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	cc58cf6993	resourcecontrol: convert stats dev_t to unit64types Their types may differ on various host OSes, but unix.Major\|Minor always takes a uint64 Depends-on: github.com/kata-containers/tests#4516 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Eric Ernst	5be188cc29	utils: Add darwin stub Add a stub for utils_darwin to facilitate building this package on Darwin. We can probably drop this empty stub if we have better abstraction for the various parts of virtcontainers that call it today... Fixes:# 3777 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	ad0449195d	virtcontainers: Convert stats dev_t to uint64 We need to convert them to uint64 as their types may differ on various host OSes, but unix.Major\|Minor takes a uint64 regardless. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	56751089c0	katautils: Use a syscall wrapper for the hook JSON state There is no real equivalent of a thread ID on Darwin. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	7d64ae7a41	runtime: Add a syscall wrapper package It allows to support syscall variations between host OSes. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Samuel Ortiz	abc681ca5f	katautils: Add Darwin stub for the netNS API And move the current implementation into a Linux only file. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-28 08:01:53 -08:00
Fabiano Fidêncio	de57466212	config: Expand confidential_guest comments Let's clarify that an error will be reported in case confidential_guest is enabled, but the hardware where Kata Containers is running doesn't provide the required feature set. Fixes: #3787 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 11:57:42 +01:00
Fabiano Fidêncio	641d475fa6	config: clh: Use "Intel TDX" instead of just "TDX" Let's use "Intel TDX" rather than just "TDX", as it can ease the understanding of the terminology. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:27:21 +01:00
Fabiano Fidêncio	0bafa2def9	config: clh: Mention supported TEEs Let's mention the supported TEEs to be used with confidential guests. Right now, Cloud Hyperisor supports only Intel TDX, used together with TD Shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:24:33 +01:00
bin	81ed269ed2	runtime: use Cmd.StdoutPipe instead of self-created pipe Nydusd uses a bufio.Scanner to check if nydusd process has existed, but stderr/stdout passed to Cmd is self-created pipe, this pipe will not be closed if the process start failing. Use standard Cmd.StdoutPipe can close the stdout and kata shim will detect the existence of the nydusd process, then call cmd.Wait to reap the process' resources. Fixes: #3783 Signed-off-by: bin <bin@hyper.sh>	2022-02-28 16:52:49 +08:00
Eric Ernst	3997c962c2	Merge pull request #3767 from tanweernoor/02242022-kata-containers-issue-3631 runtime, config: make selinux configurable	2022-02-26 08:44:29 -08:00
Fabiano Fidêncio	a9ba7c132b	clh: Fix typo on HotplugRemoveDevice A copy and paste mistake was made and the error on HotplugRemoveDevice() should be about removal and not about addition. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 22:35:32 +01:00
Tanweer Noor	082d538cb4	runtime: make selinux configurable removes --tags selinux handling in the makefile (part of it introduced here: `d78ffd6`) and makes selinux configurable via configuration.toml Fixes: #3631 Signed-off-by: Tanweer Noor <tnoor@apple.com>	2022-02-25 10:33:46 -08:00
Fabiano Fidêncio	ea1876f057	Merge pull request #3771 from fidencio/wip/clh-tdx clh: Add TDX support	2022-02-25 18:45:31 +01:00
Samuel Ortiz	1103f5a4d4	virtcontainers: Use FilesystemSharer for sharing the containers files Switching to the generic FilesystemSharer brings 2 majors improvements: 1. Remove container and sandbox specific code from kata_agent.go 2. Allow for non Linux implementations to provide ways to share container files and root filesystems with the Kata Linux guest. Fixes #3622 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	533c1c0e86	virtcontainers: Keep all filesystem sharing prep code to sandbox.go With the Linux implementation of the FilesystemSharer interface, we can now remove all host filesystem sharing code from kata_agent and keep it where it belongs: sandbox.go. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	61590bbddc	virtcontainers: Add a Linux implementation for the FilesystemSharer This gathers the current kata agent and container filesystem sharing code into a FilesystemSharer implementation. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Samuel Ortiz	03fc1cbd7e	virtcontainers: Add a filesystem sharing interface Filesystem sharing here means the ability to share some parts of the host filesystem with the guest. It's mostly about sharing files and container bundle root filesystems. In order to allow for different file and rootfs sharing implementations, we define a FilesystemSharer interface. This interface provides a preparation step, where concrete implementations will be able to e.g. prepare the host filesysstem. Then it provides 2 methods, one for sharing any file (regular file or a directory) and another one for sharing a container root filesystem Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-25 17:22:27 +01:00
Fabiano Fidêncio	72434333aa	clh: Add TDX support Let's enable TDX support for Cloud Hypervisor, using td-shim as its desired firmware. Fixes: #3632 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	a13b4d5ad8	clh: Add firmware to the config file "firmware" option was already present for a while, but it's never been exposed to the configuration file before. Let's do it now as it can be used, in combination with the newly added confidential_guest option, to boot a guest VM using the so called `td-shim`[0] with Cloud Hypervisor. [0]: https://github.com/confidential-containers/td-shim Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	a8827e0c78	hypervisors: Confidential Guests do not support NVDIMM NVDIMM is also not supported with Confidential Guests and Virtio Block devices should be used instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	f50ff9f798	hypervisors: Confidential Guests do not support Memory hotplug Similarly to VCPUs and Device hotplug, Confidential Guests also do not support Memory hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Memory hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	df8ffecde0	hypervisors: Confidential Guests do not support Device hotplug Similarly to VCPUs hotplug, Confidential Guests also do not support Device hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Device hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	28c4c044e6	hypervisors: Confidential Guests do not support VCPUs hotplug As confidential guests do not support VCPUs hotplug, let's set the "DefaultMaxVCPUs" value to "NumVCPUs". The reason to do this is to ensure that guests will be started with the correct amount of VCPUs, without giving to the guest with all the possible VCPUs the host could provide. One clear side effect of this limitation is that workloads that would require more VCPUs on their yaml definition will not run on this scenario. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	29ee870d20	clh: Add confidential_guest to the config file ConfidentialGuest is an option already present and exposed for QEMU, which is used for using Kata Containers together with different sorts of Guest Protections, such as TDX and SEV for x86_64, PEF for ppc64le, and SE for s390x. Right now we error out in case confidential_guest is enabled, as we will be implementing the needed blocks for this as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	9621c59691	clh: refactor image / initrd configuration set This is a small code refactor removing a deadcode based the checks already done in the generic hypervisor abstraction. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	dcdc412e25	clh: use common kernel params from the hypervisor code The hypervisor code already defines 3 common kernel root params for the following cases: * NVDIMM * NVDIMM without DAX support * Virtio Block As parameters used for cloud-hypervisor have an overlap with the ones provided by the NVDIMM case, let's take advantage of that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	4c164afbac	versions: Update Cloud Hypervisor to 5343e09e7b8db Let's bump the Cloud Hypervisor version to 5343e09e7b8db, as that brings a few fixes we're interested in, such as: * hypervisor, vmm: Handle TDX hypercalls with INVALID_OPERAND - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3723 - This is needed for the TDX support on the cloud hypervisor driver, which is part of this very same series. * openapi: Update the PciBdf types - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3748 - This is needed due to a change in a DeviceNode field, which would cause a marshalling / demarshalling error when running with a version of cloud-hypervisor that includes the TDX fixes mentioned above. * scripts: dev_cli: Don't quote $features_build * scripts: dev_cli: Add --features option - https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3773 - This is needed due to changes in the scripts used to build Cloud Hypervisor, which are used as part of Kata Containers CIs and github actions. Due to this change, we're also adapting the build scripts as part of this very same commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:16 +01:00
Jakob Naucke	bbfe7d6591	Merge pull request #3599 from Jakob-Naucke/no-virtio-rng-ccw virtcontainers: Do not add a virtio-rng-ccw device	2022-02-25 15:27:02 +01:00
Francesco Giudici	3da6006de4	Merge pull request #3751 from fgiudici/kata-monitor_issue3705 kata-monitor: fix collecting metrics for sandboxes not started through CRI	2022-02-25 14:53:12 +01:00
Jakob Naucke	b2a65f9031	virtcontainers: Use available s390x hugepages in TestHandleHugepages. On s390x, hugepage sizes must be set at boot, so test with any that are present (default is 1M). Depends-on: github.com/kata-containers/kata-containers#3770 Fixes: #3763 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-25 13:11:00 +01:00
Amulyam24	cb4230e60e	runtime: fix package declaration for ppc64le Incorrect package name causes build to fail. Fix it in vm_ppc64le.go Fixes: #3761 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2022-02-24 15:31:48 +05:30
Eric Ernst	c6cc038364	Merge pull request #3615 from sameo/topic/hypervisor Make the hypervisor framework not Linux specific	2022-02-23 16:02:00 -08:00
Francesco Giudici	fec26f8e51	kata-monitor: trivial: rename symbols & labels We introduced collection of sandboxes metadata from the CRI that will be attached to the sandbox metrics: this will allow to immediately match sandboxes metrics with CRI workloads. Rename the symbols from Kube to CRI as the metadata will be there every time pods are created through CRI, also if kubernetes is not installed (e.g., 'crictl runp'). Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 18:34:32 +01:00
Samuel Ortiz	9fd4e5514f	runtime: Move the resourcecontrol package one layer up And try to reduce the number of virtcontainers packages, step by step. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	823faee83a	virtcontainers: Rename the cgroups package To resourcecontrol, and make it consistent with the fact that cgroups are a Linux implementation of the ResourceController interface. Fixes: #3601 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	0d1a7da682	virtcontainers: Rename and clean the cgroup interface We call it a ResourceController, and we make it not so Linux specific. Now the Linux implementations is the cgroups one. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	ad10e201e1	virtcontainers: cgroups: Move non Linux routine to utils.go Have an OS agnostic file for sharing routines. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Samuel Ortiz	d49d0b6f39	virtcontainers: cgroups: Define a cgroup interface And move the current, Linux-specific implementation into cgroups_linux.go Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-23 15:48:40 +01:00
Francesco Giudici	3ac52e8193	kata-monitor: fix updating sandbox cache at startup We now rely on fs events only to update the sandbox cache. This is not true anyway for sandboxes already present at kata-monitor startup: we just retrieve the list and add them in the cache only when we get their CRI metadata. If CRI metadata is not available we will never add them to the sandbox cache. Fix this by immediately adding the sandboxes we find at startup time to the sandbox cache. Fixes: #3705 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 11:21:06 +01:00
Francesco Giudici	160bb62138	kata-monitor: bump version to 0.3.0 Since kata-monitor now: - relies on fs events only to update the sandbox cache - adds CRI meta-data as labels (CRI pod name, namespace and uid) it deserves a version bump. Note that while we could let kata-monitor match the runtime version, kata-monitor will usually work flawlessy with different kata shim releases: so it makes sense to keep kata-monitor version separated. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-02-23 11:17:02 +01:00
Fabiano Fidêncio	6a9e5f90f7	Merge pull request #3670 from sameo/topic/nerdctl Support nerdctl OCI hooks	2022-02-22 23:03:33 +01:00
Fabiano Fidêncio	4729fd0fc2	Merge pull request #3736 from liubin/fix/3733-log-events-for-crio shim: log events for CRI-O	2022-02-22 09:19:37 +01:00
bin	f6fc1621f7	shim: log events for CRI-O CRI-O start shim process without setting TTRPC_ADDRESS, that the forwarding events goroutine will get errors. For CRI-O runtime, we can log the events to log file. Fixes: #3733 Signed-off-by: bin <bin@hyper.sh>	2022-02-22 11:02:50 +08:00
Fabiano Fidêncio	1e9f3c856d	Merge pull request #3553 from fgiudici/kata-monitor_cachefix kata-monitor: simplify sandbox cache management and attach kubernetes POD metadata to metrics	2022-02-21 13:17:22 +01:00
Peng Tao	031da99914	Merge pull request #3687 from luodw/nydus-clh nydus: add lazyload support for kata with clh	2022-02-21 19:31:45 +08:00
luodaowen.backend	3175aad5ba	virtiofs-nydus: add lazyload support for kata with clh As kata with qemu has supported lazyload, so this pr aims to bring lazyload ability to kata with clh. Fixes #3654 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-19 21:55:31 +08:00
zhanghj	94b831ebf8	virtcontainers: remove temp dir created for vsock in test code remove temp dir generated by mock.GenerateKataMockHybridVSock(). Fixes: #3186 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-02-19 16:59:15 +08:00
Archana Shinde	7db9bef72c	Merge pull request #3718 from Kvasscn/kata_dev_fix_utils_assert_msg virtcontainers: Remove duplicated assert messages in utils test code	2022-02-18 06:07:16 -08:00
Samuel Ortiz	27de212fe1	runtime: Always add network endpoints from the pod netns As the container runtime, we're never inspecting, adding or configuring host networking endpoints. Make sure we're always do that by wrapping addSingleEndpoint calls into the pod network namespace. Fixes #3661 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-18 10:37:07 +01:00
zhanghj	1cee0a9452	virtcontainers: Remove duplicated assert messages in utils test code Remove duplicated strings in assert.Errorf() and assert.NoErrorf(). Fixes: #3714 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2022-02-18 16:45:05 +08:00
Samuel Ortiz	77c29bfd3b	container: Remove VFIO lazy attach handling With the recently added VFIO fixes and support, we should not need that anymore. Fixes #3108 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-17 08:39:44 +01:00
GabyCT	ced5e910d5	Merge pull request #3558 from jodh-intel/docs-rework-readme docs: Improve top-level README	2022-02-16 16:28:14 -06:00
Fabiano Fidêncio	6f9685fbf5	Merge pull request #3624 from mdlayher/mdl-vsock runtime: use github.com/mdlayher/vsock@v1.1.0	2022-02-16 23:11:47 +01:00
Samuel Ortiz	26b3f0017c	virtcontainers: Split hypervisor into Linux and OS agnostic bits Keep all the OS agnostic bits in the hypervisor.go and hypervisor_ARCH.go files. Fixes #3614 Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:15:31 +01:00
Samuel Ortiz	fa0e9dc6b1	virtcontainers: Make all Linux VMMs only build on Linux Some of them (e.g. QEMU) can run on other OSes (e.g. Darwin) but the current virtcontainers implementation is Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:34 +01:00
Samuel Ortiz	c91035d0e1	virtcontainers: Move non QEMU specific constants to hypervisor.go Hotplugging errors and 9pfs size are not particularily QEMU specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:34 +01:00
Samuel Ortiz	10ae05914c	virtcontainers: Move guest protection definitions to hypervisor.go They're not QEMU specific, other VMMs may implement support for it. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:07:31 +01:00
Samuel Ortiz	b28d0274ff	virtcontainers: Make max vCPU config less QEMU specific Even though it's still actually defined as the QEMU upper bound, it's now abstracted away through govmm. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:06:32 +01:00
Samuel Ortiz	a5f6df6a49	govmm: Define the number of supported vCPUs per architecture Based on qhe QEMU supports on those architectures. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-16 19:06:32 +01:00
Fabiano Fidêncio	be2e90469a	Merge pull request #3669 from fidencio/wip/virtiofsd-use-announce-submounts virtiofsd: Use "-o announce_submounts"	2022-02-16 16:43:18 +01:00
James O. D. Hunt	9818cf7196	docs: Improve top-level and runtime README Various improvements to the top-level README file: - Moved the following sections from the runtime's README to the top-level README: - License - Platform support / Hardware requirements - Added the following sections to the top-level README: - Configuration - Hypervisors - Improved formatting of the Documentation section in the top-level README. - Removed some unused named links from the top-level README. Also improvements to the runtime README: - Removed confusing mention of the old 1.x runtime name. - Clarify the binary name for the 2.x runtime and the utility program. > Note: > > We cannot currently link to the AMD website as that site's > configuration causes the CI static checks to fail. See > https://github.com/kata-containers/tests/issues/4401 Fixes: #3557. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-02-16 09:52:48 +00:00
bin	81a8baa5e5	runtime: add hugepages support Add hugepages support, port from: `b486387cba` Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: bin <bin@hyper.sh>	2022-02-16 15:14:53 +08:00
bin	7df677c01e	runtime: Update calculateSandboxMemory to include Hugepages Limit Support hugepages and port from: `96dbb2e8f0` Fixes: #3342 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: bin <bin@hyper.sh>	2022-02-16 15:14:37 +08:00
Samuel Ortiz	4f96e3eae3	katautils: Pass the nerdctl netns annotation to the OCI hooks We need to let nerdctl know which namespace to use when calling the selected CNI plugin. See https://github.com/containerd/nerdctl/issues/787 Fixes: #1935 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 18:11:23 +01:00
Samuel Ortiz	a871a33b65	katautils: Run the createRuntime hooks The preStart hooks are being deprecated over the createRuntime ones. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
Samuel Ortiz	d9dfce1453	katautils: Run the preStart hook in the host namespace The OCI spec is very specific about it: "The prestart hooks MUST be executed in the runtime namespace." Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
Samuel Ortiz	6be6d0a3b3	katautils: Pass the OCI annotations back to the called OCI hooks That allows us to amend those annotations with information that could be used when running those hooks. For example nerdctl will use those annotations to resolve the networking namespace path in where to run the CNI plugin, i.e. the created pod networking namespace. Fixes #3629 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-15 17:31:56 +01:00
Fabiano Fidêncio	4bd945b67b	virtiofsd: Use "-o announce_submounts" German Maglione, one of the current virtio-fs developers, has brought to our attention that using "announce-submounts" could help us to prevent inode number collisions. This feature was introduced a year ago or so by Hanna Reitz as part of the 08dce386e77eb9ab044cb118e5391dc9ae11c5a8, and as we already mandate QEMU >= 6.1.0, let's take advantage of that. Fixes: #3507 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-15 08:52:03 +01:00
Yu Li	37df1678ae	build: always reset ARCH after getting it When building with `ARCH=x86_64`, the previous `Makefile` will use it without checking and cause: Makefile:319: *** "ERROR: No hypervisors known for architecture x86_64 (looked for: acrn firecracker qemu cloud-hypervisor)". Stop. This commit fix the above issue by checking `ARCH` no matter where it is assigned. Fixes: #3444 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2022-02-15 14:26:34 +08:00
Shengjing Zhu	3a641b56f6	katatestutils: remove distro constraints The distro constraint parses os release files, which may not contain distro version(VERSION_ID field), for example rolling release distributions like Debian testing, archlinux. These distro constraints are not used anyway, so removing them instead of fixing the complex version detection. Fixes: #1864 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-02-15 02:11:52 +08:00
Fabiano Fidêncio	90fd625d0c	versions: Udpate Cloud Hypervisor to 55479a64d237 Let's update cloud-hypervisor to a version that exposes the TDx support via the OpenAPI's auto-generated code. Fixes: #3663 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-14 17:32:30 +01:00
James O. D. Hunt	8f80dffead	Merge pull request #3648 from yaoyinnan/index-in-for runtime: The index variable is initialized multiple times in for	2022-02-14 12:36:46 +00:00
Bin Liu	cf53ec2c71	Merge pull request #2977 from luodw/support_nydus feature(nydusd): add nydusd support to introduce lazyload ability	2022-02-14 13:08:50 +08:00
Matt Layher	c1ce67d905	runtime: use github.com/mdlayher/vsock@v1.1.0 Fixes #3625 Signed-off-by: Matt Layher <mdlayher@gmail.com>	2022-02-12 19:57:15 -05:00
yaoyinnan	42a878e6c1	runtime: The index variable is initialized multiple times in for Change the variables `mountTypeFieldIdx := 8`, `mntDestIdx := 4` and `netNsMountType := "nsfs"` to const. And unify the variable naming style, modify `mntDestIdx` to `mountDestIdx`. Fixes: #3646 Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2022-02-12 11:10:10 +08:00
luodaowen.backend	2d9f89aec7	feature(nydusd): add nydusd support to introduse lazyload ability Pulling image is the most time-consuming step in the container lifecycle. This PR introduse nydus to kata container, it can lazily pull image when container start. So it can speed up kata container create and start. Fixes #2724 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-11 21:41:17 +08:00
Daniel Höxtermann	b19b6938a8	docs: Fix relative links in Markdown Relative links within this repository allow for easier navigation to the corresponding file / directory in the current commit / for the selected version. Link text was slightly changed / fixed in - docs/Unit-Test-Advice.md - docs/how-to/how-to-run-docker-with-kata.md Fixes #3045 Signed-off-by: Daniel Höxtermann <daniel@hxtm.dev>	2022-02-11 13:49:42 +01:00
Julio Montes	982f14fa66	runtime: support QEMU SGX Enable SGX in QEMU when `sgx.intel.com/epc` annotation is defined fixes #3436 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-10 09:45:48 -06:00
Samuel Ortiz	07b9d93f5f	virtcontainer: Simplify the sandbox network creation flow We don't need to call NewNetwork() twice, and we can have the VM factory case return immediatly. That makes the code more readable. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	2c7087ff42	virtcontainers: Make all endpoints Linux only All of the networking endpoints are Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	49d2cde1e2	virtcontainers: Split network tests into generic and OS specific parts Some unit tests are generic while others, mostly because they depend on netlink, are Linux specific. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	0269077ebf	virtcontainers: Remove the netlink package dependency from network.go Move the netlink dependent code into network_linux.go. Other OSes will have to provide the same functions. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	7fca5792f7	virtcontainers: Unify Network endpoints management interface And only have AddEndpoints/RemoveEndpoints for all cases (single endpoint vs all of them, hotplug or not). Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	c67109a251	virtcontainers: Remove the Network PostAdd method It's used once by the sandbox code and can be implemented directly there. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	e0b264430d	virtcontainers: Define a Network interface And move the Linux implementation into a GOOS specific file. Fixes #3005 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	5e119e90e8	virtcontainers: Rename the Network structure fields and methods We are converting the Network structure into an interface, so that different host OSes can have different networking implementations for Kata. One step into that direction is to rename all the Network structure fields and methods to something that is less Linux networking namespace specific. This will make the Network interface naming consistent. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	b858d0dedf	virtcontainers: Make all Network fields private Prepare for making it a real interface. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	49eee79f5f	virtcontainers: Remove the NetworkNamespace structure It is now replaced with a single Network structure Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	844eb61992	virtcontainers: Have CreateVM use a Network reference We are replacing the NetworkingNamespace structure with the Network one, so we should have the hypervisor interface switching to it as well. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	d7b67a7d1a	virtcontainers: Network API cleanups and simplifications Remove unused parameters. Reduce the number of parameters by deriving some of them (e.g. a networking config) from their outer structure (e.g. a Sandbox reference). Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	2edea88369	virtcontainers: Make the Network structure manage endpoints Endpoints creations, attachement and hotplug are bound to the networking namespace described through the Network structure. Making them Network methods is natural and simplifies the code. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Samuel Ortiz	8f48e28325	virtcontainers: Expand the Network structure For simplicity sake, there should only be one networking structure per sandbox, as opposed to two (Network and NetworkingNamespace) currently. This commit start expanding the Network structure in order to eventually make it the single representation of a virtcontainers sandbox networking. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2022-02-08 22:27:53 +01:00
Pierre Kohler	5ef522f7c3	runtime: check kvm module `sev` correctly Runtime now accepts both `1` and `Y` as valid values for kvm_amd module parameter kvm_amd.sev. Fixes #3273 Signed-off-by: Pierre Kohler <pierre.kohler@cysec.systems>	2022-02-07 23:48:47 +01:00
Eric Ernst	e8eb5e8295	Merge pull request #3609 from egernst/rootless-linux virtcontainers: Split the rootless package into OS specific parts	2022-02-03 12:19:31 -08:00
Jakob Naucke	7ffe9e5198	virtcontainers: Do not add a virtio-rng-ccw device On s390x, skip adding a virtio-rng device. The on-chip CPACF provides entropy instead. For Confidential Containers, when using Secure Execution, entropy attacks on virtio-rng are mitigated. Fixes: #3598 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-02-02 17:06:20 +01:00
Julio Montes	1f29478b09	runtime: suppport split firmware firmware can be split into FIRMWARE_VARS.fd (UEFI variables as configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI variables can be customized per each user while UEFI code is kept same. fixes #3583 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-01 13:40:19 -06:00
Samuel Ortiz	14e7f52a91	virtcontainers: Split the rootless package into OS specific parts Move the netns specific bits into a Linux specific file. Fixes: #3607 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-28 16:20:28 -08:00
James O. D. Hunt	7c956e0d27	virtcontainers: Enable initrd for Cloud Hypervisor Since CH has supported booting with an initramfs since version 0.7.0 [1], allow an `initrd=` to be specified. Fixes: #3566. [1] - https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v0.7.0 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2022-01-28 10:49:10 +00:00
Eric Ernst	a5ebeb96c1	Merge pull request #2941 from egernst/sandbox-sizing-feature Sandbox sizing feature	2022-01-27 09:37:57 -08:00
Eric Ernst	8cde54131a	runtime: introduce static sandbox resource management There are software and hardware architectures which do not support dynamically adjusting the CPU and memory resources associated with a sandbox. For these, today, they rely on "default CPU" and "default memory" configuration options for the runtime, either set by annotation or by the configuration toml on disk. In the case of a single container (launched by ctr, or something like "docker run"), we could allow for sizing the VM correctly, since all of the information is already available to us at creation time. In the sandbox / pod container case, it is possible for the upper layer container runtime (ie, containerd or crio) could send a specific annotation indicating the total workload resource requirements associated with the sandbox creation request. In the case of sizing information not being provided, we will follow same behavior as today: start the VM with (just) the default CPU/memory. If this information is provided, we'll track this as Workload specific resources, and track default sizing information as Base resources. We will update the hypervisor configuration to utilize Base+Workload resources, thus starting the VM with the appropriate amount of CPU and memory. In this scenario (we start the VM with the "right" amount of CPU/Memory), we do not want to update the VM resources when containers are added, or adjusted in size. This functionality is introduced behind a configuration flag, `static_sandbox_resource_mgmt`. This is defaulted to false for all configurations except Firecracker, which is set to true. This'll greatly improve UX for folks who are utilizing Kata with a VMM or hardware architecture that doesn't support hotplug. Note, users will still be unable to do in place vertical pod autoscaling or other dynamic container/pod sizing with this enabled. Fixes: #3264 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:04:38 -08:00
Eric Ernst	c3e97a0a22	config: updates to configuration clh, fc toml template There's some cruft -- let's update to reflect reality, and ensure that we match what is expected. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:45:50 -08:00
Francesco Giudici	ab447285ba	kata-monitor: add kubernetes pod metadata labels to metrics Add the POD metadata we get from the container manager to the metrics by adding more labels. Fixes: #3551 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	834e199eee	kata-monitor: drop unused functions Drop the functions we are not using anymore. Update the tests too. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	7516a8c51b	kata-monitor: rework the sandbox cache sync with the container manager Kata-monitor detects started and terminated kata pods by monitoring the vc/sbs fs (this makes sense since we will have to access that path to access the sockets there to get the metrics from the shim). While kata-monitor updates its sandbox cache based on the sbs fs events, it will schedule also a sync with the container manager via the CRI in order to sync the list of sandboxes there. The container manager will be the ultimate source of truth, so we will stick with the response from the container manager, removing the sandboxes not reported from the container manager. May happen anyway that when we check the container manager, the new kata pod is not reported yet, and we will remove it from the kata-monitor pod cache. If we don't get any new kata pod added or removed, we will not check with the container manager again, missing reporting metrics about that kata pod. Let's stick with the sbs fs as the source of truth: we will update the cache just following what happens on the sbs fs. At this point we may have also decided to drop the container manager connection... better instead to keep it in order to get the kube pod metadata from it, i.e., the kube UID, Name and Namespace associated with the sandbox. Every time we get a new sandbox from the sbs fs we will try to retrieve the pod metadata associated with it. Right now we just attach the container manager sandbox id as a label to the exposed metrics, making hard to link the metrics to the running pod in the kubernetes cluster. With kubernetes pod metadata we will be able to add them as labels to map explicitly the metrics to the kubernetes workloads. Fixes: #3550 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	e78d80ea0d	kata-monitor: silently ignore CHMOD events on the sandboxes fs We currently WARN about unexpected fs events, which includes CHMOD operations (which should be actually expected...). Just ignore all the fs events we don't care about without any warn. We dump all the events with debug log in any case. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Francesco Giudici	e9eb34cea8	kata-monitor: improve debug logging Improve debug log formatting of the sandbox cache update process. Move raw and tracing logs from the DEBUG to the TRACE log level. Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2022-01-26 13:48:45 +01:00
Fabiano Fidêncio	f7c7dc8d33	Merge pull request #3504 from Jakob-Naucke/s390x-govmm-tests Fix and re-enable s390x GoVMM tests	2022-01-26 12:57:38 +01:00
Archana Shinde	081a235efe	Merge pull request #3540 from bradenrayhorn/fix-negative-memory-limit runtime: fix handling container spec's memory limit	2022-01-25 05:17:05 -08:00
Braden Rayhorn	fc0e095180	runtime: fix handling container spec's memory limit The OCI container spec specifies a limit of -1 signifies unlimited memory. Update the sandbox memory calculator to reflect this part of the spec. Fixes: #3512 Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>	2022-01-24 13:30:32 -06:00
Jakob Naucke	016569fd8e	Merge pull request #3476 from bergwolf/runtime-dep runtime: update runc and image-spec dependencies	2022-01-24 15:53:43 +01:00
Peng Tao	5643c6dcae	runtime: update runc and image-spec dependencies To address two depbot security warnings. Fixes: #3475 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2022-01-24 11:49:05 +08:00
Bo Chen	94b343492d	Merge pull request #3520 from likebreath/0120/clh_v21.0 Upgrade to Cloud Hypervisor v21.0	2022-01-21 08:08:13 -08:00
Jakob Naucke	2f37165f46	govmm: Unite VirtioNet tests no explicit PCI test, just switch path depending on architecture (CCW for s390x, PCI for others). Also fixes an unknown variable error. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	4a428fd1c5	govmm: readonly=on in s390x blkdev test Forgotten in `b17f07395c`, also fixes a test. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	79ecebb280	govmm: TestAppendPCIBridgeDevice et al. on !s390x s390x uses CCW, also fixes a lint failure about undeclared variables on s390x. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	dc285ab1d7	govmm: Remove unnecessary comma in iommu_platform in FSDevice.QemuParams for VirtioCCW. Forgotten in `ff34d283db`, also fixes a test. Fixes: #3500 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Jakob Naucke	d23f2eb0f0	govmm: Revert "govmm: s390x: Skip broken tests" This reverts commit `5ce9011a36`. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-21 13:00:05 +01:00
Amulya Meka	f52ce302bc	runtime: rectify passing empty options to -ldflags When no options are passed to -ldflags, it passes incorrect values(in this case, $BUILDFLAGS) to it. Fix passing empty values by passing $KATA_LDFLAGS in quotes. Fixes: #3521 Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>	2022-01-21 06:57:52 +00:00
Bo Chen	2d799cbfa3	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v21.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-01-20 17:48:10 -08:00
Fabiano Fidêncio	5ce9011a36	govmm: s390x: Skip broken tests For now a bunch of tests are simply not working. Let's skip them all, and re-enable them once kata-containers/kata-containers/issues/3500 gets fixed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-20 01:04:35 +01:00
Fabiano Fidêncio	8bcaed0b4f	govmm: Adapt license headers to kata-containers Both projects follow the same license, Apache-2.0, but the header saying that comes from govmm is different from the one expected for the tests present on the kata-containers repo. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	6dd6577986	govmm: Ignore govet checks, at least for now govet checks have been ignored on govmm repo, but those are enabled on kata-containers one. So, in order to avoid failing our CIs let's just keep ignoring the checks for the govmm structs and have an issue opened for fixing it whenever someone has cycles to do it. The important bit here is, we're not making anything worse that it already is. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	de678a3aaa	govmm: Remove non-relevant top files govmm, from now on, should follow the same guidelines from contributing, copying, and etc as kata-containers does. The go.mod is not needed anymore as the project lives inside the runtime. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	ec6655af87	govmm: Use govmm from our own pkg Let's stop using govmm from kata-containers/govmm and let's start using it from our own repo. Fixes: #3495 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-01-19 18:02:46 +01:00
Fabiano Fidêncio	fb7f98bd2e	Merge govmm into kata-containers	2022-01-19 09:40:15 +01:00
Julio Montes	c0e28b54a1	Merge pull request #3460 from devimc/2021-01-17/vendorGovmm vendor: update govmm	2022-01-18 15:54:11 -06:00
Julio Montes	49223e67af	runtime: remove enable_swap option `enable_swap` option was added long time ago to add `-realtime mlock=off` to the QEMU's command line. Kata now supports QEMU 6, `-realtime` option has been deprecated and `mlock=on` is causing unexpected behaviors in kata. This patch removes support for `enable_swap`, `-realtime` and `mlock=` since they are causing bugs in kata. Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-18 11:12:29 -06:00
Jakob Naucke	5285ac2b57	runtime: -Wl,--s390-pgste for s390x for linking. Required for basic KVM checks on some kernels (e.g. the one RHEL is currently shipping), cf. `6621441db5/target/s390x/kvm/meson.build (L15-L16)`. Fixes: #3469 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2022-01-18 11:32:03 +01:00
Julio Montes	41e0c414a4	vendor: update govmm bring SGX support and other fixes shortlog: `8939b0f` qemu: add support for SGX `b17f073` qemu: update readonly flag for block devices `f971801` qemu: only set wait parameter for server mode socket based char device `82cc01d` qemu: Fix 32 bit int overflow in test file `1d1a231` qemu: Add support for legacy serial device `9a2bbed` qemu: Remove -realtime in favor of -overcommit `fe83c20` qemu: Add support for --no-shutdown Knob `1ed5271` qmp: wait for POWERDOWN event in ExecuteSystemPowerdown() fixes #3080 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-17 09:20:47 -06:00
Sebastian Hasler	adffd3f8b6	scripts: Use shebang /usr/bin/env bash Not all distros have `/bin/bash`, e.g. NixOS. Fixes: #3450 Signed-off-by: Sebastian Hasler <sebastian.hasler@stuvus.uni-stuttgart.de>	2022-01-13 22:53:28 +01:00
liangxianlong	878ab93c15	runtime: Provide protection for shared data The k.reqHandlers should be protected by locks when used Fixes #3440 Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn>	2022-01-13 14:48:10 +08:00
James O. D. Hunt	ef835b5948	Merge pull request #3418 from yangfeiyu20102011/main runtime: it should rollback when failed in Sandbox AddInterface	2022-01-12 10:22:36 +00:00
bin	85f5ae190e	runtime: close span before return from function in case of error Return before closing span will cause invalid spans, so span should be closed before function return. Fixes: #3424 Signed-off-by: bin <bin@hyper.sh>	2022-01-11 19:45:41 +08:00
yangfeiyu	b133a2368a	runtime: it should rollback when failed in Sandbox AddInterface When Sandbox AddInterface() is called, it may fail after endpoint.HotAttach, we'd better rollback and call save() in the end. Fixes: #3419 Signed-off-by: yangfeiyu <yangfeiyu20102011@163.com>	2022-01-11 18:43:43 +08:00
Feng Wang	c486c2ca18	agent: fix the broken protobuf generation code After the protocols are moved to upper libs (PR3355), the runtime protocol generation is broken. This fixes it. Fixes: #3414 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-01-10 15:37:00 -08:00
Gabriela Cervantes	ad16d75c07	runtime: Remove docker comments for kata 2.0 configuration.tomls This PR removes the reference of how to use disable_new_netns configuration with docker as for kata 2.0 we are not supporting docker and this information was used for kata 1.x Fixes #3400 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-06 16:08:10 +00:00
Eric Ernst	e073c0936b	Merge pull request #3279 from egernst/containerd-vendor-bump vendor: update to containerd v1.6.0-beta.4	2022-01-05 11:13:05 -08:00
Bin Liu	94f14cf6f7	Merge pull request #3363 from zhsj/remove-binary vc: remove swagger binary	2022-01-05 20:40:33 +08:00
Bin Liu	b2166560fa	Merge pull request #3375 from zhaojizhuang/debianrootfs osbuilder: Restore Debian as a rootfs	2022-01-05 10:27:47 +08:00
Eric Ernst	7b03d78f15	vendor: update to containerd v1.6.0-beta.4 Update our containerd vendoring. In particular, we're interested in grabbing the updated annotation definitions for defining sandbox sizing. - go get github.com/containerd/containerd@v1.6.0-beta.4 - edit go.mod to remove containerd v1.5.8 replacement directive - go mod vendor - go mod tidy Fixes: #3276 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-04 17:15:17 -08:00
zhaojizhuang	3093f93a6f	osbuilder: Restore Debian as a rootfs Restore Debian as a rootfs. 1. revert of #3154, but some change 2. update debian version to 10.11 3. update `libstdc++-6-dev` to `libstdc++-8-dev` 4. changes discarded in QAT are not restored Fixes: #3372 Signed-off-by: zhaojizhuang <571130360@qq.com>	2022-01-04 11:54:34 +08:00
zhanghj	2254fa8657	runtime: fix a typo in kata-collect-data.sh Fix a typo while to check if mountpoint exist. Fixes: #3365 Signed-off-by: zhanghj <zhanghj.lc@inspur.com>	2021-12-28 10:03:18 +08:00
Shengjing Zhu	2d0f9d2d06	vc: remove swagger binary Fixes: #3362 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2021-12-25 22:41:29 +08:00
Fupan Li	0fe20854e7	Merge pull request #2481 from Bevisy/main-1494 Makefile: update `make go-test` call	2021-12-24 09:57:06 +08:00
Jakob Naucke	137e217b85	docs: Fix outdated k8s link in virtcontainers readme Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-12-22 19:40:25 +01:00
James O. D. Hunt	2ebae2d279	Merge pull request #3287 from jodh-intel/docs-split-arch-doc Split architecture doc into separate files	2021-12-20 10:11:30 +00:00
Chelsea Mafrica	1653dd4a30	tracing: Add span name to logging error Add span name to logging error to help with debugging when the context is not set before the span is created. Fixes #3289 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-12-16 12:44:42 -08:00
James O. D. Hunt	6f9efb4043	docs: Move arch doc to separate directory Move the architecture document into a new `docs/design/architecture/` directory in preparation for splitting it into more manageable pieces. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-12-16 12:26:17 +00:00
Eric Ernst	3865a1bcf6	Merge pull request #2918 from egernst/update-container-type-handling update container type handling	2021-12-15 10:41:23 -08:00
Eric Ernst	7a989a8333	runtime: api-test: fixup not clear why this was commented out before -- ensure that we set approprate annotation on the sandbox container's annotations to indicate this is a sandbox. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-14 18:55:18 -08:00
Eric Ernst	52f79aef91	utils: update container type handling Today we assume that if the CRI/upper layer doesn't provide a container type annotation, it should be treated as a sandbox. Up to this point, a sandbox with a pause container in CRI context and a single container (ala ctr run) are treated the same. For VM sizing and container constraining, it'll be useful to know if this is a sandbox or if this is a single container. In updating this, we cleanup the type handling tests and we update the containerd annotations vendoring. Fixes: #2926 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-14 17:59:19 -08:00
bin	03546f75a6	runtime: change io/ioutil to io/os packages Change io/ioutil to io/os packages because io/ioutil package is deprecated from 1.16: Discard => io.Discard NopCloser => io.NopCloser ReadAll => io.ReadAll ReadDir => os.ReadDir ReadFile => os.ReadFile TempDir => os.MkdirTemp TempFile => os.CreateTemp WriteFile => os.WriteFile Details: https://go.dev/doc/go1.16#ioutil Fixes: #3265 Signed-off-by: bin <bin@hyper.sh>	2021-12-15 07:31:48 +08:00
Fabiano Fidêncio	602d87295b	Merge pull request #3226 from liubin/fix/3193-fill-hypervisorconfig runtime/template: Handling new attributes for hypervisor config	2021-12-09 13:29:23 +01:00
Chelsea Mafrica	7522109abc	Merge pull request #3218 from liubin/fix/3217-fix-span-name runtime: correct span name for stopSandbox function	2021-12-07 16:36:14 -08:00
bin	b92babf91b	runtime/template: Handling new attributes for hypervisor config Some new attributes are added to hypervisor config: - VMStorePath - RunStorePath - SharedPath These attributes should be handled in two places: - reset when check the new hypervisor's config is suitable to the base config. - copy from new hypervisor's config when create new VM Fixes: #3193 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 19:31:03 +08:00
bin	40bd34caaf	runtime: only call stopVirtiofsd when shared_fs is virtio-fs If shared_fs is set to virtio-9p, the virtiofsd is not started, so there is no need to stop it. Fixes: #3219 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 16:06:26 +08:00
bin	33f343ee08	runtime: correct span name for stopSandbox function Normally the span name should be the same as function name, so chagne `StopVM` to `stopSandbox`. Fixes: #3217 Signed-off-by: bin <bin@hyper.sh>	2021-12-07 15:59:18 +08:00
Bo Chen	995300260e	virtcontainers: clh: Upgrade to openapi-generator v5.3.0 The latest release of openapi-generator v5.3.0 contains the fix for `dropping err` bug [1]. This patch also re-generated the client code of Cloud Hypervisor to have the bug fixed. [1] https://github.com/OpenAPITools/openapi-generator/pull/10275 Fixes: #3201 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-03 08:55:38 -08:00
Fabiano Fidêncio	3fdc97e110	Merge pull request #3183 from fengwang666/nonroot-vhost-bug-fix runtime: enable vhost-net for rootless hypervisor	2021-12-03 10:42:50 +01:00
Feng Wang	b3bcb7b251	runtime: enable vhost-net for rootless hypervisor vhost-net is disabled in the rootless kata runtime feature, which has been abandoned since kata 2.0. I reused the rootless flag for nonroot hypervisor and would like to enable vhost-net. Fixes #3182 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2021-12-02 21:55:31 -08:00
Bo Chen	4756a04b2d	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v19.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-12-02 12:09:12 -08:00
Gabriela Cervantes	591d4af1ea	runtime: Update comments for virtcontainers to use kata 2.0 This PR updates the comments in the configuration.toml to point to the current kata containers repository instead of the kata 1.x. Fixes #3163 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-12-01 16:16:46 +00:00
Gabriela Cervantes	923e098db6	osbuilder: Remove debian as a rootfs Currently we do not have debian as part of the kata CI as we do not have a mantainer, this PR removes debian as a supported rootfs in order to have only the distros that we are supporting and mantainining. Fixes #3153 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-11-30 19:31:33 +00:00
Peng Tao	01b6ffc0a4	Merge pull request #3028 from egernst/hypervisor-hacking Hypervisor cleanup, refactoring	2021-11-26 10:21:49 +08:00
Chelsea Mafrica	ed7eb26bff	Merge pull request #3113 from liubin/fix/3112-delete-netmon runtime: delete netmon	2021-11-24 17:58:13 -08:00
Binbin Zhang	75bb340137	shimv2/service: fix defer funtions never run with os.Exit() os.Exit() will terminate program immediately, the defer functions won't be executed, so we add defer functions again before os.Exit(). Refer to https://pkg.go.dev/os#Exit Fixes: #3059 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-11-24 15:59:59 +01:00
bin	ddc68131df	runtime: delete netmon Netmon is not used anymore. Fixes: #3112 Signed-off-by: bin <bin@hyper.sh>	2021-11-24 15:08:18 +08:00
Binbin Zhang	7304e52a59	Makefile: update `make go-test` call 1. use ci/go-test.sh to replace the direct call to go test 2. fix data race test 3. install hook whether it is root or not Fixes #1494 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-11-22 13:59:22 +08:00
Eric Ernst	ce92cadc7d	vc: hypervisor: remove setSandbox The hypervisor interface implementation should not know a thing about sandboxes. Fixes: #2882 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	2227c46c25	vc: hypervisor: use our own logger This'll end up moving to hypervisors pkg, but let's stop using virtLog, instead introduce hvLogger. Fixes: #2884 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	4c2883f7e2	vc: hypervisor: remove dependency on persist API Today the hypervisor code in vc relies on persist pkg for two things: 1. To get the VM/run store path on the host filesystem, 2. For type definition of the Load/Save functions of the hypervisor interface. For (1), we can simply remove the store interface from the hypervisor config and replace it with just the path, since this is all we really need. When we create a NewHypervisor structure, outside of the hypervisor, we can populate this path. For (2), rather than have the persist pkg define the structure, let's let the hypervisor code (soon to be pkg) define the structure. persist API already needs to call into hypervisor anyway; let's allow us to define the structure. We'll probably want to look at following similar pattern for other parts of vc that we want to make independent of the persist API. In doing this, we started an initial hypervisors pkg, to hold these types (avoid a circular dependency between virtcontainers and persist pkg). Next step will be to remove all other dependencies and move the hypervisor specific code into this pkg, and out of virtcontaienrs. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	34f23de512	vc: hypervisor: Remove need to get shared address from sandbox Add shared path as part of the hypervisor config Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	c28e5a7807	acrn: remove dependency on sandbox, persistapi datatypes Today, acrn relies on sandbox level information, as well as a store provided by common parts of the hypervisor. As we cleanup the abstractions within our runtime, we need to ensure that there aren't cross dependencies between the sandbox, the persistence logic and the hypervisor. Ensure that ACRN still compiles, but remove the setSandbox usage as well as persist driver setup. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Eric Ernst	a0e0e18639	hypervisors: introduce pkg to unbreak vc/persist dependency Initial hypervisors pkg, with just basic state types defined. Fixes: #2883 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-19 12:20:41 -08:00
Christophe de Dinechin	0380b9bda7	runtime: Update containerd to 1.5.8 Release 1.5.8 of containerd contains fixes for two low-severity advisories: [GHSA-5j5w-g665-5m35](https://github.com/opencontainers/distribution-spec/security/advisories/GHSA-mc8v-mgrf-8f4m) [GHSA-77vh-xpmg-72qh](https://github.com/opencontainers/image-spec/security/advisories/GHSA-77vh-xpmg-72qh) Fixes: #3074 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2021-11-18 18:38:27 +01:00
Greg Kurz	f80ca66300	Merge pull request #2921 from Amulyam24/template_test virtcontainers: fix failing template test on ppc64le	2021-11-18 17:32:18 +01:00
Amulyam24	d5a18173b9	virtcontainers: fix failing template test on ppc64le If a file/directory doesn't exist, os.Stat() returns an error. Assert the returned value with os.IsNotExist() to prevent it from failing. Fixes: #2920 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2021-11-18 15:37:40 +05:30
Eric Ernst	7e6f2b8d64	vc-utils: don't export unused function Many of these functions are just used on one place throughout the rest of the code base. If we create hypervisor package, newtork package, etc, we may want to parse this out. Fixes: #3049 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	860f30882a	virtcontainers: move oci, uuid packages top level This will be useful at runtime level; no need for oci or uuid to be subpkg of virtcontainers. While at it, ensure we run gofmt on the changed files. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	8acb3a32b6	virtcontainers: remove unused package nsenter Package is not utilized. Remove. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	4788cb8263	vc-network: remove unused functions Unused functions -- let's clean up! Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	b6ebddd7ef	oci: remove unused function GetContainerType This is unused - we utilize ContainerType directly. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-17 14:12:57 -08:00
Eric Ernst	1e7cb4bc3a	macvlan: drop bridged part of name The fact that we need to "bridge" the endpoint is a bit irrelevant. To be consistent with the rest of the endpoints, let's just call this "macvlan" Fixes: #3050 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-11-16 16:44:29 -08:00
Carlos Venegas	15b5d22e81	Merge pull request #2778 from jcvenegas/clh-race-condition-check clh: Fix race condition that prevent start pods	2021-11-16 14:15:06 -06:00
Carlos Venegas	55412044df	monitor: Fix monitor race condition doing hypervisor.check() The thread monitor will check if the agent and the VMM are alive every second in a blocking thread. The Cloud hypervisor API server is single-threaded, if the monitor does a `check()`, while a slow request is still in progress, the monitor check() method will timeout. The monitor thread will stop all the shim-v2 execution. This commit modifies the monitor thread to make it check the status of the hypervisor after 5 seconds. Additionally, the `check()` method from cloud-hypervisor will use the method `clh.isClhRunning(timeout)` with a 10 seconds timeout. The monitor function does no timeout, so even if `hypervisor.check()` takes more 10 seconds, the isClhRunning method handles errors doing a VmmPing and retry in case of errors until the timeout is reached. Reduce the time to the next check to 5 should not affect any functionality, but it will reduce the overhead polling the hypervisor. Fixes: #2777 Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>	2021-11-16 18:28:29 +00:00
snir911	b046c1ef6b	Merge pull request #2959 from snir911/wip/cgroups-systemd-fix cgroups: Fix systemd cgroup support	2021-11-15 10:44:45 +02:00
Eric Ernst	e89c06e68b	Merge pull request #3032 from liubin/fix/3031-merge-two-types-packages runtime: merge virtcontainers/pkg/types into virtcontainers/types	2021-11-12 14:23:21 -08:00
bin	09f7962ff1	runtime: merge virtcontainers/pkg/types into virtcontainers/types There are two types packages under virtcontainers, and the virtcontainers/pkg/types has a few codes, merging them into one can make it easy for outstanding and using types package. Fixes: #3031 Signed-off-by: bin <bin@hyper.sh>	2021-11-12 15:06:39 +08:00
bin	6acedc2531	runtime: delete not used codes Functions EnvVars and GetOCIConfig in runtime/virtcontainers/pkg/oci/utils.go are not used anymore. Fixes: #3029 Signed-off-by: bin <bin@hyper.sh>	2021-11-12 11:35:31 +08:00
Snir Sheriber	bcf181b7ee	cgroups: Fix systemd cgroup support As github.com/containerd/cgroups doesn't support scope units which are essential in some cases lets create the cgroups manually and load it trough the cgroups api This is currently done only when there's single sandbox cgroup (sandbox_cgroup_only=true), otherwise we set it as static cgroup path as it used to be (until a proper soultion for overhead cgroup under systemd will be suggested) Fixes: #2868 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-11 08:51:45 +02:00
Bin Liu	04185bd068	Merge pull request #2997 from Jakob-Naucke/lint-protection virtcontainers: Lint protection types	2021-11-11 08:34:48 +08:00
Fabiano Fidêncio	653976c0fd	Merge pull request #3000 from bergwolf/crioptions runtime: Revert "runtime: use containerd package instead of cri-containerd"	2021-11-10 13:41:24 +01:00
Peng Tao	eacfcdec19	runtime: Revert "runtime: use containerd package instead of cri-containerd" This reverts commit `76f16fd1a7` to bring back cri-containerd crioptions parsing so that kata works with older containerd versions like v1.3.9 and v1.4.6. Fixes: #2999 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2021-11-10 16:06:42 +08:00
Jakob Naucke	b7b89905d4	virtcontainers: Lint protection types Protection types like tdxProtection or seProtection were marked nolint, remove this. As a side effect, ARM needs dummy tests for these. Fixes: #2801 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-11-09 18:36:32 +01:00
James O. D. Hunt	87f676062c	agent: Remove dynamic tracing APIs Remove the `StartTracing` and `StopTracing` agent APIs that toggle dynamic tracing. This is not supported in Kata 2.x, as documented in the [tracing proposals document](https://github.com/kata-containers/kata-containers/pull/2062). Fixes: #2985. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-11-09 08:39:06 +00:00
Chelsea Mafrica	d17100aee6	vendor: update OpenTelemetry to v1.0.0 Upgrade from v0.20.0 to v1.0.0, first stable release. Git log 4bfa0034 Release prep v1.0.0-RC3 (2218) c7ae470a Refactor SDK span creation and implementation (2213) db317fce Verify and update OTLP trace exporter documentation (2053) 04de34a2 Update the website getting started docs (2203) a7b9d021 Rename metric instruments to match feature-freeze API specification (2202) 1f527a52 Update trace API config creation functions (2212) 361a2096 Fix RC2 header in changelog (2215) e209ee75 chore(exporter/zipkin): improves logging on invalid collector. (2191) c0c5ef65 Fix typos in resource.go. (2201) abf6afe0 Update otel example guide (2210) 3b05ba02 Bump actions/setup-go from 2.1.3 to 2.1.4 (2206) bcd7ff7b Bump codecov/codecov-action from 2.0.2 to 2.0.3 (2205) c912b179 Print JSON objects to stdout without a wrapping array (2196) add511c1 Make WithoutTimestamps work (2195) 85c27e01 Bump github.com/golangci/golangci-lint from 1.41.1 to 1.42.0 in /internal/tools (2199) bf6500b3 Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /exporters/otlp/otlptrace (2184) 9392af96 Bump google.golang.org/grpc in /exporters/otlp/otlptrace/otlptracegrpc (2185) c95694dc Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /example/otel-collector (2183) 0528fa66 Bump google.golang.org/grpc from 1.39.1 to 1.40.0 in /exporters/otlp/otlpmetric (2186) 3a26ed21 Deprecate the oteltest package (2188) c885435f Website: support GH page links to canonical src (2189) 6da20a27 Add cross-module test coverage (2182) dfc866bd Support capturing stack trace (2163) 41588fea Deprecate the attribute.Any function (2181) 4e8d667f Support a single Resource per MeterProvider in the SDK (2120) a8bb0bf8 Make the tracetest.SpanRecorder concurrent safe (2178) 87d09df3 Deprecate Array attribute in favor of Slice types (2162) df384a9a Move InstrumentKind into the new metric/sdkapi package (2091) 1cb5cdca Unify the OTLP attribute transform (2170) a882ee37 Clarify the attribute package documentation and order/grouping (2168) 5d25c4d2 Add support for int32 in attribute.Any (2169) 2b0e139e Refactor attributes benchmark tests (2167) 4c7470d9 Bump google.golang.org/grpc from 1.39.0 to 1.39.1 in /exporters/otlp/otlptrace (2176) 990c534a Bump google.golang.org/grpc in /example/otel-collector (2172) b45c9d31 Bump google.golang.org/grpc from 1.39.0 to 1.39.1 in /exporters/otlp/otlpmetric (2174) a3d4ff5c Deprecated the bridge/opencensus/utils package (2166) b1d1d529 Move OC bridge integration tests to own mod (2165) 89a9489c Add OC bridge internal unit tests (2164) 56c743ba Allow global ErrorHandler to be set multiple times (2160) d18c135f Add OpenCensus bridge internal package (2146) fcf945a4 Just a little typo fix in code documentation. (2159) 59a82eba Update version.go (2157) 21d4686f Add ErrorHandlerFunc to simplify creating ErrorHandlers (2149) 23cb9396 Remove `internal/semconv-gen` (2155) 39acab32 Fix code sample in otel.GetTraceProvider (2147) 2b1bb29e Update OpenCensus bridge docs with limitations (2145) fd7c327b Fix Jaeger exporter agent port default value and docs (2131) b8561785 fix(2138): add guard to constructOTResources to return an empty resource (2139) 11f62640 Add a SpanRecorder to the sdk/trace/tracetest (2132) fd9de7ec rename assertsocketbuffersize.go to _test (2136) a6b4d90c nit doc fix (2135) 79398418 pre-release v1.0.0-RC2 (2133) 2501e0fd Use semconv.SchemaURL in STDOUT exporter example (2134) ef03dbc9 Bump codecov/codecov-action from 1 to 2.0.2 (2129) bbe6ca40 Deprecate oteltest.Harness for removal (2123) 7a624ac2 Deprecated the oteltest.TraceStateFromKeyValues function (2122) ece1879f Removed dropped link's attributes field from API package (2118) 03902d98 Rename sdk/trace/tracetest test.go -> exporter.go (2128) cb607b0a Unify OTLP exporter retry logic (2095) abe22437 API: create new linked span from current context (2115) db81d4aa Update internal/global/trace testing (2111) 7f10ef72 Remove propagation testing types from oteltest (2116) 25d739b0 Remove resource.WithBuiltinDetectors() which has not been maintained (2097) d57c5a56 Remove several metrics test helpers (2105) 49359495 Simplify trace_context tests (2108) 56d42011 Simplify trace context benchmark test (2109) 63dfe64a Correct status transform in OTLP exporter (2102) 9b1a5f70 Performance improvement: avoid creating multiple same read-only objects (2104) ab78dbd0 Update release URL (2106) 647af3a0 Pre release experimental metrics v0.22.0 (2101) 0a562337 Fixed OS type value for DragonFly BSD (2092) 62c21ffb Bump golang.org/x/tools from 0.1.4 to 0.1.5 in /internal/tools (2096) 4a3da55a Ensure sample code in website_docs getting started page works (2094) d3063a3d Update otel.Meter to global.Meter in Getting Started Document.(2087) (2093) 00a1ec5f Add documentation guidelines and improve Jaeger exporter readme (2082) 12f737c7 oteltest: ensure valid SpanContext created for span started WithNewRoot (2073) 484258eb OS description attribute detector (1840) d8c9a955 Bump google.golang.org/grpc from 1.38.0 to 1.39.0 in /example/otel-collector (2054) 4ffdf034 Add @pellard as an Approver (2047) 1a74b399 Bump google.golang.org/protobuf from 1.26.0 to 1.27.0 in /exporters/otlp/otlpmetric (2040) 57c2e8fb Bump golang.org/x/tools from 0.1.3 to 0.1.4 in /internal/tools (2036) 7cff31a9 Bump google.golang.org/protobuf from 1.26.0 to 1.27.0 in /exporters/otlp/otlptrace (2035) 9e8f523d when using WithNewRoot, don't use the parent context for sampling (2032) 62af6c70 semconv-gen: fix capitalization at word boundaries, add stability/deprecation indicators (2033) 0bceed7e Fix docs on otel-collector example (2034) 6428cd69 Update doc.go (2030) 311a6396 fix documentation for trace.Status (2029) 16f83ce6 export ToZipkinSpanModels for use outside this library (2027) d5d4c87f Add HTTP metrics exporter for OTLP (2022) d6e8f60f Bump github.com/golangci/golangci-lint from 1.40.1 to 1.41.1 in /internal/tools (2023) 51dbe3cb Remove deprecated exporters (2020) 257ef7fc Update project status in README (2017) ced177b7 Pre-release 1.0.0-RC1 (2013) 694c9a41 Interface stability documentation (2012) 39fe8092 Add span.TracerProvider() (2009) d020e1a2 Add more tests for go.opentelemetry.io/otel/trace package. (2004) 6d4a38f1 replace WithSyncer with WithBatcher in opencensus example (2007) c30cd1d0 Split stdout exporter into stdouttrace and stdoutmetric (2005) 80ca2b1e otlp: mark unix endpoints to work without transport security (2001) 65140985 Update codecov ignore (2006) 3be9813d Deprecate the exporters in the "trace" and "metric" sub-directories (1993) 377f7ce4 remove WithTrace* options from otlptrace exporters (1997) b33edaa5 OTLP metrics gRPC exporter (1991) 64b640cc Remove old OTLP exporter (1990) 7728a521 Remove dependency on metrics packages (1988) 135ac4b6 Moved internal/tools duplicated findRepoRoot function to common package (1978) cdf67ddf Update semantic conventions to v1.4.0, move to versioned package (1987) 4883cb11 Refactor exporter creation functions (1985) 87cc1e1f Test BatchSpanProcessor export timeout directly (1982) 7ffe2845 Added inputPath validation to semconv-gen (1986) a113856a Add caveat about installing opencensus bridge (1983) 741cb9a3 Fix generator.go call typo in RELEASING.md (1977) 7a0cee7b Replaces golint by revive and fix newly reported linter issues (1946) 46d9687a Add Schema URL support to Resource (1938) 0827aa62 Use mock server as jaeger agent listener. (1930) 20886012 Bugfix jaeger exporter test panic (1973) 4bf6150f Add baggage implementation based on the W3C and OpenTelemetry specification (1967) bbe2b8a3 Bump github.com/itchyny/gojq from 0.12.3 to 0.12.4 in /internal/tools (1971) 4949bf05 Bump github.com/cenkalti/backoff/v4 from 4.1.0 to 4.1.1 in /exporters/otlp/otlptrace (1972) 015b4c17 Bump github.com/cenkalti/backoff/v4 from 4.1.0 to 4.1.1 in /exporters/otlp (1970) 13eb12ac Bump github.com/prometheus/client_golang from 1.10.0 to 1.11.0 in /exporters/metric/prometheus (1974) 2371bb0a add otlp trace http exporter (1963) a75ade4e sdk/resource: honor OTEL_SERVICE_NAME in fromEnv resource detector (1969) aed45802 Bump go.opentelemetry.io/proto/otlp from 0.8.0 to 0.9.0 in /exporters/otlp/otlptrace (1959) c4ebae6a Bump go.opentelemetry.io/proto/otlp (1960) b1d2be3b Bump google.golang.org/grpc from 1.37.1 to 1.38.0 in /exporters/otlp/otlptrace (1958) f6daea5e Generate semantic conventions according to specification latest tagged version (1933) 435a63b3 Bump github.com/google/go-cmp from 0.5.5 to 0.5.6 (1954) 6c46af66 Bump github.com/google/go-cmp from 0.5.5 to 0.5.6 in /exporters/trace/jaeger (1953) 4d294853 Bump actions/cache from 2.1.5 to 2.1.6 (1952) dfe2b6f1 OTLP trace gRPC exporter (1922) 5a8f7ff7 Bump go.opentelemetry.io/proto/otlp from 0.8.0 to 0.9.0 in /exporters/otlp (1943) bd935866 Add schema URL support to Tracer (1889) c1f460e0 Update API configs. (1921) 270cc603 Small fixes on some Span method's documentation headers (1950) 8603b902 Fix typo in doc (1949) acbb1882 Bump google.golang.org/grpc from 1.37.1 to 1.38.0 in /exporters/otlp (1942) b1621501 Add codecov badge (1940) ea1434c3 Fix some golint issues (1947) 0eeb8f87 Refactor Tracestate (1931) d3b12808 Add Passthrough example (1912) f06cace6 Add @MadVikingGod as a project Approver (1923) ab5facb3 Bump github.com/golangci/golangci-lint in /internal/tools (1925) d23cc61b Refactor configs (1882) 6324adaa Add tracer option argument to global Tracer function (1902) 035fc650 Do not include authentication information in the http.url attribute (1919) d8ac212c Fix sporadic test failure in otlp exporter http driver (1906) a3df00f4 Create .gitattributes (1920) fb88e926 Bump google.golang.org/grpc from 1.37.0 to 1.37.1 in /exporters/otlp (1914) 1982dc46 Bump google.golang.org/grpc in /example/prom-collector (1915) 1759c630 Bump github.com/golangci/golangci-lint in /internal/tools (1916) 7342aa47 Bump google.golang.org/grpc in /example/otel-collector (1913) 21c16418 Add support for scheme in OTEL_EXPORTER_OTLP_ENDPOINT (1886) 5cb62636 Semantic Convention generation tooling (1891) 6219221f Move the unit package to the metric module (1903) 63e0ecfc Implement global default non-recording span (1901) b6d5442f Remove the Tracer method from the Span API (1900) ae85fab3 Document functional options (1899) cabf0c07 Fix default Jaeger collector endpoint (1898) 1e3fa3a3 Bump go.opentelemetry.io/proto/otlp from 0.7.0 to 0.8.0 in /exporters/otlp (1872) 696af787 Bump github.com/benbjohnson/clock from 1.0.3 to 1.1.0 in /sdk/metric (1532) 97eea6c3 Fix some golint issues (1894) 79d9852e fix container port mismatch issue (1895) d20e7228 CI builds validate against last two versions of Go, dropping 1.14 and adding 1.16 (1865) cbcd4b1a Redefine ExportSpans of SpanExporter with ReadOnlySpan (1873) c99d5e99 Split large jaeger span batch to admire the udp packet size limit (1853) 42a84509 Unembed SpanContext (1877) b7d02db1 Add Status type to SDK (1874) f90d0d93 Update README (1876) a1349944 Update resource.go (1871) f40cad5e Add markdown link check configuration and action (1869) 9bc28f6b Fix existing markdown lint issues (1866) 08f4c270 Add documentation for tracer.Start() (1864) 2bd4840c remove Set.Encoded(Encoder) enconding cache (1855) 7674eebf Removed different types of Detectors for Resources. (1810) f92a6d83 Implement retry policy for the OTLP/gRPC exporter (1832) ec75390f Fix BSP context done tests (1863) 8e55f10a Move the Event type from the API to the SDK (1846) e399d355 drop failed to exporter batches and return error when forcing flush a span processor (1860) f6a9279a Honor context deadline or cancellation in SimpleSpanProcessor.Shutdown (1856) aeef8e00 Add markdown lint GitHub action (1849) d4c8ffad Replace spaces to tabs in Go code snippets (1854) cb097250 fixed typo (1857) 392a44fa Refine configuration design docs (1841) 62cd933d Handle Resource env error when non-nil (1851) 24a91628 Document the SSP is not for production use (1844) ec26ac23 Update RELEASING.md (1843) 8eb0bb99 Fix golint issue caused by typo (1847) ca130e54 Markdownlint (1842) 1144a83d Small typo fixes to existing CHANGELOG entries (1839) e6086958 Update website_docs to v0.20.0 (1838) 0f4e454c Change NewSplitDriver paramater and initialization (1798) 92551d39 Prerelease v1.0.0 (2250) 61839133 zipkin: remove no-op WithSDKOptions (2248) 568e7556 Set Schema URL when exporting traces to OTLP (2242) ec26b556 Fix RC tags in docs (2239) 767ce26c Bump github.com/itchyny/gojq from 0.12.4 to 0.12.5 in /internal/tools (2216) fe7058da adding NewNoopMeterProvider to follow trace api (2237) c338a5ef Bump github.com/golangci/golangci-lint from 1.42.0 to 1.42.1 in /internal/tools (2236) ef126f5c Remove deprecated Array from attribute package (2235) 360d1302 Add tests for nil *Resource (2227) 9e7812d1 Remove the deprecated oteltest package (2234) 486afd34 Remove the deprecated bridge/opencensus/utils pkg (2233) eaacfaa8 Fix slice-valued attributes when used as map keys (2223) df2bdbba Fix the import comments of otelpconfig (2224) 7aae2a02 otlptrace: Document supported environment variables (2222) Fixes #2591 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-11-04 12:39:00 -07:00
Chelsea Mafrica	84ccdd8ef2	vendor: update OpenTelemetry to v0.20.0 Update OpenTelemetry from v0.15.0 to v0.20.0. Git log 02d8bdd5 Release v0.20.0 (1837) aa66fe75 OS and Process resource detectors (1788) 7374d679 Fix Links documents (1835) 856f5b84 Add feature request issue template (1831) 0fdc3d78 Remove bundler from Jaeger exporter (1830) 738ef11e Fix flaky global ErrorHandler delegation test (1829) e43d9c00 Update Default Value for Jaeger Exporter Endpoint (1824) 0032bd64 Fix default merging of resource attributes from environment variable (1785) 96c5e4ba Add SpanProcessor example for Span annotation on start (1733) 543c8144 Remove the WithSDKOptions from the Jaeger exporter (1825) 66389ad6 Update function docs in sdk.go (1826) 70bc9eb3 Adds support for timeout on the otlp/gRPC exporter (1821) 081cc61d Update Jaeger exporter convenience functions (1822) 1b9f16d3 Remove the WithDisabled option from Jaeger exporter (1806) 6867faa0 Bump actions/cache from v2.1.4 to v2.1.5 (1818) a2bf04dc Build context pipeline in Jaeger upload process (1809) 2de86f23 Remove locking from Jaeger exporter shutdown/export (1807) 4f9fec29 Add ExportSpans benchmark to Jaeger exporter (1805) d9566abe Fix OTLP testing flake: signal connection from mock collector (1816) a2cecb6e add support for env var configuration to otlp/gRPC (1811) d616df61 Fix flaky OTLP exporter reconnect test (1814) b09df84a Changes stdout to expose the `*sdktrace.TracerProvider` (1800) 04890608 Remove options field from Jaeger exporter (1808) 6db20e00 Remove the abandoned Process struct in Jaeger exporter (1804) 086abf34 docs: use test example to document prometheus.InstallNewPipeline (1796) d0cea04b Bump google.golang.org/api from 0.43.0 to 0.44.0 in /exporters/trace/jaeger (1792) 99c477fe Fixed typo for default service name in Jaeger Exporter (1797) 95fd8f50 Bump google.golang.org/grpc from 1.36.1 to 1.37.0 in /exporters/otlp (1791) 9b251644 Zipkin Exporter: Use default resouce's serviceName as default serivce name (1777) (1786) 4d141e47 Add k8s.node.name and k8s.node.uid to semconv (1789) 5c99a34c Fix golint issue caused by incorrect comment (1795) c5d006c0 Update Jaeger environment variables (1752) 58432808 add NewExportPipeline and InstallNewPipeline for otlp (1373) 7d8e6bd7 Zipkin Exporter: Adjust span transformation to comply with the spec (1688) 2817c091 Merge sdk/export/trace into sdk/trace (1778) c61e654c Refactor prometheus exporter tests to match file headers as well (1470) 23422c56 Remove process config for Jaeger exporter (1776) 0d49b592 Add test to check bsp ignores `OnEnd` and `ForceFlush` post Shutdown` (1772) e9aaa04b Record links/events attribute drops independently (1771) 5bbfc22c Make ExportSpans for Jaeger Exporter honor deadline (1773) 0786fe32 Add Bug report issue templates (1775) 3c7facee Add `ExportTimeout` option to batch span processor (1755) c6b92d5b Make TraceFlags spec-compliant (1770) ee687ca5 Bump github.com/itchyny/gojq from 0.12.2 to 0.12.3 in /internal/tools (1774) 52a24774 add support for configuring tls certs via env var to otlp/HTTP (1769) 35cfbc7e Update precedence of event name in Jaeger exporter (1768) 33699d24 Adds semantic conventions for exceptions (1492) 928e3c38 Modify ForceFlush to abort after timeout/cancellation (1757) 3947cab4 Fix testCollectorEndpoint typo and add tag assertions in jaeger_test (1753) ecc635dc add website docs (1747) 07a8d195 Fix Jaeger span status reporting and unify tag keys (1761) 4fa35c90 add partial support for env var config to otlp/HTTP (1758) bf180d0f improve OTLP/gRPC connection errors (1737) d575865b Fix span IsRecording when not sampling (1750) 20c93b01 Update SamplingParameters (1749) 97501a3f Update SpanSnapshot to use parent SpanContext (1748) 604b05cb Store current Span instead of local and remote SpanContext in context.Context (1731) c61f4b6d Set @lizthegrey to emeritus status (1745) b1342fec Bump github.com/golangci/golangci-lint in /internal/tools (1743) 54e1bd19 Bump google.golang.org/api from 0.41.0 to 0.43.0 in /exporters/trace/jaeger (1741) 4d25b6a2 Bump github.com/prometheus/client_golang from 1.9.0 to 1.10.0 in /exporters/metric/prometheus (1740) 0a47b66f Bump google.golang.org/grpc from 1.36.0 to 1.36.1 in /exporters/otlp (1739) 26f006b8 Reinstate @paivagustavo as an Approver (1734) 382c7ced Remove hasRemoteParent field from SDK span (1728) 862a5a68 Remove setting error status while recording error with Span from oteltest package (1729) 6defcfdf Remove links on NewRoot spans (1726) a9b2f851 upgrade thrift to v0.14.1 in jaeger exporter (1712) 5a6a854d Bump google.golang.org/protobuf from 1.25.0 to 1.26.0 in /exporters/otlp (1724) 23486213 Migrate to using go.opentelemetry.io/proto/otlp (1713) 5d559b40 Remove makeSamplingDecision func (1711) e24702da Update the TraceContext.Extract docs (1720) 9d4eb1f6 Update dates in CHANGELOG.md for 2021 releases (1723) 2b4fa968 Release v0.19.0 (1710) 4beb7041 sdk/trace: removing ApplyConfig and Config (1693) 1d42be16 Rename WithDefaultSampler TracerProvider option to WithSampler and update docs (1702) 860d5d86 Add flag to determine whether SpanContext is remote (1701) 0fe65e6b Comply with OpenTelemetry attributes specification (1703) 88884351 Bump google.golang.org/api from 0.40.0 to 0.41.0 in /exporters/trace/jaeger (1700) 345f264a breaking(zipkin): removes servicName from zipkin exporter. (1697) 62cbf0f2 Populate Jaeger's Span.Process from Resource (1673) 28eaaa9a Add a test to prove the Tracer is safe for concurrent calls (1665) 8b1be11a Rename resource pkg label vars and methods (1692) a1539d44 OpenCensus metric exporter bridge (1444) 77aa218d Fix issue #1490, apply same logic as in the SDK (1687) 9d3416cc Fix synchronization issues in global trace delegate implementation (1686) 58f69f09 Span status from HTTP code: Do not set status message if it can be inferred (1681) 9c305bde Flush metric events prior to shutdown in OTLP example (1678) 66b1135a Fix CHANGELOG (1680) 90bd4ab5 Update employer information for maintainers (1683) 36841913 Remove WithRecord() option from trace.SpanOption when starting a span (1660) 65c7de20 Remove trace prefix from NoOp src files. (1679) e88a091a Make SpanContext Immutable (1573) d75e2680 Avoid overriding configuration of tracer provider (1633) 2b4d5ac3 Bump github.com/golangci/golangci-lint in /internal/tools (1671) 150b868d Bump github.com/google/go-cmp from 0.5.4 to 0.5.5 (1667) 76aa924e Fix the examples target info messaging (1676) a3aa9fda Bump github.com/itchyny/gojq from 0.12.1 to 0.12.2 in /internal/tools (1672) a5edd79e Removed setting error status while recording err as span event (1663) e9814758 chore(zipkin): improves zipkin example to not to depend on timeouts. (1566) 3dc91f2d Add ForceFlush method to TracerProvider (1608) bd0bba43 exporter: swap pusher for exporter (1656) 56904859 Update the SimpleSpanProcessor (1612) a7f7abac SpanStatus description set only when status code is set to Error (1662) 05252f40 Jaeger Exporter: Fix minor mapping discrepancies (1626) 238e7c61 Add non-empty string check for attribute keys (1659) e9b9aca8 Add tests for propagation of Sampler Tracestate changes (1655) 875a2583 Add docs on when reviews should be cleared (1556) 7153ef2d Add HTTP/JSON to the otlp exporter (1586) 62e2a0f7 Unexport the simple and batch SpanProcessors (1638) 992837f1 Add TracerProvider tests to oteltest harness (1607) bb4c297e Pre release v0.18.0 (1635) 712c3dcc Fix makefile ci target and coverage test packages (1634) 841d2a58 Rename local var new to not collide with builtin (1610) 13938ab5 Update SpanProcessor docs (1611) e25503a0 Add compatibility tests to CI (1567) 1519d959 Use reasonable interval in sdktrace.WithBatchTimeout (1621) 7d4496e0 Pass metric labels when transforming to gaugeArray (1570) 6d4a5e0d Bump google.golang.org/grpc from 1.35.0 to 1.36.0 in /exporters/otlp (1619) a93393a0 Bump google.golang.org/grpc in /example/prom-collector (1620) e499ca86 Fix validation for tracestate with vendor and add tests (1581) 43886e52 Make timestamps sequential in lastvalue agg check (1579) 37688ef6 revent end-users from implementing some interfaces (1575) 85e696d2 Updating documentation with an working example for creating NewExporter (1513) 562eb28b Unify the Added sections of the unreleased changes (1580) c4cf1aff Fix Windows build of Jaeger tests (1577) 4a163bea Fix stdout TestStdoutTimestamp failure with sleep (1572) bd4701eb Stagger timestamps in exact aggregator tests (1569) b94cd4b2 add code attributes to semconv package (1558) 78c06cef Update docs from gitter to slack for communication (1554) 1307c911 Remove vendor exclude from license-check (1552) 5d2636e5 Bump github.com/golangci/golangci-lint in /internal/tools (1565) d7aff473 Vendor Thrift dependency (1551) 298c5a14 Update span limits to conform with OpenTelemetry specification (1535) ecf65d79 Rename otel/label -> otel/attribute (1541) 1b5b6621 Remove resampling on span.SetName (1545) 8da52996 fix: grpc reconnection (1521) 3bce9c97 Add Keys() method to propagation.TextMapCarrier (1544) 0b1a1c72 Make oteltest.SpanRecorder into a concrete type (1542) 7d0e3e52 SDK span no modification after ended (1543) 7de3b58c Remove extra labels types (1314) 73194e44 Bump google.golang.org/api from 0.39.0 to 0.40.0 in /exporters/trace/jaeger (1536) 8fae0a64 Create resource.Default() with required attributes/default values (1507) 76f93422 Release v0.17.0 (1534) 9b242bc4 Organize API into Go modules based on stability and dependencies (1528) e50a1c8c Bump actions/cache from v2 to v2.1.4 (1518) a6aa7f00 Bump google.golang.org/api from 0.38.0 to 0.39.0 in /exporters/trace/jaeger (1517) 38efc875 Code Improvement - Error strings should not be capitalized (1488) 6b340501 Update default branch name (1505) b39fd052 nit: Fix comment to be up-to-date (1510) 186c2953 Fix golint error of package comment form (1487) 9308d662 Bump google.golang.org/api from 0.37.0 to 0.38.0 in /exporters/trace/jaeger (1506) 1952d7b6 Reverse order of attribute precedence when merging two Resources (1501) ad7b4715 Remove build flags for runtime/trace support (1498) 4bf4b690 Remove inaccurate and unnecessary import comment (1481) 7e19eb6a Bump google.golang.org/api from 0.36.0 to 0.37.0 in /exporters/trace/jaeger (1504) c6a4406a Bump github.com/golangci/golangci-lint in /internal/tools (1503) 9524ac09 Update workflows to include main branch as trigger (1497) c066f15e Bump github.com/gogo/protobuf from 1.3.1 to 1.3.2 in /internal/tools (1478) 894e0240 Bump github.com/golangci/golangci-lint in /internal/tools (1477) 71ffba39 Bump google.golang.org/grpc from 1.34.0 to 1.35.0 in /exporters/otlp (1471) 515809a8 Bump github.com/itchyny/gojq from 0.12.0 to 0.12.1 in /internal/tools (1472) 3e96ad1e gitignore: remove unused example path (1474) c5622777 Histogram aggregator functional options (1434) 0df8cd62 Rename Makefile.proto to avoid interpretation as proto file (1468) 979ff51f Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 (1453) 1df8b3b8 Bump github.com/gogo/protobuf from 1.3.1 to 1.3.2 in /exporters/otlp (1456) 4c30a90a Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /sdk (1455) 5a9f8f6e Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/stdout (1454) 7786f34c Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/trace/zipkin (1457) 4352a7a6 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/otlp (1460) 6990b3b3 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/metric/prometheus (1461) 7af40d22 Bump github.com/stretchr/testify from 1.6.1 to 1.7.0 in /exporters/trace/jaeger (1463) f16f1892 Bump google.golang.org/grpc in /example/otel-collector (1465) fe363be3 Move Span Event to API (1452) 43922240 Bump google.golang.org/grpc in /example/prom-collector (1466) 0aadfb27 Prepare release v0.16.0 (1464) 207587b6 Metric histogram aggregator: Swap in SynchronizedMove to avoid allocations (1435) c29c6fd1 Shutdown underlying span exporter while shutting down BatchSpanProcessor (1443) dfece3d2 Combine the Push and Pull metric controllers (1378) 74deeddd Handle tracestate in TraceContext propagator (1447) 49f699d6 Remove Quantile aggregation, DDSketch aggregator; add Exact timestamps (1412) 9c949411 Rename internal/testing to internal/internaltest (1449) 8d809814 Move gRPC driver to a subpackage and add an HTTP driver (1420) 9332af1b Bump github.com/golangci/golangci-lint in /internal/tools (1445) 5ed96e92 Update exporters/otlp Readme.md (1441) bc9cb5e3 Switch CircleCI badge to GitHub Actions (1440) 716ad082 Remove CircleCI config (1439) 0682db1e Adding Security Workflows to GitHub Actions (2/2): gosec workflow (1429) 11f732b8 Adding Security Workflows to GitHub Actions (1/2): codeql workflow (1428) 40f1c003 Add Tracestate into the SamplingResult struct (1432) db06c8d1 Flush metric events before shutdown in collector example (1438) f6f458e1 Fix golint issue caused by typo in trace.go (1436) fe9d1f7e Use uint64 Count consistently in metric aggregation (1430) 3a337d0b Bump github.com/golangci/golangci-lint in /internal/tools (1433) 1e4c8321 cleanup: drop the removed examples in gitignore (1427) 5c9221cf Unify endpoint API that related to OTel exporter (1401) 045c3ffe Build scripts: Replace mapfile with read loop for old bash versions (1425) 2def8c3d Add Versioning Documentation (1388) 6bcd1085 Bump github.com/itchyny/gojq from 0.11.2 to 0.12.0 in /internal/tools (1424) 38e76efe Add a split protocol driver for otlp exporter (1418) 439cd313 Add TraceState to SpanContext in API (1340) 35215264 Split connection management away from exporter (1369) add9d933 Bump github.com/prometheus/client_golang from 1.8.0 to 1.9.0 in /exporters/metric/prometheus (1414) 93d426a1 Add @dashpole as a project Approver (1410) 6fe20ef3 Fix small typo (1409) b22d0d70 Mention the getting started guide (1406) 3fb80fb2 Fix duplicate checkout action in GitHub workflow (1407) 2051927b Correct CI workflow syntax (1403) f11a86f7 Fix typo in comment (1402) bdf87a78 Migrate CircleCI ci.yml workflow to GitHub Actions (1382) 4e59dd1f Bump google.golang.org/grpc from 1.32.0 to 1.34.0 in /example/otel-collector (1400) 83513f70 Bump google.golang.org/api from 0.32.0 to 0.36.0 in /exporters/trace/jaeger (1398) a354fc41 Bump github.com/prometheus/client_golang from 1.7.1 to 1.8.0 in /exporters/metric/prometheus (1397) 3528e42c Bump google.golang.org/grpc from 1.32.0 to 1.34.0 in /exporters/otlp (1396) af114baf Call otel.Handle with non-nil errors (1384) c3c4273e Add RO/RW span interfaces (1360) Fixes #2591 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-11-04 12:30:45 -07:00
Chelsea Mafrica	b5cfb73466	Merge pull request #2931 from YchauWang/wyc-runtime-shim2 runtime# make sure the "Shutdown" trace span have a correct end	2021-11-04 11:33:22 -07:00
Chelsea Mafrica	09d5d8836b	runtime: tracing: Change method for adding tags In later versions of OpenTelemetry label.Any() is deprecated. Create addTag() to handle type assertions of values. Change AddTag() to variadic function that accepts multiple keys and values. Fixes #2547 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-11-04 10:19:05 -07:00
Snir Sheriber	b34ed403c5	cgroups: pass vhost-vsock device to cgroup for the sandbox cgroup Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-04 10:59:10 +02:00
Snir Sheriber	7362e1e8a9	runtime: remove prefix when cgroups are managed by systemd as done previously in `9949daf4dc` Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2021-11-04 10:13:22 +02:00
bin	375ad2b2b6	runtime: Enhancement for Makefile There are some issues with Makefile for runtime: - default target can't be used as a dependent of other targets. - empty target `check` And also add two targets for locally development/tests. - lint: run golangci-lint - pre-commit: run lint and test Fixes: #2942 Signed-off-by: bin <bin@hyper.sh>	2021-11-03 17:36:55 +08:00
wangyongchao.bj	9d3ec58370	runtime: make sure the "Shutdown" trace span have a correct end We only added span.End() in the main process of the shim2 Shutdown method. The "Shutdown" span would keep alive, when the containers number is not 0. This PR make sure the "Shutdown" trace span have a correct end. Fixes: #2930 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2021-11-02 14:24:31 +08:00
Jianyong Wu	e15c8460db	Merge pull request #2265 from rapiz1/simple-ro-mount virtcontainers: simplify read-only mount handling	2021-11-01 10:43:16 +08:00
Bin Liu	51e9038ad5	Merge pull request #1998 from liubin/1997/add-fastfail-test runtime: add fast-test to let test exit on error	2021-10-30 15:38:27 +08:00
bin	3f21af9c5c	runtime: add fast-test to let test exit on error Add -failfast option to let test exit on error, but -failfast option can't cross package, so there is a for loop used to test on all packages in src/runtime, and the parallel number is set to 1, this may lead test to be slow. Fixes: #1997 Signed-off-by: bin <bin@hyper.sh>	2021-10-30 11:09:54 +08:00
GabyCT	c8553ea427	Merge pull request #2046 from littlejawa/issue_2042 test: Fix random failure for TestIoCopy	2021-10-29 17:29:31 -05:00
GabyCT	969b78b01f	Merge pull request #2496 from rapiz1/show-guest-protection cli: Show available guest protection in env output	2021-10-29 17:28:47 -05:00
James O. D. Hunt	2551179e43	Merge pull request #2929 from YchauWang/vc-docs-api virtcontainers: api: update the functions in the api.md docs	2021-10-29 16:01:31 +01:00
James O. D. Hunt	4e2dd41eb6	Merge pull request #1791 from wainersm/virtcontainers-1 virtcontainers: check that both initrd and image are not set	2021-10-29 14:51:07 +01:00
wangyongchao.bj	338ac87516	virtcontainers: api: update the functions in the api.md docs Virtcontainers API document functions weren't sync with the codes Sandbox and VCImpl. And we have two functions named `CreateSandbox` functions, diff by one parameter, very confused. So this pr sync the codes to api documents. Fixes: #2928 Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>	2021-10-29 15:36:53 +08:00
Bin Liu	eb248b0c66	Merge pull request #2750 from liubin/fix/2749-remove-fixme runtime: set tags for trace span	2021-10-29 11:42:49 +08:00
Gabriela Cervantes	e610fc82ff	runtime: Remove comments about unsupported features in config for clh Cloud hypervisor is only supporting virtio-blk, this PR removes comments that make a wrong reference of other features that are not supported by clh. Fixes #2924 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-10-28 15:14:49 +00:00
Yujia Qiao	e66d0473be	virtcontainers: simplify read-only mount handling Current handling of read-only mounts is a little tricky. However, a clearer solution can be used here: 1. make a private ro bind mount at privateDest to the mount source 2. make a bind mount at mountDest to the mount created in step 1 3. umount the private bind mount created in step 1 One important aspect is that the mount in step 2 is duplicated from the one we created in step 1. So the MS_RDONLY flag is properly preserved in all mounts created in the propagtion. Fixes: #2205 Depends-on: github.com/kata-containers/tests#4106 Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>	2021-10-28 15:48:41 +08:00
Manabu Sugimoto	3be50adab9	agent: Add support for Seccomp The kata-agent supports seccomp feature based on the OCI runtime specification. This seccomp capability in the kata-agent is enabled by default. However, it is not enforced by default: users need to enable that by setting `disable_guest_seccomp` to `false` in the main configuration file. Fixes: #1476 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2021-10-27 19:06:13 +09:00
Wainer dos Santos Moschetta	309dae631a	virtcontainers: check that both initrd and image are not set This changed valid() in hypervisor to check the case where both initrd and image path are set; in this case it returns an error. Fixes #1868 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2021-10-26 10:44:23 -04:00
bin	5f306330f4	virtcontainers: delete duplicated notify in watchHypervisor function When hypervisor check failed, the notify function is called twice. Fixes: #2901 Signed-off-by: bin <bin@hyper.sh>	2021-10-26 11:58:26 +08:00
Yujia Qiao	6cc8000cae	cli: Show available guest protection in env output Show available guest protections in the `kata-runtime env` output. Also bump the formatVersion. Fixes: #1982 Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>	2021-10-25 21:44:56 +08:00
Yujia Qiao	2063b13805	virtcontainers: Add func AvailableGuestProtections Add functions to return guestProtection as a string slice, which can be then used in `kata-runtime env` output. Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>	2021-10-25 21:44:01 +08:00
James O. D. Hunt	ec3aa1694b	Merge pull request #2844 from jongwu/unit_test enable unit test on arm	2021-10-25 10:58:21 +01:00
Bin Liu	ded864f862	Merge pull request #2568 from Bevisy/main-2254 cli: Fix outdated kata-runtime bash completion	2021-10-25 14:02:13 +08:00
David Gibson	a0825badf6	Merge pull request #2795 from dgibson/vfio-as-vfio Allow VFIO devices to be used as VFIO devices in the container	2021-10-25 14:25:26 +11:00
David Gibson	34273da98f	runtime/device: Allow VFIO devices to be presented to guest as VFIO devices On a conventional (e.g. runc) container, passing in a VFIO group device, /dev/vfio/NN, will result in the same VFIO group device being available within the container. With Kata, however, the VFIO device will be bound to the guest kernel's driver (if it has one), possibly appearing as some other device (or a network interface) within the guest. This add a new `vfio_mode` option to alter this. If set to "vfio" it will instruct the agent to remap VFIO devices to the VFIO driver within the guest as well, meaning they will appear as VFIO devices within the container. Unlike a runc container, the VFIO devices will have different names to the host, since the names correspond to the IOMMU groups of the guest and those can't be remapped with namespaces. For now we keep 'guest-kernel' as the value in the default configuration files, to maintain current Kata behaviour. In future we should change this to 'vfio' as the default. That will make Kata's default behaviour more closely resemble OCI specified behaviour. fixes #693 Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-10-25 12:29:31 +11:00
David Gibson	68696e051d	runtime: Add parameter to constrainGRPCSpec to control VFIO handling Currently constrainGRPCSpec always removes VFIO devices from the OCI container spec which will be used for the inner container. For upcoming support for VFIO devices in DPDK usecases we'll need to not do that. As a preliminary to that, add an extra parameter to the function to control whether or not it will remove the VFIO devices from the spec. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-10-25 12:29:31 +11:00
David Gibson	d9e2e9edb2	runtime: Rename constraintGRPCSpec to improve grammar "constraint" is a noun, "constrain" is the associated verb, which makes more sense in this context. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-10-25 12:29:31 +11:00
David Gibson	57ab408576	runtime: Introduce "vfio_mode" config variable and annotation In order to support DPDK workloads, we need to change the way VFIO devices will be handled in Kata containers. However, the current method, although it is not remotely OCI compliant has real uses. Therefore, introduce a new runtime configuration field "vfio_mode" to control how VFIO devices will be presented to the container. We also add a new sandbox annotation - io.katacontainers.config.runtime.vfio_mode - to override this on a per-sandbox basis. For now, the only allowed value is "guest-kernel" which refers to the current behaviour where VFIO devices added to the container will be bound to whatever driver in the VM kernel claims them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-10-25 12:29:29 +11:00
Jianyong Wu	1a96b8ba35	template: disable template unit test on arm Template is broken on arm. here we disable the template unit test temporarily. Fixes: #2809 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2021-10-23 15:07:25 +08:00
Jianyong Wu	43b13a4a6d	runtime: DefaultMaxVCPUs should not greater than defaultMaxQemuVCPUs DefaultMaxVCPUs may be larger than the defaultMaxQemuVCPUs that should be checked and avoided. Fixes: #2809 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2021-10-23 15:07:25 +08:00
Jianyong Wu	c59c36732b	runtime: current vcpu number should be limited The physical current vcpu number should not be used directly as the largest vcpu number is limited to defaultMaxQemuVCPUs. Here, a new helper is introduced in pkg/katautils/config.go to get current vcpu number. Fixes: #2809 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2021-10-23 15:07:25 +08:00
Jianyong Wu	fa922517d9	runtime: kernel version with '+' as suffix panic in parse The current kernel version parse lib can't process suffix '+', as the modified kernel version will add '+' as suffix, thus panic will occur. For example, if the current kernel version is "5.14.0-rc4+", test TestHostNetworkingRequested will panic: --- FAIL: TestHostNetworkingRequested (0.00s) panic: &{DistroName:ubuntu DistroVersion:18.04 KernelVersion:5.11.0-rc3+ Issue: Passed:[] Failed:[] Debug:true ActualEUID:0}: failed to check test constraints: error: Build meta data is empty Here, remove the suffix '+' in kernel version fix helper. Fixes: #2809 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2021-10-23 15:07:25 +08:00
Manohar Castelino	52268d0ece	hypervisor: Expose the hypervisor itself Export the top level hypervisor type s/hypervisor/Hypervisor Fixes: #2880 Signed-off-by: Manohar Castelino <mcastelino@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-22 16:46:02 -07:00
Eric Ernst	a72bed5b34	hypervisor: update tests based on createSandbox->CreateVM change Fixup a couple of broken tests. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	f434bcbf6c	hypervisor: createSandbox is CreateVM Last of a series of commits to export the top level hypervisor generic methods. s/createSandbox/CreateVM Fixes #2880 Signed-off-by: Manohar Castelino <mcastelino@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	76f1ce9e30	hypervisor: startSandbox is StartVM s/startSandbox/StartVM Signed-off-by: Manohar Castelino <mcastelino@apple.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	fd24a695bf	hypervisor: waitSandbox is waitVM renaming... Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	a6385c8fde	hypervisor: stopSandbox is StopVM Renaming. There is no Sandbox specific logic except tracing. Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	f989078cd2	hypervisor: resumeSandbox is ResumeVM renaming... Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	73b4f27c46	hypervisor: saveSandbox is SaveVM rename Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	7308610c41	hypervisor: pauseSandbox is nothing but PauseVM renaming Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	8f78e1cc19	hypervisor: The SandboxConsole is the VM's console update naming Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	4d47aeef2e	hypervisor: Export generic interface methods This is in preparation for creating a seperate hypervisor package. Non functional change. Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
Manohar Castelino	6baf2586ee	hypervisor: Minimal exports of generic hypervisor internal fields Export commonly used hypervisor fields and utility functions. These need to be exposed to allow the hypervisor to be consumed externally. Note: This does not change the hypervisor interface definition. Those changes will be separate commits. Signed-off-by: Manohar Castelino <mcastelino@apple.com>	2021-10-22 16:45:35 -07:00
GabyCT	03877f3479	Merge pull request #2872 from likebreath/1020/clh_v19.0 Upgrade to Cloud Hypervisor v19.0	2021-10-21 10:26:55 -05:00
James O. D. Hunt	09741272bc	Merge pull request #2783 from likebreath/1001/clh_enable_seccomp virtcontainers: clh: Enable the `seccomp` feature	2021-10-21 09:21:33 +01:00
Bo Chen	8030b6caf0	virtcontainers: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v19.0. Note: The client code of cloud-hypervisor's (CLH) OpenAPI is automatically generated by openapi-generator [1-2]. [1] https://github.com/OpenAPITools/openapi-generator [2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-10-20 15:48:55 -07:00
Binbin Zhang	4f018b5287	runtime: delete useless src/runtime/cli/exit.go simply use os.Exit() replace exit() delete useless ci/go-no-os-exit.sh; Fixes: #2295 Signed-off-by: Binbin Zhang <binbin36520@gmail.com>	2021-10-20 11:42:37 +08:00
Chelsea Mafrica	4ce2b14e60	Merge pull request #2817 from jodh-intel/clh+fc-agent-tracing Enable agent tracing for hybrid VSOCK hypervisors	2021-10-18 22:01:52 -07:00
Bin Liu	72d1a04cf1	Merge pull request #2761 from liubin/fix/2752-optimize-test-code runtime: optimize test code	2021-10-19 12:21:04 +08:00
bin	273a1a9ac6	runtime: optimize test code This PR includes these optimize changes: - Remove the dependency on the container engine. The old code uses runc to generate config.json and Docker to export rootfs, that will be heavy and need additional dependency. Using a fixed config for busybox image can avoid the heavy processing above. - Moved duplicate code to pkg/katatestutils package Fixes: #2752 Signed-off-by: bin <bin@hyper.sh>	2021-10-19 09:54:49 +08:00
bin	76f16fd1a7	runtime: use containerd package instead of cri-containerd cri-containerd project has been merged into containerd repo, and we should not reference it any more in code and docs. This commit will use containerd package instead of cri-containerd package. Fixes: #2791 Signed-off-by: bin <bin@hyper.sh>	2021-10-19 09:40:20 +08:00
James O. D. Hunt	41c49a7bf5	Merge pull request #2771 from fengwang666/debug-pid runtime: update sandbox root dir cleanup behavior in rootless hypervisor	2021-10-18 17:47:47 +01:00
Julien Ropé	17a8c5c685	runtime: Fix random failure for TestIoCopy When running the TestIoCopy test, on some occasions, the test runs too quick, and closes the stdin pipe before the ioCopy() routine start to read from it. This causes a SIGSEGV error. To fix this issue, I am adding additional read/write tests before closing the pipes. As the read operation waits for the writer to be done, this actually synchronizes the threads and make sure the final tests (with closed pipes) works as expected. Fixes: #2042 Signed-off-by: Julien Ropé <jrope@redhat.com>	2021-10-18 15:25:57 +02:00
Bin Liu	c2be2dfb61	Merge pull request #2848 from c3d/bug/2847-tag-typo tracing: Fix typo in "package" tag name	2021-10-18 14:50:47 +08:00
Chelsea Mafrica	6ffe9e5afe	Merge pull request #2816 from cmaf/add-var-name-kata runtime: change name in config settings back to "kata"	2021-10-15 14:09:41 -07:00
Christophe de Dinechin	bcffa26305	tracing: Fix typo in "package" tag name The tracing tags for api.go contain `"packages"` as a tag name, whereas all other tags contain `"package"`. Fixes: #2847 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2021-10-15 14:48:00 +02:00
James O. D. Hunt	e61f5e2931	runtime: Show socket path in kata-env output Display a pseudo path to the sandbox socket in the output of `kata-runtime env` for those hypervisors that use Hybrid VSOCK. The path is not a real path since the command does not create a sandbox. The output includes a `{ID}` tag which would be replaced with the real sandbox ID (name) when the sandbox was created. This feature is only useful for agent tracing with the trace forwarder where the configured hypervisor uses Hybrid VSOCK. Note that the features required a new `setConfig()` method to be added to the `hypervisor` interface. This isn't normally needed as the specified hypervisor configuration passed to `setConfig()` is also passed to `createSandbox()`. However the new call is required by `kata-runtime env` to display the correct socket path for Firecracker. The new method isn't wholly redundant for the main code path though as it's now used by each hypervisor's `createSandbox()` call. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-10-15 11:45:29 +01:00
James O. D. Hunt	321be0f794	tracing: Remove trace mode and trace type Remove the `trace_mode` and `trace_type` agent tracing options as decided in the Architecture Committee meeting. See: - https://github.com/kata-containers/kata-containers/pull/2062 Fixes: #2352. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-10-15 10:09:38 +01:00
Bin Liu	8be85fda4f	Merge pull request #2775 from fgiudici/kata-monitor_issue2292 kata-monitor: add index page	2021-10-14 09:12:57 +08:00
Bo Chen	7b2bfd4eca	virtcontainers: clh: Use 'quiet' as the default kernel parameter The 'quiet' kernel parameter can avoid guest kernel logs while booting, which can reduce boot time. Fix: #2820 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-10-11 22:06:27 -07:00
Bo Chen	3e24e46c70	virtcontainers: clh: Turn-off serial and virtio-console by default We will need to have console output from the guest only for debugging purposes. As a result, we can turn-off both the serial and virtio-console devices by default for better boot time. Fixes: #2820 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-10-11 22:06:23 -07:00
Chelsea Mafrica	3f95469a78	runtime: logging: Add variable for syslog tag The variable for 'name' in config-settings.go.in was previously hardcoded as "kata". In `e7c42fb` it was changed to the runtime name, which is "kata-runtime". Add a variable to specify a syslog identifier for consistency for tests and documentation that use it. Fixes #2806 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2021-10-11 02:12:13 -07:00
Feng Wang	adc9e0baaf	runtime: fix two bugs in rootless hypervisor Update the sandbox dir clean up logic to be more appropriate Add different seeds for randInt() method Fixes #2770 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2021-10-08 15:52:42 -07:00
Bo Chen	51cbe14584	runtime: Add option "disable_seccomp" to config hypervisor.clh This patch adds an option "disable_seccomp" to the config hypervisor.clh, from which users can disable the `seccomp` feature from Cloud Hypervisor when needed (for debugging purposes). Fixes: #2782 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-10-08 15:10:30 -07:00
Bo Chen	98b7350a1b	virtcontainers: clh: Enable the `seccomp` feature This patch enables the `seccomp` feature from Cloud Hypervisor which provides fine-grained allowed syscalls for each of its worker threads. It brings important security benefits, while would increase memory footprint. Fixes: #2782 Signed-off-by: Bo Chen <chen.bo@intel.com>	2021-10-08 15:07:43 -07:00
bin	5c77cc2c49	runtime: don't start shim management server in tests Shim management server is running in a go routine, in test mode this will cause the directory where the listen socket file(/run/vc/sbs/777-77-77777777/shim-monitor.sock) in leak after the tests finished. Fixes: #2805 Signed-off-by: bin <bin@hyper.sh>	2021-10-08 18:41:53 +08:00
Fupan Li	988eb95621	Merge pull request #2760 from liubin/fix/2759-optimize-code-for-managing-temp-users runtime: optimize code for managing temp users for rootless mode	2021-10-08 13:49:14 +08:00
bin	bf8f582c1d	runtime: optimize code for managing temp users for rootless mode This commit does two chagnes: - move code for managing temp users to rootless.go. - use common function in qemu.go when shutdown the VM. Fixes: #2759 Signed-off-by: bin <bin@hyper.sh>	2021-10-08 11:04:21 +08:00
Bin Liu	10ec4b133c	Merge pull request #2742 from liubin/fix/2741-delete-file-code Delete file virtcontainers-setup.sh	2021-10-07 11:54:47 +08:00
Fabiano Fidêncio	4cde619c68	Merge pull request #2797 from fidencio/wip/upgrade-vendored-containerd vendor: Update containerd to v1.5.7	2021-10-06 21:05:44 +02:00
Chelsea Mafrica	6e3fcce2a2	Merge pull request #2748 from liubin/fix/2747-add-test runtime: Optimize func noNeedForOutput and add test cases	2021-10-06 11:24:57 -07:00
Jianyong Wu	7eac2ec786	protection: add confidential compute frame for arm Even CCA, which is the confidential compute archtecture, has not been ready, add a empty implementation to avoid static check error. Fixes: #2789 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Suggested-by: Fabiano Fidêncio <fidencio@redhat.com>	2021-10-06 15:53:36 +02:00
Jianyong Wu	8acfc154de	check: fix typecheck failure in qemu_arm64_test.go fix typecheck failure in qemu_arm64_test.go Fixes: #2789 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2021-10-06 15:53:35 +02:00
Amulya Meka	5b02d54e23	virtcontainers: fix lint failure on ppc64le Add nolint for arch specific code to exclude from lint check. Fixes: #2773 Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>	2021-10-06 15:53:35 +02:00
Jakob Naucke	ff9728f032	virtcontainers: nolint guestProtection Exclude from lint checking for it is ultimately only used in architecture-specific code. Fixes: #2273 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-10-06 15:53:35 +02:00
Jakob Naucke	5c138c8f12	runtime: Fix field alignment on s390x Follow-up of #2237 for s390x -- field alignment isn't always minimal Fixes: #2773 Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2021-10-06 15:53:35 +02:00
Fabiano Fidêncio	191d001610	vendor: Update containerd to v1.5.7 Bump containerd to v1.5.7 in order to bring in a fix for CVE-2021-41103, "insufficiently restricted permissions ons plugins directories (https://github.com/advisories/GHSA-c2h3-6mxw-7mvq)". dependabot found a potential security vulnerability and raised a PR to fix it. However, dependabot does not properly follows nor understands the needed of our CIs (mainly related to formatting the PR and whatnot), thus I'm re-raising it. Fixes: #2796 Supersedes: #2787 Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>	2021-10-06 10:40:43 +02:00
Eric Ernst	2bc7561561	Merge pull request #2769 from sameo/topic/agent-route Pass the host route IP family to the guest	2021-10-05 07:20:33 -07:00
Bin Liu	f7f6bd0142	kata-monitor: add index page Add an index page to the kata-monitor endpoint. Porting of https://github.com/liubin/kata-containers/commit/a45aa0696d55 Fixes: #2292 Signed-off-by: Francesco Giudici <fgiudici@redhat.com>	2021-10-04 18:13:56 +02:00
Samuel Ortiz	71ce6cfe9e	runtime: Pass the route IP family to the agent When updating the guest routing table, we should forward the IP family information up to the guest. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2021-10-01 14:35:17 +02:00
Samuel Ortiz	99450bd1f7	agent: protos: Add a Family field to the Route payload Our check for the IP family is working as long as we have either a gateway or a destination IP. Some routes are missing both. The RT netlink messages provide the IP family information for each route, so we can carry that piece of information up to the guest. That will allow for a more reliable route IP family determination. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2021-10-01 14:35:17 +02:00
Samuel Ortiz	f85fe70231	runtime: vendor: Bump the netlink package dependency We need to be able to get the IP family from the netlink route meesages, and the Route.Family field only got recently added to the netlink package. The update generates static check warnings about the call for nethandler.Delete() being deprecated in favor of a Close() call instead. So we include the s/Delete()/Close()/ change as part of this PR. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>	2021-10-01 14:35:01 +02:00
Amulya Meka	e439cec7c5	cmd: fix field alignment on ppc64le Optimising structure field alignment. Fixes: #2779 Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>	2021-10-01 11:45:27 +00:00
Amulya Meka	e5159ea755	cmd: get return value for setCPUtype Accept and assert the return value in testSetCPUTypeGeneric. Fixes: #2779 Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>	2021-10-01 11:44:14 +00:00
James O. D. Hunt	2ce8d4263c	clh: Suppress hypervisor output to make guest output visible Reduce the cloud-hypervisor log level from `Debug` to `Info` when hypervisor debug is enabled. This is required since `Debug` level: - Is overkill for debugging hypervisor failures. - Effectively hides the output from the guest kernel and userland: CLH generates so much output that the output from the guest gets "lost in the noise" (experiments show that for each full CLH debug message, at most 1 _byte_ of guest output is displayed). Fixes: #2726. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2021-09-30 14:22:09 +01:00
Jakob Naucke	8739a73dd3	Merge pull request #2736 from Amulyam24/kata-check-test cmd: Fix mismatched types in testModuleData	2021-09-30 10:20:19 +02:00
bin	762922a521	runtime: delete func ConstraintsToVCPUs ConstraintsToVCPUs is not used any more. Fixes: #2741 Signed-off-by: bin <bin@hyper.sh>	2021-09-30 14:44:41 +08:00

... 11 12 13 14 15 ...

2076 Commits