Compare commits

..

304 Commits

Author SHA1 Message Date
Fabiano Fidêncio
2e54c8e887 Merge pull request #5921 from fidencio/3.1.0-alpha1-branch-bump
# Kata Containers 3.1.0-alpha1
2022-12-19 15:45:53 +01:00
Bin Liu
6039516802 Merge pull request #5925 from xinydev/fix-docs
docs: Remove duplicate sentences
2022-12-19 17:12:15 +08:00
Peng Tao
473f5ff7da Merge pull request #5861 from mflagey/Docs_Change_build_virtiofsd_in_developer_guide_#5860
docs: Update virtiofsd build script in the developer guide
2022-12-19 17:02:35 +08:00
Bin Liu
0cf443a612 Merge pull request #5915 from openanolis/legacy_device
dragonball: refactor legacy device initialization
2022-12-19 13:31:45 +08:00
Xin Yang
74fa10a235 docs: remove duplicate sentences
remove duplicate sentences in spdk docs
Fixes: #5926

Signed-off-by: Xin Yang <xinydev@gmail.com>
2022-12-17 11:26:36 +00:00
Bin Liu
e4645642d0 Merge pull request #5877 from openanolis/fix_start_bundle
runtime-rs: enable start container from bundle
2022-12-17 08:10:08 +08:00
Wainer Moschetta
339ef99669 Merge pull request #5867 from Alex-Carter01/sev_module_unload
kernel building: Add module unload to SEV kernel config
2022-12-16 17:17:53 -03:00
Alex Carter
9f465a58af kernel: Add "unload" module to SEV config
Fixes: #5866
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
2022-12-16 16:56:56 +00:00
Fabiano Fidêncio
b0896126cf release: Kata Containers 3.1.0-alpha1
- tools: Add some new gitignore items
- shim: return hypervisor's pid not shim's pid
- Dragonball: introduce upcall
- refactor(shim-mgmt): move client side to libs
- kata-ctl: Add --list option
- kata-ctl: check: only-list-releases and include-all-releases options
- basic framework for QEMU support in runtime-rs
- tools: Fix indentation on build kernel script
- runtime-rs: fix standalone share fs
- runtime-rs: fix sandbox_pidns calculation and oci spec amending
- runtime,agent: Add SELinux support for containers inside the guest
- kata-sys-util: fix issues where umount2 couldn't get the correct path
- agent: Drop the Option for LinuxContainer.cgroup_manager
- dragonball: enable kata3.0/dragonball CI on Arm
- fix kata deploy error after node reboot.
- tools: Fix indentation for ovmf script
- runtime: prevent waiting 50 ms minimum for a process exit
- runtime-rs: fix high cpu
- agent: remove `sysinfo` dependency
- runtime-rs: bind mount volumes in sandbox level
- docs: Update the rust version in the installation documentation
- runtime-rs: fix some variable names and typos
- kata-ctl: add host check for aarch64
- kata-ctl: fix dependency version conflict
- workflow: fix cargo-deny-runner.yaml syntax error
- runtime: Add identification in version for runtime-rs
- workflow: call cargo in user's $PATH
- runtime-rs: remove the version number from the commit display message
- runk: Re-implement start operation using the agent codes
- build: update golang version to 1.19.3
- snap: Fix snapcraft setup (unbreak snap releases)
- fix(agent): fix iptables binary path in guest
- runtime-rs: moving only vCPU threads into sandbox controller
- tools: Remove extra tab spaces from kata deploy binaries script
- ci: let static checks don't depend on build
- actions: use matrix to refactor static checks
- agent: support systemd cgroup for kata agent.
- actions: skip some jobs using "paths-ignore" filter
- runtime: go fix code for 1.19
- doc: update runtime-rs "Build and Install"
- runtime: don't fail mkdir if the folder is already created by another process
- kernel: add CONFIG_X86_SGX into whitelist
- runtime-rs: block on the current thread when setup the network to avoid be take over by other task
- Refactor(runtime-rs): add conditional compile for virt-sandbox persist
- runtime: add log record to the qemu config method `appendDevices` for…
- runtime: Use containerd v1.6.8
- tools: Fix indentation of build static firecracker script
- package: add nydus to release artifacts
- agent: check if command exist before do ip_tables test
- runtime: Support virtiofs queue size for qemu and make it configurable
- docs: change mount-info.json to mountInfo.json
- docs: update doc "NVIDIA GPU passthrough"
- runtime-rs: support vhost-vsock
- utils: Add utility function to fetch the kernel version.
- versions: update nydusd version
- runtime-rs: support nydus v5 and v6 rootfs
- Upgrade to Cloud Hypervisor v28.0
- docs: update doc "Setup swap device in guest kernel"
- Rust fixes + Golang bump
- clh: avoid race condition when stopping clh
- tools: Fix indentation of build static virtiofsd script
- docs: Fix configuration path
- runtime-rs : fix the shim source in the documentation test is ambiguous
- versions: update vmm-sys-util and related crates to v0.11.0
- runtime-rs: delete all cargo patches
- feat(shim-mgmt): iptables handler
- tools: Remove empty spaces from build kernel script
- Built-in Sandbox: add more unit tests for dragonball. Part 3
- Dragonball: enable mem_file_path config into hugetlbfs process
- runtime-rs:add hypervisor interface capabilities
- cloud-hypervisor: Fix GetThreadIDs function
- github: Parallelise static checks
- runtime-rs: blanks filled & fixes made to virtiofsd launch
- vCPUs pinning support for Kata Containers
- runtime-rs: fix shared volume permission issue
- runk: Ignore an error when calling kill cmd with --all option
- runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
- snap: Unbreak docker install
- add EnterNetNS in virtcontainers
- tools: Fix indentation of build static clh script
- virtiofsd: Not use "link-self-contained=yes" on s390x
- Kata ctl drop privs
- versions: bump golangci-lint version
- runtime-rs: generate config files with the default target
- docs: Fix volumeMounts in SGX usage example
- versions: Update Cloud Hypervisor to b4e39427080
- docs: update rust runtime installation guide
- rustjail: Upgrade libseccomp crate to v0.3.0
- makefile: remove sudo when create symbolic link
- agent: remove redundant checks
- shim: Ensure pagesize is set when reporting hugetlb stats
- kata-ctl: Re-enable network tests on s390x (fixes 5438)
- agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
- fix readme content error at doc directory
- agent: validate hugepage size is supported
- Makefile: fix an typo in runtime-rs makefile
- qemu: Re-work static-build Dockerfile
- Modify agent-url return value in runtime-rs
- runtime-rs: regulate the comment in runtime-rs makefile
- doc: Update how-to-run-kata-containers-with-SNP-VMs.md
- kata-ctl: Disable network check on s390x
- virtiofsd: Build inside a container
- Dragonball: remove redundant comments in event manager
- versions: Update TDX QEMU
- runtime-rs: fix typo get_contaier_type to get_container_type
- kata-ctl: improve command descriptions for consistency
- runtime-rs: force shutdown shim process in it can't exit
- versions: Update TDX kernel
- ci: skip s390x for dragonball.
- Dragonball: delete redundant comments in blk_dev_mgr
- kata-ctl: Move development to main branch
- runtime-rs: support ephemeral storage for emptydir
- docs: fix a typo in rust-runtime-installation-guide
- Built-in Sandbox: add more unit tests for dragonball
- readme: remove libraries mentioning

b5cfd0958 kata-ctl: Fixed format for check release options
fbf294da3 refactor(shim-mgmt): move client side to libs
ae0dcacd4 tools: Add some new gitignore items
99485d871 shim: return hypervisor's pid not shim's pid
1f28ff683 runtime-rs: add binary to exercise shim proper w/o containerd dependencies
eb8c9d38f runtime-rs: add launch of a simple qemu process to start_vm()
2f6d0d408 runtime-rs: support qemu in VirtContainer
1413dfe91 runtime-rs: add basic empty boilerplate for qemu driver
a81ced0e3 upcall: add upcall into kernel build script
f5c34ed08 Dragonball: introduce upcall
8dbfc3dc8 kata-ctl: Fixed format for check release options
f3091a9da kata-ctl: Add kata-ctl check release options
a577df8b7 tools: Fix indentation on build kernel script
b087667ac kata-deploy: Fix the pod of kata deploy starts to occur an error
79cf38e6e runtime-rs: clear OCI spec namespace path
62f4603e8 runtime-rs: reset rdma cgroup
5b6596f54 runtime-rs: CreateContainerRequest has Default
e9e82ce28 runtime-rs: fix is_pid_namespace_enabled check
8079a9732 kata-sys-util: fix issues where umount2 couldn't get the correct path
4661ea8d3 runtime-rs: fix standalone share fs
c5abc5ed4 config: speed up rng init when kernel boot for arm64
3e6114b2e tools: Fix indentation for ovmf script
7fdbbcda8 agent: Drop the Option for LinuxContainer.cgroup_manager
d04d45ea0 runtime: use pidfd to wait for processes on Linux
e9ba0c11d runtime: use exponential backoff for process wait
748f22e7d agent: remove sysinfo dependency
0019d653d runtime-rs: fix high cpu
46b38458a docs: Update the rust version in the installation documentation
71491a69c runtime: move process wait logic to another function
92ebe61fe runtime: reap force killed processes
fdf0a7bb1 runtime-rs: fix the issues mentioned in the code review
1d823c4f6 runtime-rs: umount and permission controls in sandbox level
527b87141 runtime-rs: bind mount volumes in sandbox level
9ccf2ebe8 agent: add signal value to log
fb2c142f1 runtime-rs: fix some variable names and typos
737420469 kata-ctl: fix dependency version conflict
89574f03f workflow: call cargo in user's $PATH
d4321ab48 runtime: Add identification in version for runtime-rs
f7fc436be workflow: fix cargo-deny-runner.yaml syntax error
78532154d docs: Add description for guest SELinux support
c617bbe70 runtime: Pass SELinux policy for containers to the agent
935476928 agent: Add SELinux support for containers
a75f99d20 osbuilder: Create guest image for SELinux
a9c746f28 kernel: Add kernel configs for SELinux
86cb05883 snap: Fix snapcraft setup (unbreak snap releases)
f443b7853 build: update golang version to 1.19.3
e12db92e4 runk: Re-implement start operation using the agent codes
e723bad0a ci: let static checks don't depend on build
69aae0227 actions: use matrix to refactor static checks
a5e4cad4b kata-ctl: add host check for aarch64
2edbe389d runtime-rs: moving only vCPU threads into sandbox controller
340e24f17 actions: skip some job using "paths-ignore" filter
2426ea9bd doc: update runtime-rs "Build and Install"
67fe703ff runtime-rs: remove the version number from the commit display message
1d93a9346 fix(agent): fix iptables binary path in guest
1dfd845f5 runtime: go fix code for 1.19
cd85a44a0 tools: Remove extra tab spaces from kata deploy binaries script
cb199e0ec kernel: add CONFIG_X86_SGX into whitelist
4b45e1386 runtime: don't fail mkdir if the folder is already created
b987bbc57 runtime-rs: block on the current thread when setup the network
abb9ebeec package: add nydus to release artifacts
30a7ebf43 runtime: Log invalid devices in QEMU config
2539f3186 runtime: Use containerd v1.6.8
993d05a42 docs: change mount-info.json to mountInfo.json
d808adef9 runtime-rs: support vhost-vsock
6b2ef66f0 runtime-rs: add conditional compile for virt-sandbox persist
6c1e153a6 docs: update doc "NVIDIA GPU passthrough"
b53171b60 agent: check command before do test_ip_tables
a636d426d versions: update nydusd version
3bb145c63 runtime: Support virtiofs queue size for qemu and make it configurable
e80a9f09f utils: Add utility function to fetch the kernel version.
36545aa81 runtime: clh: Re-generate the client code
f4b02c224 versions: Upgrade to Cloud Hypervisor v28.0
e4a6fbadf docs: update doc "Setup swap device in guest kernel"
2f5f575a4 log-parser: Simplify check
d94718fb3 runtime: Fix gofmt issues
16b837509 golang: Stop using io/ioutils
66aa330d0 versions: Update golangci-lint
b3a4a1629 versions: bump containerd version
eab8d6be1 build: update golang version to 1.19.2
e80dbc15d runtime-rs: workaround Dragonball compilation problem
c3f1922df fix(fmt): fix cargo fmt to pass static check
a4099dab8 tools: Fix indentation of build static firecracker script
c46814b26 runtime-rs:support nydus v5 and v6
a04afab74 qemu: early exit from Check if the process was stopped
7e481f217 qemu: set stopped only if StopVM is successful
0e3ac66e7 clh: return faster with dead clh process from isClhRunning
9ef68e0c7 clh: fast exit from isClhRunning if the process was stopped
2631b08ff clh: don't try to stop clh multiple times
f45fe4f90 versions: update vmm-sys-util and related crates to v0.11.0
8be081730 tools: Fix indentation of build static virtiofsd script
f8f97c1e2 feat(shim-mgmt): iptables handler
29c75cf12 runtime-rs: delete all cargo patches
9f70a6949 tools: Remove empty spaces from build kernel script
57336835d dragonball: add more unit test for device manager
233370023 dragonball: add test utils.
3e9c3f12c docs: Fix configuration path
2adb1c182 Dragonball: enable mem_file_path config into hugetlbfs process
daeee26a1 cloud-hypervisor: Fix GetThreadIDs function
40d514aa2 github: Parallelise static checks
2508d39b7 runtime: added vcpus pinning logics Core VCPU threads pinning logics for issue 4476. Also provided docs.
fef8e92af runtime-rs:add hypervisor interface capabilities
27b191358 runtime-rs: blanks filled & fixes made to virtiofsd launch
990e6359b snap: Unbreak docker install
ca69a9ad6 snap: Use metadata for dependencies
df092185e runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
16dca4ecd runk: Ignore an error when calling kill cmd with --all option
b74c18024 runtime-rs: fix shared volume permission issue
936fe35ac runtime-rs : fix shim source is ambiguous
0ed7da30d tools: Fix indentation of build static clh script
43fcb8fd0 virtiofsd: Not use "link-self-contained=yes" on s390x The compile option link-self-contained=yes asks rustc to use C library startup object files that come with the compiler, which are not available on the target s390x-unknown-linux-gnu. A build does not contain any startup files leading to a broken executable entry point (causing segmentation fault).
219919e9f docs: Fix volumeMounts in SGX usage example
c0f5bc81b cargo: Add Cargo.lock to version control
474927ec9 gitignore: Add gitignore file
699f821e1 utils: Add function to drop priveleges
a6fb4e2a6 versions: bump golangci-lint version
b015f34af runtime-rs: generate config files with the default target
d7bb4b551 agent: support systemd cgroup for kata agent
144efd1a7 docs: update rust runtime installation guide
abf4f9b29 docs: kata 3.0 Architecture fix readme content error
44d8de892 agent: remove redundant checks
9d286af7b versions: Update Cloud Hypervisor to b4e39427080
081ee4871 agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
e95089b71 kata-ctl: add basic cpu check for s390x
871d2cf2c kata-ctl: Limit running tests to x86 and use native-tls on s390x
cbd84c3f5 rustjail: Upgrade libseccomp crate to v0.3.0
748be0fe3 makefile: remove sudo when create symbolic link
227e717d2 qemu: Re-work static-build Dockerfile
72738dc11 agent: validate hugepage size is supported
f74e328ff Makefile: fix an typo in runtime-rs makefile
f205472b0 Makefile: regulate the comment style for the runtime-rs comments
9f2c7e47c Revert "kata-ctl: Disable network check on s390x"
ac403cfa5 doc: Update how-to-run-kata-containers-with-SNP-VMs.md
00981b3c0 kata-ctl: Disable network check on s390x
39363ffbf runtime: remove same function
c322d1d12 kata-ctl: arch: Improve check call
0bc5baafb snap: Build virtiofsd using the kata-deploy scripts
cb4ef4734 snap: Create a task for installing docker
7e5941c57 virtiofsd: Build inside a container
35d52d30f versions: Update TDX QEMU
4d9dd8790 runtime-rs: fix typo get_contaier_type to get_container_type
70676d4a9 kata-ctl: improve command descriptions for consistency
9eb73d543 versions: Update TDX kernel
00a42f69c kata-ctl: cargo: 2021 -> 2018
fb6327474 kata-ctl: rustfmt + clippy fixes
1f1901e05 dragonball: fix clippy warning for aarch64
a343c570e dragonball: enhance dragonball ci
6a64fb0eb ci: skip s390x for dragonball.
a743e37da Dragonball: delete redundant comments in blk_dev_mgr
2b345ba29 build: Add kata-ctl to tools list
f7010b806 kata-ctl: docs: Write basic documentation
862eaef86 docs: fix a typo in rust-runtime-installation-guide
26c043dee ci: Add dragonball test
781e604c3 docs: Reference kata-ctl README
15c343cbf kata-ctl: Don't rely on system ssl libs
c23584994 kata-ctl: clippy: Resolve warnings and reformat
133690434 kata-ctl: implement CLI argument --check-version-only
eb5423cb7 kata-ctl: switch to use clap derive for CLI handling
018aa899c kata-ctl: Add cpu check
7c9f9a5a1 kata-ctl: Make arch test run at compile time
b63ba66dc kata-ctl: Formatting tweaks
cca7e32b5 kata-ctl: Lint fixes to allow the branch to be built
8e7bb8521 kata-ctl: add code for framework for arch
303fc8b11 kata-ctl: Add unit tests cases
d0b33e9a3 versions: Add kata-ctl version entry
002b18054 kata-ctl: Add initial rust code for kata-ctl
b62b18bf1 dragonball: fix clippy warning
2ddc948d3 Makefile: add dragonball components.
3fe81fe4a dragonball-ut: use skip_if_not_root to skip root case
72259f101 dragonball: add more unit test for vmm actions
9717dc3f7 Dragonball: remove redundant comments in event manager
9c1ac3d45 runtime-rs: return port on agent-url req
89e62d4ed shim: Ensure pagesize is set when reporting hugetbl stats
8d4ced3c8 runtime-rs: support ephemeral storage for emptydir
046ddc646 readme: remove libraries mentioning
86ad832e3 runtime-rs: force shutdown shim process in it can't exit

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-12-16 09:12:07 +01:00
Zhongtao Hu
21ec766d29 docs: add documents for using bundle to start container
add document for using bundle to start container

Fixes:#5872
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-16 11:13:25 +08:00
Yushuo
d14c3af35c dragonball: refactor legacy device initialization
If the serial path is given, legacy_manager should create socket console
based on that path. Or the console should be created based on stdio.

Fixes: #5914

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2022-12-15 20:55:01 +08:00
Fabiano Fidêncio
1d266352ea Merge pull request #5902 from Bevisy/fix-too-many-git-file
tools: Add some new gitignore items
2022-12-15 11:29:32 +01:00
Zhongtao Hu
ca39a07a14 runtime-rs: enable start container from bundle
enable start container from bundle in this way

$ ls ./bundle
config.json  rootfs
$ sudo ctr run -d --runtime io.containerd.kata.v2 --config bundle/config.json test_kata

Fixes:#5872
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-12-15 17:28:13 +08:00
Peng Tao
ebb73df6bc Merge pull request #5899 from Bevisy/fix-outdated-comments
shim: return hypervisor's pid not shim's pid
2022-12-15 14:55:54 +08:00
Peng Tao
7210905deb Merge pull request #5712 from openanolis/chao/upcall
Dragonball: introduce upcall
2022-12-15 14:44:56 +08:00
Chao Wu
fad229b853 Merge pull request #5875 from Ji-Xinyou/xyji/refactor-shim-mgmt
refactor(shim-mgmt): move client side to libs
2022-12-15 10:59:45 +08:00
David Esparza
1dbd6c8057 Merge pull request #5735 from dborquez/kata-ctl-cli-list
kata-ctl: Add --list option
2022-12-14 15:03:21 -06:00
Alex
b5cfd09583 kata-ctl: Fixed format for check release options
Fixed formatting for check release options

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2022-12-14 09:42:57 -06:00
James O. D. Hunt
2e15af777c Merge pull request #5786 from alexlee-23/main
kata-ctl: check: only-list-releases and include-all-releases options
2022-12-14 11:25:36 +00:00
Ji-Xinyou
fbf294da3f refactor(shim-mgmt): move client side to libs
The client side is moved to libs. This is to solve the problem
that including clients will bring about messy dependencies.

Fixes: #5874
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-12-14 17:42:25 +08:00
Peng Tao
856d4b7361 Merge pull request #5798 from pmores/qemu-support
basic framework for QEMU support in runtime-rs
2022-12-14 15:05:33 +08:00
Binbin Zhang
ae0dcacd4a tools: Add some new gitignore items
Add some new ignore items to avoid local builds that cause git to track a lot of files

Fixes: #5900

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2022-12-14 11:38:23 +08:00
Binbin Zhang
99485d871c shim: return hypervisor's pid not shim's pid
update outdated code comments

Fixes: #3234

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2022-12-14 11:16:11 +08:00
GabyCT
b637d12d19 Merge pull request #5884 from GabyCT/topic/fixbuildscript
tools: Fix indentation on build kernel script
2022-12-13 15:28:24 -06:00
Chao Wu
bb4be2a666 Merge pull request #5690 from yipengyin/fix-virtiofsd
runtime-rs: fix standalone share fs
2022-12-14 00:16:10 +08:00
Pavel Mores
1f28ff6838 runtime-rs: add binary to exercise shim proper w/o containerd dependencies
After building the binary as usual with `cargo build` run it as follows.

It needs a configuration.toml in which only qemu keys `path`, `kernel`
and `initrd` will initially need to be set.  Point them to respective
files e.g. from a kata distribution tarball.

It also needs to be launched from an exported container bundle
directory.  One can be created by running

mkdir rootfs
podman export $(podman create busybox) | tar -C ./rootfs -xvf -
runc spec -b .

in a suitable directory.

Then launch the program like this:

KATA_CONF_FILE=/path/to/configuration-qemu.toml /path/to/shim-ctl

Fixes: #5817

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:55:21 +01:00
Pavel Mores
eb8c9d38ff runtime-rs: add launch of a simple qemu process to start_vm()
The point here is just to get a simplest Kata VM running.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
2f6d0d408b runtime-rs: support qemu in VirtContainer
Added registration of qemu config plugin and support for creating Qemu
Hypervisor instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:54:26 +01:00
Pavel Mores
1413dfe91c runtime-rs: add basic empty boilerplate for qemu driver
This does almost literally nothing so far apart from getting and setting
HypervisorConfig.  It's mostly copied from/inspired by dragonball.

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-12-13 14:53:45 +01:00
Bin Liu
3952fedcd0 Merge pull request #5882 from bergwolf/github/oci-namespaces
runtime-rs: fix sandbox_pidns calculation and oci spec amending
2022-12-13 18:32:02 +08:00
Fabiano Fidêncio
f1381eb361 Merge pull request #4813 from ManaSugi/fix/add-selinux-agent
runtime,agent: Add SELinux support for containers inside the guest
2022-12-13 11:24:53 +01:00
Fupan Li
015674df16 Merge pull request #5873 from justxuewei/fix/umount2
kata-sys-util: fix issues where umount2 couldn't get the correct path
2022-12-13 15:52:32 +08:00
Chao Wu
a81ced0e3f upcall: add upcall into kernel build script
In order to let upcall being used by Kata Container, we need to add
those patches into kernel build script.

Currently, only when experimental (-e) and hypervisor type dragonball
(-t dragonball) are both enabled, that the upcall patches will be
applied to build a 5.10 guest kernel.

example commands: sh ./build-kernel.sh -e -t dragonball -d setup

fixes: #5642

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-12-13 15:44:55 +08:00
Chao Wu
f5c34ed088 Dragonball: introduce upcall
Upcall is a direct communication tool between VMM and guest developed
upon vsock. The server side of the upcall is a driver in guest kernel
(kernel patches are needed for this feature) and it'll start to serve
the requests after the kernel starts. And the client side is in
Dragonball VMM , it'll be a thread that communicates with vsock through
uds.

We want to keep the lightweight of the VM through the implementation of
the upcall, through which we could achieve vCPU hotplug, virtio-mmio
hotplug without implementing complex and heavy virtualization features
such as ACPI virtualization.

fixes: #5642

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-12-13 15:44:47 +08:00
Bin Liu
03b6124fc6 Merge pull request #5848 from Yuan-Zhuo/drop-cgmr-option
agent: Drop the Option for LinuxContainer.cgroup_manager
2022-12-13 12:09:39 +08:00
Alex
8dbfc3dc82 kata-ctl: Fixed format for check release options
Fixed formatting for check release options

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
2022-12-13 03:10:19 +00:00
Bin Liu
add2486259 Merge pull request #5853 from jongwu/test_kata3.0_arm
dragonball: enable kata3.0/dragonball CI on Arm
2022-12-13 11:05:17 +08:00
Alex
f3091a9da4 kata-ctl: Add kata-ctl check release options
This pull request adds kata-ctl check only-list-releases and include-all-releases

Fixes: #5345

Signed-off-by: Alex <alee23@bu.edu>
2022-12-13 03:04:30 +00:00
Gabriela Cervantes
a577df8b71 tools: Fix indentation on build kernel script
This PR fixes the indentation on the build kernel script.

Fixes #5883

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-12 16:37:47 +00:00
Fabiano Fidêncio
740387b569 Merge pull request #5829 from singhwang/main
fix kata deploy error after node reboot.
2022-12-12 14:20:14 +01:00
singhwang
b087667ac5 kata-deploy: Fix the pod of kata deploy starts to occur an error
If a pod of kata is deployed on a machine, after the machine restarts, the pod status of kata-deploy will be CrashLoopBackOff.

Fixes: #5868
Signed-off-by: SinghWang <wangxin_0611@126.com>
2022-12-12 19:11:38 +08:00
Peng Tao
79cf38e6ea runtime-rs: clear OCI spec namespace path
None of the host namespace paths make sense in the guest. Let's clear
them all before sending the spec to the agent.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 11:07:14 +00:00
Peng Tao
62f4603e81 runtime-rs: reset rdma cgroup
We don't support rdma cgroups yet. Let's make sure it is reset to empty.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
5b6596f54e runtime-rs: CreateContainerRequest has Default
We can just use it to initialize the default fields.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:57:24 +00:00
Peng Tao
e9e82ce28b runtime-rs: fix is_pid_namespace_enabled check
We should test is_pid_namespace_enabled before amending the container
spec, where the pid namespace path is cleared and resulting
sandbox_pidns to always being false.

Fixes: #5881
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-12-12 09:54:48 +00:00
Xuewei Niu
8079a9732d kata-sys-util: fix issues where umount2 couldn't get the correct path
Strings in Rust don't have \0 at the end, but C does, which leads to `umount2`
in the libc can't get the correct path. Besides, calling `nix::mount::umount2`
to avoid using an unsafe block is a robust solution.

Fixes: #5871

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2022-12-12 11:50:32 +08:00
Yipeng Yin
4661ea8d3b runtime-rs: fix standalone share fs
Standalone share fs should add virtiofs device in setup_device_before_start_vm
and return the storages to mount the directory in guest. And it uses
hypervisor's jailer root directly instead of jail config.

Besides, we tweaked the parameter, so it adapts to rust version virtiofsd
now. And its cache policy which forbids caching is "never" now,  instead of
"none". Hence, we change the default cache mode.

Fixes: #5655

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2022-12-12 10:58:09 +08:00
GabyCT
67e82804c5 Merge pull request #5865 from GabyCT/topic/fixspacesovmfscript
tools: Fix indentation for ovmf script
2022-12-09 15:33:49 -06:00
Jianyong Wu
c5abc5ed4d config: speed up rng init when kernel boot for arm64
For now, rng init is too slow for kata3.0/dragonball. Enable
random_trust_cpu can speed up rng init when kernel boot.

Fixes: #5870
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-12-09 14:20:18 +08:00
Gabriela Cervantes
3e6114b2ef tools: Fix indentation for ovmf script
This PR fixes the indentation for the ovmf script for packaging.

Fixes #5864

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-12-08 16:12:20 +00:00
Greg Kurz
5ef7ed72ae Merge pull request #5610 from UiPath/fix-process-wait
runtime: prevent waiting 50 ms minimum for a process exit
2022-12-08 11:02:39 +01:00
Mathias Flagey
ebe5c5adf9 docs: Update virtiofsd build script in the developer guide
Script to execute to build virtiofsd has been changed in #5426 but not in the doc. This commit update the developer guide.

Fixes: #5860

Signed-off-by: Mathias Flagey <mathiasflagey1201@gmail.com>
2022-12-08 09:29:10 +01:00
Peng Tao
0a1d1ec2fa Merge pull request #5830 from openanolis/fix-high-cpu
runtime-rs: fix high cpu
2022-12-08 12:16:06 +08:00
Steve Horsman
39394fa2a8 Merge pull request #5844 from jtumber-ibm/patch-1
agent: remove `sysinfo` dependency
2022-12-07 16:35:05 +00:00
Fupan Li
cce316b5e9 Merge pull request #5607 from justxuewei/feat/sandbox-level-volume
runtime-rs: bind mount volumes in sandbox level
2022-12-07 19:23:38 +08:00
Chelsea Mafrica
1ff4185111 Merge pull request #5842 from cyyzero/update_install_guide
docs: Update the rust version in the installation documentation
2022-12-06 23:40:35 -08:00
Yuan-Zhuo
7fdbbcda82 agent: Drop the Option for LinuxContainer.cgroup_manager
Cgroup manager for a container will always be created.
Thus, dropping the option for LinuxContainer.cgroup_manager
is feasible and could simplify the code.

Fixes: #5778

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-12-07 13:40:38 +08:00
Alexandru Matei
d04d45ea05 runtime: use pidfd to wait for processes on Linux
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic

Fixes: #5617

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:31:05 +02:00
Alexandru Matei
e9ba0c11d0 runtime: use exponential backoff for process wait
Initial wait period between checks is 1ms, and the
next ones are min(wait_period*5, 50ms)

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-06 16:30:58 +02:00
James Tumber
748f22e7d0 agent: remove sysinfo dependency
Removes the redundant dependency `sysinfo`.

Fixes: #5843

Signed-off-by: James Tumber <james.tumber@ibm.com>
2022-12-06 10:18:53 +00:00
Quanwei Zhou
0019d653d6 runtime-rs: fix high cpu
Fixed the issue when using nonblocking, the `tokio::io::copy()` needing
to handle EAGAIN, resulting in high CPU usage.

Fixes: #5740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
2022-12-06 14:25:33 +08:00
Chao Wu
326d589ff5 Merge pull request #5822 from liubin/fix/5820-var-name-and-typo
runtime-rs: fix some variable names and typos
2022-12-06 14:24:11 +08:00
Zhongtao Hu
c12bb5008d Merge pull request #5769 from jongwu/check_host_arm
kata-ctl: add host check for aarch64
2022-12-06 14:05:52 +08:00
Chen Yiyang
46b38458af docs: Update the rust version in the installation documentation
Rust version in the installation documentation does not match the
requirements. Just fix it.

Fixes: #5841

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2022-12-06 12:50:32 +08:00
Chao Wu
538bddf4ee Merge pull request #5811 from tzY15368/fix-katactl-conflict-dependency
kata-ctl: fix dependency version conflict
2022-12-06 10:44:48 +08:00
Alexandru Matei
71491a69c3 runtime: move process wait logic to another function
extract process wait logic to another function

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:32:04 +02:00
Alexandru Matei
92ebe61fea runtime: reap force killed processes
reap child processes after sending SIGKILL

Fixes #5739

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-12-05 13:31:58 +02:00
Xuewei Niu
fdf0a7bb14 runtime-rs: fix the issues mentioned in the code review
Removed the `Debug` trait for the `ShareFs` and etc. Renamed
`ShareFsMount::upgrade()` and `ShareFsMount::downgrade()` to
`upgrade_to_rw()` and `downgrade_to_ro()`. Protected `mounted_info_set`
with a mutex to avoid race conditions.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 11:18:26 +08:00
Xuewei Niu
1d823c4f65 runtime-rs: umount and permission controls in sandbox level
This commit implemented umonut controls and permission controls. When a volume
is no longer referenced, it will be umounted immediately. When a volume mounted
with readonly permission and a new coming container needs readwrite permission,
the volume should be upgraded to readwrite permission. On the contrary, if a
volume with readwrite permission and no container needs readwrite, then the
volume should be downgraded.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Xuewei Niu
527b871414 runtime-rs: bind mount volumes in sandbox level
Implemented bind mount related managment on the sandbox side, involving bind
mount a volume if it's not mounted before, upgrade permission to readwrite if
there is a new container needs.

Fixes: #5588

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-12-05 10:58:13 +08:00
Bin Liu
9ccf2ebe8a agent: add signal value to log
For signal_process call, log the signal value in logs.

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:53:58 +08:00
Bin Liu
fb2c142f18 runtime-rs: fix some variable names and typos
Fix some not perfect variable names, and some typos in logs.

Fixes: #5820

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-02 14:52:34 +08:00
Bin Liu
8246de821f Merge pull request #5809 from liubin/fix/cargo-deny-workflow-error
workflow: fix cargo-deny-runner.yaml syntax error
2022-12-02 12:19:44 +08:00
Bin Liu
514b7778a2 Merge pull request #5807 from liubin/fix/5806-add-shim-lanuage
runtime: Add identification in version for runtime-rs
2022-12-02 11:36:55 +08:00
Bin Liu
c1f5a93b66 Merge pull request #5814 from liubin/fix/5813-test-dragonball-error
workflow: call cargo in user's $PATH
2022-12-02 11:36:19 +08:00
Tingzhou Yuan
737420469a kata-ctl: fix dependency version conflict
Also added crate `runtime-rs/crates/runtimes` as dependency as it's
immediately depended upon by the `direct-volume` feature, see issue
5341 and PR 5467.

Fixes #5810

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2022-12-01 17:53:21 +00:00
Bin Liu
89574f03f8 workflow: call cargo in user's $PATH
Call cargo in root's HOME may lead to permission error, should
call cargo installed in user's HOME/PATH.

Fixes: #5813

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 15:37:16 +08:00
Bin Liu
d4321ab489 runtime: Add identification in version for runtime-rs
Now we are supporting two runtime/shim, the go version,
and the rust version, for debug purposes, we can
add an identification in the version info
to tell us which runtime/shim is used.

Fixes: #5806

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 15:14:08 +08:00
Bin Liu
7fabfb2cf0 Merge pull request #5756 from chentt10/remove-version-number-from-commit-message
runtime-rs: remove the version number from the commit display message
2022-12-01 13:11:47 +08:00
Bin Liu
f7fc436bed workflow: fix cargo-deny-runner.yaml syntax error
There is a syntax error in .github/workflows/cargo-deny-runner.yaml

Fixes: #5808

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-12-01 12:32:00 +08:00
Fabiano Fidêncio
212325a9db Merge pull request #5649 from ManaSugi/runk/refactor-start-using-agent-code
runk: Re-implement start operation using the agent codes
2022-11-29 20:45:16 +01:00
Fabiano Fidêncio
ac1b2d2a18 Merge pull request #5774 from UiPath/fix-go-panic
build: update golang version to 1.19.3
2022-11-29 13:17:53 +01:00
Fabiano Fidêncio
d8d9aae123 Merge pull request #5781 from jodh-intel/snap-fix-release
snap: Fix snapcraft setup (unbreak snap releases)
2022-11-29 13:11:34 +01:00
Manabu Sugimoto
78532154d9 docs: Add description for guest SELinux support
Add the description about how to enable SELinux for containers
running inside the guest.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Manabu Sugimoto
c617bbe70d runtime: Pass SELinux policy for containers to the agent
Pass SELinux policy for containers to the agent if `disable_guest_selinux`
is set to `false` in the runtime configuration. The `container_t` type
is applied to the container process inside the guest by default.
Users can also set a custom SELinux policy to the container process using
`guest_selinux_label` in the runtime configuration. This will be an
alternative configuration of Kubernetes' security context for SELinux
because users cannot specify the policy in Kata through Kubernetes's security
context. To apply SELinux policy to the container, the guest rootfs must
be CentOS that is created and built with `SELINUX=yes`.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Manabu Sugimoto
9354769286 agent: Add SELinux support for containers
The kata-agent supports SELinux for containers inside the guest
to comply with the OCI runtime specification.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Bin Liu
588f81a23c Merge pull request #5612 from openanolis/fix-iptables
fix(agent): fix iptables binary path in guest
2022-11-29 16:57:06 +08:00
Bin Liu
1da2d0603c Merge pull request #5761 from gaohuatao-1/ght_overhead
runtime-rs: moving only vCPU threads into sandbox controller
2022-11-29 13:53:01 +08:00
Manabu Sugimoto
a75f99d20d osbuilder: Create guest image for SELinux
Create a guest image to support SELinux for containers inside the guest
if `SELINUX=yes` is specified. This works only if the guest rootfs is
CentOS and the init service is systemd, not the agent init. To enable
labeling the guest image on the host, selinuxfs must be mounted on the
host. The kata-agent will be labeled as `container_runtime_exec_t` type.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 13:32:26 +09:00
Manabu Sugimoto
a9c746f284 kernel: Add kernel configs for SELinux
Add kernel configs related to SELinux in order to add the
support for containers running inside the guest.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 13:32:26 +09:00
GabyCT
681d946644 Merge pull request #5748 from GabyCT/topic/removeextratabspacesdocker
tools: Remove extra tab spaces from kata deploy binaries script
2022-11-28 15:34:12 -06:00
James O. D. Hunt
86cb058833 snap: Fix snapcraft setup (unbreak snap releases)
Setup the snapcraft environment manually as the action we had been using
for this does not appear to be actively maintained currently.

Related to this, switch to specifying the snapcraft store credentials
using the `SNAPCRAFT_STORE_CREDENTIALS` secret. This unbreaks
`snapcraft upload`, which Canonical appear to have broken by removing
the previous facility.

Fixes: #5772.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-28 15:51:47 +00:00
Alexandru Matei
f443b78537 build: update golang version to 1.19.3
This Go release fixes golang/go#56309

Fixes #5773
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-28 17:03:29 +02:00
GabyCT
013752667b Merge pull request #5776 from liubin/tmp/debug-static-check
ci: let static checks don't depend on build
2022-11-28 07:51:42 -06:00
Fabiano Fidêncio
527e6c99e9 Merge pull request #5766 from liubin/fix/5763-use-composite-action-refactor-static-checks
actions: use matrix to refactor static checks
2022-11-28 14:12:27 +01:00
Bin Liu
6af037d379 Merge pull request #5154 from Yuan-Zhuo/main
agent: support systemd cgroup for kata agent.
2022-11-28 18:40:10 +08:00
Manabu Sugimoto
e12db92e4d runk: Re-implement start operation using the agent codes
This commit re-implements `start` operation by leveraging the agent codes.
Currently, `runk` has own `start` mechanism even if the agent already
has the feature to handle starting a container. This worsen the maintainability
and `runk` cannot keep up with the changes on the agent side easily.
Hence, `runk` replaces own implementations with agent's ones.

Fixes: #5648

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-28 19:11:21 +09:00
Fabiano Fidêncio
74531114c3 Merge pull request #5762 from liubin/fix/5759-skip-action-by-path
actions: skip some jobs using "paths-ignore" filter
2022-11-28 11:04:34 +01:00
Bin Liu
e723bad0af ci: let static checks don't depend on build
Build is a time consumable operation, skip build while let
ci run faster.

Fixes: #5777

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-28 15:26:04 +08:00
Bin Liu
a55eb78c32 Merge pull request #5752 from liubin/fix/5750-go-fix-1.19
runtime: go fix code for 1.19
2022-11-26 02:09:02 +08:00
Bin Liu
57c80ad65c Merge pull request #5758 from chentt10/update-runtime-rs-build-and-install
doc: update runtime-rs "Build and Install"
2022-11-26 02:08:48 +08:00
Bin Liu
69aae02276 actions: use matrix to refactor static checks
Using matrix to reduce the duplication that of similar code.

Fixes: #5763

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-26 00:32:15 +08:00
Jianyong Wu
a5e4cad4b6 kata-ctl: add host check for aarch64
For now, we can check if host support running kata by check if "/dev/kvm"
exist on aarch64.

Fixes: #5768
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-11-25 18:55:32 +08:00
gaohuatao
2edbe389d8 runtime-rs: moving only vCPU threads into sandbox controller
when overhead controller exists, just contrain vCPU threads
in sandbox controller

Fixes:#5760

Signed-off-by: gaohuatao <gaohuatao@bytedance.com>
2022-11-25 17:53:21 +08:00
Peng Tao
e32c023d96 Merge pull request #5714 from UiPath/fix-mkdir
runtime: don't fail mkdir if the folder is already created by another process
2022-11-25 17:52:56 +08:00
Bin Liu
ae1001a9d1 Merge pull request #5742 from openanolis/chao/SGX_whitelist
kernel: add CONFIG_X86_SGX into whitelist
2022-11-25 17:36:26 +08:00
Bin Liu
340e24f175 actions: skip some job using "paths-ignore" filter
If only docs/images are changed, some jobs should not run.

Fixes: #5759

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-25 15:33:32 +08:00
Chen Taotao
2426ea9bdc doc: update runtime-rs "Build and Install"
When using source code to compile runtime-rs,make the
documentation point out the detailed environment build
and compilation methods to avoid errors caused by related
dependent packages.

Fixes:#5757

Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
2022-11-25 13:13:00 +08:00
Chen Taotao
67fe703ff5 runtime-rs: remove the version number from the commit display message
The displayed commit message and version message are partially duplicated.
Remove the version number from the commit display message.

Fixes:#5735

Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
2022-11-25 13:00:01 +08:00
Ji-Xinyou
1d93a93468 fix(agent): fix iptables binary path in guest
Some rootfs put iptables-save and iptables-restore
under /usr/sbin instead of /sbin. This pr checks both
and returns the one exist.

Fixes: #5608
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-25 11:57:34 +08:00
Bin Liu
1dfd845f51 runtime: go fix code for 1.19
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.

Fixes: #5750

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-25 11:29:18 +08:00
Zhongtao Hu
f02bb1a9cb Merge pull request #5729 from openanolis/netnsref
runtime-rs: block on the current thread when setup the network to avoid be take over by other task
2022-11-25 08:09:10 +08:00
Gabriela Cervantes
cd85a44a04 tools: Remove extra tab spaces from kata deploy binaries script
This PR removes extra tab spaces from the kata deploy binaries
script.

Fixes #5747

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-24 17:57:36 +00:00
Chao Wu
cb199e0ecf kernel: add CONFIG_X86_SGX into whitelist
CONFIG_X86_SGX is introduced after kernel 5.11, and that config is a
default x86_64 config for Kata build-kernel.sh script.
But if we use -v to specify any kernel version below 5.11 will cause an
inevitable error because CONFIG_X86_SGX is not supported in older
kernels and that may cause problem for the situation if we need kernel
version below 5.11.

So I propose to put CONFIG_X86_SGX into whitelist.conf to avoid break
building guest kernel below 5.11.

fixes: #5741

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-24 20:43:58 +08:00
Alexandru Matei
4b45e13869 runtime: don't fail mkdir if the folder is already created
Use MkdirAll instead of Mkdir so it doesn't generate an
error when the folder is created by another process

Fixes #5713

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-24 11:20:56 +02:00
Chao Wu
9bde32daa1 Merge pull request #5707 from openanolis/ref
Refactor(runtime-rs): add conditional compile for virt-sandbox persist
2022-11-24 15:24:06 +08:00
Zhongtao Hu
b987bbc576 runtime-rs: block on the current thread when setup the network
As the increase of the I/O intensive tasks, two issues could be caused:

 1. When the future is blocked, the current thread (which is in the network namespace)
    might be take over by other tasks. After the future is finished, the thread take over
    the current task might not be in the pod network namespace
 2. When finish setting up the network, the current thread will be set back to the host namsapce.
    But the task which be taken over would still stay in the pod network namespace

 To avoid that, we need to block the future on the current thread.

Fixes:#5728
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-24 13:48:05 +08:00
Bin Liu
06a604b753 Merge pull request #5720 from YchauWang/wyc-docs-test-22
runtime: add log record to the qemu config method `appendDevices` for…
2022-11-24 13:15:06 +08:00
Peng Tao
b4d0a39f6d Merge pull request #5723 from fidencio/topic/runtime-bump-containerd-to-v1.6.8
runtime: Use containerd v1.6.8
2022-11-24 11:28:58 +08:00
GabyCT
6d1b5d47fb Merge pull request #5664 from GabyCT/topic/fixfirecrackerscript
tools: Fix indentation of build static firecracker script
2022-11-23 15:00:07 -06:00
Fabiano Fidêncio
82aa876903 Merge pull request #5727 from liubin/feat/add-nydus-to-release
package: add nydus to release artifacts
2022-11-23 14:39:26 +01:00
Bin Liu
abb9ebeece package: add nydus to release artifacts
Install nydus related binaries under /opt/kata/libexec/

Fixes: #5726

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-23 15:17:58 +08:00
Fabiano Fidêncio
5cbf879659 Merge pull request #5693 from jongwu/test_ip_table
agent: check if command exist before do ip_tables test
2022-11-23 08:15:08 +01:00
wangyongchao.bj
30a7ebf430 runtime: Log invalid devices in QEMU config
When the user tried to add new devices to the VM, there is no error info for the invalid
 device. This PR adds a log record to the `appendDevices` for the invalid device of the
 qemu config.

Fixes: #5719

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2022-11-23 09:09:45 +08:00
Fabiano Fidêncio
df3d9878d5 Merge pull request #5695 from darfux/virtiofs-queue-size
runtime: Support virtiofs queue size for qemu and make it configurable
2022-11-22 20:04:30 +01:00
Archana Shinde
e7f8d21bb7 Merge pull request #5717 from Kvasscn/fix_direct_blk_mount_info
docs: change mount-info.json to mountInfo.json
2022-11-22 10:19:02 -08:00
Fabiano Fidêncio
2539f31862 runtime: Use containerd v1.6.8
Let's follow the binary bump used in the CI and also bump the vendored
version of containerd to v1.6.8.

Fixes: #5722

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-22 18:28:30 +01:00
Fabiano Fidêncio
732123b9ab Merge pull request #5709 from kinderyj/main
docs: update doc "NVIDIA GPU passthrough"
2022-11-22 16:53:51 +01:00
Chao Wu
8b04ba95cb Merge pull request #5691 from yipengyin/support-vhost-vsock
runtime-rs: support vhost-vsock
2022-11-22 14:59:55 +08:00
Jason Zhang
993d05a42e docs: change mount-info.json to mountInfo.json
mount-info.json should be mountInfo.json according to the description in the doc.

Fixes: #5716

Signed-off-by: Jason Zhang <zhanghj.lc@inspur.com>
2022-11-22 14:25:57 +08:00
Yipeng Yin
d808adef95 runtime-rs: support vhost-vsock
Rename old VsockConfig to HybridVsockConfig. And add VsockConfig to
support vhost-vsock. We follow kata's old way to try random vhost fd
for 50 times to generate uniqe fd.

Fixes: #5654

Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2022-11-22 10:03:52 +08:00
Zhongtao Hu
6b2ef66f0f runtime-rs: add conditional compile for virt-sandbox persist
code refactoring, add conditional compile for virt-sandbox persist

Fixes: #5706
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-21 19:51:43 +08:00
Matt Wang
6c1e153a6f docs: update doc "NVIDIA GPU passthrough"
We should make sure the hook shell
`nvidia-container-toolkit.sh` is executable.

Fixes: #5594

Signed-off-by: Matt Wang <kinder_yj@hotmail.com>
2022-11-21 17:31:20 +08:00
Jianyong Wu
b53171b605 agent: check command before do test_ip_tables
test_ip_tables test depends on iptables tools. But we can't
ensure these tools are exist. it's better to skip the test
if there is no such tools.

Fixes: #5697
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-11-21 14:56:51 +08:00
Bin Liu
7c8d474959 Merge pull request #5689 from kata-containers/kata-ctl-util
utils: Add utility function to fetch the kernel version.
2022-11-21 14:44:05 +08:00
Peng Tao
be31a0fb41 Merge pull request #5638 from bergwolf/github/nydusd
versions: update nydusd version
2022-11-21 09:53:11 +08:00
Peng Tao
a636d426d9 versions: update nydusd version
To the latest stable v2.1.1.

Depends-on: github.com/kata-containers/tests#5246
Fixes: #5635
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-19 16:33:29 +00:00
liyuxuan.darfux
3bb145c63a runtime: Support virtiofs queue size for qemu and make it configurable
The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024
by default which is same as clh. Also make this value configurable.

Fixes: #5694

Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>
2022-11-19 15:38:11 +08:00
Archana Shinde
e80a9f09fa utils: Add utility function to fetch the kernel version.
Add functionality to get kernel version and related unit tests.
This is intended to be used in the kata-env command going forward.

Fixes: #5688

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-11-18 15:39:57 -08:00
Bin Liu
7506237420 Merge pull request #5144 from openanolis/nydus-dev
runtime-rs: support nydus v5 and v6 rootfs
2022-11-18 14:05:04 +08:00
Bo Chen
65686dbbdc Merge pull request #5684 from likebreath/1117/clh_v28.0
Upgrade to Cloud Hypervisor v28.0
2022-11-17 15:18:51 -08:00
Chelsea Mafrica
85f818743b Merge pull request #5679 from liubin/fix/5678-update-swap-doc
docs: update doc "Setup swap device in guest kernel"
2022-11-17 13:23:57 -08:00
Bo Chen
36545aa81a runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v28.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #5683

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-17 09:45:27 -08:00
Bo Chen
f4b02c2244 versions: Upgrade to Cloud Hypervisor v28.0
Details of this release can be found in our new roadmap project as
iteration v28.0: https://github.com/orgs/cloud-hypervisor/projects/6.

Fixes: #5683

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-17 09:44:49 -08:00
Fabiano Fidêncio
81c0945afa Merge pull request #5669 from fidencio/topic/rust-fixes-plus-golang-bump
Rust fixes + Golang bump
2022-11-17 16:02:17 +01:00
Bin Liu
e4a6fbadf8 docs: update doc "Setup swap device in guest kernel"
`crictl runp` command needs `--runtime kata` option
to start a Kata Containers pod.

Fixes: #5678

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-11-17 22:57:22 +08:00
Fabiano Fidêncio
2f5f575a43 log-parser: Simplify check
```
14:13:15 parse.go:306:5: S1009: should omit nil check; len() for github.com/kata-containers/kata-containers/src/tools/log-parser.kvPairs is defined as zero (gosimple)
14:13:15 	if pairs == nil || len(pairs) == 0 {
14:13:15 	   ^
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 14:17:29 +01:00
Fabiano Fidêncio
d94718fb30 runtime: Fix gofmt issues
It seems that bumping the version of golang and golangci-lint new format
changes are required.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 14:16:12 +01:00
Fabiano Fidêncio
16b8375095 golang: Stop using io/ioutils
The package has been deprecated as part of 1.16 and the same
functionality is now provided by either the io or the os package.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-17 13:43:25 +01:00
Fabiano Fidêncio
66aa330d0d versions: Update golangci-lint
Let's bump the golangci-lint in order to fix issues that popped up after
updating Golang to its 1.19.2 version.

Depends-on: github.com/kata-containers/tests#5257

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-16 19:03:02 +01:00
Peng Tao
b3a4a16294 versions: bump containerd version
v1.5.2 cannot be built from source by newer golang. Let's bump
containerd version to 1.6.8. The GO runtime dependency has
been moved to v1.6.6 for some time already.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-16 19:02:41 +01:00
Peng Tao
eab8d6be13 build: update golang version to 1.19.2
So that we get the latest language fixes.

There is little use to maitain compiler backward compatibility.
Let's just set the default golang version to the latest 1.19.2.

Fixes: #5494
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-11-16 19:02:39 +01:00
Chao Wu
e80dbc15d8 runtime-rs: workaround Dragonball compilation problem
Since the upstream rust-vmm is changing its dependency style towards
caret requirements in these days (more information:
rust-vmm/vm-memory#199) and it breaks Dragonball compilation frequently.

rust-vmm is expected to finish the changes this week and in order to not
break Kata CI due to Dragonball's compilation error, we will add
Cargo.lock file into /src/dragonball first and remove it later when
rust-vmm is stable.

fixes: #5657
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-16 12:44:41 +01:00
Ji-Xinyou
c3f1922df6 fix(fmt): fix cargo fmt to pass static check
Fix cargo fmt

Fixes: #5639
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-16 12:44:38 +01:00
Greg Kurz
1bbcb413c9 Merge pull request #5597 from UiPath/fix-clh-wait
clh: avoid race condition when stopping clh
2022-11-16 07:39:27 +01:00
Gabriela Cervantes
a4099dab8f tools: Fix indentation of build static firecracker script
This PR fixes the indentation of the build static firecracker script.

Fixes #5663

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-15 16:01:36 +00:00
Bin Liu
b8dbb35bb7 Merge pull request #5631 from GabyCT/topic/fixvirtiofsdscript
tools: Fix indentation of build static virtiofsd script
2022-11-11 14:31:26 +08:00
Bin Liu
dff78593c0 Merge pull request #5505 from Joffref/patch-1
docs: Fix configuration path
2022-11-11 14:26:40 +08:00
Zhongtao Hu
7d91150185 Merge pull request #5536 from chentt10/fix-name-shim-source-ambiguous
runtime-rs : fix the shim source in the documentation test is ambiguous
2022-11-11 14:07:05 +08:00
Zhongtao Hu
c46814b26a runtime-rs:support nydus v5 and v6
add nydus v5 snd v6 upport for container rootfs

Fixes:#5142
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-11 10:15:35 +08:00
Alexandru Matei
a04afab74d qemu: early exit from Check if the process was stopped
Fixes: #5625

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
7e481f2179 qemu: set stopped only if StopVM is successful
Fixes: #5624

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
0e3ac66e76 clh: return faster with dead clh process from isClhRunning
Through proactively checking if Cloud Hypervisor process is dead,
this patch provides a faster path for isClhRunning

Fixes: #5623

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
9ef68e0c7a clh: fast exit from isClhRunning if the process was stopped
Use atomic operations instead of acquiring a mutex in isClhRunning.
This stops isClhRunning from generating a deadlock by trying to
reacquire an already-acquired lock when called via StopVM->terminate.

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
Alexandru Matei
2631b08ff1 clh: don't try to stop clh multiple times
Avoid executing StopVM concurrently when virtiofs dies as a result of clh
being stopped in StopVM.

Fixes: #5622

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2022-11-10 22:43:32 +02:00
James O. D. Hunt
56641bc230 Merge pull request #5637 from openanolis/chao/update_cargo_lock
versions: update vmm-sys-util and related crates to v0.11.0
2022-11-10 13:49:24 +00:00
Chao Wu
f45fe4f90d versions: update vmm-sys-util and related crates to v0.11.0
Since the upstream of vmm-sys-utils upgraded to 0.11.0, some crates
automatically upgrade to v0.11.0, and some stay at v0.10.0 ( depending
on how they write version dependency in Cargo toml` which causes the
compile error in runtime-rs.

In order to fix this problem, we need to upgrade all vmm-sys-util
dependencies in runtime-rs to v0.11.0.

fixes: #5636

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-10 19:13:23 +08:00
quanweiZhou
bbc93260c9 Merge pull request #5615 from openanolis/chao/delete_cargo_patch
runtime-rs: delete all cargo patches
2022-11-10 10:18:19 +08:00
Gabriela Cervantes
8be0817305 tools: Fix indentation of build static virtiofsd script
This Pr removes single spaces and fix the indentation of the script.

Fixes #5630

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-09 17:09:13 +00:00
Zhongtao Hu
071ac4693a Merge pull request #5613 from openanolis/iptables
feat(shim-mgmt): iptables handler
2022-11-09 17:21:45 +08:00
Bin Liu
1d59137c6f Merge pull request #5620 from GabyCT/topic/removeemptysspaces
tools: Remove empty spaces from build kernel script
2022-11-09 17:02:29 +08:00
Ji-Xinyou
f8f97c1e22 feat(shim-mgmt): iptables handler
Support the handlers in runtime, which are used by kata-ctl iptables series of commands in runtime.

Fixes: #5370
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-11-09 10:39:50 +08:00
Chao Wu
29c75cf12b runtime-rs: delete all cargo patches
The cargo patch in the cargo.toml seems to cause the whole runtime-rs
building time longer and also makes it harder to build runtime-rs in an
environment without the network

We should delete all patches from the cargo.toml file and publish all
the crates that was once patched.

fixes: #5614 #5527 #5526 #5449

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-09 10:02:58 +08:00
Gabriela Cervantes
9f70a6949b tools: Remove empty spaces from build kernel script
This PR removes some extra empty spaces at the build kernel script.

Fixes #5619

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-11-08 17:49:57 +00:00
Chao Wu
f5f25d9379 Merge pull request #5431 from wllenyj/dragonball-ut-3
Built-in Sandbox: add more unit tests for dragonball. Part 3
2022-11-08 15:48:16 +08:00
Zhongtao Hu
351bdbfacd Merge pull request #5567 from openanolis/chao/fix_mem_file_path_error
Dragonball: enable mem_file_path config into hugetlbfs process
2022-11-08 09:00:13 +08:00
wllenyj
57336835da dragonball: add more unit test for device manager
Added more unit tests for device manager.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-11-08 00:45:17 +08:00
wllenyj
2333700237 dragonball: add test utils.
Added some tools for dragonball unit testing.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-11-08 00:45:17 +08:00
Bin Liu
bfe9157abc Merge pull request #5570 from openanolis/capability
runtime-rs:add hypervisor interface capabilities
2022-11-07 23:04:55 +08:00
Mathis Joffre
3e9c3f12ce docs: Fix configuration path
On install you generate a configuration-fc.toml
file when building the kata-runtime and
copy it to either /etc/kata-containers/configuration-fc.toml
or /usr/share/defaults/kata-containers/configuration-fc.toml.
To reflect that the path must be one of the above,
we can fix the path in doc.

Fixes: #5589

Signed-off-by: Mathis Joffre <mariusjoffre@gmail.com>
2022-11-07 10:19:47 +01:00
Chao Wu
2adb1c1823 Dragonball: enable mem_file_path config into hugetlbfs process
In the current Dragonball code, mem_file_path config is not used when
hugetlbfs is enabled.
In this commit we add mem_file_path into hugetlbfs enable process.

fixes: #5566
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-11-07 16:07:57 +08:00
Fabiano Fidêncio
7250be3601 Merge pull request #5584 from fengyehong/clh-thread
cloud-hypervisor: Fix GetThreadIDs function
2022-11-07 08:22:40 +01:00
Fabiano Fidêncio
3b1750e8e8 Merge pull request #5586 from fidencio/topic/paralelise-static-checks
github: Parallelise static checks
2022-11-07 07:54:48 +01:00
Bin Liu
824ea83c3c Merge pull request #5573 from pmores/fill-in-virtiofsd-standalone-impl
runtime-rs: blanks filled & fixes made to virtiofsd launch
2022-11-07 14:19:45 +08:00
Bin Liu
83d052f82b Merge pull request #4476 from LitFlwr0/vcpu-pinning-frq
vCPUs pinning support for Kata Containers
2022-11-07 10:37:22 +08:00
Guanglu Guo
daeee26a1e cloud-hypervisor: Fix GetThreadIDs function
Get vcpu thread-ids by reading cloud-hypervisor process tasks information.

Fixes: #5568

Signed-off-by: Guanglu Guo <guoguanglu@qiyi.com>
2022-11-05 17:23:19 +08:00
Bin Liu
427b01e298 Merge pull request #5548 from justxuewei/fix/share-fs-permission
runtime-rs: fix shared volume permission issue
2022-11-04 21:21:50 +08:00
Fabiano Fidêncio
40d514aa2c github: Parallelise static checks
Although introducing an awful amount of code duplication, let's
parallelise the static checks in order to reduce its time and the space
used in the VMs running those.

While I understand there may be ways to make the whole setup less
repetitive and error prone, I'm taking the approach of:
* Make it work
* Make it right
* Make it fast

So, it's clear that I'm only attempting to make it work, and I'd
appreciate community help in order to improve the situation here.  But,
for now, this is a stopgap solution.

JFYI, the time needed for run the tests on the `main` branch went down
from ~110 minutes to ~60 minutes.  Plus, we're not running those on a
single VM anymore, which decreases the change to hit the space limit.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/3393468605/jobs/5640842041

Ideally, each one of the following tests should be also split into
smaller tests, each test for one component, for instance.
* static-checks
* compiler-checks
* unit-tests
* unit-tests-as-root

Fixes: #5585

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-11-04 13:41:16 +01:00
LitFlwr0
2508d39b7c runtime: added vcpus pinning logics
Core VCPU threads pinning logics for issue 4476. Also provided docs.

Fixes:#4476
Signed-off-by: LitFlwr0 <861690705@qq.com>
2022-11-04 17:52:42 +08:00
Zhongtao Hu
fef8e92af1 runtime-rs:add hypervisor interface capabilities
1. be able to check does hypervisor support use block device, block
device hotplug, multi-queue, and share file

2. be able to set the hypervisor capability of using block device, block
device hotplug, multi-queue, and share file

Fixes: #5569
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-11-04 09:24:36 +08:00
Bin Liu
b0c7bcce7c Merge pull request #5556 from ManaSugi/runk/fix-kill-behavior
runk: Ignore an error when calling kill cmd with --all option
2022-11-04 08:42:27 +08:00
Bin Liu
02fa6b8dad Merge pull request #5557 from ManaSugi/runk/update-cargolock-libseccomp
runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
2022-11-04 08:41:45 +08:00
Fabiano Fidêncio
bb38901550 Merge pull request #5571 from jodh-intel/snap-unbreak-docker
snap: Unbreak docker install
2022-11-03 23:47:07 +01:00
Pavel Mores
27b1913584 runtime-rs: blanks filled & fixes made to virtiofsd launch
The 'config' argument to ShareVirtioFsStandalone::new() is now actually
used, taking care of an explicit TODO.

If a shared path doesn't exist in ShareVirtioFsStandalone::virtiofsd_args()
it is now created instead of returning an error, thus following
ShareVirtioFsInline's suit.

The '-o vhost_user_socket=...' command line argument doesn't seem to be
supported by newer versions of virtiofsd so we replace it with
'--socket-path' which should be functionally equivalent according to docs.

Fixes #5572

Signed-off-by: Pavel Mores <pmores@redhat.com>
2022-11-03 08:38:59 +01:00
James O. D. Hunt
990e6359b7 snap: Unbreak docker install
It appears that _either_ the GitHub workflow runners have changed their
environment, or the Ubuntu archive has changed package dependencies,
resulting in the following error when building the snap:

```
Installing build dependencies: bc bison build-essential cpio curl docker.io ...

    :

The following packages have unmet dependencies:
docker.io : Depends: containerd (>= 1.2.6-0ubuntu1~)
E: Unable to correct problems, you have held broken packages.
```

This PR uses the simplest solution: install the `containerd` and `runc`
packages. However, we might want to investigate alternative solutions in
the future given that the docker and containerd packages seem to have
gone wild in the Ubuntu GitHub workflow runner environment. If you
include the official docker repo (which the snap uses), a _subset_ of
the related packages is now:

- `containerd`
- `containerd.io`
- `docker-ce`
- `docker.io`
- `moby-containerd`
- `moby-engine`
- `moby-runc`
- `runc`

Fixes: #5545.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-02 10:09:03 +00:00
James O. D. Hunt
ca69a9ad6d snap: Use metadata for dependencies
Rather than hard-coding the package manager into the docker part,
use the `build-packages` section to specify the parts package
dependencies in a distro agnostic manner.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-11-02 09:50:29 +00:00
Manabu Sugimoto
df092185ee runk: Upgrade libseccomp crate to v0.3.0 in Cargo.lock
The libseccomp crate was upgraded to v0.3.0 by 4696ead,
but `Cargo.lock` of runk wasn't updated by mistake.
So, this commit updates `Cargo.lock` of runk to the latest dependencies.

Fixes: #5487

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-01 20:26:33 +09:00
Manabu Sugimoto
16dca4ecd4 runk: Ignore an error when calling kill cmd with --all option
Ignore an error handling that is triggered when the kill command is called
with `--all option` to the stopped container.

High-level container runtimes such as containerd call the kill command with
`--all` option in order to terminate all processes inside the container
even if the container already is stopped. Hence, a low-level runtime
should allow `kill --all` regardless of the container state like runc.

This commit reverts to the previous behavior.

Fixes: #5555

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-01 20:24:29 +09:00
Xuewei Niu
b74c18024a runtime-rs: fix shared volume permission issue
Fix the issue where share volumes always have readwrite permission even if
readonly permission is enough.

Fixes: #5549

Signed-off-by: Xuewei Niu <justxuewei@apache.org>
2022-11-01 18:42:19 +08:00
Chen TaoTao
936fe35acb runtime-rs : fix shim source is ambiguous
In the documentation test, the name shim has multiple potential
sources of import, now give it a clear source.

Fixes: #5535

Signed-off-by: Chen TaoTao <chentt10@chinatelecom.cn>
2022-10-31 19:54:22 -07:00
snir911
288e337a6f Merge pull request #5434 from Rouzip/remove-doNetNS
add EnterNetNS in virtcontainers
2022-10-30 11:19:07 +02:00
GabyCT
e04ad49c1b Merge pull request #5530 from GabyCT/topic/fixclhscript
tools: Fix indentation of build static clh script
2022-10-28 11:52:56 -05:00
Gabriela Cervantes
0ed7da30d7 tools: Fix indentation of build static clh script
This Pr removes single spaces and fix the indentation of the script.

Fixes #5528

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2022-10-27 21:09:34 +00:00
Bin Liu
0bb005093e Merge pull request #5523 from BbolroC/s390x-virtiofsd
virtiofsd: Not use "link-self-contained=yes" on s390x
2022-10-27 20:42:57 +08:00
Hyounggyu Choi
43fcb8fd09 virtiofsd: Not use "link-self-contained=yes" on s390x
The compile option link-self-contained=yes asks rustc to use
C library startup object files that come with the compiler,
which are not available on the target s390x-unknown-linux-gnu.
A build does not contain any startup files leading to a
broken executable entry point (causing segmentation fault).

Fixes: #5522

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2022-10-26 23:43:22 +02:00
David Esparza
37f0cd1c8f Merge pull request #5436 from amshinde/kata-ctl-drop-privs
Kata ctl drop privs
2022-10-26 11:37:27 -05:00
David Esparza
8b0c830a23 Merge pull request #5513 from bergwolf/github/golang-ci-lint
versions: bump golangci-lint version
2022-10-26 07:36:45 -05:00
Bin Liu
059b09b0a8 Merge pull request #5510 from bergwolf/github/runtime-rs-makefile
runtime-rs: generate config files with the default target
2022-10-26 20:29:17 +08:00
David Esparza
4d6c3bd0fa Merge pull request #5515 from cmaf/docs-fix-sgx-k8s-volumemount
docs: Fix volumeMounts in SGX usage example
2022-10-26 07:24:31 -05:00
Chelsea Mafrica
219919e9f7 docs: Fix volumeMounts in SGX usage example
The /dev/sgx is not mounted and the enclave is not available,
causing the demo job to report an error in the logs. Add volumeMounts to
container in order to have the device available in the container.

Fixes: #5514

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-25 23:20:49 -07:00
Archana Shinde
c0f5bc81b7 cargo: Add Cargo.lock to version control
Add Cargo.lock to capture state of build.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Archana Shinde
474927ec90 gitignore: Add gitignore file
Ignore autogeneraated version.rs

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Archana Shinde
699f821e12 utils: Add function to drop priveleges
This function is meant to be used before operations
such as accessing network to make sure those operations
are not performed as a privilged user.

Fixes: #5331

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2022-10-25 20:34:40 -07:00
Peng Tao
a6fb4e2a68 versions: bump golangci-lint version
There is little point to maintain backward compatiblity for
golangci-lint. Let's just use a unified version of it.

Fixes: #5512
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-26 10:41:24 +08:00
Peng Tao
b015f34aff runtime-rs: generate config files with the default target
Right now it is not generated with a simple `make`.

Fixes: #5509
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-10-26 10:25:29 +08:00
Yuan-Zhuo
d7bb4b5512 agent: support systemd cgroup for kata agent
1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd).
2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs).
3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md).

Fixes: #4336

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2022-10-25 13:57:09 +08:00
Bo Chen
a151d8ee50 Merge pull request #5493 from fidencio/topic/update-clh
versions: Update Cloud Hypervisor to b4e39427080
2022-10-24 07:54:02 -07:00
Bin Liu
0f7088a4b1 Merge pull request #5501 from openanolis/update_install_guide
docs: update rust runtime installation guide
2022-10-24 17:49:34 +08:00
Bin Liu
4696eadfeb Merge pull request #5488 from ManaSugi/fix/update-libseccomp-crate
rustjail: Upgrade libseccomp crate to v0.3.0
2022-10-24 17:03:30 +08:00
Bin Liu
badb2600b3 Merge pull request #5474 from openanolis/makefile
makefile: remove sudo when create symbolic link
2022-10-24 17:03:20 +08:00
Bin Liu
ab5f97759d Merge pull request #5497 from Rouzip/remove-redundant
agent: remove redundant checks
2022-10-24 16:41:49 +08:00
Fabiano Fidêncio
190e623c40 Merge pull request #5317 from Champ-Goblem/fix-containerd-stats
shim: Ensure pagesize is set when reporting hugetlb stats
2022-10-24 10:24:49 +02:00
Fabiano Fidêncio
7248cf51c5 Merge pull request #5447 from hbrueckner/fix-5438
kata-ctl: Re-enable network tests on s390x (fixes 5438)
2022-10-24 10:23:35 +02:00
Zhongtao Hu
144efd1a7a docs: update rust runtime installation guide
As kata-deploy support rust runtime, we need to update the installation docs

Fixes:#5500
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-10-24 15:55:30 +08:00
James O. D. Hunt
65ef2a0a0b Merge pull request #5089 from liubin/fix/4895-ignore-exit-error
agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
2022-10-24 08:46:54 +01:00
Zhongtao Hu
164ecca3f0 Merge pull request #5499 from zhaoxuat/main
fix readme content error at doc directory
2022-10-24 14:15:52 +08:00
zhaoxu
abf4f9b299 docs: kata 3.0 Architecture
fix readme content error

Fixes: #5498
Signed-off-by: zhaoxu <zhaoxu@megvii.com>
2022-10-24 11:07:34 +08:00
snir911
ee189d2ebe Merge pull request #5455 from kata-containers/main-validate-hp-size
agent: validate hugepage size is supported
2022-10-23 08:15:05 +03:00
Rouzip
44d8de8923 agent: remove redundant checks
Remove redundant checks for executable files.

FIXes: #3730

Signed-off-by: Rouzip <1226015390@qq.com>
2022-10-22 23:31:18 +08:00
Fabiano Fidêncio
9d286af7b4 versions: Update Cloud Hypervisor to b4e39427080
An API change, done a long time ago, has been exposed on Cloud
Hypervisor and we should update it on the Kata Containers side to ensure
it doesn't affect Cloud Hypervisor CI and because the change is needed
for an upcoming work to get QAT working with Cloud Hypervisor.

Fixes: #5492

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-21 20:52:54 +02:00
Bin Liu
081ee48713 agent: use NLM_F_REPLACE replace NLM_F_EXCL in rtnetlink
Sometimes we will face EEXIST error when adding arp neighbour.
Using NLM_F_REPLACE replace NLM_F_EXCL will avoid fail if the
entry exists.

See https://man7.org/linux/man-pages/man7/netlink.7.html

Fixes: #4895

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-10-21 21:19:14 +08:00
Hendrik Brueckner
e95089b716 kata-ctl: add basic cpu check for s390x
Add a basic s390x cpu check for the "sie" feature to be present.
Also re-enable cpu check testing.

Fixes: #5438

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-21 12:04:28 +00:00
Hendrik Brueckner
871d2cf2c0 kata-ctl: Limit running tests to x86 and use native-tls on s390x
For s390x, use native-tls for reqwest because the rustls-tls/ring
dependency is not available for s390x.

Also exclude s390x, powerpc64le, and aarch64 from running the cpu
check due to the lack of the arch-specific implementation. In this
case, rust complains about unused functions in src/check.rs (both
normal and test context).

Fixes: #5438

Co-authored-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-21 11:54:26 +00:00
Manabu Sugimoto
cbd84c3f5a rustjail: Upgrade libseccomp crate to v0.3.0
The libseccomp crate v0.3.0 has been released, so use it in the agent.

Fixes: #5487

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-10-21 15:40:05 +09:00
Bin Liu
1bf64c9a11 Merge pull request #5453 from openanolis/chao/fix_comment_typo
Makefile: fix an typo in runtime-rs makefile
2022-10-21 14:36:39 +08:00
David Esparza
1c159d83ea Merge pull request #5465 from fidencio/topic/re-work-QEMU-dockerfile
qemu: Re-work static-build Dockerfile
2022-10-20 13:32:03 -05:00
Zhongtao Hu
748be0fe3d makefile: remove sudo when create symbolic link
when using mock to package rpm, we cannot have sudo permission

Fixes: #5473
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2022-10-20 22:13:21 +08:00
Bin Liu
cd27ad144e Merge pull request #5219 from openanolis/krt-modify
Modify agent-url return value in runtime-rs
2022-10-20 11:17:29 +08:00
Fabiano Fidêncio
227e717d27 qemu: Re-work static-build Dockerfile
Differently than every single other bit that's part of our repo, QEMU
has been using a single Dockerfile that prepares an environment where
the project can be built, but *also* building the project as part of
that very same Dockerfile.

This is a problem, for several different reasons, including:
* It's very hard to have a reproducible build if you don't have an
  archived image of the builder
* One cannot cache / ipload the image of the builder, as that contains
  already a specific version of QEMU
* Every single CI run we end up building the builder image, which
  includes building dependencies (such as liburing)

Let's split the logic into a new build script, and pass the build script
to be executed inside the builder image, which will be only responsible
for providing an environment where QEMU can be built.

Fixes: #5464

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-19 21:34:36 +02:00
Bin Liu
faf363db75 Merge pull request #5414 from openanolis/chao/regulate_runtime_rs_makefile_comments
runtime-rs: regulate the comment in runtime-rs makefile
2022-10-19 15:36:00 +08:00
Snir Sheriber
72738dc11f agent: validate hugepage size is supported
before setting a limit, otherwise paths may not be found.
guest supporting different hugepage size is more likely with peer-pods where
podvm may use different flavor.

Fixes: #5191
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2022-10-19 09:55:33 +03:00
Chao Wu
f74e328fff Makefile: fix an typo in runtime-rs makefile
There is a typo in runtime-rs makefile.
_dragonball should be _DB

fixes: #5452

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-19 14:12:48 +08:00
Chao Wu
f205472b01 Makefile: regulate the comment style for the runtime-rs comments
In runtime-rs makefile, we use
```
```
to let make help print out help information for variables and targets,
but later commits forgot this rule.
So we need to follow the previous rule and change the current comments.

fixes: #5413
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-19 12:12:50 +08:00
Fabiano Fidêncio
c97b7b18e7 Merge pull request #5416 from zvonkok/patch-1
doc: Update how-to-run-kata-containers-with-SNP-VMs.md
2022-10-18 22:45:05 +02:00
Hendrik Brueckner
9f2c7e47c9 Revert "kata-ctl: Disable network check on s390x"
This reverts commit 00981b3c0a.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
2022-10-18 11:12:18 +00:00
James O. D. Hunt
dd60a0298d Merge pull request #5439 from jodh-intel/kata-ctl-s390x-disable-tls
kata-ctl: Disable network check on s390x
2022-10-18 09:58:09 +01:00
Zvonko Kaiser
ac403cfa5a doc: Update how-to-run-kata-containers-with-SNP-VMs.md
If the needed libraries (for virtfs) are installed on the host,
 QEMU will pick it up and enable it. If not installed and you
do not enable the flag, QEMU will just ignore it, and you end
up without 9p support. Enabling it explicitly will fail if the
needed libs are not installed so this way we can be sure that
it gets build.

Fixes: #5418

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2022-10-17 05:56:19 -07:00
James O. D. Hunt
00981b3c0a kata-ctl: Disable network check on s390x
s390x apparently does not support rust-tls, which is required by the
network check (due to the `reqwest` crate dependency).

Disable the network check on s390x until we can find a solution to the
problem.

> **Note:**
>
> This fix is assumed to be a temporary one until we find a solution.
> Hence, I have not moved the network check code (which should be entirely
> generic) into an architecture specific module.

Fixes: #5435.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-17 10:24:06 +01:00
Rouzip
39363ffbfb runtime: remove same function
Add EnterNetNS in virtcontainers to remove same function.

FIXes #5394

Signed-off-by: Rouzip <1226015390@qq.com>
2022-10-17 10:59:13 +08:00
James O. D. Hunt
c322d1d12a kata-ctl: arch: Improve check call
Rework the architecture-specific `check()` call by moving all the
conditional logic out of the function.

Fixes: #5402.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-15 11:41:53 +01:00
Fabiano Fidêncio
ff8bfdfe3b Merge pull request #5426 from fidencio/topic/build-virtiofsd-in-a-2nd-layer-container
virtiofsd: Build inside a container
2022-10-15 00:26:56 +02:00
Fabiano Fidêncio
0bc5baafb9 snap: Build virtiofsd using the kata-deploy scripts
Let's build virtiofsd using the kata-deploy build scripts, which
simplifies and unifies the way we build our components.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 13:44:03 +02:00
Fabiano Fidêncio
cb4ef4734f snap: Create a task for installing docker
Let's have the docker installation / configuration as part of its own
task, which can be set as a dependency of other tasks whcih may or may
not depend on docker.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 12:41:21 +02:00
Fabiano Fidêncio
7e5941c578 virtiofsd: Build inside a container
When moving to building the CI artefacts using the kata-deploy scripts,
we've noticed that the build would fail on any machine where the tarball
wasn't officially provided.

This happens as rust is missing from the 1st layer container.  However,
it's a very common practice to leave the 1st layer container with the
minimum possible dependencies and install whatever is needed for
building a specific component in a 2nd layer container, which virtiofsd
never had.

In this commit we introduce the second layer containers (yes,
comtainers), one for building virtiofsd using musl, and one for building
virtiofsd using glibc.  The reason for taking this approach was to
actually simplify the scripts and avoid building the dependencies
(libseccomp, libcap-ng) using musl libc.

Fixes: #5425

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-14 12:41:21 +02:00
Zhongtao Hu
5d17cbeef7 Merge pull request #5383 from openanolis/chao/update_comments_in_event_manager
Dragonball: remove redundant comments in event manager
2022-10-14 15:50:37 +08:00
Fabiano Fidêncio
c745d6648d Merge pull request #5420 from fidencio/topic/update-tdx-qemu-repo
versions: Update TDX QEMU
2022-10-13 20:57:37 +02:00
Bin Liu
b23a24ab2f Merge pull request #5417 from liubin/fix/typo-get_contaier_type
runtime-rs: fix typo get_contaier_type to get_container_type
2022-10-13 22:35:23 +08:00
Bin Liu
c7b38532f0 Merge pull request #5412 from tzY15368/improve-cmd-descriptions
kata-ctl: improve command descriptions for consistency
2022-10-13 19:17:42 +08:00
Fabiano Fidêncio
35d52d30fd versions: Update TDX QEMU
The previously used repo will be removed by Intel, as done with the one
used for TDX kernel.  The TDX team has already worked on providing the
patches that were hosted atop of the QEMU commit with the following hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0 as a tarball in the
https://github.com/intel/tdx-tools repo, see
https://github.com/intel/tdx-tools/pull/162.

On the Kata Containers side, in order to simplify the process and to
avoid adding hundreds of patches to our repo, we've revived the
https://github.com/kata-containers/qemu repo, and created a branch and a
tag with those hundreds of patches atop of the QEMU commit hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0.  The branch is called
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0-plus-TDX-v3.1 and the tag is
called TDX-v3.1.

Knowing the whole background, let's switch the repo we're getting the
TDX QEMU from.

Fixes: #5419

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-13 11:53:29 +02:00
Bin Liu
4d9dd8790d runtime-rs: fix typo get_contaier_type to get_container_type
Change get_contaier_type to get_container_type

Fixes: #5415

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-10-13 17:12:43 +08:00
Bin Liu
2de29b6f69 Merge pull request #5088 from liubin/fix/5087-force-shutdown-shim
runtime-rs: force shutdown shim process in it can't exit
2022-10-13 16:55:05 +08:00
Fabiano Fidêncio
d934d87482 Merge pull request #5404 from fidencio/topic/update-tdx-kernel-repo
versions: Update TDX kernel
2022-10-13 09:14:44 +02:00
Tingzhou Yuan
70676d4a99 kata-ctl: improve command descriptions for consistency
This change improves the command descriptions for kata-ctl and can avoid certain confusions in command functionality.

Fixes #5411

Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
2022-10-13 04:10:23 +00:00
Bin Liu
3b70c72436 Merge pull request #5395 from wllenyj/dragonball-s390
ci: skip s390x for dragonball.
2022-10-13 09:03:08 +08:00
Bin Liu
157d3cdcb1 Merge pull request #5397 from openanolis/chao/delete_redundant_dragonball_comment
Dragonball: delete redundant comments in blk_dev_mgr
2022-10-13 09:01:59 +08:00
Fabiano Fidêncio
9eb73d543a versions: Update TDX kernel
The previously used repo has been removed by Intel.  As this happened,
the TDX team worked on providing the patches that were hosted atop of
the v5.15 kernel as a tarball present in the
https://github.com/intel/tdx-tools repos, see
https://github.com/intel/tdx-tools/pull/161.

On the Kata Containers side, in order to simplify the process and to
avoid adding ~1400 kernel patches to our repo, we've revived the
https://github.com/kata-containers/linux repo, and created a branch and
a tag with those ~1400 patches atop of the v5.15.  The branch is called
v5.15-plus-TDX, and the tag is called 5.15-plus-TDX (in order to avoid
having to change how the kernel builder script deals with versioning).

Knowing the whole background, let's switch the repo we're getting the
TDX kernel from.

Fixes: #5326

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-10-12 16:54:43 +02:00
James O. D. Hunt
d3ee8d9f1b Merge pull request #5388 from jodh-intel/kata-ctl
kata-ctl: Move development to main branch
2022-10-12 14:29:35 +01:00
James O. D. Hunt
00a42f69c0 kata-ctl: cargo: 2021 -> 2018
Revert to the 2018 edition of rust for consistency with other rust
components.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-12 11:46:51 +01:00
James O. D. Hunt
fb63274747 kata-ctl: rustfmt + clippy fixes
Make this file conform to the standard rust layout conventions and
simplify the code as recommended by `clippy`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-12 11:46:48 +01:00
wllenyj
1f1901e059 dragonball: fix clippy warning for aarch64
Added aarch64 check.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 18:29:00 +08:00
wllenyj
a343c570e4 dragonball: enhance dragonball ci
Unified use of Makefile instead of calling `cargo test` directly.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 17:53:01 +08:00
wllenyj
6a64fb0eb3 ci: skip s390x for dragonball.
Currently, Dragonball only supports x86_64 and aarch64 platforms.

Fixes: #4381

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-12 15:27:45 +08:00
Bin Liu
7aacba0abc Merge pull request #5282 from liubin/fix/4730-rs-emptydir
runtime-rs: support ephemeral storage for emptydir
2022-10-12 09:53:59 +08:00
Chao Wu
a743e37daf Dragonball: delete redundant comments in blk_dev_mgr
delete redundent derive part for BlockDeviceMgr.

fixes: #5396

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-11 19:41:47 +08:00
Chao Wu
d2bf2f5dd0 Merge pull request #5393 from LetFu/5392/fixInstallKata30RustRuntimeShimGuideTypo
docs: fix a typo in rust-runtime-installation-guide
2022-10-11 19:27:31 +08:00
James O. D. Hunt
2b345ba29d build: Add kata-ctl to tools list
Update the top-level Makefile to build the `kata-ctl` tool by default.

Fixes: #4499, #5334.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-11 10:05:16 +01:00
James O. D. Hunt
f7010b8061 kata-ctl: docs: Write basic documentation
Provide a basic document explaining a little about the `kata-ctl`
command.

Fixes: #5351.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-11 10:04:48 +01:00
Bin Liu
ffdd7e1ad8 Merge pull request #4961 from wllenyj/dragonball-ut-2
Built-in Sandbox: add more unit tests for dragonball
2022-10-11 14:12:25 +08:00
Bin Liu
39702c19d5 Merge pull request #5276 from bergwolf/github/readme
readme: remove libraries mentioning
2022-10-11 13:19:18 +08:00
chmod100
862eaef863 docs: fix a typo in rust-runtime-installation-guide
Fixes: #5392

Signed-off-by: chmod100 <letfu@outlook.com>
2022-10-11 02:31:29 +00:00
wllenyj
26c043dee7 ci: Add dragonball test
Enhanced Static-Check of CI to support nested virtualization.

Fixes: #5378

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-11 00:36:20 +08:00
James O. D. Hunt
781e604c39 docs: Reference kata-ctl README
Add a link to the `kata-ctl` tool's README.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 16:49:53 +01:00
James O. D. Hunt
15c343cbf2 kata-ctl: Don't rely on system ssl libs
Build using the rust TLS implementation rather than the system ones.
This resolves the `reqwest` crate build failure: it doesn't appear to
build against the native libssl libraries due to Kata defaulting to
using the musl libc.

Fixes: #5387.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
James O. D. Hunt
c23584994a kata-ctl: clippy: Resolve warnings and reformat
Resolved a couple of clippy warnings and applied standard `rustfmt`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
David Esparza
133690434c kata-ctl: implement CLI argument --check-version-only
This kata-ctl argument returns the latest stable Kata
release by hitting github.com.
Adds check-version unit tests.

Fixes: #11

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2022-10-10 13:42:51 +01:00
David Esparza
eb5423cb7f kata-ctl: switch to use clap derive for CLI handling
Switch from the functional version of `clap` to the declarative
methodology.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:51 +01:00
Chelsea Mafrica
018aa899cb kata-ctl: Add cpu check
Add architecture-specific code for x86_64 and generic calls handling
checks for CPU flags and attributes.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
7c9f9a5a1d kata-ctl: Make arch test run at compile time
Changed the `panic!()` call to a `compile_error!()` one to ensure it
fires at compile time rather than runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
b63ba66dc3 kata-ctl: Formatting tweaks
Automatic format updates.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
James O. D. Hunt
cca7e32b54 kata-ctl: Lint fixes to allow the branch to be built
Remove return value for branches that call `unimplemented!()`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:50 +01:00
Chelsea Mafrica
8e7bb8521c kata-ctl: add code for framework for arch
Add framework for different architectures for check. In the existing
kata-runtime check, the network checks do not appear to be
architecture-specific while the kernel module, cpu, and kvm checks do
have separate implementations for different architectures.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 13:42:50 +01:00
David Esparza
303fc8b118 kata-ctl: Add unit tests cases
Add more unit tests cases to --version argument.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:43 +01:00
David Esparza
d0b33e9a32 versions: Add kata-ctl version entry
As we're switching to using the rust version of the kata-ctl, lets
provide with its own entry in the kata-ctl command line.

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Commit-edited-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-10-10 13:42:35 +01:00
Chelsea Mafrica
002b18054d kata-ctl: Add initial rust code for kata-ctl
Use agent-ctl tool rust code as an example for a skeleton for the new
kata-ctl tool.

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2022-10-10 10:10:37 +01:00
wllenyj
b62b18bf1c dragonball: fix clippy warning
Fixed:
- unnecessary_lazy_evaluations
- derive_partial_eq_without_eq
- redundant_closure
- single_match
- question_mark
- unused-must-use
- redundant_clone
- needless_return

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
2ddc948d30 Makefile: add dragonball components.
Enable ci to run dragonball unit tests.

Fixes: #4899

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
3fe81fe4ab dragonball-ut: use skip_if_not_root to skip root case
Use skip_if_not_root to skip when unit test requires privileges.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:40 +08:00
wllenyj
72259f101a dragonball: add more unit test for vmm actions
Added more unit tests for vmm actions.

Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
2022-10-10 16:41:39 +08:00
Chao Wu
9717dc3f75 Dragonball: remove redundant comments in event manager
handle_events for EventManager doesn't take max_events as arguments, so
we need to update the comments for it.

p.s. max_events is defined when initializing the EventManager.

fixes: #5382

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2022-10-09 14:38:12 +08:00
Ji-Xinyou
9c1ac3d457 runtime-rs: return port on agent-url req
Add the server vport (1024) when requesting agent-url

Fixes: #5213
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
2022-10-08 16:14:21 +08:00
Champ-Goblem
89e62d4edf shim: Ensure pagesize is set when reporting hugetbl stats
The containerd stats method and metrics API are broken with Kata 2.5.x, the stats fail to load and the metrics API responds with status code 500

This seems to be down to the conversion from the stats reported by the agent RPC `StatsContainer` where the field `Pagesize` is not
completed by the `setHugetlbStats` method. In the case where multiple sized tables stats are reported, this causes containerd to register two metrics
with the same label set, rather than each being partitioned by the `page` label.

Fixes: #5316
Signed-off-by: Champ-Goblem <cameron@northflank.com>
2022-10-04 09:16:30 +01:00
Bin Liu
8d4ced3c86 runtime-rs: support ephemeral storage for emptydir
Add support for ephemeral storage and k8s emptydir.

Depends-on:github.com/kata-containers/tests#5161

Fixes: #4730

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-30 09:10:20 +08:00
Peng Tao
046ddc6463 readme: remove libraries mentioning
There are two duplicated mentioning of the rust libraries in README.md.
Let's just remove them all as the section is intended to list out core
Kata components rather than general libraries.

Fixes: #5275
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-09-29 12:10:50 +08:00
Bin Liu
86ad832e37 runtime-rs: force shutdown shim process in it can't exit
In some case the call of cleanup from shim to service manager will fail,
and the shim process will continue to running, that will make process leak.

This commit will force shutdown the shim process in case of any errors in
service crate.

Fixes: #5087

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-09-02 19:43:50 +08:00
461 changed files with 19042 additions and 5944 deletions

View File

@@ -1,5 +1,12 @@
name: Cargo Crates Check Runner
on: [pull_request]
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]
jobs:
cargo-deny-runner:
runs-on: ubuntu-latest

View File

@@ -5,20 +5,16 @@ on:
- edited
- reopened
- synchronize
paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]
name: Darwin tests
jobs:
test:
strategy:
matrix:
go-version: [1.16.x, 1.17.x]
os: [macos-latest]
runs-on: ${{ matrix.os }}
runs-on: macos-latest
steps:
- name: Install Go
uses: actions/setup-go@v2
with:
go-version: ${{ matrix.go-version }}
go-version: 1.19.3
- name: Checkout code
uses: actions/checkout@v2
- name: Build utils

View File

@@ -5,11 +5,7 @@ on:
name: Docs URL Alive Check
jobs:
test:
strategy:
matrix:
go-version: [1.17.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
runs-on: ubuntu-20.04
# don't run this action on forks
if: github.repository_owner == 'kata-containers'
env:
@@ -18,7 +14,7 @@ jobs:
- name: Install Go
uses: actions/setup-go@v2
with:
go-version: ${{ matrix.go-version }}
go-version: 1.19.3
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Set env

View File

@@ -25,6 +25,7 @@ jobs:
- rootfs-image
- rootfs-initrd
- virtiofsd
- nydus
steps:
- uses: actions/checkout@v2
- name: Install docker

View File

@@ -50,6 +50,7 @@ jobs:
- cloud-hypervisor
- firecracker
- kernel
- nydus
- qemu
- rootfs-image
- rootfs-initrd

View File

@@ -13,6 +13,7 @@ jobs:
- cloud-hypervisor
- firecracker
- kernel
- nydus
- qemu
- rootfs-image
- rootfs-initrd

View File

@@ -4,6 +4,9 @@ on:
tags:
- '[0-9]+.[0-9]+.[0-9]+*'
env:
SNAPCRAFT_STORE_CREDENTIALS: ${{ secrets.snapcraft_token }}
jobs:
release-snap:
runs-on: ubuntu-20.04
@@ -14,9 +17,16 @@ jobs:
fetch-depth: 0
- name: Install Snapcraft
uses: samuelmeuli/action-snapcraft@v1
with:
snapcraft_token: ${{ secrets.snapcraft_token }}
run: |
# Required to avoid snapcraft install failure
sudo chown root:root /
# "--classic" is needed for the GitHub action runner
# environment.
sudo snap install snapcraft --classic
# Allow other parts to access snap binaries
echo /snap/bin >> "$GITHUB_PATH"
- name: Build snap
run: |

View File

@@ -6,6 +6,7 @@ on:
- synchronize
- reopened
- edited
paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]
jobs:
test:
@@ -19,7 +20,16 @@ jobs:
- name: Install Snapcraft
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: samuelmeuli/action-snapcraft@v1
run: |
# Required to avoid snapcraft install failure
sudo chown root:root /
# "--classic" is needed for the GitHub action runner
# environment.
sudo snap install snapcraft --classic
# Allow other parts to access snap binaries
echo /snap/bin >> "$GITHUB_PATH"
- name: Build snap
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

View File

@@ -0,0 +1,33 @@
on:
pull_request:
types:
- opened
- edited
- reopened
- synchronize
paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]
name: Static checks dragonball
jobs:
test-dragonball:
runs-on: self-hosted
env:
RUST_BACKTRACE: "1"
steps:
- uses: actions/checkout@v3
- name: Set env
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
- name: Install Rust
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
./ci/install_rust.sh
PATH=$PATH:"$HOME/.cargo/bin"
- name: Run Unit Test
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd src/dragonball
cargo version
rustc --version
sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test

View File

@@ -8,12 +8,16 @@ on:
name: Static checks
jobs:
test:
static-checks:
runs-on: ubuntu-20.04
strategy:
matrix:
go-version: [1.16.x, 1.17.x]
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
cmd:
- "make vendor"
- "make static-checks"
- "make check"
- "make test"
- "sudo -E PATH=\"$PATH\" make test"
env:
TRAVIS: "true"
TRAVIS_BRANCH: ${{ github.base_ref }}
@@ -22,11 +26,16 @@ jobs:
RUST_BACKTRACE: "1"
target_branch: ${{ github.base_ref }}
steps:
- name: Install Go
- name: Checkout code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/setup-go@v2
uses: actions/checkout@v3
with:
go-version: ${{ matrix.go-version }}
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Install Go
uses: actions/setup-go@v3
with:
go-version: 1.19.3
env:
GOPATH: ${{ runner.workspace }}/kata-containers
- name: Setup GOPATH
@@ -41,12 +50,6 @@ jobs:
run: |
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
- name: Checkout code
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
uses: actions/checkout@v2
with:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- name: Setup travis references
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
@@ -66,6 +69,7 @@ jobs:
rustup target add x86_64-unknown-linux-musl
rustup component add rustfmt clippy
- name: Setup seccomp
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
@@ -73,24 +77,7 @@ jobs:
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
# Check whether the vendored code is up-to-date & working as the first thing
- name: Check vendored code
- name: Run check
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor
- name: Static Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks
- name: Run Compiler Checks
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make check
- name: Run Unit Tests
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && make test
- name: Run Unit Tests As Root User
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
run: |
cd ${GOPATH}/src/github.com/${{ github.repository }} && sudo -E PATH="$PATH" make test
cd ${GOPATH}/src/github.com/${{ github.repository }} && ${{ matrix.cmd }}

3
.gitignore vendored
View File

@@ -4,6 +4,8 @@
**/*.rej
**/target
**/.vscode
**/.idea
**/.fleet
pkg/logging/Cargo.lock
src/agent/src/version.rs
src/agent/kata-agent.service
@@ -11,4 +13,3 @@ src/agent/protocols/src/*.rs
!src/agent/protocols/src/lib.rs
build
src/tools/log-parser/kata-log-parser

View File

@@ -8,6 +8,7 @@ COMPONENTS =
COMPONENTS += libs
COMPONENTS += agent
COMPONENTS += dragonball
COMPONENTS += runtime
COMPONENTS += runtime-rs
@@ -15,11 +16,12 @@ COMPONENTS += runtime-rs
TOOLS =
TOOLS += agent-ctl
TOOLS += trace-forwarder
TOOLS += runk
TOOLS += kata-ctl
TOOLS += log-parser
TOOLS += runk
TOOLS += trace-forwarder
STANDARD_TARGETS = build check clean install test vendor
STANDARD_TARGETS = build check clean install static-checks-build test vendor
default: all
@@ -35,7 +37,7 @@ generate-protocols:
make -C src/agent generate-protocols
# Some static checks rely on generated source files of components.
static-checks: build
static-checks: static-checks-build
bash ci/static-checks.sh
docs-url-alive-check:

View File

@@ -119,10 +119,8 @@ The table below lists the core parts of the project:
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
| [runtime-rs](src/runtime-rs) | core | The Rust version runtime. |
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
| [libraries](src/libs) | core | Library crates shared by multiple Kata Container components or published to [`crates.io`](https://crates.io/index.html) |
| [`dragonball`](src/dragonball) | core | An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads |
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
| [libraries](src/libs) | core | Library crates shared by multiple Kata Container components or published to [`crates.io`](https://crates.io/index.html) |
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
### Additional components
@@ -135,6 +133,7 @@ The table below lists the remaining parts of the project:
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |

View File

@@ -1 +1 @@
3.1.0-alpha0
3.1.0-alpha1

View File

@@ -86,6 +86,27 @@ $ sudo sed -i '/^disable_guest_seccomp/ s/true/false/' /etc/kata-containers/conf
This will pass container seccomp profiles to the kata agent.
## Enable SELinux on the guest
> **Note:**
>
> - To enable SELinux on the guest, SELinux MUST be also enabled on the host.
> - You MUST create and build a rootfs image for SELinux in advance.
> See [Create a rootfs image](#create-a-rootfs-image) and [Build a rootfs image](#build-a-rootfs-image).
> - SELinux on the guest is supported in only a rootfs image currently, so
> you cannot enable SELinux with the agent init (`AGENT_INIT=yes`) yet.
Enable guest SELinux in Enforcing mode as follows:
```
$ sudo sed -i '/^disable_guest_selinux/ s/true/false/g' /etc/kata-containers/configuration.toml
```
The runtime automatically will set `selinux=1` to the kernel parameters and `xattr` option to
`virtiofsd` when `disable_guest_selinux` is set to `false`.
If you want to enable SELinux in Permissive mode, add `enforcing=0` to the kernel parameters.
## Enable full debug
Enable full debug as follows:
@@ -256,6 +277,12 @@ If you want to build the agent without seccomp capability, you need to run the `
$ script -fec 'sudo -E AGENT_INIT=yes USE_DOCKER=true SECCOMP=no ./rootfs.sh "${distro}"'
```
If you want to enable SELinux on the guest, you MUST choose `centos` and run the `rootfs.sh` script with `SELINUX=yes` as follows.
```
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true SELINUX=yes ./rootfs.sh centos'
```
> **Note:**
>
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
@@ -283,6 +310,19 @@ $ script -fec 'sudo -E USE_DOCKER=true ./image_builder.sh "${ROOTFS_DIR}"'
$ popd
```
If you want to enable SELinux on the guest, you MUST run the `image_builder.sh` script with `SELINUX=yes`
to label the guest image as follows.
To label the image on the host, you need to make sure that SELinux is enabled (`selinuxfs` is mounted) on the host
and the rootfs MUST be created by running the `rootfs.sh` with `SELINUX=yes`.
```
$ script -fec 'sudo -E USE_DOCKER=true SELINUX=yes ./image_builder.sh ${ROOTFS_DIR}'
```
Currently, the `image_builder.sh` uses `chcon` as an interim solution in order to apply `container_runtime_exec_t`
to the `kata-agent`. Hence, if you run `restorecon` to the guest image after running the `image_builder.sh`,
the `kata-agent` needs to be labeled `container_runtime_exec_t` again by yourself.
> **Notes:**
>
> - You must ensure that the *default Docker runtime* is `runc` to make use of
@@ -441,7 +481,7 @@ When using the file system type virtio-fs (default), `virtiofsd` is required
```bash
$ pushd kata-containers/tools/packaging/static-build/virtiofsd
$ ./build-static-virtiofsd.sh
$ ./build.sh
$ popd
```

View File

@@ -7,7 +7,9 @@ Kata Containers design documents:
- [Design requirements for Kata Containers](kata-design-requirements.md)
- [VSocks](VSocks.md)
- [VCPU handling](vcpu-handling.md)
- [VCPU threads pinning](vcpu-threads-pinning.md)
- [Host cgroups](host-cgroups.md)
- [Agent systemd cgroup](agent-systemd-cgroup.md)
- [`Inotify` support](inotify.md)
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
- [Design for Kata Containers `Lazyload` ability with `nydus`](kata-nydus-design.md)

View File

@@ -0,0 +1,84 @@
# Systemd Cgroup for Agent
As we know, we can interact with cgroups in two ways, **`cgroupfs`** and **`systemd`**. The former is achieved by reading and writing cgroup `tmpfs` files under `/sys/fs/cgroup` while the latter is done by configuring a transient unit by requesting systemd. Kata agent uses **`cgroupfs`** by default, unless you pass the parameter `--systemd-cgroup`.
## usage
For systemd, kata agent configures cgroups according to the following `linux.cgroupsPath` format standard provided by `runc` (`[slice]:[prefix]:[name]`). If you don't provide a valid `linux.cgroupsPath`, kata agent will treat it as `"system.slice:kata_agent:<container-id>"`.
> Here slice is a systemd slice under which the container is placed. If empty, it defaults to system.slice, except when cgroup v2 is used and rootless container is created, in which case it defaults to user.slice.
>
> Note that slice can contain dashes to denote a sub-slice (e.g. user-1000.slice is a correct notation, meaning a `subslice` of user.slice), but it must not contain slashes (e.g. user.slice/user-1000.slice is invalid).
>
> A slice of `-` represents a root slice.
>
> Next, prefix and name are used to compose the unit name, which is `<prefix>-<name>.scope`, unless name has `.slice` suffix, in which case prefix is ignored and the name is used as is.
## supported properties
The kata agent will translate the parameters in the `linux.resources` of `config.json` into systemd unit properties, and send it to systemd for configuration. Since systemd supports limited properties, only the following parameters in `linux.resources` will be applied. We will simply treat hybrid mode as legacy mode by the way.
- CPU
- v1
| runtime spec resource | systemd property name |
| --------------------- | --------------------- |
| `cpu.shares` | `CPUShares` |
- v2
| runtime spec resource | systemd property name |
| -------------------------- | -------------------------- |
| `cpu.shares` | `CPUShares` |
| `cpu.period` | `CPUQuotaPeriodUSec`(v242) |
| `cpu.period` & `cpu.quota` | `CPUQuotaPerSecUSec` |
- MEMORY
- v1
| runtime spec resource | systemd property name |
| --------------------- | --------------------- |
| `memory.limit` | `MemoryLimit` |
- v2
| runtime spec resource | systemd property name |
| ------------------------------ | --------------------- |
| `memory.low` | `MemoryLow` |
| `memory.max` | `MemoryMax` |
| `memory.swap` & `memory.limit` | `MemorySwapMax` |
- PIDS
| runtime spec resource | systemd property name |
| --------------------- | --------------------- |
| `pids.limit ` | `TasksMax` |
- CPUSET
| runtime spec resource | systemd property name |
| --------------------- | -------------------------- |
| `cpuset.cpus` | `AllowedCPUs`(v244) |
| `cpuset.mems` | `AllowedMemoryNodes`(v244) |
## Systemd Interface
`session.rs` and `system.rs` in `src/agent/rustjail/src/cgroups/systemd/interface` are automatically generated by `zbus-xmlgen`, which is is an accompanying tool provided by `zbus` to generate Rust code from `D-Bus XML interface descriptions`. The specific commands to generate these two files are as follows:
```shell
// system.rs
zbus-xmlgen --system org.freedesktop.systemd1 /org/freedesktop/systemd1
// session.rs
zbus-xmlgen --session org.freedesktop.systemd1 /org/freedesktop/systemd1
```
The current implementation of `cgroups/systemd` uses `system.rs` while `session.rs` could be used to build rootless containers in the future.
## references
- [runc - systemd cgroup driver](https://github.com/opencontainers/runc/blob/main/docs/systemd.md)
- [systemd.resource-control — Resource control unit settings](https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html)

Binary file not shown.

After

Width:  |  Height:  |  Size: 193 KiB

View File

@@ -64,8 +64,8 @@ The kata-runtime is controlled by TOKIO_RUNTIME_WORKER_THREADS to run the OS thr
├─ TTRPC listener thread(M * tokio task)
├─ TTRPC client handler thread(7 * M * tokio task)
├─ container stdin io thread(M * tokio task)
├─ container stdin io thread(M * tokio task)
└─ container stdin io thread(M * tokio task)
├─ container stdout io thread(M * tokio task)
└─ container stderr io thread(M * tokio task)
```
### Extensible Framework
The Kata 3.x runtime is designed with the extension of service, runtime, and hypervisor, combined with configuration to meet the needs of different scenarios. At present, the service provides a register mechanism to support multiple services. Services could interact with runtime through messages. In addition, the runtime handler handles messages from services. To meet the needs of a binary that supports multiple runtimes and hypervisors, the startup must obtain the runtime handler type and hypervisor type through configuration.

View File

@@ -81,7 +81,7 @@ Notes: given that the `mountInfo` is persisted to the disk by the Kata runtime,
Instead of the CSI node driver writing the mount info into a `csiPlugin.json` file under the volume root,
as described in the original proposal, here we propose that the CSI node driver passes the mount information to
the Kata Containers runtime through a new `kata-runtime` commandline command. The `kata-runtime` then writes the mount
information to a `mount-info.json` file in a predefined location (`/run/kata-containers/shared/direct-volumes/[volume_path]/`).
information to a `mountInfo.json` file in a predefined location (`/run/kata-containers/shared/direct-volumes/[volume_path]/`).
When the Kata Containers runtime starts a container, it verifies whether a volume mount is a direct-assigned volume by checking
whether there is a `mountInfo` file under the computed Kata `direct-volumes` directory. If it is, the runtime parses the `mountInfo` file,

View File

@@ -0,0 +1,37 @@
# Design Doc for Kata Containers' VCPUs Pinning Feature
## Background
By now, vCPU threads of Kata Containers are scheduled randomly to CPUs. And each pod would request a specific set of CPUs which we call it CPU set (just the CPU set meaning in Linux cgroups).
If the number of vCPU threads are equal to that of CPUs claimed in CPU set, we can then pin each vCPU thread to one specified CPU, to reduce the cost of random scheduling.
## Detailed Design
### Passing Config Parameters
Two ways are provided to use this vCPU thread pinning feature: through `QEMU` configuration file and through annotations. Finally the pinning parameter is passed to `HypervisorConfig`.
### Related Linux Thread Scheduling API
| API Info | Value |
|-------------------|-----------------------------------------------------------|
| Package | `golang.org/x/sys/unix` |
| Method | `unix.SchedSetaffinity(thread_id, &unixCPUSet)` |
| Official Doc Page | https://pkg.go.dev/golang.org/x/sys/unix#SchedSetaffinity |
### When is VCPUs Pinning Checked?
As shown in Section 1, when `num(vCPU threads) == num(CPUs in CPU set)`, we shall pin each vCPU thread to a specified CPU. And when this condition is broken, we should restore to the original random scheduling pattern.
So when may `num(CPUs in CPU set)` change? There are 5 possible scenes:
| Possible scenes | Related Code |
|-----------------------------------|--------------------------------------------|
| when creating a container | File Sandbox.go, in method `CreateContainer` |
| when starting a container | File Sandbox.go, in method `StartContainer` |
| when deleting a container | File Sandbox.go, in method `DeleteContainer` |
| when updating a container | File Sandbox.go, in method `UpdateContainer` |
| when creating multiple containers | File Sandbox.go, in method `createContainers` |
### Core Pinning Logics
We can split the whole process into the following steps. Related methods are `checkVCPUsPinning` and `resetVCPUsPinning`, in file Sandbox.go.
![](arch-images/vcpus-pinning-process.png)

View File

@@ -257,6 +257,48 @@ This launches a BusyBox container named `hello`, and it will be removed by `--rm
The `--cni` flag enables CNI networking for the container. Without this flag, a container with just a
loopback interface is created.
### Launch containers using `ctr` command line with rootfs bundle
#### Get rootfs
Use the script to create rootfs
```bash
ctr i pull quay.io/prometheus/busybox:latest
ctr i export rootfs.tar quay.io/prometheus/busybox:latest
rootfs_tar=rootfs.tar
bundle_dir="./bundle"
mkdir -p "${bundle_dir}"
# extract busybox rootfs
rootfs_dir="${bundle_dir}/rootfs"
mkdir -p "${rootfs_dir}"
layers_dir="$(mktemp -d)"
tar -C "${layers_dir}" -pxf "${rootfs_tar}"
for ((i=0;i<$(cat ${layers_dir}/manifest.json | jq -r ".[].Layers | length");i++)); do
tar -C ${rootfs_dir} -xf ${layers_dir}/$(cat ${layers_dir}/manifest.json | jq -r ".[].Layers[${i}]")
done
```
#### Get `config.json`
Use runc spec to generate `config.json`
```bash
cd ./bundle/rootfs
runc spec
mv config.json ../
```
Change the root `path` in `config.json` to the absolute path of rootfs
```JSON
"root":{
"path":"/root/test/bundle/rootfs",
"readonly": false
},
```
#### Run container
```bash
sudo ctr run -d --runtime io.containerd.run.kata.v2 --config bundle/config.json hello
sudo ctr t exec --exec-id ${ID} -t hello sh
```
### Launch Pods with `crictl` command line
With the `crictl` command line of `cri-tools`, you can specify runtime class with `-r` or `--runtime` flag.

View File

@@ -50,7 +50,7 @@ $ qemu_commit="$(get_from_kata_deps "assets.hypervisor.qemu.snp.commit")"
$ git clone -b "${qemu_branch}" "${qemu_url}"
$ pushd qemu
$ git checkout "${qemu_commit}"
$ ./configure --target-list=x86_64-softmmu --enable-debug
$ ./configure --enable-virtfs --target-list=x86_64-softmmu --enable-debug
$ make -j "$(nproc)"
$ popd
```

View File

@@ -17,9 +17,9 @@ Enable setup swap device in guest kernel as follows:
$ sudo sed -i -e 's/^#enable_guest_swap.*$/enable_guest_swap = true/g' /etc/kata-containers/configuration.toml
```
## Run a Kata Container utilizing swap device
## Run a Kata Containers utilizing swap device
Use following command to start a Kata Container with swappiness 60 and 1GB swap device (swap_in_bytes - memory_limit_in_bytes).
Use following command to start a Kata Containers with swappiness 60 and 1GB swap device (swap_in_bytes - memory_limit_in_bytes).
```
$ pod_yaml=pod.yaml
$ container_yaml=container.yaml
@@ -43,12 +43,12 @@ command:
- top
EOF
$ sudo crictl pull $image
$ podid=$(sudo crictl runp $pod_yaml)
$ podid=$(sudo crictl runp --runtime kata $pod_yaml)
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
$ sudo crictl start $cid
```
Kata Container setups swap device for this container only when `io.katacontainers.container.resource.swappiness` is set.
Kata Containers setups swap device for this container only when `io.katacontainers.container.resource.swappiness` is set.
The following table shows the swap size how to decide if `io.katacontainers.container.resource.swappiness` is set.
|`io.katacontainers.container.resource.swap_in_bytes`|`memory_limit_in_bytes`|swap size|

View File

@@ -212,7 +212,7 @@ Next, we need to configure containerd. Add a file in your path (e.g. `/usr/local
```
#!/bin/bash
KATA_CONF_FILE=/etc/containers/configuration-fc.toml /usr/local/bin/containerd-shim-kata-v2 $@
KATA_CONF_FILE=/etc/kata-containers/configuration-fc.toml /usr/local/bin/containerd-shim-kata-v2 $@
```
> **Note:** You may need to edit the paths of the configuration file and the `containerd-shim-kata-v2` to correspond to your setup.

View File

@@ -24,7 +24,7 @@ architectures:
| Installation method | Description | Automatic updates | Use case | Availability
|------------------------------------------------------|----------------------------------------------------------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------|----------- |
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. | No |
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. | Yes |
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. | No |
| [Using snap](#snap-installation) | Easy to install | yes | Good alternative to official distro packages. | No |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. | No |
@@ -32,7 +32,8 @@ architectures:
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. | Yes |
### Kata Deploy Installation
`ToDo`
Follow the [`kata-deploy`](../../tools/packaging/kata-deploy/README.md).
### Official packages
`ToDo`
### Snap Installation
@@ -48,14 +49,14 @@ architectures:
* Download `Rustup` and install `Rust`
> **Notes:**
> Rust version 1.58 is needed
> Rust version 1.62.0 is needed
Example for `x86_64`
```
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$ source $HOME/.cargo/env
$ rustup install 1.58
$ rustup default 1.58-x86_64-unknown-linux-gnu
$ rustup install 1.62.0
$ rustup default 1.62.0-x86_64-unknown-linux-gnu
```
* Musl support for fully static binary
@@ -83,7 +84,7 @@ $ git clone https://github.com/kata-containers/kata-containers.git
$ cd kata-containers/src/runtime-rs
$ make && sudo make install
```
After running the command above, the default config file `configuration.toml` will be installed under `/usr/share/defaults/kata-containers/`, the binary file `containerd-shim-kata-v2` will be installed under `/user/local/bin` .
After running the command above, the default config file `configuration.toml` will be installed under `/usr/share/defaults/kata-containers/`, the binary file `containerd-shim-kata-v2` will be installed under `/usr/local/bin/` .
### Build Kata Containers Kernel
Follow the [Kernel installation guide](/tools/packaging/kernel/README.md).

View File

@@ -545,6 +545,12 @@ Create the hook execution file for Kata:
/usr/bin/nvidia-container-toolkit -debug $@
```
Make sure the hook shell is executable:
```sh
chmod +x $ROOTFS_DIR/usr/share/oci/hooks/prestart/nvidia-container-toolkit.sh
```
As the last step one can do some cleanup of files or package caches. Build the
rootfs and configure it for use with Kata according to the development guide.

View File

@@ -61,6 +61,9 @@ spec:
name: eosgx-demo-job-1
image: oeciteam/oe-helloworld:latest
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /dev
name: dev-mount
securityContext:
readOnlyRootFilesystem: true
capabilities:

View File

@@ -197,11 +197,6 @@ vhost_user_store_path = "<Path of the base directory for vhost-user device>"
> under `[hypervisor.qemu]` section.
For the subdirectories of `vhost_user_store_path`: `block` is used for block
device; `block/sockets` is where we expect UNIX domain sockets for vhost-user
block devices to live; `block/devices` is where simulated block device nodes
for vhost-user block devices are created.
For the subdirectories of `vhost_user_store_path`:
- `block` is used for block device;
- `block/sockets` is where we expect UNIX domain sockets for vhost-user

View File

@@ -82,8 +82,39 @@ parts:
fi
rustup component add rustfmt
docker:
after: [metadata]
plugin: nil
prime:
- -*
build-packages:
- ca-certificates
- containerd
- curl
- gnupg
- lsb-release
- runc
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
curl -fsSL https://download.docker.com/linux/ubuntu/gpg |\
sudo gpg --batch --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
distro_codename=$(lsb_release -cs)
echo "deb [arch=${dpkg_arch} signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu ${distro_codename} stable" |\
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get -y update
sudo apt-get -y install docker-ce docker-ce-cli containerd.io
echo "Unmasking docker service"
sudo -E systemctl unmask docker.service || true
sudo -E systemctl unmask docker.socket || true
echo "Adding $USER into docker group"
sudo -E gpasswd -a $USER docker
echo "Starting docker"
sudo -E systemctl start docker || true
image:
after: [godeps, qemu, kernel]
after: [godeps, docker, qemu, kernel]
plugin: nil
build-packages:
- docker.io
@@ -107,14 +138,6 @@ parts:
# Copy yq binary. It's used in the container
cp -a "${yq}" "${GOPATH}/bin/"
echo "Unmasking docker service"
sudo -E systemctl unmask docker.service || true
sudo -E systemctl unmask docker.socket || true
echo "Adding $USER into docker group"
sudo -E gpasswd -a $USER docker
echo "Starting docker"
sudo -E systemctl start docker || true
cd "${kata_dir}/tools/osbuilder"
# build image
@@ -301,54 +324,31 @@ parts:
virtiofsd:
plugin: nil
after: [godeps, rustdeps]
after: [godeps, rustdeps, docker]
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
# Currently, powerpc makes use of the QEMU's C implementation.
# The other platforms make use of the new rust virtiofsd.
#
# See "tools/packaging/scripts/configure-hypervisor.sh".
if [ "${arch}" == "ppc64le" ]
then
echo "INFO: Building QEMU's C version of virtiofsd"
# Handled by the 'qemu' part, so nothing more to do here.
exit 0
else
echo "INFO: Building rust version of virtiofsd"
fi
echo "INFO: Building rust version of virtiofsd"
cd "${kata_dir}"
cd "${SNAPCRAFT_PROJECT_DIR}"
# Clean-up build dir in case it already exists
sudo -E NO_TTY=true make virtiofsd-tarball
export PATH=${PATH}:${HOME}/.cargo/bin
# Download the rust implementation of virtiofsd
tools/packaging/static-build/virtiofsd/build-static-virtiofsd.sh
sudo install \
--owner='root' \
--group='root' \
--mode=0755 \
-D \
--target-directory="${SNAPCRAFT_PART_INSTALL}/usr/libexec/" \
virtiofsd/virtiofsd
build/virtiofsd/builddir/virtiofsd/virtiofsd
cloud-hypervisor:
plugin: nil
after: [godeps]
after: [godeps, docker]
override-build: |
source "${SNAPCRAFT_PROJECT_DIR}/snap/local/snap-common.sh"
if [ "${arch}" == "aarch64" ] || [ "${arch}" == "x86_64" ]; then
sudo apt-get -y update
sudo apt-get -y install ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg |\
sudo gpg --batch --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
distro_codename=$(lsb_release -cs)
echo "deb [arch=${dpkg_arch} signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu ${distro_codename} stable" |\
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get -y update
sudo apt-get -y install docker-ce docker-ce-cli containerd.io
sudo systemctl start docker.socket
cd "${SNAPCRAFT_PROJECT_DIR}"
sudo -E NO_TTY=true make cloud-hypervisor-tarball

487
src/agent/Cargo.lock generated
View File

@@ -47,6 +47,71 @@ version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c5d78ce20460b82d3fa150275ed9d55e21064fc7951177baacf86a145c4a4b1f"
[[package]]
name = "async-broadcast"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d26004fe83b2d1cd3a97609b21e39f9a31535822210fe83205d2ce48866ea61"
dependencies = [
"event-listener",
"futures-core",
"parking_lot 0.12.1",
]
[[package]]
name = "async-channel"
version = "1.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e14485364214912d3b19cc3435dde4df66065127f05fa0d75c712f36f12c2f28"
dependencies = [
"concurrent-queue",
"event-listener",
"futures-core",
]
[[package]]
name = "async-executor"
version = "1.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "871f9bb5e0a22eeb7e8cf16641feb87c9dc67032ccf8ff49e772eb9941d3a965"
dependencies = [
"async-task",
"concurrent-queue",
"fastrand",
"futures-lite",
"once_cell",
"slab",
]
[[package]]
name = "async-io"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "83e21f3a490c72b3b0cf44962180e60045de2925d8dff97918f7ee43c8f637c7"
dependencies = [
"autocfg",
"concurrent-queue",
"futures-lite",
"libc",
"log",
"once_cell",
"parking",
"polling",
"slab",
"socket2",
"waker-fn",
"winapi",
]
[[package]]
name = "async-lock"
version = "2.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e97a171d191782fba31bb902b14ad94e24a68145032b7eedf871ab0bc0d077b6"
dependencies = [
"event-listener",
]
[[package]]
name = "async-recursion"
version = "0.3.2"
@@ -58,6 +123,12 @@ dependencies = [
"syn",
]
[[package]]
name = "async-task"
version = "4.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a40729d2133846d9ed0ea60a8b9541bccddab49cd30f0715a1da672fe9a2524"
[[package]]
name = "async-trait"
version = "0.1.56"
@@ -86,6 +157,12 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "base64"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "904dfeac50f3cdaba28fc6f57fdcddb75f49ed61346676a78c4ffe55877802fd"
[[package]]
name = "bincode"
version = "1.3.3"
@@ -95,12 +172,28 @@ dependencies = [
"serde",
]
[[package]]
name = "bit-vec"
version = "0.6.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "349f9b6a179ed607305526ca489b34ad0a41aed5f7980fa90eb03160b69598fb"
[[package]]
name = "bitflags"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
[[package]]
name = "bitmask-enum"
version = "2.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd9e32d7420c85055e8107e5b2463c4eeefeaac18b52359fe9f9c08a18f342b2"
dependencies = [
"quote",
"syn",
]
[[package]]
name = "bumpalo"
version = "3.10.0"
@@ -135,6 +228,12 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4872d67bab6358e59559027aa3b9157c53d9358c51423c17554809a8858e0f8"
[[package]]
name = "cache-padded"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c1db59621ec70f09c5e9b597b220c7a2b43611f4710dc03ceb8748637775692c"
[[package]]
name = "capctl"
version = "0.2.1"
@@ -247,6 +346,15 @@ version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2382f75942f4b3be3690fe4f86365e9c853c1587d6ee58212cebf6e2a9ccd101"
[[package]]
name = "concurrent-queue"
version = "1.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "af4780a44ab5696ea9e28294517f1fffb421a83a25af521333c838635509db9c"
dependencies = [
"cache-padded",
]
[[package]]
name = "core-foundation-sys"
version = "0.8.3"
@@ -272,31 +380,6 @@ dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-deque"
version = "0.8.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6455c0ca19f0d2fbf751b908d5c55c1f5cbc65e03c4225427254b46890bdde1e"
dependencies = [
"cfg-if 1.0.0",
"crossbeam-epoch",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-epoch"
version = "0.9.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "07db9d94cbd326813772c968ccd25999e5f8ae22f4f8d1b11effa37ef6ce281d"
dependencies = [
"autocfg",
"cfg-if 1.0.0",
"crossbeam-utils",
"memoffset",
"once_cell",
"scopeguard",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.10"
@@ -307,6 +390,17 @@ dependencies = [
"once_cell",
]
[[package]]
name = "derivative"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fcc3dd5e9e9c0b295d6e1e4d811fb6f157d5ffd784b8d202fc62eac8035a770b"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "derive-new"
version = "0.5.9"
@@ -318,12 +412,53 @@ dependencies = [
"syn",
]
[[package]]
name = "dirs"
version = "4.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ca3aa72a6f96ea37bbc5aa912f6788242832f75369bdfdadcb0e38423f100059"
dependencies = [
"dirs-sys",
]
[[package]]
name = "dirs-sys"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b1d1d91c932ef41c0f2663aa8b0ca0342d444d842c06914aa0a7e352d0bada6"
dependencies = [
"libc",
"redox_users",
"winapi",
]
[[package]]
name = "either"
version = "1.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e78d4f1cc4ae33bbfc157ed5d5a5ef3bc29227303d595861deb238fcec4e9457"
[[package]]
name = "enumflags2"
version = "0.7.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e75d4cd21b95383444831539909fbb14b9dc3fdceb2a6f5d36577329a1f55ccb"
dependencies = [
"enumflags2_derive",
"serde",
]
[[package]]
name = "enumflags2_derive"
version = "0.7.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f58dc3c5e468259f19f2d46304a6b28f1c3d034442e14b322d2b850e36f6d5ae"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "errno"
version = "0.2.8"
@@ -345,6 +480,12 @@ dependencies = [
"libc",
]
[[package]]
name = "event-listener"
version = "2.5.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0206175f82b8d6bf6652ff7d71a1e27fd2e4efde587fd368662814d6ec1d9ce0"
[[package]]
name = "fail"
version = "0.5.0"
@@ -435,6 +576,21 @@ version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fc4045962a5a5e935ee2fdedaa4e08284547402885ab326734432bed5d12966b"
[[package]]
name = "futures-lite"
version = "1.12.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7694489acd39452c77daa48516b894c153f192c3578d5a839b62c58099fcbf48"
dependencies = [
"fastrand",
"futures-core",
"futures-io",
"memchr",
"parking",
"pin-project-lite",
"waker-fn",
]
[[package]]
name = "futures-macro"
version = "0.3.21"
@@ -674,7 +830,6 @@ dependencies = [
"slog",
"slog-scope",
"slog-stdlog",
"sysinfo",
"tempfile",
"test-utils",
"thiserror",
@@ -686,6 +841,7 @@ dependencies = [
"tracing-subscriber",
"ttrpc",
"vsock-exporter",
"which",
]
[[package]]
@@ -715,6 +871,9 @@ dependencies = [
name = "kata-types"
version = "0.1.0"
dependencies = [
"anyhow",
"base64",
"bitmask-enum",
"byte-unit",
"glob",
"lazy_static",
@@ -743,10 +902,11 @@ checksum = "349d5a591cd28b49e1d1037471617a32ddcda5731b99419008085f72d5a53836"
[[package]]
name = "libseccomp"
version = "0.2.3"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "49bda1fbf25c42ac8942ff7df1eb6172a3bc36299e84be0dba8c888a7db68c80"
checksum = "21c57fd8981a80019807b7b68118618d29a87177c63d704fc96e6ecd003ae5b3"
dependencies = [
"bitflags",
"libc",
"libseccomp-sys",
"pkg-config",
@@ -942,15 +1102,6 @@ dependencies = [
"memoffset",
]
[[package]]
name = "ntapi"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c28774a7fd2fbb4f0babd8237ce554b73af68021b5f695a3cebd6c59bac0980f"
dependencies = [
"winapi",
]
[[package]]
name = "num-integer"
version = "0.1.45"
@@ -1001,9 +1152,9 @@ dependencies = [
[[package]]
name = "once_cell"
version = "1.12.0"
version = "1.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7709cef83f0c1f58f666e746a08b21e0085f7440fa6a29cc194d68aac97a4225"
checksum = "e82dad04139b71a90c080c8463fe0dc7902db5192d939bd0950f074d014339e1"
[[package]]
name = "opentelemetry"
@@ -1025,12 +1176,28 @@ dependencies = [
"tokio-stream",
]
[[package]]
name = "ordered-stream"
version = "0.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "44630c059eacfd6e08bdaa51b1db2ce33119caa4ddc1235e923109aa5f25ccb1"
dependencies = [
"futures-core",
"pin-project-lite",
]
[[package]]
name = "os_str_bytes"
version = "6.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "21326818e99cfe6ce1e524c2a805c189a99b5ae555a35d19f9a284b427d86afa"
[[package]]
name = "parking"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "427c3892f9e783d91cc128285287e70a59e206ca452770ece88a76f7a3eddd72"
[[package]]
name = "parking_lot"
version = "0.11.2"
@@ -1158,12 +1325,37 @@ version = "0.3.25"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1df8c4ec4b0627e53bdf214615ad287367e482558cf84b109250b37464dc03ae"
[[package]]
name = "polling"
version = "2.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ab4609a838d88b73d8238967b60dd115cc08d38e2bbaf51ee1e4b695f89122e2"
dependencies = [
"autocfg",
"cfg-if 1.0.0",
"libc",
"log",
"wepoll-ffi",
"winapi",
]
[[package]]
name = "ppv-lite86"
version = "0.2.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eb9f9e6e233e5c4a35559a617bf40a4ec447db2e84c20b55a6f83167b7e57872"
[[package]]
name = "proc-macro-crate"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eda0fc3b0fb7c975631757e14d9049da17374063edb6ebbcbc54d880d4fe94e9"
dependencies = [
"once_cell",
"thiserror",
"toml",
]
[[package]]
name = "proc-macro-error"
version = "1.0.4"
@@ -1400,30 +1592,6 @@ dependencies = [
"rand_core 0.5.1",
]
[[package]]
name = "rayon"
version = "1.5.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bd99e5772ead8baa5215278c9b15bf92087709e9c1b2d1f97cdb5a183c933a7d"
dependencies = [
"autocfg",
"crossbeam-deque",
"either",
"rayon-core",
]
[[package]]
name = "rayon-core"
version = "1.9.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "258bcdb5ac6dad48491bb2992db6b7cf74878b0384908af124823d118c99683f"
dependencies = [
"crossbeam-channel",
"crossbeam-deque",
"crossbeam-utils",
"num_cpus",
]
[[package]]
name = "redox_syscall"
version = "0.2.13"
@@ -1433,6 +1601,17 @@ dependencies = [
"bitflags",
]
[[package]]
name = "redox_users"
version = "0.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b033d837a7cf162d7993aded9304e30a83213c648b6e389db233191f891e5c2b"
dependencies = [
"getrandom 0.2.7",
"redox_syscall",
"thiserror",
]
[[package]]
name = "regex"
version = "1.5.6"
@@ -1498,6 +1677,7 @@ version = "0.1.0"
dependencies = [
"anyhow",
"async-trait",
"bit-vec",
"capctl",
"caps",
"cfg-if 0.1.10",
@@ -1525,6 +1705,8 @@ dependencies = [
"tempfile",
"test-utils",
"tokio",
"xattr",
"zbus",
]
[[package]]
@@ -1579,6 +1761,17 @@ dependencies = [
"serde",
]
[[package]]
name = "serde_repr"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fe39d9fbb0ebf5eb2c7cb7e2a47e4f462fad1379f1166b8ae49ad9eae89a7ca"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serial_test"
version = "0.5.1"
@@ -1601,6 +1794,21 @@ dependencies = [
"syn",
]
[[package]]
name = "sha1"
version = "0.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c1da05c97445caa12d05e848c4a4fcbbea29e748ac28f7e80e9b010392063770"
dependencies = [
"sha1_smol",
]
[[package]]
name = "sha1_smol"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ae1a47186c03a32177042e55dbc5fd5aee900b8e0069a8d70fba96a9375cd012"
[[package]]
name = "sharded-slab"
version = "0.1.4"
@@ -1699,6 +1907,12 @@ dependencies = [
"winapi",
]
[[package]]
name = "static_assertions"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f"
[[package]]
name = "strsim"
version = "0.10.0"
@@ -1726,21 +1940,6 @@ dependencies = [
"unicode-ident",
]
[[package]]
name = "sysinfo"
version = "0.23.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3977ec2e0520829be45c8a2df70db2bf364714d8a748316a10c3c35d4d2b01c9"
dependencies = [
"cfg-if 1.0.0",
"core-foundation-sys",
"libc",
"ntapi",
"once_cell",
"rayon",
"winapi",
]
[[package]]
name = "take_mut"
version = "0.2.2"
@@ -2047,6 +2246,16 @@ dependencies = [
"tempfile",
]
[[package]]
name = "uds_windows"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ce65604324d3cce9b966701489fbd0cf318cb1f7bd9dd07ac9a4ee6fb791930d"
dependencies = [
"tempfile",
"winapi",
]
[[package]]
name = "unicode-ident"
version = "1.0.1"
@@ -2098,6 +2307,12 @@ dependencies = [
"tokio-vsock",
]
[[package]]
name = "waker-fn"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d5b2c62b4012a3e1eca5a7e077d13b3bf498c4073e33ccd58626607748ceeca"
[[package]]
name = "wasi"
version = "0.9.0+wasi-snapshot-preview1"
@@ -2171,14 +2386,23 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6a89911bd99e5f3659ec4acf9c4d93b0a90fe4a2a11f15328472058edc5261be"
[[package]]
name = "which"
version = "4.2.5"
name = "wepoll-ffi"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c4fb54e6113b6a8772ee41c3404fb0301ac79604489467e0a9ce1f3e97c24ae"
checksum = "d743fdedc5c64377b5fc2bc036b01c7fd642205a0d96356034ae3404d49eb7fb"
dependencies = [
"cc",
]
[[package]]
name = "which"
version = "4.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1c831fbbee9e129a8cf93e7747a82da9d95ba8e16621cae60ec2cdc849bacb7b"
dependencies = [
"either",
"lazy_static",
"libc",
"once_cell",
]
[[package]]
@@ -2254,3 +2478,102 @@ name = "windows_x86_64_msvc"
version = "0.36.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c811ca4a8c853ef420abd8592ba53ddbbac90410fab6903b3e79972a631f7680"
[[package]]
name = "xattr"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d1526bbe5aaeb5eb06885f4d987bcdfa5e23187055de9b83fe00156a821fabc"
dependencies = [
"libc",
]
[[package]]
name = "zbus"
version = "2.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2d8f1a037b2c4a67d9654dc7bdfa8ff2e80555bbefdd3c1833c1d1b27c963a6b"
dependencies = [
"async-broadcast",
"async-channel",
"async-executor",
"async-io",
"async-lock",
"async-recursion",
"async-task",
"async-trait",
"byteorder",
"derivative",
"dirs",
"enumflags2",
"event-listener",
"futures-core",
"futures-sink",
"futures-util",
"hex",
"lazy_static",
"nix 0.23.1",
"once_cell",
"ordered-stream",
"rand 0.8.5",
"serde",
"serde_repr",
"sha1",
"static_assertions",
"tracing",
"uds_windows",
"winapi",
"zbus_macros",
"zbus_names",
"zvariant",
]
[[package]]
name = "zbus_macros"
version = "2.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1f8fb5186d1c87ae88cf234974c240671238b4a679158ad3b94ec465237349a6"
dependencies = [
"proc-macro-crate",
"proc-macro2",
"quote",
"regex",
"syn",
]
[[package]]
name = "zbus_names"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41a408fd8a352695690f53906dc7fd036be924ec51ea5e05666ff42685ed0af5"
dependencies = [
"serde",
"static_assertions",
"zvariant",
]
[[package]]
name = "zvariant"
version = "3.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b794fb7f59af4105697b0449ba31731ee5dbb3e773a17dbdf3d36206ea1b1644"
dependencies = [
"byteorder",
"enumflags2",
"libc",
"serde",
"static_assertions",
"zvariant_derive",
]
[[package]]
name = "zvariant_derive"
version = "3.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dd58d4b6c8e26d3dd2149c8c40c6613ef6451b9885ff1296d1ac86c388351a54"
dependencies = [
"proc-macro-crate",
"proc-macro2",
"quote",
"syn",
]

View File

@@ -23,7 +23,6 @@ regex = "1.5.6"
serial_test = "0.5.1"
kata-sys-util = { path = "../libs/kata-sys-util" }
kata-types = { path = "../libs/kata-types" }
sysinfo = "0.23.0"
# Async helpers
async-trait = "0.1.42"
@@ -69,6 +68,7 @@ clap = { version = "3.0.1", features = ["derive"] }
[dev-dependencies]
tempfile = "3.1.0"
test-utils = { path = "../libs/test-utils" }
which = "4.3.0"
[workspace]
members = [

View File

@@ -107,6 +107,8 @@ endef
##TARGET default: build code
default: $(TARGET) show-header
static-checks-build: $(GENERATED_CODE)
$(TARGET): $(GENERATED_CODE) $(TARGET_PATH)
$(TARGET_PATH): show-summary

View File

@@ -32,7 +32,10 @@ tokio = { version = "1.2.0", features = ["sync", "io-util", "process", "time", "
futures = "0.3.17"
async-trait = "0.1.31"
inotify = "0.9.2"
libseccomp = { version = "0.2.3", optional = true }
libseccomp = { version = "0.3.0", optional = true }
zbus = "2.3.0"
bit-vec= "0.6.3"
xattr = "0.2.3"
[dev-dependencies]
serial_test = "0.5.0"

View File

@@ -32,6 +32,7 @@ use protocols::agent::{
BlkioStats, BlkioStatsEntry, CgroupStats, CpuStats, CpuUsage, HugetlbStats, MemoryData,
MemoryStats, PidsStats, ThrottlingData,
};
use std::any::Any;
use std::collections::HashMap;
use std::fs;
use std::path::Path;
@@ -193,6 +194,79 @@ impl CgroupManager for Manager {
Ok(result)
}
fn update_cpuset_path(&self, guest_cpuset: &str, container_cpuset: &str) -> Result<()> {
if guest_cpuset.is_empty() {
return Ok(());
}
info!(sl!(), "update_cpuset_path to: {}", guest_cpuset);
let h = cgroups::hierarchies::auto();
let root_cg = h.root_control_group();
let root_cpuset_controller: &CpuSetController = root_cg.controller_of().unwrap();
let path = root_cpuset_controller.path();
let root_path = Path::new(path);
info!(sl!(), "root cpuset path: {:?}", &path);
let container_cpuset_controller: &CpuSetController = self.cgroup.controller_of().unwrap();
let path = container_cpuset_controller.path();
let container_path = Path::new(path);
info!(sl!(), "container cpuset path: {:?}", &path);
let mut paths = vec![];
for ancestor in container_path.ancestors() {
if ancestor == root_path {
break;
}
paths.push(ancestor);
}
info!(sl!(), "parent paths to update cpuset: {:?}", &paths);
let mut i = paths.len();
loop {
if i == 0 {
break;
}
i -= 1;
// remove cgroup root from path
let r_path = &paths[i]
.to_str()
.unwrap()
.trim_start_matches(root_path.to_str().unwrap());
info!(sl!(), "updating cpuset for parent path {:?}", &r_path);
let cg = new_cgroup(cgroups::hierarchies::auto(), r_path);
let cpuset_controller: &CpuSetController = cg.controller_of().unwrap();
cpuset_controller.set_cpus(guest_cpuset)?;
}
if !container_cpuset.is_empty() {
info!(
sl!(),
"updating cpuset for container path: {:?} cpuset: {}",
&container_path,
container_cpuset
);
container_cpuset_controller.set_cpus(container_cpuset)?;
}
Ok(())
}
fn get_cgroup_path(&self, cg: &str) -> Result<String> {
if cgroups::hierarchies::is_cgroup2_unified_mode() {
let cg_path = format!("/sys/fs/cgroup/{}", self.cpath);
return Ok(cg_path);
}
// for cgroup v1
Ok(self.paths.get(cg).map(|s| s.to_string()).unwrap())
}
fn as_any(&self) -> Result<&dyn Any> {
Ok(self)
}
}
fn set_network_resources(
@@ -252,19 +326,28 @@ fn set_devices_resources(
}
fn set_hugepages_resources(
_cg: &cgroups::Cgroup,
cg: &cgroups::Cgroup,
hugepage_limits: &[LinuxHugepageLimit],
res: &mut cgroups::Resources,
) {
info!(sl!(), "cgroup manager set hugepage");
let mut limits = vec![];
let hugetlb_controller = cg.controller_of::<HugeTlbController>();
for l in hugepage_limits.iter() {
let hr = HugePageResource {
size: l.page_size.clone(),
limit: l.limit,
};
limits.push(hr);
if hugetlb_controller.is_some() && hugetlb_controller.unwrap().size_supported(&l.page_size)
{
let hr = HugePageResource {
size: l.page_size.clone(),
limit: l.limit,
};
limits.push(hr);
} else {
warn!(
sl!(),
"{} page size support cannot be verified, dropping requested limit", l.page_size
);
}
}
res.hugepages.limits = limits;
}
@@ -972,75 +1055,6 @@ impl Manager {
cgroup: new_cgroup(cgroups::hierarchies::auto(), cpath),
})
}
pub fn update_cpuset_path(&self, guest_cpuset: &str, container_cpuset: &str) -> Result<()> {
if guest_cpuset.is_empty() {
return Ok(());
}
info!(sl!(), "update_cpuset_path to: {}", guest_cpuset);
let h = cgroups::hierarchies::auto();
let root_cg = h.root_control_group();
let root_cpuset_controller: &CpuSetController = root_cg.controller_of().unwrap();
let path = root_cpuset_controller.path();
let root_path = Path::new(path);
info!(sl!(), "root cpuset path: {:?}", &path);
let container_cpuset_controller: &CpuSetController = self.cgroup.controller_of().unwrap();
let path = container_cpuset_controller.path();
let container_path = Path::new(path);
info!(sl!(), "container cpuset path: {:?}", &path);
let mut paths = vec![];
for ancestor in container_path.ancestors() {
if ancestor == root_path {
break;
}
paths.push(ancestor);
}
info!(sl!(), "parent paths to update cpuset: {:?}", &paths);
let mut i = paths.len();
loop {
if i == 0 {
break;
}
i -= 1;
// remove cgroup root from path
let r_path = &paths[i]
.to_str()
.unwrap()
.trim_start_matches(root_path.to_str().unwrap());
info!(sl!(), "updating cpuset for parent path {:?}", &r_path);
let cg = new_cgroup(cgroups::hierarchies::auto(), r_path);
let cpuset_controller: &CpuSetController = cg.controller_of().unwrap();
cpuset_controller.set_cpus(guest_cpuset)?;
}
if !container_cpuset.is_empty() {
info!(
sl!(),
"updating cpuset for container path: {:?} cpuset: {}",
&container_path,
container_cpuset
);
container_cpuset_controller.set_cpus(container_cpuset)?;
}
Ok(())
}
pub fn get_cg_path(&self, cg: &str) -> Option<String> {
if cgroups::hierarchies::is_cgroup2_unified_mode() {
let cg_path = format!("/sys/fs/cgroup/{}", self.cpath);
return Some(cg_path);
}
// for cgroup v1
self.paths.get(cg).map(|s| s.to_string())
}
}
// get the guest's online cpus.

View File

@@ -11,6 +11,7 @@ use anyhow::Result;
use cgroups::freezer::FreezerState;
use libc::{self, pid_t};
use oci::LinuxResources;
use std::any::Any;
use std::collections::HashMap;
use std::string::String;
@@ -53,6 +54,18 @@ impl CgroupManager for Manager {
fn get_pids(&self) -> Result<Vec<pid_t>> {
Ok(Vec::new())
}
fn update_cpuset_path(&self, _: &str, _: &str) -> Result<()> {
Ok(())
}
fn get_cgroup_path(&self, _: &str) -> Result<String> {
Ok("".to_string())
}
fn as_any(&self) -> Result<&dyn Any> {
Ok(self)
}
}
impl Manager {
@@ -63,12 +76,4 @@ impl Manager {
cpath: cpath.to_string(),
})
}
pub fn update_cpuset_path(&self, _: &str, _: &str) -> Result<()> {
Ok(())
}
pub fn get_cg_path(&self, _: &str) -> Option<String> {
Some("".to_string())
}
}

View File

@@ -4,8 +4,10 @@
//
use anyhow::{anyhow, Result};
use core::fmt::Debug;
use oci::LinuxResources;
use protocols::agent::CgroupStats;
use std::any::Any;
use cgroups::freezer::FreezerState;
@@ -38,4 +40,22 @@ pub trait Manager {
fn set(&self, _container: &LinuxResources, _update: bool) -> Result<()> {
Err(anyhow!("not supported!"))
}
fn update_cpuset_path(&self, _: &str, _: &str) -> Result<()> {
Err(anyhow!("not supported!"))
}
fn get_cgroup_path(&self, _: &str) -> Result<String> {
Err(anyhow!("not supported!"))
}
fn as_any(&self) -> Result<&dyn Any> {
Err(anyhow!("not supported!"))
}
}
impl Debug for dyn Manager + Send + Sync {
fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
write!(f, "CgroupManager")
}
}

View File

@@ -1,10 +0,0 @@
// Copyright (c) 2019 Ant Financial
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::cgroups::Manager as CgroupManager;
pub struct Manager {}
impl CgroupManager for Manager {}

View File

@@ -0,0 +1,95 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Result};
use super::common::{DEFAULT_SLICE, SCOPE_SUFFIX, SLICE_SUFFIX};
use std::string::String;
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct CgroupsPath {
pub slice: String,
pub prefix: String,
pub name: String,
}
impl CgroupsPath {
pub fn new(cgroups_path_str: &str) -> Result<Self> {
let path_vec: Vec<&str> = cgroups_path_str.split(':').collect();
if path_vec.len() != 3 {
return Err(anyhow!("invalid cpath: {:?}", cgroups_path_str));
}
Ok(CgroupsPath {
slice: if path_vec[0].is_empty() {
DEFAULT_SLICE.to_string()
} else {
path_vec[0].to_owned()
},
prefix: path_vec[1].to_owned(),
name: path_vec[2].to_owned(),
})
}
// ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md
// return: (parent_slice, unit_name)
pub fn parse(&self) -> Result<(String, String)> {
Ok((
parse_parent(self.slice.to_owned())?,
get_unit_name(self.prefix.to_owned(), self.name.to_owned()),
))
}
}
fn parse_parent(slice: String) -> Result<String> {
if !slice.ends_with(SLICE_SUFFIX) || slice.contains('/') {
return Err(anyhow!("invalid slice name: {}", slice));
} else if slice == "-.slice" {
return Ok(String::new());
}
let mut slice_path = String::new();
let mut prefix = String::new();
for subslice in slice.trim_end_matches(SLICE_SUFFIX).split('-') {
if subslice.is_empty() {
return Err(anyhow!("invalid slice name: {}", slice));
}
slice_path = format!("{}/{}{}{}", slice_path, prefix, subslice, SLICE_SUFFIX);
prefix = format!("{}{}-", prefix, subslice);
}
slice_path.remove(0);
Ok(slice_path)
}
fn get_unit_name(prefix: String, name: String) -> String {
if name.ends_with(SLICE_SUFFIX) {
name
} else if prefix.is_empty() {
format!("{}{}", name, SCOPE_SUFFIX)
} else {
format!("{}-{}{}", prefix, name, SCOPE_SUFFIX)
}
}
#[cfg(test)]
mod tests {
use super::CgroupsPath;
#[test]
fn test_cgroup_path_parse() {
let slice = "system.slice";
let prefix = "kata_agent";
let name = "123";
let cgroups_path =
CgroupsPath::new(format!("{}:{}:{}", slice, prefix, name).as_str()).unwrap();
assert_eq!(slice, cgroups_path.slice.as_str());
assert_eq!(prefix, cgroups_path.prefix.as_str());
assert_eq!(name, cgroups_path.name.as_str());
let (parent_slice, unit_name) = cgroups_path.parse().unwrap();
assert_eq!(format!("{}", slice), parent_slice);
assert_eq!(format!("{}-{}.scope", prefix, name), unit_name);
}
}

View File

@@ -0,0 +1,17 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
pub const DEFAULT_SLICE: &str = "system.slice";
pub const SLICE_SUFFIX: &str = ".slice";
pub const SCOPE_SUFFIX: &str = ".scope";
pub const UNIT_MODE: &str = "replace";
pub type Properties<'a> = Vec<(&'a str, zbus::zvariant::Value<'a>)>;
#[derive(Serialize, Deserialize, Debug, Clone)]
pub enum CgroupHierarchy {
Legacy,
Unified,
}

View File

@@ -0,0 +1,126 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use std::vec;
use super::common::CgroupHierarchy;
use super::common::{Properties, SLICE_SUFFIX, UNIT_MODE};
use super::interface::system::ManagerProxyBlocking as SystemManager;
use anyhow::{Context, Result};
use zbus::zvariant::Value;
pub trait SystemdInterface {
fn start_unit(
&self,
pid: i32,
parent: &str,
unit_name: &str,
cg_hierarchy: &CgroupHierarchy,
) -> Result<()>;
fn set_properties(&self, unit_name: &str, properties: &Properties) -> Result<()>;
fn stop_unit(&self, unit_name: &str) -> Result<()>;
fn get_version(&self) -> Result<String>;
fn unit_exist(&self, unit_name: &str) -> Result<bool>;
fn add_process(&self, pid: i32, unit_name: &str) -> Result<()>;
}
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct DBusClient {}
impl DBusClient {
fn build_proxy(&self) -> Result<SystemManager<'static>> {
let connection = zbus::blocking::Connection::system()?;
let proxy = SystemManager::new(&connection)?;
Ok(proxy)
}
}
impl SystemdInterface for DBusClient {
fn start_unit(
&self,
pid: i32,
parent: &str,
unit_name: &str,
cg_hierarchy: &CgroupHierarchy,
) -> Result<()> {
let proxy = self.build_proxy()?;
// enable CPUAccounting & MemoryAccounting & (Block)IOAccounting by default
let mut properties: Properties = vec![
("CPUAccounting", Value::Bool(true)),
("DefaultDependencies", Value::Bool(false)),
("MemoryAccounting", Value::Bool(true)),
("TasksAccounting", Value::Bool(true)),
("Description", Value::Str("kata-agent container".into())),
("PIDs", Value::Array(vec![pid as u32].into())),
];
match *cg_hierarchy {
CgroupHierarchy::Legacy => properties.push(("IOAccounting", Value::Bool(true))),
CgroupHierarchy::Unified => properties.push(("BlockIOAccounting", Value::Bool(true))),
}
if unit_name.ends_with(SLICE_SUFFIX) {
properties.push(("Wants", Value::Str(parent.into())));
} else {
properties.push(("Slice", Value::Str(parent.into())));
properties.push(("Delegate", Value::Bool(true)));
}
proxy
.start_transient_unit(unit_name, UNIT_MODE, &properties, &[])
.with_context(|| format!("failed to start transient unit {}", unit_name))?;
Ok(())
}
fn set_properties(&self, unit_name: &str, properties: &Properties) -> Result<()> {
let proxy = self.build_proxy()?;
proxy
.set_unit_properties(unit_name, true, properties)
.with_context(|| format!("failed to set unit properties {}", unit_name))?;
Ok(())
}
fn stop_unit(&self, unit_name: &str) -> Result<()> {
let proxy = self.build_proxy()?;
proxy
.stop_unit(unit_name, UNIT_MODE)
.with_context(|| format!("failed to stop unit {}", unit_name))?;
Ok(())
}
fn get_version(&self) -> Result<String> {
let proxy = self.build_proxy()?;
let systemd_version = proxy
.version()
.with_context(|| "failed to get systemd version".to_string())?;
Ok(systemd_version)
}
fn unit_exist(&self, unit_name: &str) -> Result<bool> {
let proxy = self.build_proxy()?;
Ok(proxy.get_unit(unit_name).is_ok())
}
fn add_process(&self, pid: i32, unit_name: &str) -> Result<()> {
let proxy = self.build_proxy()?;
proxy
.attach_processes_to_unit(unit_name, "/", &[pid as u32])
.with_context(|| format!("failed to add process {}", unit_name))?;
Ok(())
}
}

View File

@@ -0,0 +1,7 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
pub(crate) mod session;
pub(crate) mod system;

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,129 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use crate::cgroups::Manager as CgroupManager;
use crate::protocols::agent::CgroupStats;
use anyhow::Result;
use cgroups::freezer::FreezerState;
use libc::{self, pid_t};
use oci::LinuxResources;
use std::any::Any;
use std::collections::HashMap;
use std::convert::TryInto;
use std::string::String;
use std::vec;
use super::super::fs::Manager as FsManager;
use super::cgroups_path::CgroupsPath;
use super::common::{CgroupHierarchy, Properties};
use super::dbus_client::{DBusClient, SystemdInterface};
use super::subsystem::transformer::Transformer;
use super::subsystem::{cpu::Cpu, cpuset::CpuSet, memory::Memory, pids::Pids};
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Manager {
pub paths: HashMap<String, String>,
pub mounts: HashMap<String, String>,
pub cgroups_path: CgroupsPath,
pub cpath: String,
pub unit_name: String,
// dbus client for set properties
dbus_client: DBusClient,
// fs manager for get properties
fs_manager: FsManager,
// cgroup version for different dbus properties
cg_hierarchy: CgroupHierarchy,
}
impl CgroupManager for Manager {
fn apply(&self, pid: pid_t) -> Result<()> {
let unit_name = self.unit_name.as_str();
if self.dbus_client.unit_exist(unit_name).unwrap() {
self.dbus_client.add_process(pid, self.unit_name.as_str())?;
} else {
self.dbus_client.start_unit(
(pid as u32).try_into().unwrap(),
self.cgroups_path.slice.as_str(),
self.unit_name.as_str(),
&self.cg_hierarchy,
)?;
}
Ok(())
}
fn set(&self, r: &LinuxResources, _: bool) -> Result<()> {
let mut properties: Properties = vec![];
let systemd_version = self.dbus_client.get_version()?;
let systemd_version_str = systemd_version.as_str();
Cpu::apply(r, &mut properties, &self.cg_hierarchy, systemd_version_str)?;
Memory::apply(r, &mut properties, &self.cg_hierarchy, systemd_version_str)?;
Pids::apply(r, &mut properties, &self.cg_hierarchy, systemd_version_str)?;
CpuSet::apply(r, &mut properties, &self.cg_hierarchy, systemd_version_str)?;
self.dbus_client
.set_properties(self.unit_name.as_str(), &properties)?;
Ok(())
}
fn get_stats(&self) -> Result<CgroupStats> {
self.fs_manager.get_stats()
}
fn freeze(&self, state: FreezerState) -> Result<()> {
self.fs_manager.freeze(state)
}
fn destroy(&mut self) -> Result<()> {
self.dbus_client.stop_unit(self.unit_name.as_str())?;
self.fs_manager.destroy()
}
fn get_pids(&self) -> Result<Vec<pid_t>> {
self.fs_manager.get_pids()
}
fn update_cpuset_path(&self, guest_cpuset: &str, container_cpuset: &str) -> Result<()> {
self.fs_manager
.update_cpuset_path(guest_cpuset, container_cpuset)
}
fn get_cgroup_path(&self, cg: &str) -> Result<String> {
self.fs_manager.get_cgroup_path(cg)
}
fn as_any(&self) -> Result<&dyn Any> {
Ok(self)
}
}
impl Manager {
pub fn new(cgroups_path_str: &str) -> Result<Self> {
let cgroups_path = CgroupsPath::new(cgroups_path_str)?;
let (parent_slice, unit_name) = cgroups_path.parse()?;
let cpath = parent_slice + "/" + &unit_name;
let fs_manager = FsManager::new(cpath.as_str())?;
Ok(Manager {
paths: fs_manager.paths.clone(),
mounts: fs_manager.mounts.clone(),
cgroups_path,
cpath,
unit_name,
dbus_client: DBusClient {},
fs_manager,
cg_hierarchy: if cgroups::hierarchies::is_cgroup2_unified_mode() {
CgroupHierarchy::Unified
} else {
CgroupHierarchy::Legacy
},
})
}
}

View File

@@ -0,0 +1,12 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
pub mod manager;
mod cgroups_path;
mod common;
mod dbus_client;
mod interface;
mod subsystem;

View File

@@ -0,0 +1,139 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::super::common::{CgroupHierarchy, Properties};
use super::transformer::Transformer;
use anyhow::Result;
use oci::{LinuxCpu, LinuxResources};
use zbus::zvariant::Value;
const BASIC_SYSTEMD_VERSION: &str = "242";
const DEFAULT_CPUQUOTAPERIOD: u64 = 100 * 1000;
const SEC2MICROSEC: u64 = 1000 * 1000;
const BASIC_INTERVAL: u64 = 10 * 1000;
pub struct Cpu {}
impl Transformer for Cpu {
fn apply(
r: &LinuxResources,
properties: &mut Properties,
cgroup_hierarchy: &CgroupHierarchy,
systemd_version: &str,
) -> Result<()> {
if let Some(cpu_resources) = &r.cpu {
match cgroup_hierarchy {
CgroupHierarchy::Legacy => {
Self::legacy_apply(cpu_resources, properties, systemd_version)?
}
CgroupHierarchy::Unified => {
Self::unified_apply(cpu_resources, properties, systemd_version)?
}
}
}
Ok(())
}
}
impl Cpu {
// v1:
// cpu.shares <-> CPUShares
// cpu.period <-> CPUQuotaPeriodUSec
// cpu.period & cpu.quota <-> CPUQuotaPerSecUSec
fn legacy_apply(
cpu_resources: &LinuxCpu,
properties: &mut Properties,
systemd_version: &str,
) -> Result<()> {
if let Some(shares) = cpu_resources.shares {
properties.push(("CPUShares", Value::U64(shares)));
}
if let Some(period) = cpu_resources.period {
if period != 0 && systemd_version >= BASIC_SYSTEMD_VERSION {
properties.push(("CPUQuotaPeriodUSec", Value::U64(period)));
}
}
if let Some(quota) = cpu_resources.quota {
let period = cpu_resources.period.unwrap_or(DEFAULT_CPUQUOTAPERIOD);
if period != 0 {
let cpu_quota_per_sec_usec = resolve_cpuquota(quota, period);
properties.push(("CPUQuotaPerSecUSec", Value::U64(cpu_quota_per_sec_usec)));
}
}
Ok(())
}
// v2:
// cpu.shares <-> CPUShares
// cpu.period <-> CPUQuotaPeriodUSec
// cpu.period & cpu.quota <-> CPUQuotaPerSecUSec
fn unified_apply(
cpu_resources: &LinuxCpu,
properties: &mut Properties,
systemd_version: &str,
) -> Result<()> {
if let Some(shares) = cpu_resources.shares {
let unified_shares = get_unified_cpushares(shares);
properties.push(("CPUShares", Value::U64(unified_shares)));
}
if let Some(period) = cpu_resources.period {
if period != 0 && systemd_version >= BASIC_SYSTEMD_VERSION {
properties.push(("CPUQuotaPeriodUSec", Value::U64(period)));
}
}
if let Some(quota) = cpu_resources.quota {
let period = cpu_resources.period.unwrap_or(DEFAULT_CPUQUOTAPERIOD);
if period != 0 {
let cpu_quota_per_sec_usec = resolve_cpuquota(quota, period);
properties.push(("CPUQuotaPerSecUSec", Value::U64(cpu_quota_per_sec_usec)));
}
}
Ok(())
}
}
// ref: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2
// [2-262144] to [1-10000]
fn get_unified_cpushares(shares: u64) -> u64 {
if shares == 0 {
return 100;
}
1 + ((shares - 2) * 9999) / 262142
}
fn resolve_cpuquota(quota: i64, period: u64) -> u64 {
let mut cpu_quota_per_sec_usec = u64::MAX;
if quota > 0 {
cpu_quota_per_sec_usec = (quota as u64) * SEC2MICROSEC / period;
if cpu_quota_per_sec_usec % BASIC_INTERVAL != 0 {
cpu_quota_per_sec_usec =
((cpu_quota_per_sec_usec / BASIC_INTERVAL) + 1) * BASIC_INTERVAL;
}
}
cpu_quota_per_sec_usec
}
#[cfg(test)]
mod tests {
use crate::cgroups::systemd::subsystem::cpu::resolve_cpuquota;
#[test]
fn test_unified_cpuquota() {
let quota: i64 = 1000000;
let period: u64 = 500000;
let cpu_quota_per_sec_usec = resolve_cpuquota(quota, period);
assert_eq!(2000000, cpu_quota_per_sec_usec);
}
}

View File

@@ -0,0 +1,124 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::super::common::{CgroupHierarchy, Properties};
use super::transformer::Transformer;
use anyhow::{bail, Result};
use bit_vec::BitVec;
use oci::{LinuxCpu, LinuxResources};
use std::convert::{TryFrom, TryInto};
use zbus::zvariant::Value;
const BASIC_SYSTEMD_VERSION: &str = "244";
pub struct CpuSet {}
impl Transformer for CpuSet {
fn apply(
r: &LinuxResources,
properties: &mut Properties,
_: &CgroupHierarchy,
systemd_version: &str,
) -> Result<()> {
if let Some(cpuset_resources) = &r.cpu {
Self::apply(cpuset_resources, properties, systemd_version)?;
}
Ok(())
}
}
// v1 & v2:
// cpuset.cpus <-> AllowedCPUs (v244)
// cpuset.mems <-> AllowedMemoryNodes (v244)
impl CpuSet {
fn apply(
cpuset_resources: &LinuxCpu,
properties: &mut Properties,
systemd_version: &str,
) -> Result<()> {
if systemd_version < BASIC_SYSTEMD_VERSION {
return Ok(());
}
let cpus = cpuset_resources.cpus.as_str();
if !cpus.is_empty() {
let cpus_vec: BitMask = cpus.try_into()?;
properties.push(("AllowedCPUs", Value::Array(cpus_vec.0.into())));
}
let mems = cpuset_resources.mems.as_str();
if !mems.is_empty() {
let mems_vec: BitMask = mems.try_into()?;
properties.push(("AllowedMemoryNodes", Value::Array(mems_vec.0.into())));
}
Ok(())
}
}
struct BitMask(Vec<u8>);
impl TryFrom<&str> for BitMask {
type Error = anyhow::Error;
fn try_from(bitmask_str: &str) -> Result<Self, Self::Error> {
let mut bitmask_vec = BitVec::from_elem(8, false);
let bitmask_str_vec: Vec<&str> = bitmask_str.split(',').collect();
for bitmask in bitmask_str_vec.iter() {
let range: Vec<&str> = bitmask.split('-').collect();
match range.len() {
1 => {
let idx: usize = range[0].parse()?;
while idx >= bitmask_vec.len() {
bitmask_vec.grow(8, false);
}
bitmask_vec.set(adjust_index(idx), true);
}
2 => {
let left_index = range[0].parse()?;
let right_index = range[1].parse()?;
while right_index >= bitmask_vec.len() {
bitmask_vec.grow(8, false);
}
for idx in left_index..=right_index {
bitmask_vec.set(adjust_index(idx), true);
}
}
_ => bail!("invalid bitmask str {}", bitmask_str),
}
}
let mut result_vec = bitmask_vec.to_bytes();
result_vec.reverse();
Ok(BitMask(result_vec))
}
}
#[inline(always)]
fn adjust_index(idx: usize) -> usize {
idx / 8 * 8 + 7 - idx % 8
}
#[cfg(test)]
mod tests {
use std::convert::TryInto;
use crate::cgroups::systemd::subsystem::cpuset::BitMask;
#[test]
fn test_bitmask_conversion() {
let cpus_vec: BitMask = "2-4".try_into().unwrap();
assert_eq!(vec![0b11100 as u8], cpus_vec.0);
let cpus_vec: BitMask = "1,7".try_into().unwrap();
assert_eq!(vec![0b10000010 as u8], cpus_vec.0);
let cpus_vec: BitMask = "0,2-3,7".try_into().unwrap();
assert_eq!(vec![0b10001101 as u8], cpus_vec.0);
}
}

View File

@@ -0,0 +1,117 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::super::common::{CgroupHierarchy, Properties};
use super::transformer::Transformer;
use anyhow::{bail, Result};
use oci::{LinuxMemory, LinuxResources};
use zbus::zvariant::Value;
pub struct Memory {}
impl Transformer for Memory {
fn apply(
r: &LinuxResources,
properties: &mut Properties,
cgroup_hierarchy: &CgroupHierarchy,
_: &str,
) -> Result<()> {
if let Some(memory_resources) = &r.memory {
match cgroup_hierarchy {
CgroupHierarchy::Legacy => Self::legacy_apply(memory_resources, properties)?,
CgroupHierarchy::Unified => Self::unified_apply(memory_resources, properties)?,
}
}
Ok(())
}
}
impl Memory {
// v1:
// memory.limit <-> MemoryLimit
fn legacy_apply(memory_resources: &LinuxMemory, properties: &mut Properties) -> Result<()> {
if let Some(limit) = memory_resources.limit {
let limit = match limit {
1..=i64::MAX => limit as u64,
0 => u64::MAX,
_ => bail!("invalid memory.limit"),
};
properties.push(("MemoryLimit", Value::U64(limit)));
}
Ok(())
}
// v2:
// memory.low <-> MemoryLow
// memory.max <-> MemoryMax
// memory.swap & memory.limit <-> MemorySwapMax
fn unified_apply(memory_resources: &LinuxMemory, properties: &mut Properties) -> Result<()> {
if let Some(limit) = memory_resources.limit {
let limit = match limit {
1..=i64::MAX => limit as u64,
0 => u64::MAX,
_ => bail!("invalid memory.limit: {}", limit),
};
properties.push(("MemoryMax", Value::U64(limit)));
}
if let Some(reservation) = memory_resources.reservation {
let reservation = match reservation {
1..=i64::MAX => reservation as u64,
0 => u64::MAX,
_ => bail!("invalid memory.reservation: {}", reservation),
};
properties.push(("MemoryLow", Value::U64(reservation)));
}
let swap = match memory_resources.swap {
Some(0) => u64::MAX,
Some(1..=i64::MAX) => match memory_resources.limit {
Some(1..=i64::MAX) => {
(memory_resources.limit.unwrap() - memory_resources.swap.unwrap()) as u64
}
_ => bail!("invalid memory.limit when memory.swap specified"),
},
None => u64::MAX,
_ => bail!("invalid memory.swap"),
};
properties.push(("MemorySwapMax", Value::U64(swap)));
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::Memory;
use super::Properties;
use super::Value;
#[test]
fn test_unified_memory() {
let memory_resources = oci::LinuxMemory {
limit: Some(736870912),
reservation: Some(536870912),
swap: Some(536870912),
kernel: Some(0),
kernel_tcp: Some(0),
swappiness: Some(0),
disable_oom_killer: Some(false),
};
let mut properties: Properties = vec![];
assert_eq!(
true,
Memory::unified_apply(&memory_resources, &mut properties).is_ok()
);
assert_eq!(Value::U64(200000000), properties[2].1);
}
}

View File

@@ -0,0 +1,10 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
pub mod cpu;
pub mod cpuset;
pub mod memory;
pub mod pids;
pub mod transformer;

View File

@@ -0,0 +1,60 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::super::common::{CgroupHierarchy, Properties};
use super::transformer::Transformer;
use anyhow::Result;
use oci::{LinuxPids, LinuxResources};
use zbus::zvariant::Value;
pub struct Pids {}
impl Transformer for Pids {
fn apply(
r: &LinuxResources,
properties: &mut Properties,
_: &CgroupHierarchy,
_: &str,
) -> Result<()> {
if let Some(pids_resources) = &r.pids {
Self::apply(pids_resources, properties)?;
}
Ok(())
}
}
// pids.limit <-> TasksMax
impl Pids {
fn apply(pids_resources: &LinuxPids, properties: &mut Properties) -> Result<()> {
let limit = if pids_resources.limit > 0 {
pids_resources.limit as u64
} else {
u64::MAX
};
properties.push(("TasksMax", Value::U64(limit)));
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::Pids;
use super::Properties;
use super::Value;
#[test]
fn test_subsystem_workflow() {
let pids_resources = oci::LinuxPids { limit: 0 };
let mut properties: Properties = vec![];
assert_eq!(true, Pids::apply(&pids_resources, &mut properties).is_ok());
assert_eq!(Value::U64(u64::MAX), properties[0].1);
}
}

View File

@@ -0,0 +1,17 @@
// Copyright 2021-2022 Kata Contributors
//
// SPDX-License-Identifier: Apache-2.0
//
use super::super::common::{CgroupHierarchy, Properties};
use anyhow::Result;
use oci::LinuxResources;
pub trait Transformer {
fn apply(
r: &LinuxResources,
properties: &mut Properties,
cgroup_hierarchy: &CgroupHierarchy,
systemd_version: &str,
) -> Result<()>;
}

View File

@@ -22,6 +22,7 @@ use crate::capabilities;
use crate::cgroups::fs::Manager as FsManager;
#[cfg(test)]
use crate::cgroups::mock::Manager as FsManager;
use crate::cgroups::systemd::manager::Manager as SystemdManager;
use crate::cgroups::Manager;
#[cfg(feature = "standard-oci-runtime")]
use crate::console;
@@ -29,6 +30,7 @@ use crate::log_child;
use crate::process::Process;
#[cfg(feature = "seccomp")]
use crate::seccomp;
use crate::selinux;
use crate::specconv::CreateOpts;
use crate::{mount, validator};
@@ -49,6 +51,7 @@ use std::os::unix::io::AsRawFd;
use protobuf::SingularPtrField;
use oci::State as OCIState;
use regex::Regex;
use std::collections::HashMap;
use std::os::unix::io::FromRawFd;
use std::str::FromStr;
@@ -107,7 +110,6 @@ impl Default for ContainerStatus {
}
// We might want to change this to thiserror in the future
const MissingCGroupManager: &str = "failed to get container's cgroup Manager";
const MissingLinux: &str = "no linux config";
const InvalidNamespace: &str = "invalid namespace type";
@@ -201,6 +203,8 @@ lazy_static! {
},
]
};
pub static ref SYSTEMD_CGROUP_PATH_FORMAT:Regex = Regex::new(r"^[\w\-.]*:[\w\-.]*:[\w\-.]*$").unwrap();
}
#[derive(Serialize, Deserialize, Debug)]
@@ -239,7 +243,7 @@ pub struct LinuxContainer {
pub id: String,
pub root: String,
pub config: Config,
pub cgroup_manager: Option<FsManager>,
pub cgroup_manager: Box<dyn Manager + Send + Sync>,
pub init_process_pid: pid_t,
pub init_process_start_time: u64,
pub uid_map_path: String,
@@ -288,16 +292,11 @@ impl Container for LinuxContainer {
));
}
if self.cgroup_manager.is_some() {
self.cgroup_manager
.as_ref()
.unwrap()
.freeze(FreezerState::Frozen)?;
self.cgroup_manager.as_ref().freeze(FreezerState::Frozen)?;
self.status.transition(ContainerState::Paused);
return Ok(());
}
Err(anyhow!(MissingCGroupManager))
self.status.transition(ContainerState::Paused);
Ok(())
}
fn resume(&mut self) -> Result<()> {
@@ -306,16 +305,11 @@ impl Container for LinuxContainer {
return Err(anyhow!("container status is: {:?}, not paused", status));
}
if self.cgroup_manager.is_some() {
self.cgroup_manager
.as_ref()
.unwrap()
.freeze(FreezerState::Thawed)?;
self.cgroup_manager.as_ref().freeze(FreezerState::Thawed)?;
self.status.transition(ContainerState::Running);
return Ok(());
}
Err(anyhow!(MissingCGroupManager))
self.status.transition(ContainerState::Running);
Ok(())
}
}
@@ -390,7 +384,9 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
let buf = read_sync(crfd)?;
let cm_str = std::str::from_utf8(&buf)?;
let cm: FsManager = serde_json::from_str(cm_str)?;
// deserialize cm_str into FsManager and SystemdManager separately
let fs_cm: Result<FsManager, serde_json::Error> = serde_json::from_str(cm_str);
let systemd_cm: Result<SystemdManager, serde_json::Error> = serde_json::from_str(cm_str);
#[cfg(feature = "standard-oci-runtime")]
let csocket_fd = console::setup_console_socket(&std::env::var(CONSOLE_SOCKET_FD)?)?;
@@ -531,6 +527,8 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
}
}
let selinux_enabled = selinux::is_enabled()?;
sched::unshare(to_new & !CloneFlags::CLONE_NEWUSER)?;
if userns {
@@ -548,7 +546,18 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
if to_new.contains(CloneFlags::CLONE_NEWNS) {
// setup rootfs
mount::init_rootfs(cfd_log, &spec, &cm.paths, &cm.mounts, bind_device)?;
if let Ok(systemd_cm) = systemd_cm {
mount::init_rootfs(
cfd_log,
&spec,
&systemd_cm.paths,
&systemd_cm.mounts,
bind_device,
)?;
} else {
let fs_cm = fs_cm.unwrap();
mount::init_rootfs(cfd_log, &spec, &fs_cm.paths, &fs_cm.mounts, bind_device)?;
}
}
if init {
@@ -621,6 +630,18 @@ fn do_init_child(cwfd: RawFd) -> Result<()> {
capctl::prctl::set_no_new_privs().map_err(|_| anyhow!("cannot set no new privileges"))?;
}
// Set SELinux label
if !oci_process.selinux_label.is_empty() {
if !selinux_enabled {
return Err(anyhow!(
"SELinux label for the process is provided but SELinux is not enabled on the running kernel"
));
}
log_child!(cfd_log, "Set SELinux label to the container process");
selinux::set_exec_label(&oci_process.selinux_label)?;
}
// Log unknown seccomp system calls in advance before the log file descriptor closes.
#[cfg(feature = "seccomp")]
if let Some(ref scmp) = linux.seccomp {
@@ -830,22 +851,17 @@ impl BaseContainer for LinuxContainer {
}
fn stats(&self) -> Result<StatsContainerResponse> {
let mut r = StatsContainerResponse::default();
if self.cgroup_manager.is_some() {
r.cgroup_stats =
SingularPtrField::some(self.cgroup_manager.as_ref().unwrap().get_stats()?);
}
// what about network interface stats?
Ok(r)
Ok(StatsContainerResponse {
cgroup_stats: SingularPtrField::some(self.cgroup_manager.as_ref().get_stats()?),
..Default::default()
})
}
fn set(&mut self, r: LinuxResources) -> Result<()> {
if self.cgroup_manager.is_some() {
self.cgroup_manager.as_ref().unwrap().set(&r, true)?;
}
self.cgroup_manager.as_ref().set(&r, true)?;
self.config
.spec
.as_mut()
@@ -1018,7 +1034,8 @@ impl BaseContainer for LinuxContainer {
&logger,
spec,
&p,
self.cgroup_manager.as_ref().unwrap(),
self.cgroup_manager.as_ref(),
self.config.use_systemd_cgroup,
&st,
&mut pipe_w,
&mut pipe_r,
@@ -1096,19 +1113,19 @@ impl BaseContainer for LinuxContainer {
)?;
fs::remove_dir_all(&self.root)?;
if let Some(cgm) = self.cgroup_manager.as_mut() {
// Kill all of the processes created in this container to prevent
// the leak of some daemon process when this container shared pidns
// with the sandbox.
let pids = cgm.get_pids().context("get cgroup pids")?;
for i in pids {
if let Err(e) = signal::kill(Pid::from_raw(i), Signal::SIGKILL) {
warn!(self.logger, "kill the process {} error: {:?}", i, e);
}
let cgm = self.cgroup_manager.as_mut();
// Kill all of the processes created in this container to prevent
// the leak of some daemon process when this container shared pidns
// with the sandbox.
let pids = cgm.get_pids().context("get cgroup pids")?;
for i in pids {
if let Err(e) = signal::kill(Pid::from_raw(i), Signal::SIGKILL) {
warn!(self.logger, "kill the process {} error: {:?}", i, e);
}
cgm.destroy().context("destroy cgroups")?;
}
cgm.destroy().context("destroy cgroups")?;
Ok(())
}
@@ -1280,11 +1297,13 @@ pub fn setup_child_logger(fd: RawFd, child_logger: Logger) -> tokio::task::JoinH
})
}
#[allow(clippy::too_many_arguments)]
async fn join_namespaces(
logger: &Logger,
spec: &Spec,
p: &Process,
cm: &FsManager,
cm: &(dyn Manager + Send + Sync),
use_systemd_cgroup: bool,
st: &OCIState,
pipe_w: &mut PipeStream,
pipe_r: &mut PipeStream,
@@ -1311,7 +1330,11 @@ async fn join_namespaces(
info!(logger, "wait child received oci process");
read_async(pipe_r).await?;
let cm_str = serde_json::to_string(cm)?;
let cm_str = if use_systemd_cgroup {
serde_json::to_string(cm.as_any()?.downcast_ref::<SystemdManager>().unwrap())
} else {
serde_json::to_string(cm.as_any()?.downcast_ref::<FsManager>().unwrap())
}?;
write_async(pipe_w, SYNC_DATA, cm_str.as_str()).await?;
// wait child setup user namespace
@@ -1334,13 +1357,16 @@ async fn join_namespaces(
}
// apply cgroups
if p.init && res.is_some() {
info!(logger, "apply cgroups!");
cm.set(res.unwrap(), false)?;
// For FsManger, it's no matter about the order of apply and set.
// For SystemdManger, apply must be precede set because we can only create a systemd unit with specific processes(pids).
if res.is_some() {
info!(logger, "apply processes to cgroups!");
cm.apply(p.pid)?;
}
if res.is_some() {
cm.apply(p.pid)?;
if p.init && res.is_some() {
info!(logger, "set properties to cgroups!");
cm.set(res.unwrap(), false)?;
}
info!(logger, "notify child to continue");
@@ -1419,7 +1445,7 @@ impl LinuxContainer {
pub fn new<T: Into<String> + Display + Clone>(
id: T,
base: T,
config: Config,
mut config: Config,
logger: &Logger,
) -> Result<Self> {
let base = base.into();
@@ -1448,24 +1474,46 @@ impl LinuxContainer {
let linux = spec.linux.as_ref().unwrap();
let cpath = if linux.cgroups_path.is_empty() {
format!("/{}", id.as_str())
// determine which cgroup driver to take and then assign to config.use_systemd_cgroup
// systemd: "[slice]:[prefix]:[name]"
// fs: "/path_a/path_b"
let cpath = if SYSTEMD_CGROUP_PATH_FORMAT.is_match(linux.cgroups_path.as_str()) {
config.use_systemd_cgroup = true;
if linux.cgroups_path.len() == 2 {
format!("system.slice:kata_agent:{}", id.as_str())
} else {
linux.cgroups_path.clone()
}
} else {
linux.cgroups_path.clone()
config.use_systemd_cgroup = false;
if linux.cgroups_path.is_empty() {
format!("/{}", id.as_str())
} else {
linux.cgroups_path.clone()
}
};
let cgroup_manager = FsManager::new(cpath.as_str()).map_err(|e| {
anyhow!(format!(
"fail to create cgroup manager with path {}: {:}",
cpath, e
))
})?;
let cgroup_manager: Box<dyn Manager + Send + Sync> = if config.use_systemd_cgroup {
Box::new(SystemdManager::new(cpath.as_str()).map_err(|e| {
anyhow!(format!(
"fail to create cgroup manager with path {}: {:}",
cpath, e
))
})?)
} else {
Box::new(FsManager::new(cpath.as_str()).map_err(|e| {
anyhow!(format!(
"fail to create cgroup manager with path {}: {:}",
cpath, e
))
})?)
};
info!(logger, "new cgroup_manager {:?}", &cgroup_manager);
Ok(LinuxContainer {
id: id.clone(),
root,
cgroup_manager: Some(cgroup_manager),
cgroup_manager,
status: ContainerStatus::new(),
uid_map_path: String::from(""),
gid_map_path: "".to_string(),
@@ -1931,20 +1979,12 @@ mod tests {
assert!(format!("{:?}", ret).contains("failed to pause container"))
}
#[test]
fn test_linuxcontainer_pause_cgroupmgr_is_none() {
let ret = new_linux_container_and_then(|mut c: LinuxContainer| {
c.cgroup_manager = None;
c.pause().map_err(|e| anyhow!(e))
});
assert!(ret.is_err(), "Expecting error, Got {:?}", ret);
}
#[test]
fn test_linuxcontainer_pause() {
let ret = new_linux_container_and_then(|mut c: LinuxContainer| {
c.cgroup_manager = FsManager::new("").ok();
c.cgroup_manager = Box::new(FsManager::new("").map_err(|e| {
anyhow!(format!("fail to create cgroup manager with path: {:}", e))
})?);
c.pause().map_err(|e| anyhow!(e))
});
@@ -1963,21 +2003,12 @@ mod tests {
assert!(format!("{:?}", ret).contains("not paused"))
}
#[test]
fn test_linuxcontainer_resume_cgroupmgr_is_none() {
let ret = new_linux_container_and_then(|mut c: LinuxContainer| {
c.status.transition(ContainerState::Paused);
c.cgroup_manager = None;
c.resume().map_err(|e| anyhow!(e))
});
assert!(ret.is_err(), "Expecting error, Got {:?}", ret);
}
#[test]
fn test_linuxcontainer_resume() {
let ret = new_linux_container_and_then(|mut c: LinuxContainer| {
c.cgroup_manager = FsManager::new("").ok();
c.cgroup_manager = Box::new(FsManager::new("").map_err(|e| {
anyhow!(format!("fail to create cgroup manager with path: {:}", e))
})?);
// Change status to paused, this way we can resume it
c.status.transition(ContainerState::Paused);
c.resume().map_err(|e| anyhow!(e))

View File

@@ -38,6 +38,7 @@ pub mod pipestream;
pub mod process;
#[cfg(feature = "seccomp")]
pub mod seccomp;
pub mod selinux;
pub mod specconv;
pub mod sync;
pub mod sync_with_async;

View File

@@ -25,6 +25,7 @@ use std::fs::File;
use std::io::{BufRead, BufReader};
use crate::container::DEFAULT_DEVICES;
use crate::selinux;
use crate::sync::write_count;
use std::string::ToString;
@@ -181,6 +182,8 @@ pub fn init_rootfs(
None => flags |= MsFlags::MS_SLAVE,
}
let label = &linux.mount_label;
let root = spec
.root
.as_ref()
@@ -244,7 +247,7 @@ pub fn init_rootfs(
}
}
mount_from(cfd_log, m, rootfs, flags, &data, "")?;
mount_from(cfd_log, m, rootfs, flags, &data, label)?;
// bind mount won't change mount options, we need remount to make mount options
// effective.
// first check that we have non-default options required before attempting a
@@ -524,7 +527,6 @@ pub fn pivot_rootfs<P: ?Sized + NixPath + std::fmt::Debug>(path: &P) -> Result<(
fn rootfs_parent_mount_private(path: &str) -> Result<()> {
let mount_infos = parse_mount_table(MOUNTINFO_PATH)?;
let mut max_len = 0;
let mut mount_point = String::from("");
let mut options = String::from("");
@@ -767,9 +769,9 @@ fn mount_from(
rootfs: &str,
flags: MsFlags,
data: &str,
_label: &str,
label: &str,
) -> Result<()> {
let d = String::from(data);
let mut d = String::from(data);
let dest = secure_join(rootfs, &m.destination);
let src = if m.r#type.as_str() == "bind" {
@@ -822,6 +824,37 @@ fn mount_from(
e
})?;
// Set the SELinux context for the mounts
let mut use_xattr = false;
if !label.is_empty() {
if selinux::is_enabled()? {
let device = Path::new(&m.source)
.file_name()
.ok_or_else(|| anyhow!("invalid device source path: {}", &m.source))?
.to_str()
.ok_or_else(|| anyhow!("failed to convert device source path: {}", &m.source))?;
match device {
// SELinux does not support labeling of /proc or /sys
"proc" | "sysfs" => (),
// SELinux does not support mount labeling against /dev/mqueue,
// so we use setxattr instead
"mqueue" => {
use_xattr = true;
}
_ => {
log_child!(cfd_log, "add SELinux mount label to {}", dest.as_str());
selinux::add_mount_label(&mut d, label);
}
}
} else {
log_child!(
cfd_log,
"SELinux label for the mount is provided but SELinux is not enabled on the running kernel"
);
}
}
mount(
Some(src.as_str()),
dest.as_str(),
@@ -834,6 +867,10 @@ fn mount_from(
e
})?;
if !label.is_empty() && selinux::is_enabled()? && use_xattr {
xattr::set(dest.as_str(), "security.selinux", label.as_bytes())?;
}
if flags.contains(MsFlags::MS_BIND)
&& flags.intersects(
!(MsFlags::MS_REC

View File

@@ -0,0 +1,80 @@
// Copyright 2022 Sony Group Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{Context, Result};
use nix::unistd::gettid;
use std::fs::{self, OpenOptions};
use std::io::prelude::*;
use std::path::Path;
pub fn is_enabled() -> Result<bool> {
let buf = fs::read_to_string("/proc/mounts")?;
let enabled = buf.contains("selinuxfs");
Ok(enabled)
}
pub fn add_mount_label(data: &mut String, label: &str) {
if data.is_empty() {
let context = format!("context=\"{}\"", label);
data.push_str(&context);
} else {
let context = format!(",context=\"{}\"", label);
data.push_str(&context);
}
}
pub fn set_exec_label(label: &str) -> Result<()> {
let mut attr_path = Path::new("/proc/thread-self/attr/exec").to_path_buf();
if !attr_path.exists() {
// Fall back to the old convention
attr_path = Path::new("/proc/self/task")
.join(gettid().to_string())
.join("attr/exec")
}
let mut file = OpenOptions::new()
.write(true)
.truncate(true)
.open(attr_path)?;
file.write_all(label.as_bytes())
.with_context(|| "failed to apply SELinux label")?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
const TEST_LABEL: &str = "system_u:system_r:unconfined_t:s0";
#[test]
fn test_is_enabled() {
let ret = is_enabled();
assert!(ret.is_ok(), "Expecting Ok, Got {:?}", ret);
}
#[test]
fn test_add_mount_label() {
let mut data = String::new();
add_mount_label(&mut data, TEST_LABEL);
assert_eq!(data, format!("context=\"{}\"", TEST_LABEL));
let mut data = String::from("defaults");
add_mount_label(&mut data, TEST_LABEL);
assert_eq!(data, format!("defaults,context=\"{}\"", TEST_LABEL));
}
#[test]
fn test_set_exec_label() {
let ret = set_exec_label(TEST_LABEL);
if is_enabled().unwrap() {
assert!(ret.is_ok(), "Expecting Ok, Got {:?}", ret);
} else {
assert!(ret.is_err(), "Expecting error, Got {:?}", ret);
}
}
}

View File

@@ -6,6 +6,7 @@
use crate::container::Config;
use anyhow::{anyhow, Context, Result};
use oci::{Linux, LinuxIdMapping, LinuxNamespace, Spec};
use regex::Regex;
use std::collections::HashMap;
use std::path::{Component, PathBuf};
@@ -86,6 +87,23 @@ fn hostname(oci: &Spec) -> Result<()> {
fn security(oci: &Spec) -> Result<()> {
let linux = get_linux(oci)?;
let label_pattern = r".*_u:.*_r:.*_t:s[0-9]|1[0-5].*";
let label_regex = Regex::new(label_pattern)?;
if let Some(ref process) = oci.process {
if !process.selinux_label.is_empty() && !label_regex.is_match(&process.selinux_label) {
return Err(anyhow!(
"SELinux label for the process is invalid format: {}",
&process.selinux_label
));
}
}
if !linux.mount_label.is_empty() && !label_regex.is_match(&linux.mount_label) {
return Err(anyhow!(
"SELinux label for the mount is invalid format: {}",
&linux.mount_label
));
}
if linux.masked_paths.is_empty() && linux.readonly_paths.is_empty() {
return Ok(());
@@ -95,8 +113,6 @@ fn security(oci: &Spec) -> Result<()> {
return Err(anyhow!("Linux namespace does not contain mount"));
}
// don't care about selinux at present
Ok(())
}
@@ -285,7 +301,7 @@ pub fn validate(conf: &Config) -> Result<()> {
#[cfg(test)]
mod tests {
use super::*;
use oci::Mount;
use oci::{Mount, Process};
#[test]
fn test_namespace() {
@@ -388,6 +404,29 @@ mod tests {
];
spec.linux = Some(linux);
security(&spec).unwrap();
// SELinux
let valid_label = "system_u:system_r:container_t:s0:c123,c456";
let mut process = Process::default();
process.selinux_label = valid_label.to_string();
spec.process = Some(process);
security(&spec).unwrap();
let mut linux = Linux::default();
linux.mount_label = valid_label.to_string();
spec.linux = Some(linux);
security(&spec).unwrap();
let invalid_label = "system_u:system_r:container_t";
let mut process = Process::default();
process.selinux_label = invalid_label.to_string();
spec.process = Some(process);
security(&spec).unwrap_err();
let mut linux = Linux::default();
linux.mount_label = invalid_label.to_string();
spec.linux = Some(linux);
security(&spec).unwrap_err();
}
#[test]

View File

@@ -529,7 +529,9 @@ impl Handle {
.map_err(|e| anyhow!("Failed to parse IP {}: {:?}", ip_address, e))?;
// Import rtnetlink objects that make sense only for this function
use packet::constants::{NDA_UNSPEC, NLM_F_ACK, NLM_F_CREATE, NLM_F_EXCL, NLM_F_REQUEST};
use packet::constants::{
NDA_UNSPEC, NLM_F_ACK, NLM_F_CREATE, NLM_F_REPLACE, NLM_F_REQUEST,
};
use packet::neighbour::{NeighbourHeader, NeighbourMessage};
use packet::nlas::neighbour::Nla;
use packet::{NetlinkMessage, NetlinkPayload, RtnlMessage};
@@ -572,7 +574,7 @@ impl Handle {
// Send request and ACK
let mut req = NetlinkMessage::from(RtnlMessage::NewNeighbour(message));
req.header.flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_EXCL | NLM_F_CREATE;
req.header.flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_REPLACE;
let mut response = self.handle.request(req)?;
while let Some(message) = response.next().await {

View File

@@ -44,7 +44,6 @@ use nix::errno::Errno;
use nix::mount::MsFlags;
use nix::sys::{stat, statfs};
use nix::unistd::{self, Pid};
use rustjail::cgroups::Manager;
use rustjail::process::ProcessOperations;
use crate::device::{
@@ -84,9 +83,15 @@ use std::path::PathBuf;
const CONTAINER_BASE: &str = "/run/kata-containers";
const MODPROBE_PATH: &str = "/sbin/modprobe";
/// the iptables seriers binaries could appear either in /sbin
/// or /usr/sbin, we need to check both of them
const USR_IPTABLES_SAVE: &str = "/usr/sbin/iptables-save";
const IPTABLES_SAVE: &str = "/sbin/iptables-save";
const USR_IPTABLES_RESTORE: &str = "/usr/sbin/iptables-store";
const IPTABLES_RESTORE: &str = "/sbin/iptables-restore";
const USR_IP6TABLES_SAVE: &str = "/usr/sbin/ip6tables-save";
const IP6TABLES_SAVE: &str = "/sbin/ip6tables-save";
const USR_IP6TABLES_RESTORE: &str = "/usr/sbin/ip6tables-save";
const IP6TABLES_RESTORE: &str = "/sbin/ip6tables-restore";
const ERR_CANNOT_GET_WRITER: &str = "Cannot get writer";
@@ -266,14 +271,13 @@ impl AgentService {
}
// start oom event loop
if let Some(ref ctr) = ctr.cgroup_manager {
let cg_path = ctr.get_cg_path("memory");
if let Some(cg_path) = cg_path {
let rx = notifier::notify_oom(cid.as_str(), cg_path.to_string()).await?;
let cg_path = ctr.cgroup_manager.as_ref().get_cgroup_path("memory");
s.run_oom_event_monitor(rx, cid.clone()).await;
}
if let Ok(cg_path) = cg_path {
let rx = notifier::notify_oom(cid.as_str(), cg_path.to_string()).await?;
s.run_oom_event_monitor(rx, cid.clone()).await;
}
Ok(())
@@ -377,6 +381,7 @@ impl AgentService {
"signal process";
"container-id" => cid.clone(),
"exec-id" => eid.clone(),
"signal" => req.signal,
);
let mut sig: libc::c_int = req.signal as libc::c_int;
@@ -459,11 +464,7 @@ impl AgentService {
let ctr = sandbox
.get_container(cid)
.ok_or_else(|| anyhow!("Invalid container id {}", cid))?;
let cm = ctr
.cgroup_manager
.as_ref()
.ok_or_else(|| anyhow!("cgroup manager not exist"))?;
cm.freeze(state)?;
ctr.cgroup_manager.as_ref().freeze(state)?;
Ok(())
}
@@ -473,11 +474,7 @@ impl AgentService {
let ctr = sandbox
.get_container(cid)
.ok_or_else(|| anyhow!("Invalid container id {}", cid))?;
let cm = ctr
.cgroup_manager
.as_ref()
.ok_or_else(|| anyhow!("cgroup manager not exist"))?;
let pids = cm.get_pids()?;
let pids = ctr.cgroup_manager.as_ref().get_pids()?;
Ok(pids)
}
@@ -998,8 +995,18 @@ impl agent_ttrpc::AgentService for AgentService {
info!(sl!(), "get_ip_tables: request received");
// the binary could exists in either /usr/sbin or /sbin
// here check both of the places and return the one exists
// if none exists, return the /sbin one, and the rpc will
// returns an internal error
let cmd = if req.is_ipv6 {
IP6TABLES_SAVE
if Path::new(USR_IP6TABLES_SAVE).exists() {
USR_IP6TABLES_SAVE
} else {
IP6TABLES_SAVE
}
} else if Path::new(USR_IPTABLES_SAVE).exists() {
USR_IPTABLES_SAVE
} else {
IPTABLES_SAVE
}
@@ -1027,8 +1034,18 @@ impl agent_ttrpc::AgentService for AgentService {
info!(sl!(), "set_ip_tables request received");
// the binary could exists in both /usr/sbin and /sbin
// here check both of the places and return the one exists
// if none exists, return the /sbin one, and the rpc will
// returns an internal error
let cmd = if req.is_ipv6 {
IP6TABLES_RESTORE
if Path::new(USR_IP6TABLES_RESTORE).exists() {
USR_IP6TABLES_RESTORE
} else {
IP6TABLES_RESTORE
}
} else if Path::new(USR_IPTABLES_RESTORE).exists() {
USR_IPTABLES_RESTORE
} else {
IPTABLES_RESTORE
}
@@ -2032,6 +2049,11 @@ mod tests {
use tempfile::{tempdir, TempDir};
use test_utils::{assert_result, skip_if_not_root};
use ttrpc::{r#async::TtrpcContext, MessageHeader};
use which::which;
fn check_command(cmd: &str) -> bool {
which(cmd).is_ok()
}
fn mk_ttrpc_context() -> TtrpcContext {
TtrpcContext {
@@ -2751,6 +2773,27 @@ OtherField:other
async fn test_ip_tables() {
skip_if_not_root!();
let iptables_cmd_list = [
USR_IPTABLES_SAVE,
USR_IP6TABLES_SAVE,
USR_IPTABLES_RESTORE,
USR_IP6TABLES_RESTORE,
IPTABLES_SAVE,
IP6TABLES_SAVE,
IPTABLES_RESTORE,
IP6TABLES_RESTORE,
];
for cmd in iptables_cmd_list {
if !check_command(cmd) {
warn!(
sl!(),
"one or more commands for ip tables test are missing, skip it"
);
return;
}
}
let logger = slog::Logger::root(slog::Discard, o!());
let sandbox = Sandbox::new(&logger).unwrap();
let agent_service = Box::new(AgentService {

View File

@@ -296,7 +296,6 @@ impl Sandbox {
info!(self.logger, "updating {}", ctr.id.as_str());
ctr.cgroup_manager
.as_ref()
.unwrap()
.update_cpuset_path(guest_cpuset.as_str(), container_cpust)?;
}
@@ -327,7 +326,7 @@ impl Sandbox {
// Reject non-file, symlinks and non-executable files
if !entry.file_type()?.is_file()
|| entry.file_type()?.is_symlink()
|| entry.metadata()?.permissions().mode() & 0o777 & 0o111 == 0
|| entry.metadata()?.permissions().mode() & 0o111 == 0
{
continue;
}

1850
src/dragonball/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -12,15 +12,15 @@ edition = "2018"
[dependencies]
arc-swap = "1.5.0"
bytes = "1.1.0"
dbs-address-space = "0.1.0"
dbs-address-space = "0.2.0"
dbs-allocator = "0.1.0"
dbs-arch = "0.1.0"
dbs-boot = "0.2.0"
dbs-device = "0.1.0"
dbs-interrupt = { version = "0.1.0", features = ["kvm-irq"] }
dbs-arch = "0.2.0"
dbs-boot = "0.3.0"
dbs-device = "0.2.0"
dbs-interrupt = { version = "0.2.0", features = ["kvm-irq"] }
dbs-legacy-devices = "0.1.0"
dbs-upcall = { version = "0.1.0", optional = true }
dbs-utils = "0.1.0"
dbs-utils = "0.2.0"
dbs-virtio-devices = { version = "0.1.0", optional = true, features = ["virtio-mmio"] }
kvm-bindings = "0.5.0"
kvm-ioctls = "0.11.0"
@@ -36,13 +36,14 @@ serde_json = "1.0.9"
slog = "2.5.2"
slog-scope = "4.4.0"
thiserror = "1"
vmm-sys-util = "0.10.0"
vmm-sys-util = "0.11.0"
virtio-queue = { version = "0.4.0", optional = true }
vm-memory = { version = "0.9.0", features = ["backend-mmap"] }
[dev-dependencies]
slog-term = "2.9.0"
slog-async = "2.7.0"
test-utils = { path = "../libs/test-utils" }
[features]
acpi = []
@@ -53,14 +54,3 @@ virtio-blk = ["dbs-virtio-devices/virtio-blk", "virtio-queue"]
virtio-net = ["dbs-virtio-devices/virtio-net", "virtio-queue"]
# virtio-fs only work on atomic-guest-memory
virtio-fs = ["dbs-virtio-devices/virtio-fs", "virtio-queue", "atomic-guest-memory"]
[patch.'crates-io']
dbs-device = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-interrupt = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-legacy-devices = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-upcall = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-utils = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-virtio-devices = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-boot = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-arch = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }
dbs-address-space = { git = "https://github.com/openanolis/dragonball-sandbox.git", rev = "c3d7831aee7c3962b8a90f0afbfd0fb7e4d30323" }

View File

@@ -2,10 +2,22 @@
# Copyright (c) 2019-2022 Ant Group. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
include ../../utils.mk
ifeq ($(ARCH), s390x)
default build check test clippy:
@echo "s390x not support currently"
exit 0
else
default: build
build:
cargo build --all-features
@echo "INFO: cargo build..."
cargo build --all-features --target $(TRIPLE)
static-checks-build:
@echo "INFO: static-checks-build do nothing.."
check: clippy format
@@ -15,6 +27,9 @@ clippy:
-- \
-D warnings
vendor:
@echo "INFO: vendor do nothing.."
format:
@echo "INFO: cargo fmt..."
cargo fmt -- --check
@@ -23,5 +38,13 @@ clean:
cargo clean
test:
@echo "INFO: testing dragonball for development build"
cargo test --all-features -- --nocapture
ifdef SUPPORT_VIRTUALIZATION
cargo test --all-features --target $(TRIPLE) -- --nocapture
else
@echo "INFO: skip testing dragonball, it need virtualization support."
exit 0
endif
endif # ifeq ($(ARCH), s390x)
.DEFAULT_GOAL := default

View File

@@ -35,8 +35,8 @@ use nix::unistd::dup;
#[cfg(feature = "atomic-guest-memory")]
use vm_memory::GuestMemoryAtomic;
use vm_memory::{
address::Address, FileOffset, GuestAddress, GuestAddressSpace, GuestMemoryMmap, GuestMemoryRegion,
GuestRegionMmap, GuestUsize, MemoryRegionAddress, MmapRegion,
address::Address, FileOffset, GuestAddress, GuestAddressSpace, GuestMemoryMmap,
GuestMemoryRegion, GuestRegionMmap, GuestUsize, MemoryRegionAddress, MmapRegion,
};
use crate::resource_manager::ResourceManager;
@@ -270,7 +270,7 @@ impl AddressSpaceMgr {
let size = info
.size
.checked_shl(20)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
// Guest memory does not intersect with the MMIO hole.
// TODO: make it work for ARM (issue #4307)
@@ -281,13 +281,13 @@ impl AddressSpaceMgr {
regions.push(region);
start_addr = start_addr
.checked_add(size)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
} else {
// Add guest memory below the MMIO hole, avoid splitting the memory region
// if the available address region is small than MINIMAL_SPLIT_SPACE MiB.
let mut below_size = dbs_boot::layout::MMIO_LOW_START
.checked_sub(start_addr)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
if below_size < (MINIMAL_SPLIT_SPACE) {
below_size = 0;
} else {
@@ -299,12 +299,12 @@ impl AddressSpaceMgr {
let above_start = dbs_boot::layout::MMIO_LOW_END + 1;
let above_size = size
.checked_sub(below_size)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
let region = self.create_region(above_start, above_size, info, &mut param)?;
regions.push(region);
start_addr = above_start
.checked_add(above_size)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
}
}
@@ -502,7 +502,7 @@ impl AddressSpaceMgr {
fn configure_numa(&self, mmap_reg: &MmapRegion, node_id: u32) -> Result<()> {
let nodemask = 1_u64
.checked_shl(node_id)
.ok_or_else(|| AddressManagerError::InvalidOperation)?;
.ok_or(AddressManagerError::InvalidOperation)?;
let res = unsafe {
libc::syscall(
libc::SYS_mbind,

View File

@@ -18,7 +18,7 @@ pub const DEFAULT_KERNEL_CMDLINE: &str = "reboot=k panic=1 pci=off nomodules 825
i8042.noaux i8042.nomux i8042.nopnp i8042.dumbkbd";
/// Strongly typed data structure used to configure the boot source of the microvm.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize, Default)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize, Default)]
#[serde(deny_unknown_fields)]
pub struct BootSourceConfig {
/// Path of the kernel image.

View File

@@ -10,7 +10,7 @@ use serde_derive::{Deserialize, Serialize};
/// When Dragonball starts, the instance state is Uninitialized. Once start_microvm method is
/// called, the state goes from Uninitialized to Starting. The state is changed to Running until
/// the start_microvm method ends. Halting and Halted are currently unsupported.
#[derive(Copy, Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Copy, Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub enum InstanceState {
/// Microvm is not initialized.
Uninitialized,
@@ -29,7 +29,7 @@ pub enum InstanceState {
}
/// The state of async actions
#[derive(Debug, Deserialize, Serialize, Clone, PartialEq)]
#[derive(Debug, Deserialize, Serialize, Clone, PartialEq, Eq)]
pub enum AsyncState {
/// Uninitialized
Uninitialized,

View File

@@ -10,7 +10,7 @@ pub const MAX_SUPPORTED_VCPUS: u8 = 254;
pub const MEMORY_HOTPLUG_ALIGHMENT: u8 = 64;
/// Errors associated with configuring the microVM.
#[derive(Debug, PartialEq, thiserror::Error)]
#[derive(Debug, PartialEq, Eq, thiserror::Error)]
pub enum VmConfigError {
/// Cannot update the configuration of the microvm post boot.
#[error("update operation is not allowed after boot")]

View File

@@ -83,13 +83,13 @@ pub enum VmmActionError {
#[cfg(feature = "virtio-fs")]
/// The action `InsertFsDevice` failed either because of bad user input or an internal error.
#[error("virtio-fs device: {0}")]
#[error("virtio-fs device error: {0}")]
FsDevice(#[source] FsDeviceError),
}
/// This enum represents the public interface of the VMM. Each action contains various
/// bits of information (ids, paths, etc.).
#[derive(Clone, Debug, PartialEq)]
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum VmmAction {
/// Configure the boot source of the microVM using `BootSourceConfig`.
/// This action can only be called before the microVM has booted.
@@ -298,7 +298,6 @@ impl VmmService {
let mut cmdline = linux_loader::cmdline::Cmdline::new(dbs_boot::layout::CMDLINE_MAX_SIZE);
let boot_args = boot_source_config
.boot_args
.clone()
.unwrap_or_else(|| String::from(DEFAULT_KERNEL_CMDLINE));
cmdline
.insert_str(boot_args)
@@ -407,19 +406,10 @@ impl VmmService {
}
config.vpmu_feature = machine_config.vpmu_feature;
let vm_id = vm.shared_info().read().unwrap().id.clone();
let serial_path = match machine_config.serial_path {
Some(value) => value,
None => {
if config.serial_path.is_none() {
String::from("/run/dragonball/") + &vm_id + "_com1"
} else {
// Safe to unwrap() because we have checked it has a value.
config.serial_path.as_ref().unwrap().clone()
}
}
};
config.serial_path = Some(serial_path);
// If serial_path is:
// - None, legacy_manager will create_stdio_console.
// - Some(path), legacy_manager will create_socket_console on that path.
config.serial_path = machine_config.serial_path;
vm.set_vm_config(config.clone());
self.machine_config = config;
@@ -634,3 +624,783 @@ fn handle_cpu_topology(
Ok(cpu_topology)
}
#[cfg(test)]
mod tests {
use std::sync::mpsc::channel;
use std::sync::{Arc, Mutex};
use dbs_utils::epoll_manager::EpollManager;
use test_utils::skip_if_not_root;
use vmm_sys_util::tempfile::TempFile;
use super::*;
use crate::vmm::tests::create_vmm_instance;
struct TestData<'a> {
req: Option<VmmAction>,
vm_state: InstanceState,
f: &'a dyn Fn(VmmRequestResult),
}
impl<'a> TestData<'a> {
fn new(req: VmmAction, vm_state: InstanceState, f: &'a dyn Fn(VmmRequestResult)) -> Self {
Self {
req: Some(req),
vm_state,
f,
}
}
fn check_request(&mut self) {
let (to_vmm, from_api) = channel();
let (to_api, from_vmm) = channel();
let vmm = Arc::new(Mutex::new(create_vmm_instance()));
let mut vservice = VmmService::new(from_api, to_api);
let epoll_mgr = EpollManager::default();
let mut event_mgr = EventManager::new(&vmm, epoll_mgr).unwrap();
let mut v = vmm.lock().unwrap();
let vm = v.get_vm_mut().unwrap();
vm.set_instance_state(self.vm_state);
to_vmm.send(Box::new(self.req.take().unwrap())).unwrap();
assert!(vservice.run_vmm_action(&mut v, &mut event_mgr).is_ok());
let response = from_vmm.try_recv();
assert!(response.is_ok());
(self.f)(*response.unwrap());
}
}
#[test]
fn test_vmm_action_receive_unknown() {
skip_if_not_root!();
let (_to_vmm, from_api) = channel();
let (to_api, _from_vmm) = channel();
let vmm = Arc::new(Mutex::new(create_vmm_instance()));
let mut vservice = VmmService::new(from_api, to_api);
let epoll_mgr = EpollManager::default();
let mut event_mgr = EventManager::new(&vmm, epoll_mgr).unwrap();
let mut v = vmm.lock().unwrap();
assert!(vservice.run_vmm_action(&mut v, &mut event_mgr).is_ok());
}
#[should_panic]
#[test]
fn test_vmm_action_disconnected() {
let (to_vmm, from_api) = channel();
let (to_api, _from_vmm) = channel();
let vmm = Arc::new(Mutex::new(create_vmm_instance()));
let mut vservice = VmmService::new(from_api, to_api);
let epoll_mgr = EpollManager::default();
let mut event_mgr = EventManager::new(&vmm, epoll_mgr).unwrap();
let mut v = vmm.lock().unwrap();
drop(to_vmm);
vservice.run_vmm_action(&mut v, &mut event_mgr).unwrap();
}
#[test]
fn test_vmm_action_config_boot_source() {
skip_if_not_root!();
let kernel_file = TempFile::new().unwrap();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::ConfigureBootSource(BootSourceConfig::default()),
InstanceState::Running,
&|result| {
if let Err(VmmActionError::BootSource(
BootSourceConfigError::UpdateNotAllowedPostBoot,
)) = result
{
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to configure boot source for VM: \
the update operation is not allowed after boot",
);
assert_eq!(err_string, expected_err);
} else {
panic!();
}
},
),
// invalid kernel file path
TestData::new(
VmmAction::ConfigureBootSource(BootSourceConfig::default()),
InstanceState::Uninitialized,
&|result| {
if let Err(VmmActionError::BootSource(
BootSourceConfigError::InvalidKernelPath(_),
)) = result
{
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to configure boot source for VM: \
the kernel file cannot be opened due to invalid kernel path or invalid permissions: \
No such file or directory (os error 2)");
assert_eq!(err_string, expected_err);
} else {
panic!();
}
},
),
//success
TestData::new(
VmmAction::ConfigureBootSource(BootSourceConfig {
kernel_path: kernel_file.as_path().to_str().unwrap().to_string(),
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[test]
fn test_vmm_action_set_vm_configuration() {
skip_if_not_root!();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::UpdateNotAllowedPostBoot
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
update operation is not allowed after boot",
);
assert_eq!(err_string, expected_err);
},
),
// invalid cpu count (0)
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
vcpu_count: 0,
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::InvalidVcpuCount(0)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the vCPU number '0' can only be 1 or an even number when hyperthreading is enabled");
assert_eq!(err_string, expected_err);
},
),
// invalid max cpu count (too small)
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
vcpu_count: 4,
max_vcpu_count: 2,
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::InvalidMaxVcpuCount(2)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the max vCPU number '2' shouldn't less than vCPU count and can only be 1 or an even number when hyperthreading is enabled");
assert_eq!(err_string, expected_err);
},
),
// invalid cpu topology (larger than 254)
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
vcpu_count: 254,
cpu_topology: CpuTopology {
threads_per_core: 2,
cores_per_die: 128,
dies_per_socket: 1,
sockets: 1,
},
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::VcpuCountExceedsMaximum
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the vCPU number shouldn't large than 254",
);
assert_eq!(err_string, expected_err)
},
),
// cpu topology and max_vcpu_count are not matched - success
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
vcpu_count: 16,
max_vcpu_count: 32,
cpu_topology: CpuTopology {
threads_per_core: 1,
cores_per_die: 128,
dies_per_socket: 1,
sockets: 1,
},
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
result.unwrap();
},
),
// invalid threads_per_core
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
vcpu_count: 4,
max_vcpu_count: 4,
cpu_topology: CpuTopology {
threads_per_core: 4,
cores_per_die: 1,
dies_per_socket: 1,
sockets: 1,
},
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::InvalidThreadsPerCore(4)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the threads_per_core number '4' can only be 1 or 2",
);
assert_eq!(err_string, expected_err)
},
),
// invalid mem size
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
mem_size_mib: 3,
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::InvalidMemorySize(3)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the memory size 0x3MiB is invalid",
);
assert_eq!(err_string, expected_err);
},
),
// invalid mem path
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo {
mem_type: String::from("hugetlbfs"),
mem_file_path: String::from(""),
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::MachineConfig(
VmConfigError::InvalidMemFilePath(_)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to set configuration for the VM: \
the memory file path is invalid",
);
assert_eq!(err_string, expected_err);
},
),
// success
TestData::new(
VmmAction::SetVmConfiguration(VmConfigInfo::default()),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[test]
fn test_vmm_action_start_microvm() {
skip_if_not_root!();
let tests = &mut [
// invalid state (running)
TestData::new(VmmAction::StartMicroVm, InstanceState::Running, &|result| {
assert!(matches!(
result,
Err(VmmActionError::StartMicroVm(
StartMicroVmError::MicroVMAlreadyRunning
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to boot the VM: \
the virtual machine is already running",
);
assert_eq!(err_string, expected_err);
}),
// no kernel configuration
TestData::new(
VmmAction::StartMicroVm,
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::StartMicroVm(
StartMicroVmError::MissingKernelConfig
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to boot the VM: \
cannot start the virtual machine without kernel configuration",
);
assert_eq!(err_string, expected_err);
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[test]
fn test_vmm_action_shutdown_microvm() {
skip_if_not_root!();
let tests = &mut [
// success
TestData::new(
VmmAction::ShutdownMicroVm,
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-blk")]
#[test]
fn test_vmm_action_insert_block_device() {
skip_if_not_root!();
let dummy_file = TempFile::new().unwrap();
let dummy_path = dummy_file.as_path().to_owned();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::InsertBlockDevice(BlockDeviceConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Block(
BlockDeviceError::UpdateNotAllowedPostBoot
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-blk device error: \
block device does not support runtime update",
);
assert_eq!(err_string, expected_err);
},
),
// success
TestData::new(
VmmAction::InsertBlockDevice(BlockDeviceConfigInfo {
path_on_host: dummy_path,
device_type: crate::device_manager::blk_dev_mgr::BlockDeviceType::RawBlock,
is_root_device: true,
part_uuid: None,
is_read_only: false,
is_direct: false,
no_drop: false,
drive_id: String::from("1"),
rate_limiter: None,
num_queues: BlockDeviceConfigInfo::default_num_queues(),
queue_size: 256,
use_shared_irq: None,
use_generic_irq: None,
}),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-blk")]
#[test]
fn test_vmm_action_update_block_device() {
skip_if_not_root!();
let tests = &mut [
// invalid id
TestData::new(
VmmAction::UpdateBlockDevice(BlockDeviceConfigUpdateInfo {
drive_id: String::from("1"),
rate_limiter: None,
}),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Block(BlockDeviceError::InvalidDeviceId(_)))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-blk device error: \
invalid block device id '1'",
);
assert_eq!(err_string, expected_err);
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-blk")]
#[test]
fn test_vmm_action_remove_block_device() {
skip_if_not_root!();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::RemoveBlockDevice(String::from("1")),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Block(
BlockDeviceError::UpdateNotAllowedPostBoot
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-blk device error: \
block device does not support runtime update",
);
assert_eq!(err_string, expected_err);
},
),
// invalid id
TestData::new(
VmmAction::RemoveBlockDevice(String::from("1")),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Block(BlockDeviceError::InvalidDeviceId(_)))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-blk device error: \
invalid block device id '1'",
);
assert_eq!(err_string, expected_err);
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-fs")]
#[test]
fn test_vmm_action_insert_fs_device() {
skip_if_not_root!();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::InsertFsDevice(FsDeviceConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::FsDevice(
FsDeviceError::UpdateNotAllowedPostBoot
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-fs device error: \
update operation is not allowed after boot",
);
assert_eq!(err_string, expected_err);
},
),
// success
TestData::new(
VmmAction::InsertFsDevice(FsDeviceConfigInfo::default()),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-fs")]
#[test]
fn test_vmm_action_manipulate_fs_device() {
skip_if_not_root!();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::ManipulateFsBackendFs(FsMountConfigInfo::default()),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::FsDevice(FsDeviceError::MicroVMNotRunning))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-fs device error: \
vm is not running when attaching a backend fs",
);
assert_eq!(err_string, expected_err);
},
),
// invalid backend
TestData::new(
VmmAction::ManipulateFsBackendFs(FsMountConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::FsDevice(
FsDeviceError::AttachBackendFailed(_)
))
));
let err_string = format!("{}", result.unwrap_err());
println!("{}", err_string);
let expected_err = String::from(
"virtio-fs device error: \
Fs device attach a backend fs failed",
);
assert_eq!(err_string, expected_err);
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-net")]
#[test]
fn test_vmm_action_insert_network_device() {
skip_if_not_root!();
let tests = &mut [
// hotplug unready
TestData::new(
VmmAction::InsertNetworkDevice(VirtioNetDeviceConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::StartMicroVm(
StartMicroVmError::UpcallMissVsock
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to boot the VM: \
the upcall client needs a virtio-vsock device for communication",
);
assert_eq!(err_string, expected_err);
},
),
// success
TestData::new(
VmmAction::InsertNetworkDevice(VirtioNetDeviceConfigInfo::default()),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-net")]
#[test]
fn test_vmm_action_update_network_interface() {
skip_if_not_root!();
let tests = &mut [
// invalid id
TestData::new(
VmmAction::UpdateNetworkInterface(VirtioNetDeviceConfigUpdateInfo {
iface_id: String::from("1"),
rx_rate_limiter: None,
tx_rate_limiter: None,
}),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::VirtioNet(
VirtioNetDeviceError::InvalidIfaceId(_)
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"virtio-net device error: \
invalid virtio-net iface id '1'",
);
assert_eq!(err_string, expected_err);
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
#[cfg(feature = "virtio-vsock")]
#[test]
fn test_vmm_action_insert_vsock_device() {
skip_if_not_root!();
let tests = &mut [
// invalid state
TestData::new(
VmmAction::InsertVsockDevice(VsockDeviceConfigInfo::default()),
InstanceState::Running,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Vsock(
VsockDeviceError::UpdateNotAllowedPostBoot
))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to add virtio-vsock device: \
update operation is not allowed after boot",
);
assert_eq!(err_string, expected_err);
},
),
// invalid guest_cid
TestData::new(
VmmAction::InsertVsockDevice(VsockDeviceConfigInfo::default()),
InstanceState::Uninitialized,
&|result| {
assert!(matches!(
result,
Err(VmmActionError::Vsock(VsockDeviceError::GuestCIDInvalid(0)))
));
let err_string = format!("{}", result.unwrap_err());
let expected_err = String::from(
"failed to add virtio-vsock device: \
the guest CID 0 is invalid",
);
assert_eq!(err_string, expected_err);
},
),
// success
TestData::new(
VmmAction::InsertVsockDevice(VsockDeviceConfigInfo {
guest_cid: 3,
..Default::default()
}),
InstanceState::Uninitialized,
&|result| {
assert!(result.is_ok());
},
),
];
for t in tests.iter_mut() {
t.check_request();
}
}
}

View File

@@ -46,7 +46,7 @@ pub trait ConfigItem {
}
/// Struct to manage a group of configuration items.
#[derive(Debug, Default, Deserialize, PartialEq, Serialize)]
#[derive(Debug, Default, Deserialize, PartialEq, Eq, Serialize)]
pub struct ConfigInfos<T>
where
T: ConfigItem + Clone,
@@ -316,7 +316,7 @@ where
}
/// Configuration information for RateLimiter token bucket.
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Eq, Serialize)]
pub struct TokenBucketConfigInfo {
/// The size for the token bucket. A TokenBucket of `size` total capacity will take `refill_time`
/// milliseconds to go from zero tokens to total capacity.
@@ -349,7 +349,7 @@ impl From<&TokenBucketConfigInfo> for TokenBucket {
}
/// Configuration information for RateLimiter objects.
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Default, Deserialize, PartialEq, Eq, Serialize)]
pub struct RateLimiterConfigInfo {
/// Data used to initialize the RateLimiter::bandwidth bucket.
pub bandwidth: TokenBucketConfigInfo,

View File

@@ -106,7 +106,7 @@ pub enum BlockDeviceError {
}
/// Type of low level storage device/protocol for virtio-blk devices.
#[derive(Clone, Copy, Debug, PartialEq, Serialize, Deserialize)]
#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)]
pub enum BlockDeviceType {
/// Unknown low level device type.
Unknown,
@@ -131,7 +131,7 @@ impl BlockDeviceType {
}
/// Configuration information for a block device.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct BlockDeviceConfigUpdateInfo {
/// Unique identifier of the drive.
pub drive_id: String,
@@ -151,7 +151,7 @@ impl BlockDeviceConfigUpdateInfo {
}
/// Configuration information for a block device.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct BlockDeviceConfigInfo {
/// Unique identifier of the drive.
pub drive_id: String,
@@ -285,7 +285,6 @@ impl std::fmt::Debug for BlockDeviceInfo {
pub type BlockDeviceInfo = DeviceConfigInfo<BlockDeviceConfigInfo>;
/// Wrapper for the collection that holds all the Block Devices Configs
//#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone)]
pub struct BlockDeviceMgr {
/// A list of `BlockDeviceInfo` objects.
@@ -625,7 +624,7 @@ impl BlockDeviceMgr {
// we need to satisfy the condition by which a VMM can only have on root device
if block_device_config.is_root_device {
if self.has_root_block {
return Err(BlockDeviceError::RootBlockDeviceAlreadyAdded);
Err(BlockDeviceError::RootBlockDeviceAlreadyAdded)
} else {
self.has_root_block = true;
self.read_only_root = block_device_config.is_read_only;

View File

@@ -89,7 +89,7 @@ pub enum FsDeviceError {
}
/// Configuration information for a vhost-user-fs device.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct FsDeviceConfigInfo {
/// vhost-user socket path.
pub sock_path: String,
@@ -201,7 +201,7 @@ impl FsDeviceConfigInfo {
}
/// Configuration information for virtio-fs.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct FsDeviceConfigUpdateInfo {
/// virtiofs mount tag name used inside the guest.
/// used as the device name during mount.
@@ -242,7 +242,7 @@ impl ConfigItem for FsDeviceConfigInfo {
}
/// Configuration information of manipulating backend fs for a virtiofs device.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize, Default)]
pub struct FsMountConfigInfo {
/// Mount operations, mount, update, umount
pub ops: String,

View File

@@ -147,7 +147,11 @@ pub type Result<T> = ::std::result::Result<T, DeviceMgrError>;
/// Type of the dragonball virtio devices.
#[cfg(feature = "dbs-virtio-devices")]
pub type DbsVirtioDevice = Box<
dyn VirtioDevice<GuestAddressSpaceImpl, virtio_queue::QueueStateSync, vm_memory::GuestRegionMmap>,
dyn VirtioDevice<
GuestAddressSpaceImpl,
virtio_queue::QueueStateSync,
vm_memory::GuestRegionMmap,
>,
>;
/// Type of the dragonball virtio mmio devices.
@@ -591,23 +595,17 @@ impl DeviceManager {
.map_err(|_| StartMicroVmError::EventFd)?;
info!(self.logger, "init console path: {:?}", com1_sock_path);
if let Some(legacy_manager) = self.legacy_manager.as_ref() {
if let Some(path) = com1_sock_path {
// Currently, the `com1_sock_path` "stdio" is only reserved for creating the stdio console
if path != "stdio" {
let com1 = legacy_manager.get_com1_serial();
self.con_manager
.create_socket_console(com1, path)
.map_err(StartMicroVmError::DeviceManager)?;
return Ok(());
}
}
let com1 = legacy_manager.get_com1_serial();
self.con_manager
.create_stdio_console(com1)
.map_err(StartMicroVmError::DeviceManager)?;
if let Some(path) = com1_sock_path {
self.con_manager
.create_socket_console(com1, path)
.map_err(StartMicroVmError::DeviceManager)?;
} else {
self.con_manager
.create_stdio_console(com1)
.map_err(StartMicroVmError::DeviceManager)?;
}
}
Ok(())
@@ -791,13 +789,14 @@ impl DeviceManager {
fn allocate_mmio_device_resource(
&self,
) -> std::result::Result<DeviceResources, StartMicroVmError> {
let mut requests = Vec::new();
requests.push(ResourceConstraint::MmioAddress {
range: None,
align: MMIO_DEFAULT_CFG_SIZE,
size: MMIO_DEFAULT_CFG_SIZE,
});
requests.push(ResourceConstraint::LegacyIrq { irq: None });
let requests = vec![
ResourceConstraint::MmioAddress {
range: None,
align: MMIO_DEFAULT_CFG_SIZE,
size: MMIO_DEFAULT_CFG_SIZE,
},
ResourceConstraint::LegacyIrq { irq: None },
];
self.res_manager
.allocate_device_resources(&requests, false)
@@ -997,7 +996,7 @@ impl DeviceManager {
{
self.vsock_manager
.get_default_connector()
.map(|d| Some(d))
.map(Some)
.unwrap_or(None)
}
#[cfg(not(feature = "virtio-vsock"))]
@@ -1006,3 +1005,168 @@ impl DeviceManager {
}
}
}
#[cfg(test)]
mod tests {
use std::sync::{Arc, Mutex};
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use vm_memory::{GuestAddress, MmapRegion};
use super::*;
use crate::vm::CpuTopology;
impl DeviceManager {
pub fn new_test_mgr() -> Self {
let kvm = Kvm::new().unwrap();
let vm = kvm.create_vm().unwrap();
let vm_fd = Arc::new(vm);
let epoll_manager = EpollManager::default();
let res_manager = Arc::new(ResourceManager::new(None));
let logger = slog_scope::logger().new(slog::o!());
DeviceManager {
vm_fd: Arc::clone(&vm_fd),
con_manager: ConsoleManager::new(epoll_manager, &logger),
io_manager: Arc::new(ArcSwap::new(Arc::new(IoManager::new()))),
io_lock: Arc::new(Mutex::new(())),
irq_manager: Arc::new(KvmIrqManager::new(vm_fd.clone())),
res_manager,
legacy_manager: None,
#[cfg(feature = "virtio-blk")]
block_manager: BlockDeviceMgr::default(),
#[cfg(feature = "virtio-fs")]
fs_manager: Arc::new(Mutex::new(FsDeviceMgr::default())),
#[cfg(feature = "virtio-net")]
virtio_net_manager: VirtioNetDeviceMgr::default(),
#[cfg(feature = "virtio-vsock")]
vsock_manager: VsockDeviceMgr::default(),
#[cfg(target_arch = "aarch64")]
mmio_device_info: HashMap::new(),
logger,
}
}
}
#[test]
fn test_create_device_manager() {
skip_if_not_root!();
let mgr = DeviceManager::new_test_mgr();
let _ = mgr.io_manager();
}
#[cfg(target_arch = "x86_64")]
#[test]
fn test_create_devices() {
skip_if_not_root!();
use crate::vm::VmConfigInfo;
let epoll_manager = EpollManager::default();
let vmm = Arc::new(Mutex::new(crate::vmm::tests::create_vmm_instance()));
let event_mgr = crate::event_manager::EventManager::new(&vmm, epoll_manager).unwrap();
let mut vm = crate::vm::tests::create_vm_instance();
let vm_config = VmConfigInfo {
vcpu_count: 1,
max_vcpu_count: 1,
cpu_pm: "off".to_string(),
mem_type: "shmem".to_string(),
mem_file_path: "".to_string(),
mem_size_mib: 16,
serial_path: None,
cpu_topology: CpuTopology {
threads_per_core: 1,
cores_per_die: 1,
dies_per_socket: 1,
sockets: 1,
},
vpmu_feature: 0,
};
vm.set_vm_config(vm_config);
vm.init_guest_memory().unwrap();
vm.setup_interrupt_controller().unwrap();
let vm_as = vm.vm_as().cloned().unwrap();
let kernel_temp_file = vmm_sys_util::tempfile::TempFile::new().unwrap();
let kernel_file = kernel_temp_file.into_file();
let mut cmdline = crate::vm::KernelConfigInfo::new(
kernel_file,
None,
linux_loader::cmdline::Cmdline::new(0x1000),
);
let address_space = vm.vm_address_space().cloned();
let mgr = vm.device_manager_mut();
let guard = mgr.io_manager.load();
let mut lcr = [0u8];
// 0x3f8 is the adddress of serial device
guard.pio_read(0x3f8 + 3, &mut lcr).unwrap_err();
assert_eq!(lcr[0], 0x0);
mgr.create_interrupt_manager().unwrap();
mgr.create_devices(
vm_as,
event_mgr.epoll_manager(),
&mut cmdline,
None,
None,
address_space.as_ref(),
)
.unwrap();
let guard = mgr.io_manager.load();
guard.pio_read(0x3f8 + 3, &mut lcr).unwrap();
assert_eq!(lcr[0], 0x3);
}
#[cfg(feature = "virtio-fs")]
#[test]
fn test_handler_insert_region() {
skip_if_not_root!();
use dbs_virtio_devices::VirtioRegionHandler;
use lazy_static::__Deref;
use vm_memory::{GuestAddressSpace, GuestMemory, GuestMemoryRegion};
let vm = crate::test_utils::tests::create_vm_for_test();
let ctx = DeviceOpContext::new(
Some(vm.epoll_manager().clone()),
vm.device_manager(),
Some(vm.vm_as().unwrap().clone()),
vm.vm_address_space().cloned(),
true,
);
let guest_addr = GuestAddress(0x200000000000);
let cache_len = 1024 * 1024 * 1024;
let mmap_region = MmapRegion::build(
None,
cache_len as usize,
libc::PROT_NONE,
libc::MAP_ANONYMOUS | libc::MAP_NORESERVE | libc::MAP_PRIVATE,
)
.unwrap();
let guest_mmap_region =
Arc::new(vm_memory::GuestRegionMmap::new(mmap_region, guest_addr).unwrap());
let mut handler = DeviceVirtioRegionHandler {
vm_as: ctx.get_vm_as().unwrap(),
address_space: ctx.address_space.as_ref().unwrap().clone(),
};
handler.insert_region(guest_mmap_region).unwrap();
let mut find_region = false;
let find_region_ptr = &mut find_region;
let guard = vm.vm_as().unwrap().clone().memory();
let mem = guard.deref();
for region in mem.iter() {
if region.start_addr() == guest_addr && region.len() == cache_len {
*find_region_ptr = true;
}
}
assert!(find_region);
}
}

View File

@@ -93,7 +93,7 @@ pub enum VirtioNetDeviceError {
}
/// Configuration information for virtio net devices.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct VirtioNetDeviceConfigUpdateInfo {
/// ID of the guest network interface.
pub iface_id: String,
@@ -123,7 +123,7 @@ impl VirtioNetDeviceConfigUpdateInfo {
}
/// Configuration information for virtio net devices.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize, Default)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize, Default)]
pub struct VirtioNetDeviceConfigInfo {
/// ID of the guest network interface.
pub iface_id: String,
@@ -264,7 +264,7 @@ impl VirtioNetDeviceMgr {
config.use_generic_irq.unwrap_or(USE_GENERIC_IRQ),
)
.map_err(VirtioNetDeviceError::DeviceManager)?;
ctx.insert_hotplug_mmio_device(&dev.clone(), None)
ctx.insert_hotplug_mmio_device(&dev, None)
.map_err(VirtioNetDeviceError::DeviceManager)?;
// live-upgrade need save/restore device from info.device.
mgr.info_list[device_index].set_device(dev);

View File

@@ -70,7 +70,7 @@ pub enum VsockDeviceError {
}
/// Configuration information for a vsock device.
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Serialize)]
pub struct VsockDeviceConfigInfo {
/// ID of the vsock device.
pub id: String,

View File

@@ -101,7 +101,6 @@ impl EventManager {
/// Poll pending events and invoke registered event handler.
///
/// # Arguments:
/// * max_events: maximum number of pending events to handle
/// * timeout: maximum time in milliseconds to wait
pub fn handle_events(&self, timeout: i32) -> std::result::Result<usize, EpollError> {
self.epoll_mgr

View File

@@ -210,14 +210,19 @@ mod x86_64 {
#[cfg(test)]
mod tests {
use super::*;
use kvm_ioctls::Kvm;
use std::fs::File;
use std::os::unix::fs::MetadataExt;
use std::os::unix::io::{AsRawFd, FromRawFd};
use kvm_ioctls::Kvm;
use test_utils::skip_if_not_root;
use super::*;
#[test]
fn test_create_kvm_context() {
skip_if_not_root!();
let c = KvmContext::new(None).unwrap();
assert!(c.max_memslots >= 32);
@@ -234,6 +239,8 @@ mod tests {
#[cfg(target_arch = "x86_64")]
#[test]
fn test_get_supported_cpu_id() {
skip_if_not_root!();
let c = KvmContext::new(None).unwrap();
let _ = c
@@ -244,6 +251,8 @@ mod tests {
#[test]
fn test_create_vm() {
skip_if_not_root!();
let c = KvmContext::new(None).unwrap();
let _ = c.create_vm().unwrap();

View File

@@ -34,6 +34,9 @@ pub mod vm;
mod event_manager;
mod io_manager;
mod test_utils;
mod vmm;
pub use self::error::StartMicroVmError;

View File

@@ -36,7 +36,7 @@ const PIO_MAX: u16 = 0xFFFF;
const MMIO_SPACE_RESERVED: u64 = 0x400_0000;
/// Errors associated with resource management operations
#[derive(Debug, PartialEq, thiserror::Error)]
#[derive(Debug, PartialEq, Eq, thiserror::Error)]
pub enum ResourceError {
/// Unknown/unsupported resource type.
#[error("unsupported resource type")]
@@ -569,9 +569,7 @@ impl ResourceManager {
Resource::KvmMemSlot(slot) => self.free_kvm_mem_slot(*slot),
Resource::MacAddresss(_) => Ok(()),
};
if result.is_err() {
return result;
}
result?;
}
Ok(())
}
@@ -588,9 +586,9 @@ mod tests {
// Allocate/free shared IRQs multiple times.
assert_eq!(mgr.allocate_legacy_irq(true, None).unwrap(), SHARED_IRQ);
assert_eq!(mgr.allocate_legacy_irq(true, None).unwrap(), SHARED_IRQ);
mgr.free_legacy_irq(SHARED_IRQ);
mgr.free_legacy_irq(SHARED_IRQ);
mgr.free_legacy_irq(SHARED_IRQ);
mgr.free_legacy_irq(SHARED_IRQ).unwrap();
mgr.free_legacy_irq(SHARED_IRQ).unwrap();
mgr.free_legacy_irq(SHARED_IRQ).unwrap();
// Allocate specified IRQs.
assert_eq!(
@@ -598,7 +596,7 @@ mod tests {
.unwrap(),
LEGACY_IRQ_BASE + 10
);
mgr.free_legacy_irq(LEGACY_IRQ_BASE + 10);
mgr.free_legacy_irq(LEGACY_IRQ_BASE + 10).unwrap();
assert_eq!(
mgr.allocate_legacy_irq(false, Some(LEGACY_IRQ_BASE + 10))
.unwrap(),
@@ -635,19 +633,19 @@ mod tests {
let mgr = ResourceManager::new(None);
let msi = mgr.allocate_msi_irq(3).unwrap();
mgr.free_msi_irq(msi, 3);
mgr.free_msi_irq(msi, 3).unwrap();
let msi = mgr.allocate_msi_irq(3).unwrap();
mgr.free_msi_irq(msi, 3);
mgr.free_msi_irq(msi, 3).unwrap();
let irq = mgr.allocate_msi_irq_aligned(8).unwrap();
assert_eq!(irq & 0x7, 0);
mgr.free_msi_irq(msi, 8);
mgr.free_msi_irq(msi, 8).unwrap();
let irq = mgr.allocate_msi_irq_aligned(8).unwrap();
assert_eq!(irq & 0x7, 0);
let irq = mgr.allocate_msi_irq_aligned(512).unwrap();
assert_eq!(irq, 512);
mgr.free_msi_irq(irq, 512);
mgr.free_msi_irq(irq, 512).unwrap();
let irq = mgr.allocate_msi_irq_aligned(512).unwrap();
assert_eq!(irq, 512);
@@ -690,9 +688,9 @@ mod tests {
},
];
let resources = mgr.allocate_device_resources(&requests, false).unwrap();
mgr.free_device_resources(&resources);
mgr.free_device_resources(&resources).unwrap();
let resources = mgr.allocate_device_resources(&requests, false).unwrap();
mgr.free_device_resources(&resources);
mgr.free_device_resources(&resources).unwrap();
requests.push(ResourceConstraint::PioAddress {
range: Some((0xc000, 0xc000)),
align: 0x1000,
@@ -702,7 +700,7 @@ mod tests {
let resources = mgr
.allocate_device_resources(&requests[0..requests.len() - 1], false)
.unwrap();
mgr.free_device_resources(&resources);
mgr.free_device_resources(&resources).unwrap();
}
#[test]
@@ -721,7 +719,7 @@ mod tests {
let mgr = ResourceManager::new(None);
assert_eq!(mgr.allocate_kvm_mem_slot(1, None).unwrap(), 0);
assert_eq!(mgr.allocate_kvm_mem_slot(1, Some(200)).unwrap(), 200);
mgr.free_kvm_mem_slot(200);
mgr.free_kvm_mem_slot(200).unwrap();
assert_eq!(mgr.allocate_kvm_mem_slot(1, Some(200)).unwrap(), 200);
assert_eq!(
mgr.allocate_kvm_mem_slot(1, Some(KVM_USER_MEM_SLOTS))

View File

@@ -0,0 +1,47 @@
// Copyright (C) 2022 Alibaba Cloud. All rights reserved.
//
// SPDX-License-Identifier: Apache-2.0
#[cfg(test)]
pub mod tests {
use crate::api::v1::InstanceInfo;
use crate::vm::{CpuTopology, KernelConfigInfo, Vm, VmConfigInfo};
use dbs_utils::epoll_manager::EpollManager;
use linux_loader::cmdline::Cmdline;
use std::sync::{Arc, RwLock};
use vmm_sys_util::tempfile::TempFile;
pub fn create_vm_for_test() -> Vm {
// Call for kvm too frequently would cause error in some host kernel.
let instance_info = Arc::new(RwLock::new(InstanceInfo::default()));
let epoll_manager = EpollManager::default();
let mut vm = Vm::new(None, instance_info, epoll_manager).unwrap();
let kernel_file = TempFile::new().unwrap();
let cmd_line = Cmdline::new(64);
vm.set_kernel_config(KernelConfigInfo::new(
kernel_file.into_file(),
None,
cmd_line,
));
let vm_config = VmConfigInfo {
vcpu_count: 1,
max_vcpu_count: 1,
cpu_pm: "off".to_string(),
mem_type: "shmem".to_string(),
mem_file_path: "".to_string(),
mem_size_mib: 1,
serial_path: None,
cpu_topology: CpuTopology {
threads_per_core: 1,
cores_per_die: 1,
dies_per_socket: 1,
sockets: 1,
},
vpmu_feature: 0,
};
vm.set_vm_config(vm_config);
vm.init_guest_memory().unwrap();
vm
}
}

View File

@@ -39,6 +39,7 @@ impl Vcpu {
/// vcpu thread to vmm thread.
/// * `create_ts` - A timestamp used by the vcpu to calculate its lifetime.
/// * `support_immediate_exit` - whether kvm uses supports immediate_exit flag.
#[allow(clippy::too_many_arguments)]
pub fn new_aarch64(
id: u8,
vcpu_fd: Arc<VcpuFd>,

View File

@@ -533,16 +533,11 @@ impl Vcpu {
fn check_io_port_info(&self, addr: u16, data: &[u8]) -> Result<bool> {
let mut checked = false;
match addr {
// debug info signal
MAGIC_IOPORT_DEBUG_INFO => {
if data.len() == 4 {
let data = unsafe { std::ptr::read(data.as_ptr() as *const u32) };
log::warn!("KDBG: guest kernel debug info: 0x{:x}", data);
checked = true;
}
}
_ => {}
// debug info signal
if addr == MAGIC_IOPORT_DEBUG_INFO && data.len() == 4 {
let data = unsafe { std::ptr::read(data.as_ptr() as *const u32) };
log::warn!("KDBG: guest kernel debug info: 0x{:x}", data);
checked = true;
};
Ok(checked)
@@ -771,6 +766,7 @@ pub mod tests {
use dbs_device::device_manager::IoManager;
use kvm_ioctls::Kvm;
use lazy_static::lazy_static;
use test_utils::skip_if_not_root;
use super::*;
use crate::kvm_context::KvmContext;
@@ -855,7 +851,7 @@ pub mod tests {
let kvm = Kvm::new().unwrap();
let vm = Arc::new(kvm.create_vm().unwrap());
let kvm_context = KvmContext::new(Some(kvm.as_raw_fd())).unwrap();
let _kvm_context = KvmContext::new(Some(kvm.as_raw_fd())).unwrap();
let vcpu_fd = Arc::new(vm.create_vcpu(0).unwrap());
let io_manager = IoManagerCached::new(Arc::new(ArcSwap::new(Arc::new(IoManager::new()))));
let reset_event_fd = EventFd::new(libc::EFD_NONBLOCK).unwrap();
@@ -880,6 +876,8 @@ pub mod tests {
#[test]
fn test_vcpu_run_emulation() {
skip_if_not_root!();
let (mut vcpu, _) = create_vcpu();
#[cfg(target_arch = "x86_64")]
@@ -964,6 +962,8 @@ pub mod tests {
#[cfg(target_arch = "x86_64")]
#[test]
fn test_vcpu_check_io_port_info() {
skip_if_not_root!();
let (vcpu, _receiver) = create_vcpu();
// debug info signal

View File

@@ -774,7 +774,7 @@ impl VcpuManager {
self.reset_event_fd.as_ref().unwrap().try_clone().unwrap(),
self.vcpu_state_event.try_clone().unwrap(),
self.vcpu_state_sender.clone(),
request_ts.clone(),
request_ts,
self.support_immediate_exit,
)
.map_err(VcpuManagerError::Vcpu)

View File

@@ -35,6 +35,7 @@ use crate::event_manager::EventManager;
/// * `device_info` - A hashmap containing the attached devices for building FDT device nodes.
/// * `gic_device` - The GIC device.
/// * `initrd` - Information about an optional initrd.
#[allow(clippy::borrowed_box)]
fn configure_system<T: DeviceInfoForFDT + Clone + Debug, M: GuestMemory>(
guest_mem: &M,
cmdline: &str,
@@ -58,8 +59,9 @@ fn configure_system<T: DeviceInfoForFDT + Clone + Debug, M: GuestMemory>(
#[cfg(target_arch = "aarch64")]
impl Vm {
/// Gets a reference to the irqchip of the VM
#[allow(clippy::borrowed_box)]
pub fn get_irqchip(&self) -> &Box<dyn GICDevice> {
&self.irqchip_handle.as_ref().unwrap()
self.irqchip_handle.as_ref().unwrap()
}
/// Creates the irq chip in-kernel device model.

View File

@@ -67,7 +67,7 @@ pub enum VmError {
}
/// Configuration information for user defined NUMA nodes.
#[derive(Clone, Debug, Default, Serialize, Deserialize, PartialEq)]
#[derive(Clone, Debug, Default, Serialize, Deserialize, PartialEq, Eq)]
pub struct NumaRegionInfo {
/// memory size for this region (unit: MiB)
pub size: u64,
@@ -80,7 +80,7 @@ pub struct NumaRegionInfo {
}
/// Information for cpu topology to guide guest init
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CpuTopology {
/// threads per core to indicate hyperthreading is enabled or not
pub threads_per_core: u8,
@@ -104,7 +104,7 @@ impl Default for CpuTopology {
}
/// Configuration information for virtual machine instance.
#[derive(Clone, Debug, PartialEq)]
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct VmConfigInfo {
/// Number of vcpu to start.
pub vcpu_count: u8,
@@ -520,6 +520,7 @@ impl Vm {
let mem_type = self.vm_config.mem_type.clone();
let mut mem_file_path = String::from("");
if mem_type == "hugetlbfs" {
mem_file_path = self.vm_config.mem_file_path.clone();
let shared_info = self.shared_info.read()
.expect("Failed to determine if instance is initialized because shared info couldn't be read due to poisoned lock");
mem_file_path.push_str("/dragonball/");
@@ -814,3 +815,23 @@ impl Vm {
Err(StartMicroVmError::MicroVMAlreadyRunning)
}
}
#[cfg(test)]
pub mod tests {
use super::*;
impl Vm {
pub fn set_instance_state(&mut self, mstate: InstanceState) {
self.shared_info
.write()
.expect("Failed to start microVM because shared info couldn't be written due to poisoned lock")
.state = mstate;
}
}
pub fn create_vm_instance() -> Vm {
let instance_info = Arc::new(RwLock::new(InstanceInfo::default()));
let epoll_manager = EpollManager::default();
Vm::new(None, instance_info, epoll_manager).unwrap()
}
}

View File

@@ -189,6 +189,8 @@ impl Vmm {
#[cfg(test)]
pub(crate) mod tests {
use test_utils::skip_if_not_root;
use super::*;
pub fn create_vmm_instance() -> Vmm {
@@ -210,6 +212,8 @@ pub(crate) mod tests {
#[test]
fn test_create_vmm_instance() {
skip_if_not_root!();
create_vmm_instance();
}
}

178
src/libs/Cargo.lock generated
View File

@@ -40,12 +40,28 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "base64"
version = "0.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9e1b586273c5702936fe7b7d6896644d8be71e6314cfe09d3167c95f712589e8"
[[package]]
name = "bitflags"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cf1de2fe8c75bc145a2f577add951f8134889b4795d47466a54a5c846d691693"
[[package]]
name = "bitmask-enum"
version = "2.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd9e32d7420c85055e8107e5b2463c4eeefeaac18b52359fe9f9c08a18f342b2"
dependencies = [
"quote",
"syn",
]
[[package]]
name = "bumpalo"
version = "3.11.0"
@@ -187,6 +203,12 @@ version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37ab347416e802de484e4d03c7316c48f1ecb56574dfd4a46a80f173ce1de04d"
[[package]]
name = "fnv"
version = "1.0.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3f9eec918d3f24069decb9af1554cad7c880e2da24a9afd88aca000531ab82c1"
[[package]]
name = "futures"
version = "0.3.21"
@@ -328,6 +350,82 @@ dependencies = [
"libc",
]
[[package]]
name = "hex"
version = "0.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f24254aa9a54b5c858eaee2f5bccdb46aaf0e486a595ed5fd8f86ba55232a70"
[[package]]
name = "http"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75f43d41e26995c17e71ee126451dd3941010b0514a81a9d11f3b341debc2399"
dependencies = [
"bytes 1.1.0",
"fnv",
"itoa",
]
[[package]]
name = "http-body"
version = "0.4.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d5f38f16d184e36f2408a55281cd658ecbd3ca05cce6d6510a176eca393e26d1"
dependencies = [
"bytes 1.1.0",
"http",
"pin-project-lite",
]
[[package]]
name = "httparse"
version = "1.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d897f394bad6a705d5f4104762e116a75639e470d80901eed05a860a95cb1904"
[[package]]
name = "httpdate"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4a1e36c821dbe04574f602848a19f742f4fb3c98d40449f11bcad18d6b17421"
[[package]]
name = "hyper"
version = "0.14.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "034711faac9d2166cb1baf1a2fb0b60b1f277f8492fd72176c17f3515e1abd3c"
dependencies = [
"bytes 1.1.0",
"futures-channel",
"futures-core",
"futures-util",
"http",
"http-body",
"httparse",
"httpdate",
"itoa",
"pin-project-lite",
"socket2",
"tokio",
"tower-service",
"tracing",
"want",
]
[[package]]
name = "hyperlocal"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0fafdf7b2b2de7c9784f76e02c0935e65a8117ec3b768644379983ab333ac98c"
dependencies = [
"futures-util",
"hex",
"hyper",
"pin-project",
"tokio",
]
[[package]]
name = "indexmap"
version = "1.8.1"
@@ -410,6 +508,9 @@ dependencies = [
name = "kata-types"
version = "0.1.0"
dependencies = [
"anyhow",
"base64",
"bitmask-enum",
"byte-unit",
"glob",
"lazy_static",
@@ -628,6 +729,26 @@ dependencies = [
"indexmap",
]
[[package]]
name = "pin-project"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ad29a609b6bcd67fee905812e544992d216af9d755757c05ed2d0e15a74c6ecc"
dependencies = [
"pin-project-internal",
]
[[package]]
name = "pin-project-internal"
version = "1.0.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "069bdb1e05adc7a8990dce9cc75370895fbe4e3d58b9b73bf1aee56359344a55"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "pin-project-lite"
version = "0.2.8"
@@ -936,6 +1057,16 @@ dependencies = [
"syn",
]
[[package]]
name = "shim-interface"
version = "0.1.0"
dependencies = [
"anyhow",
"hyper",
"hyperlocal",
"tokio",
]
[[package]]
name = "slab"
version = "0.4.6"
@@ -991,9 +1122,9 @@ checksum = "f2dd574626839106c320a323308629dcb1acfc96e32a8cba364ddc61ac23ee83"
[[package]]
name = "socket2"
version = "0.4.4"
version = "0.4.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "66d72b759436ae32898a2af0a14218dbf55efde3feeb170eb623637db85ee1e0"
checksum = "02e2d2db9033d13a1567121ddd7a095ee144db4e1ca1b1bda3419bc0da294ebd"
dependencies = [
"libc",
"winapi",
@@ -1096,6 +1227,7 @@ dependencies = [
"libc",
"memchr",
"mio",
"num_cpus",
"pin-project-lite",
"socket2",
"tokio-macros",
@@ -1135,6 +1267,38 @@ dependencies = [
"serde",
]
[[package]]
name = "tower-service"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6bc1c9ce2b5135ac7f93c72918fc37feb872bdc6a5533a8b85eb4b86bfdae52"
[[package]]
name = "tracing"
version = "0.1.34"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d0ecdcb44a79f0fe9844f0c4f33a342cbcbb5117de8001e6ba0dc2351327d09"
dependencies = [
"cfg-if",
"pin-project-lite",
"tracing-core",
]
[[package]]
name = "tracing-core"
version = "0.1.26"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f54c8ca710e81886d498c2fd3331b56c93aa248d49de2222ad2742247c60072f"
dependencies = [
"lazy_static",
]
[[package]]
name = "try-lock"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "59547bce71d9c38b83d9c0e92b6066c4253371f15005def0c30d9657f50c7642"
[[package]]
name = "ttrpc"
version = "0.6.1"
@@ -1203,6 +1367,16 @@ dependencies = [
"nix 0.23.1",
]
[[package]]
name = "want"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ce8a968cb1cd110d136ff8b819a556d6fb6d919363c61534f6860c7eb172ba0"
dependencies = [
"log",
"try-lock",
]
[[package]]
name = "wasi"
version = "0.9.0+wasi-snapshot-preview1"

View File

@@ -6,6 +6,7 @@ members = [
"oci",
"protocols",
"safe-path",
"shim-interface",
"test-utils",
]
resolver = "2"

View File

@@ -16,6 +16,9 @@ default: build
build:
cargo build --all-features
static-checks-build:
@echo "INFO: static-checks-build do nothing.."
check: clippy format
clippy:

View File

@@ -49,7 +49,7 @@ pub fn is_host_empty_dir(path: &str) -> bool {
false
}
// set_ephemeral_storage_type sets the mount type to 'ephemeral'
// update_ephemeral_storage_type sets the mount type to 'ephemeral'
// if the mount source path is provisioned by k8s for ephemeral storage.
// For the given pod ephemeral volume is created only once
// backed by tmpfs inside the VM. For successive containers
@@ -63,6 +63,8 @@ pub fn update_ephemeral_storage_type(oci_spec: &mut Spec) {
if is_ephemeral_volume(&m.source) {
m.r#type = String::from(mount::KATA_EPHEMERAL_VOLUME_TYPE);
} else if is_host_empty_dir(&m.source) {
// FIXME support disable_guest_empty_dir
// https://github.com/kata-containers/kata-containers/blob/02a51e75a7e0c6fce5e8abe3b991eeac87e09645/src/runtime/pkg/katautils/create.go#L105
m.r#type = String::from(mount::KATA_HOST_DIR_VOLUME_TYPE);
}
}

View File

@@ -43,8 +43,6 @@
use std::fmt::Debug;
use std::fs;
use std::io::{self, BufRead};
use std::os::raw::c_char;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::fs::{DirBuilderExt, OpenOptionsExt};
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
@@ -213,11 +211,11 @@ pub fn create_mount_destination<S: AsRef<Path>, D: AsRef<Path>, R: AsRef<Path>>(
}
}
/// Remount a bind mount into readonly mode.
/// Remount a bind mount
///
/// # Safety
/// Caller needs to ensure safety of the `dst` to avoid possible file path based attacks.
pub fn bind_remount_read_only<P: AsRef<Path>>(dst: P) -> Result<()> {
pub fn bind_remount<P: AsRef<Path>>(dst: P, readonly: bool) -> Result<()> {
let dst = dst.as_ref();
if dst.is_empty() {
return Err(Error::NullMountPointPath);
@@ -226,7 +224,7 @@ pub fn bind_remount_read_only<P: AsRef<Path>>(dst: P) -> Result<()> {
.canonicalize()
.map_err(|_e| Error::InvalidPath(dst.to_path_buf()))?;
do_rebind_mount_read_only(dst, MsFlags::empty())
do_rebind_mount(dst, readonly, MsFlags::empty())
}
/// Bind mount `src` to `dst` in slave mode, optionally in readonly mode if `readonly` is true.
@@ -239,7 +237,7 @@ pub fn bind_remount_read_only<P: AsRef<Path>>(dst: P) -> Result<()> {
pub fn bind_mount_unchecked<S: AsRef<Path>, D: AsRef<Path>>(
src: S,
dst: D,
read_only: bool,
readonly: bool,
) -> Result<()> {
fail::fail_point!("bind_mount", |_| {
Err(Error::FailureInject(
@@ -275,8 +273,8 @@ pub fn bind_mount_unchecked<S: AsRef<Path>, D: AsRef<Path>>(
.map_err(|e| Error::Mount(PathBuf::new(), dst.to_path_buf(), e))?;
// Optionally rebind into readonly mode.
if read_only {
do_rebind_mount_read_only(dst, MsFlags::empty())?;
if readonly {
do_rebind_mount(dst, readonly, MsFlags::empty())?;
}
Ok(())
@@ -356,7 +354,7 @@ impl Mounter for kata_types::mount::Mount {
// Bind mount readonly.
let bro_flag = MsFlags::MS_BIND | MsFlags::MS_RDONLY;
if (o_flag & bro_flag) == bro_flag {
do_rebind_mount_read_only(target, o_flag)?;
do_rebind_mount(target, true, o_flag)?;
}
Ok(())
@@ -364,12 +362,16 @@ impl Mounter for kata_types::mount::Mount {
}
#[inline]
fn do_rebind_mount_read_only<P: AsRef<Path>>(path: P, flags: MsFlags) -> Result<()> {
fn do_rebind_mount<P: AsRef<Path>>(path: P, readonly: bool, flags: MsFlags) -> Result<()> {
mount(
Some(""),
path.as_ref(),
Some(""),
flags | MsFlags::MS_BIND | MsFlags::MS_REMOUNT | MsFlags::MS_RDONLY,
if readonly {
flags | MsFlags::MS_BIND | MsFlags::MS_REMOUNT | MsFlags::MS_RDONLY
} else {
flags | MsFlags::MS_BIND | MsFlags::MS_REMOUNT
},
Some(""),
)
.map_err(|e| Error::Remount(path.as_ref().to_path_buf(), e))
@@ -756,18 +758,11 @@ pub fn umount_all<P: AsRef<Path>>(mountpoint: P, lazy_umount: bool) -> Result<()
// Counterpart of nix::umount2, with support of `UMOUNT_FOLLOW`.
fn umount2<P: AsRef<Path>>(path: P, lazy_umount: bool) -> std::io::Result<()> {
let path_ptr = path.as_ref().as_os_str().as_bytes().as_ptr() as *const c_char;
let mut flags = MntFlags::UMOUNT_NOFOLLOW.bits();
let mut flags = MntFlags::UMOUNT_NOFOLLOW;
if lazy_umount {
flags |= MntFlags::MNT_DETACH.bits();
}
// Safe because parameter is valid and we have checked the reuslt.
if unsafe { libc::umount2(path_ptr, flags) } < 0 {
Err(io::Error::last_os_error())
} else {
Ok(())
flags |= MntFlags::MNT_DETACH;
}
nix::mount::umount2(path.as_ref(), flags).map_err(io::Error::from)
}
#[cfg(test)]
@@ -820,21 +815,21 @@ mod tests {
#[test]
#[ignore]
fn test_bind_remount_read_only() {
fn test_bind_remount() {
let tmpdir = tempfile::tempdir().unwrap();
let tmpdir2 = tempfile::tempdir().unwrap();
assert!(matches!(
bind_remount_read_only(&PathBuf::from("")),
bind_remount(&PathBuf::from(""), true),
Err(Error::NullMountPointPath)
));
assert!(matches!(
bind_remount_read_only(&PathBuf::from("../______doesn't____exist____nnn")),
bind_remount(&PathBuf::from("../______doesn't____exist____nnn"), true),
Err(Error::InvalidPath(_))
));
bind_mount_unchecked(tmpdir2.path(), tmpdir.path(), true).unwrap();
bind_remount_read_only(tmpdir.path()).unwrap();
bind_remount(tmpdir.path(), true).unwrap();
umount_timeout(tmpdir.path().to_str().unwrap(), 0).unwrap();
}

View File

@@ -49,7 +49,7 @@ pub enum ShimIdInfo {
}
/// get container type
pub fn get_contaier_type(spec: &oci::Spec) -> Result<ContainerType, Error> {
pub fn get_container_type(spec: &oci::Spec) -> Result<ContainerType, Error> {
for k in CRI_CONTAINER_TYPE_KEY_LIST.iter() {
if let Some(type_value) = spec.annotations.get(*k) {
match type_value.as_str() {
@@ -67,7 +67,7 @@ pub fn get_contaier_type(spec: &oci::Spec) -> Result<ContainerType, Error> {
/// get shim id info
pub fn get_shim_id_info() -> Result<ShimIdInfo, Error> {
let spec = load_oci_spec()?;
match get_contaier_type(&spec)? {
match get_container_type(&spec)? {
ContainerType::PodSandbox => Ok(ShimIdInfo::Sandbox),
ContainerType::PodContainer => {
for k in CRI_SANDBOX_ID_KEY_LIST {

View File

@@ -11,6 +11,9 @@ license = "Apache-2.0"
edition = "2018"
[dependencies]
bitmask-enum = "2.1.0"
anyhow = "1.0"
base64 = "0.13.0"
byte-unit = "3.1.4"
glob = "0.3.0"
lazy_static = "1.4.0"

View File

@@ -0,0 +1,107 @@
// Copyright (c) 2019-2022 Alibaba Cloud
// Copyright (c) 2019-2022 Ant Group
//
// SPDX-License-Identifier: Apache-2.0
//
use bitmask_enum::bitmask;
/// CapabilityBits
#[bitmask(u8)]
pub enum CapabilityBits {
/// hypervisor supports use block device
BlockDeviceSupport,
/// hypervisor supports block device hotplug
BlockDeviceHotplugSupport,
/// hypervisor supports multi queue
MultiQueueSupport,
/// hypervisor supports filesystem share
FsSharingSupport,
}
/// Capabilities describe a virtcontainers hypervisor capabilities through a bit mask.
#[derive(Debug, Clone)]
pub struct Capabilities {
/// Capability flags
flags: CapabilityBits,
}
impl Default for Capabilities {
fn default() -> Self {
Self::new()
}
}
impl Capabilities {
/// new Capabilities struct
pub fn new() -> Self {
Capabilities {
flags: CapabilityBits { bits: 0 },
}
}
/// set CapabilityBits
pub fn set(&mut self, flags: CapabilityBits) {
self.flags = flags;
}
/// is_block_device_supported tells if an hypervisor supports block devices.
pub fn is_block_device_supported(&self) -> bool {
self.flags.and(CapabilityBits::BlockDeviceSupport) != 0
}
/// is_block_device_hotplug_supported tells if an hypervisor supports block devices.
pub fn is_block_device_hotplug_supported(&self) -> bool {
self.flags.and(CapabilityBits::BlockDeviceHotplugSupport) != 0
}
/// is_multi_queue_supported tells if an hypervisor supports device multi queue support.
pub fn is_multi_queue_supported(&self) -> bool {
self.flags.and(CapabilityBits::MultiQueueSupport) != 0
}
/// is_fs_sharing_supported tells if an hypervisor supports host filesystem sharing.
pub fn is_fs_sharing_supported(&self) -> bool {
self.flags.and(CapabilityBits::FsSharingSupport) != 0
}
}
#[cfg(test)]
mod tests {
use crate::capabilities::CapabilityBits;
use super::Capabilities;
#[test]
fn test_set_hypervisor_capabilities() {
let mut cap = Capabilities::new();
assert!(!cap.is_block_device_supported());
// test set block device support
cap.set(CapabilityBits::BlockDeviceSupport);
assert!(cap.is_block_device_supported());
assert!(!cap.is_block_device_hotplug_supported());
// test set block device hotplug support
cap.set(CapabilityBits::BlockDeviceSupport | CapabilityBits::BlockDeviceHotplugSupport);
assert!(cap.is_block_device_hotplug_supported());
assert!(!cap.is_multi_queue_supported());
// test set multi queue support
cap.set(
CapabilityBits::BlockDeviceSupport
| CapabilityBits::BlockDeviceHotplugSupport
| CapabilityBits::MultiQueueSupport,
);
assert!(cap.is_multi_queue_supported());
// test set host filesystem sharing support
cap.set(
CapabilityBits::BlockDeviceSupport
| CapabilityBits::BlockDeviceHotplugSupport
| CapabilityBits::MultiQueueSupport
| CapabilityBits::FsSharingSupport,
);
assert!(cap.is_fs_sharing_supported())
}
}

View File

@@ -10,6 +10,7 @@ use crate::config::{ConfigOps, TomlConfig};
pub use vendor::AgentVendor;
use super::default::{DEFAULT_AGENT_LOG_PORT, DEFAULT_AGENT_VSOCK_PORT};
use crate::eother;
/// agent name of Kata agent.
pub const AGENT_NAME_KATA: &str = "kata";
@@ -108,6 +109,16 @@ fn default_health_check_timeout() -> u32 {
90_000
}
impl Agent {
fn validate(&self) -> Result<()> {
if self.dial_timeout_ms == 0 {
return Err(eother!("dial_timeout_ms couldn't be 0."));
}
Ok(())
}
}
impl ConfigOps for Agent {
fn adjust_config(conf: &mut TomlConfig) -> Result<()> {
AgentVendor::adjust_config(conf)?;
@@ -116,6 +127,9 @@ impl ConfigOps for Agent {
fn validate(conf: &TomlConfig) -> Result<()> {
AgentVendor::validate(conf)?;
for (_, agent_config) in conf.agent.iter() {
agent_config.validate()?;
}
Ok(())
}
}

View File

@@ -35,7 +35,7 @@ pub const DEFAULT_VHOST_USER_STORE_PATH: &str = "/var/run/vhost-user";
pub const DEFAULT_BLOCK_NVDIMM_MEM_OFFSET: u64 = 0;
pub const DEFAULT_SHARED_FS_TYPE: &str = "virtio-fs";
pub const DEFAULT_VIRTIO_FS_CACHE_MODE: &str = "none";
pub const DEFAULT_VIRTIO_FS_CACHE_MODE: &str = "never";
pub const DEFAULT_VIRTIO_FS_DAX_SIZE_MB: u32 = 1024;
pub const DEFAULT_SHARED_9PFS_SIZE_MB: u32 = 128 * 1024;
pub const MIN_SHARED_9PFS_SIZE_MB: u32 = 4 * 1024;

View File

@@ -864,7 +864,7 @@ impl SharedFsInfo {
)?;
}
let l = ["none", "auto", "always"];
let l = ["never", "auto", "always"];
if !l.contains(&self.virtio_fs_cache.as_str()) {
return Err(eother!(

View File

@@ -31,6 +31,9 @@ pub mod mount;
pub(crate) mod utils;
/// hypervisor capabilities
pub mod capabilities;
/// Common error codes.
#[derive(thiserror::Error, Debug)]
pub enum Error {

View File

@@ -4,6 +4,7 @@
// SPDX-License-Identifier: Apache-2.0
//
use anyhow::{anyhow, Context, Result};
use std::path::PathBuf;
/// Prefix to mark a volume as Kata special.
@@ -13,7 +14,7 @@ pub const KATA_VOLUME_TYPE_PREFIX: &str = "kata:";
pub const KATA_GUEST_MOUNT_PREFIX: &str = "kata:guest-mount:";
/// KATA_EPHEMERAL_DEV_TYPE creates a tmpfs backed volume for sharing files between containers.
pub const KATA_EPHEMERAL_VOLUME_TYPE: &str = "kata:ephemeral";
pub const KATA_EPHEMERAL_VOLUME_TYPE: &str = "ephemeral";
/// KATA_HOST_DIR_TYPE use for host empty dir
pub const KATA_HOST_DIR_VOLUME_TYPE: &str = "kata:hostdir";
@@ -68,10 +69,46 @@ pub fn is_kata_host_dir_volume(ty: &str) -> bool {
ty == KATA_HOST_DIR_VOLUME_TYPE
}
/// Nydus extra options
#[derive(Debug, serde::Deserialize)]
pub struct NydusExtraOptions {
/// source path
pub source: String,
/// nydus config
pub config: String,
/// snapshotter directory
#[serde(rename(deserialize = "snapshotdir"))]
pub snapshot_dir: String,
/// fs version
pub fs_version: String,
}
impl NydusExtraOptions {
/// Create Nydus extra options
pub fn new(mount: &Mount) -> Result<Self> {
let options: Vec<&str> = mount
.options
.iter()
.filter(|x| x.starts_with("extraoption="))
.map(|x| x.as_ref())
.collect();
if options.len() != 1 {
return Err(anyhow!(
"get_nydus_extra_options: Invalid nydus options: {:?}",
&mount.options
));
}
let config_raw_data = options[0].trim_start_matches("extraoption=");
let extra_options_buf =
base64::decode(config_raw_data).context("decode the nydus's base64 extraoption")?;
serde_json::from_slice(&extra_options_buf).context("deserialize nydus's extraoption")
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_is_kata_special_volume() {
assert!(is_kata_special_volume("kata:guest-mount:nfs"));
@@ -85,4 +122,38 @@ mod tests {
assert!(!is_kata_guest_mount_volume("kata:guest-moun"));
assert!(!is_kata_guest_mount_volume("Kata:guest-mount:nfs"));
}
#[test]
fn test_get_nydus_extra_options_v5() {
let mut mount_info = Mount {
..Default::default()
};
mount_info.options = vec!["extraoption=eyJzb3VyY2UiOiIvdmFyL2xpYi9jb250YWluZXJkL2lvLmNvbnRhaW5lcmQuc25hcHNob3R0ZXIudjEubnlkdXMvc25hcHNob3RzLzkvZnMvaW1hZ2UvaW1hZ2UuYm9vdCIsImNvbmZpZyI6IntcImRldmljZVwiOntcImJhY2tlbmRcIjp7XCJ0eXBlXCI6XCJyZWdpc3RyeVwiLFwiY29uZmlnXCI6e1wicmVhZGFoZWFkXCI6ZmFsc2UsXCJob3N0XCI6XCJsb2NhbGhvc3Q6NTAwMFwiLFwicmVwb1wiOlwidWJ1bnR1LW55ZHVzXCIsXCJzY2hlbWVcIjpcImh0dHBcIixcInNraXBfdmVyaWZ5XCI6dHJ1ZSxcInByb3h5XCI6e1wiZmFsbGJhY2tcIjpmYWxzZX0sXCJ0aW1lb3V0XCI6NSxcImNvbm5lY3RfdGltZW91dFwiOjUsXCJyZXRyeV9saW1pdFwiOjJ9fSxcImNhY2hlXCI6e1widHlwZVwiOlwiYmxvYmNhY2hlXCIsXCJjb25maWdcIjp7XCJ3b3JrX2RpclwiOlwiL3Zhci9saWIvbnlkdXMvY2FjaGVcIixcImRpc2FibGVfaW5kZXhlZF9tYXBcIjpmYWxzZX19fSxcIm1vZGVcIjpcImRpcmVjdFwiLFwiZGlnZXN0X3ZhbGlkYXRlXCI6ZmFsc2UsXCJlbmFibGVfeGF0dHJcIjp0cnVlLFwiZnNfcHJlZmV0Y2hcIjp7XCJlbmFibGVcIjp0cnVlLFwicHJlZmV0Y2hfYWxsXCI6ZmFsc2UsXCJ0aHJlYWRzX2NvdW50XCI6NCxcIm1lcmdpbmdfc2l6ZVwiOjAsXCJiYW5kd2lkdGhfcmF0ZVwiOjB9LFwidHlwZVwiOlwiXCIsXCJpZFwiOlwiXCIsXCJkb21haW5faWRcIjpcIlwiLFwiY29uZmlnXCI6e1wiaWRcIjpcIlwiLFwiYmFja2VuZF90eXBlXCI6XCJcIixcImJhY2tlbmRfY29uZmlnXCI6e1wicmVhZGFoZWFkXCI6ZmFsc2UsXCJwcm94eVwiOntcImZhbGxiYWNrXCI6ZmFsc2V9fSxcImNhY2hlX3R5cGVcIjpcIlwiLFwiY2FjaGVfY29uZmlnXCI6e1wid29ya19kaXJcIjpcIlwifSxcIm1ldGFkYXRhX3BhdGhcIjpcIlwifX0iLCJzbmFwc2hvdGRpciI6Ii92YXIvbGliL2NvbnRhaW5lcmQvaW8uY29udGFpbmVyZC5zbmFwc2hvdHRlci52MS5ueWR1cy9zbmFwc2hvdHMvMjU3IiwiZnNfdmVyc2lvbiI6InY1In0=".to_string()];
let extra_option_result = NydusExtraOptions::new(&mount_info);
assert!(extra_option_result.is_ok());
let extra_option = extra_option_result.unwrap();
assert_eq!(extra_option.source,"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/9/fs/image/image.boot");
assert_eq!(
extra_option.snapshot_dir,
"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/257"
);
assert_eq!(extra_option.fs_version, "v5");
}
#[test]
fn test_get_nydus_extra_options_v6() {
let mut mount_info = Mount {
..Default::default()
};
mount_info.options = vec!["extraoption=eyJzb3VyY2UiOiIvdmFyL2xpYi9jb250YWluZXJkL2lvLmNvbnRhaW5lcmQuc25hcHNob3R0ZXIudjEubnlkdXMvc25hcHNob3RzLzIwMS9mcy9pbWFnZS9pbWFnZS5ib290IiwiY29uZmlnIjoie1wiZGV2aWNlXCI6e1wiYmFja2VuZFwiOntcInR5cGVcIjpcInJlZ2lzdHJ5XCIsXCJjb25maWdcIjp7XCJyZWFkYWhlYWRcIjpmYWxzZSxcImhvc3RcIjpcImxvY2FsaG9zdDo1MDAwXCIsXCJyZXBvXCI6XCJ1YnVudHUtbnlkdXMtdjZcIixcInNjaGVtZVwiOlwiaHR0cFwiLFwic2tpcF92ZXJpZnlcIjp0cnVlLFwicHJveHlcIjp7XCJmYWxsYmFja1wiOmZhbHNlfSxcInRpbWVvdXRcIjo1LFwiY29ubmVjdF90aW1lb3V0XCI6NSxcInJldHJ5X2xpbWl0XCI6Mn19LFwiY2FjaGVcIjp7XCJ0eXBlXCI6XCJibG9iY2FjaGVcIixcImNvbmZpZ1wiOntcIndvcmtfZGlyXCI6XCIvdmFyL2xpYi9ueWR1cy9jYWNoZVwiLFwiZGlzYWJsZV9pbmRleGVkX21hcFwiOmZhbHNlfX19LFwibW9kZVwiOlwiZGlyZWN0XCIsXCJkaWdlc3RfdmFsaWRhdGVcIjpmYWxzZSxcImVuYWJsZV94YXR0clwiOnRydWUsXCJmc19wcmVmZXRjaFwiOntcImVuYWJsZVwiOnRydWUsXCJwcmVmZXRjaF9hbGxcIjpmYWxzZSxcInRocmVhZHNfY291bnRcIjo0LFwibWVyZ2luZ19zaXplXCI6MCxcImJhbmR3aWR0aF9yYXRlXCI6MH0sXCJ0eXBlXCI6XCJcIixcImlkXCI6XCJcIixcImRvbWFpbl9pZFwiOlwiXCIsXCJjb25maWdcIjp7XCJpZFwiOlwiXCIsXCJiYWNrZW5kX3R5cGVcIjpcIlwiLFwiYmFja2VuZF9jb25maWdcIjp7XCJyZWFkYWhlYWRcIjpmYWxzZSxcInByb3h5XCI6e1wiZmFsbGJhY2tcIjpmYWxzZX19LFwiY2FjaGVfdHlwZVwiOlwiXCIsXCJjYWNoZV9jb25maWdcIjp7XCJ3b3JrX2RpclwiOlwiXCJ9LFwibWV0YWRhdGFfcGF0aFwiOlwiXCJ9fSIsInNuYXBzaG90ZGlyIjoiL3Zhci9saWIvY29udGFpbmVyZC9pby5jb250YWluZXJkLnNuYXBzaG90dGVyLnYxLm55ZHVzL3NuYXBzaG90cy8yNjEiLCJmc192ZXJzaW9uIjoidjYifQ==".to_string()];
let extra_option_result = NydusExtraOptions::new(&mount_info);
assert!(extra_option_result.is_ok());
let extra_option = extra_option_result.unwrap();
assert_eq!(extra_option.source,"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/201/fs/image/image.boot");
assert_eq!(
extra_option.snapshot_dir,
"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/261"
);
assert_eq!(extra_option.fs_version, "v6");
}
}

View File

@@ -0,0 +1,18 @@
[package]
name = "shim-interface"
version = "0.1.0"
description = "A library to provide service interface of Kata Containers"
keywords = ["kata", "container", "http"]
categories = ["services"]
authors = ["The Kata Containers community <kata-dev@lists.katacontainers.io>"]
repository = "https://github.com/kata-containers/kata-containers.git"
homepage = "https://katacontainers.io/"
readme = "README.md"
license = "Apache-2.0"
edition = "2018"
[dependencies]
anyhow = "^1.0"
tokio = { version = "1.8.0", features = ["rt-multi-thread"] }
hyper = { version = "0.14.20", features = ["stream", "server", "http1"] }
hyperlocal = "0.8"

View File

@@ -0,0 +1,66 @@
// Copyright (c) 2022 Alibaba Cloud
//
// SPDX-License-Identifier: Apache-2.0
//
//! shim-interface is a common library for different components of Kata Containers
//! to make function call through services inside the runtime(runtime-rs runtime).
//!
//! Shim management:
//! Currently, inside the shim, there is a shim management server running as the shim
//! starts, working as a RESTful server. To make function call in runtime from another
//! binary, using the utilities provided in this library is one of the methods.
//!
//! You may construct clients by construct a MgmtClient and let is make specific
//! HTTP request to the server. The server inside shim will multiplex the request
//! to its corresponding handler and run certain methods.
use std::path::Path;
use anyhow::{anyhow, Result};
pub mod shim_mgmt;
pub const KATA_PATH: &str = "/run/kata";
pub const SHIM_MGMT_SOCK_NAME: &str = "shim-monitor.sock";
// return sandbox's storage path
pub fn sb_storage_path() -> String {
String::from(KATA_PATH)
}
// returns the address of the unix domain socket(UDS) for communication with shim
// management service using http
// normally returns "unix:///run/kata/{sid}/shim_monitor.sock"
pub fn mgmt_socket_addr(sid: &str) -> Result<String> {
if sid.is_empty() {
return Err(anyhow!(
"Empty sandbox id for acquiring socket address for shim_mgmt"
));
}
let p = Path::new(&sb_storage_path())
.join(sid)
.join(SHIM_MGMT_SOCK_NAME);
if let Some(p) = p.to_str() {
Ok(format!("unix://{}", p))
} else {
Err(anyhow!("Bad socket path"))
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_mgmt_socket_addr() {
let sid = "414123";
let addr = mgmt_socket_addr(sid).unwrap();
assert_eq!(addr, "unix:///run/kata/414123/shim-monitor.sock");
let sid = "";
assert!(mgmt_socket_addr(sid).is_err());
}
}

Some files were not shown because too many files have changed in this diff Show More