Compare commits
385 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ef11ce13ea | ||
|
|
ea3f9b22a2 | ||
|
|
86ad7e486c | ||
|
|
624ff41318 | ||
|
|
6bb3f44100 | ||
|
|
4d4aba2e64 | ||
|
|
5f4f8ff337 | ||
|
|
f0d6316004 | ||
|
|
4e868ad981 | ||
|
|
a24ff2b51c | ||
|
|
1c70ef544f | ||
|
|
e5df408f64 | ||
|
|
985b9fa479 | ||
|
|
6d5e47bab1 | ||
|
|
514af3624b | ||
|
|
a6e3fb6514 | ||
|
|
55bdd1fcf4 | ||
|
|
6586f3b725 | ||
|
|
f5adc4c114 | ||
|
|
a67bdc369a | ||
|
|
67be55834d | ||
|
|
abfff68de6 | ||
|
|
0466ee04da | ||
|
|
6b223194a9 | ||
|
|
fb01d51573 | ||
|
|
144be14547 | ||
|
|
017c7cf249 | ||
|
|
52c6b0737c | ||
|
|
e7bdeb49b9 | ||
|
|
c0ca9f9a90 | ||
|
|
81f389903a | ||
|
|
179a98d678 | ||
|
|
e3efcfd40f | ||
|
|
5a92333f4b | ||
|
|
ec0424e153 | ||
|
|
b26e94ffba | ||
|
|
f6f4023508 | ||
|
|
814e7d7285 | ||
|
|
92d1197f10 | ||
|
|
a2484d0088 | ||
|
|
9e2cbe8ea1 | ||
|
|
fc676f76de | ||
|
|
ac9f838e33 | ||
|
|
9ea851ee53 | ||
|
|
2c1b957642 | ||
|
|
dfe5ef36b4 | ||
|
|
8a374af6b7 | ||
|
|
50aa89fa05 | ||
|
|
57aa746d0d | ||
|
|
ce2798b688 | ||
|
|
b7208b3c6c | ||
|
|
7e4dc08b0e | ||
|
|
a649d33a45 | ||
|
|
c628ecf298 | ||
|
|
d87076eea5 | ||
|
|
2dd859bfce | ||
|
|
4c9af982e6 | ||
|
|
06f964843a | ||
|
|
c27c3c40dd | ||
|
|
476467115f | ||
|
|
73645d1742 | ||
|
|
c7db337f10 | ||
|
|
72af86f686 | ||
|
|
95b2cad095 | ||
|
|
506f4f2adc | ||
|
|
a3e35e7e92 | ||
|
|
fdf69ab84c | ||
|
|
56b94e200c | ||
|
|
0533bee222 | ||
|
|
2114576be5 | ||
|
|
bcd8fd538d | ||
|
|
6fe3f331c9 | ||
|
|
3f3a2533a3 | ||
|
|
fc72d392b7 | ||
|
|
ef4ebfba48 | ||
|
|
336b80626c | ||
|
|
dd3c5fc617 | ||
|
|
93bd2e4716 | ||
|
|
7eb882a797 | ||
|
|
a60cf37879 | ||
|
|
ca6438728d | ||
|
|
32feb10331 | ||
|
|
3c618a61d6 | ||
|
|
7c888b34be | ||
|
|
234d53b6df | ||
|
|
cf81d400d8 | ||
|
|
79ed33adb5 | ||
|
|
f1cea9a022 | ||
|
|
4f802cc993 | ||
|
|
dda4279a2b | ||
|
|
5888971e18 | ||
|
|
ca28ca422c | ||
|
|
50ad323a21 | ||
|
|
f8314bedb0 | ||
|
|
99d9a24a51 | ||
|
|
0091b89184 | ||
|
|
9da2707202 | ||
|
|
2a0ff0bec3 | ||
|
|
fa581d334f | ||
|
|
a3967e9a59 | ||
|
|
272d39bc87 | ||
|
|
7a86c2eedd | ||
|
|
5096bd6a11 | ||
|
|
3fe59a99ff | ||
|
|
61fa4a3c75 | ||
|
|
856af1a886 | ||
|
|
74b587431f | ||
|
|
3df65f4f3a | ||
|
|
c5a6354718 | ||
|
|
867d8bc9b4 | ||
|
|
cfe9470ff1 | ||
|
|
9820459a0f | ||
|
|
4e141a96ed | ||
|
|
c8028da3c6 | ||
|
|
0aa68ccfef | ||
|
|
e4cea92ad3 | ||
|
|
0590fedd98 | ||
|
|
6b6668998f | ||
|
|
4f7f25d1a1 | ||
|
|
216eb29e04 | ||
|
|
65ae12710d | ||
|
|
9bc6fe6c83 | ||
|
|
349d496f7f | ||
|
|
6005026416 | ||
|
|
91b43a9964 | ||
|
|
2478b8f400 | ||
|
|
499aa24d38 | ||
|
|
1edb7fe7da | ||
|
|
607a892f2e | ||
|
|
26f176e2d9 | ||
|
|
3306195f66 | ||
|
|
a7568b520c | ||
|
|
e6d68349fa | ||
|
|
1f943bd6bf | ||
|
|
9a41d09f39 | ||
|
|
8fdb85e062 | ||
|
|
49516ef6f2 | ||
|
|
21fad464e8 | ||
|
|
b745e5ff02 | ||
|
|
40316f688a | ||
|
|
35b619ff58 | ||
|
|
662e8db5dd | ||
|
|
9117dd409e | ||
|
|
fce14f3697 | ||
|
|
0fd70f7ec3 | ||
|
|
4727a9c3e4 | ||
|
|
7ab8f62d43 | ||
|
|
7e92833bd4 | ||
|
|
14b18b55be | ||
|
|
1dde0de1d7 | ||
|
|
d4c1b768a6 | ||
|
|
3c36ce8139 | ||
|
|
c9d4e2c4b0 | ||
|
|
5fadc5fcb4 | ||
|
|
7cc7fd6888 | ||
|
|
5f8875064b | ||
|
|
3b925d6ad1 | ||
|
|
7526ee9350 | ||
|
|
c46a6244ba | ||
|
|
21ed9dc23f | ||
|
|
5f1520bdee | ||
|
|
e30bd6733b | ||
|
|
78df4a0c3f | ||
|
|
7daf9cffb1 | ||
|
|
293be9d0ad | ||
|
|
84e1a34f8f | ||
|
|
cf56307edb | ||
|
|
359f76d209 | ||
|
|
ca8f1399ca | ||
|
|
0bb559a438 | ||
|
|
4ca4412f64 | ||
|
|
e2424b9eb1 | ||
|
|
3d80c84869 | ||
|
|
f0fdc8e17c | ||
|
|
e53645ec85 | ||
|
|
aa295c91f2 | ||
|
|
6648c8c7fc | ||
|
|
49776f76bf | ||
|
|
dbfe85e705 | ||
|
|
0c3b6a94b3 | ||
|
|
f751c98da3 | ||
|
|
08361c5948 | ||
|
|
da9bfb27ed | ||
|
|
7347d43cf9 | ||
|
|
c7bb1e2790 | ||
|
|
e6f7ddd9a2 | ||
|
|
46cfed5025 | ||
|
|
81fb2c9980 | ||
|
|
0c432153df | ||
|
|
6511ffe89d | ||
|
|
ee59378232 | ||
|
|
ef11213a4e | ||
|
|
1fb6730984 | ||
|
|
05e9fe0591 | ||
|
|
d658129695 | ||
|
|
ae2d89e95e | ||
|
|
095d4ad08d | ||
|
|
bd816dfcec | ||
|
|
d413bf7d44 | ||
|
|
76408c0f13 | ||
|
|
6e4da19fa5 | ||
|
|
8f8061da08 | ||
|
|
64e4b2fa83 | ||
|
|
7c0d68f7f7 | ||
|
|
82ed34aee1 | ||
|
|
9def624c05 | ||
|
|
6926914683 | ||
|
|
e733c13cf7 | ||
|
|
ba069f9baa | ||
|
|
cc8ec7b0e9 | ||
|
|
8a364d2145 | ||
|
|
0cc6297716 | ||
|
|
b6059f3566 | ||
|
|
c6afad2a06 | ||
|
|
451608fb28 | ||
|
|
8328136575 | ||
|
|
a92a63031d | ||
|
|
997f7c4433 | ||
|
|
74d4065197 | ||
|
|
73bb3fdbee | ||
|
|
5a587ba506 | ||
|
|
29f5dec38f | ||
|
|
d71f9e1155 | ||
|
|
28c386c51f | ||
|
|
c2a186b18c | ||
|
|
8cd094cf06 | ||
|
|
b5f2a1e8c4 | ||
|
|
2d65b3bfd8 | ||
|
|
fe5e1cf2e1 | ||
|
|
3f7bcf54f0 | ||
|
|
80144fc415 | ||
|
|
2f5f35608a | ||
|
|
2faafbdd3a | ||
|
|
9e5ed41511 | ||
|
|
b33d4fe708 | ||
|
|
183823398d | ||
|
|
bfbbe8ba6b | ||
|
|
5c21ec278c | ||
|
|
9bb0d48d56 | ||
|
|
64a2ef62e0 | ||
|
|
a441f21c40 | ||
|
|
ce54090f25 | ||
|
|
e884fef483 | ||
|
|
9c16643c12 | ||
|
|
4978c9092c | ||
|
|
a7ba362f92 | ||
|
|
230a9833f8 | ||
|
|
a6d9fd4118 | ||
|
|
8f0cb2f1ea | ||
|
|
cbdae44992 | ||
|
|
97acaa8124 | ||
|
|
23246662b2 | ||
|
|
ebe5ad1386 | ||
|
|
c9497c88e4 | ||
|
|
d5d9928f97 | ||
|
|
f70892a5bb | ||
|
|
ab64780a0b | ||
|
|
9e064ba192 | ||
|
|
42c48f54ed | ||
|
|
d3a36fa06f | ||
|
|
fa546600ff | ||
|
|
efddcb4ab8 | ||
|
|
7bb3e562bc | ||
|
|
7b53041bad | ||
|
|
38212ba6d8 | ||
|
|
fb7e9b4f32 | ||
|
|
0cfcbf79b8 | ||
|
|
997f1f6cd0 | ||
|
|
f60f43af6b | ||
|
|
1789527d61 | ||
|
|
999f67d573 | ||
|
|
cb2255f199 | ||
|
|
2a6c9eec74 | ||
|
|
eaff5de37a | ||
|
|
4f1d23b651 | ||
|
|
6d80df9831 | ||
|
|
a116ce0b75 | ||
|
|
4dc3bc0020 | ||
|
|
8f7a4842c2 | ||
|
|
ce54e5dd57 | ||
|
|
9adb7b7c28 | ||
|
|
73ab9b1d6d | ||
|
|
4db3f9e226 | ||
|
|
19cb657299 | ||
|
|
86bc151787 | ||
|
|
8d8adb6887 | ||
|
|
76298c12b7 | ||
|
|
7d303ec2d0 | ||
|
|
e0b79eb57f | ||
|
|
8ed61b1bb9 | ||
|
|
cc4f02e2b6 | ||
|
|
ace6f1e66e | ||
|
|
47cfeaaf18 | ||
|
|
63c475786f | ||
|
|
059b89cd03 | ||
|
|
4ff3ed5101 | ||
|
|
de8dcb1549 | ||
|
|
c488cc48a2 | ||
|
|
e5acb1257f | ||
|
|
1bddde729b | ||
|
|
9517b0a933 | ||
|
|
f5a7175f92 | ||
|
|
9b969bb7da | ||
|
|
fb2f3cfce2 | ||
|
|
f32a741c76 | ||
|
|
512e79f61a | ||
|
|
aa70080423 | ||
|
|
34015bae12 | ||
|
|
93b60a8327 | ||
|
|
aa9951f2cd | ||
|
|
9d8c72998b | ||
|
|
033ed13202 | ||
|
|
c058d04b94 | ||
|
|
9d2bb0c452 | ||
|
|
627d062fb2 | ||
|
|
96afe62576 | ||
|
|
d946016eb7 | ||
|
|
37f1a77a6a | ||
|
|
450a81cc54 | ||
|
|
c09f02e6f6 | ||
|
|
58c7469110 | ||
|
|
c36ea0968d | ||
|
|
ba197302e2 | ||
|
|
725ad067c1 | ||
|
|
9858c23c59 | ||
|
|
fc8f1ff03c | ||
|
|
f7b4f76082 | ||
|
|
4fd66fa689 | ||
|
|
e6ff42b8ad | ||
|
|
6710d87c6a | ||
|
|
178b79f122 | ||
|
|
bc545c6549 | ||
|
|
585481990a | ||
|
|
0057f86cfa | ||
|
|
fa0401793f | ||
|
|
60b7265961 | ||
|
|
57b53dbae8 | ||
|
|
ddf1a545d1 | ||
|
|
cbdf6400ae | ||
|
|
ceeecf9c66 | ||
|
|
7c53baea8a | ||
|
|
b549d354bf | ||
|
|
9f3113e1f6 | ||
|
|
ef94742320 | ||
|
|
d71764985d | ||
|
|
0fc04a269d | ||
|
|
8d7ac5f01c | ||
|
|
612acbe319 | ||
|
|
f3a487cd41 | ||
|
|
3a559521d1 | ||
|
|
567daf5a42 | ||
|
|
c7d913f436 | ||
|
|
7bd410c725 | ||
|
|
7fbc789855 | ||
|
|
7fc41a771a | ||
|
|
a31d82fec2 | ||
|
|
9ef4c80340 | ||
|
|
6a4e413758 | ||
|
|
678d4d189d | ||
|
|
718f718764 | ||
|
|
d860ded3f0 | ||
|
|
a141da8a20 | ||
|
|
aaaaee7a4b | ||
|
|
21efaf1fca | ||
|
|
2056623e13 | ||
|
|
34126ee704 | ||
|
|
980a338454 | ||
|
|
e14f766895 | ||
|
|
2e0731f479 | ||
|
|
addf62087c | ||
|
|
c24b68dc4f | ||
|
|
24677d7484 | ||
|
|
9e74c28158 | ||
|
|
b7aae33cc1 | ||
|
|
6d9d58278e | ||
|
|
1bc6fbda8c | ||
|
|
d39f5a85e6 | ||
|
|
d90a0eefbe | ||
|
|
2618c014a0 | ||
|
|
5c4878f37e | ||
|
|
bd6b169e98 | ||
|
|
5770336572 | ||
|
|
45daec7b37 | ||
|
|
ed5a7dc022 | ||
|
|
6fc7c77721 |
1
.github/workflows/PR-wip-checks.yaml
vendored
@@ -15,7 +15,6 @@ jobs:
|
||||
name: WIP Check
|
||||
steps:
|
||||
- name: WIP Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755
|
||||
with:
|
||||
labels: '["do-not-merge", "wip", "rfc"]'
|
||||
|
||||
22
.github/workflows/commit-message-check.yaml
vendored
@@ -10,7 +10,7 @@ env:
|
||||
error_msg: |+
|
||||
See the document below for help on formatting commits for the project.
|
||||
|
||||
https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format
|
||||
https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-forma
|
||||
|
||||
jobs:
|
||||
commit-message-check:
|
||||
@@ -18,32 +18,24 @@ jobs:
|
||||
name: Commit Message Check
|
||||
steps:
|
||||
- name: Get PR Commits
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
id: 'get-pr-commits'
|
||||
uses: tim-actions/get-pr-commits@v1.2.0
|
||||
uses: tim-actions/get-pr-commits@v1.0.0
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
# Filter out revert commits
|
||||
# The format of a revert commit is as follows:
|
||||
#
|
||||
# Revert "<original-subject-line>"
|
||||
#
|
||||
filter_out_pattern: '^Revert "'
|
||||
|
||||
- name: DCO Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
|
||||
- name: Commit Body Missing Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
if: ${{ success() || failure() }}
|
||||
uses: tim-actions/commit-body-check@v1.0.2
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
|
||||
- name: Check Subject Line Length
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
if: ${{ success() || failure() }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
@@ -52,7 +44,7 @@ jobs:
|
||||
post_error: ${{ env.error_msg }}
|
||||
|
||||
- name: Check Body Line Length
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
if: ${{ success() || failure() }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
@@ -79,7 +71,7 @@ jobs:
|
||||
post_error: ${{ env.error_msg }}
|
||||
|
||||
- name: Check Fixes
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
if: ${{ success() || failure() }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
@@ -90,7 +82,7 @@ jobs:
|
||||
one_pass_all_pass: 'true'
|
||||
|
||||
- name: Check Subsystem
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
if: ${{ success() || failure() }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
|
||||
25
.github/workflows/darwin-tests.yaml
vendored
@@ -1,25 +0,0 @@
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
name: Darwin tests
|
||||
jobs:
|
||||
test:
|
||||
strategy:
|
||||
matrix:
|
||||
go-version: [1.16.x, 1.17.x]
|
||||
os: [macos-latest]
|
||||
runs-on: ${{ matrix.os }}
|
||||
steps:
|
||||
- name: Install Go
|
||||
uses: actions/setup-go@v2
|
||||
with:
|
||||
go-version: ${{ matrix.go-version }}
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v2
|
||||
- name: Build utils
|
||||
run: ./ci/darwin-test.sh
|
||||
18
.github/workflows/gather-artifacts.sh
vendored
Executable file
@@ -0,0 +1,18 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
set -o pipefail
|
||||
|
||||
pushd kata-artifacts >>/dev/null
|
||||
for c in ./*.tar.gz
|
||||
do
|
||||
echo "untarring tarball $c"
|
||||
tar -xvf $c
|
||||
done
|
||||
|
||||
tar cvfJ ../kata-static.tar.xz ./opt
|
||||
popd >>/dev/null
|
||||
36
.github/workflows/generate-artifact-tarball.sh
vendored
Executable file
@@ -0,0 +1,36 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
set -o pipefail
|
||||
|
||||
|
||||
main() {
|
||||
artifact_stage=${1:-}
|
||||
artifact=$(echo ${artifact_stage} | sed -n -e 's/^install_//p' | sed -r 's/_/-/g')
|
||||
if [ -z "${artifact}" ]; then
|
||||
"Scripts needs artifact name to build"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
export GOPATH=$HOME/go
|
||||
|
||||
go get github.com/kata-containers/packaging || true
|
||||
pushd $GOPATH/src/github.com/kata-containers/packaging/release >>/dev/null
|
||||
git checkout $tag
|
||||
pushd ../obs-packaging
|
||||
./gen_versions_txt.sh $tag
|
||||
popd
|
||||
|
||||
source ./kata-deploy-binaries.sh
|
||||
${artifact_stage} $tag
|
||||
popd
|
||||
|
||||
mv $HOME/go/src/github.com/kata-containers/packaging/release/kata-static-${artifact}.tar.gz .
|
||||
}
|
||||
|
||||
main $@
|
||||
34
.github/workflows/generate-local-artifact-tarball.sh
vendored
Executable file
@@ -0,0 +1,34 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
# Copyright (c) 2020 Ant Group
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
set -o pipefail
|
||||
|
||||
|
||||
main() {
|
||||
artifact_stage=${1:-}
|
||||
artifact=$(echo ${artifact_stage} | sed -n -e 's/^install_//p' | sed -r 's/_/-/g')
|
||||
if [ -z "${artifact}" ]; then
|
||||
"Scripts needs artifact name to build"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
pushd $GITHUB_WORKSPACE/tools/packaging
|
||||
git checkout $tag
|
||||
./scripts/gen_versions_txt.sh $tag
|
||||
popd
|
||||
|
||||
pushd $GITHUB_WORKSPACE/tools/packaging/release
|
||||
source ./kata-deploy-binaries.sh
|
||||
${artifact_stage} $tag
|
||||
popd
|
||||
|
||||
mv $GITHUB_WORKSPACE/tools/packaging/release/kata-static-${artifact}.tar.gz .
|
||||
}
|
||||
|
||||
main $@
|
||||
83
.github/workflows/kata-deploy-push.yaml
vendored
@@ -1,83 +0,0 @@
|
||||
name: kata deploy build
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
paths:
|
||||
- tools/**
|
||||
- versions.yaml
|
||||
|
||||
jobs:
|
||||
build-asset:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- kernel
|
||||
- shim-v2
|
||||
- qemu
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Install docker
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make "${KATA_ASSET}-tarball"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r --preserve=all "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: build
|
||||
- name: merge-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make merge-builds
|
||||
- name: store-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
path: kata-static.tar.xz
|
||||
|
||||
make-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: make kata-tarball
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make kata-tarball
|
||||
sudo make install-tarball
|
||||
153
.github/workflows/kata-deploy-test.yaml
vendored
@@ -1,115 +1,30 @@
|
||||
on:
|
||||
workflow_dispatch: # this is used to trigger the workflow on non-main branches
|
||||
issue_comment:
|
||||
types: [created, edited]
|
||||
|
||||
name: test-kata-deploy
|
||||
|
||||
jobs:
|
||||
check-comment-and-membership:
|
||||
check_comments:
|
||||
if: ${{ github.event.issue.pull_request }}
|
||||
runs-on: ubuntu-latest
|
||||
if: |
|
||||
github.event.issue.pull_request
|
||||
&& github.event_name == 'issue_comment'
|
||||
&& github.event.action == 'created'
|
||||
&& startsWith(github.event.comment.body, '/test_kata_deploy')
|
||||
steps:
|
||||
- name: Check membership
|
||||
uses: kata-containers/is-organization-member@1.0.1
|
||||
id: is_organization_member
|
||||
- name: Check for Command
|
||||
id: command
|
||||
uses: kata-containers/slash-command-action@v1
|
||||
with:
|
||||
organization: kata-containers
|
||||
username: ${{ github.event.comment.user.login }}
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
- name: Fail if not member
|
||||
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
command: "test_kata_deploy"
|
||||
reaction: "true"
|
||||
reaction-type: "eyes"
|
||||
allow-edits: "false"
|
||||
permission-level: admin
|
||||
- name: verify command arg is kata-deploy
|
||||
run: |
|
||||
result=${{ steps.is_organization_member.outputs.result }}
|
||||
if [ $result == false ]; then
|
||||
user=${{ github.event.comment.user.login }}
|
||||
echo Either ${user} is not part of the kata-containers organization
|
||||
echo or ${user} has its Organization Visibility set to Private at
|
||||
echo https://github.com/orgs/kata-containers/people?query=${user}
|
||||
echo
|
||||
echo Ensure you change your Organization Visibility to Public and
|
||||
echo trigger the test again.
|
||||
exit 1
|
||||
fi
|
||||
echo "The command was '${{ steps.command.outputs.command-name }}' with arguments '${{ steps.command.outputs.command-arguments }}'"
|
||||
|
||||
build-asset:
|
||||
runs-on: ubuntu-latest
|
||||
needs: check-comment-and-membership
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- kernel
|
||||
- qemu
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
- shim-v2
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
id: get-PR-ref
|
||||
run: |
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
|
||||
- name: Install docker
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
run: |
|
||||
make "${KATA_ASSET}-tarball"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
id: get-PR-ref
|
||||
run: |
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-artifacts
|
||||
- name: merge-artifacts
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: create-kata-tarball
|
||||
create-and-test-container:
|
||||
needs: check_comments
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
@@ -118,30 +33,30 @@ jobs:
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
|
||||
- name: check out
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
- name: get-kata-tarball
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
|
||||
- name: build-container-image
|
||||
id: build-container-image
|
||||
run: |
|
||||
PR_SHA=$(git log --format=format:%H -n1)
|
||||
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t quay.io/kata-containers/kata-deploy-ci:$PR_SHA $GITHUB_WORKSPACE/tools/packaging/kata-deploy
|
||||
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
|
||||
docker push quay.io/kata-containers/kata-deploy-ci:$PR_SHA
|
||||
mkdir -p packaging/kata-deploy
|
||||
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
|
||||
echo "::set-output name=PKG_SHA::${PR_SHA}"
|
||||
PR_SHA=$(git log --format=format:%H -n1)
|
||||
VERSION="2.0.0"
|
||||
ARTIFACT_URL="https://github.com/kata-containers/kata-containers/releases/download/${VERSION}/kata-static-${VERSION}-x86_64.tar.xz"
|
||||
wget "${ARTIFACT_URL}" -O tools/packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:${PR_SHA} ./tools/packaging/kata-deploy
|
||||
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
|
||||
docker push katadocker/kata-deploy-ci:$PR_SHA
|
||||
echo "##[set-output name=pr-sha;]${PR_SHA}"
|
||||
|
||||
- name: test-kata-deploy-ci-in-aks
|
||||
uses: ./packaging/kata-deploy/action
|
||||
uses: ./tools/packaging/kata-deploy/action
|
||||
with:
|
||||
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
packaging-sha: ${{ steps.build-container-image.outputs.pr-sha }}
|
||||
env:
|
||||
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
PKG_SHA: ${{ steps.build-container-image.outputs.pr-sha }}
|
||||
AZ_APPID: ${{ secrets.AZ_APPID }}
|
||||
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
|
||||
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
|
||||
|
||||
293
.github/workflows/main.yaml
vendored
Normal file
@@ -0,0 +1,293 @@
|
||||
name: Publish release tarball
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- '1.*'
|
||||
|
||||
jobs:
|
||||
get-artifact-list:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get the list
|
||||
run: |
|
||||
pushd $GITHUB_WORKSPACE
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
git checkout $tag
|
||||
popd
|
||||
$GITHUB_WORKSPACE/tools/packaging/artifact-list.sh > artifact-list.txt
|
||||
- name: save-artifact-list
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
path: artifact-list.txt
|
||||
|
||||
build-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-kernel
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-kernel.tar.gz
|
||||
|
||||
build-experimental-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_experimental_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-experimental-kernel
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-experimental-kernel.tar.gz
|
||||
|
||||
build-qemu:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_qemu"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-qemu
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-qemu.tar.gz
|
||||
|
||||
# Job for building the image
|
||||
build-image:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_image"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-image
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-image.tar.gz
|
||||
|
||||
# Job for building firecracker hypervisor
|
||||
build-firecracker:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_firecracker"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-firecracker
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-firecracker.tar.gz
|
||||
|
||||
# Job for building cloud-hypervisor
|
||||
build-clh:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_clh"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-clh
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-clh.tar.gz
|
||||
|
||||
# Job for building kata components
|
||||
build-kata-components:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kata_components"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-kata-components
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-kata-components.tar.gz
|
||||
|
||||
gather-artifacts:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: [build-experimental-kernel, build-kernel, build-qemu, build-image, build-firecracker, build-kata-components, build-clh]
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: kata-artifacts
|
||||
- name: colate-artifacts
|
||||
run: |
|
||||
$GITHUB_WORKSPACE/.github/workflows/gather-artifacts.sh
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@master
|
||||
with:
|
||||
name: release-candidate
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: gather-artifacts
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: release-candidate
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
git clone https://github.com/kata-containers/packaging
|
||||
pushd packaging
|
||||
git checkout $tag
|
||||
pkg_sha=$(git rev-parse HEAD)
|
||||
popd
|
||||
mv release-candidate/kata-static.tar.xz ./packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha ./packaging/kata-deploy
|
||||
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
|
||||
docker push katadocker/kata-deploy-ci:$pkg_sha
|
||||
echo "::set-output name=PKG_SHA::${pkg_sha}"
|
||||
- name: test-kata-deploy-ci-in-aks
|
||||
uses: ./packaging/kata-deploy/action
|
||||
with:
|
||||
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
env:
|
||||
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
AZ_APPID: ${{ secrets.AZ_APPID }}
|
||||
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
|
||||
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
|
||||
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
|
||||
- name: push-tarball
|
||||
run: |
|
||||
# tag the container image we created and push to DockerHub
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag}
|
||||
docker push katadocker/kata-deploy:${tag}
|
||||
|
||||
upload-static-tarball:
|
||||
needs: kata-deploy
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: download-artifacts
|
||||
uses: actions/download-artifact@master
|
||||
with:
|
||||
name: release-candidate
|
||||
- name: install hub
|
||||
run: |
|
||||
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')
|
||||
wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub
|
||||
- name: push static tarball to github
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tarball="kata-static-$tag-x86_64.tar.xz"
|
||||
repo="https://github.com/kata-containers/runtime.git"
|
||||
mv release-candidate/kata-static.tar.xz "release-candidate/${tarball}"
|
||||
git clone "${repo}"
|
||||
cd runtime
|
||||
echo "uploading asset '${tarball}' to '${repo}' tag: ${tag}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "../release-candidate/${tarball}" "${tag}"
|
||||
@@ -16,7 +16,6 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Install hub
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
HUB_ARCH="amd64"
|
||||
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
|
||||
@@ -27,7 +26,6 @@ jobs:
|
||||
sudo install hub /usr/local/bin
|
||||
|
||||
- name: Install hub extension script
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
# Clone into a temporary directory to avoid overwriting
|
||||
# any existing github directory.
|
||||
@@ -37,11 +35,9 @@ jobs:
|
||||
popd &>/dev/null
|
||||
|
||||
- name: Checkout code to allow hub to communicate with the project
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Move issue to "In progress"
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
run: |
|
||||
|
||||
298
.github/workflows/release.yaml
vendored
@@ -5,46 +5,213 @@ on:
|
||||
- '2.*'
|
||||
|
||||
jobs:
|
||||
build-asset:
|
||||
get-artifact-list:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- kernel
|
||||
- qemu
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
- shim-v2
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Install docker
|
||||
- name: get the list
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
pushd $GITHUB_WORKSPACE
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
git checkout $tag
|
||||
popd
|
||||
$GITHUB_WORKSPACE/tools/packaging/artifact-list.sh > artifact-list.txt
|
||||
- name: save-artifact-list
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
path: artifact-list.txt
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
build-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-kernel
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-copy-yq-installer.sh
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
path: kata-static-kernel.tar.gz
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
build-experimental-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_experimental_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-experimental-kernel
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-experimental-kernel.tar.gz
|
||||
|
||||
build-qemu:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_qemu"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-qemu
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-qemu.tar.gz
|
||||
|
||||
build-image:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_image"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-image
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-image.tar.gz
|
||||
|
||||
build-firecracker:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_firecracker"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-firecracker
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-firecracker.tar.gz
|
||||
|
||||
|
||||
build-clh:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_clh"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-clh
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-clh.tar.gz
|
||||
|
||||
build-kata-components:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kata_components"
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-kata-components
|
||||
run: |
|
||||
if grep -q $buildstr artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-local-artifact-tarball.sh $buildstr
|
||||
echo "artifact-built=true" >> $GITHUB_ENV
|
||||
else
|
||||
echo "artifact-built=false" >> $GITHUB_ENV
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-kata-components.tar.gz
|
||||
|
||||
gather-artifacts:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: [build-experimental-kernel, build-kernel, build-qemu, build-image, build-firecracker, build-kata-components, build-clh]
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifacts
|
||||
@@ -52,24 +219,24 @@ jobs:
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-artifacts
|
||||
- name: merge-artifacts
|
||||
- name: colate-artifacts
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
|
||||
$GITHUB_WORKSPACE/.github/workflows/gather-artifacts.sh
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
name: release-candidate
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: create-kata-tarball
|
||||
needs: gather-artifacts
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-kata-tarball
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
name: release-candidate
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
run: |
|
||||
@@ -79,11 +246,9 @@ jobs:
|
||||
pkg_sha=$(git rev-parse HEAD)
|
||||
popd
|
||||
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
|
||||
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
|
||||
docker push katadocker/kata-deploy-ci:$pkg_sha
|
||||
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
|
||||
docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha
|
||||
mkdir -p packaging/kata-deploy
|
||||
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
|
||||
echo "::set-output name=PKG_SHA::${pkg_sha}"
|
||||
@@ -101,14 +266,8 @@ jobs:
|
||||
run: |
|
||||
# tag the container image we created and push to DockerHub
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tags=($tag)
|
||||
tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))
|
||||
for tag in ${tags[@]}; do \
|
||||
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag} && \
|
||||
docker tag quay.io/kata-containers/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} quay.io/kata-containers/kata-deploy:${tag} && \
|
||||
docker push katadocker/kata-deploy:${tag} && \
|
||||
docker push quay.io/kata-containers/kata-deploy:${tag}; \
|
||||
done
|
||||
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag}
|
||||
docker push katadocker/kata-deploy:${tag}
|
||||
|
||||
upload-static-tarball:
|
||||
needs: kata-deploy
|
||||
@@ -118,7 +277,7 @@ jobs:
|
||||
- name: download-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
name: release-candidate
|
||||
- name: install hub
|
||||
run: |
|
||||
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')
|
||||
@@ -132,46 +291,3 @@ jobs:
|
||||
pushd $GITHUB_WORKSPACE
|
||||
echo "uploading asset '${tarball}' for tag: ${tag}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
popd
|
||||
|
||||
upload-cargo-vendored-tarball:
|
||||
needs: upload-static-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: generate-and-upload-tarball
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tarball="kata-containers-$tag-vendor.tar.gz"
|
||||
pushd $GITHUB_WORKSPACE
|
||||
bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
popd
|
||||
|
||||
upload-libseccomp-tarball:
|
||||
needs: upload-cargo-vendored-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: download-and-upload-tarball
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}
|
||||
GOPATH: ${HOME}/go
|
||||
run: |
|
||||
pushd $GITHUB_WORKSPACE
|
||||
./ci/install_yq.sh
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
versions_yaml="versions.yaml"
|
||||
version=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.version")
|
||||
repo_url=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.url")
|
||||
download_url="${repo_url}/releases/download/v${version}"
|
||||
tarball="libseccomp-${version}.tar.gz"
|
||||
asc="${tarball}.asc"
|
||||
curl -sSLO "${download_url}/${tarball}"
|
||||
curl -sSLO "${download_url}/${asc}"
|
||||
# "-m" option should be empty to re-use the existing release title
|
||||
# without opening a text editor.
|
||||
# For the details, check https://hub.github.com/hub-release.1.html.
|
||||
hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
hub release edit -m "" -a "${asc}" "${tag}"
|
||||
popd
|
||||
|
||||
@@ -12,15 +12,12 @@ on:
|
||||
- reopened
|
||||
- labeled
|
||||
- unlabeled
|
||||
branches:
|
||||
- main
|
||||
|
||||
jobs:
|
||||
check-pr-porting-labels:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Install hub
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
HUB_ARCH="amd64"
|
||||
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
|
||||
@@ -31,8 +28,9 @@ jobs:
|
||||
sudo install hub /usr/local/bin
|
||||
|
||||
- name: Checkout code to allow hub to communicate with the project
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
token: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
|
||||
- name: Install porting checker script
|
||||
run: |
|
||||
@@ -44,7 +42,6 @@ jobs:
|
||||
popd &>/dev/null
|
||||
|
||||
- name: Stop PR being merged unless it has a correct set of porting labels
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
run: |
|
||||
|
||||
6
.github/workflows/snap-release.yaml
vendored
@@ -9,8 +9,6 @@ jobs:
|
||||
steps:
|
||||
- name: Check out Git repository
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Install Snapcraft
|
||||
uses: samuelmeuli/action-snapcraft@v1
|
||||
@@ -21,7 +19,7 @@ jobs:
|
||||
run: |
|
||||
sudo apt-get install -y git git-extras
|
||||
kata_url="https://github.com/kata-containers/kata-containers"
|
||||
latest_version=$(git ls-remote --tags ${kata_url} | egrep -o "refs.*" | egrep -v "\-alpha|\-rc|{}" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r | head -1)
|
||||
latest_version=$(git ls-remote --tags ${kata_url} | egrep -o "refs.*" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r -u | head -1)
|
||||
current_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"
|
||||
# Check semantic versioning format (x.y.z) and if the current tag is the latest tag
|
||||
if echo "${current_version}" | grep -q "^[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+$" && echo -e "$latest_version\n$current_version" | sort -C -V; then
|
||||
@@ -35,5 +33,5 @@ jobs:
|
||||
snap_file="kata-containers_${snap_version}_amd64.snap"
|
||||
# Upload the snap if it exists
|
||||
if [ -f ${snap_file} ]; then
|
||||
snapcraft upload --release=stable ${snap_file}
|
||||
snapcraft upload --release=candidate ${snap_file}
|
||||
fi
|
||||
|
||||
14
.github/workflows/snap.yaml
vendored
@@ -1,27 +1,15 @@
|
||||
name: snap CI
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- synchronize
|
||||
- reopened
|
||||
- edited
|
||||
|
||||
on: ["pull_request"]
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-20.04
|
||||
steps:
|
||||
- name: Check out
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Install Snapcraft
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: samuelmeuli/action-snapcraft@v1
|
||||
|
||||
- name: Build snap
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
snapcraft -d snap --destructive-mode
|
||||
|
||||
96
.github/workflows/static-checks.yaml
vendored
@@ -1,96 +0,0 @@
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
name: Static checks
|
||||
jobs:
|
||||
test:
|
||||
strategy:
|
||||
matrix:
|
||||
go-version: [1.16.x, 1.17.x]
|
||||
os: [ubuntu-20.04]
|
||||
runs-on: ${{ matrix.os }}
|
||||
env:
|
||||
TRAVIS: "true"
|
||||
TRAVIS_BRANCH: ${{ github.base_ref }}
|
||||
TRAVIS_PULL_REQUEST_BRANCH: ${{ github.head_ref }}
|
||||
TRAVIS_PULL_REQUEST_SHA : ${{ github.event.pull_request.head.sha }}
|
||||
RUST_BACKTRACE: "1"
|
||||
target_branch: ${{ github.base_ref }}
|
||||
steps:
|
||||
- name: Install Go
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/setup-go@v2
|
||||
with:
|
||||
go-version: ${{ matrix.go-version }}
|
||||
env:
|
||||
GOPATH: ${{ runner.workspace }}/kata-containers
|
||||
- name: Setup GOPATH
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "TRAVIS_BRANCH: ${TRAVIS_BRANCH}"
|
||||
echo "TRAVIS_PULL_REQUEST_BRANCH: ${TRAVIS_PULL_REQUEST_BRANCH}"
|
||||
echo "TRAVIS_PULL_REQUEST_SHA: ${TRAVIS_PULL_REQUEST_SHA}"
|
||||
echo "TRAVIS: ${TRAVIS}"
|
||||
- name: Set env
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
|
||||
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
|
||||
- name: Checkout code
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
path: ./src/github.com/${{ github.repository }}
|
||||
- name: Setup travis references
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "TRAVIS_BRANCH=${TRAVIS_BRANCH:-$(echo $GITHUB_REF | awk 'BEGIN { FS = \"/\" } ; { print $3 }')}"
|
||||
target_branch=${TRAVIS_BRANCH}
|
||||
- name: Setup
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
|
||||
env:
|
||||
GOPATH: ${{ runner.workspace }}/kata-containers
|
||||
- name: Installing rust
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh
|
||||
PATH=$PATH:"$HOME/.cargo/bin"
|
||||
rustup target add x86_64-unknown-linux-musl
|
||||
rustup component add rustfmt clippy
|
||||
- name: Setup seccomp
|
||||
run: |
|
||||
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
|
||||
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
|
||||
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
|
||||
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
|
||||
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
|
||||
# Check whether the vendored code is up-to-date & working as the first thing
|
||||
- name: Check vendored code
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor
|
||||
- name: Static Checks
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks
|
||||
- name: Run Compiler Checks
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make check
|
||||
- name: Run Unit Tests
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make test
|
||||
- name: Run Unit Tests As Root User
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && sudo -E PATH="$PATH" make test
|
||||
3
.gitignore
vendored
@@ -1,13 +1,10 @@
|
||||
**/*.bk
|
||||
**/*~
|
||||
**/*.orig
|
||||
**/*.rej
|
||||
**/target
|
||||
**/.vscode
|
||||
pkg/logging/Cargo.lock
|
||||
src/agent/src/version.rs
|
||||
src/agent/kata-agent.service
|
||||
src/agent/protocols/src/*.rs
|
||||
!src/agent/protocols/src/lib.rs
|
||||
build
|
||||
|
||||
|
||||
62
.travis.yml
Normal file
@@ -0,0 +1,62 @@
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
dist: bionic
|
||||
os: linux
|
||||
|
||||
# set cache directories manually, because
|
||||
# we are using a non-standard directory struct
|
||||
# cargo root is in srs/agent
|
||||
#
|
||||
# If needed, caches can be cleared
|
||||
# by ways documented in
|
||||
# https://docs.travis-ci.com/user/caching#clearing-caches
|
||||
language: rust
|
||||
rust:
|
||||
- 1.44.1
|
||||
cache:
|
||||
cargo: true
|
||||
directories:
|
||||
- src/agent/target
|
||||
|
||||
before_install:
|
||||
- git remote set-branches --add origin "${TRAVIS_BRANCH}"
|
||||
- git fetch
|
||||
- export RUST_BACKTRACE=1
|
||||
- export target_branch=$TRAVIS_BRANCH
|
||||
- "ci/setup.sh"
|
||||
|
||||
# we use install to run check agent
|
||||
# so that it is easy to skip for non-amd64 platform
|
||||
install:
|
||||
- export PATH=$PATH:"$HOME/.cargo/bin"
|
||||
- export RUST_AGENT=yes
|
||||
- rustup target add x86_64-unknown-linux-musl
|
||||
- sudo ln -sf /usr/bin/g++ /bin/musl-g++
|
||||
- rustup component add rustfmt
|
||||
- make -C ${TRAVIS_BUILD_DIR}/src/agent
|
||||
- make -C ${TRAVIS_BUILD_DIR}/src/agent check
|
||||
- sudo -E PATH=$PATH make -C ${TRAVIS_BUILD_DIR}/src/agent check
|
||||
|
||||
before_script:
|
||||
- "ci/install_go.sh"
|
||||
- make -C ${TRAVIS_BUILD_DIR}/src/runtime
|
||||
- make -C ${TRAVIS_BUILD_DIR}/src/runtime test
|
||||
- sudo -E PATH=$PATH GOPATH=$GOPATH make -C ${TRAVIS_BUILD_DIR}/src/runtime test
|
||||
|
||||
script:
|
||||
- "ci/static-checks.sh"
|
||||
|
||||
jobs:
|
||||
include:
|
||||
- name: x86_64 test
|
||||
os: linux
|
||||
- name: ppc64le test
|
||||
os: linux-ppc64le
|
||||
install: skip
|
||||
script: skip
|
||||
allow_failures:
|
||||
- name: ppc64le test
|
||||
fast_finish: true
|
||||
@@ -2,4 +2,4 @@
|
||||
|
||||
## This repo is part of [Kata Containers](https://katacontainers.io)
|
||||
|
||||
For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).
|
||||
For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md).
|
||||
@@ -1,3 +0,0 @@
|
||||
# Glossary
|
||||
|
||||
See the [project glossary hosted in the wiki](https://github.com/kata-containers/kata-containers/wiki/Glossary).
|
||||
26
Makefile
@@ -8,24 +8,18 @@ COMPONENTS =
|
||||
|
||||
COMPONENTS += agent
|
||||
COMPONENTS += runtime
|
||||
COMPONENTS += trace-forwarder
|
||||
|
||||
# List of available tools
|
||||
TOOLS =
|
||||
|
||||
TOOLS += agent-ctl
|
||||
TOOLS += trace-forwarder
|
||||
|
||||
STANDARD_TARGETS = build check clean install test vendor
|
||||
|
||||
default: all
|
||||
|
||||
all: logging-crate-tests build
|
||||
|
||||
logging-crate-tests:
|
||||
make -C src/libs/logging
|
||||
STANDARD_TARGETS = build check clean install test
|
||||
|
||||
include utils.mk
|
||||
include ./tools/packaging/kata-deploy/local-build/Makefile
|
||||
|
||||
all: build
|
||||
|
||||
# Create the rules
|
||||
$(eval $(call create_all_rules,$(COMPONENTS),$(TOOLS),$(STANDARD_TARGETS)))
|
||||
@@ -35,14 +29,4 @@ $(eval $(call create_all_rules,$(COMPONENTS),$(TOOLS),$(STANDARD_TARGETS)))
|
||||
generate-protocols:
|
||||
make -C src/agent generate-protocols
|
||||
|
||||
# Some static checks rely on generated source files of components.
|
||||
static-checks: build
|
||||
bash ci/static-checks.sh
|
||||
|
||||
.PHONY: \
|
||||
all \
|
||||
binary-tarball \
|
||||
default \
|
||||
install-binary-tarball \
|
||||
logging-crate-tests \
|
||||
static-checks
|
||||
.PHONY: all default
|
||||
|
||||
223
README.md
@@ -2,145 +2,130 @@
|
||||
|
||||
# Kata Containers
|
||||
|
||||
* [Raising issues](#raising-issues)
|
||||
* [Kata Containers repositories](#kata-containers-repositories)
|
||||
* [Code Repositories](#code-repositories)
|
||||
* [Kata Containers-developed components](#kata-containers-developed-components)
|
||||
* [Agent](#agent)
|
||||
* [KSM throttler](#ksm-throttler)
|
||||
* [Runtime](#runtime)
|
||||
* [Trace forwarder](#trace-forwarder)
|
||||
* [Additional](#additional)
|
||||
* [Kernel](#kernel)
|
||||
* [CI](#ci)
|
||||
* [Community](#community)
|
||||
* [Documentation](#documentation)
|
||||
* [Packaging](#packaging)
|
||||
* [Test code](#test-code)
|
||||
* [Utilities](#utilities)
|
||||
* [OS builder](#os-builder)
|
||||
* [Web content](#web-content)
|
||||
|
||||
---
|
||||
|
||||
Welcome to Kata Containers!
|
||||
|
||||
This repository is the home of the Kata Containers code for the 2.0 and newer
|
||||
releases.
|
||||
The purpose of this repository is to act as a "top level" site for the project. Specifically it is used:
|
||||
|
||||
If you want to learn about Kata Containers, visit the main
|
||||
[Kata Containers website](https://katacontainers.io).
|
||||
- To provide a list of the various *other* [Kata Containers repositories](#kata-containers-repositories),
|
||||
along with a brief explanation of their purpose.
|
||||
|
||||
## Introduction
|
||||
- To provide a general area for [Raising Issues](#raising-issues).
|
||||
|
||||
Kata Containers is an open source project and community working to build a
|
||||
standard implementation of lightweight Virtual Machines (VMs) that feel and
|
||||
perform like containers, but provide the workload isolation and security
|
||||
advantages of VMs.
|
||||
## Raising issues
|
||||
|
||||
## License
|
||||
This repository is used for [raising
|
||||
issues](https://github.com/kata-containers/kata-containers/issues/new):
|
||||
|
||||
The code is licensed under the Apache 2.0 license.
|
||||
See [the license file](LICENSE) for further details.
|
||||
- That might affect multiple code repositories.
|
||||
|
||||
## Platform support
|
||||
|
||||
Kata Containers currently runs on 64-bit systems supporting the following
|
||||
technologies:
|
||||
|
||||
| Architecture | Virtualization technology |
|
||||
|-|-|
|
||||
| `x86_64`, `amd64` | [Intel](https://www.intel.com) VT-x, AMD SVM |
|
||||
| `aarch64` ("`arm64`")| [ARM](https://www.arm.com) Hyp |
|
||||
| `ppc64le` | [IBM](https://www.ibm.com) Power |
|
||||
| `s390x` | [IBM](https://www.ibm.com) Z & LinuxONE SIE |
|
||||
|
||||
### Hardware requirements
|
||||
|
||||
The [Kata Containers runtime](src/runtime) provides a command to
|
||||
determine if your host system is capable of running and creating a
|
||||
Kata Container:
|
||||
|
||||
```bash
|
||||
$ kata-runtime check
|
||||
```
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - This command runs a number of checks including connecting to the
|
||||
> network to determine if a newer release of Kata Containers is
|
||||
> available on GitHub. If you do not wish this to check to run, add
|
||||
> the `--no-network-checks` option.
|
||||
>
|
||||
> - By default, only a brief success / failure message is printed.
|
||||
> If more details are needed, the `--verbose` flag can be used to display the
|
||||
> list of all the checks performed.
|
||||
>
|
||||
> - If the command is run as the `root` user additional checks are
|
||||
> run (including checking if another incompatible hypervisor is running).
|
||||
> When running as `root`, network checks are automatically disabled.
|
||||
|
||||
## Getting started
|
||||
|
||||
See the [installation documentation](docs/install).
|
||||
|
||||
## Documentation
|
||||
|
||||
See the [official documentation](docs) including:
|
||||
|
||||
- [Installation guides](docs/install)
|
||||
- [Developer guide](docs/Developer-Guide.md)
|
||||
- [Design documents](docs/design)
|
||||
- [Architecture overview](docs/design/architecture)
|
||||
|
||||
## Configuration
|
||||
|
||||
Kata Containers uses a single
|
||||
[configuration file](src/runtime/README.md#configuration)
|
||||
which contains a number of sections for various parts of the Kata
|
||||
Containers system including the [runtime](src/runtime), the
|
||||
[agent](src/agent) and the [hypervisor](#hypervisors).
|
||||
|
||||
## Hypervisors
|
||||
|
||||
See the [hypervisors document](docs/hypervisors.md) and the
|
||||
[Hypervisor specific configuration details](src/runtime/README.md#hypervisor-specific-configuration).
|
||||
|
||||
## Community
|
||||
|
||||
To learn more about the project, its community and governance, see the
|
||||
[community repository](https://github.com/kata-containers/community). This is
|
||||
the first place to go if you wish to contribute to the project.
|
||||
|
||||
## Getting help
|
||||
|
||||
See the [community](#community) section for ways to contact us.
|
||||
|
||||
### Raising issues
|
||||
|
||||
Please raise an issue
|
||||
[in this repository](https://github.com/kata-containers/kata-containers/issues).
|
||||
- Where the raiser is unsure which repositories are affected.
|
||||
|
||||
> **Note:**
|
||||
> If you are reporting a security issue, please follow the [vulnerability reporting process](https://github.com/kata-containers/community#vulnerability-handling)
|
||||
>
|
||||
> - If an issue affects only a single component, it should be raised in that
|
||||
> components repository.
|
||||
|
||||
## Developers
|
||||
## Kata Containers repositories
|
||||
|
||||
See the [developer guide](docs/Developer-Guide.md).
|
||||
### CI
|
||||
|
||||
### Components
|
||||
The [CI](https://github.com/kata-containers/ci) repository stores the Continuous
|
||||
Integration (CI) system configuration information.
|
||||
|
||||
### Main components
|
||||
### Community
|
||||
|
||||
The table below lists the core parts of the project:
|
||||
The [Community](https://github.com/kata-containers/community) repository is
|
||||
the first place to go if you want to use or contribute to the project.
|
||||
|
||||
| Component | Type | Description |
|
||||
|-|-|-|
|
||||
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
|
||||
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
|
||||
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
|
||||
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
|
||||
### Code Repositories
|
||||
|
||||
### Additional components
|
||||
#### Kata Containers-developed components
|
||||
|
||||
The table below lists the remaining parts of the project:
|
||||
##### Agent
|
||||
|
||||
| Component | Type | Description |
|
||||
|-|-|-|
|
||||
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
|
||||
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
|
||||
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
|
||||
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
|
||||
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
|
||||
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
|
||||
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
|
||||
The [`kata-agent`](src/agent/README.md) runs inside the
|
||||
virtual machine and sets up the container environment.
|
||||
|
||||
### Packaging and releases
|
||||
##### KSM throttler
|
||||
|
||||
Kata Containers is now
|
||||
[available natively for most distributions](docs/install/README.md#packaged-installation-methods).
|
||||
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
|
||||
the [components](#components) section for further details.
|
||||
The [`kata-ksm-throttler`](https://github.com/kata-containers/ksm-throttler)
|
||||
is an optional utility that monitors containers and deduplicates memory to
|
||||
maximize container density on a host.
|
||||
|
||||
## Glossary of Terms
|
||||
##### Runtime
|
||||
|
||||
See the [glossary of terms](https://github.com/kata-containers/kata-containers/wiki/Glossary) related to Kata Containers.
|
||||
The [`kata-runtime`](src/runtime/README.md) is usually
|
||||
invoked by a container manager and provides high-level verbs to manage
|
||||
containers.
|
||||
|
||||
##### Trace forwarder
|
||||
|
||||
The [`kata-trace-forwarder`](src/trace-forwarder) is a component only used
|
||||
when tracing the [agent](#agent) process.
|
||||
|
||||
#### Additional
|
||||
|
||||
##### Kernel
|
||||
|
||||
The hypervisor uses a [Linux\* kernel](https://github.com/kata-containers/linux) to boot the guest image.
|
||||
|
||||
### Documentation
|
||||
|
||||
The [docs](docs/README.md) directory holds documentation common to all code components.
|
||||
|
||||
### Packaging
|
||||
|
||||
We use the [packaging](tools/packaging/README.md) to create packages for the [system
|
||||
components](#kata-containers-developed-components) including
|
||||
[rootfs](#os-builder) and [kernel](#kernel) images.
|
||||
|
||||
### Test code
|
||||
|
||||
The [tests](https://github.com/kata-containers/tests) repository hosts all
|
||||
test code except the unit testing code (which is kept in the same repository
|
||||
as the component it tests).
|
||||
|
||||
### Utilities
|
||||
|
||||
#### OS builder
|
||||
|
||||
The [osbuilder](tools/osbuilder/README.md) tool can create
|
||||
a rootfs and a "mini O/S" image. This image is used by the hypervisor to setup
|
||||
the environment before switching to the workload.
|
||||
|
||||
#### `kata-agent-ctl`
|
||||
|
||||
[`kata-agent-ctl`](tools/agent-ctl) is a low-level test tool for
|
||||
interacting with the agent.
|
||||
|
||||
### Web content
|
||||
|
||||
The
|
||||
[www.katacontainers.io](https://github.com/kata-containers/www.katacontainers.io)
|
||||
repository contains all sources for the https://www.katacontainers.io site.
|
||||
|
||||
## Credits
|
||||
|
||||
Kata Containers uses [packagecloud](https://packagecloud.io) for package
|
||||
hosting.
|
||||
|
||||
@@ -1,42 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2022 Apple Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
cidir=$(dirname "$0")
|
||||
runtimedir=$cidir/../src/runtime
|
||||
|
||||
build_working_packages() {
|
||||
# working packages:
|
||||
device_api=$runtimedir/virtcontainers/device/api
|
||||
device_config=$runtimedir/virtcontainers/device/config
|
||||
device_drivers=$runtimedir/virtcontainers/device/drivers
|
||||
device_manager=$runtimedir/virtcontainers/device/manager
|
||||
rc_pkg_dir=$runtimedir/pkg/resourcecontrol/
|
||||
utils_pkg_dir=$runtimedir/virtcontainers/utils
|
||||
|
||||
# broken packages :( :
|
||||
#katautils=$runtimedir/pkg/katautils
|
||||
#oci=$runtimedir/pkg/oci
|
||||
#vc=$runtimedir/virtcontainers
|
||||
|
||||
pkgs=(
|
||||
"$device_api"
|
||||
"$device_config"
|
||||
"$device_drivers"
|
||||
"$device_manager"
|
||||
"$utils_pkg_dir"
|
||||
"$rc_pkg_dir")
|
||||
for pkg in "${pkgs[@]}"; do
|
||||
echo building "$pkg"
|
||||
pushd "$pkg" &>/dev/null
|
||||
go build
|
||||
go test
|
||||
popd &>/dev/null
|
||||
done
|
||||
}
|
||||
|
||||
build_working_packages
|
||||
30
ci/go-no-os-exit.sh
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# Check there are no os.Exit() calls creeping into the code
|
||||
# We don't use that exit path in the Kata codebase.
|
||||
|
||||
# Allow the path to check to be over-ridden.
|
||||
# Default to the current directory.
|
||||
go_packages=${1:-.}
|
||||
|
||||
echo "Checking for no os.Exit() calls for package [${go_packages}]"
|
||||
|
||||
candidates=`go list -f '{{.Dir}}/*.go' $go_packages`
|
||||
for f in $candidates; do
|
||||
filename=`basename $f`
|
||||
# skip all go test files
|
||||
[[ $filename == *_test.go ]] && continue
|
||||
# skip exit.go where, the only file we should call os.Exit() from.
|
||||
[[ $filename == "exit.go" ]] && continue
|
||||
files="$f $files"
|
||||
done
|
||||
|
||||
[ -z "$files" ] && echo "No files to check, skipping" && exit 0
|
||||
|
||||
if egrep -n '\<os\.Exit\>' $files; then
|
||||
echo "Direct calls to os.Exit() are forbidden, please use exit() so atexit() works"
|
||||
exit 1
|
||||
fi
|
||||
@@ -1,4 +1,3 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2020 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,109 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright 2021 Sony Group Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
|
||||
clone_tests_repo
|
||||
|
||||
source "${tests_repo_dir}/.ci/lib.sh"
|
||||
|
||||
# The following variables if set on the environment will change the behavior
|
||||
# of gperf and libseccomp configure scripts, that may lead this script to
|
||||
# fail. So let's ensure they are unset here.
|
||||
unset PREFIX DESTDIR
|
||||
|
||||
arch=$(uname -m)
|
||||
workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
|
||||
|
||||
# Variables for libseccomp
|
||||
# Currently, specify the libseccomp version directly without using `versions.yaml`
|
||||
# because the current Snap workflow is incomplete.
|
||||
# After solving the issue, replace this code by using the `versions.yaml`.
|
||||
# libseccomp_version=$(get_version "externals.libseccomp.version")
|
||||
# libseccomp_url=$(get_version "externals.libseccomp.url")
|
||||
libseccomp_version="2.5.1"
|
||||
libseccomp_url="https://github.com/seccomp/libseccomp"
|
||||
libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"
|
||||
libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"
|
||||
cflags="-O2"
|
||||
|
||||
# Variables for gperf
|
||||
# Currently, specify the gperf version directly without using `versions.yaml`
|
||||
# because the current Snap workflow is incomplete.
|
||||
# After solving the issue, replace this code by using the `versions.yaml`.
|
||||
# gperf_version=$(get_version "externals.gperf.version")
|
||||
# gperf_url=$(get_version "externals.gperf.url")
|
||||
gperf_version="3.1"
|
||||
gperf_url="https://ftp.gnu.org/gnu/gperf"
|
||||
gperf_tarball="gperf-${gperf_version}.tar.gz"
|
||||
gperf_tarball_url="${gperf_url}/${gperf_tarball}"
|
||||
|
||||
# We need to build the libseccomp library from sources to create a static library for the musl libc.
|
||||
# However, ppc64le and s390x have no musl targets in Rust. Hence, we do not set cflags for the musl libc.
|
||||
if ([ "${arch}" != "ppc64le" ] && [ "${arch}" != "s390x" ]); then
|
||||
# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2
|
||||
cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"
|
||||
fi
|
||||
|
||||
die() {
|
||||
msg="$*"
|
||||
echo "[Error] ${msg}" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
finish() {
|
||||
rm -rf "${workdir}"
|
||||
}
|
||||
|
||||
trap finish EXIT
|
||||
|
||||
build_and_install_gperf() {
|
||||
echo "Build and install gperf version ${gperf_version}"
|
||||
mkdir -p "${gperf_install_dir}"
|
||||
curl -sLO "${gperf_tarball_url}"
|
||||
tar -xf "${gperf_tarball}"
|
||||
pushd "gperf-${gperf_version}"
|
||||
./configure --prefix="${gperf_install_dir}"
|
||||
make
|
||||
make install
|
||||
export PATH=$PATH:"${gperf_install_dir}"/bin
|
||||
popd
|
||||
echo "Gperf installed successfully"
|
||||
}
|
||||
|
||||
build_and_install_libseccomp() {
|
||||
echo "Build and install libseccomp version ${libseccomp_version}"
|
||||
mkdir -p "${libseccomp_install_dir}"
|
||||
curl -sLO "${libseccomp_tarball_url}"
|
||||
tar -xf "${libseccomp_tarball}"
|
||||
pushd "libseccomp-${libseccomp_version}"
|
||||
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static
|
||||
make
|
||||
make install
|
||||
popd
|
||||
echo "Libseccomp installed successfully"
|
||||
}
|
||||
|
||||
main() {
|
||||
local libseccomp_install_dir="${1:-}"
|
||||
local gperf_install_dir="${2:-}"
|
||||
|
||||
if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then
|
||||
die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"
|
||||
fi
|
||||
|
||||
pushd "$workdir"
|
||||
# gperf is required for building the libseccomp.
|
||||
build_and_install_gperf
|
||||
build_and_install_libseccomp
|
||||
popd
|
||||
}
|
||||
|
||||
main "$@"
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2020 Ant Group
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
@@ -12,11 +12,10 @@ install_aarch64_musl() {
|
||||
local musl_tar="${arch}-linux-musl-native.tgz"
|
||||
local musl_dir="${arch}-linux-musl-native"
|
||||
pushd /tmp
|
||||
if curl -sLO --fail https://musl.cc/${musl_tar}; then
|
||||
tar -zxf ${musl_tar}
|
||||
mkdir -p /usr/local/musl/
|
||||
cp -r ${musl_dir}/* /usr/local/musl/
|
||||
fi
|
||||
curl -sLO https://musl.cc/${musl_tar}
|
||||
tar -zxf ${musl_tar}
|
||||
mkdir -p /usr/local/musl/
|
||||
cp -r ${musl_dir}/* /usr/local/musl/
|
||||
popd
|
||||
fi
|
||||
}
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
@@ -12,5 +12,5 @@ source "${cidir}/lib.sh"
|
||||
clone_tests_repo
|
||||
|
||||
pushd ${tests_repo_dir}
|
||||
.ci/install_rust.sh ${1:-}
|
||||
.ci/install_rust.sh
|
||||
popd
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -15,18 +15,10 @@ die() {
|
||||
# Install the yq yaml query package from the mikefarah github repo
|
||||
# Install via binary download, as we may not have golang installed at this point
|
||||
function install_yq() {
|
||||
GOPATH=${GOPATH:-${HOME}/go}
|
||||
local yq_path="${GOPATH}/bin/yq"
|
||||
local yq_pkg="github.com/mikefarah/yq"
|
||||
local yq_version=3.4.1
|
||||
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
|
||||
|
||||
if [ "${INSTALL_IN_GOPATH}" == "true" ];then
|
||||
GOPATH=${GOPATH:-${HOME}/go}
|
||||
mkdir -p "${GOPATH}/bin"
|
||||
local yq_path="${GOPATH}/bin/yq"
|
||||
else
|
||||
yq_path="/usr/local/bin/yq"
|
||||
fi
|
||||
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return
|
||||
[ -x "${GOPATH}/bin/yq" ] && return
|
||||
|
||||
read -r -a sysInfo <<< "$(uname -sm)"
|
||||
|
||||
@@ -57,12 +49,15 @@ function install_yq() {
|
||||
;;
|
||||
esac
|
||||
|
||||
mkdir -p "${GOPATH}/bin"
|
||||
|
||||
# Check curl
|
||||
if ! command -v "curl" >/dev/null; then
|
||||
die "Please install curl"
|
||||
fi
|
||||
|
||||
local yq_version=3.1.0
|
||||
|
||||
## NOTE: ${var,,} => gives lowercase value of var
|
||||
local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos,,}_${goarch}"
|
||||
curl -o "${yq_path}" -LSsf "${yq_url}"
|
||||
|
||||
29
ci/lib.sh
@@ -3,31 +3,20 @@
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -o nounset
|
||||
|
||||
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
|
||||
export tests_repo_dir="$GOPATH/src/$tests_repo"
|
||||
export branch="${target_branch:-main}"
|
||||
export branch="${branch:-$TRAVIS_BRANCH}"
|
||||
|
||||
# Clones the tests repository and checkout to the branch pointed out by
|
||||
# the global $branch variable.
|
||||
# If the clone exists and `CI` is exported then it does nothing. Otherwise
|
||||
# it will clone the repository or `git pull` the latest code.
|
||||
#
|
||||
clone_tests_repo()
|
||||
{
|
||||
if [ -d "$tests_repo_dir" ]; then
|
||||
[ -n "${CI:-}" ] && return
|
||||
pushd "${tests_repo_dir}"
|
||||
git checkout "${branch}"
|
||||
git pull
|
||||
popd
|
||||
else
|
||||
git clone -q "https://${tests_repo}" "$tests_repo_dir"
|
||||
pushd "${tests_repo_dir}"
|
||||
git checkout "${branch}"
|
||||
popd
|
||||
if [ -d "$tests_repo_dir" -a -n "$CI" ]
|
||||
then
|
||||
return
|
||||
fi
|
||||
|
||||
go get -d -u "$tests_repo" || true
|
||||
|
||||
pushd "${tests_repo_dir}" && git checkout "${branch}" && popd
|
||||
}
|
||||
|
||||
run_static_checks()
|
||||
@@ -36,7 +25,7 @@ run_static_checks()
|
||||
# Make sure we have the targeting branch
|
||||
git remote set-branches --add origin "${branch}"
|
||||
git fetch -a
|
||||
bash "$tests_repo_dir/.ci/static-checks.sh" "$@"
|
||||
bash "$tests_repo_dir/.ci/static-checks.sh" "github.com/kata-containers/kata-containers"
|
||||
}
|
||||
|
||||
run_go_test()
|
||||
|
||||
@@ -1,14 +0,0 @@
|
||||
# Copyright (c) 2021 Red Hat, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# This is the build root image for Kata Containers on OpenShift CI.
|
||||
#
|
||||
FROM quay.io/centos/centos:stream8
|
||||
|
||||
RUN yum -y update && \
|
||||
yum -y install \
|
||||
git \
|
||||
sudo \
|
||||
wget && \
|
||||
yum clean all
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
@@ -8,14 +8,9 @@
|
||||
set -e
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
export CI_JOB="${CI_JOB:-}"
|
||||
|
||||
clone_tests_repo
|
||||
|
||||
pushd ${tests_repo_dir}
|
||||
.ci/run.sh
|
||||
# temporary fix, see https://github.com/kata-containers/tests/issues/3878
|
||||
if [ "$(uname -m)" != "s390x" ] && [ "$CI_JOB" == "CRI_CONTAINERD_K8S_MINIMAL" ]; then
|
||||
tracing/test-agent-shutdown.sh
|
||||
fi
|
||||
popd
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2017-2018 Intel Corporation
|
||||
#
|
||||
@@ -9,4 +9,4 @@ set -e
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
|
||||
run_static_checks "${@:-github.com/kata-containers/kata-containers}"
|
||||
run_static_checks
|
||||
|
||||
@@ -1,3 +1,55 @@
|
||||
- [Warning](#warning)
|
||||
- [Assumptions](#assumptions)
|
||||
- [Initial setup](#initial-setup)
|
||||
- [Requirements to build individual components](#requirements-to-build-individual-components)
|
||||
- [Build and install the Kata Containers runtime](#build-and-install-the-kata-containers-runtime)
|
||||
- [Check hardware requirements](#check-hardware-requirements)
|
||||
- [Configure to use initrd or rootfs image](#configure-to-use-initrd-or-rootfs-image)
|
||||
- [Enable full debug](#enable-full-debug)
|
||||
- [debug logs and shimv2](#debug-logs-and-shimv2)
|
||||
- [Enabling full `containerd` debug](#enabling-full-containerd-debug)
|
||||
- [Enabling just `containerd shim` debug](#enabling-just-containerd-shim-debug)
|
||||
- [Enabling `CRI-O` and `shimv2` debug](#enabling-cri-o-and-shimv2-debug)
|
||||
- [journald rate limiting](#journald-rate-limiting)
|
||||
- [`systemd-journald` suppressing messages](#systemd-journald-suppressing-messages)
|
||||
- [Disabling `systemd-journald` rate limiting](#disabling-systemd-journald-rate-limiting)
|
||||
- [Create and install rootfs and initrd image](#create-and-install-rootfs-and-initrd-image)
|
||||
- [Build a custom Kata agent - OPTIONAL](#build-a-custom-kata-agent---optional)
|
||||
- [Get the osbuilder](#get-the-osbuilder)
|
||||
- [Create a rootfs image](#create-a-rootfs-image)
|
||||
- [Create a local rootfs](#create-a-local-rootfs)
|
||||
- [Add a custom agent to the image - OPTIONAL](#add-a-custom-agent-to-the-image---optional)
|
||||
- [Build a rootfs image](#build-a-rootfs-image)
|
||||
- [Install the rootfs image](#install-the-rootfs-image)
|
||||
- [Create an initrd image - OPTIONAL](#create-an-initrd-image---optional)
|
||||
- [Create a local rootfs for initrd image](#create-a-local-rootfs-for-initrd-image)
|
||||
- [Build an initrd image](#build-an-initrd-image)
|
||||
- [Install the initrd image](#install-the-initrd-image)
|
||||
- [Install guest kernel images](#install-guest-kernel-images)
|
||||
- [Install a hypervisor](#install-a-hypervisor)
|
||||
- [Build a custom QEMU](#build-a-custom-qemu)
|
||||
- [Build a custom QEMU for aarch64/arm64 - REQUIRED](#build-a-custom-qemu-for-aarch64arm64---required)
|
||||
- [Run Kata Containers with Containerd](#run-kata-containers-with-containerd)
|
||||
- [Run Kata Containers with Kubernetes](#run-kata-containers-with-kubernetes)
|
||||
- [Troubleshoot Kata Containers](#troubleshoot-kata-containers)
|
||||
- [Appendices](#appendices)
|
||||
- [Checking Docker default runtime](#checking-docker-default-runtime)
|
||||
- [Set up a debug console](#set-up-a-debug-console)
|
||||
- [Simple debug console setup](#simple-debug-console-setup)
|
||||
- [Enable agent debug console](#enable-agent-debug-console)
|
||||
- [Connect to debug console](#connect-to-debug-console)
|
||||
- [Traditional debug console setup](#traditional-debug-console-setup)
|
||||
- [Create a custom image containing a shell](#create-a-custom-image-containing-a-shell)
|
||||
- [Build the debug image](#build-the-debug-image)
|
||||
- [Configure runtime for custom debug image](#configure-runtime-for-custom-debug-image)
|
||||
- [Create a container](#create-a-container)
|
||||
- [Connect to the virtual machine using the debug console](#connect-to-the-virtual-machine-using-the-debug-console)
|
||||
- [Enabling debug console for QEMU](#enabling-debug-console-for-qemu)
|
||||
- [Enabling debug console for cloud-hypervisor / firecracker](#enabling-debug-console-for-cloud-hypervisor--firecracker)
|
||||
- [Connecting to the debug console](#connecting-to-the-debug-console)
|
||||
- [Obtain details of the image](#obtain-details-of-the-image)
|
||||
- [Capturing kernel boot logs](#capturing-kernel-boot-logs)
|
||||
|
||||
# Warning
|
||||
|
||||
This document is written **specifically for developers**: it is not intended for end users.
|
||||
@@ -51,7 +103,7 @@ The build will create the following:
|
||||
You can check if your system is capable of creating a Kata Container by running the following:
|
||||
|
||||
```
|
||||
$ sudo kata-runtime check
|
||||
$ sudo kata-runtime kata-check
|
||||
```
|
||||
|
||||
If your system is *not* able to run Kata Containers, the previous command will error out and explain why.
|
||||
@@ -86,16 +138,6 @@ One of the `initrd` and `image` options in Kata runtime config file **MUST** be
|
||||
The main difference between the options is that the size of `initrd`(10MB+) is significantly smaller than
|
||||
rootfs `image`(100MB+).
|
||||
|
||||
## Enable seccomp
|
||||
|
||||
Enable seccomp as follows:
|
||||
|
||||
```
|
||||
$ sudo sed -i '/^disable_guest_seccomp/ s/true/false/' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
|
||||
This will pass container seccomp profiles to the kata agent.
|
||||
|
||||
## Enable full debug
|
||||
|
||||
Enable full debug as follows:
|
||||
@@ -212,13 +254,11 @@ $ sudo systemctl restart systemd-journald
|
||||
>
|
||||
> - You should only do this step if you are testing with the latest version of the agent.
|
||||
|
||||
The agent is built with a statically linked `musl.` The default `libc` used is `musl`, but on `ppc64le` and `s390x`, `gnu` should be used. To configure this:
|
||||
The rust-agent is built with a static linked `musl.` To configure this:
|
||||
|
||||
```
|
||||
$ export ARCH=$(uname -m)
|
||||
$ if [ "$ARCH" = "ppc64le" -o "$ARCH" = "s390x" ]; then export LIBC=gnu; else export LIBC=musl; fi
|
||||
$ [ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
|
||||
$ rustup target add ${ARCH}-unknown-linux-${LIBC}
|
||||
rustup target add x86_64-unknown-linux-musl
|
||||
sudo ln -s /usr/bin/g++ /bin/musl-g++
|
||||
```
|
||||
|
||||
To build the agent:
|
||||
@@ -228,18 +268,6 @@ $ go get -d -u github.com/kata-containers/kata-containers
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/src/agent && make
|
||||
```
|
||||
|
||||
The agent is built with seccomp capability by default.
|
||||
If you want to build the agent without the seccomp capability, you need to run `make` with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ make -C $GOPATH/src/github.com/kata-containers/kata-containers/src/agent SECCOMP=no
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - If you enable seccomp in the main configuration file but build the agent without seccomp capability,
|
||||
> the runtime exits conservatively with an error message.
|
||||
|
||||
## Get the osbuilder
|
||||
|
||||
```
|
||||
@@ -258,21 +286,9 @@ the following example.
|
||||
$ export ROOTFS_DIR=${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs
|
||||
$ sudo rm -rf ${ROOTFS_DIR}
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true ./rootfs.sh ${distro}'
|
||||
```
|
||||
|
||||
You MUST choose a distribution (e.g., `ubuntu`) for `${distro}`.
|
||||
You can get a supported distributions list in the Kata Containers by running the following.
|
||||
|
||||
```
|
||||
$ ./rootfs.sh -l
|
||||
```
|
||||
|
||||
If you want to build the agent without seccomp capability, you need to run the `rootfs.sh` script with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true SECCOMP=no ./rootfs.sh ${distro}'
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true SECCOMP=no ./rootfs.sh ${distro}'
|
||||
```
|
||||
You MUST choose one of `alpine`, `centos`, `clearlinux`, `debian`, `euleros`, `fedora`, `suse`, and `ubuntu` for `${distro}`. By default `seccomp` packages are not included in the rootfs image. Set `SECCOMP` to `yes` to include them.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
@@ -288,7 +304,7 @@ $ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true SECCOMP=no
|
||||
> - You should only do this step if you are testing with the latest version of the agent.
|
||||
|
||||
```
|
||||
$ sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/usr/bin ../../../src/agent/target/x86_64-unknown-linux-musl/release/kata-agent
|
||||
$ sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/bin ../../../src/agent/target/x86_64-unknown-linux-musl/release/kata-agent
|
||||
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-agent.service ${ROOTFS_DIR}/usr/lib/systemd/system/
|
||||
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-containers.target ${ROOTFS_DIR}/usr/lib/systemd/system/
|
||||
```
|
||||
@@ -308,7 +324,6 @@ $ script -fec 'sudo -E USE_DOCKER=true ./image_builder.sh ${ROOTFS_DIR}'
|
||||
> - If you do *not* wish to build under Docker, remove the `USE_DOCKER`
|
||||
> variable in the previous command and ensure the `qemu-img` command is
|
||||
> available on your system.
|
||||
> - If `qemu-img` is not installed, you will likely see errors such as `ERROR: File /dev/loop19p1 is not a block device` and `losetup: /tmp/tmp.bHz11oY851: Warning: file is smaller than 512 bytes; the loop device may be useless or invisible for system tools`. These can be mitigated by installing the `qemu-img` command (available in the `qemu-img` package on Fedora or the `qemu-utils` package on Debian).
|
||||
|
||||
|
||||
### Install the rootfs image
|
||||
@@ -327,35 +342,20 @@ $ (cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers.img)
|
||||
$ export ROOTFS_DIR="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs"
|
||||
$ sudo rm -rf ${ROOTFS_DIR}
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true ./rootfs.sh ${distro}'
|
||||
```
|
||||
`AGENT_INIT` controls if the guest image uses the Kata agent as the guest `init` process. When you create an initrd image,
|
||||
always set `AGENT_INIT` to `yes`.
|
||||
|
||||
You MUST choose a distribution (e.g., `ubuntu`) for `${distro}`.
|
||||
You can get a supported distributions list in the Kata Containers by running the following.
|
||||
|
||||
```
|
||||
$ ./rootfs.sh -l
|
||||
```
|
||||
|
||||
If you want to build the agent without seccomp capability, you need to run the `rootfs.sh` script with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true SECCOMP=no ./rootfs.sh ${distro}'
|
||||
```
|
||||
`AGENT_INIT` controls if the guest image uses the Kata agent as the guest `init` process. When you create an initrd image,
|
||||
always set `AGENT_INIT` to `yes`. By default `seccomp` packages are not included in the initrd image. Set `SECCOMP` to `yes` to include them.
|
||||
|
||||
You MUST choose one of `alpine`, `centos`, `clearlinux`, `euleros`, and `fedora` for `${distro}`.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
|
||||
|
||||
Optionally, add your custom agent binary to the rootfs with the following commands. The default `$LIBC` used
|
||||
is `musl`, but on ppc64le and s390x, `gnu` should be used. Also, Rust refers to ppc64le as `powerpc64le`:
|
||||
Optionally, add your custom agent binary to the rootfs with the following:
|
||||
```
|
||||
$ export ARCH=$(uname -m)
|
||||
$ [ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
|
||||
$ [ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
|
||||
$ sudo install -o root -g root -m 0550 -T ../../../src/agent/target/${ARCH}-unknown-linux-${LIBC}/release/kata-agent ${ROOTFS_DIR}/sbin/init
|
||||
$ sudo install -o root -g root -m 0550 -T ../../agent/kata-agent ${ROOTFS_DIR}/sbin/init
|
||||
```
|
||||
|
||||
### Build an initrd image
|
||||
@@ -390,40 +390,14 @@ You may choose to manually build your VMM/hypervisor.
|
||||
Kata Containers makes use of upstream QEMU branch. The exact version
|
||||
and repository utilized can be found by looking at the [versions file](../versions.yaml).
|
||||
|
||||
Find the correct version of QEMU from the versions file:
|
||||
```
|
||||
$ source ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/scripts/lib.sh
|
||||
$ qemu_version=$(get_from_kata_deps "assets.hypervisor.qemu.version")
|
||||
$ echo ${qemu_version}
|
||||
```
|
||||
Get source from the matching branch of QEMU:
|
||||
```
|
||||
$ go get -d github.com/qemu/qemu
|
||||
$ cd ${GOPATH}/src/github.com/qemu/qemu
|
||||
$ git checkout ${qemu_version}
|
||||
$ your_qemu_directory=${GOPATH}/src/github.com/qemu/qemu
|
||||
```
|
||||
|
||||
There are scripts to manage the build and packaging of QEMU. For the examples below, set your
|
||||
environment as:
|
||||
```
|
||||
$ go get -d github.com/kata-containers/kata-containers
|
||||
$ packaging_dir="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging"
|
||||
```
|
||||
|
||||
Kata often utilizes patches for not-yet-upstream and/or backported fixes for components,
|
||||
including QEMU. These can be found in the [packaging/QEMU directory](../tools/packaging/qemu/patches),
|
||||
and it's *recommended* that you apply them. For example, suppose that you are going to build QEMU
|
||||
version 5.2.0, do:
|
||||
```
|
||||
$ cd $your_qemu_directory
|
||||
$ $packaging_dir/scripts/apply_patches.sh $packaging_dir/qemu/patches/5.2.x/
|
||||
```
|
||||
Kata often utilizes patches for not-yet-upstream fixes for components,
|
||||
including QEMU. These can be found in the [packaging/QEMU directory](../tools/packaging/qemu/patches)
|
||||
|
||||
To build utilizing the same options as Kata, you should make use of the `configure-hypervisor.sh` script. For example:
|
||||
```
|
||||
$ go get -d github.com/kata-containers/kata-containers/tools/packaging
|
||||
$ cd $your_qemu_directory
|
||||
$ $packaging_dir/scripts/configure-hypervisor.sh kata-qemu > kata.cfg
|
||||
$ ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/scripts/configure-hypervisor.sh qemu > kata.cfg
|
||||
$ eval ./configure "$(cat kata.cfg)"
|
||||
$ make -j $(nproc)
|
||||
$ sudo -E make install
|
||||
@@ -465,7 +439,7 @@ script and paste its output directly into a
|
||||
> [runtime](../src/runtime) repository.
|
||||
|
||||
To perform analysis on Kata logs, use the
|
||||
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
|
||||
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/master/cmd/log-parser)
|
||||
tool, which can convert the logs into formats (e.g. JSON, TOML, XML, and YAML).
|
||||
|
||||
See [Set up a debug console](#set-up-a-debug-console).
|
||||
@@ -498,9 +472,9 @@ debug_console_enabled = true
|
||||
|
||||
This will pass `agent.debug_console agent.debug_console_vport=1026` to agent as kernel parameters, and sandboxes created using this parameters will start a shell in guest if new connection is accept from VSOCK.
|
||||
|
||||
#### Start `kata-monitor` - ONLY NEEDED FOR 2.0.x
|
||||
#### Start `kata-monitor`
|
||||
|
||||
For Kata Containers `2.0.x` releases, the `kata-runtime exec` command depends on the`kata-monitor` running, in order to get the sandbox's `vsock` address to connect to. Thus, first start the `kata-monitor` process.
|
||||
The `kata-runtime exec` command needs `kata-monitor` to get the sandbox's `vsock` address to connect to, first start `kata-monitor`.
|
||||
|
||||
```
|
||||
$ sudo kata-monitor
|
||||
@@ -508,6 +482,7 @@ $ sudo kata-monitor
|
||||
|
||||
`kata-monitor` will serve at `localhost:8090` by default.
|
||||
|
||||
|
||||
#### Connect to debug console
|
||||
|
||||
Command `kata-runtime exec` is used to connect to the debug console.
|
||||
@@ -522,10 +497,6 @@ bash-4.2# exit
|
||||
exit
|
||||
```
|
||||
|
||||
`kata-runtime exec` has a command-line option `runtime-namespace`, which is used to specify under which [runtime namespace](https://github.com/containerd/containerd/blob/master/docs/namespaces.md) the particular pod was created. By default, it is set to `k8s.io` and works for containerd when configured
|
||||
with Kubernetes. For CRI-O, the namespace should set to `default` explicitly. This should not be confused with [Kubernetes namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/).
|
||||
For other CRI-runtimes and configurations, you may need to set the namespace utilizing the `runtime-namespace` option.
|
||||
|
||||
If you want to access guest OS through a traditional way, see [Traditional debug console setup)](#traditional-debug-console-setup).
|
||||
|
||||
### Traditional debug console setup
|
||||
@@ -652,7 +623,7 @@ VMM solution.
|
||||
|
||||
In case of cloud-hypervisor, connect to the `vsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id}/root/ && socat stdin unix-connect:clh.sock'
|
||||
$ sudo su -c 'cd /var/run/vc/vm/{sandbox_id}/root/ && socat stdin unix-connect:clh.sock'
|
||||
CONNECT 1026
|
||||
```
|
||||
|
||||
@@ -660,7 +631,7 @@ CONNECT 1026
|
||||
|
||||
For firecracker, connect to the `hvsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/firecracker/${sandbox_id}/root/ && socat stdin unix-connect:kata.hvsock'
|
||||
$ sudo su -c 'cd /var/run/vc/firecracker/{sandbox_id}/root/ && socat stdin unix-connect:kata.hvsock'
|
||||
CONNECT 1026
|
||||
```
|
||||
|
||||
@@ -669,7 +640,7 @@ CONNECT 1026
|
||||
|
||||
For QEMU, connect to the `vsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id} && socat "stdin,raw,echo=0,escape=0x11" "unix-connect:console.sock"'
|
||||
$ sudo su -c 'cd /var/run/vc/vm/{sandbox_id} && socat "stdin,raw,echo=0,escape=0x11" "unix-connect:console.sock"
|
||||
```
|
||||
|
||||
To disconnect from the virtual machine, type `CONTROL+q` (hold down the
|
||||
|
||||
@@ -1,3 +1,16 @@
|
||||
* [Introduction](#introduction)
|
||||
* [General requirements](#general-requirements)
|
||||
* [Linking advice](#linking-advice)
|
||||
* [Notes](#notes)
|
||||
* [Warnings and other admonitions](#warnings-and-other-admonitions)
|
||||
* [Files and command names](#files-and-command-names)
|
||||
* [Code blocks](#code-blocks)
|
||||
* [Images](#images)
|
||||
* [Spelling](#spelling)
|
||||
* [Names](#names)
|
||||
* [Version numbers](#version-numbers)
|
||||
* [The apostrophe](#the-apostrophe)
|
||||
|
||||
# Introduction
|
||||
|
||||
This document outlines the requirements for all documentation in the [Kata
|
||||
@@ -10,6 +23,10 @@ All documents must:
|
||||
- Be written in simple English.
|
||||
- Be written in [GitHub Flavored Markdown](https://github.github.com/gfm) format.
|
||||
- Have a `.md` file extension.
|
||||
- Include a TOC (table of contents) at the top of the document with links to
|
||||
all heading sections. We recommend using the
|
||||
[`kata-check-markdown`](https://github.com/kata-containers/tests/tree/master/cmd/check-markdown)
|
||||
tool to generate the TOC.
|
||||
- Be linked to from another document in the same repository.
|
||||
|
||||
Although GitHub allows navigation of the entire repository, it should be
|
||||
@@ -26,10 +43,6 @@ All documents must:
|
||||
which can then execute the commands specified to ensure the instructions are
|
||||
correct. This avoids documents becoming out of date over time.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> Do not add a table of contents (TOC) since GitHub will auto-generate one.
|
||||
|
||||
# Linking advice
|
||||
|
||||
Linking between documents is strongly encouraged to help users and developers
|
||||
@@ -105,7 +118,7 @@ This section lists requirements for displaying commands and command output.
|
||||
The requirements must be adhered to since documentation containing code blocks
|
||||
is validated by the CI system, which executes the command blocks with the help
|
||||
of the
|
||||
[doc-to-script](https://github.com/kata-containers/tests/tree/main/.ci/kata-doc-to-script.sh)
|
||||
[doc-to-script](https://github.com/kata-containers/tests/tree/master/.ci/kata-doc-to-script.sh)
|
||||
utility.
|
||||
|
||||
- If a document includes commands the user should run, they **MUST** be shown
|
||||
@@ -189,7 +202,7 @@ and compare them with standard tools (e.g. `diff(1)`).
|
||||
|
||||
Since this project uses a number of terms not found in conventional
|
||||
dictionaries, we have a
|
||||
[spell checking tool](https://github.com/kata-containers/tests/tree/main/cmd/check-spelling)
|
||||
[spell checking tool](https://github.com/kata-containers/tests/tree/master/cmd/check-spelling)
|
||||
that checks both dictionary words and the additional terms we use.
|
||||
|
||||
Run the spell checking tool on your document before raising a PR to ensure it
|
||||
|
||||
@@ -1,5 +1,9 @@
|
||||
# Licensing strategy
|
||||
|
||||
* [Project License](#project-license)
|
||||
* [License file](#license-file)
|
||||
* [License for individual files](#license-for-individual-files)
|
||||
|
||||
## Project License
|
||||
|
||||
The license for the [Kata Containers](https://github.com/kata-containers)
|
||||
@@ -18,4 +22,4 @@ licensing and allows automated tooling to check the license of individual
|
||||
files.
|
||||
|
||||
This SPDX licence identifier requirement is enforced by the
|
||||
[CI (Continuous Integration) system](https://github.com/kata-containers/tests/blob/main/.ci/static-checks.sh).
|
||||
[CI (Continuous Integration) system](https://github.com/kata-containers/tests/blob/master/.ci/static-checks.sh).
|
||||
|
||||
@@ -1,3 +1,33 @@
|
||||
* [Overview](#overview)
|
||||
* [Definition of a limitation](#definition-of-a-limitation)
|
||||
* [Scope](#scope)
|
||||
* [Contributing](#contributing)
|
||||
* [Pending items](#pending-items)
|
||||
* [Runtime commands](#runtime-commands)
|
||||
* [checkpoint and restore](#checkpoint-and-restore)
|
||||
* [events command](#events-command)
|
||||
* [update command](#update-command)
|
||||
* [Networking](#networking)
|
||||
* [Docker swarm and compose support](#docker-swarm-and-compose-support)
|
||||
* [Resource management](#resource-management)
|
||||
* [docker run and shared memory](#docker-run-and-shared-memory)
|
||||
* [docker run and sysctl](#docker-run-and-sysctl)
|
||||
* [Docker daemon features](#docker-daemon-features)
|
||||
* [SELinux support](#selinux-support)
|
||||
* [Architectural limitations](#architectural-limitations)
|
||||
* [Networking limitations](#networking-limitations)
|
||||
* [Support for joining an existing VM network](#support-for-joining-an-existing-vm-network)
|
||||
* [docker --net=host](#docker---nethost)
|
||||
* [docker run --link](#docker-run---link)
|
||||
* [Host resource sharing](#host-resource-sharing)
|
||||
* [docker run --privileged](#docker-run---privileged)
|
||||
* [Miscellaneous](#miscellaneous)
|
||||
* [Docker --security-opt option partially supported](#docker---security-opt-option-partially-supported)
|
||||
* [Appendices](#appendices)
|
||||
* [The constraints challenge](#the-constraints-challenge)
|
||||
|
||||
---
|
||||
|
||||
# Overview
|
||||
|
||||
A [Kata Container](https://github.com/kata-containers) utilizes a Virtual Machine (VM) to enhance security and
|
||||
@@ -57,21 +87,12 @@ for advice on which repository to raise the issue against.
|
||||
|
||||
This section lists items that might be possible to fix.
|
||||
|
||||
## OCI CLI commands
|
||||
|
||||
### Docker and Podman support
|
||||
Currently Kata Containers does not support Docker or Podman.
|
||||
|
||||
See issue https://github.com/kata-containers/kata-containers/issues/722 for more information.
|
||||
|
||||
## Runtime commands
|
||||
|
||||
### checkpoint and restore
|
||||
|
||||
The runtime does not provide `checkpoint` and `restore` commands. There
|
||||
are discussions about using VM save and restore to give us a
|
||||
`[criu](https://github.com/checkpoint-restore/criu)`-like functionality,
|
||||
which might provide a solution.
|
||||
are discussions about using VM save and restore to give [`criu`](https://github.com/checkpoint-restore/criu)-like functionality, which might provide a solution.
|
||||
|
||||
Note that the OCI standard does not specify `checkpoint` and `restore`
|
||||
commands.
|
||||
@@ -93,6 +114,21 @@ All other configurations are supported and are working properly.
|
||||
|
||||
## Networking
|
||||
|
||||
### Docker swarm and compose support
|
||||
|
||||
The newest version of Docker supported is specified by the
|
||||
`externals.docker.version` variable in the
|
||||
[versions database](https://github.com/kata-containers/runtime/blob/master/versions.yaml).
|
||||
|
||||
Basic Docker swarm support works. However, if you want to use custom networks
|
||||
with Docker's swarm, an older version of Docker is required. This is specified
|
||||
by the `externals.docker.meta.swarm-version` variable in the
|
||||
[versions database](https://github.com/kata-containers/runtime/blob/master/versions.yaml).
|
||||
|
||||
See issue https://github.com/kata-containers/runtime/issues/175 for more information.
|
||||
|
||||
Docker compose normally uses custom networks, so also has the same limitations.
|
||||
|
||||
## Resource management
|
||||
|
||||
Due to the way VMs differ in their CPU and memory allocation, and sharing
|
||||
@@ -104,28 +140,91 @@ See issue https://github.com/clearcontainers/runtime/issues/341 and [the constra
|
||||
For CPUs resource management see
|
||||
[CPU constraints](design/vcpu-handling.md).
|
||||
|
||||
### docker run and shared memory
|
||||
|
||||
The runtime does not implement the `docker run --shm-size` command to
|
||||
set the size of the `/dev/shm tmpfs` within the container. It is possible to pass this configuration value into the VM container so the appropriate mount command happens at launch time.
|
||||
|
||||
See issue https://github.com/kata-containers/kata-containers/issues/21 for more information.
|
||||
|
||||
### docker run and sysctl
|
||||
|
||||
The `docker run --sysctl` feature is not implemented. At the runtime
|
||||
level, this equates to the `linux.sysctl` OCI configuration. Docker
|
||||
allows configuring the sysctl settings that support namespacing. From a security and isolation point of view, it might make sense to set them in the VM, which isolates sysctl settings. Also, given that each Kata Container has its own kernel, we can support setting of sysctl settings that are not namespaced. In some cases, we might need to support configuring some of the settings on both the host side Kata Container namespace and the Kata Containers kernel.
|
||||
|
||||
See issue https://github.com/kata-containers/runtime/issues/185 for more information.
|
||||
|
||||
## Docker daemon features
|
||||
|
||||
Some features enabled or implemented via the
|
||||
[`dockerd` daemon](https://docs.docker.com/config/daemon/) configuration are not yet
|
||||
implemented.
|
||||
|
||||
### SELinux support
|
||||
|
||||
The `dockerd` configuration option `"selinux-enabled": true` is not presently implemented
|
||||
in Kata Containers. Enabling this option causes an OCI runtime error.
|
||||
|
||||
See issue https://github.com/kata-containers/runtime/issues/784 for more information.
|
||||
|
||||
The consequence of this is that the [Docker --security-opt is only partially supported](#docker---security-opt-option-partially-supported).
|
||||
|
||||
Kubernetes [SELinux labels](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#assign-selinux-labels-to-a-container) will also not be applied.
|
||||
|
||||
# Architectural limitations
|
||||
|
||||
This section lists items that might not be fixed due to fundamental
|
||||
architectural differences between "soft containers" (i.e. traditional Linux*
|
||||
containers) and those based on VMs.
|
||||
|
||||
## Storage limitations
|
||||
## Networking limitations
|
||||
|
||||
### Kubernetes `volumeMounts.subPaths`
|
||||
### Support for joining an existing VM network
|
||||
|
||||
Kubernetes `volumeMount.subPath` is not supported by Kata Containers at the
|
||||
moment.
|
||||
Docker supports the ability for containers to join another containers
|
||||
namespace with the `docker run --net=containers` syntax. This allows
|
||||
multiple containers to share a common network namespace and the network
|
||||
interfaces placed in the network namespace. Kata Containers does not
|
||||
support network namespace sharing. If a Kata Container is setup to
|
||||
share the network namespace of a `runc` container, the runtime
|
||||
effectively takes over all the network interfaces assigned to the
|
||||
namespace and binds them to the VM. Consequently, the `runc` container loses
|
||||
its network connectivity.
|
||||
|
||||
See [this issue](https://github.com/kata-containers/runtime/issues/2812) for more details.
|
||||
[Another issue](https://github.com/kata-containers/kata-containers/issues/1728) focuses on the case of `emptyDir`.
|
||||
### docker --net=host
|
||||
|
||||
Docker host network support (`docker --net=host run`) is not supported.
|
||||
It is not possible to directly access the host networking configuration
|
||||
from within the VM.
|
||||
|
||||
The `--net=host` option can still be used with `runc` containers and
|
||||
inter-mixed with running Kata Containers, thus enabling use of `--net=host`
|
||||
when necessary.
|
||||
|
||||
It should be noted, currently passing the `--net=host` option into a
|
||||
Kata Container may result in the Kata Container networking setup
|
||||
modifying, re-configuring and therefore possibly breaking the host
|
||||
networking setup. Do not use `--net=host` with Kata Containers.
|
||||
|
||||
### docker run --link
|
||||
|
||||
The runtime does not support the `docker run --link` command. This
|
||||
command is now deprecated by docker and we have no intention of adding support.
|
||||
Equivalent functionality can be achieved with the newer docker networking commands.
|
||||
|
||||
See more documentation at
|
||||
[docs.docker.com](https://docs.docker.com/engine/userguide/networking/default_network/dockerlinks/).
|
||||
|
||||
## Host resource sharing
|
||||
|
||||
### Privileged containers
|
||||
### docker run --privileged
|
||||
|
||||
Privileged support in Kata is essentially different from `runc` containers.
|
||||
The container runs with elevated capabilities within the guest and is granted
|
||||
Kata does support `docker run --privileged` command, but in this case full access
|
||||
to the guest VM is provided in addition to some host access.
|
||||
|
||||
The container runs with elevated capabilities within the guest and is granted
|
||||
access to guest devices instead of the host devices.
|
||||
This is also true with using `securityContext privileged=true` with Kubernetes.
|
||||
|
||||
@@ -134,6 +233,17 @@ The container may also be granted full access to a subset of host devices
|
||||
|
||||
See [Privileged Kata Containers](how-to/privileged.md) for how to configure some of this behavior.
|
||||
|
||||
# Miscellaneous
|
||||
|
||||
This section lists limitations where the possible solutions are uncertain.
|
||||
|
||||
## Docker --security-opt option partially supported
|
||||
|
||||
The `--security-opt=` option used by Docker is partially supported.
|
||||
We only support `--security-opt=no-new-privileges` and `--security-opt seccomp=/path/to/seccomp/profile.json`
|
||||
option as of today.
|
||||
|
||||
Note: The `--security-opt apparmor=your_profile` is not yet supported. See https://github.com/kata-containers/runtime/issues/707.
|
||||
# Appendices
|
||||
|
||||
## The constraints challenge
|
||||
|
||||
@@ -1,5 +1,16 @@
|
||||
# Documentation
|
||||
|
||||
* [Getting Started](#getting-started)
|
||||
* [More User Guides](#more-user-guides)
|
||||
* [Kata Use-Cases](#kata-use-cases)
|
||||
* [Developer Guide](#developer-guide)
|
||||
* [Design and Implementations](#design-and-implementations)
|
||||
* [How to Contribute](#how-to-contribute)
|
||||
* [Code Licensing](#code-licensing)
|
||||
* [The Release Process](#the-release-process)
|
||||
* [Help Improving the Documents](#help-improving-the-documents)
|
||||
* [Website Changes](#website-changes)
|
||||
|
||||
The [Kata Containers](https://github.com/kata-containers)
|
||||
documentation repository hosts overall system documentation, with information
|
||||
common to multiple components.
|
||||
@@ -11,28 +22,24 @@ For details of the other Kata Containers repositories, see the
|
||||
|
||||
* [Installation guides](./install/README.md): Install and run Kata Containers with Docker or Kubernetes
|
||||
|
||||
## Tracing
|
||||
|
||||
See the [tracing documentation](tracing.md).
|
||||
|
||||
## More User Guides
|
||||
|
||||
* [Upgrading](Upgrading.md): how to upgrade from [Clear Containers](https://github.com/clearcontainers) and [runV](https://github.com/hyperhq/runv) to [Kata Containers](https://github.com/kata-containers) and how to upgrade an existing Kata Containers system to the latest version.
|
||||
* [Limitations](Limitations.md): differences and limitations compared with the default [Docker](https://www.docker.com/) runtime,
|
||||
[`runc`](https://github.com/opencontainers/runc).
|
||||
|
||||
### How-to guides
|
||||
### Howto guides
|
||||
|
||||
See the [how-to documentation](how-to).
|
||||
See the [howto documentation](how-to).
|
||||
|
||||
## Kata Use-Cases
|
||||
|
||||
* [GPU Passthrough with Kata](./use-cases/GPU-passthrough-and-Kata.md)
|
||||
* [OpenStack Zun with Kata Containers](./use-cases/zun_kata.md)
|
||||
* [SR-IOV with Kata](./use-cases/using-SRIOV-and-kata.md)
|
||||
* [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md)
|
||||
* [VPP with Kata](./use-cases/using-vpp-and-kata.md)
|
||||
* [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md)
|
||||
* [Intel SGX with Kata](./use-cases/using-Intel-SGX-and-kata.md)
|
||||
|
||||
## Developer Guide
|
||||
|
||||
@@ -40,30 +47,16 @@ Documents that help to understand and contribute to Kata Containers.
|
||||
|
||||
### Design and Implementations
|
||||
|
||||
* [Kata Containers Architecture](design/architecture): Architectural overview of Kata Containers
|
||||
* [Kata Containers Architecture](design/architecture.md): Architectural overview of Kata Containers
|
||||
* [Kata Containers E2E Flow](design/end-to-end-flow.md): The entire end-to-end flow of Kata Containers
|
||||
* [Kata Containers design](./design/README.md): More Kata Containers design documents
|
||||
* [Kata Containers threat model](./threat-model/threat-model.md): Kata Containers threat model
|
||||
|
||||
### How to Contribute
|
||||
|
||||
* [Developer Guide](Developer-Guide.md): Setup the Kata Containers developing environments
|
||||
* [How to contribute to Kata Containers](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md)
|
||||
* [How to contribute to Kata Containers](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md)
|
||||
* [Code of Conduct](../CODE_OF_CONDUCT.md)
|
||||
|
||||
## Help Writing a Code PR
|
||||
|
||||
* [Code PR advice](code-pr-advice.md).
|
||||
|
||||
## Help Writing Unit Tests
|
||||
|
||||
* [Unit Test Advice](Unit-Test-Advice.md)
|
||||
* [Unit testing presentation](presentations/unit-testing/kata-containers-unit-testing.md)
|
||||
|
||||
## Help Improving the Documents
|
||||
|
||||
* [Documentation Requirements](Documentation-Requirements.md)
|
||||
|
||||
### Code Licensing
|
||||
|
||||
* [Licensing](Licensing-strategy.md): About the licensing strategy of Kata Containers.
|
||||
@@ -73,9 +66,9 @@ Documents that help to understand and contribute to Kata Containers.
|
||||
* [Release strategy](Stable-Branch-Strategy.md)
|
||||
* [Release Process](Release-Process.md)
|
||||
|
||||
## Presentations
|
||||
## Help Improving the Documents
|
||||
|
||||
* [Presentations](presentations)
|
||||
* [Documentation Requirements](Documentation-Requirements.md)
|
||||
|
||||
## Website Changes
|
||||
|
||||
|
||||
@@ -1,26 +1,45 @@
|
||||
|
||||
# How to do a Kata Containers Release
|
||||
This document lists the tasks required to create a Kata Release.
|
||||
|
||||
<!-- TOC START min:1 max:3 link:true asterisk:false update:true -->
|
||||
- [How to do a Kata Containers Release](#how-to-do-a-kata-containers-release)
|
||||
- [Requirements](#requirements)
|
||||
- [Release Process](#release-process)
|
||||
- [Bump all Kata repositories](#bump-all-kata-repositories)
|
||||
- [Merge all bump version Pull requests](#merge-all-bump-version-pull-requests)
|
||||
- [Tag all Kata repositories](#tag-all-kata-repositories)
|
||||
- [Check Git-hub Actions](#check-git-hub-actions)
|
||||
- [Create release notes](#create-release-notes)
|
||||
- [Announce the release](#announce-the-release)
|
||||
<!-- TOC END -->
|
||||
|
||||
|
||||
## Requirements
|
||||
|
||||
- [hub](https://github.com/github/hub)
|
||||
* Using an [application token](https://github.com/settings/tokens) is required for hub (set to a GITHUB_TOKEN environment variable).
|
||||
|
||||
- OBS account with permissions on [`/home:katacontainers`](https://build.opensuse.org/project/subprojects/home:katacontainers)
|
||||
|
||||
- GitHub permissions to push tags and create releases in Kata repositories.
|
||||
|
||||
- GPG configured to sign git tags. https://docs.github.com/en/authentication/managing-commit-signature-verification/generating-a-new-gpg-key
|
||||
- GPG configured to sign git tags. https://help.github.com/articles/generating-a-new-gpg-key/
|
||||
|
||||
- You should configure your GitHub to use your ssh keys (to push to branches). See https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/.
|
||||
* As an alternative, configure hub to push and fork with HTTPS, `git config --global hub.protocol https` (Not tested yet) *
|
||||
|
||||
## Release Process
|
||||
|
||||
|
||||
### Bump all Kata repositories
|
||||
|
||||
Bump the repositories using a script in the Kata packaging repo, where:
|
||||
- `BRANCH=<the-branch-you-want-to-bump>`
|
||||
- `NEW_VERSION=<the-new-kata-version>`
|
||||
- We have set up a Jenkins job to bump the version in the `VERSION` file in all Kata repositories. Go to the [Jenkins bump-job page](http://jenkins.katacontainers.io/job/release/build) to trigger a new job.
|
||||
- Start a new job with variables for the job passed as:
|
||||
- `BRANCH=<the-branch-you-want-to-bump>`
|
||||
- `NEW_VERSION=<the-new-kata-version>`
|
||||
|
||||
For example, in the case where you want to make a patch release `1.10.2`, the variable `NEW_VERSION` should be `1.10.2` and `BRANCH` should point to `stable-1.10`. In case of an alpha or release candidate release, `BRANCH` should point to `master` branch.
|
||||
|
||||
Alternatively, you can also bump the repositories using a script in the Kata packaging repo
|
||||
```
|
||||
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
|
||||
$ export NEW_VERSION=<the-new-kata-version>
|
||||
@@ -28,34 +47,16 @@
|
||||
$ ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH"
|
||||
```
|
||||
|
||||
### Point tests repository to stable branch
|
||||
|
||||
If you create a new stable branch, i.e. if your release changes a major or minor version number (not a patch release), then
|
||||
you should modify the `tests` repository to point to that newly created stable branch and not the `main` branch.
|
||||
The objective is that changes in the CI on the main branch will not impact the stable branch.
|
||||
|
||||
In the test directory, change references the main branch in:
|
||||
* `README.md`
|
||||
* `versions.yaml`
|
||||
* `cmd/github-labels/labels.yaml.in`
|
||||
* `cmd/pmemctl/pmemctl.sh`
|
||||
* `.ci/lib.sh`
|
||||
* `.ci/static-checks.sh`
|
||||
|
||||
See the commits in [the corresponding PR for stable-2.1](https://github.com/kata-containers/tests/pull/3504) for an example of the changes.
|
||||
|
||||
|
||||
### Merge all bump version Pull requests
|
||||
|
||||
- The above step will create a GitHub pull request in the Kata projects. Trigger the CI using `/test` command on each bump Pull request.
|
||||
- Trigger the `test-kata-deploy` workflow which is under the `Actions` tab on the repository GitHub page (make sure to select the correct branch and validate it passes).
|
||||
- Check any failures and fix if needed.
|
||||
- Work with the Kata approvers to verify that the CI works and the pull requests are merged.
|
||||
|
||||
### Tag all Kata repositories
|
||||
|
||||
Once all the pull requests to bump versions in all Kata repositories are merged,
|
||||
tag all the repositories as shown below.
|
||||
tag all the repositories as shown below.
|
||||
```
|
||||
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
|
||||
$ git checkout <kata-branch-to-release>
|
||||
@@ -65,7 +66,7 @@
|
||||
|
||||
### Check Git-hub Actions
|
||||
|
||||
We make use of [GitHub actions](https://github.com/features/actions) in this [file](../.github/workflows/release.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-containers` repository.
|
||||
We make use of [GitHub actions](https://github.com/features/actions) in this [file](https://github.com/kata-containers/kata-containers/blob/master/.github/workflows/main.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-conatiners` repository.
|
||||
|
||||
Check the [actions status page](https://github.com/kata-containers/kata-containers/actions) to verify all steps in the actions workflow have completed successfully. On success, a static tarball containing Kata release artifacts will be uploaded to the [Release page](https://github.com/kata-containers/kata-containers/releases).
|
||||
|
||||
|
||||
@@ -32,16 +32,16 @@ provides additional information regarding release `99.123.77` in the previous ex
|
||||
changing the existing behavior*.
|
||||
|
||||
- When `MAJOR` increases, the new release adds **new features, bug fixes, or
|
||||
both** and which **changes the behavior from the previous release** (incompatible with previous releases).
|
||||
both** and which *changes the behavior from the previous release* (incompatible with previous releases).
|
||||
|
||||
A major release will also likely require a change of the container manager version used,
|
||||
for example Containerd or CRI-O. Please refer to the release notes for further details.
|
||||
for example Docker\*. Please refer to the release notes for further details.
|
||||
|
||||
## Release Strategy
|
||||
|
||||
Any new features added since the last release will be available in the next minor
|
||||
release. These will include bug fixes as well. To facilitate a stable user environment,
|
||||
Kata provides stable branch-based releases and a main branch release.
|
||||
Kata provides stable branch-based releases and a master branch release.
|
||||
|
||||
## Stable branch patch criteria
|
||||
|
||||
@@ -49,10 +49,9 @@ No new features should be introduced to stable branches. This is intended to li
|
||||
providing only bug and security fixes.
|
||||
|
||||
## Branch Management
|
||||
Kata Containers will maintain **one** stable release branch, in addition to the main branch, for
|
||||
each active major release.
|
||||
Once a new MAJOR or MINOR release is created from main, a new stable branch is created for
|
||||
the prior MAJOR or MINOR release and the previous stable branch is no longer maintained. End of
|
||||
Kata Containers will maintain two stable release branches in addition to the master branch.
|
||||
Once a new MAJOR or MINOR release is created from master, a new stable branch is created for
|
||||
the prior MAJOR or MINOR release and the older stable branch is no longer maintained. End of
|
||||
maintenance for a branch is announced on the Kata Containers mailing list. Users can determine
|
||||
the version currently installed by running `kata-runtime kata-env`. It is recommended to use the
|
||||
latest stable branch available.
|
||||
@@ -62,67 +61,67 @@ A couple of examples follow to help clarify this process.
|
||||
### New bug fix introduced
|
||||
|
||||
A bug fix is submitted against the runtime which does not introduce new inter-component dependencies.
|
||||
This fix is applied to both the main and stable branches, and there is no need to create a new
|
||||
This fix is applied to both the master and stable branches, and there is no need to create a new
|
||||
stable branch.
|
||||
|
||||
| Branch | Original version | New version |
|
||||
|--|--|--|
|
||||
| `main` | `2.3.0-rc0` | `2.3.0-rc1` |
|
||||
| `stable-2.2` | `2.2.0` | `2.2.1` |
|
||||
| `stable-2.1` | (unmaintained) | (unmaintained) |
|
||||
| `master` | `1.3.0-rc0` | `1.3.0-rc1` |
|
||||
| `stable-1.2` | `1.2.0` | `1.2.1` |
|
||||
| `stable-1.1` | `1.1.2` | `1.1.3` |
|
||||
|
||||
|
||||
### New release made feature or change adding new inter-component dependency
|
||||
|
||||
A new feature is introduced, which adds a new inter-component dependency. In this case a new stable
|
||||
branch is created (stable-2.3) starting from main and the previous stable branch (stable-2.2)
|
||||
branch is created (stable-1.3) starting from master and the older stable branch (stable-1.1)
|
||||
is dropped from maintenance.
|
||||
|
||||
|
||||
| Branch | Original version | New version |
|
||||
|--|--|--|
|
||||
| `main` | `2.3.0-rc1` | `2.3.0` |
|
||||
| `stable-2.3` | N/A| `2.3.0` |
|
||||
| `stable-2.2` | `2.2.1` | (unmaintained) |
|
||||
| `stable-2.1` | (unmaintained) | (unmaintained) |
|
||||
| `master` | `1.3.0-rc1` | `1.3.0` |
|
||||
| `stable-1.3` | N/A| `1.3.0` |
|
||||
| `stable-1.2` | `1.2.1` | `1.2.2` |
|
||||
| `stable-1.1` | `1.1.3` | (unmaintained) |
|
||||
|
||||
Note, the stable-2.2 branch will still exist with tag 2.2.1, but under current plans it is
|
||||
not maintained further. The next tag applied to main will be 2.4.0-alpha0. We would then
|
||||
Note, the stable-1.1 branch will still exist with tag 1.1.3, but under current plans it is
|
||||
not maintained further. The next tag applied to master will be 1.4.0-alpha0. We would then
|
||||
create a couple of alpha releases gathering features targeted for that particular release (in
|
||||
this case 2.4.0), followed by a release candidate. The release candidate marks a feature freeze.
|
||||
this case 1.4.0), followed by a release candidate. The release candidate marks a feature freeze.
|
||||
A new stable branch is created for the release candidate. Only bug fixes and any security issues
|
||||
are added to the branch going forward until release 2.4.0 is made.
|
||||
are added to the branch going forward until release 1.4.0 is made.
|
||||
|
||||
## Backporting Process
|
||||
|
||||
Development that occurs against the main branch and applicable code commits should also be submitted
|
||||
Development that occurs against the master branch and applicable code commits should also be submitted
|
||||
against the stable branches. Some guidelines for this process follow::
|
||||
1. Only bug and security fixes which do not introduce inter-component dependencies are
|
||||
candidates for stable branches. These PRs should be marked with "bug" in GitHub.
|
||||
2. Once a PR is created against main which meets requirement of (1), a comparable one
|
||||
2. Once a PR is created against master which meets requirement of (1), a comparable one
|
||||
should also be submitted against the stable branches. It is the responsibility of the submitter
|
||||
to apply their pull request against stable, and it is the responsibility of the
|
||||
reviewers to help identify stable-candidate pull requests.
|
||||
|
||||
## Continuous Integration Testing
|
||||
|
||||
The test repository is forked to create stable branches from main. Full CI
|
||||
runs on each stable and main PR using its respective tests repository branch.
|
||||
The test repository is forked to create stable branches from master. Full CI
|
||||
runs on each stable and master PR using its respective tests repository branch.
|
||||
|
||||
### An alternative method for CI testing:
|
||||
|
||||
Ideally, the continuous integration infrastructure will run the same test suite on both main
|
||||
Ideally, the continuous integration infrastructure will run the same test suite on both master
|
||||
and the stable branches. When tests are modified or new feature tests are introduced, explicit
|
||||
logic should exist within the testing CI to make sure only applicable tests are executed against
|
||||
stable and main. While this is not in place currently, it should be considered in the long term.
|
||||
stable and master. While this is not in place currently, it should be considered in the long term.
|
||||
|
||||
## Release Management
|
||||
|
||||
### Patch releases
|
||||
|
||||
Releases are made every four weeks, which include a GitHub release as
|
||||
Releases are made every three weeks, which include a GitHub release as
|
||||
well as binary packages. These patch releases are made for both stable branches, and a "release candidate"
|
||||
for the next `MAJOR` or `MINOR` is created from main. If there are no changes across all the repositories, no
|
||||
for the next `MAJOR` or `MINOR` is created from master. If there are no changes across all the repositories, no
|
||||
release is created and an announcement is made on the developer mailing list to highlight this.
|
||||
If a release is being made, each repository is tagged for this release, regardless
|
||||
of whether changes are introduced. The release schedule can be seen on the
|
||||
@@ -136,16 +135,17 @@ The process followed for making a release can be found at [Release Process](Rele
|
||||
|
||||
### Frequency
|
||||
Minor releases are less frequent in order to provide a more stable baseline for users. They are currently
|
||||
running on a sixteen weeks cadence. The release schedule can be seen on the
|
||||
running on a twelve week cadence. As the Kata Containers code base has reached a certain level of
|
||||
maturity, we have increased the cadence from six weeks to twelve weeks. The release schedule can be seen on the
|
||||
[release rotation wiki page](https://github.com/kata-containers/community/wiki/Release-Team-Rota).
|
||||
|
||||
### Compatibility
|
||||
Kata guarantees compatibility between components that are within one minor release of each other.
|
||||
|
||||
This is critical for dependencies which cross between host (shimv2 runtime) and
|
||||
This is critical for dependencies which cross between host (runtime, shim, proxy) and
|
||||
the guest (hypervisor, rootfs and agent). For example, consider a cluster with a long-running
|
||||
deployment, workload-never-dies, all on Kata version 2.1.3 components. If the operator updates
|
||||
the Kata components to the next new minor release (i.e. 2.2.0), we need to guarantee that the 2.2.0
|
||||
shimv2 runtime still communicates with 2.1.3 agent within workload-never-dies.
|
||||
deployment, workload-never-dies, all on Kata version 1.1.3 components. If the operator updates
|
||||
the Kata components to the next new minor release (i.e. 1.2.0), we need to guarantee that the 1.2.0
|
||||
runtime still communicates with 1.1.3 agent within workload-never-dies.
|
||||
|
||||
Handling live-update is out of the scope of this document. See this [`kata-runtime` issue](https://github.com/kata-containers/runtime/issues/492) for details.
|
||||
|
||||
@@ -1,379 +0,0 @@
|
||||
# Unit Test Advice
|
||||
|
||||
## Overview
|
||||
|
||||
This document offers advice on writing a Unit Test (UT) in
|
||||
[Golang](https://golang.org) and [Rust](https://www.rust-lang.org).
|
||||
|
||||
## General advice
|
||||
|
||||
### Unit test strategies
|
||||
|
||||
#### Positive and negative tests
|
||||
|
||||
Always add positive tests (where success is expected) *and* negative
|
||||
tests (where failure is expected).
|
||||
|
||||
#### Boundary condition tests
|
||||
|
||||
Try to add unit tests that exercise boundary conditions such as:
|
||||
|
||||
- Missing values (`null` or `None`).
|
||||
- Empty strings and huge strings.
|
||||
- Empty (or uninitialised) complex data structures
|
||||
(such as lists, vectors and hash tables).
|
||||
- Common numeric values (such as `-1`, `0`, `1` and the minimum and
|
||||
maximum values).
|
||||
|
||||
#### Test unusual values
|
||||
|
||||
Also always consider "unusual" input values such as:
|
||||
|
||||
- String values containing spaces, Unicode characters, special
|
||||
characters, escaped characters or null bytes.
|
||||
|
||||
> **Note:** Consider these unusual values in prefix, infix and
|
||||
> suffix position.
|
||||
|
||||
- String values that cannot be converted into numeric values or which
|
||||
contain invalid structured data (such as invalid JSON).
|
||||
|
||||
#### Other types of tests
|
||||
|
||||
If the code requires other forms of testing (such as stress testing,
|
||||
fuzz testing and integration testing), raise a GitHub issue and
|
||||
reference it on the issue you are using for the main work. This
|
||||
ensures the test team are aware that a new test is required.
|
||||
|
||||
### Test environment
|
||||
|
||||
#### Create unique files and directories
|
||||
|
||||
Ensure your tests do not write to a fixed file or directory. This can
|
||||
cause problems when running multiple tests simultaneously and also
|
||||
when running tests after a previous test run failure.
|
||||
|
||||
#### Assume parallel testing
|
||||
|
||||
Always assume your tests will be run *in parallel*. If this is
|
||||
problematic for a test, force it to run in isolation using the
|
||||
`serial_test` crate for Rust code for example.
|
||||
|
||||
### Running
|
||||
|
||||
Ensure you run the unit tests and they all pass before raising a PR.
|
||||
Ideally do this on different distributions on different architectures
|
||||
to maximise coverage (and so minimise surprises when your code runs in
|
||||
the CI).
|
||||
|
||||
## Assertions
|
||||
|
||||
### Golang assertions
|
||||
|
||||
Use the `testify` assertions package to create a new assertion object as this
|
||||
keeps the test code free from distracting `if` tests:
|
||||
|
||||
```go
|
||||
func TestSomething(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
err := doSomething()
|
||||
assert.NoError(err)
|
||||
}
|
||||
```
|
||||
|
||||
### Rust assertions
|
||||
|
||||
Use the standard set of `assert!()` macros.
|
||||
|
||||
## Table driven tests
|
||||
|
||||
Try to write tests using a table-based approach. This allows you to distill
|
||||
the logic into a compact table (rather than spreading the tests across
|
||||
multiple test functions). It also makes it easy to cover all the
|
||||
interesting boundary conditions:
|
||||
|
||||
### Golang table driven tests
|
||||
|
||||
Assume the following function:
|
||||
|
||||
```go
|
||||
// The function under test.
|
||||
//
|
||||
// Accepts a string and an integer and returns the
|
||||
// result of sticking them together separated by a dash as a string.
|
||||
func joinParamsWithDash(str string, num int) (string, error) {
|
||||
if str == "" {
|
||||
return "", errors.New("string cannot be blank")
|
||||
}
|
||||
|
||||
if num <= 0 {
|
||||
return "", errors.New("number must be positive")
|
||||
}
|
||||
|
||||
return fmt.Sprintf("%s-%d", str, num), nil
|
||||
}
|
||||
```
|
||||
|
||||
A table driven approach to testing it:
|
||||
|
||||
```go
|
||||
import (
|
||||
"testing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestJoinParamsWithDash(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
// Type used to hold function parameters and expected results.
|
||||
type testData struct {
|
||||
param1 string
|
||||
param2 int
|
||||
expectedResult string
|
||||
expectError bool
|
||||
}
|
||||
|
||||
// List of tests to run including the expected results
|
||||
data := []testData{
|
||||
// Failure scenarios
|
||||
{"", -1, "", true},
|
||||
{"", 0, "", true},
|
||||
{"", 1, "", true},
|
||||
{"foo", 0, "", true},
|
||||
{"foo", -1, "", true},
|
||||
|
||||
// Success scenarios
|
||||
{"foo", 1, "foo-1", false},
|
||||
{"bar", 42, "bar-42", false},
|
||||
}
|
||||
|
||||
// Run the tests
|
||||
for i, d := range data {
|
||||
// Create a test-specific string that is added to each assert
|
||||
// call. It will be displayed if any assert test fails.
|
||||
msg := fmt.Sprintf("test[%d]: %+v", i, d)
|
||||
|
||||
// Call the function under test
|
||||
result, err := joinParamsWithDash(d.param1, d.param2)
|
||||
|
||||
// update the message for more information on failure
|
||||
msg = fmt.Sprintf("%s, result: %q, err: %v", msg, result, err)
|
||||
|
||||
if d.expectError {
|
||||
assert.Error(err, msg)
|
||||
|
||||
// If an error is expected, there is no point
|
||||
// performing additional checks.
|
||||
continue
|
||||
}
|
||||
|
||||
assert.NoError(err, msg)
|
||||
assert.Equal(d.expectedResult, result, msg)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rust table driven tests
|
||||
|
||||
Assume the following function:
|
||||
|
||||
```rust
|
||||
// Convenience type to allow Result return types to only specify the type
|
||||
// for the true case; failures are specified as static strings.
|
||||
// XXX: This is an example. In real code use the "anyhow" and
|
||||
// XXX: "thiserror" crates.
|
||||
pub type Result<T> = std::result::Result<T, &'static str>;
|
||||
|
||||
// The function under test.
|
||||
//
|
||||
// Accepts a string and an integer and returns the
|
||||
// result of sticking them together separated by a dash as a string.
|
||||
fn join_params_with_dash(str: &str, num: i32) -> Result<String> {
|
||||
if str.is_empty() {
|
||||
return Err("string cannot be blank");
|
||||
}
|
||||
|
||||
if num <= 0 {
|
||||
return Err("number must be positive");
|
||||
}
|
||||
|
||||
let result = format!("{}-{}", str, num);
|
||||
|
||||
Ok(result)
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
A table driven approach to testing it:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_join_params_with_dash() {
|
||||
// This is a type used to record all details of the inputs
|
||||
// and outputs of the function under test.
|
||||
#[derive(Debug)]
|
||||
struct TestData<'a> {
|
||||
str: &'a str,
|
||||
num: i32,
|
||||
result: Result<String>,
|
||||
}
|
||||
|
||||
// The tests can now be specified as a set of inputs and outputs
|
||||
let tests = &[
|
||||
// Failure scenarios
|
||||
TestData {
|
||||
str: "",
|
||||
num: 0,
|
||||
result: Err("string cannot be blank"),
|
||||
},
|
||||
TestData {
|
||||
str: "foo",
|
||||
num: -1,
|
||||
result: Err("number must be positive"),
|
||||
},
|
||||
|
||||
// Success scenarios
|
||||
TestData {
|
||||
str: "foo",
|
||||
num: 42,
|
||||
result: Ok("foo-42".to_string()),
|
||||
},
|
||||
TestData {
|
||||
str: "-",
|
||||
num: 1,
|
||||
result: Ok("--1".to_string()),
|
||||
},
|
||||
];
|
||||
|
||||
// Run the tests
|
||||
for (i, d) in tests.iter().enumerate() {
|
||||
// Create a string containing details of the test
|
||||
let msg = format!("test[{}]: {:?}", i, d);
|
||||
|
||||
// Call the function under test
|
||||
let result = join_params_with_dash(d.str, d.num);
|
||||
|
||||
// Update the test details string with the results of the call
|
||||
let msg = format!("{}, result: {:?}", msg, result);
|
||||
|
||||
// Perform the checks
|
||||
if d.result.is_ok() {
|
||||
assert!(result == d.result, msg);
|
||||
continue;
|
||||
}
|
||||
|
||||
let expected_error = format!("{}", d.result.as_ref().unwrap_err());
|
||||
let actual_error = format!("{}", result.unwrap_err());
|
||||
assert!(actual_error == expected_error, msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Temporary files
|
||||
|
||||
Always delete temporary files on success.
|
||||
|
||||
### Golang temporary files
|
||||
|
||||
```go
|
||||
func TestSomething(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
// Create a temporary directory
|
||||
tmpdir, err := os.MkdirTemp("", "")
|
||||
assert.NoError(err)
|
||||
|
||||
// Delete it at the end of the test
|
||||
defer os.RemoveAll(tmpdir)
|
||||
|
||||
// Add test logic that will use the tmpdir here...
|
||||
}
|
||||
```
|
||||
|
||||
### Rust temporary files
|
||||
|
||||
Use the `tempfile` crate which allows files and directories to be deleted
|
||||
automatically:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use tempfile::tempdir;
|
||||
|
||||
#[test]
|
||||
fn test_something() {
|
||||
|
||||
// Create a temporary directory (which will be deleted automatically
|
||||
let dir = tempdir().expect("failed to create tmpdir");
|
||||
|
||||
let filename = dir.path().join("file.txt");
|
||||
|
||||
// create filename ...
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## Test user
|
||||
|
||||
[Unit tests are run *twice*](https://github.com/kata-containers/tests/blob/main/.ci/go-test.sh):
|
||||
|
||||
- as the current user
|
||||
- as the `root` user (if different to the current user)
|
||||
|
||||
When writing a test consider which user should run it; even if the code the
|
||||
test is exercising runs as `root`, it may be necessary to *only* run the test
|
||||
as a non-`root` for the test to be meaningful. Add appropriate skip
|
||||
guards around code that requires `root` and non-`root` so that the test
|
||||
will run if the correct type of user is detected and skipped if not.
|
||||
|
||||
### Run Golang tests as a different user
|
||||
|
||||
The main repository has the most comprehensive set of skip abilities. See:
|
||||
|
||||
- [`katatestutils`](../src/runtime/pkg/katatestutils)
|
||||
|
||||
### Run Rust tests as a different user
|
||||
|
||||
One method is to use the `nix` crate along with some custom macros:
|
||||
|
||||
```
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
#[allow(unused_macros)]
|
||||
macro_rules! skip_if_root {
|
||||
() => {
|
||||
if nix::unistd::Uid::effective().is_root() {
|
||||
println!("INFO: skipping {} which needs non-root", module_path!());
|
||||
return;
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
#[allow(unused_macros)]
|
||||
macro_rules! skip_if_not_root {
|
||||
() => {
|
||||
if !nix::unistd::Uid::effective().is_root() {
|
||||
println!("INFO: skipping {} which needs root", module_path!());
|
||||
return;
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_that_must_be_run_as_root() {
|
||||
// Not running as the superuser, so skip.
|
||||
skip_if_not_root!();
|
||||
|
||||
// Run test *iff* the user running the test is root
|
||||
|
||||
// ...
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -1,3 +1,16 @@
|
||||
* [Introduction](#introduction)
|
||||
* [Maintenance warning](#maintenance-warning)
|
||||
* [Determine current version](#determine-current-version)
|
||||
* [Determine latest version](#determine-latest-version)
|
||||
* [Configuration changes](#configuration-changes)
|
||||
* [Upgrade Kata Containers](#upgrade-kata-containers)
|
||||
* [Upgrade native distribution packaged version](#upgrade-native-distribution-packaged-version)
|
||||
* [Static installation](#static-installation)
|
||||
* [Determine if you are using a static installation](#determine-if-you-are-using-a-static-installation)
|
||||
* [Remove a static installation](#remove-a-static-installation)
|
||||
* [Upgrade a static installation](#upgrade-a-static-installation)
|
||||
* [Custom assets](#custom-assets)
|
||||
|
||||
# Introduction
|
||||
|
||||
This document outlines the options for upgrading from a
|
||||
@@ -35,10 +48,10 @@ Alternatively, if you are using Kata Containers version 1.12.0 or newer, you
|
||||
can check for newer releases using the command line:
|
||||
|
||||
```bash
|
||||
$ kata-runtime check --check-version-only
|
||||
$ kata-runtime kata-check --check-version-only
|
||||
```
|
||||
|
||||
There are various other related options. Run `kata-runtime check --help`
|
||||
There are various other related options. Run `kata-runtime kata-check --help`
|
||||
for further details.
|
||||
|
||||
# Configuration changes
|
||||
@@ -102,7 +115,7 @@ first
|
||||
[install the latest release](#determine-latest-version).
|
||||
|
||||
See the
|
||||
[manual installation documentation](install/README.md#manual-installation)
|
||||
[manual installation installation documentation](install/README.md#manual-installation)
|
||||
for details on how to automatically install and configuration a static release
|
||||
with containerd.
|
||||
|
||||
@@ -114,7 +127,7 @@ with containerd.
|
||||
> kernel or image.
|
||||
|
||||
If you are using custom
|
||||
[guest assets](design/architecture/README.md#guest-assets),
|
||||
[guest assets](design/architecture.md#guest-assets),
|
||||
you must upgrade them to work with Kata Containers 2.x since Kata
|
||||
Containers 1.x assets will **not** work.
|
||||
|
||||
|
||||
@@ -1,247 +0,0 @@
|
||||
# Code PR Advice
|
||||
|
||||
Before raising a PR containing code changes, we suggest you consider
|
||||
the following to ensure a smooth and fast process.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - All the advice in this document is optional. However, if the
|
||||
> advice provided is not followed, there is no guarantee your PR
|
||||
> will be merged.
|
||||
>
|
||||
> - All the check tools will be run automatically on your PR by the CI.
|
||||
> However, if you run them locally first, there is a much better
|
||||
> chance of a successful initial CI run.
|
||||
|
||||
## Assumptions
|
||||
|
||||
This document assumes you have already read (and in the case of the
|
||||
code of conduct agreed to):
|
||||
|
||||
- The [Kata Containers code of conduct](https://github.com/kata-containers/community/blob/main/CODE_OF_CONDUCT.md).
|
||||
- The [Kata Containers contributing guide](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).
|
||||
|
||||
## Code
|
||||
|
||||
### Architectures
|
||||
|
||||
Do not write architecture-specific code if it is possible to write the
|
||||
code generically.
|
||||
|
||||
### General advice
|
||||
|
||||
- Do not write code to impress: instead write code that is easy to read and understand.
|
||||
|
||||
- Always consider which user will run the code. Try to minimise
|
||||
the privileges the code requires.
|
||||
|
||||
### Comments
|
||||
|
||||
Always add comments if the intent of the code is not obvious. However,
|
||||
try to avoid comments if the code could be made clearer (for example
|
||||
by using more meaningful variable names).
|
||||
|
||||
### Constants
|
||||
|
||||
Don't embed magic numbers and strings in functions, particularly if
|
||||
they are used repeatedly.
|
||||
|
||||
Create constants at the top of the file instead.
|
||||
|
||||
### Copyright and license
|
||||
|
||||
Ensure all new files contain a copyright statement and an SPDX license
|
||||
identifier in the comments at the top of the file.
|
||||
|
||||
### FIXME and TODO
|
||||
|
||||
If the code contains areas that are not fully implemented, make this
|
||||
clear a comment which provides a link to a GitHub issue that provides
|
||||
further information.
|
||||
|
||||
Do not just rely on comments in this case though: if possible, return
|
||||
a "`BUG: feature X not implemented see {bug-url}`" type error.
|
||||
|
||||
### Functions
|
||||
|
||||
- Keep functions relatively short (less than 100 lines is a good "rule of thumb").
|
||||
|
||||
- Document functions if the parameters, return value or general intent
|
||||
of the function is not obvious.
|
||||
|
||||
- Always return errors where possible.
|
||||
|
||||
Do not discard error return values from the functions this function
|
||||
calls.
|
||||
|
||||
### Logging
|
||||
|
||||
- Don't use multiple log calls when a single log call could be used.
|
||||
|
||||
- Use structured logging where possible to allow
|
||||
[standard tooling](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
|
||||
be able to extract the log fields.
|
||||
|
||||
### Names
|
||||
|
||||
Give functions, macros and variables clear and meaningful names.
|
||||
|
||||
### Structures
|
||||
|
||||
#### Golang structures
|
||||
|
||||
Unlike Rust, Go does not enforce that all structure members be set.
|
||||
This has lead to numerous bugs in the past where code like the
|
||||
following is used:
|
||||
|
||||
```go
|
||||
type Foo struct {
|
||||
Key string
|
||||
Value string
|
||||
}
|
||||
|
||||
// BUG: Key not set, but nobody noticed! ;(
|
||||
let foo1 = Foo {
|
||||
Value: "foo",
|
||||
}
|
||||
```
|
||||
|
||||
A much safer approach is to create a constructor function to enforce
|
||||
integrity:
|
||||
|
||||
```go
|
||||
type Foo struct {
|
||||
Key string
|
||||
Value string
|
||||
}
|
||||
|
||||
func NewFoo(key, value string) (*Foo, error) {
|
||||
if key == "" {
|
||||
return nil, errors.New("Foo needs a key")
|
||||
}
|
||||
|
||||
if value == "" {
|
||||
return nil, errors.New("Foo needs a value")
|
||||
}
|
||||
|
||||
return &Foo{
|
||||
Key: key,
|
||||
Value: value,
|
||||
}, nil
|
||||
}
|
||||
|
||||
func testFoo() error {
|
||||
// BUG: Key not set, but nobody noticed! ;(
|
||||
badFoo := Foo{Value: "value"}
|
||||
|
||||
// Ok - the constructor performs needed validation
|
||||
goodFoo, err := NewFoo("name", "value")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The above is just an example. The *safest* approach would be to move
|
||||
> `NewFoo()` into a separate package and make `Foo` and it's elements
|
||||
> private. The compiler would then enforce the use of the constructor
|
||||
> to guarantee correctly defined objects.
|
||||
|
||||
|
||||
### Tracing
|
||||
|
||||
Consider if the code needs to create a new
|
||||
[trace span](./tracing.md).
|
||||
|
||||
Ensure any new trace spans added to the code are completed.
|
||||
|
||||
## Tests
|
||||
|
||||
### Unit tests
|
||||
|
||||
Where possible, code changes should be accompanied by unit tests.
|
||||
|
||||
Consider using the standard
|
||||
[table-based approach](Unit-Test-Advice.md)
|
||||
as it encourages you to make functions small and simple, and also
|
||||
allows you to think about what types of value to test.
|
||||
|
||||
### Other categories of test
|
||||
|
||||
Raised a GitHub issue in the
|
||||
[`tests`](https://github.com/kata-containers/tests) repository that
|
||||
explains what sort of test is required along with as much detail as
|
||||
possible. Ensure the original issue is referenced on the `tests` issue.
|
||||
|
||||
### Unsafe code
|
||||
|
||||
#### Rust language specifics
|
||||
|
||||
Minimise the use of `unsafe` blocks in Rust code and since it is
|
||||
potentially dangerous always write [unit tests][#unit-tests]
|
||||
for this code where possible.
|
||||
|
||||
`expect()` and `unwrap()` will cause the code to panic on error.
|
||||
Prefer to return a `Result` on error rather than using these calls to
|
||||
allow the caller to deal with the error condition.
|
||||
|
||||
The table below lists the small number of cases where use of
|
||||
`expect()` and `unwrap()` are permitted:
|
||||
|
||||
| Area | Rationale for permitting |
|
||||
|-|-|
|
||||
| In test code (the `tests` module) | Panics will cause the test to fail, which is desirable. |
|
||||
| `lazy_static!()` | This magic macro cannot "return" a value as it runs before `main()`. |
|
||||
| `defer!()` | Similar to golang's `defer()` but doesn't allow the use of `?`. |
|
||||
| `tokio::spawn(async move {})` | Cannot currently return a `Result` from an `async move` closure. |
|
||||
| If an explicit test is performed before the `unwrap()` / `expect()` | *"Just about acceptable"*, but not ideal `[*]` |
|
||||
| `Mutex.lock()` | Almost unrecoverable if failed in the lock acquisition |
|
||||
|
||||
|
||||
`[*]` - There can lead to bad *future* code: consider what would
|
||||
happen if the explicit test gets dropped in the future. This is easier
|
||||
to happen if the test and the extraction of the value are two separate
|
||||
operations. In summary, this strategy can introduce an insidious
|
||||
maintenance issue.
|
||||
|
||||
## Documentation
|
||||
|
||||
### General requirements
|
||||
|
||||
- All new features should be accompanied by documentation explaining:
|
||||
|
||||
- What the new feature does
|
||||
|
||||
- Why it is useful
|
||||
|
||||
- How to use the feature
|
||||
|
||||
- Any known issues or limitations
|
||||
|
||||
Links should be provided to GitHub issues tracking the issues
|
||||
|
||||
- The [documentation requirements document](Documentation-Requirements.md)
|
||||
explains how the project formats documentation.
|
||||
|
||||
### Markdown syntax
|
||||
|
||||
Run the
|
||||
[markdown checker](https://github.com/kata-containers/tests/tree/main/cmd/check-markdown)
|
||||
on your documentation changes.
|
||||
|
||||
### Spell check
|
||||
|
||||
Run the
|
||||
[spell checker](https://github.com/kata-containers/tests/tree/main/cmd/check-spelling)
|
||||
on your documentation changes.
|
||||
|
||||
## Finally
|
||||
|
||||
You may wish to read the documentation that the
|
||||
[Kata Review Team](https://github.com/kata-containers/community/blob/main/Rota-Process.md) use to help review PRs:
|
||||
|
||||
- [PR review guide](https://github.com/kata-containers/community/blob/main/PR-Review-Guide.md).
|
||||
- [documentation review process](https://github.com/kata-containers/community/blob/main/Documentation-Review-Process.md).
|
||||
@@ -2,16 +2,10 @@
|
||||
|
||||
Kata Containers design documents:
|
||||
|
||||
- [Kata Containers architecture](architecture)
|
||||
- [Kata Containers architecture](architecture.md)
|
||||
- [API Design of Kata Containers](kata-api-design.md)
|
||||
- [Design requirements for Kata Containers](kata-design-requirements.md)
|
||||
- [VSocks](VSocks.md)
|
||||
- [VCPU handling](vcpu-handling.md)
|
||||
- [Host cgroups](host-cgroups.md)
|
||||
- [`Inotify` support](inotify.md)
|
||||
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
|
||||
- [Design for Kata Containers `Lazyload` ability with `nydus`](kata-nydus-design.md)
|
||||
|
||||
---
|
||||
|
||||
- [Design proposals](proposals)
|
||||
|
||||
@@ -1,5 +1,12 @@
|
||||
# Kata Containers and VSOCKs
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [VSOCK communication diagram](#vsock-communication-diagram)
|
||||
- [System requirements](#system-requirements)
|
||||
- [Advantages of using VSOCKs](#advantages-of-using-vsocks)
|
||||
- [High density](#high-density)
|
||||
- [Reliability](#reliability)
|
||||
|
||||
## Introduction
|
||||
|
||||
There are two different ways processes in the virtual machine can communicate
|
||||
|
||||
|
Before Width: | Height: | Size: 101 KiB |
@@ -1 +1 @@
|
||||
<mxfile host="app.diagrams.net" modified="2021-11-05T13:07:32.992Z" agent="5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" etag="j5e7J3AOXxeQrt-Zz2uw" version="15.6.8" type="device"><diagram id="XNV8G0dePIPkhS_Khqr4" name="Page-1">7Vxdd9o4EP01nLP7QI5s+fORUNhNT7rNbnqaZl/2CCywG2OxQhDIr18Z29iyZD6CDZRuHho8toQ9986dGVlNC3Yny98omvqfiIfDlg68ZQt+aOm6AUyb/4otq8TiGnpiGNPAS0wgNzwGbzgxapl1Hnh4ltoSEyMkZMFUNA5JFOEhE2yIUvIqXjYioScYpmiMJcPjEIWy9SnwmJ9YdQjd/MTvOBj76VdDCNI7n6Ds6tQw85FHXgsm2GvBLiWEJZ8myy4OY++JjulXnN3cGcUR22fAn3fPzx+jj7e9HrIXA330feIZ7czPCxTO00dO75atMh+MKZlP08swZXip8jwaZJcD+ca0zeNyomAywYyu+CXpRG3NNM1kUMoSXU+PX3OfG9nEfsHdG56gFOfxZvbcE/xD6gy1Yz7/88nuPiLwcNcZDLvEvZ10Vm1zt1+4WyIPx5NoLXj76gcMP07RMD77yqOB23w2CdPTIxKxlN78nuHtmCIv4A7qkpDQ9XzQxsjC8blREIYFu4ewMxpy+4xR8oILZ6yhgwcjfkZ2+Xa0yzDK2JzP81ZzntcNtedHI8+1LNnzo9FIHyo971kDy7Qa8Xx61gJCSGhyRHCtkXHRLLMhXGwJF659/KFrCwtHBkAbIA3rKgAAsHqdfjqDCBn/aRIYTRORUWiVa8rAwKZwcSRcuCQzFESYcrNWLz6K4HFtD9i2QrZM7HiGCjtHH8Ak3ETs+n0A+v0msdNhCTv7RkZPM1TwNSV37lb4asw6VwifDc4MnqJ88ldTTBfBjHulDB1/SibiI/o2IhEuAZGaUBiMI3445B7lvIC3sc8CXqZ20hOTwPPir1ESIqcMUORDn9DgLaZcmF7QnHKWUhqQ4TNUpUZj6GkSeuM5nvGUBl4wjeJW5odAsDnAzBJihiLgrIYgUz6DrJYSRqdvVwzblol82qJZM3Y75sfrV9wKGC+pXdEa7BTP16/s7/lLbVc0uY+8hn7lYGCkdgVKyJy0XdHkPvJn6VcOxu4C2xVte7t5zf3K0fCdv12Ry6efpl05XDgvrFvR5V7zmruVw/E6Z7OiDjcoQYK9MX5MDwllPhmTCIW93FpyXn7NPSHTFMXvmLFV6lI0Z0TEGC8D9q3w+TmeiueN5OjDMp15fbCSMdJijDg0dPUtuzI+KMwSH+bTrI+yeWRss3xO5nSYsfb+y53jDp6+P1r+y+fXj+HLU97AMETHmG1xalo+xI7cygqKQ8SCBRZuQwXxemiHUrQqXDAlQcRmhZkfYoPQBNqOmJstu0gYxQjD1ksjnBLFkrvICbd5nCNUA0qqwRidDpXMvEcDLiMCm/ZXAopnwVvaV8dcSF3IJzdvW+YHBctktmwNo727+erWHdxormWIMpEcHUYXGV0osmFz09kUZDSaYSZJymEIqyNHLqirEb5A7Xmv1hTZZB+nPa6sPVtFaqf45HyDBrRFvtnHEa5WQqklQy7ir5g6aiE6hjpqEbuQtOUCUf52p63yCFOzS6w7Lm1t9WtB1LqFNrMXjfmnlm6FcYE74CZrHH/6ZdOLeq04F/T5v92/7tqff5UoXeeyj+U5tqXsPWHHNGA2w37LPtlLib0L37auOa4IKpQXpDXHkktfq4bSV4lf1tb+HBpyXOmr75t+4L7pZ28ROQpjXY7RBxrP7eP5LH5wTBdYXla4rsATyz4LqALPaCbw1Mn7nHGXx9pzMdR2xF0eas/ZfCfJ3VDRcqrFrPa4e2fytk1bSbfq5F0eYTpmrclbyUH5XaTP2FRJzMvsOPUKJTi44QQ3Um6uqd/MXm9l/aZuiFM01x7ICwqV6J5MdqCY8QGwTqI98aQPmAbcpTFTj1wC21+P4J56tGGhCQEUK8QjaVhNs8NVzbHFezNAaSP7TlUrjTha1dDX0f1d+292B77+8dRGX7/MfHCmzPpOplaycPcah9tIslM1lpq4jWZTPGWTJBGTjjuGYVXftKXpLY0wHbh9hA5ca9uIZtpkKKfaF8RQe0KigCle6V3sXofDa2/NNfWbCgIVqm/dVLxgba76lrcvXGjb+64yetvK1u4VMKtuYTkOKjl0frQXI5VR8446FZqmXlNlCsqvQkqyXktpqnxnvMdWvOZ3hzqGmAgMR7EmYG928hR1yW5s05XCMcnaqRcsBAdZ/87j/5C4pmR7tuZkh1+gGdPlmpjZ+WzFNV9wbc/8YNJep5/FZnp+t+tvSC6uLx8ZDeejrfQ6aDtqBdTNrbzKujYjw5f6XK/a8esAwIt4hes7JgAGOKXrs/2o2o1e2qUtb51T1QZ1bL5SPoL8mvYCxEn5pkDNWKcpxir3rv+vToeGiF1BnMtSJ3n16ArUaX/XV6qTKW9Wq0md+GH+RwaSSiv/Ww2w9x8=</diagram></mxfile>
|
||||
<mxfile host="Chrome" modified="2020-07-02T06:44:28.736Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36" etag="r7FpfnbGNK7jbg54Gu9x" version="13.3.5" type="device"><diagram id="XNV8G0dePIPkhS_Khqr4" name="Page-1">7VvZcuI4FP0aHqFky+sjkNCTqfR0qtLV6fTLlMDy0hiLscWWrx8Zy3iRQkhjQxbygnVlhK1zz9HRkg4cztZfYjT3vxIHhx0VOOsOvOqoKrRthX2kkU0WMTWYBbw4cLIQKAL3wRPOgkoeXQQOTngsC1FCQhrMq8EJiSI8oZUYimOyqt7mktCpBObIw0LgfoJCMfoQONTPoiqEdlHxFw48n/80hIA/+Qzld/NA4iOHrEoheN2Bw5gQml3N1kMcpr1X7ZjRM7W7J4txRA/5wrd/v5rDewTubvrjyZDYg1l/01W0rJklChf8lfnT0k3eBzFZRA5OW1E6cLDyA4rv52iS1q4Y6izm01nIq10SUQ4jwxAOvBg5AXvCIQlJvG0PmhgZOK1zgzAsxR2ELXfC4gmNyRSXaoyJhccuqxHfmXfDEscUr0sh3gdfMJlhGm/YLby2q+i6nn2J56QOzKy8KhDWctT8Eri7rEQ8q7xd60W/swve9a+BQW8PBlWTw+C6jm0YIgyu66oTKQyOMTZ0oykYNLsGQ/7OJRgYnUQYFENvCYYWUXgvZFDss5PBuHBBVc7OBQUKvY4dNjbyIompTzwSofC6iA4KXNKULu65JWTO0fiNKd1wONCCkipWeB3Qn6Xrx7Spns5LV2ve8raw4YUyy7R9iCRkEU/4u3i3328se/zw+97wp99Wf4fTh2I0pCj2MN3TOZwjaYfsBTjGIaLBsmomGodKhQJjKI3nEymAt2jMPFql01EYeBG7nrAOwyy/B2nqBswE9XnFLHCcDF+cBE9ovG0v7fo5CSK6fR190NGvDgJjb7YJpNlZO/6rFfMkJRPoKbahVUUtKx2MBm/8Ln27Ustqmonldrt2tQ3iugnLmzqeu4c8COK9mVkRRSNkXTpwgmUFZeO/RWopt0h0ky0UfXaDos3XWzzyenblpZ+sfykKIhw73cQPZt0poqi73DXPHnf7C9nNzY2Hmqi2yhgpWJWpLQDGdX/EW6jqM/trSoUts6bCUBwLFdPMk6Csw0YDg6EcePMV3H6D4szwiDc/y4XSt9Ji8bVtSSbq5nGibouivpdjL6o6TxjQA5ZuVjMGHKc0jQqJfKwQzdQHzpwj7YAkc+Sj17nswN7HLklGofHNCbglCrjhWKahyQQc9nUN5i20JeBMnMHLAi6bzLQm35r+megGjqKbeqhQw0OF+jR8U0W+3cVp2z5eJOmL45gl8ccmnm1XiGdIltQUS2uHePJx7py8K7j2WKbaC7wrqPaYt3eSYQ5KZr1yMTsb76QQi1Min9K5FPe3OelVnyHaq+e8oMfYVWXgkVPefIKrKL3aAlN71lRcxXgWz5PxWP2YRIYHEnmXXxBa1fSyj8uvRrMJ/XBvb7q/6A348c9DF/34nvjgzANA61lzgizJMX4jNguKer9dqpqRKKCkXX917pUpW1drS48yh6Xqp1yZ0kS9Tshk2hwOsl0xCwATynDo6wBooG0cLFibYFoiCjIQYGs2VxH6+43OL/9omNu32vLyqozxpuyqKurXe9uleZYyf+BYoa6rx3mInJWGUuFkVwOn86yKgOllW6Zp0TVr2zKaRHRPvS2jiWT+8IOfqcBeDQodqucd/8TdMeSl7/iRzaCm1bYpVREEWwZCW0dFLAGEnXilCtcsGJLTO7bpANOUEEbHliNdFbXUMczO+1SBGo0AGI2aAgqqdaC0XKTK0qWdkjB79oZYWJw0f1qsDMkgc1Kk8vN15fWwzRzHyyCRzHbZi9IqGNWOjEiEa73OQ4f7Shn61blE/aidT+LgKc2vsPPCssWrzizWBVB2ZlF2ZLE1qEQb6C1wkprUKY6j9Ez8u4CrmeEJVNGBsk1Y46TwiCdKP59L0HOhOpdLkJxkutgE2dCjw/PbBGW/p7v4hB1Y5tl9gmjpLj5B6hNka+Yn9QmqaOkuPmGHjtaeT2DF4h/tsrW/4v8V4fX/</diagram></mxfile>
|
||||
|
Before Width: | Height: | Size: 90 KiB After Width: | Height: | Size: 93 KiB |
@@ -1 +0,0 @@
|
||||
<mxfile host="app.diagrams.net" modified="2022-01-18T14:06:01.890Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" etag="nId-8OV6FDjWTDgzqDu-" version="15.8.9" type="device"><diagram id="bkF_ZONM9sPFCpIYoGFl" name="Page-1">5Vtbj6M2GP01eUyEbW55nGSmM2q70mqnUnf6UrnBCdYSnIIzSfbX14AJYDsLSSDZUWdHGvxhDD7n8F1sdoTm6/1zgjfhJxaQaAStYD9CjyMIgQ1d8SezHAqL504LwyqhgexUGV7pdyKNlrRuaUDSRkfOWMTppmlcsDgmC96w4SRhu2a3JYuad93gFdEMrwsc6dY/acDDwupDr7K/ELoKyzuDcn5rXHaWM0lDHLBdzYSeRmieMMaLo/V+TqIMvBKX4rpfTpw9PlhCYt7lgpd4/J1SSP9++bR8Cb6A1de/XsZ2Mco7jrZywiFLuXxgfihR4GQv7jEL+ToSBiAOU56wb2TOIpYIS8xi0XO2pFGkmHBEV7FoLsRTEmGfvZOEU4HvgzyxpkGQ3Wa2Cyknrxu8yO65E2oStoRt44BkE7BES5+xBCEbk+xrJonAM2FrwpOD6CLP+o5kQ8rRsZ2ivavItWWXsMZrSSKWclodR64QFwcSdDMB8ycYj+f+Hw/+b3jxvN57e/vXsa8RoIHfBMEEU42XJYu5fIugfeSplC5QSBpBtHSyf/LKmr340ZgWZ9z858iHBr6BopN8INDkAwGdj6llIMSxh2JkamDEjbhEqEGN+++WlSfGaY76g+gA3c2+OimOVtnf+BBs03Ea400aMp69DHJY8ZTFyEW/H/AP+uC/D9aQNbFAkzjDiwQ8A3H+ULyVSrqCOARNxInQwjGNSRIMzth0OMacCYJN14csnTFnOkG+Tpo3GGnAQJqCJomDhyySZ1EkwmlKFzlKOOG6uYZr023WUBYTRDOBW3L4mp2cOGXzTV6ZNx738sqidWjEIBJoWYMWlFK2TRakg2DFTFaEt3kkndoab47JQ0pbQiLM6XvzeU1Eyjt8ZjR/W0rluErELD10OUQxT3lVPf9QBrIVV2+7ykAFDtpAua6O075Cauh6x97iH8ZpSNfjb5jj8TscxFn04Aocx2n3A65BUMM5AT0L7c+lwqFcqg8UHKEeAVGJdSOXdAYD0rle4tOTucvw4W8wrhyvyZU7NWQr0KB5dzCq3OupMqaZufcRVWnOzwfNVnxbiTlTg4tCP4h5/dPlXZin1KA7phxjkT3DRtZhTbxj+0Tikbc+k4SKCWWFdGHcU/61HF4cv1UJjWhVI2WNITIYdM/MxIOKStSEomtmosrNVVOcoTOTDosAncWl5LNWm6ykgirVvNX0dCMFdciBC0ruJjWkKAReKjWnZaCBpQZNRfLFUmu6sFYPdmdn1bXcuq9Xc1WFqClIV6mpA3nWjaV2aWlfl9oFkql5QgvYTYkC95Ioexd/Z/9MoVWLiJ39HWiJ0UOLEBpEeF6aDXxTmr3akrRzhv0zbZ9cl5grcdBxJL732j6BpqWDM/k1llHFNthHordZifn9EA6A4gmQYZXjtozraxxzoFFyaU2bB4hBalpggROpX1tRO9gaBNTXILLt6GX6IeH0O8KJBoNTXyOg6+zzAhGOPw6sSi3sGTZkgWlDdjhYTdXxmS7eMbn4NBSwBDQZZJ2s9OwRWfJ+qJmq+bxxq/yGxKAOteStNzc0t2BC6aZeodx1/d/LV0kdfeve8jXtB95ZvtNpO0i3VW+Hrbm2Iv70RjysL0DWS/xbrQkVL+e9qmzfP8H2uVW2Fhrs21bZyLTv2K9KykWd4wJkvx9rtK7HFFnIvZQCLNiXVFxVKt7kxmLRq47yo7g8mpmL63Mrahm4TtbTqXDjNF79nnd7tCvLF0leZmLi8mWUazYUFxIxwmyT4ZIj5czEr0Bznq1IOuJZ56INqrb4zbonfM5i8fiY5pojOOW7bO0okzzHHP+Tz1Sv4HvLiFzHLJ2adD3DZwrDxZRet7vO24MIcBoe43mP7qEQ9f3cg6VwrC6/dHUP6kYXALA//yCa1efuRffqPw2gp/8A</diagram></mxfile>
|
||||
|
Before Width: | Height: | Size: 51 KiB |
|
Before Width: | Height: | Size: 390 KiB |
|
Before Width: | Height: | Size: 942 KiB |
|
Before Width: | Height: | Size: 182 KiB |
311
docs/design/architecture.md
Normal file
@@ -0,0 +1,311 @@
|
||||
# Kata Containers Architecture
|
||||
|
||||
|
||||
- [Kata Containers Architecture](#kata-containers-architecture)
|
||||
- [Overview](#overview)
|
||||
- [Virtualization](#virtualization)
|
||||
- [Guest assets](#guest-assets)
|
||||
- [Guest kernel](#guest-kernel)
|
||||
- [Guest image](#guest-image)
|
||||
- [Root filesystem image](#root-filesystem-image)
|
||||
- [Initrd image](#initrd-image)
|
||||
- [Agent](#agent)
|
||||
- [Runtime](#runtime)
|
||||
- [Configuration](#configuration)
|
||||
- [Networking](#networking)
|
||||
- [Network Hotplug](#network-hotplug)
|
||||
- [Storage](#storage)
|
||||
- [Kubernetes support](#kubernetes-support)
|
||||
- [OCI annotations](#oci-annotations)
|
||||
- [Mixing VM based and namespace based runtimes](#mixing-vm-based-and-namespace-based-runtimes)
|
||||
- [Appendices](#appendices)
|
||||
- [DAX](#dax)
|
||||
|
||||
## Overview
|
||||
|
||||
This is an architectural overview of Kata Containers, based on the 2.0 release.
|
||||
|
||||
The primary deliverable of the Kata Containers project is a CRI friendly shim. There is also a CRI friendly library API behind them.
|
||||
|
||||
The [Kata Containers runtime](../../src/runtime)
|
||||
is compatible with the [OCI](https://github.com/opencontainers) [runtime specification](https://github.com/opencontainers/runtime-spec)
|
||||
and therefore works seamlessly with the [Kubernetes\* Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/container-runtime-interface.md)
|
||||
through the [CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and
|
||||
[Containerd\*](https://github.com/containerd/containerd) implementation.
|
||||
|
||||
Kata Containers creates a QEMU\*/KVM virtual machine for pod that `kubelet` (Kubernetes) creates respectively.
|
||||
|
||||
The [`containerd-shim-kata-v2` (shown as `shimv2` from this point onwards)](../../src/runtime/containerd-shim-v2)
|
||||
is the Kata Containers entrypoint, which
|
||||
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2) for Kata.
|
||||
|
||||
Before `shimv2` (as done in [Kata Containers 1.x releases](https://github.com/kata-containers/runtime/releases)), we need to create a `containerd-shim` and a [`kata-shim`](https://github.com/kata-containers/shim) for each container and the Pod sandbox itself, plus an optional [`kata-proxy`](https://github.com/kata-containers/proxy) when VSOCK is not available. With `shimv2`, Kubernetes can launch Pod and OCI compatible containers with one shim (the `shimv2`) per Pod instead of `2N+1` shims, and no standalone `kata-proxy` process even if no VSOCK is available.
|
||||
|
||||

|
||||
|
||||
The container process is then spawned by
|
||||
[`kata-agent`](../../src/agent), an agent process running
|
||||
as a daemon inside the virtual machine. `kata-agent` runs a [`ttRPC`](https://github.com/containerd/ttrpc-rust) server in
|
||||
the guest using a VIRTIO serial or VSOCK interface which QEMU exposes as a socket
|
||||
file on the host. `shimv2` uses a `ttRPC` protocol to communicate with
|
||||
the agent. This protocol allows the runtime to send container management
|
||||
commands to the agent. The protocol is also used to carry the I/O streams (stdout,
|
||||
stderr, stdin) between the containers and the manage engines (e.g. CRI-O or containerd).
|
||||
|
||||
For any given container, both the init process and all potentially executed
|
||||
commands within that container, together with their related I/O streams, need
|
||||
to go through the VSOCK interface exported by QEMU.
|
||||
|
||||
The container workload, that is, the actual OCI bundle rootfs, is exported from the
|
||||
host to the virtual machine. In the case where a block-based graph driver is
|
||||
configured, `virtio-scsi` will be used. In all other cases a 9pfs VIRTIO mount point
|
||||
will be used. `kata-agent` uses this mount point as the root filesystem for the
|
||||
container processes.
|
||||
|
||||
## Virtualization
|
||||
|
||||
How Kata Containers maps container concepts to virtual machine technologies, and how this is realized in the multiple
|
||||
hypervisors and VMMs that Kata supports is described within the [virtualization documentation](./virtualization.md)
|
||||
|
||||
## Guest assets
|
||||
|
||||
The hypervisor will launch a virtual machine which includes a minimal guest kernel
|
||||
and a guest image.
|
||||
|
||||
### Guest kernel
|
||||
|
||||
The guest kernel is passed to the hypervisor and used to boot the virtual
|
||||
machine. The default kernel provided in Kata Containers is highly optimized for
|
||||
kernel boot time and minimal memory footprint, providing only those services
|
||||
required by a container workload. This is based on a very current upstream Linux
|
||||
kernel.
|
||||
|
||||
### Guest image
|
||||
|
||||
Kata Containers supports both an `initrd` and `rootfs` based minimal guest image.
|
||||
|
||||
#### Root filesystem image
|
||||
|
||||
The default packaged root filesystem image, sometimes referred to as the "mini O/S", is a
|
||||
highly optimized container bootstrap system based on [Clear Linux](https://clearlinux.org/). It provides an extremely minimal environment and
|
||||
has a highly optimized boot path.
|
||||
|
||||
The only services running in the context of the mini O/S are the init daemon
|
||||
(`systemd`) and the [Agent](#agent). The real workload the user wishes to run
|
||||
is created using libcontainer, creating a container in the same manner that is done
|
||||
by `runc`.
|
||||
|
||||
For example, when `ctr run -ti ubuntu date` is run:
|
||||
|
||||
- The hypervisor will boot the mini-OS image using the guest kernel.
|
||||
- `systemd`, running inside the mini-OS context, will launch the `kata-agent` in
|
||||
the same context.
|
||||
- The agent will create a new confined context to run the specified command in
|
||||
(`date` in this example).
|
||||
- The agent will then execute the command (`date` in this example) inside this
|
||||
new context, first setting the root filesystem to the expected Ubuntu\* root
|
||||
filesystem.
|
||||
|
||||
#### Initrd image
|
||||
|
||||
A compressed `cpio(1)` archive, created from a rootfs which is loaded into memory and used as part of the Linux startup process. During startup, the kernel unpacks it into a special instance of a `tmpfs` that becomes the initial root filesystem.
|
||||
|
||||
The only service running in the context of the initrd is the [Agent](#agent) as the init daemon. The real workload the user wishes to run is created using libcontainer, creating a container in the same manner that is done by `runc`.
|
||||
|
||||
## Agent
|
||||
|
||||
[`kata-agent`](../../src/agent) is a process running in the guest as a supervisor for managing containers and processes running within those containers.
|
||||
|
||||
For the 2.0 release, the `kata-agent` is rewritten in the [RUST programming language](https://www.rust-lang.org/) so that we can minimize its memory footprint while keeping the memory safety of the original GO version of [`kata-agent` used in Kata Container 1.x](https://github.com/kata-containers/agent). This memory footprint reduction is pretty impressive, from tens of megabytes down to less than 100 kilobytes, enabling Kata Containers in more use cases like functional computing and edge computing.
|
||||
|
||||
The `kata-agent` execution unit is the sandbox. A `kata-agent` sandbox is a container sandbox defined by a set of namespaces (NS, UTS, IPC and PID). `shimv2` can
|
||||
run several containers per VM to support container engines that require multiple
|
||||
containers running inside a pod.
|
||||
|
||||
`kata-agent` communicates with the other Kata components over `ttRPC`.
|
||||
|
||||
## Runtime
|
||||
|
||||
`containerd-shim-kata-v2` is a [containerd runtime shimv2](https://github.com/containerd/containerd/blob/v1.4.1/runtime/v2/README.md) implementation and is responsible for handling the `runtime v2 shim APIs`, which is similar to [the OCI runtime specification](https://github.com/opencontainers/runtime-spec) but simplifies the architecture by loading the runtime once and making RPC calls to handle the various container lifecycle commands. This refinement is an improvement on the OCI specification which requires the container manager call the runtime binary multiple times, at least once for each lifecycle command.
|
||||
|
||||
`containerd-shim-kata-v2` heavily utilizes the
|
||||
[virtcontainers package](../../src/runtime/virtcontainers/), which provides a generic, runtime-specification agnostic, hardware-virtualized containers library.
|
||||
|
||||
### Configuration
|
||||
|
||||
The runtime uses a TOML format configuration file called `configuration.toml`. By default this file is installed in the `/usr/share/defaults/kata-containers` directory and contains various settings such as the paths to the hypervisor, the guest kernel and the mini-OS image.
|
||||
|
||||
The actual configuration file paths can be determined by running:
|
||||
```
|
||||
$ kata-runtime --kata-show-default-config-paths
|
||||
```
|
||||
Most users will not need to modify the configuration file.
|
||||
|
||||
The file is well commented and provides a few "knobs" that can be used to modify the behavior of the runtime and your chosen hypervisor.
|
||||
|
||||
The configuration file is also used to enable runtime [debug output](../Developer-Guide.md#enable-full-debug).
|
||||
|
||||
## Networking
|
||||
|
||||
Containers will typically live in their own, possibly shared, networking namespace.
|
||||
At some point in a container lifecycle, container engines will set up that namespace
|
||||
to add the container to a network which is isolated from the host network, but
|
||||
which is shared between containers
|
||||
|
||||
In order to do so, container engines will usually add one end of a virtual
|
||||
ethernet (`veth`) pair into the container networking namespace. The other end of
|
||||
the `veth` pair is added to the host networking namespace.
|
||||
|
||||
This is a very namespace-centric approach as many hypervisors/VMMs cannot handle `veth`
|
||||
interfaces. Typically, `TAP` interfaces are created for VM connectivity.
|
||||
|
||||
To overcome incompatibility between typical container engines expectations
|
||||
and virtual machines, Kata Containers networking transparently connects `veth`
|
||||
interfaces with `TAP` ones using Traffic Control:
|
||||
|
||||

|
||||
|
||||
With a TC filter in place, a redirection is created between the container network and the
|
||||
virtual machine. As an example, the CNI may create a device, `eth0`, in the container's network
|
||||
namespace, which is a VETH device. Kata Containers will create a tap device for the VM, `tap0_kata`,
|
||||
and setup a TC redirection filter to mirror traffic from `eth0`'s ingress to `tap0_kata`'s egress,
|
||||
and a second to mirror traffic from `tap0_kata`'s ingress to `eth0`'s egress.
|
||||
|
||||
Kata Containers maintains support for MACVTAP, which was an earlier implementation used in Kata. TC-filter
|
||||
is the default because it allows for simpler configuration, better CNI plugin compatibility, and performance
|
||||
on par with MACVTAP.
|
||||
|
||||
Kata Containers has deprecated support for bridge due to lacking performance relative to TC-filter and MACVTAP.
|
||||
|
||||
Kata Containers supports both
|
||||
[CNM](https://github.com/docker/libnetwork/blob/master/docs/design.md#the-container-network-model)
|
||||
and [CNI](https://github.com/containernetworking/cni) for networking management.
|
||||
|
||||
### Network Hotplug
|
||||
|
||||
Kata Containers has developed a set of network sub-commands and APIs to add, list and
|
||||
remove a guest network endpoint and to manipulate the guest route table.
|
||||
|
||||
The following diagram illustrates the Kata Containers network hotplug workflow.
|
||||
|
||||

|
||||
|
||||
## Storage
|
||||
Container workloads are shared with the virtualized environment through [virtio-fs](https://virtio-fs.gitlab.io/).
|
||||
|
||||
The [devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/master/snapshots/devmapper) is a special case. The `snapshotter` uses dedicated block devices rather than formatted filesystems, and operates at the block level rather than the file level. This knowledge is used to directly use the underlying block device instead of the overlay file system for the container root file system. The block device maps to the top read-write layer for the overlay. This approach gives much better I/O performance compared to using `virtio-fs` to share the container file system.
|
||||
|
||||
Kata Containers has the ability to hotplug and remove block devices, which makes it possible to use block devices for containers started after the VM has been launched.
|
||||
|
||||
Users can check to see if the container uses the devicemapper block device as its rootfs by calling `mount(8)` within the container. If the devicemapper block device
|
||||
is used, `/` will be mounted on `/dev/vda`. Users can disable direct mounting of the underlying block device through the runtime configuration.
|
||||
|
||||
## Kubernetes support
|
||||
|
||||
[Kubernetes\*](https://github.com/kubernetes/kubernetes/) is a popular open source
|
||||
container orchestration engine. In Kubernetes, a set of containers sharing resources
|
||||
such as networking, storage, mount, PID, etc. is called a
|
||||
[Pod](https://kubernetes.io/docs/user-guide/pods/).
|
||||
A node can have multiple pods, but at a minimum, a node within a Kubernetes cluster
|
||||
only needs to run a container runtime and a container agent (called a
|
||||
[Kubelet](https://kubernetes.io/docs/admin/kubelet/)).
|
||||
|
||||
A Kubernetes cluster runs a control plane where a scheduler (typically running on a
|
||||
dedicated master node) calls into a compute Kubelet. This Kubelet instance is
|
||||
responsible for managing the lifecycle of pods within the nodes and eventually relies
|
||||
on a container runtime to handle execution. The Kubelet architecture decouples
|
||||
lifecycle management from container execution through the dedicated
|
||||
`gRPC` based [Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/container-runtime-interface-v1.md).
|
||||
|
||||
In other words, a Kubelet is a CRI client and expects a CRI implementation to
|
||||
handle the server side of the interface.
|
||||
[CRI-O\*](https://github.com/kubernetes-incubator/cri-o) and [Containerd\*](https://github.com/containerd/containerd/) are CRI implementations that rely on [OCI](https://github.com/opencontainers/runtime-spec)
|
||||
compatible runtimes for managing container instances.
|
||||
|
||||
Kata Containers is an officially supported CRI-O and Containerd runtime. Refer to the following guides on how to set up Kata Containers with Kubernetes:
|
||||
|
||||
- [How to use Kata Containers and Containerd](../how-to/containerd-kata.md)
|
||||
- [Run Kata Containers with Kubernetes](../how-to/run-kata-with-k8s.md)
|
||||
|
||||
#### OCI annotations
|
||||
|
||||
In order for the Kata Containers runtime (or any virtual machine based OCI compatible
|
||||
runtime) to be able to understand if it needs to create a full virtual machine or if it
|
||||
has to create a new container inside an existing pod's virtual machine, CRI-O adds
|
||||
specific annotations to the OCI configuration file (`config.json`) which is passed to
|
||||
the OCI compatible runtime.
|
||||
|
||||
Before calling its runtime, CRI-O will always add a `io.kubernetes.cri-o.ContainerType`
|
||||
annotation to the `config.json` configuration file it produces from the Kubelet CRI
|
||||
request. The `io.kubernetes.cri-o.ContainerType` annotation can either be set to `sandbox`
|
||||
or `container`. Kata Containers will then use this annotation to decide if it needs to
|
||||
respectively create a virtual machine or a container inside a virtual machine associated
|
||||
with a Kubernetes pod:
|
||||
|
||||
```Go
|
||||
containerType, err := ociSpec.ContainerType()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
handleFactory(ctx, runtimeConfig)
|
||||
|
||||
disableOutput := noNeedForOutput(detach, ociSpec.Process.Terminal)
|
||||
|
||||
var process vc.Process
|
||||
switch containerType {
|
||||
case vc.PodSandbox:
|
||||
process, err = createSandbox(ctx, ociSpec, runtimeConfig, containerID, bundlePath, console, disableOutput, systemdCgroup)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
case vc.PodContainer:
|
||||
process, err = createContainer(ctx, ociSpec, containerID, bundlePath, console, disableOutput)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
#### Mixing VM based and namespace based runtimes
|
||||
|
||||
> **Note:** Since Kubernetes 1.12, the [`Kubernetes RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/)
|
||||
> has been supported and the user can specify runtime without the non-standardized annotations.
|
||||
|
||||
With `RuntimeClass`, users can define Kata Containers as a `RuntimeClass` and then explicitly specify that a pod being created as a Kata Containers pod. For details, please refer to [How to use Kata Containers and Containerd](../../docs/how-to/containerd-kata.md).
|
||||
|
||||
|
||||
# Appendices
|
||||
|
||||
## DAX
|
||||
|
||||
Kata Containers utilizes the Linux kernel DAX [(Direct Access filesystem)](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.txt)
|
||||
feature to efficiently map some host-side files into the guest VM space.
|
||||
In particular, Kata Containers uses the QEMU NVDIMM feature to provide a
|
||||
memory-mapped virtual device that can be used to DAX map the virtual machine's
|
||||
root filesystem into the guest memory address space.
|
||||
|
||||
Mapping files using DAX provides a number of benefits over more traditional VM
|
||||
file and device mapping mechanisms:
|
||||
|
||||
- Mapping as a direct access devices allows the guest to directly access
|
||||
the host memory pages (such as via Execute In Place (XIP)), bypassing the guest
|
||||
page cache. This provides both time and space optimizations.
|
||||
- Mapping as a direct access device inside the VM allows pages from the
|
||||
host to be demand loaded using page faults, rather than having to make requests
|
||||
via a virtualized device (causing expensive VM exits/hypercalls), thus providing
|
||||
a speed optimization.
|
||||
- Utilizing `MAP_SHARED` shared memory on the host allows the host to efficiently
|
||||
share pages.
|
||||
|
||||
Kata Containers uses the following steps to set up the DAX mappings:
|
||||
1. QEMU is configured with an NVDIMM memory device, with a memory file
|
||||
backend to map in the host-side file into the virtual NVDIMM space.
|
||||
2. The guest kernel command line mounts this NVDIMM device with the DAX
|
||||
feature enabled, allowing direct page mapping and access, thus bypassing the
|
||||
guest page cache.
|
||||
|
||||

|
||||
|
||||
Information on the use of NVDIMM via QEMU is available in the [QEMU source code](http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/nvdimm.txt;hb=HEAD)
|
||||
@@ -1,477 +0,0 @@
|
||||
# Kata Containers Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Kata Containers is an open source community working to build a secure
|
||||
container [runtime](#runtime) with lightweight virtual machines (VM's)
|
||||
that feel and perform like standard Linux containers, but provide
|
||||
stronger [workload](#workload) isolation using hardware
|
||||
[virtualization](#virtualization) technology as a second layer of
|
||||
defence.
|
||||
|
||||
Kata Containers runs on [multiple architectures](../../../src/runtime/README.md#platform-support)
|
||||
and supports [multiple hypervisors](../../hypervisors.md).
|
||||
|
||||
This document is a summary of the Kata Containers architecture.
|
||||
|
||||
## Background knowledge
|
||||
|
||||
This document assumes the reader understands a number of concepts
|
||||
related to containers and file systems. The
|
||||
[background](background.md) document explains these concepts.
|
||||
|
||||
## Example command
|
||||
|
||||
This document makes use of a particular [example
|
||||
command](example-command.md) throughout the text to illustrate certain
|
||||
concepts.
|
||||
|
||||
## Virtualization
|
||||
|
||||
For details on how Kata Containers maps container concepts to VM
|
||||
technologies, and how this is realized in the multiple hypervisors and
|
||||
VMMs that Kata supports see the
|
||||
[virtualization documentation](../virtualization.md).
|
||||
|
||||
## Compatibility
|
||||
|
||||
The [Kata Containers runtime](../../../src/runtime) is compatible with
|
||||
the [OCI](https://github.com/opencontainers)
|
||||
[runtime specification](https://github.com/opencontainers/runtime-spec)
|
||||
and therefore works seamlessly with the
|
||||
[Kubernetes Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/container-runtime-interface.md)
|
||||
through the [CRI-O](https://github.com/kubernetes-incubator/cri-o)
|
||||
and [containerd](https://github.com/containerd/containerd)
|
||||
implementations.
|
||||
|
||||
Kata Containers provides a ["shimv2"](#shim-v2-architecture) compatible runtime.
|
||||
|
||||
## Shim v2 architecture
|
||||
|
||||
The Kata Containers runtime is shim v2 ("shimv2") compatible. This
|
||||
section explains what this means.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> For a comparison with the Kata 1.x architecture, see
|
||||
> [the architectural history document](history.md).
|
||||
|
||||
The
|
||||
[containerd runtime shimv2 architecture](https://github.com/containerd/containerd/tree/main/runtime/v2)
|
||||
or _shim API_ architecture resolves the issues with the old
|
||||
architecture by defining a set of shimv2 APIs that a compatible
|
||||
runtime implementation must supply. Rather than calling the runtime
|
||||
binary multiple times for each new container, the shimv2 architecture
|
||||
runs a single instance of the runtime binary (for any number of
|
||||
containers). This improves performance and resolves the state handling
|
||||
issue.
|
||||
|
||||
The shimv2 API is similar to the
|
||||
[OCI runtime](https://github.com/opencontainers/runtime-spec)
|
||||
API in terms of the way the container lifecycle is split into
|
||||
different verbs. Rather than calling the runtime multiple times, the
|
||||
container manager creates a socket and passes it to the shimv2
|
||||
runtime. The socket is a bi-directional communication channel that
|
||||
uses a gRPC based protocol to allow the container manager to send API
|
||||
calls to the runtime, which returns the result to the container
|
||||
manager using the same channel.
|
||||
|
||||
The shimv2 architecture allows running several containers per VM to
|
||||
support container engines that require multiple containers running
|
||||
inside a pod.
|
||||
|
||||
With the new architecture [Kubernetes](kubernetes.md) can
|
||||
launch both Pod and OCI compatible containers with a single
|
||||
[runtime](#runtime) shim per Pod, rather than `2N+1` shims. No stand
|
||||
alone `kata-proxy` process is required, even if VSOCK is not
|
||||
available.
|
||||
|
||||
## Workload
|
||||
|
||||
The workload is the command the user requested to run in the
|
||||
container and is specified in the [OCI bundle](background.md#oci-bundle)'s
|
||||
configuration file.
|
||||
|
||||
In our [example](example-command.md), the workload is the `sh(1)` command.
|
||||
|
||||
### Workload root filesystem
|
||||
|
||||
For details of how the [runtime](#runtime) makes the
|
||||
[container image](background.md#container-image) chosen by the user available to
|
||||
the workload process, see the
|
||||
[Container creation](#container-creation) and [storage](#storage) sections.
|
||||
|
||||
Note that the workload is isolated from the [guest VM](#environments) environment by its
|
||||
surrounding [container environment](#environments). The guest VM
|
||||
environment where the container runs in is also isolated from the _outer_
|
||||
[host environment](#environments) where the container manager runs.
|
||||
|
||||
## System overview
|
||||
|
||||
### Environments
|
||||
|
||||
The following terminology is used to describe the different or
|
||||
environments (or contexts) various processes run in. It is necessary
|
||||
to study this table closely to make sense of what follows:
|
||||
|
||||
| Type | Name | Virtualized | Containerized | rootfs | Rootfs device type | Mount type | Description |
|
||||
|-|-|-|-|-|-|-|-|
|
||||
| Host | Host | no `[1]` | no | Host specific | Host specific | Host specific | The environment provided by a standard, physical non virtualized system. |
|
||||
| VM root | Guest VM | yes | no | rootfs inside the [guest image](guest-assets.md#guest-image) | Hypervisor specific `[2]` | `ext4` | The first (or top) level VM environment created on a host system. |
|
||||
| VM container root | Container | yes | yes | rootfs type requested by user ([`ubuntu` in the example](example-command.md)) | `kataShared` | [virtio FS](storage.md#virtio-fs) | The first (or top) level container environment created inside the VM. Based on the [OCI bundle](background.md#oci-bundle). |
|
||||
|
||||
**Key:**
|
||||
|
||||
- `[1]`: For simplicity, this document assumes the host environment
|
||||
runs on physical hardware.
|
||||
|
||||
- `[2]`: See the [DAX](#dax) section.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The word "root" is used to mean _top level_ here in a similar
|
||||
> manner to the term [rootfs](background.md#root-filesystem).
|
||||
>
|
||||
> - The term "first level" prefix used above is important since it implies
|
||||
> that it is possible to create multi level systems. However, they do
|
||||
> not form part of a standard Kata Containers environment so will not
|
||||
> be considered in this document.
|
||||
|
||||
The reasons for containerizing the [workload](#workload) inside the VM
|
||||
are:
|
||||
|
||||
- Isolates the workload entirely from the VM environment.
|
||||
- Provides better isolation between containers in a [pod](kubernetes.md).
|
||||
- Allows the workload to be managed and monitored through its cgroup
|
||||
confinement.
|
||||
|
||||
### Container creation
|
||||
|
||||
The steps below show at a high level how a Kata Containers container is
|
||||
created using the containerd container manager:
|
||||
|
||||
1. The user requests the creation of a container by running a command
|
||||
like the [example command](example-command.md).
|
||||
1. The container manager daemon runs a single instance of the Kata
|
||||
[runtime](#runtime).
|
||||
1. The Kata runtime loads its [configuration file](#configuration).
|
||||
1. The container manager calls a set of shimv2 API functions on the runtime.
|
||||
1. The Kata runtime launches the configured [hypervisor](#hypervisor).
|
||||
1. The hypervisor creates and starts (_boots_) a VM using the
|
||||
[guest assets](guest-assets.md#guest-assets):
|
||||
|
||||
- The hypervisor [DAX](#dax) shares the
|
||||
[guest image](guest-assets.md#guest-image)
|
||||
into the VM to become the VM [rootfs](background.md#root-filesystem) (mounted on a `/dev/pmem*` device),
|
||||
which is known as the [VM root environment](#environments).
|
||||
- The hypervisor mounts the [OCI bundle](background.md#oci-bundle), using [virtio FS](storage.md#virtio-fs),
|
||||
into a container specific directory inside the VM's rootfs.
|
||||
|
||||
This container specific directory will become the
|
||||
[container rootfs](#environments), known as the
|
||||
[container environment](#environments).
|
||||
|
||||
1. The [agent](#agent) is started as part of the VM boot.
|
||||
|
||||
1. The runtime calls the agent's `CreateSandbox` API to request the
|
||||
agent create a container:
|
||||
|
||||
1. The agent creates a [container environment](#environments)
|
||||
in the container specific directory that contains the [container rootfs](#environments).
|
||||
|
||||
The container environment hosts the [workload](#workload) in the
|
||||
[container rootfs](#environments) directory.
|
||||
|
||||
1. The agent spawns the workload inside the container environment.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The container environment created by the agent is equivalent to
|
||||
> a container environment created by the
|
||||
> [`runc`](https://github.com/opencontainers/runc) OCI runtime;
|
||||
> Linux cgroups and namespaces are created inside the VM by the
|
||||
> [guest kernel](guest-assets.md#guest-kernel) to isolate the
|
||||
> workload from the VM environment the container is created in.
|
||||
> See the [Environments](#environments) section for an
|
||||
> explanation of why this is done.
|
||||
>
|
||||
> - See the [guest image](guest-assets.md#guest-image) section for
|
||||
> details of exactly how the agent is started.
|
||||
|
||||
1. The container manager returns control of the container to the
|
||||
user running the `ctr` command.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> At this point, the container is running and:
|
||||
>
|
||||
> - The [workload](#workload) process ([`sh(1)` in the example](example-command.md))
|
||||
> is running in the [container environment](#environments).
|
||||
> - The user is now able to interact with the workload
|
||||
> (using the [`ctr` command in the example](example-command.md)).
|
||||
> - The [agent](#agent), running inside the VM is monitoring the
|
||||
> [workload](#workload) process.
|
||||
> - The [runtime](#runtime) is waiting for the agent's `WaitProcess` API
|
||||
> call to complete.
|
||||
|
||||
Further details of these steps are provided in the sections below.
|
||||
|
||||
### Container shutdown
|
||||
|
||||
There are two possible ways for the container environment to be
|
||||
terminated:
|
||||
|
||||
- When the [workload](#workload) exits.
|
||||
|
||||
This is the standard, or _graceful_ shutdown method.
|
||||
|
||||
- When the container manager forces the container to be deleted.
|
||||
|
||||
#### Workload exit
|
||||
|
||||
The [agent](#agent) will detect when the [workload](#workload) process
|
||||
exits, capture its exit status (see `wait(2)`) and return that value
|
||||
to the [runtime](#runtime) by specifying it as the response to the
|
||||
`WaitProcess` agent API call made by the [runtime](#runtime).
|
||||
|
||||
The runtime then passes the value back to the container manager by the
|
||||
`Wait` [shimv2 API](#shim-v2-architecture) call.
|
||||
|
||||
Once the workload has fully exited, the VM is no longer needed and the
|
||||
runtime cleans up the environment (which includes terminating the
|
||||
[hypervisor](#hypervisor) process).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> When [agent tracing is enabled](../../tracing.md#agent-shutdown-behaviour),
|
||||
> the shutdown behaviour is different.
|
||||
|
||||
#### Container manager requested shutdown
|
||||
|
||||
If the container manager requests the container be deleted, the
|
||||
[runtime](#runtime) will signal the agent by sending it a
|
||||
`DestroySandbox` [ttRPC API](../../../src/libs/protocols/protos/agent.proto) request.
|
||||
|
||||
## Guest assets
|
||||
|
||||
The guest assets comprise a guest image and a guest kernel that are
|
||||
used by the [hypervisor](#hypervisor).
|
||||
|
||||
See the [guest assets](guest-assets.md) document for further
|
||||
information.
|
||||
|
||||
## Hypervisor
|
||||
|
||||
The [hypervisor](../../hypervisors.md) specified in the
|
||||
[configuration file](#configuration) creates a VM to host the
|
||||
[agent](#agent) and the [workload](#workload) inside the
|
||||
[container environment](#environments).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The hypervisor process runs inside an environment slightly different
|
||||
> to the host environment:
|
||||
>
|
||||
> - It is run in a different cgroup environment to the host.
|
||||
> - It is given a separate network namespace from the host.
|
||||
> - If the [OCI configuration specifies a SELinux label](https://github.com/opencontainers/runtime-spec/blob/main/config.md#linux-process),
|
||||
> the hypervisor process will run with that label (*not* the workload running inside the hypervisor's VM).
|
||||
|
||||
## Agent
|
||||
|
||||
The Kata Containers agent ([`kata-agent`](../../../src/agent)), written
|
||||
in the [Rust programming language](https://www.rust-lang.org), is a
|
||||
long running process that runs inside the VM. It acts as the
|
||||
supervisor for managing the containers and the [workload](#workload)
|
||||
running within those containers. Only a single agent process is run
|
||||
for each VM created.
|
||||
|
||||
### Agent communications protocol
|
||||
|
||||
The agent communicates with the other Kata components (primarily the
|
||||
[runtime](#runtime)) using a
|
||||
[`ttRPC`](https://github.com/containerd/ttrpc-rust) based
|
||||
[protocol](../../../src/libs/protocols/protos).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> If you wish to learn more about this protocol, a practical way to do
|
||||
> so is to experiment with the
|
||||
> [agent control tool](#agent-control-tool) on a test system.
|
||||
> This tool is for test and development purposes only and can send
|
||||
> arbitrary ttRPC agent API commands to the [agent](#agent).
|
||||
|
||||
## Runtime
|
||||
|
||||
The Kata Containers runtime (the [`containerd-shim-kata-v2`](../../../src/runtime/cmd/containerd-shim-kata-v2
|
||||
) binary) is a [shimv2](#shim-v2-architecture) compatible runtime.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The Kata Containers runtime is sometimes referred to as the Kata
|
||||
> _shim_. Both terms are correct since the `containerd-shim-kata-v2`
|
||||
> is a container runtime, and that runtime implements the containerd
|
||||
> shim v2 API.
|
||||
|
||||
The runtime makes heavy use of the [`virtcontainers`
|
||||
package](../../../src/runtime/virtcontainers), which provides a generic,
|
||||
runtime-specification agnostic, hardware-virtualized containers
|
||||
library.
|
||||
|
||||
The runtime is responsible for starting the [hypervisor](#hypervisor)
|
||||
and it's VM, and communicating with the [agent](#agent) using a
|
||||
[ttRPC based protocol](#agent-communications-protocol) over a VSOCK
|
||||
socket that provides a communications link between the VM and the
|
||||
host.
|
||||
|
||||
This protocol allows the runtime to send container management commands
|
||||
to the agent. The protocol is also used to carry the standard I/O
|
||||
streams (`stdout`, `stderr`, `stdin`) between the containers and
|
||||
container managers (such as CRI-O or containerd).
|
||||
|
||||
## Utility program
|
||||
|
||||
The `kata-runtime` binary is a utility program that provides
|
||||
administrative commands to manipulate and query a Kata Containers
|
||||
installation.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> In Kata 1.x, this program also acted as the main
|
||||
> [runtime](#runtime), but this is no longer required due to the
|
||||
> improved shimv2 architecture.
|
||||
|
||||
### exec command
|
||||
|
||||
The `exec` command allows an administrator or developer to enter the
|
||||
[VM root environment](#environments) which is not accessible by the container
|
||||
[workload](#workload).
|
||||
|
||||
See [the developer guide](../../Developer-Guide.md#connect-to-debug-console) for further details.
|
||||
|
||||
### Configuration
|
||||
|
||||
See the [configuration file details](../../../src/runtime/README.md#configuration).
|
||||
|
||||
The configuration file is also used to enable runtime [debug output](../../Developer-Guide.md#enable-full-debug).
|
||||
|
||||
## Process overview
|
||||
|
||||
The table below shows an example of the main processes running in the
|
||||
different [environments](#environments) when a Kata Container is
|
||||
created with containerd using our [example command](example-command.md):
|
||||
|
||||
| Description | Host | VM root environment | VM container environment |
|
||||
|-|-|-|-|
|
||||
| Container manager | `containerd` | |
|
||||
| Kata Containers | [runtime](#runtime), [`virtiofsd`](storage.md#virtio-fs), [hypervisor](#hypervisor) | [agent](#agent) |
|
||||
| User [workload](#workload) | | | [`ubuntu sh`](example-command.md) |
|
||||
|
||||
## Networking
|
||||
|
||||
See the [networking document](networking.md).
|
||||
|
||||
## Storage
|
||||
|
||||
See the [storage document](storage.md).
|
||||
|
||||
## Kubernetes support
|
||||
|
||||
See the [Kubernetes document](kubernetes.md).
|
||||
|
||||
#### OCI annotations
|
||||
|
||||
In order for the Kata Containers [runtime](#runtime) (or any VM based OCI compatible
|
||||
runtime) to be able to understand if it needs to create a full VM or if it
|
||||
has to create a new container inside an existing pod's VM, CRI-O adds
|
||||
specific annotations to the OCI configuration file (`config.json`) which is passed to
|
||||
the OCI compatible runtime.
|
||||
|
||||
Before calling its runtime, CRI-O will always add a `io.kubernetes.cri-o.ContainerType`
|
||||
annotation to the `config.json` configuration file it produces from the Kubelet CRI
|
||||
request. The `io.kubernetes.cri-o.ContainerType` annotation can either be set to `sandbox`
|
||||
or `container`. Kata Containers will then use this annotation to decide if it needs to
|
||||
respectively create a virtual machine or a container inside a virtual machine associated
|
||||
with a Kubernetes pod:
|
||||
|
||||
| Annotation value | Kata VM created? | Kata container created? |
|
||||
|-|-|-|
|
||||
| `sandbox` | yes | yes (inside new VM) |
|
||||
| `container`| no | yes (in existing VM) |
|
||||
|
||||
#### Mixing VM based and namespace based runtimes
|
||||
|
||||
> **Note:** Since Kubernetes 1.12, the [`Kubernetes RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/)
|
||||
> has been supported and the user can specify runtime without the non-standardized annotations.
|
||||
|
||||
With `RuntimeClass`, users can define Kata Containers as a
|
||||
`RuntimeClass` and then explicitly specify that a pod must be created
|
||||
as a Kata Containers pod. For details, please refer to [How to use
|
||||
Kata Containers and containerd](../../../docs/how-to/containerd-kata.md).
|
||||
|
||||
## Tracing
|
||||
|
||||
The [tracing document](../../tracing.md) provides details on the tracing
|
||||
architecture.
|
||||
|
||||
# Appendices
|
||||
|
||||
## DAX
|
||||
|
||||
Kata Containers utilizes the Linux kernel DAX
|
||||
[(Direct Access filesystem)](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.rst?h=v5.14)
|
||||
feature to efficiently map the [guest image](guest-assets.md#guest-image) in the
|
||||
[host environment](#environments) into the
|
||||
[guest VM environment](#environments) to become the VM's
|
||||
[rootfs](background.md#root-filesystem).
|
||||
|
||||
If the [configured](#configuration) [hypervisor](#hypervisor) is set
|
||||
to either QEMU or Cloud Hypervisor, DAX is used with the feature shown
|
||||
in the table below:
|
||||
|
||||
| Hypervisor | Feature used | rootfs device type |
|
||||
|-|-|-|
|
||||
| Cloud Hypervisor (CH) | `dax` `FsConfig` configuration option | PMEM (emulated Persistent Memory device) |
|
||||
| QEMU | NVDIMM memory device with a memory file backend | NVDIMM (emulated Non-Volatile Dual In-line Memory Module device) |
|
||||
|
||||
The features in the table above are equivalent in that they provide a memory-mapped
|
||||
virtual device which is used to DAX map the VM's
|
||||
[rootfs](background.md#root-filesystem) into the [VM guest](#environments) memory
|
||||
address space.
|
||||
|
||||
The VM is then booted, specifying the `root=` kernel parameter to make
|
||||
the [guest kernel](guest-assets.md#guest-kernel) use the appropriate emulated device
|
||||
as its rootfs.
|
||||
|
||||
### DAX advantages
|
||||
|
||||
Mapping files using [DAX](#dax) provides a number of benefits over
|
||||
more traditional VM file and device mapping mechanisms:
|
||||
|
||||
- Mapping as a direct access device allows the guest to directly
|
||||
access the host memory pages (such as via Execute In Place (XIP)),
|
||||
bypassing the [guest kernel](guest-assets.md#guest-kernel)'s page cache. This
|
||||
zero copy provides both time and space optimizations.
|
||||
|
||||
- Mapping as a direct access device inside the VM allows pages from the
|
||||
host to be demand loaded using page faults, rather than having to make requests
|
||||
via a virtualized device (causing expensive VM exits/hypercalls), thus providing
|
||||
a speed optimization.
|
||||
|
||||
- Utilizing `mmap(2)`'s `MAP_SHARED` shared memory option on the host
|
||||
allows the host to efficiently share pages.
|
||||
|
||||

|
||||
|
||||
For further details of the use of NVDIMM with QEMU, see the [QEMU
|
||||
project documentation](https://www.qemu.org).
|
||||
|
||||
## Agent control tool
|
||||
|
||||
The [agent control tool](../../../src/tools/agent-ctl) is a test and
|
||||
development tool that can be used to learn more about a Kata
|
||||
Containers system.
|
||||
|
||||
## Terminology
|
||||
|
||||
See the [project glossary](../../../Glossary.md).
|
||||
@@ -1,81 +0,0 @@
|
||||
# Kata Containers architecture background knowledge
|
||||
|
||||
The following sections explain some of the background concepts
|
||||
required to understand the [architecture document](README.md).
|
||||
|
||||
## Root filesystem
|
||||
|
||||
This document uses the term _rootfs_ to refer to a root filesystem
|
||||
which is mounted as the top-level directory ("`/`") and often referred
|
||||
to as _slash_.
|
||||
|
||||
It is important to understand this term since the overall system uses
|
||||
multiple different rootfs's (as explained in the
|
||||
[Environments](README.md#environments) section.
|
||||
|
||||
## Container image
|
||||
|
||||
In the [example command](example-command.md) the user has specified the
|
||||
type of container they wish to run via the container image name:
|
||||
`ubuntu`. This image name corresponds to a _container image_ that can
|
||||
be used to create a container with an Ubuntu Linux environment. Hence,
|
||||
in our [example](example-command.md), the `sh(1)` command will be run
|
||||
inside a container which has an Ubuntu rootfs.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The term _container image_ is confusing since the image in question
|
||||
> is **not** a container: it is simply a set of files (_an image_)
|
||||
> that can be used to _create_ a container. The term _container
|
||||
> template_ would be more accurate but the term _container image_ is
|
||||
> commonly used so this document uses the standard term.
|
||||
|
||||
For the purposes of this document, the most important part of the
|
||||
[example command line](example-command.md) is the container image the
|
||||
user has requested. Normally, the container manager will _pull_
|
||||
(download) a container image from a remote site and store a copy
|
||||
locally. This local container image is used by the container manager
|
||||
to create an [OCI bundle](#oci-bundle) which will form the environment
|
||||
the container will run in. After creating the OCI bundle, the
|
||||
container manager launches a [runtime](README.md#runtime) which will create the
|
||||
container using the provided OCI bundle.
|
||||
|
||||
## OCI bundle
|
||||
|
||||
To understand what follows, it is important to know at a high level
|
||||
how an OCI ([Open Containers Initiative](https://opencontainers.org)) compatible container is created.
|
||||
|
||||
An OCI compatible container is created by taking a
|
||||
[container image](#container-image) and converting the embedded rootfs
|
||||
into an
|
||||
[OCI rootfs bundle](https://github.com/opencontainers/runtime-spec/blob/main/bundle.md),
|
||||
or more simply, an _OCI bundle_.
|
||||
|
||||
An OCI bundle is a `tar(1)` archive normally created by a container
|
||||
manager which is passed to an OCI [runtime](README.md#runtime) which converts
|
||||
it into a full container rootfs. The bundle contains two assets:
|
||||
|
||||
- A container image [rootfs](#root-filesystem)
|
||||
|
||||
This is simply a directory of files that will be used to represent
|
||||
the rootfs for the container.
|
||||
|
||||
For the [example command](example-command.md), the directory will
|
||||
contain the files necessary to create a minimal Ubuntu root
|
||||
filesystem.
|
||||
|
||||
- An [OCI configuration file](https://github.com/opencontainers/runtime-spec/blob/main/config.md)
|
||||
|
||||
This is a JSON file called `config.json`.
|
||||
|
||||
The container manager will create this file so that:
|
||||
|
||||
- The `root.path` value is set to the full path of the specified
|
||||
container rootfs.
|
||||
|
||||
In [the example](example-command.md) this value will be `ubuntu`.
|
||||
|
||||
- The `process.args` array specifies the list of commands the user
|
||||
wishes to run. This is known as the [workload](README.md#workload).
|
||||
|
||||
In [the example](example-command.md) the workload is `sh(1)`.
|
||||
@@ -1,30 +0,0 @@
|
||||
# Example command
|
||||
|
||||
The following containerd command creates a container. It is referred
|
||||
to throughout the architecture document to help explain various points:
|
||||
|
||||
```bash
|
||||
$ sudo ctr run --runtime "io.containerd.kata.v2" --rm -t "quay.io/libpod/ubuntu:latest" foo sh
|
||||
```
|
||||
|
||||
This command requests that containerd:
|
||||
|
||||
- Create a container (`ctr run`).
|
||||
- Use the Kata [shimv2](README.md#shim-v2-architecture) runtime (`--runtime "io.containerd.kata.v2"`).
|
||||
- Delete the container when it [exits](README.md#workload-exit) (`--rm`).
|
||||
- Attach the container to the user's terminal (`-t`).
|
||||
- Use the Ubuntu Linux [container image](background.md#container-image)
|
||||
to create the container [rootfs](background.md#root-filesystem) that will become
|
||||
the [container environment](README.md#environments)
|
||||
(`quay.io/libpod/ubuntu:latest`).
|
||||
- Create the container with the name "`foo`".
|
||||
- Run the `sh(1)` command in the Ubuntu rootfs based container
|
||||
environment.
|
||||
|
||||
The command specified here is referred to as the [workload](README.md#workload).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> For the purposes of this document and to keep explanations
|
||||
> simpler, we assume the user is running this command in the
|
||||
> [host environment](README.md#environments).
|
||||
@@ -1,152 +0,0 @@
|
||||
# Guest assets
|
||||
|
||||
Kata Containers creates a VM in which to run one or more containers.
|
||||
It does this by launching a [hypervisor](README.md#hypervisor) to
|
||||
create the VM. The hypervisor needs two assets for this task: a Linux
|
||||
kernel and a small root filesystem image to boot the VM.
|
||||
|
||||
## Guest kernel
|
||||
|
||||
The [guest kernel](../../../tools/packaging/kernel)
|
||||
is passed to the hypervisor and used to boot the VM.
|
||||
The default kernel provided in Kata Containers is highly optimized for
|
||||
kernel boot time and minimal memory footprint, providing only those
|
||||
services required by a container workload. It is based on the latest
|
||||
Linux LTS (Long Term Support) [kernel](https://www.kernel.org).
|
||||
|
||||
## Guest image
|
||||
|
||||
The hypervisor uses an image file which provides a minimal root
|
||||
filesystem used by the guest kernel to boot the VM and host the Kata
|
||||
Container. Kata Containers supports both initrd and rootfs based
|
||||
minimal guest images. The [default packages](../../install/) provide both
|
||||
an image and an initrd, both of which are created using the
|
||||
[`osbuilder`](../../../tools/osbuilder) tool.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - Although initrd and rootfs based images are supported, not all
|
||||
> [hypervisors](README.md#hypervisor) support both types of image.
|
||||
>
|
||||
> - The guest image is *unrelated* to the image used in a container
|
||||
> workload.
|
||||
>
|
||||
> For example, if a user creates a container that runs a shell in a
|
||||
> BusyBox image, they will run that shell in a BusyBox environment.
|
||||
> However, the guest image running inside the VM that is used to
|
||||
> *host* that BusyBox image could be running Clear Linux, Ubuntu,
|
||||
> Fedora or any other distribution potentially.
|
||||
>
|
||||
> The `osbuilder` tool provides
|
||||
> [configurations for various common Linux distributions](../../../tools/osbuilder/rootfs-builder)
|
||||
> which can be built into either initrd or rootfs guest images.
|
||||
>
|
||||
> - If you are using a [packaged version of Kata
|
||||
> Containers](../../install), you can see image details by running the
|
||||
> [`kata-collect-data.sh`](../../../src/runtime/data/kata-collect-data.sh.in)
|
||||
> script as `root` and looking at the "Image details" section of the
|
||||
> output.
|
||||
|
||||
#### Root filesystem image
|
||||
|
||||
The default packaged rootfs image, sometimes referred to as the _mini
|
||||
O/S_, is a highly optimized container bootstrap system.
|
||||
|
||||
If this image type is [configured](README.md#configuration), when the
|
||||
user runs the [example command](example-command.md):
|
||||
|
||||
- The [runtime](README.md#runtime) will launch the configured [hypervisor](README.md#hypervisor).
|
||||
- The hypervisor will boot the mini-OS image using the [guest kernel](#guest-kernel).
|
||||
- The kernel will start the init daemon as PID 1 (`systemd`) inside the VM root environment.
|
||||
- `systemd`, running inside the mini-OS context, will launch the [agent](README.md#agent)
|
||||
in the root context of the VM.
|
||||
- The agent will create a new container environment, setting its root
|
||||
filesystem to that requested by the user (Ubuntu in [the example](example-command.md)).
|
||||
- The agent will then execute the command (`sh(1)` in [the example](example-command.md))
|
||||
inside the new container.
|
||||
|
||||
The table below summarises the default mini O/S showing the
|
||||
environments that are created, the services running in those
|
||||
environments (for all platforms) and the root filesystem used by
|
||||
each service:
|
||||
|
||||
| Process | Environment | systemd service? | rootfs | User accessible | Notes |
|
||||
|-|-|-|-|-|-|
|
||||
| systemd | VM root | n/a | [VM guest image](#guest-image)| [debug console][debug-console] | The init daemon, running as PID 1 |
|
||||
| [Agent](README.md#agent) | VM root | yes | [VM guest image](#guest-image)| [debug console][debug-console] | Runs as a systemd service |
|
||||
| `chronyd` | VM root | yes | [VM guest image](#guest-image)| [debug console][debug-console] | Used to synchronise the time with the host |
|
||||
| container workload (`sh(1)` in [the example](example-command.md)) | VM container | no | User specified (Ubuntu in [the example](example-command.md)) | [exec command](README.md#exec-command) | Managed by the agent |
|
||||
|
||||
See also the [process overview](README.md#process-overview).
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The "User accessible" column shows how an administrator can access
|
||||
> the environment.
|
||||
>
|
||||
> - The container workload is running inside a full container
|
||||
> environment which itself is running within a VM environment.
|
||||
>
|
||||
> - See the [configuration files for the `osbuilder` tool](../../../tools/osbuilder/rootfs-builder)
|
||||
> for details of the default distribution for platforms other than
|
||||
> Intel x86_64.
|
||||
|
||||
#### Initrd image
|
||||
|
||||
The initrd image is a compressed `cpio(1)` archive, created from a
|
||||
rootfs which is loaded into memory and used as part of the Linux
|
||||
startup process. During startup, the kernel unpacks it into a special
|
||||
instance of a `tmpfs` mount that becomes the initial root filesystem.
|
||||
|
||||
If this image type is [configured](README.md#configuration), when the user runs
|
||||
the [example command](example-command.md):
|
||||
|
||||
- The [runtime](README.md#runtime) will launch the configured [hypervisor](README.md#hypervisor).
|
||||
- The hypervisor will boot the mini-OS image using the [guest kernel](#guest-kernel).
|
||||
- The kernel will start the init daemon as PID 1 (the
|
||||
[agent](README.md#agent))
|
||||
inside the VM root environment.
|
||||
- The [agent](README.md#agent) will create a new container environment, setting its root
|
||||
filesystem to that requested by the user (`ubuntu` in
|
||||
[the example](example-command.md)).
|
||||
- The agent will then execute the command (`sh(1)` in [the example](example-command.md))
|
||||
inside the new container.
|
||||
|
||||
The table below summarises the default mini O/S showing the environments that are created,
|
||||
the processes running in those environments (for all platforms) and
|
||||
the root filesystem used by each service:
|
||||
|
||||
| Process | Environment | rootfs | User accessible | Notes |
|
||||
|-|-|-|-|-|
|
||||
| [Agent](README.md#agent) | VM root | [VM guest image](#guest-image) | [debug console][debug-console] | Runs as the init daemon (PID 1) |
|
||||
| container workload | VM container | User specified (Ubuntu in this example) | [exec command](README.md#exec-command) | Managed by the agent |
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The "User accessible" column shows how an administrator can access
|
||||
> the environment.
|
||||
>
|
||||
> - It is possible to use a standard init daemon such as systemd with
|
||||
> an initrd image if this is desirable.
|
||||
|
||||
See also the [process overview](README.md#process-overview).
|
||||
|
||||
#### Image summary
|
||||
|
||||
| Image type | Default distro | Init daemon | Reason | Notes |
|
||||
|-|-|-|-|-|
|
||||
| [image](background.md#root-filesystem-image) | [Clear Linux](https://clearlinux.org) (for x86_64 systems)| systemd | Minimal and highly optimized | systemd offers flexibility |
|
||||
| [initrd](#initrd-image) | [Alpine Linux](https://alpinelinux.org) | Kata [agent](README.md#agent) (as no systemd support) | Security hardened and tiny C library |
|
||||
|
||||
See also:
|
||||
|
||||
- The [osbuilder](../../../tools/osbuilder) tool
|
||||
|
||||
This is used to build all default image types.
|
||||
|
||||
- The [versions database](../../../versions.yaml)
|
||||
|
||||
The `default-image-name` and `default-initrd-name` options specify
|
||||
the default distributions for each image type.
|
||||
|
||||
[debug-console]: ../../Developer-Guide.md#connect-to-debug-console
|
||||
@@ -1,41 +0,0 @@
|
||||
# History
|
||||
|
||||
## Kata 1.x architecture
|
||||
|
||||
In the old [Kata 1.x architecture](https://github.com/kata-containers/documentation/blob/master/design/architecture.md),
|
||||
the Kata [runtime](README.md#runtime) was an executable called `kata-runtime`.
|
||||
The container manager called this executable multiple times when
|
||||
creating each container. Each time the runtime was called a different
|
||||
OCI command-line verb was provided. This architecture was simple, but
|
||||
not well suited to creating VM based containers due to the issue of
|
||||
handling state between calls. Additionally, the architecture suffered
|
||||
from performance issues related to continually having to spawn new
|
||||
instances of the runtime binary, and
|
||||
[Kata shim](https://github.com/kata-containers/shim) and
|
||||
[Kata proxy](https://github.com/kata-containers/proxy) processes for systems
|
||||
that did not provide VSOCK.
|
||||
|
||||
## Kata 2.x architecture
|
||||
|
||||
See the ["shimv2"](README.md#shim-v2-architecture) section of the
|
||||
architecture document.
|
||||
|
||||
## Architectural comparison
|
||||
|
||||
| Kata version | Kata Runtime process calls | Kata shim processes | Kata proxy processes (if no VSOCK) |
|
||||
|-|-|-|-|
|
||||
| 1.x | multiple per container | 1 per container connection | 1 |
|
||||
| 2.x | 1 per VM (hosting any number of containers) | 0 | 0 |
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - A single VM can host one or more containers.
|
||||
>
|
||||
> - The "Kata shim processes" column refers to the old
|
||||
> [Kata shim](https://github.com/kata-containers/shim) (`kata-shim` binary),
|
||||
> *not* the new shimv2 runtime instance (`containerd-shim-kata-v2` binary).
|
||||
|
||||
The diagram below shows how the original architecture was simplified
|
||||
with the advent of shimv2.
|
||||
|
||||

|
||||
@@ -1,35 +0,0 @@
|
||||
# Kubernetes support
|
||||
|
||||
[Kubernetes](https://github.com/kubernetes/kubernetes/), or K8s, is a popular open source
|
||||
container orchestration engine. In Kubernetes, a set of containers sharing resources
|
||||
such as networking, storage, mount, PID, etc. is called a
|
||||
[pod](https://kubernetes.io/docs/user-guide/pods/).
|
||||
|
||||
A node can have multiple pods, but at a minimum, a node within a Kubernetes cluster
|
||||
only needs to run a container runtime and a container agent (called a
|
||||
[Kubelet](https://kubernetes.io/docs/admin/kubelet/)).
|
||||
|
||||
Kata Containers represents a Kubelet pod as a VM.
|
||||
|
||||
A Kubernetes cluster runs a control plane where a scheduler (typically
|
||||
running on a dedicated master node) calls into a compute Kubelet. This
|
||||
Kubelet instance is responsible for managing the lifecycle of pods
|
||||
within the nodes and eventually relies on a container runtime to
|
||||
handle execution. The Kubelet architecture decouples lifecycle
|
||||
management from container execution through a dedicated gRPC based
|
||||
[Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/container-runtime-interface-v1.md).
|
||||
|
||||
In other words, a Kubelet is a CRI client and expects a CRI
|
||||
implementation to handle the server side of the interface.
|
||||
[CRI-O](https://github.com/kubernetes-incubator/cri-o) and
|
||||
[containerd](https://github.com/containerd/containerd/) are CRI
|
||||
implementations that rely on
|
||||
[OCI](https://github.com/opencontainers/runtime-spec) compatible
|
||||
runtimes for managing container instances.
|
||||
|
||||
Kata Containers is an officially supported CRI-O and containerd
|
||||
runtime. Refer to the following guides on how to set up Kata
|
||||
Containers with Kubernetes:
|
||||
|
||||
- [How to use Kata Containers and containerd](../../how-to/containerd-kata.md)
|
||||
- [Run Kata Containers with Kubernetes](../../how-to/run-kata-with-k8s.md)
|
||||
@@ -1,49 +0,0 @@
|
||||
# Networking
|
||||
|
||||
Containers typically live in their own, possibly shared, networking namespace.
|
||||
At some point in a container lifecycle, container engines will set up that namespace
|
||||
to add the container to a network which is isolated from the host network.
|
||||
|
||||
In order to setup the network for a container, container engines call into a
|
||||
networking plugin. The network plugin will usually create a virtual
|
||||
ethernet (`veth`) pair adding one end of the `veth` pair into the container
|
||||
networking namespace, while the other end of the `veth` pair is added to the
|
||||
host networking namespace.
|
||||
|
||||
This is a very namespace-centric approach as many hypervisors or VM
|
||||
Managers (VMMs) such as `virt-manager` cannot handle `veth`
|
||||
interfaces. Typically, [`TAP`](https://www.kernel.org/doc/Documentation/networking/tuntap.txt)
|
||||
interfaces are created for VM connectivity.
|
||||
|
||||
To overcome incompatibility between typical container engines expectations
|
||||
and virtual machines, Kata Containers networking transparently connects `veth`
|
||||
interfaces with `TAP` ones using [Traffic Control](https://man7.org/linux/man-pages/man8/tc.8.html):
|
||||
|
||||

|
||||
|
||||
With a TC filter rules in place, a redirection is created between the container network
|
||||
and the virtual machine. As an example, the network plugin may place a device,
|
||||
`eth0`, in the container's network namespace, which is one end of a VETH device.
|
||||
Kata Containers will create a tap device for the VM, `tap0_kata`,
|
||||
and setup a TC redirection filter to redirect traffic from `eth0`'s ingress to `tap0_kata`'s egress,
|
||||
and a second TC filter to redirect traffic from `tap0_kata`'s ingress to `eth0`'s egress.
|
||||
|
||||
Kata Containers maintains support for MACVTAP, which was an earlier implementation used in Kata.
|
||||
With this method, Kata created a MACVTAP device to connect directly to the `eth0` device.
|
||||
TC-filter is the default because it allows for simpler configuration, better CNI plugin
|
||||
compatibility, and performance on par with MACVTAP.
|
||||
|
||||
Kata Containers has deprecated support for bridge due to lacking performance relative to TC-filter and MACVTAP.
|
||||
|
||||
Kata Containers supports both
|
||||
[CNM](https://github.com/docker/libnetwork/blob/master/docs/design.md#the-container-network-model)
|
||||
and [CNI](https://github.com/containernetworking/cni) for networking management.
|
||||
|
||||
## Network Hotplug
|
||||
|
||||
Kata Containers has developed a set of network sub-commands and APIs to add, list and
|
||||
remove a guest network endpoint and to manipulate the guest route table.
|
||||
|
||||
The following diagram illustrates the Kata Containers network hotplug workflow.
|
||||
|
||||

|
||||
@@ -1,44 +0,0 @@
|
||||
# Storage
|
||||
|
||||
## virtio SCSI
|
||||
|
||||
If a block-based graph driver is [configured](README.md#configuration),
|
||||
`virtio-scsi` is used to _share_ the workload image (such as
|
||||
`busybox:latest`) into the container's environment inside the VM.
|
||||
|
||||
## virtio FS
|
||||
|
||||
If a block-based graph driver is _not_ [configured](README.md#configuration), a
|
||||
[`virtio-fs`](https://virtio-fs.gitlab.io) (`VIRTIO`) overlay
|
||||
filesystem mount point is used to _share_ the workload image instead. The
|
||||
[agent](README.md#agent) uses this mount point as the root filesystem for the
|
||||
container processes.
|
||||
|
||||
For virtio-fs, the [runtime](README.md#runtime) starts one `virtiofsd` daemon
|
||||
(that runs in the host context) for each VM created.
|
||||
|
||||
## Devicemapper
|
||||
|
||||
The
|
||||
[devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/master/snapshots/devmapper)
|
||||
is a special case. The `snapshotter` uses dedicated block devices
|
||||
rather than formatted filesystems, and operates at the block level
|
||||
rather than the file level. This knowledge is used to directly use the
|
||||
underlying block device instead of the overlay file system for the
|
||||
container root file system. The block device maps to the top
|
||||
read-write layer for the overlay. This approach gives much better I/O
|
||||
performance compared to using `virtio-fs` to share the container file
|
||||
system.
|
||||
|
||||
#### Hot plug and unplug
|
||||
|
||||
Kata Containers has the ability to hot plug add and hot plug remove
|
||||
block devices. This makes it possible to use block devices for
|
||||
containers started after the VM has been launched.
|
||||
|
||||
Users can check to see if the container uses the `devicemapper` block
|
||||
device as its rootfs by calling `mount(8)` within the container. If
|
||||
the `devicemapper` block device is used, the root filesystem (`/`)
|
||||
will be mounted from `/dev/vda`. Users can disable direct mounting of
|
||||
the underlying block device through the runtime
|
||||
[configuration](README.md#configuration).
|
||||
@@ -1825,8 +1825,12 @@ components:
|
||||
desc: ""
|
||||
- value: grpc.StartContainerRequest
|
||||
desc: ""
|
||||
- value: grpc.StartTracingRequest
|
||||
desc: ""
|
||||
- value: grpc.StatsContainerRequest
|
||||
desc: ""
|
||||
- value: grpc.StopTracingRequest
|
||||
desc: ""
|
||||
- value: grpc.TtyWinResizeRequest
|
||||
desc: ""
|
||||
- value: grpc.UpdateContainerRequest
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
# Kata Containers E2E Flow
|
||||
|
||||
|
||||

|
||||
|
||||
@@ -1,3 +1,18 @@
|
||||
- [Host cgroup management](#host-cgroup-management)
|
||||
- [Introduction](#introduction)
|
||||
- [`SandboxCgroupOnly` enabled](#sandboxcgrouponly-enabled)
|
||||
- [What does Kata do in this configuration?](#what-does-kata-do-in-this-configuration)
|
||||
- [Why create a Kata-cgroup under the parent cgroup?](#why-create-a-kata-cgroup-under-the-parent-cgroup)
|
||||
- [Improvements](#improvements)
|
||||
- [`SandboxCgroupOnly` disabled (default, legacy)](#sandboxcgrouponly-disabled-default-legacy)
|
||||
- [What does this method do?](#what-does-this-method-do)
|
||||
- [Impact](#impact)
|
||||
- [Supported cgroups](#supported-cgroups)
|
||||
- [Cgroups V1](#cgroups-v1)
|
||||
- [Cgroups V2](#cgroups-v2)
|
||||
- [Distro Support](#distro-support)
|
||||
- [Summary](#summary)
|
||||
|
||||
# Host cgroup management
|
||||
|
||||
## Introduction
|
||||
@@ -12,244 +27,187 @@ The OCI [runtime specification][linux-config] provides guidance on where the con
|
||||
> [`cgroupsPath`][cgroupspath]: (string, OPTIONAL) path to the cgroups. It can be used to either control the cgroups
|
||||
> hierarchy for containers or to run a new process in an existing container
|
||||
|
||||
Cgroups are hierarchical, and this can be seen with the following pod example:
|
||||
cgroups are hierarchical, and this can be seen with the following pod example:
|
||||
|
||||
- Pod 1: `cgroupsPath=/kubepods/pod1`
|
||||
- Container 1: `cgroupsPath=/kubepods/pod1/container1`
|
||||
- Container 2: `cgroupsPath=/kubepods/pod1/container2`
|
||||
- Container 1:
|
||||
`cgroupsPath=/kubepods/pod1/container1`
|
||||
- Container 2:
|
||||
`cgroupsPath=/kubepods/pod1/container2`
|
||||
|
||||
- Pod 2: `cgroupsPath=/kubepods/pod2`
|
||||
- Container 1: `cgroupsPath=/kubepods/pod2/container1`
|
||||
- Container 2: `cgroupsPath=/kubepods/pod2/container2`
|
||||
- Container 1:
|
||||
`cgroupsPath=/kubepods/pod2/container2`
|
||||
- Container 2:
|
||||
`cgroupsPath=/kubepods/pod2/container2`
|
||||
|
||||
Depending on the upper-level orchestration layers, the cgroup under which the pod is placed is
|
||||
managed by the orchestrator or not. In the case of Kubernetes, the pod cgroup is created by Kubelet,
|
||||
while the container cgroups are to be handled by the runtime.
|
||||
Kubelet will size the pod cgroup based on the container resource requirements, to which it may add
|
||||
a configured set of [pod resource overheads](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/).
|
||||
Depending on the upper-level orchestrator, the cgroup under which the pod is placed is
|
||||
managed by the orchestrator. In the case of Kubernetes, the pod-cgroup is created by Kubelet,
|
||||
while the container cgroups are to be handled by the runtime. Kubelet will size the pod-cgroup
|
||||
based on the container resource requirements.
|
||||
|
||||
Kata Containers introduces a non-negligible resource overhead for running a sandbox (pod). Typically, the Kata shim,
|
||||
through its underlying VMM invocation, will create many additional threads compared to process based container runtimes:
|
||||
the para-virtualized I/O back-ends, the VMM instance or even the Kata shim process, all of those host processes consume
|
||||
memory and CPU time not directly tied to the container workload, and introduces a sandbox resource overhead.
|
||||
In order for a Kata workload to run without significant performance degradation, its sandbox overhead must be
|
||||
provisioned accordingly. Two scenarios are possible:
|
||||
Kata Containers introduces a non-negligible overhead for running a sandbox (pod). Based on this, two scenarios are possible:
|
||||
1) The upper-layer orchestrator takes the overhead of running a sandbox into account when sizing the pod-cgroup, or
|
||||
2) Kata Containers do not fully constrain the VMM and associated processes, instead placing a subset of them outside of the pod-cgroup.
|
||||
|
||||
1) The upper-layer orchestrator takes the overhead of running a sandbox into account when sizing the pod cgroup.
|
||||
For example, Kubernetes [`PodOverhead`](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/)
|
||||
feature lets the orchestrator add a configured sandbox overhead to the sum of all its containers resources. In
|
||||
that case, the pod sandbox is properly sized and all Kata created processes will run under the pod cgroup
|
||||
defined constraints and limits.
|
||||
2) The upper-layer orchestrator does **not** take the sandbox overhead into account and the pod cgroup is not
|
||||
sized to properly run all Kata created processes. With that scenario, attaching all the Kata processes to the sandbox
|
||||
cgroup may lead to non-negligible workload performance degradations. As a consequence, Kata Containers will move
|
||||
all processes but the vCPU threads into a dedicated overhead cgroup under `/kata_overhead`. The Kata runtime will
|
||||
not apply any constraints or limits to that cgroup, it is up to the infrastructure owner to optionally set it up.
|
||||
Kata Containers provides two options for how cgroups are handled on the host. Selection of these options is done through
|
||||
the `SandboxCgroupOnly` flag within the Kata Containers [configuration](../../src/runtime/README.md#configuration)
|
||||
file.
|
||||
|
||||
Those 2 scenarios are not dynamically detected by the Kata Containers runtime implementation, and thus the
|
||||
infrastructure owner must configure the runtime according to how the upper-layer orchestrator creates and sizes the
|
||||
pod cgroup. That configuration selection is done through the `sandbox_cgroup_only` flag within the Kata Containers
|
||||
[configuration](../../src/runtime/README.md#configuration) file.
|
||||
## `SandboxCgroupOnly` enabled
|
||||
|
||||
## `sandbox_cgroup_only = true`
|
||||
With `SandboxCgroupOnly` enabled, it is expected that the parent cgroup is sized to take the overhead of running
|
||||
a sandbox into account. This is ideal, as all the applicable Kata Containers components can be placed within the
|
||||
given cgroup-path.
|
||||
|
||||
Setting `sandbox_cgroup_only` to `true` from the Kata Containers configuration file means that the pod cgroup is
|
||||
properly sized and takes the pod overhead into account. This is ideal, as all the applicable Kata Containers processes
|
||||
can simply be placed within the given cgroup path.
|
||||
|
||||
In the context of Kubernetes, Kubelet can size the pod cgroup to take the overhead of running a Kata-based sandbox
|
||||
into account. This has been supported since the 1.16 Kubernetes release, through the
|
||||
[`PodOverhead`](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/) feature.
|
||||
In the context of Kubernetes, Kubelet will size the pod-cgroup to take the overhead of running a Kata-based sandbox
|
||||
into account. This will be feasible in the 1.16 Kubernetes release through the `PodOverhead` feature.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────┐ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │
|
||||
│ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │Pod 1 │ │ │
|
||||
│ │ └─────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────┐ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │
|
||||
│ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │Pod 2 │ │ │
|
||||
│ │ └─────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │/kubepods │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Node │
|
||||
└─────────────────────────────────────────┘
|
||||
+----------------------------------------------------------+
|
||||
| +---------------------------------------------------+ |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | | kata-shimv2, VMM and threads: | | | |
|
||||
| | | | (VMM, IO-threads, vCPU threads, etc)| | | |
|
||||
| | | | | | | |
|
||||
| | | | kata_<sandbox-id> | | | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | | | |
|
||||
| | |Pod 1 | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | | kata-shimv2, VMM and threads: | | | |
|
||||
| | | | (VMM, IO-threads, vCPU threads, etc)| | | |
|
||||
| | | | | | | |
|
||||
| | | | kata_<sandbox-id> | | | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | |Pod 2 | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| |kubepods | |
|
||||
| +---------------------------------------------------+ |
|
||||
| |
|
||||
|Node |
|
||||
+----------------------------------------------------------+
|
||||
```
|
||||
|
||||
### Implementation details
|
||||
### What does Kata do in this configuration?
|
||||
1. Given a `PodSandbox` container creation, let:
|
||||
|
||||
When `sandbox_cgroup_only` is enabled, the Kata shim will create a per pod
|
||||
sub-cgroup under the pod's dedicated cgroup. For example, in the Kubernetes context,
|
||||
it will create a `/kata_<PodSandboxID>` under the `/kubepods` cgroup hierarchy.
|
||||
On a typical cgroup v1 hierarchy mounted under `/sys/fs/cgroup/`, the memory cgroup
|
||||
subsystem for a pod with sandbox ID `12345678` would live under
|
||||
`/sys/fs/cgroup/memory/kubepods/kata_12345678`.
|
||||
```
|
||||
podCgroup=Parent(container.CgroupsPath)
|
||||
KataSandboxCgroup=<podCgroup>/kata_<PodSandboxID>
|
||||
```
|
||||
|
||||
In most cases, the `/kata_<PodSandboxID>` created cgroup is unrestricted and inherits and shares all
|
||||
constraints and limits from the parent cgroup (`/kubepods` in the Kubernetes case). The exception is
|
||||
for the `cpuset` and `devices` cgroup subsystems, which are managed by the Kata shim.
|
||||
2. Create the cgroup, `KataSandboxCgroup`
|
||||
|
||||
After creating the `/kata_<PodSandboxID>` cgroup, the Kata Containers shim will move itself to it, **before** starting
|
||||
the virtual machine. As a consequence all processes subsequently created by the Kata Containers shim (the VMM itself, and
|
||||
all vCPU and I/O related threads) will be created in the `/kata_<PodSandboxID>` cgroup.
|
||||
3. Join the `KataSandboxCgroup`
|
||||
|
||||
### Why create a kata-cgroup under the parent cgroup?
|
||||
Any process created by the runtime will be created in `KataSandboxCgroup`.
|
||||
The runtime will limit the cgroup in the host only if the sandbox doesn't have a
|
||||
container type annotation, but the caller is free to set the proper limits for the `podCgroup`.
|
||||
|
||||
And why not directly adding the per sandbox shim directly to the pod cgroup (e.g.
|
||||
`/kubepods` in the Kubernetes context)?
|
||||
In the example above the pod cgroups are `/kubepods/pod1` and `/kubepods/pod2`.
|
||||
Kata creates the unrestricted sandbox cgroup under the pod cgroup.
|
||||
|
||||
The Kata Containers shim implementation creates a per-sandbox cgroup
|
||||
(`/kata_<PodSandboxID>`) to support the `Docker` use case. Although `Docker` does not
|
||||
have a notion of pods, Kata Containers still creates a sandbox to support the pod-less,
|
||||
single container use case that `Docker` implements. Since `Docker` does create any
|
||||
cgroup hierarchy to place a container into, it would be very complex for Kata to map
|
||||
a particular container to its sandbox without placing it under a `/kata_<containerID>>`
|
||||
sub-cgroup first.
|
||||
### Why create a Kata-cgroup under the parent cgroup?
|
||||
|
||||
### Advantages
|
||||
`Docker` does not have a notion of pods, and will not create a cgroup directory
|
||||
to place a particular container in (i.e., all containers would be in a path like
|
||||
`/docker/container-id`. To simplify the implementation and continue to support `Docker`,
|
||||
Kata Containers creates the sandbox-cgroup, in the case of Kubernetes, or a container cgroup, in the case
|
||||
of docker.
|
||||
|
||||
Keeping all Kata Containers processes under a properly sized pod cgroup is ideal
|
||||
and makes for a simpler Kata Containers implementation. It also helps with gathering
|
||||
accurate statistics and preventing Kata workloads from being noisy neighbors.
|
||||
### Improvements
|
||||
|
||||
#### Pod resources statistics
|
||||
- Get statistics about pod resources
|
||||
|
||||
If the Kata caller wants to know the resource usage on the host it can get
|
||||
statistics from the pod cgroup. All cgroups stats in the hierarchy will include
|
||||
the Kata overhead. This gives the possibility of gathering usage-statics at the
|
||||
pod level and the container level.
|
||||
|
||||
#### Better host resource isolation
|
||||
- Better host resource isolation
|
||||
|
||||
Because the Kata runtime will place all the Kata processes in the pod cgroup,
|
||||
the resource limits that the caller applies to the pod cgroup will affect all
|
||||
processes that belong to the Kata sandbox in the host. This will improve the
|
||||
isolation in the host preventing Kata to become a noisy neighbor.
|
||||
|
||||
## `sandbox_cgroup_only = false` (Default setting)
|
||||
|
||||
If the cgroup provided to Kata is not sized appropriately, Kata components will
|
||||
consume resources that the actual container workloads expect to see and use.
|
||||
This can cause instability and performance degradations.
|
||||
|
||||
To avoid that situation, Kata Containers creates an unconstrained overhead
|
||||
cgroup and moves all non workload related processes (Anything but the virtual CPU
|
||||
threads) to it. The name of this overhead cgroup is `/kata_overhead` and a per
|
||||
sandbox sub cgroup will be created under it for each sandbox Kata Containers creates.
|
||||
|
||||
Kata Containers does not add any constraints or limitations on the overhead cgroup. It is up to the infrastructure
|
||||
owner to either:
|
||||
|
||||
- Provision nodes with a pre-sized `/kata_overhead` cgroup. Kata Containers will
|
||||
load that existing cgroup and move all non workload related processes to it.
|
||||
- Let Kata Containers create the `/kata_overhead` cgroup, leave it
|
||||
unconstrained or resize it a-posteriori.
|
||||
## `SandboxCgroupOnly` disabled (default, legacy)
|
||||
|
||||
If the cgroup provided to Kata is not sized appropriately, instability will be
|
||||
introduced when fully constraining Kata components, and the user-workload will
|
||||
see a subset of resources that were requested. Based on this, the default
|
||||
handling for Kata Containers is to not fully constrain the VMM and Kata
|
||||
components on the host.
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────────┐
|
||||
│ │
|
||||
│ ┌─────────────────────────────┐ ┌───────────────────────────┐ │
|
||||
│ │ │ │ │ │
|
||||
│ │ ┌─────────────────────────┼────┼─────────────────────────┐ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ │ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ │ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │ /<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ Pod 1 │ │ │ │ │
|
||||
│ │ └─────────────────────────┼────┼─────────────────────────┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ ┌─────────────────────────┼────┼─────────────────────────┐ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ │ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ │ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │ /<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ Pod 2 │ │ │ │ │
|
||||
│ │ └─────────────────────────┼────┼─────────────────────────┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ /kubepods │ │ /kata_overhead │ │
|
||||
│ └─────────────────────────────┘ └───────────────────────────┘ │
|
||||
│ │
|
||||
│ │
|
||||
│ Node │
|
||||
└────────────────────────────────────────────────────────────────────┘
|
||||
+----------------------------------------------------------+
|
||||
| +---------------------------------------------------+ |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | |Container 1 |-|Container 2 | | | |
|
||||
| | | | |-| | | | |
|
||||
| | | | Shim+container1 |-| Shim+container2 | | | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | | | |
|
||||
| | |Pod 1 | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | |Container 1 |-|Container 2 | | | |
|
||||
| | | | |-| | | | |
|
||||
| | | | Shim+container1 |-| Shim+container2 | | | |
|
||||
| | | +--------------------------------------+ | | |
|
||||
| | | | | |
|
||||
| | |Pod 2 | | |
|
||||
| | +---------------------------------------------+ | |
|
||||
| |kubepods | |
|
||||
| +---------------------------------------------------+ |
|
||||
| +---------------------------------------------------+ |
|
||||
| | Hypervisor | |
|
||||
| |Kata | |
|
||||
| +---------------------------------------------------+ |
|
||||
| |
|
||||
|Node |
|
||||
+----------------------------------------------------------+
|
||||
|
||||
```
|
||||
|
||||
### Implementation Details
|
||||
### What does this method do?
|
||||
|
||||
When `sandbox_cgroup_only` is disabled, the Kata Containers shim will create a per pod
|
||||
sub-cgroup under the pods dedicated cgroup, and another one under the overhead cgroup.
|
||||
For example, in the Kubernetes context, it will create a `/kata_<PodSandboxID>` under
|
||||
the `/kubepods` cgroup hierarchy, and a `/<PodSandboxID>` under the `/kata_overhead` one.
|
||||
1. Given a container creation let `containerCgroupHost=container.CgroupsPath`
|
||||
1. Rename `containerCgroupHost` path to add `kata_`
|
||||
1. Let `PodCgroupPath=PodSanboxContainerCgroup` where `PodSanboxContainerCgroup` is the cgroup of a container of type `PodSandbox`
|
||||
1. Limit the `PodCgroupPath` with the sum of all the container limits in the Sandbox
|
||||
1. Move only vCPU threads of hypervisor to `PodCgroupPath`
|
||||
1. Per each container, move its `kata-shim` to its own `containerCgroupHost`
|
||||
1. Move hypervisor and applicable threads to memory cgroup `/kata`
|
||||
|
||||
On a typical cgroup v1 hierarchy mounted under `/sys/fs/cgroup/`, for a pod which sandbox
|
||||
ID is `12345678`, create with `sandbox_cgroup_only` disabled, the 2 memory subsystems
|
||||
for the sandbox cgroup and the overhead cgroup would respectively live under
|
||||
`/sys/fs/cgroup/memory/kubepods/kata_12345678` and `/sys/fs/cgroup/memory/kata_overhead/12345678`.
|
||||
_Note_: the Kata Containers runtime will not add all the hypervisor threads to
|
||||
the cgroup path requested, only vCPUs. These threads are run unconstrained.
|
||||
|
||||
Unlike when `sandbox_cgroup_only` is enabled, the Kata Containers shim will move itself
|
||||
to the overhead cgroup first, and then move the vCPU threads to the sandbox cgroup as
|
||||
they're created. All Kata processes and threads will run under the overhead cgroup except for
|
||||
the vCPU threads.
|
||||
This mitigates the risk of the VMM and other threads receiving an out of memory scenario (`OOM`).
|
||||
|
||||
With `sandbox_cgroup_only` disabled, Kata Containers assumes the pod cgroup is only sized
|
||||
to accommodate for the actual container workloads processes. For Kata, this maps
|
||||
to the VMM created virtual CPU threads and so they are the only ones running under the pod
|
||||
cgroup. This mitigates the risk of the VMM, the Kata shim and the I/O threads going through
|
||||
a catastrophic out of memory scenario (`OOM`).
|
||||
|
||||
#### Pros and Cons
|
||||
#### Impact
|
||||
|
||||
Running all non vCPU threads under an unconstrained overhead cgroup could lead to workloads
|
||||
potentially consuming a large amount of host resources.
|
||||
If resources are reserved at a system level to account for the overheads of
|
||||
running sandbox containers, this configuration can be utilized with adequate
|
||||
stability. In this scenario, non-negligible amounts of CPU and memory will be
|
||||
utilized unaccounted for on the host.
|
||||
|
||||
On the other hand, running all non vCPU threads under a dedicated overhead cgroup can provide
|
||||
accurate metrics on the actual Kata Container pod overhead, allowing for tuning the overhead
|
||||
cgroup size and constraints accordingly.
|
||||
|
||||
[linux-config]: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md
|
||||
[cgroupspath]: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md#cgroups-path
|
||||
[linux-config]: https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md
|
||||
[cgroupspath]: https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#cgroups-path
|
||||
|
||||
# Supported cgroups
|
||||
|
||||
Kata Containers currently only supports cgroups `v1`.
|
||||
|
||||
In the following sections each cgroup is described briefly.
|
||||
Kata Containers supports cgroups `v1` and `v2`. In the following sections each cgroup is
|
||||
described briefly and what changes are needed in Kata Containers to support it.
|
||||
|
||||
## Cgroups V1
|
||||
|
||||
@@ -301,7 +259,7 @@ diagram:
|
||||
A process can join a cgroup by writing its process id (`pid`) to `cgroup.procs` file,
|
||||
or join a cgroup partially by writing the task (thread) id (`tid`) to the `tasks` file.
|
||||
|
||||
Kata Containers only supports `v1`.
|
||||
Kata Containers supports `v1` by default and no change in the configuration file is needed.
|
||||
To know more about `cgroups v1`, see [cgroupsv1(7)][2].
|
||||
|
||||
## Cgroups V2
|
||||
@@ -354,13 +312,22 @@ Same as `cgroups v1`, a process can join the cgroup by writing its process id (`
|
||||
`cgroup.procs` file, or join a cgroup partially by writing the task (thread) id (`tid`) to
|
||||
`cgroup.threads` file.
|
||||
|
||||
Kata Containers does not support cgroups `v2` on the host.
|
||||
For backwards compatibility Kata Containers defaults to supporting cgroups v1 by default.
|
||||
To change this to `v2`, set `sandbox_cgroup_only=true` in the `configuration.toml` file.
|
||||
To know more about `cgroups v2`, see [cgroupsv2(7)][3].
|
||||
|
||||
### Distro Support
|
||||
|
||||
Many Linux distributions do not yet support `cgroups v2`, as it is quite a recent addition.
|
||||
For more information about the status of this feature see [issue #2494][4].
|
||||
|
||||
# Summary
|
||||
|
||||
| cgroup option | default? | status | pros | cons | cgroups
|
||||
|-|-|-|-|-|-|
|
||||
| `SandboxCgroupOnly=false` | yes | legacy | Easiest to make Kata work | Unaccounted for memory and resource utilization | v1
|
||||
| `SandboxCgroupOnly=true` | no | recommended | Complete tracking of Kata memory and CPU utilization. In Kubernetes, the Kubelet can fully constrain Kata via the pod cgroup | Requires upper layer orchestrator which sizes sandbox cgroup appropriately | v1, v2
|
||||
|
||||
|
||||
[1]: http://man7.org/linux/man-pages/man5/tmpfs.5.html
|
||||
[2]: http://man7.org/linux/man-pages/man7/cgroups.7.html#CGROUPS_VERSION_1
|
||||
|
||||
@@ -1,30 +0,0 @@
|
||||
# Kata Containers support for `inotify`
|
||||
|
||||
## Background on `inotify` usage
|
||||
|
||||
A common pattern in Kubernetes is to watch for changes to files/directories passed in as `ConfigMaps`
|
||||
or `Secrets`. Sidecar's normally use `inotify` to watch for changes and then signal the primary container to reload
|
||||
the updated configuration. Kata Containers typically will pass these host files into the guest using `virtiofs`, which
|
||||
does not support `inotify` today. While we work to enable this use case in `virtiofs`, we introduced a workaround in Kata Containers.
|
||||
This document describes how Kata Containers implements this workaround.
|
||||
|
||||
### Detecting a `watchable` mount
|
||||
|
||||
Kubernetes creates `secrets` and `ConfigMap` mounts at very specific locations on the host filesystem. For container mounts,
|
||||
the `Kata Containers` runtime will check the source of the mount to identify these special cases. For these use cases, only a single file
|
||||
or very few would typically need to be watched. To avoid excessive overheads in making a mount watchable,
|
||||
we enforce a limit of eight files per mount. If a `secret` or `ConfigMap` mount contains more than 8 files, it will not be
|
||||
considered watchable. We similarly enforce a limit of 1 MB per mount to be considered watchable. Non-watchable mounts will
|
||||
continue to propagate changes from the mount on the host to the container workload, but these updates will not trigger an
|
||||
`inotify` event.
|
||||
|
||||
If at any point a mount grows beyond the eight file or 1MB limit, it will no longer be `watchable.`
|
||||
|
||||
### Presenting a `watchable` mount to the workload
|
||||
|
||||
For mounts that are considered `watchable`, inside the guest, the `kata-agent` will poll the mount presented from
|
||||
the host through `virtiofs` and copy any changed files to a `tmpfs` mount that is presented to the container. In this way,
|
||||
for `watchable` mounts, Kata will do the polling on behalf of the workload and existing workloads needn't change their usage
|
||||
of `inotify`.
|
||||
|
||||

|
||||
@@ -1,21 +1,36 @@
|
||||
# Kata 2.0 Metrics Design
|
||||
|
||||
Kata implements CRI's API and supports [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interfaces to get basic metrics about containers.
|
||||
* [Limitations of Kata 1.x and the target of Kata 2.0](#limitations-of-kata-1x-and-the-target-of-kata-20)
|
||||
* [Metrics architecture](#metrics-architecture)
|
||||
* [Kata monitor](#kata-monitor)
|
||||
* [Kata runtime](#kata-runtime)
|
||||
* [Kata agent](#kata-agent)
|
||||
* [Performance and overhead](#performance-and-overhead)
|
||||
* [Metrics list](#metrics-list)
|
||||
* [Metric types](#metric-types)
|
||||
* [Kata agent metrics](#kata-agent-metrics)
|
||||
* [Firecracker metrics](#firecracker-metrics)
|
||||
* [Kata guest OS metrics](#kata-guest-os-metrics)
|
||||
* [Hypervisor metrics](#hypervisor-metrics)
|
||||
* [Kata monitor metrics](#kata-monitor-metrics)
|
||||
* [Kata containerd shim v2 metrics](#kata-containerd-shim-v2-metrics)
|
||||
|
||||
Unlike `runc`, Kata is a VM-based runtime and has a different architecture.
|
||||
Kata implement CRI's API and support [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interface to get basic metrics about container.
|
||||
|
||||
## Limitations of Kata 1.x and target of Kata 2.0
|
||||
But unlike `runc`, Kata is a VM-based runtime and has a different architecture.
|
||||
|
||||
## Limitations of Kata 1.x and the target of Kata 2.0
|
||||
|
||||
Kata 1.x has a number of limitations related to observability that may be obstacles to running Kata Containers at scale.
|
||||
|
||||
In Kata 2.0, the following components will be able to provide more details about the system:
|
||||
In Kata 2.0, the following components will be able to provide more details about the system.
|
||||
|
||||
- containerd shim v2 (effectively `kata-runtime`)
|
||||
- Hypervisor statistics
|
||||
- Agent process
|
||||
- Guest OS statistics
|
||||
|
||||
> **Note**: In Kata 1.x, the main user-facing component was the runtime (`kata-runtime`). From 1.5, Kata introduced the Kata containerd shim v2 (`containerd-shim-kata-v2`) which is essentially a modified runtime that is loaded by containerd to simplify and improve the way VM-based containers are created and managed.
|
||||
> **Note**: In Kata 1.x, the main user-facing component was the runtime (`kata-runtime`). From 1.5, Kata then introduced the Kata containerd shim v2 (`containerd-shim-kata-v2`) which is essentially a modified runtime that is loaded by containerd to simplify and improve the way VM-based containers are created and managed.
|
||||
>
|
||||
> For Kata 2.0, the main component is the Kata containerd shim v2, although the deprecated `kata-runtime` binary will be maintained for a period of time.
|
||||
>
|
||||
@@ -25,15 +40,14 @@ In Kata 2.0, the following components will be able to provide more details about
|
||||
|
||||
Kata 2.0 metrics strongly depend on [Prometheus](https://prometheus.io/), a graduated project from CNCF.
|
||||
|
||||
Kata Containers 2.0 introduces a new Kata component called `kata-monitor` which is used to monitor the Kata components on the host. It's shipped with the Kata runtime to provide an interface to:
|
||||
Kata Containers 2.0 introduces a new Kata component called `kata-monitor` which is used to monitor the other Kata components on the host. It's the monitor interface with Kata runtime, and we can do something like these:
|
||||
|
||||
- Get metrics
|
||||
- Get events
|
||||
|
||||
At present, `kata-monitor` supports retrieval of metrics only: this is what will be covered in this document.
|
||||
In this document we will cover metrics only. And until now it only supports metrics function.
|
||||
|
||||
|
||||
This is the architecture overview of metrics in Kata Containers 2.0:
|
||||
This is the architecture overview metrics in Kata Containers 2.0.
|
||||
|
||||

|
||||
|
||||
@@ -46,39 +60,38 @@ For a quick evaluation, you can check out [this how to](../how-to/how-to-set-pro
|
||||
|
||||
### Kata monitor
|
||||
|
||||
The `kata-monitor` management agent should be started on each node where the Kata containers runtime is installed. `kata-monitor` will:
|
||||
`kata-monitor` is a management agent on one node, where many Kata containers are running. `kata-monitor`'s work include:
|
||||
|
||||
> **Note**: a *node* running Kata containers will be either a single host system or a worker node belonging to a K8s cluster capable of running Kata pods.
|
||||
> **Note**: node is a single host system or a node in K8s clusters.
|
||||
|
||||
- Aggregate sandbox metrics running on the node, adding the `sandbox_id` label to them.
|
||||
- Attach the additional `cri_uid`, `cri_name` and `cri_namespace` labels to the sandbox metrics, tracking the `uid`, `name` and `namespace` Kubernetes pod metadata.
|
||||
- Expose a new Prometheus target, allowing all node metrics coming from the Kata shim to be collected by Prometheus indirectly. This simplifies the targets count in Prometheus and avoids exposing shim's metrics by `ip:port`.
|
||||
- Aggregate sandbox metrics running on this node, and add `sandbox_id` label
|
||||
- As a Prometheus target, all metrics from Kata shim on this node will be collected by Prometheus indirectly. This can easy the targets count in Prometheus, and also need not to expose shim's metrics by `ip:port`
|
||||
|
||||
Only one `kata-monitor` process runs in each node.
|
||||
Only one `kata-monitor` process are running on one node.
|
||||
|
||||
`kata-monitor` uses a different communication channel than the one used by the container engine (`containerd`/`CRI-O`) to communicate with the Kata shim. The Kata shim exposes a dedicated socket address reserved to `kata-monitor`.
|
||||
`kata-monitor` is using a different communication channel other than that `conatinerd` communicating with Kata shim, and Kata shim listen on a new socket address for communicating with `kata-monitor`.
|
||||
|
||||
The shim's metrics socket file is created under the virtcontainers sandboxes directory, i.e. `vc/sbs/${PODID}/shim-monitor.sock`.
|
||||
The way `kata-monitor` get shim's metrics socket file(`monitor_address`) like that `containerd` get shim address. The socket is an abstract socket and saved as file `abstract` with the same directory of `address` for `containerd`.
|
||||
|
||||
> **Note**: If there is no Prometheus server configured, i.e., there are no scrape operations, `kata-monitor` will not collect any metrics.
|
||||
> **Note**: If there is no Prometheus server is configured, i.e., there is no scrape operations, `kata-monitor` will do nothing initiative.
|
||||
|
||||
### Kata runtime
|
||||
|
||||
Kata runtime is responsible for:
|
||||
Runtime is responsible for:
|
||||
|
||||
- Gather metrics about shim process
|
||||
- Gather metrics about hypervisor process
|
||||
- Gather metrics about running sandbox
|
||||
- Get metrics from Kata agent (through `ttrpc`)
|
||||
- Get metrics from Kata agent(through `ttrpc`)
|
||||
|
||||
### Kata agent
|
||||
|
||||
Kata agent is responsible for:
|
||||
Agent is responsible for:
|
||||
|
||||
- Gather agent process metrics
|
||||
- Gather guest OS metrics
|
||||
|
||||
In Kata 2.0, the agent adds a new interface:
|
||||
And in Kata 2.0, agent will add a new interface:
|
||||
|
||||
```protobuf
|
||||
rpc GetMetrics(GetMetricsRequest) returns (Metrics);
|
||||
@@ -95,49 +108,33 @@ The `metrics` field is Prometheus encoded content. This can avoid defining a fix
|
||||
|
||||
### Performance and overhead
|
||||
|
||||
Metrics should not become a bottleneck for the system or downgrade the performance: they should run with minimal overhead.
|
||||
Metrics should not become the bottleneck of system, downgrade the performance, and run with minimal overhead.
|
||||
|
||||
Requirements:
|
||||
|
||||
* Metrics **MUST** be quick to collect
|
||||
* Metrics **MUST** be small
|
||||
* Metrics **MUST** be small.
|
||||
* Metrics **MUST** be generated only if there are subscribers to the Kata metrics service
|
||||
* Metrics **MUST** be stateless
|
||||
|
||||
In Kata 2.0, metrics are collected only when needed (pull mode), mainly from the `/proc` filesystem, and consumed by Prometheus. This means that if the Prometheus collector is not running (so no one cares about the metrics) the overhead will be zero.
|
||||
In Kata 2.0, metrics are collected mainly from `/proc` filesystem, and consumed by Prometheus, based on a pull mode, that is mean if there is no Prometheus collector is running, so there will be zero overhead if nobody cares the metrics.
|
||||
|
||||
The metrics service also doesn't hold any metrics in memory.
|
||||
|
||||
#### Metrics size ####
|
||||
Metrics service also doesn't hold any metrics in memory.
|
||||
|
||||
|\*|No Sandbox | 1 Sandbox | 2 Sandboxes |
|
||||
|---|---|---|---|
|
||||
|Metrics count| 39 | 106 | 173 |
|
||||
|Metrics size (bytes)| 9K | 144K | 283K |
|
||||
|Metrics size (`gzipped`, bytes)| 2K | 10K | 17K |
|
||||
|Metrics size(bytes)| 9K | 144K | 283K |
|
||||
|Metrics size(`gzipped`, bytes)| 2K | 10K | 17K |
|
||||
|
||||
*Metrics size*: response size of one Prometheus scrape request.
|
||||
*Metrics size*: Response size of one Prometheus scrape request.
|
||||
|
||||
It's easy to estimate the size of one metrics fetch request issued by Prometheus.
|
||||
The formula to calculate the expected size when no gzip compression is in place is:
|
||||
9 + (144 - 9) * `number of kata sandboxes`
|
||||
|
||||
Prometheus supports `gzip compression`. When enabled, the response size of each request will be smaller:
|
||||
2 + (10 - 2) * `number of kata sandboxes`
|
||||
|
||||
**Example**
|
||||
We have 10 sandboxes running on a node. The expected size of one metrics fetch request issued by Prometheus against the kata-monitor agent running on that node will be:
|
||||
9 + (144 - 9) * 10 = **1.35M**
|
||||
|
||||
If `gzip compression` is enabled:
|
||||
2 + (10 - 2) * 10 = **82K**
|
||||
|
||||
#### Metrics delay ####
|
||||
It's easy to estimated that if there are 10 sandboxes running in the host, the size of one metrics fetch request issued by Prometheus will be about to 9 + (144 - 9) * 10 = 1.35M (not `gzipped`) or 2 + (10 - 2) * 10 = 82K (`gzipped`). Of course Prometheus support `gzip` compression, that can reduce the response size of every request.
|
||||
|
||||
And here is some test data:
|
||||
|
||||
- End-to-end (from Prometheus server to `kata-monitor` and `kata-monitor` write response back): **20ms**(avg)
|
||||
- Agent (RPC all from shim to agent): **3ms**(avg)
|
||||
- End-to-end (from Prometheus server to `kata-monitor` and `kata-monitor` write response back): 20ms(avg)
|
||||
- Agent(RPC all from shim to agent): 3ms(avg)
|
||||
|
||||
Test infrastructure:
|
||||
|
||||
@@ -146,13 +143,13 @@ Test infrastructure:
|
||||
|
||||
**Scrape interval**
|
||||
|
||||
Prometheus default `scrape_interval` is 1 minute, but it is usually set to 15 seconds. A smaller `scrape_interval` causes more overhead, so users should set it depending on their monitoring needs.
|
||||
Prometheus default `scrape_interval` is 1 minute, and usually it is set to 15s. Small `scrape_interval` will cause more overhead, so user should set it on monitor demand.
|
||||
|
||||
## Metrics list
|
||||
|
||||
Here are listed all the metrics supported by Kata 2.0. Some metrics are dependent on the VM guest kernel, so the available ones may differ based on the environment.
|
||||
Here listed is all supported metrics by Kata 2.0. Some metrics is dependent on guest kernels in the VM, so there may be some different by your environment.
|
||||
|
||||
Metrics are categorized by the component from/for which the metrics are collected.
|
||||
Metrics is categorized by component where metrics are collected from and for.
|
||||
|
||||
* [Metric types](#metric-types)
|
||||
* [Kata agent metrics](#kata-agent-metrics)
|
||||
@@ -163,15 +160,15 @@ Metrics are categorized by the component from/for which the metrics are collecte
|
||||
* [Kata containerd shim v2 metrics](#kata-containerd-shim-v2-metrics)
|
||||
|
||||
> **Note**:
|
||||
> * Labels here do not include the `instance` and `job` labels added by Prometheus.
|
||||
> * Labels here are not include `instance` and `job` labels that added by Prometheus.
|
||||
> * Notes about metrics unit
|
||||
> * `Kibibytes`, abbreviated `KiB`. 1 `KiB` equals 1024 B.
|
||||
> * For some metrics (like network devices statistics from file `/proc/net/dev`), unit depends on label( for example `recv_bytes` and `recv_packets` have different units).
|
||||
> * Most of these metrics are collected from the `/proc` filesystem, so the unit of each metric matches the unit of the relevant `/proc` entry. See the `proc(5)` manual page for further details.
|
||||
> * For some metrics (like network devices statistics from file `/proc/net/dev`), unit is depend on label( for example `recv_bytes` and `recv_packets` are having different units).
|
||||
> * Most of these metrics is collected from `/proc` filesystem, so the unit of metrics are keeping the same unit as `/proc`. See the `proc(5)` manual page for further details.
|
||||
|
||||
### Metric types
|
||||
|
||||
Prometheus offers four core metric types.
|
||||
Prometheus offer four core metric types.
|
||||
|
||||
- Counter: A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase.
|
||||
|
||||
@@ -225,7 +222,7 @@ Metrics for Firecracker vmm.
|
||||
| `kata_firecracker_uart`: <br> Metrics specific to the UART device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`flush_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vcpu`: <br> Metrics specific to VCPUs' mode of functioning. | `GAUGE` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vmm`: <br> Metrics specific to the machine manager as a whole. | `GAUGE` | | <ul><li>`item`<ul><li>`device_events`</li><li>`panic_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vsock`: <br> VSOCK-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`conn_event_fails`</li><li>`conns_added`</li><li>`conns_killed`</li><li>`conns_removed`</li><li>`ev_queue_event_fails`</li><li>`killq_resync`</li><li>`muxer_event_fails`</li><li>`rx_bytes_count`</li><li>`rx_packets_count`</li><li>`rx_queue_event_count`</li><li>`rx_queue_event_fails`</li><li>`rx_read_fails`</li><li>`tx_bytes_count`</li><li>`tx_flush_fails`</li><li>`tx_packets_count`</li><li>`tx_queue_event_count`</li><li>`tx_queue_event_fails`</li><li>`tx_write_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vsock`: <br> Vsock-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`conn_event_fails`</li><li>`conns_added`</li><li>`conns_killed`</li><li>`conns_removed`</li><li>`ev_queue_event_fails`</li><li>`killq_resync`</li><li>`muxer_event_fails`</li><li>`rx_bytes_count`</li><li>`rx_packets_count`</li><li>`rx_queue_event_count`</li><li>`rx_queue_event_fails`</li><li>`rx_read_fails`</li><li>`tx_bytes_count`</li><li>`tx_flush_fails`</li><li>`tx_packets_count`</li><li>`tx_queue_event_count`</li><li>`tx_queue_event_fails`</li><li>`tx_write_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
### Kata guest OS metrics
|
||||
|
||||
@@ -306,7 +303,7 @@ Metrics about Kata containerd shim v2 process.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_shim_agent_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (RPC actions of Kata agent)<ul><li>`grpc.CheckRequest`</li><li>`grpc.CloseStdinRequest`</li><li>`grpc.CopyFileRequest`</li><li>`grpc.CreateContainerRequest`</li><li>`grpc.CreateSandboxRequest`</li><li>`grpc.DestroySandboxRequest`</li><li>`grpc.ExecProcessRequest`</li><li>`grpc.GetMetricsRequest`</li><li>`grpc.GuestDetailsRequest`</li><li>`grpc.ListInterfacesRequest`</li><li>`grpc.ListProcessesRequest`</li><li>`grpc.ListRoutesRequest`</li><li>`grpc.MemHotplugByProbeRequest`</li><li>`grpc.OnlineCPUMemRequest`</li><li>`grpc.PauseContainerRequest`</li><li>`grpc.RemoveContainerRequest`</li><li>`grpc.ReseedRandomDevRequest`</li><li>`grpc.ResumeContainerRequest`</li><li>`grpc.SetGuestDateTimeRequest`</li><li>`grpc.SignalProcessRequest`</li><li>`grpc.StartContainerRequest`</li><li>`grpc.StatsContainerRequest`</li><li>`grpc.TtyWinResizeRequest`</li><li>`grpc.UpdateContainerRequest`</li><li>`grpc.UpdateInterfaceRequest`</li><li>`grpc.UpdateRoutesRequest`</li><li>`grpc.WaitProcessRequest`</li><li>`grpc.WriteStreamRequest`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_agent_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (RPC actions of Kata agent)<ul><li>`grpc.CheckRequest`</li><li>`grpc.CloseStdinRequest`</li><li>`grpc.CopyFileRequest`</li><li>`grpc.CreateContainerRequest`</li><li>`grpc.CreateSandboxRequest`</li><li>`grpc.DestroySandboxRequest`</li><li>`grpc.ExecProcessRequest`</li><li>`grpc.GetMetricsRequest`</li><li>`grpc.GuestDetailsRequest`</li><li>`grpc.ListInterfacesRequest`</li><li>`grpc.ListProcessesRequest`</li><li>`grpc.ListRoutesRequest`</li><li>`grpc.MemHotplugByProbeRequest`</li><li>`grpc.OnlineCPUMemRequest`</li><li>`grpc.PauseContainerRequest`</li><li>`grpc.RemoveContainerRequest`</li><li>`grpc.ReseedRandomDevRequest`</li><li>`grpc.ResumeContainerRequest`</li><li>`grpc.SetGuestDateTimeRequest`</li><li>`grpc.SignalProcessRequest`</li><li>`grpc.StartContainerRequest`</li><li>`grpc.StartTracingRequest`</li><li>`grpc.StatsContainerRequest`</li><li>`grpc.StopTracingRequest`</li><li>`grpc.TtyWinResizeRequest`</li><li>`grpc.UpdateContainerRequest`</li><li>`grpc.UpdateInterfaceRequest`</li><li>`grpc.UpdateRoutesRequest`</li><li>`grpc.WaitProcessRequest`</li><li>`grpc.WriteStreamRequest`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_fds`: <br> Kata containerd shim v2 open FDs. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_gc_duration_seconds`: <br> A summary of the pause duration of garbage collection cycles. | `SUMMARY` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_goroutines`: <br> Number of goroutines that currently exist. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
# Kata API Design
|
||||
|
||||
To fulfill the [Kata design requirements](kata-design-requirements.md), and based on the discussion on [Virtcontainers API extensions](https://docs.google.com/presentation/d/1dbGrD1h9cpuqAPooiEgtiwWDGCYhVPdatq7owsKHDEQ), the Kata runtime library features the following APIs:
|
||||
- Sandbox based top API
|
||||
- Storage and network hotplug API
|
||||
- Plugin frameworks for external proprietary Kata runtime extensions
|
||||
- Built-in shim and proxy types and capabilities
|
||||
|
||||
## Sandbox Based API
|
||||
### Sandbox Management API
|
||||
@@ -57,7 +57,7 @@ To fulfill the [Kata design requirements](kata-design-requirements.md), and base
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`sandbox.GetOOMEvent()`| Monitor the OOM events that occur in the sandbox..|
|
||||
|`sandbox.UpdateRuntimeMetrics()`| Update the `shim/hypervisor` metrics of the running sandbox.|
|
||||
|`sandbox.UpdateRuntimeMetrics()`| Update the shim/`hypervisor`'s metrics of the running sandbox.|
|
||||
|`sandbox.GetAgentMetrics()`| Get metrics of the agent and the guest in the running sandbox.|
|
||||
|
||||
## Plugin framework for external proprietary Kata runtime extensions
|
||||
@@ -99,3 +99,32 @@ Built-in implementations include:
|
||||
|
||||
### Sandbox Connection Plugin Workflow
|
||||

|
||||
|
||||
## Built-in Shim and Proxy Types and Capabilities
|
||||
### Built-in shim/proxy sandbox configurations
|
||||
- Supported shim configurations:
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`noopshim`|Do not start any shim process.|
|
||||
|`ccshim`| Start the cc-shim binary.|
|
||||
|`katashim`| Start the `kata-shim` binary.|
|
||||
|`katashimbuiltin`|No standalone shim process but shim functionality APIs are exported.|
|
||||
- Supported proxy configurations:
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`noopProxy`| a dummy proxy implementation of the proxy interface, only used for testing purpose.|
|
||||
|`noProxy`|generic implementation for any case where no actual proxy is needed.|
|
||||
|`ccProxy`|run `ccProxy` to proxy between runtime and agent.|
|
||||
|`kataProxy`|run `kata-proxy` to translate Yamux connections between runtime and Kata agent. |
|
||||
|`kataProxyBuiltin`| no standalone proxy process and connect to Kata agent with internal Yamux translation.|
|
||||
|
||||
### Built-in Shim Capability
|
||||
Built-in shim capability is implemented by removing standalone shim process, and
|
||||
supporting the shim related APIs.
|
||||
|
||||
### Built-in Proxy Capability
|
||||
Built-in proxy capability is achieved by removing standalone proxy process, and
|
||||
connecting to Kata agent with a custom gRPC dialer that is internal Yamux translation.
|
||||
The behavior is enabled when proxy is configured as `kataProxyBuiltin`.
|
||||
|
||||
@@ -30,7 +30,7 @@ The Kata Containers runtime **MUST** implement the following command line option
|
||||
The Kata Containers project **MUST** provide two interfaces for CRI shims to manage hardware
|
||||
virtualization based Kubernetes pods and containers:
|
||||
- An OCI and `runc` compatible command line interface, as described in the previous section.
|
||||
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`containerd`](https://github.com/containerd/containerd), for example.
|
||||
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`cri-containerd`](https://github.com/containerd/cri-containerd), for example.
|
||||
- A hardware virtualization runtime library API for CRI shims to consume and provide a more
|
||||
CRI native implementation. The [`frakti`](https://github.com/kubernetes/frakti) CRI shim is an example of such a consumer.
|
||||
|
||||
|
||||
@@ -1,93 +0,0 @@
|
||||
# Background
|
||||
|
||||
[Research](https://www.usenix.org/conference/fast16/technical-sessions/presentation/harter) shows that time to take for pull operation accounts for 76% of container startup time but only 6.4% of that data is read. So if we can get data on demand (lazy load), it will speed up the container start. [`Nydus`](https://github.com/dragonflyoss/image-service) is a project which build image with new format and can get data on demand when container start.
|
||||
|
||||
The following benchmarking result shows the performance improvement compared with the OCI image for the container cold startup elapsed time on containerd. As the OCI image size increases, the container startup time of using `nydus` image remains very short. [Click here](https://github.com/dragonflyoss/image-service/blob/master/docs/nydus-design.md) to see `nydus` design.
|
||||
|
||||

|
||||
|
||||
## Proposal - Bring `lazyload` ability to Kata Containers
|
||||
|
||||
`Nydusd` is a fuse/`virtiofs` daemon which is provided by `nydus` project and it supports `PassthroughFS` and [RAFS](https://github.com/dragonflyoss/image-service/blob/master/docs/nydus-design.md) (Registry Acceleration File System) natively, so in Kata Containers, we can use `nydusd` in place of `virtiofsd` and mount `nydus` image to guest in the meanwhile.
|
||||
|
||||
The process of creating/starting Kata Containers with `virtiofsd`,
|
||||
|
||||
1. When creating sandbox, the Kata Containers Containerd v2 [shim](https://github.com/kata-containers/kata-containers/blob/main/docs/design/architecture/README.md#runtime) will launch `virtiofsd` before VM starts and share directories with VM.
|
||||
2. When creating container, the Kata Containers Containerd v2 shim will mount rootfs to `kataShared`(/run/kata-containers/shared/sandboxes/\<SANDBOX\>/mounts/\<CONTAINER\>/rootfs), so it can be seen at the path `/run/kata-containers/shared/containers/shared/\<CONTAINER\>/rootfs` in the guest and used as container's rootfs.
|
||||
|
||||
The process of creating/starting Kata Containers with `nydusd`,
|
||||
|
||||

|
||||
|
||||
1. When creating sandbox, the Kata Containers Containerd v2 shim will launch `nydusd` daemon before VM starts.
|
||||
After VM starts, `kata-agent` will mount `virtiofs` at the path `/run/kata-containers/shared` and Kata Containers Containerd v2 shim mount `passthroughfs` filesystem to path `/run/kata-containers/shared/containers` when the VM starts.
|
||||
|
||||
```bash
|
||||
# start nydusd
|
||||
$ sandbox_id=my-test-sandbox
|
||||
$ sudo /usr/local/bin/nydusd --log-level info --sock /run/vc/vm/${sandbox_id}/vhost-user-fs.sock --apisock /run/vc/vm/${sandbox_id}/api.sock
|
||||
```
|
||||
|
||||
```bash
|
||||
# source: the host sharedir which will pass through to guest
|
||||
$ sudo curl -v --unix-socket /run/vc/vm/${sandbox_id}/api.sock \
|
||||
-X POST "http://localhost/api/v1/mount?mountpoint=/containers" -H "accept: */*" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"source":"/path/to/sharedir",
|
||||
"fs_type":"passthrough_fs",
|
||||
"config":""
|
||||
}'
|
||||
```
|
||||
|
||||
2. When creating normal container, the Kata Containers Containerd v2 shim send request to `nydusd` to mount `rafs` at the path `/run/kata-containers/shared/rafs/<container_id>/lowerdir` in guest.
|
||||
|
||||
```bash
|
||||
# source: the metafile of nydus image
|
||||
# config: the config of this image
|
||||
$ sudo curl --unix-socket /run/vc/vm/${sandbox_id}/api.sock \
|
||||
-X POST "http://localhost/api/v1/mount?mountpoint=/rafs/<container_id>/lowerdir" -H "accept: */*" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"source":"/path/to/bootstrap",
|
||||
"fs_type":"rafs",
|
||||
"config":"config":"{\"device\":{\"backend\":{\"type\":\"localfs\",\"config\":{\"dir\":\"blobs\"}},\"cache\":{\"type\":\"blobcache\",\"config\":{\"work_dir\":\"cache\"}}},\"mode\":\"direct\",\"digest_validate\":true}",
|
||||
}'
|
||||
```
|
||||
|
||||
The Kata Containers Containerd v2 shim will also bind mount `snapshotdir` which `nydus-snapshotter` assigns to `sharedir`。
|
||||
So in guest, container rootfs=overlay(`lowerdir=rafs`, `upperdir=snapshotdir/fs`, `workdir=snapshotdir/work`)
|
||||
|
||||
> how to transfer the `rafs` info from `nydus-snapshotter` to the Kata Containers Containerd v2 shim?
|
||||
|
||||
By default, when creating `OCI` image container, `nydus-snapshotter` will return [`struct` Mount slice](https://github.com/containerd/containerd/blob/main/mount/mount.go#L21) below to containerd and containerd use them to mount rootfs
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
Type: "overlay",
|
||||
Source: "overlay",
|
||||
Options: [lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_A>/mnt,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/work],
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
Then, we can append `rafs` info into `Options`, but if do this, containerd will mount failed, as containerd can not identify `rafs` info. Here, we can refer to [containerd mount helper](https://github.com/containerd/containerd/blob/main/mount/mount_linux.go#L42) and provide a binary called `nydus-overlayfs`. The `Mount` slice which `nydus-snapshotter` returned becomes
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
Type: "fuse.nydus-overlayfs",
|
||||
Source: "overlay",
|
||||
Options: [lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_A>/mnt,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/work,extraoption=base64({source:xxx,config:xxx,snapshotdir:xxx})],
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
When containerd find `Type` is `fuse.nydus-overlayfs`,
|
||||
|
||||
1. containerd will call `mount.fuse` command;
|
||||
2. in `mount.fuse`, it will call `nydus-overlayfs`.
|
||||
3. in `nydus-overlayfs`, it will ignore the `extraoption` and do the overlay mount.
|
||||
|
||||
Finally, in the Kata Containers Containerd v2 shim, it parse `extraoption` and get the `rafs` info to mount the image in guest.
|
||||
@@ -1,5 +0,0 @@
|
||||
# Design proposals
|
||||
|
||||
Kata Containers design proposal documents:
|
||||
|
||||
- [Kata Containers tracing](tracing-proposals.md)
|
||||
@@ -1,213 +0,0 @@
|
||||
# Kata Tracing proposals
|
||||
|
||||
## Overview
|
||||
|
||||
This document summarises a set of proposals triggered by the
|
||||
[tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
## Required context
|
||||
|
||||
This section explains some terminology required to understand the proposals.
|
||||
Further details can be found in the
|
||||
[tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
### Agent trace mode terminology
|
||||
|
||||
| Trace mode | Description | Use-case |
|
||||
|-|-|-|
|
||||
| Static | Trace agent from startup to shutdown | Entire lifespan |
|
||||
| Dynamic | Toggle tracing on/off as desired | On-demand "snapshot" |
|
||||
|
||||
### Agent trace type terminology
|
||||
|
||||
| Trace type | Description | Use-case |
|
||||
|-|-|-|
|
||||
| isolated | traces all relate to single component | Observing lifespan |
|
||||
| collated | traces "grouped" (runtime+agent) | Understanding component interaction |
|
||||
|
||||
### Container lifespan
|
||||
|
||||
| Lifespan | trace mode | trace type |
|
||||
|-|-|-|
|
||||
| short-lived | static | collated if possible, else isolated? |
|
||||
| long-running | dynamic | collated? (to see interactions) |
|
||||
|
||||
## Original plan for agent
|
||||
|
||||
- Implement all trace types and trace modes for agent.
|
||||
|
||||
- Why?
|
||||
- Maximum flexibility.
|
||||
|
||||
> **Counterargument:**
|
||||
>
|
||||
> Due to the intrusive nature of adding tracing, we have
|
||||
> learnt that landing small incremental changes is simpler and quicker!
|
||||
|
||||
- Compatibility with [Kata 1.x tracing][kata-1x-tracing].
|
||||
|
||||
> **Counterargument:**
|
||||
>
|
||||
> Agent tracing in Kata 1.x was extremely awkward to setup (to the extent
|
||||
> that it's unclear how many users actually used it!)
|
||||
>
|
||||
> This point, coupled with the new architecture for Kata 2.x, suggests
|
||||
> that we may not need to supply the same set of tracing features (in fact
|
||||
> they may not make sense)).
|
||||
|
||||
## Agent tracing proposals
|
||||
|
||||
### Agent tracing proposal 1: Don't implement dynamic trace mode
|
||||
|
||||
- All tracing will be static.
|
||||
|
||||
- Why?
|
||||
- Because dynamic tracing will always be "partial"
|
||||
|
||||
> In fact, not only would it be only a "snapshot" of activity, it may not
|
||||
> even be possible to create a complete "trace transaction". If this is
|
||||
> true, the trace output would be partial and would appear "unstructured".
|
||||
|
||||
### Agent tracing proposal 2: Simplify handling of trace type
|
||||
|
||||
- Agent tracing will be "isolated" by default.
|
||||
- Agent tracing will be "collated" if runtime tracing is also enabled.
|
||||
|
||||
- Why?
|
||||
- Offers a graceful fallback for agent tracing if runtime tracing disabled.
|
||||
- Simpler code!
|
||||
|
||||
## Questions to ask yourself (part 1)
|
||||
|
||||
- Are your containers long-running or short-lived?
|
||||
|
||||
- Would you ever need to turn on tracing "briefly"?
|
||||
- If "yes", is a "partial trace" useful or useless?
|
||||
|
||||
> Likely to be considered useless as it is a partial snapshot.
|
||||
> Alternative tracing methods may be more appropriate to dynamic
|
||||
> OpenTelemetry tracing.
|
||||
|
||||
## Questions to ask yourself (part 2)
|
||||
|
||||
- Are you happy to stop a container to enable tracing?
|
||||
If "no", dynamic tracing may be required.
|
||||
|
||||
- Would you ever want to trace the agent and the runtime "in isolation" at the
|
||||
same time?
|
||||
- If "yes", we need to fully implement `trace_mode=isolated`
|
||||
|
||||
> This seems unlikely though.
|
||||
|
||||
## Trace collection
|
||||
|
||||
The second set of proposals affect the way traces are collected.
|
||||
|
||||
### Motivation
|
||||
|
||||
Currently:
|
||||
|
||||
- The runtime sends trace spans to Jaeger directly.
|
||||
- The agent will send trace spans to the [`trace-forwarder`][trace-forwarder] component.
|
||||
- The trace forwarder will send trace spans to Jaeger.
|
||||
|
||||
Kata agent tracing overview:
|
||||
|
||||
```
|
||||
+-------------------------------------------+
|
||||
| Host |
|
||||
| |
|
||||
| +-----------+ |
|
||||
| | Trace | |
|
||||
| | Collector | |
|
||||
| +-----+-----+ |
|
||||
| ^ +--------------+ |
|
||||
| | spans | Kata VM | |
|
||||
| +-----+-----+ | | |
|
||||
| | Kata | spans | +-----+ | |
|
||||
| | Trace |<-----------------|Kata | | |
|
||||
| | Forwarder | VSOCK | |Agent| | |
|
||||
| +-----------+ Channel | +-----+ | |
|
||||
| +--------------+ |
|
||||
+-------------------------------------------+
|
||||
```
|
||||
|
||||
Currently:
|
||||
|
||||
- If agent tracing is enabled but the trace forwarder is not running,
|
||||
the agent will error.
|
||||
|
||||
- If the trace forwarder is started but Jaeger is not running,
|
||||
the trace forwarder will error.
|
||||
|
||||
### Goals
|
||||
|
||||
- The runtime and agent should:
|
||||
- Use the same trace collection implementation.
|
||||
- Use the most the common configuration items.
|
||||
|
||||
- Kata should should support more trace collection software or `SaaS`
|
||||
(for example `Zipkin`, `datadog`).
|
||||
|
||||
- Trace collection should not block normal runtime/agent operations
|
||||
(for example if `vsock-exporter`/Jaeger is not running, Kata Containers should work normally).
|
||||
|
||||
### Trace collection proposals
|
||||
|
||||
#### Trace collection proposal 1: Send all spans to the trace forwarder as a span proxy
|
||||
|
||||
Kata runtime/agent all send spans to trace forwarder, and the trace forwarder,
|
||||
acting as a tracing proxy, sends all spans to a tracing back-end, such as Jaeger or `datadog`.
|
||||
|
||||
**Pros:**
|
||||
|
||||
- Runtime/agent will be simple.
|
||||
- Could update trace collection target while Kata Containers are running.
|
||||
|
||||
**Cons:**
|
||||
|
||||
- Requires the trace forwarder component to be running (that is a pressure to operation).
|
||||
|
||||
#### Trace collection proposal 2: Send spans to collector directly from runtime/agent
|
||||
|
||||
Send spans to collector directly from runtime/agent, this proposal need
|
||||
network accessible to the collector.
|
||||
|
||||
**Pros:**
|
||||
|
||||
- No additional trace forwarder component needed.
|
||||
|
||||
**Cons:**
|
||||
|
||||
- Need more code/configuration to support all trace collectors.
|
||||
|
||||
## Future work
|
||||
|
||||
- We could add dynamic and fully isolated tracing at a later stage,
|
||||
if required.
|
||||
|
||||
## Further details
|
||||
|
||||
- See the new [GitHub project](https://github.com/orgs/kata-containers/projects/28).
|
||||
- [kata-containers-tracing-status](https://gist.github.com/jodh-intel/0ee54d41d2a803ba761e166136b42277) gist.
|
||||
- [tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
## Summary
|
||||
|
||||
### Time line
|
||||
|
||||
- 2021-07-01: A summary of the discussion was
|
||||
[posted to the mail list](http://lists.katacontainers.io/pipermail/kata-dev/2021-July/001996.html).
|
||||
- 2021-06-22: These proposals were
|
||||
[discussed in the Kata Architecture Committee meeting](https://etherpad.opendev.org/p/Kata_Containers_2021_Architecture_Committee_Mtgs).
|
||||
- 2021-06-18: These proposals where
|
||||
[announced on the mailing list](http://lists.katacontainers.io/pipermail/kata-dev/2021-June/001980.html).
|
||||
|
||||
### Outcome
|
||||
|
||||
- Nobody opposed the agent proposals, so they are being implemented.
|
||||
- The trace collection proposals are still being considered.
|
||||
|
||||
[kata-1x-tracing]: https://github.com/kata-containers/agent/blob/master/TRACING.md
|
||||
[trace-forwarder]: /src/tools/trace-forwarder
|
||||
[tracing-doc-pr]: https://github.com/kata-containers/kata-containers/pull/1937
|
||||
@@ -1,3 +1,11 @@
|
||||
- [Virtual machine vCPU sizing in Kata Containers](#virtual-machine-vcpu-sizing-in-kata-containers)
|
||||
* [Default number of virtual CPUs](#default-number-of-virtual-cpus)
|
||||
* [Virtual CPUs and Kubernetes pods](#virtual-cpus-and-kubernetes-pods)
|
||||
* [Container lifecycle](#container-lifecycle)
|
||||
* [Container without CPU constraint](#container-without-cpu-constraint)
|
||||
* [Container with CPU constraint](#container-with-cpu-constraint)
|
||||
* [Do not waste resources](#do-not-waste-resources)
|
||||
|
||||
# Virtual machine vCPU sizing in Kata Containers
|
||||
|
||||
## Default number of virtual CPUs
|
||||
@@ -157,32 +165,6 @@ docker run --cpus 4 -ti debian bash -c "nproc; cat /sys/fs/cgroup/cpu,cpuacct/cp
|
||||
400000 # cfs quota
|
||||
```
|
||||
|
||||
## Virtual CPU handling without hotplug
|
||||
|
||||
In some cases, the hardware and/or software architecture being utilized does not support
|
||||
hotplug. For example, Firecracker VMM does not support CPU or memory hotplug. Similarly,
|
||||
the current Linux Kernel for aarch64 does not support CPU or memory hotplug. To appropriately
|
||||
size the virtual machine for the workload within the container or pod, we provide a `static_sandbox_resource_mgmt`
|
||||
flag within the Kata Containers configuration. When this is set, the runtime will:
|
||||
- Size the VM based on the workload requirements as well as the `default_vcpus` option specified in the configuration.
|
||||
- Not resize the virtual machine after it has been launched.
|
||||
|
||||
VM size determination varies depending on the type of container being run, and may not always
|
||||
be available. If workload sizing information is not available, the virtual machine will be started with the
|
||||
`default_vcpus`.
|
||||
|
||||
In the case of a pod, the initial sandbox container (pause container) typically doesn't contain any resource
|
||||
information in its runtime `spec`. It is possible that the upper layer runtime
|
||||
(i.e. containerd or CRI-O) may pass sandbox sizing annotations within the pause container's
|
||||
`spec`. If these are provided, we will use this to appropriately size the VM. In particular,
|
||||
we'll calculate the number of CPUs required for the workload and augment this by `default_vcpus`
|
||||
configuration option, and use this for the virtual machine size.
|
||||
|
||||
In the case of a single container (i.e., not a pod), if the container specifies resource requirements,
|
||||
the container's `spec` will provide the sizing information directly. If these are set, we will
|
||||
calculate the number of CPUs required for the workload and augment this by `default_vcpus`
|
||||
configuration option, and use this for the virtual machine size.
|
||||
|
||||
|
||||
[1]: https://docs.docker.com/config/containers/resource_constraints/#cpu
|
||||
[2]: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource
|
||||
|
||||
@@ -1,5 +1,16 @@
|
||||
# Virtualization in Kata Containers
|
||||
|
||||
- [Virtualization in Kata Containers](#virtualization-in-kata-containers)
|
||||
- [Mapping container concepts to virtual machine technologies](#mapping-container-concepts-to-virtual-machine-technologies)
|
||||
- [Kata Containers Hypervisor and VMM support](#kata-containers-hypervisor-and-vmm-support)
|
||||
- [QEMU/KVM](#qemukvm)
|
||||
- [Machine accelerators](#machine-accelerators)
|
||||
- [Hotplug devices](#hotplug-devices)
|
||||
- [Firecracker/KVM](#firecrackerkvm)
|
||||
- [Cloud Hypervisor/KVM](#cloud-hypervisorkvm)
|
||||
- [Summary](#summary)
|
||||
|
||||
|
||||
Kata Containers, a second layer of isolation is created on top of those provided by traditional namespace-containers. The
|
||||
hardware virtualization interface is the basis of this additional layer. Kata will launch a lightweight virtual machine,
|
||||
and use the guest’s Linux kernel to create a container workload, or workloads in the case of multi-container pods. In Kubernetes
|
||||
@@ -11,10 +22,10 @@ the multiple hypervisors and virtual machine monitors that Kata supports.
|
||||
## Mapping container concepts to virtual machine technologies
|
||||
|
||||
A typical deployment of Kata Containers will be in Kubernetes by way of a Container Runtime Interface (CRI) implementation. On every node,
|
||||
Kubelet will interact with a CRI implementer (such as containerd or CRI-O), which will in turn interface with Kata Containers (an OCI based runtime).
|
||||
Kubelet will interact with a CRI implementor (such as containerd or CRI-O), which will in turn interface with Kata Containers (an OCI based runtime).
|
||||
|
||||
The CRI API, as defined at the [Kubernetes CRI-API repo](https://github.com/kubernetes/cri-api/), implies a few constructs being supported by the
|
||||
CRI implementation, and ultimately in Kata Containers. In order to support the full [API](https://github.com/kubernetes/cri-api/blob/a6f63f369f6d50e9d0886f2eda63d585fbd1ab6a/pkg/apis/runtime/v1alpha2/api.proto#L34-L110) with the CRI-implementer, Kata must provide the following constructs:
|
||||
CRI implementation, and ultimately in Kata Containers. In order to support the full [API](https://github.com/kubernetes/cri-api/blob/a6f63f369f6d50e9d0886f2eda63d585fbd1ab6a/pkg/apis/runtime/v1alpha2/api.proto#L34-L110) with the CRI-implementor, Kata must provide the following constructs:
|
||||
|
||||

|
||||
|
||||
@@ -30,9 +41,14 @@ Each hypervisor or VMM varies on how or if it handles each of these.
|
||||
|
||||
## Kata Containers Hypervisor and VMM support
|
||||
|
||||
Kata Containers [supports multiple hypervisors](../hypervisors.md).
|
||||
Kata Containers is designed to support multiple virtual machine monitors (VMMs) and hypervisors.
|
||||
Kata Containers supports:
|
||||
- [ACRN hypervisor](https://projectacrn.org/)
|
||||
- [Cloud Hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor)/[KVM](https://www.linux-kvm.org/page/Main_Page)
|
||||
- [Firecracker](https://github.com/firecracker-microvm/firecracker)/KVM
|
||||
- [QEMU](http://www.qemu-project.org/)/KVM
|
||||
|
||||
Details of each solution and a summary are provided below.
|
||||
Which configuration to use will depend on the end user's requirements. Details of each solution and a summary are provided below.
|
||||
|
||||
### QEMU/KVM
|
||||
|
||||
@@ -40,13 +56,13 @@ Kata Containers with QEMU has complete compatibility with Kubernetes.
|
||||
|
||||
Depending on the host architecture, Kata Containers supports various machine types,
|
||||
for example `pc` and `q35` on x86 systems, `virt` on ARM systems and `pseries` on IBM Power systems. The default Kata Containers
|
||||
machine type is `q35`. The machine type and its [`Machine accelerators`](#machine-accelerators) can
|
||||
be changed by editing the runtime [`configuration`](architecture/README.md#configuration) file.
|
||||
machine type is `pc`. The machine type and its [`Machine accelerators`](#machine-accelerators) can
|
||||
be changed by editing the runtime [`configuration`](./architecture.md/#configuration) file.
|
||||
|
||||
Devices and features used:
|
||||
- virtio VSOCK or virtio serial
|
||||
- virtio block or virtio SCSI
|
||||
- [virtio net](https://www.redhat.com/en/virtio-networking-series)
|
||||
- virtio net
|
||||
- virtio fs or virtio 9p (recommend: virtio fs)
|
||||
- VFIO
|
||||
- hotplug
|
||||
@@ -89,34 +105,25 @@ Devices used:
|
||||
|
||||
### Cloud Hypervisor/KVM
|
||||
|
||||
[Cloud Hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor), based
|
||||
on [rust-vmm](https://github.com/rust-vmm), is designed to have a
|
||||
lighter footprint and smaller attack surface for running modern cloud
|
||||
workloads. Kata Containers with Cloud
|
||||
Hypervisor provides mostly complete compatibility with Kubernetes
|
||||
comparable to the QEMU configuration. As of the 1.12 and 2.0.0 release
|
||||
of Kata Containers, the Cloud Hypervisor configuration supports both CPU
|
||||
and memory resize, device hotplug (disk and VFIO), file-system sharing through virtio-fs,
|
||||
block-based volumes, booting from VM images backed by pmem device, and
|
||||
fine-grained seccomp filters for each VMM threads (e.g. all virtio
|
||||
device worker threads). Please check [this GitHub Project](https://github.com/orgs/kata-containers/projects/21)
|
||||
for details of ongoing integration efforts.
|
||||
Cloud Hypervisor, based on [rust-VMM](https://github.com/rust-vmm), is designed to have a lighter footprint and attack surface. For Kata Containers,
|
||||
relative to Firecracker, the Cloud Hypervisor configuration provides better compatibility at the expense of exposing additional devices: file system
|
||||
sharing and direct device assignment. As of the 1.10 release of Kata Containers, Cloud Hypervisor does not support device hotplug, and as a result
|
||||
does not support updating container resources after boot, or utilizing block based volumes. While Cloud Hypervisor does support VFIO, Kata is still adding
|
||||
this support. As of 1.10, Kata does not support block based volumes or direct device assignment. See [Cloud Hypervisor device support documentation](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/master/docs/device_model.md)
|
||||
for more details on Cloud Hypervisor.
|
||||
|
||||
Devices and features used:
|
||||
- virtio VSOCK or virtio serial
|
||||
Devices used:
|
||||
- virtio VSOCK
|
||||
- virtio block
|
||||
- virtio net
|
||||
- virtio fs
|
||||
- virtio pmem
|
||||
- VFIO
|
||||
- hotplug
|
||||
- seccomp filters
|
||||
- [HTTP OpenAPI](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/master/vmm/src/api/openapi/cloud-hypervisor.yaml)
|
||||
|
||||
### Summary
|
||||
|
||||
| Solution | release introduced | brief summary |
|
||||
|-|-|-|
|
||||
| Cloud Hypervisor | 1.10 | upstream Cloud Hypervisor with rich feature support, e.g. hotplug, VFIO and FS sharing|
|
||||
| Firecracker | 1.5 | upstream Firecracker, rust-VMM based, no VFIO, no FS sharing, no memory/CPU hotplug |
|
||||
| QEMU | 1.0 | upstream QEMU, with support for hotplug and filesystem sharing |
|
||||
| NEMU | 1.4 | Deprecated, removed as of 1.10 release. Slimmed down fork of QEMU, with experimental support of virtio-fs |
|
||||
| Firecracker | 1.5 | upstream Firecracker, rust-VMM based, no VFIO, no FS sharing, no memory/CPU hotplug |
|
||||
| QEMU-virtio-fs | 1.7 | upstream QEMU with support for virtio-fs. Will be removed once virtio-fs lands in upstream QEMU |
|
||||
| Cloud Hypervisor | 1.10 | rust-VMM based, includes VFIO and FS sharing through virtio-fs, no hotplug |
|
||||
|
||||
@@ -1,11 +1,15 @@
|
||||
# Howto Guides
|
||||
|
||||
## Kubernetes Integration
|
||||
* [Howto Guides](#howto-guides)
|
||||
* [Kubernetes Integration](#kubernetes-integration)
|
||||
* [Hypervisors Integration](#hypervisors-integration)
|
||||
* [Advanced Topics](#advanced-topics)
|
||||
|
||||
## Kubernetes Integration
|
||||
- [Run Kata containers with `crictl`](run-kata-with-crictl.md)
|
||||
- [Run Kata Containers with Kubernetes](run-kata-with-k8s.md)
|
||||
- [How to use Kata Containers and Containerd](containerd-kata.md)
|
||||
- [How to use Kata Containers and CRI (containerd) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
|
||||
- [How to use Kata Containers and CRI (containerd plugin) with Kubernetes](how-to-use-k8s-with-cri-containerd-and-kata.md)
|
||||
- [Kata Containers and service mesh for Kubernetes](service-mesh.md)
|
||||
- [How to import Kata Containers logs into Fluentd](how-to-import-kata-logs-with-fluentd.md)
|
||||
|
||||
@@ -17,13 +21,13 @@
|
||||
- `firecracker`
|
||||
- `ACRN`
|
||||
|
||||
While `qemu` , `cloud-hypervisor` and `firecracker` work out of the box with installation of Kata,
|
||||
some additional configuration is needed in case of `ACRN`.
|
||||
While `qemu` and `cloud-hypervisor` work out of the box with installation of Kata,
|
||||
some additional configuration is needed in case of `firecracker` and `ACRN`.
|
||||
Refer to the following guides for additional configuration steps:
|
||||
- [Kata Containers with Firecracker](https://github.com/kata-containers/documentation/wiki/Initial-release-of-Kata-Containers-with-Firecracker-support)
|
||||
- [Kata Containers with ACRN Hypervisor](how-to-use-kata-containers-with-acrn.md)
|
||||
|
||||
## Advanced Topics
|
||||
|
||||
- [How to use Kata Containers with virtio-fs](how-to-use-virtio-fs-with-kata.md)
|
||||
- [Setting Sysctls with Kata](how-to-use-sysctls-with-kata.md)
|
||||
- [What Is VMCache and How To Enable It](what-is-vm-cache-and-how-do-I-use-it.md)
|
||||
@@ -33,8 +37,3 @@
|
||||
- [How to use Kata Containers with `virtio-mem`](how-to-use-virtio-mem-with-kata.md)
|
||||
- [How to set sandbox Kata Containers configurations with pod annotations](how-to-set-sandbox-config-kata.md)
|
||||
- [How to monitor Kata Containers in K8s](how-to-set-prometheus-in-k8s.md)
|
||||
- [How to use hotplug memory on arm64 in Kata Containers](how-to-hotplug-memory-arm64.md)
|
||||
- [How to setup swap devices in guest kernel](how-to-setup-swap-devices-in-guest-kernel.md)
|
||||
- [How to run rootless vmm](how-to-run-rootless-vmm.md)
|
||||
- [How to run Docker with Kata Containers](how-to-run-docker-with-kata.md)
|
||||
- [How to run Kata Containers with `nydus`](how-to-use-virtio-fs-nydus-with-kata.md)
|
||||
@@ -1,5 +1,23 @@
|
||||
# How to use Kata Containers and Containerd
|
||||
|
||||
- [Concepts](#concepts)
|
||||
- [Kubernetes `RuntimeClass`](#kubernetes-runtimeclass)
|
||||
- [Containerd Runtime V2 API: Shim V2 API](#containerd-runtime-v2-api-shim-v2-api)
|
||||
- [Install](#install)
|
||||
- [Install Kata Containers](#install-kata-containers)
|
||||
- [Install containerd with CRI plugin](#install-containerd-with-cri-plugin)
|
||||
- [Install CNI plugins](#install-cni-plugins)
|
||||
- [Install `cri-tools`](#install-cri-tools)
|
||||
- [Configuration](#configuration)
|
||||
- [Configure containerd to use Kata Containers](#configure-containerd-to-use-kata-containers)
|
||||
- [Kata Containers as a `RuntimeClass`](#kata-containers-as-a-runtimeclass)
|
||||
- [Kata Containers as the runtime for untrusted workload](#kata-containers-as-the-runtime-for-untrusted-workload)
|
||||
- [Kata Containers as the default runtime](#kata-containers-as-the-default-runtime)
|
||||
- [Configuration for `cri-tools`](#configuration-for-cri-tools)
|
||||
- [Run](#run)
|
||||
- [Launch containers with `ctr` command line](#launch-containers-with-ctr-command-line)
|
||||
- [Launch Pods with `crictl` command line](#launch-pods-with-crictl-command-line)
|
||||
|
||||
This document covers the installation and configuration of [containerd](https://containerd.io/)
|
||||
and [Kata Containers](https://katacontainers.io). The containerd provides not only the `ctr`
|
||||
command line tool, but also the [CRI](https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/)
|
||||
@@ -39,7 +57,7 @@ use `RuntimeClass` instead of the deprecated annotations.
|
||||
|
||||
### Containerd Runtime V2 API: Shim V2 API
|
||||
|
||||
The [`containerd-shim-kata-v2` (short as `shimv2` in this documentation)](../../src/runtime/cmd/containerd-shim-kata-v2/)
|
||||
The [`containerd-shim-kata-v2` (short as `shimv2` in this documentation)](../../src/runtime/containerd-shim-v2)
|
||||
implements the [Containerd Runtime V2 (Shim API)](https://github.com/containerd/containerd/tree/master/runtime/v2) for Kata.
|
||||
With `shimv2`, Kubernetes can launch Pod and OCI-compatible containers with one shim per Pod. Prior to `shimv2`, `2N+1`
|
||||
shims (i.e. a `containerd-shim` and a `kata-shim` for each container and the Pod sandbox itself) and no standalone `kata-proxy`
|
||||
@@ -188,7 +206,7 @@ If you use Containerd older than v1.2.4 or a version of Kata older than v1.6.0
|
||||
shell script with the following:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
KATA_CONF_FILE=/etc/kata-containers/firecracker.toml containerd-shim-kata-v2 $@
|
||||
```
|
||||
|
||||
|
||||
@@ -26,7 +26,7 @@ spec:
|
||||
hostNetwork: true
|
||||
containers:
|
||||
- name: kata-monitor
|
||||
image: quay.io/kata-containers/kata-monitor:2.0.0
|
||||
image: docker.io/katadocker/kata-monitor:2.0.0
|
||||
args:
|
||||
- -log-level=debug
|
||||
ports:
|
||||
|
||||
@@ -1,28 +0,0 @@
|
||||
# How to use memory hotplug feature in Kata Containers on arm64
|
||||
|
||||
## Introduction
|
||||
|
||||
Memory hotplug is a key feature for containers to allocate memory dynamically in deployment.
|
||||
As Kata Container bases on VM, this feature needs support both from VMM and guest kernel. Luckily, it has been fully supported for the current default version of QEMU and guest kernel used by Kata on arm64. For other VMMs, e.g, Cloud Hypervisor, the enablement work is on the road. Apart from VMM and guest kernel, memory hotplug also depends on ACPI which depends on firmware either. On x86, you can boot a VM using QEMU with ACPI enabled directly, because it boots up with firmware implicitly. For arm64, however, you need specify firmware explicitly. That is to say, if you are ready to run a normal Kata Container on arm64, what you need extra to do is to install the UEFI ROM before use the memory hotplug feature.
|
||||
|
||||
## Install UEFI ROM
|
||||
|
||||
We have offered a helper script for you to install the UEFI ROM. If you have installed Kata normally on your host, you just need to run the script as fellows:
|
||||
|
||||
```bash
|
||||
$ pushd $GOPATH/src/github.com/kata-containers/tests
|
||||
$ sudo .ci/aarch64/install_rom_aarch64.sh
|
||||
$ popd
|
||||
```
|
||||
|
||||
## Run for test
|
||||
|
||||
Let's test if the memory hotplug is ready for Kata after install the UEFI ROM. Make sure containerd is ready to run Kata before test.
|
||||
|
||||
```bash
|
||||
$ sudo ctr image pull docker.io/library/ubuntu:latest
|
||||
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --rm docker.io/library/ubuntu:latest hello sh -c "free -h"
|
||||
$ sudo ctr run --runtime io.containerd.run.kata.v2 -t --memory-limit 536870912 --rm docker.io/library/ubuntu:latest hello sh -c "free -h"
|
||||
```
|
||||
|
||||
Compare the results between the two tests. If the latter is 0.5G larger than the former, you have done what you want, and congratulation!
|
||||
@@ -1,10 +1,25 @@
|
||||
# Importing Kata Containers logs with Fluentd
|
||||
|
||||
* [Introduction](#introduction)
|
||||
* [Overview](#overview)
|
||||
* [Test stack](#test-stack)
|
||||
* [Importing the logs](#importing-the-logs)
|
||||
* [Direct import `logfmt` from `systemd`](#direct-import-logfmt-from-systemd)
|
||||
* [Configuring `minikube`](#configuring-minikube)
|
||||
* [Pull from `systemd`](#pull-from-systemd)
|
||||
* [Systemd Summary](#systemd-summary)
|
||||
* [Directly importing JSON](#directly-importing-json)
|
||||
* [JSON in files](#json-in-files)
|
||||
* [Prefixing all keys](#prefixing-all-keys)
|
||||
* [Kata `shimv2`](#kata-shimv2)
|
||||
* [Caveats](#caveats)
|
||||
* [Summary](#summary)
|
||||
|
||||
# Introduction
|
||||
|
||||
This document describes how to import Kata Containers logs into [Fluentd](https://www.fluentd.org/),
|
||||
typically for importing into an
|
||||
Elastic/Fluentd/Kibana([EFK](https://github.com/kubernetes-sigs/instrumentation-addons/tree/master/fluentd-elasticsearch#running-efk-stack-in-production))
|
||||
Elastic/Fluentd/Kibana([EFK](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch#running-efk-stack-in-production))
|
||||
or Elastic/Logstash/Kibana([ELK](https://www.elastic.co/elastic-stack)) stack.
|
||||
|
||||
The majority of this document focusses on CRI-O based (classic) Kata runtime. Much of that information
|
||||
@@ -128,7 +143,7 @@ YAML can be found
|
||||
tag kata-containers
|
||||
path /run/log/journal
|
||||
pos_file /run/log/journal/kata-journald.pos
|
||||
filters [{"SYSLOG_IDENTIFIER": "kata-runtime"}, {"SYSLOG_IDENTIFIER": "kata-shim"}]
|
||||
filters [{"SYSLOG_IDENTIFIER": "kata-runtime"}, {"SYSLOG_IDENTIFIER": "kata-proxy"}, {"SYSLOG_IDENTIFIER": "kata-shim"}]
|
||||
read_from_head true
|
||||
</source>
|
||||
```
|
||||
@@ -146,7 +161,7 @@ generate some Kata specific log entries:
|
||||
|
||||
```bash
|
||||
$ minikube addons open efk
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/packaging/kata-deploy
|
||||
$ cd $GOPATH/src/github.com/kata-containers/packaging/kata-deploy
|
||||
$ kubectl apply -f examples/nginx-deployment-qemu.yaml
|
||||
```
|
||||
|
||||
@@ -163,14 +178,14 @@ sub-filter on, for instance, the `SYSLOG_IDENTIFIER` to differentiate the Kata c
|
||||
on the `PRIORITY` to filter out critical issues etc.
|
||||
|
||||
Kata generates a significant amount of Kata specific information, which can be seen as
|
||||
[`logfmt`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser#logfile-requirements).
|
||||
[`logfmt`](https://github.com/kata-containers/tests/tree/master/cmd/log-parser#logfile-requirements).
|
||||
data contained in the `MESSAGE` field. Imported as-is, there is no easy way to filter on that data
|
||||
in Kibana:
|
||||
|
||||
.
|
||||
|
||||
We can however further sub-parse the Kata entries using the
|
||||
[Fluentd plugins](https://docs.fluentbit.io/manual/pipeline/parsers/logfmt) that will parse
|
||||
[Fluentd plugins](https://docs.fluentbit.io/manual/v/1.3/parser/logfmt) that will parse
|
||||
`logfmt` formatted data. We can utilise these to parse the sub-fields using a Fluentd filter
|
||||
section. At the same time, we will prefix the new fields with `kata_` to make it clear where
|
||||
they have come from:
|
||||
@@ -207,7 +222,7 @@ test to check the parsing works. The resulting output from Fluentd is:
|
||||
"_COMM":"kata-runtime",
|
||||
"_EXE":"/opt/kata/bin/kata-runtime",
|
||||
"SYSLOG_TIMESTAMP":"Feb 21 10:31:27 ",
|
||||
"_CMDLINE":"/opt/kata/bin/kata-runtime --config /opt/kata/share/defaults/kata-containers/configuration-qemu.toml --root /run/runc state 7cdd31660d8705facdadeb8598d2c0bd008e8142c54e3b3069abd392c8d58997",
|
||||
"_CMDLINE":"/opt/kata/bin/kata-runtime --kata-config /opt/kata/share/defaults/kata-containers/configuration-qemu.toml --root /run/runc state 7cdd31660d8705facdadeb8598d2c0bd008e8142c54e3b3069abd392c8d58997",
|
||||
"SYSLOG_PID":"14314",
|
||||
"_PID":"14314",
|
||||
"MESSAGE":"time=\"2020-02-21T10:31:27.810781647Z\" level=info msg=\"release sandbox\" arch=amd64 command=state container=7cdd31660d8705facdadeb8598d2c0bd008e8142c54e3b3069abd392c8d58997 name=kata-runtime pid=14314 sandbox=1c3e77cad66aa2b6d8cc846f818370f79cb0104c0b840f67d0f502fd6562b68c source=virtcontainers subsystem=sandbox",
|
||||
@@ -257,15 +272,16 @@ go directly to a full Kata specific JSON format logfile test.
|
||||
|
||||
Kata runtime has the ability to generate JSON logs directly, rather than its default `logfmt` format. Passing
|
||||
the `--log-format=json` argument to the Kata runtime enables this. The easiest way to pass in this extra
|
||||
parameter from a [Kata deploy](../../tools/packaging/kata-deploy) installation
|
||||
is to edit the `/opt/kata/bin/kata-qemu` shell script.
|
||||
parameter from a [Kata deploy](https://github.com/kata-containers/packaging/tree/master/kata-deploy) installation
|
||||
is to edit the `/opt/kata/bin/kata-qemu` shell script (generated by the
|
||||
[Kata packaging release scripts](https://github.com/kata-containers/packaging/blob/master/release/kata-deploy-binaries.sh)).
|
||||
|
||||
At the same time, we will add the `--log=/var/log/kata-runtime.log` argument to store the Kata logs in their
|
||||
own file (rather than into the system journal).
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
/opt/kata/bin/kata-runtime --config "/opt/kata/share/defaults/kata-containers/configuration-qemu.toml" --log-format=json --log=/var/log/kata-runtime.log $@
|
||||
#!/bin/bash
|
||||
/opt/kata/bin/kata-runtime --kata-config "/opt/kata/share/defaults/kata-containers/configuration-qemu.toml" --log-format=json --log=/var/log/kata-runtime.log $@
|
||||
```
|
||||
|
||||
And then we'll add the Fluentd config section to parse that file. Note, we inform the parser that Kata is
|
||||
|
||||
@@ -56,9 +56,8 @@ There are some limitations with this approach:
|
||||
|
||||
As was mentioned above, not all containers need the same modules, therefore using
|
||||
the configuration file for specifying the list of kernel modules per [POD][3] can
|
||||
be a pain.
|
||||
Unlike the configuration file, [annotations](how-to-set-sandbox-config-kata.md)
|
||||
provide a way to specify custom configurations per POD.
|
||||
be a pain. Unlike the configuration file, annotations provide a way to specify
|
||||
custom configurations per POD.
|
||||
|
||||
The list of kernel modules and parameters can be set using the annotation
|
||||
`io.katacontainers.config.agent.kernel_modules` as a semicolon separated
|
||||
@@ -102,7 +101,7 @@ spec:
|
||||
tty: true
|
||||
```
|
||||
|
||||
> **Note**: To pass annotations to Kata containers, [CRI-O must be configured correctly](how-to-set-sandbox-config-kata.md#cri-o-configuration)
|
||||
> **Note**: To pass annotations to Kata containers, [`CRI` must to be configured correctly](how-to-set-sandbox-config-kata.md#cri-configuration)
|
||||
|
||||
[1]: ../../src/runtime
|
||||
[2]: ../../src/agent
|
||||
|
||||
@@ -1,141 +0,0 @@
|
||||
# How to run Docker in Docker with Kata Containers
|
||||
|
||||
This document describes the why and how behind running Docker in a Kata Container.
|
||||
|
||||
> **Note:** While in other environments this might be described as "Docker in Docker", the new architecture of Kata 2.x means [Docker can no longer be used to create containers using a Kata Containers runtime](https://github.com/kata-containers/kata-containers/issues/722).
|
||||
|
||||
## Requirements
|
||||
|
||||
- A working Kata Containers installation
|
||||
|
||||
## Install and configure Kata Containers
|
||||
|
||||
Follow the [Kata Containers installation guide](../install/README.md) to Install Kata Containers on your Kubernetes cluster.
|
||||
|
||||
## Background
|
||||
|
||||
Docker in Docker ("DinD") is the colloquial name for the ability to run `docker` from inside a container.
|
||||
|
||||
You can learn more about about Docker-in-Docker at the following links:
|
||||
|
||||
- [The original announcement of DinD](https://www.docker.com/blog/docker-can-now-run-within-docker/)
|
||||
- [`docker` image Docker Hub page](https://hub.docker.com/_/docker/) (this page lists the `-dind` releases)
|
||||
|
||||
While normally DinD refers to running `docker` from inside a Docker container,
|
||||
Kata Containers 2.x allows only [supported runtimes][kata-2.x-supported-runtimes] (such as [`containerd`](../install/container-manager/containerd/containerd-install.md)).
|
||||
|
||||
Running `docker` in a Kata Container implies creating Docker containers from inside a container managed by `containerd` (or another supported container manager), as illustrated below:
|
||||
|
||||
```
|
||||
container manager -> Kata Containers shim -> Docker Daemon -> Docker container
|
||||
(containerd) (containerd-shim-kata-v2) (dockerd) (busybox sh)
|
||||
```
|
||||
|
||||
[OverlayFS][OverlayFS] is the preferred storage driver for most container runtimes on Linux ([including Docker](https://docs.docker.com/storage/storagedriver/select-storage-driver)).
|
||||
|
||||
> **Note:** While in the past Kata Containers did not contain the [`overlay` kernel module (aka OverlayFS)][OverlayFS], the kernel modules have been included since the [Kata Containers v2.0.0 release][v2.0.0].
|
||||
|
||||
[OverlayFS]: https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html
|
||||
[v2.0.0]: https://github.com/kata-containers/kata-containers/releases/tag/2.0.0
|
||||
[kata-2.x-supported-runtimes]: ../install/container-manager/containerd/containerd-install.md
|
||||
|
||||
## Why Docker in Kata Containers 2.x requires special measures
|
||||
|
||||
Running Docker containers Kata Containers requires care because `VOLUME`s specified in `Dockerfile`s run by Kata Containers are given the `kataShared` mount type by default, which applies to the root directory `/`:
|
||||
|
||||
```console
|
||||
/ # mount
|
||||
kataShared on / type virtiofs (rw,relatime,dax)
|
||||
```
|
||||
|
||||
`kataShared` mount types are powered by [`virtio-fs`][virtio-fs], a marked improvement over `virtio-9p`, thanks to [PR #1016](https://github.com/kata-containers/runtime/pull/1016). While `virtio-fs` is normally an excellent choice, in the case of DinD workloads `virtio-fs` causes an issue -- [it *cannot* be used as a "upper layer" of `overlayfs` without a custom patch](http://lists.katacontainers.io/pipermail/kata-dev/2020-January/001216.html).
|
||||
|
||||
As `/var/lib/docker` is a `VOLUME` specified by DinD (i.e. the `docker` images tagged `*-dind`/`*-dind-rootless`), `docker` fill fail to start (or even worse, silently pick a worse storage driver like `vfs`) when started in a Kata Container. Special measures must be taken when running DinD-powered workloads in Kata Containers.
|
||||
|
||||
## Workarounds/Solutions
|
||||
|
||||
Thanks to various community contributions (see [issue references below](#references)) the following options, with various trade-offs have been uncovered:
|
||||
|
||||
### Use a memory backed volume
|
||||
|
||||
For small workloads (small container images, without much generated filesystem load), a memory-backed volume is sufficient. Kubernetes supports a variant of [the `EmptyDir` volume][k8s-emptydir], which allows for memdisk-backed storage -- the [the `medium: Memory` ][k8s-memory-volume-type]. An example of a `Pod` using such a setup [was contributed](https://github.com/kata-containers/runtime/issues/1429#issuecomment-477385283), and is reproduced below:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: dind
|
||||
spec:
|
||||
runtimeClassName: kata
|
||||
containers:
|
||||
- name: dind
|
||||
securityContext:
|
||||
privileged: true
|
||||
image: docker:20.10-dind
|
||||
args: ["--storage-driver=overlay2"]
|
||||
resources:
|
||||
limits:
|
||||
memory: "3G"
|
||||
volumeMounts:
|
||||
- mountPath: /var/run/
|
||||
name: dockersock
|
||||
- mountPath: /var/lib/docker
|
||||
name: docker
|
||||
volumes:
|
||||
- name: dockersock
|
||||
emptyDir: {}
|
||||
- name: docker
|
||||
emptyDir:
|
||||
medium: Memory
|
||||
```
|
||||
|
||||
Inside the container you can view the mount:
|
||||
|
||||
```console
|
||||
/ # mount | grep lib\/docker
|
||||
tmpfs on /var/lib/docker type tmpfs (rw,relatime)
|
||||
```
|
||||
|
||||
As is mentioned in the comment encapsulating this code, using volatile memory for container storage backing is a risky and could be possibly wasteful on machines that do not have a lot of RAM.
|
||||
|
||||
### Use a loop mounted disk
|
||||
|
||||
Using a loop mounted disk that is provisioned shortly before starting of the container workload is another approach that yields good performance.
|
||||
|
||||
Contributors provided [an example in issue #1888](https://github.com/kata-containers/runtime/issues/1888#issuecomment-739057384), which is reproduced in part below:
|
||||
|
||||
```yaml
|
||||
spec:
|
||||
containers:
|
||||
- name: docker
|
||||
image: docker:20.10-dind
|
||||
command: ["sh", "-c"]
|
||||
args:
|
||||
- if [[ $(df -PT /var/lib/docker | awk 'NR==2 {print $2}') == virtiofs ]]; then
|
||||
apk add e2fsprogs &&
|
||||
truncate -s 20G /tmp/disk.img &&
|
||||
mkfs.ext4 /tmp/disk.img &&
|
||||
mount /tmp/disk.img /var/lib/docker; fi &&
|
||||
dockerd-entrypoint.sh;
|
||||
securityContext:
|
||||
privileged: true
|
||||
```
|
||||
|
||||
Note that loop mounted disks are often sparse, which means they *do not* take up the full amount of space that has been provisioned. This solution seems to produce the best performance and flexibility, at the expense of increased complexity and additional required setup.
|
||||
|
||||
### Build a custom kernel
|
||||
|
||||
It's possible to [modify the kernel](https://github.com/kata-containers/runtime/issues/1888#issuecomment-616872558) (in addition to applying the earlier mentioned mailing list patch) to support using `virtio-fs` as an upper. Note that if you modify your kernel and use `virtio-fs` you may require [additional changes](https://github.com/kata-containers/runtime/issues/1888#issuecomment-739057384) for decent performance and to address other issues.
|
||||
|
||||
> **NOTE:** A future kernel release may rectify the usability and performance issues of using `virtio-fs` as an OverlayFS upper layer.
|
||||
|
||||
## References
|
||||
|
||||
The solutions proposed in this document are an amalgamation of thoughtful contributions from the Kata Containers community.
|
||||
|
||||
Find links to issues & related discussion and the fruits therein below:
|
||||
|
||||
- [How to run Docker in Docker with Kata Containers (#2474)](https://github.com/kata-containers/kata-containers/issues/2474)
|
||||
- [Does Kata-container support AUFS/OverlayFS? (#2493)](https://github.com/kata-containers/runtime/issues/2493)
|
||||
- [Unable to start docker in docker with virtio-fs (#1888)](https://github.com/kata-containers/runtime/issues/1888)
|
||||
- [Not using native diff for overlay2 (#1429)](https://github.com/kata-containers/runtime/issues/1429)
|
||||
@@ -1,33 +0,0 @@
|
||||
## Introduction
|
||||
To improve security, Kata Container supports running the VMM process (currently only QEMU) as a non-`root` user.
|
||||
This document describes how to enable the rootless VMM mode and its limitations.
|
||||
|
||||
## Pre-requisites
|
||||
The permission and ownership of the `kvm` device node (`/dev/kvm`) need to be configured to:
|
||||
```
|
||||
$ crw-rw---- 1 root kvm
|
||||
```
|
||||
use the following commands:
|
||||
```
|
||||
$ sudo groupadd kvm -r
|
||||
$ sudo chown root:kvm /dev/kvm
|
||||
$ sudo chmod 660 /dev/kvm
|
||||
```
|
||||
|
||||
## Configure rootless VMM
|
||||
By default, the VMM process still runs as the root user. There are two ways to enable rootless VMM:
|
||||
1. Set the `rootless` flag to `true` in the hypervisor section of `configuration.toml`.
|
||||
2. Set the Kubernetes annotation `io.katacontainers.hypervisor.rootless` to `true`.
|
||||
|
||||
## Implementation details
|
||||
When `rootless` flag is enabled, upon a request to create a Pod, Kata Containers runtime creates a random user and group (e.g. `kata-123`), and uses them to start the hypervisor process.
|
||||
The `kvm` group is also given to the hypervisor process as a supplemental group to give the hypervisor process access to the `/dev/kvm` device.
|
||||
Another necessary change is to move the hypervisor runtime files (e.g. `vhost-fs.sock`, `qmp.sock`) to a directory (under `/run/user/[uid]/`) where only the non-root hypervisor has access to.
|
||||
|
||||
## Limitations
|
||||
|
||||
1. Only the VMM process is running as a non-root user. Other processes such as Kata Container shimv2 and `virtiofsd` still run as the root user.
|
||||
2. Currently, this feature is only supported in QEMU. Still need to bring it to Firecracker and Cloud Hypervisor (see https://github.com/kata-containers/kata-containers/issues/2567).
|
||||
3. Certain features will not work when rootless VMM is enabled, including:
|
||||
1. Passing devices to the guest (`virtio-blk`, `virtio-scsi`) will not work if the non-privileged user does not have permission to access it (leading to a permission denied error). A more permissive permission (e.g. 666) may overcome this issue. However, you need to be aware of the potential security implications of reducing the security on such devices.
|
||||
2. `vfio` device will also not work because of permission denied error.
|
||||
@@ -2,6 +2,14 @@
|
||||
|
||||
This document describes how to run `kata-monitor` in a Kubernetes cluster using Prometheus's service discovery to scrape metrics from `kata-agent`.
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Pre-requisites](#pre-requisites)
|
||||
- [Configure Prometheus](#configure-prometheus)
|
||||
- [Configure `kata-monitor`](#configure-kata-monitor)
|
||||
- [Setup Grafana](#setup-grafana)
|
||||
* [Create `datasource`](#create-datasource)
|
||||
* [Import dashboard](#import-dashboard)
|
||||
|
||||
> **Warning**: This how-to is only for evaluation purpose, you **SHOULD NOT** running it in production using this configurations.
|
||||
|
||||
## Introduction
|
||||
@@ -26,7 +34,7 @@ Also you should ensure that `kubectl` working correctly.
|
||||
Start Prometheus by utilizing our sample manifest:
|
||||
|
||||
```
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/docs/how-to/data/prometheus.yml
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/2.0-dev/docs/how-to/data/prometheus.yml
|
||||
```
|
||||
|
||||
This will create a new namespace, `prometheus`, and create the following resources:
|
||||
@@ -52,7 +60,7 @@ go_gc_duration_seconds{quantile="0.75"} 0.000229911
|
||||
`kata-monitor` can be started on the cluster as follows:
|
||||
|
||||
```
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/docs/how-to/data/kata-monitor-daemonset.yml
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/2.0-dev/docs/how-to/data/kata-monitor-daemonset.yml
|
||||
```
|
||||
|
||||
This will create a new namespace `kata-system` and a `daemonset` in it.
|
||||
@@ -65,7 +73,7 @@ Once the `daemonset` is running, Prometheus should discover `kata-monitor` as a
|
||||
Run this command to run Grafana in Kubernetes:
|
||||
|
||||
```
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/docs/how-to/data/grafana.yml
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/2.0-dev/docs/how-to/data/grafana.yml
|
||||
```
|
||||
|
||||
This will create deployment and service for Grafana under namespace `prometheus`.
|
||||
@@ -91,7 +99,7 @@ You can import this dashboard using Grafana UI, or using `curl` command in conso
|
||||
$ curl -XPOST -i localhost:3000/api/dashboards/import \
|
||||
-u admin:admin \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"dashboard\":$(curl -sL https://raw.githubusercontent.com/kata-containers/kata-containers/main/docs/how-to/data/dashboard.json )}"
|
||||
-d "{\"dashboard\":$(curl -sL https://raw.githubusercontent.com/kata-containers/kata-containers/2.0-dev/docs/how-to/data/dashboard.json )}"
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
@@ -3,11 +3,6 @@
|
||||
Kata Containers gives users freedom to customize at per-pod level, by setting
|
||||
a wide range of Kata specific annotations in the pod specification.
|
||||
|
||||
Some annotations may be [restricted](#restricted-annotations) by the
|
||||
configuration file for security reasons, notably annotations that could lead the
|
||||
runtime to execute programs on the host. Such annotations are marked with _(R)_ in
|
||||
the tables below.
|
||||
|
||||
# Kata Configuration Annotations
|
||||
There are several kinds of Kata configurations and they are listed below.
|
||||
|
||||
@@ -26,14 +21,14 @@ There are several kinds of Kata configurations and they are listed below.
|
||||
| `io.katacontainers.config.runtime.disable_new_netns` | `boolean` | determines if a new netns is created for the hypervisor process |
|
||||
| `io.katacontainers.config.runtime.internetworking_model` | string| determines how the VM should be connected to the container network interface. Valid values are `macvtap`, `tcfilter` and `none` |
|
||||
| `io.katacontainers.config.runtime.sandbox_cgroup_only`| `boolean` | determines if Kata processes are managed only in sandbox cgroup |
|
||||
| `io.katacontainers.config.runtime.enable_pprof` | `boolean` | enables Golang `pprof` for `containerd-shim-kata-v2` process |
|
||||
|
||||
## Agent Options
|
||||
| Key | Value Type | Comments |
|
||||
|-------| ----- | ----- |
|
||||
| `io.katacontainers.config.agent.enable_tracing` | `boolean` | enable tracing for the agent |
|
||||
| `io.katacontainers.config.agent.container_pipe_size` | uint32 | specify the size of the std(in/out) pipes created for containers |
|
||||
| `io.katacontainers.config.agent.kernel_modules` | string | the list of kernel modules and their parameters that will be loaded in the guest kernel. Semicolon separated list of kernel modules and their parameters. These modules will be loaded in the guest kernel using `modprobe`(8). E.g., `e1000e InterruptThrottleRate=3000,3000,3000 EEE=1; i915 enable_ppgtt=0` |
|
||||
| `io.katacontainers.config.agent.trace_mode` | string | the trace mode for the agent |
|
||||
| `io.katacontainers.config.agent.trace_type` | string | the trace type for the agent |
|
||||
|
||||
## Hypervisor Options
|
||||
| Key | Value Type | Comments |
|
||||
@@ -43,27 +38,19 @@ There are several kinds of Kata configurations and they are listed below.
|
||||
| `io.katacontainers.config.hypervisor.block_device_cache_noflush` | `boolean` | Denotes whether flush requests for the device are ignored |
|
||||
| `io.katacontainers.config.hypervisor.block_device_cache_set` | `boolean` | cache-related options will be set to block devices or not |
|
||||
| `io.katacontainers.config.hypervisor.block_device_driver` | string | the driver to be used for block device, valid values are `virtio-blk`, `virtio-scsi`, `nvdimm`|
|
||||
| `io.katacontainers.config.hypervisor.cpu_features` | `string` | Comma-separated list of CPU features to pass to the CPU (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.ctlpath` (R) | `string` | Path to the `acrnctl` binary for the ACRN hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.default_max_vcpus` | uint32| the maximum number of vCPUs allocated for the VM by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.default_memory` | uint32| the memory assigned for a VM by the hypervisor in `MiB` |
|
||||
| `io.katacontainers.config.hypervisor.default_vcpus` | uint32| the default vCPUs assigned for a VM by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.disable_block_device_use` | `boolean` | disallow a block device from being used |
|
||||
| `io.katacontainers.config.hypervisor.disable_image_nvdimm` | `boolean` | specify if a `nvdimm` device should be used as rootfs for the guest (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.disable_vhost_net` | `boolean` | specify if `vhost-net` is not available on the host |
|
||||
| `io.katacontainers.config.hypervisor.enable_hugepages` | `boolean` | if the memory should be `pre-allocated` from huge pages |
|
||||
| `io.katacontainers.config.hypervisor.enable_iommu_platform` | `boolean` | enable `iommu` on CCW devices (QEMU s390x) |
|
||||
| `io.katacontainers.config.hypervisor.enable_iommu` | `boolean` | enable `iommu` on Q35 (QEMU x86_64) |
|
||||
| `io.katacontainers.config.hypervisor.enable_iothreads` | `boolean`| enable IO to be processed in a separate thread. Supported currently for virtio-`scsi` driver |
|
||||
| `io.katacontainers.config.hypervisor.enable_mem_prealloc` | `boolean` | the memory space used for `nvdimm` device by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.enable_vhost_user_store` | `boolean` | enable vhost-user storage device (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.enable_virtio_mem` | `boolean` | enable virtio-mem (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.entropy_source` (R) | string| the path to a host source of entropy (`/dev/random`, `/dev/urandom` or real hardware RNG device) |
|
||||
| `io.katacontainers.config.hypervisor.file_mem_backend` (R) | string | file based memory backend root directory |
|
||||
| `io.katacontainers.config.hypervisor.enable_swap` | `boolean` | enable swap of VM memory |
|
||||
| `io.katacontainers.config.hypervisor.entropy_source` | string| the path to a host source of entropy (`/dev/random`, `/dev/urandom` or real hardware RNG device) |
|
||||
| `io.katacontainers.config.hypervisor.file_mem_backend` | string | file based memory backend root directory |
|
||||
| `io.katacontainers.config.hypervisor.firmware_hash` | string | container firmware SHA-512 hash value |
|
||||
| `io.katacontainers.config.hypervisor.firmware` | string | the guest firmware that will run the container VM |
|
||||
| `io.katacontainers.config.hypervisor.firmware_volume_hash` | string | container firmware volume SHA-512 hash value |
|
||||
| `io.katacontainers.config.hypervisor.firmware_volume` | string | the guest firmware volume that will be passed to the container VM |
|
||||
| `io.katacontainers.config.hypervisor.guest_hook_path` | string | the path within the VM that will be used for drop in hooks |
|
||||
| `io.katacontainers.config.hypervisor.hotplug_vfio_on_root_bus` | `boolean` | indicate if devices need to be hotplugged on the root bus instead of a bridge|
|
||||
| `io.katacontainers.config.hypervisor.hypervisor_hash` | string | container hypervisor binary SHA-512 hash value |
|
||||
@@ -72,33 +59,24 @@ There are several kinds of Kata configurations and they are listed below.
|
||||
| `io.katacontainers.config.hypervisor.initrd_hash` | string | container guest initrd SHA-512 hash value |
|
||||
| `io.katacontainers.config.hypervisor.initrd` | string | the guest initrd image that will run in the container VM |
|
||||
| `io.katacontainers.config.hypervisor.jailer_hash` | string | container jailer SHA-512 hash value |
|
||||
| `io.katacontainers.config.hypervisor.jailer_path` (R) | string | the jailer that will constrain the container VM |
|
||||
| `io.katacontainers.config.hypervisor.jailer_path` | string | the jailer that will constrain the container VM |
|
||||
| `io.katacontainers.config.hypervisor.kernel_hash` | string | container kernel image SHA-512 hash value |
|
||||
| `io.katacontainers.config.hypervisor.kernel_params` | string | additional guest kernel parameters |
|
||||
| `io.katacontainers.config.hypervisor.kernel` | string | the kernel used to boot the container VM |
|
||||
| `io.katacontainers.config.hypervisor.machine_accelerators` | string | machine specific accelerators for the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.machine_type` | string | the type of machine being emulated by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.memory_offset` | uint64| the memory space used for `nvdimm` device by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.memory_offset` | uint32| the memory space used for `nvdimm` device by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.memory_slots` | uint32| the memory slots assigned to the VM by the hypervisor |
|
||||
| `io.katacontainers.config.hypervisor.msize_9p` | uint32 | the `msize` for 9p shares |
|
||||
| `io.katacontainers.config.hypervisor.path` | string | the hypervisor that will run the container VM |
|
||||
| `io.katacontainers.config.hypervisor.pcie_root_port` | specify the number of PCIe Root Port devices. The PCIe Root Port device is used to hot-plug a PCIe device (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.shared_fs` | string | the shared file system type, either `virtio-9p` or `virtio-fs` |
|
||||
| `io.katacontainers.config.hypervisor.use_vsock` | `boolean` | specify use of `vsock` for agent communication |
|
||||
| `io.katacontainers.config.hypervisor.vhost_user_store_path` (R) | `string` | specify the directory path where vhost-user devices related folders, sockets and device nodes should be (QEMU) |
|
||||
| `io.katacontainers.config.hypervisor.virtio_fs_cache_size` | uint32 | virtio-fs DAX cache size in `MiB` |
|
||||
| `io.katacontainers.config.hypervisor.virtio_fs_cache` | string | the cache mode for virtio-fs, valid values are `always`, `auto` and `none` |
|
||||
| `io.katacontainers.config.hypervisor.virtio_fs_daemon` | string | virtio-fs `vhost-user` daemon path |
|
||||
| `io.katacontainers.config.hypervisor.virtio_fs_extra_args` | string | extra options passed to `virtiofs` daemon |
|
||||
| `io.katacontainers.config.hypervisor.enable_guest_swap` | `boolean` | enable swap in the guest |
|
||||
|
||||
## Container Options
|
||||
| Key | Value Type | Comments |
|
||||
|-------| ----- | ----- |
|
||||
| `io.katacontainers.container.resource.swappiness"` | `uint64` | specify the `Resources.Memory.Swappiness` |
|
||||
| `io.katacontainers.container.resource.swap_in_bytes"` | `uint64` | specify the `Resources.Memory.Swap` |
|
||||
|
||||
# CRI-O Configuration
|
||||
# CRI Configuration
|
||||
|
||||
In case of CRI-O, all annotations specified in the pod spec are passed down to Kata.
|
||||
|
||||
@@ -106,12 +84,11 @@ In case of CRI-O, all annotations specified in the pod spec are passed down to K
|
||||
|
||||
For containerd, annotations specified in the pod spec are passed down to Kata
|
||||
starting with version `1.3.0` of containerd. Additionally, extra configuration is
|
||||
needed for containerd, by providing `pod_annotations` field and
|
||||
`container_annotations` field in the containerd config
|
||||
file. The `pod_annotations` field and `container_annotations` field are two lists of
|
||||
annotations that can be passed down to Kata as OCI annotations. They support golang match
|
||||
patterns. Since annotations supported by Kata follow the pattern `io.katacontainers.*`,
|
||||
the following configuration would work for passing annotations to Kata from containerd:
|
||||
needed for containerd, by providing a `pod_annotations` field in the containerd config
|
||||
file. The `pod_annotations` field is a list of annotations that can be passed down to
|
||||
Kata as OCI annotations. It supports golang match patterns. Since annotations supported
|
||||
by Kata follow the pattern `io.katacontainers.*`, the following configuration would work
|
||||
for passing annotations to Kata from containerd:
|
||||
|
||||
```
|
||||
$ cat /etc/containerd/config
|
||||
@@ -120,12 +97,11 @@ $ cat /etc/containerd/config
|
||||
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
|
||||
runtime_type = "io.containerd.kata.v2"
|
||||
pod_annotations = ["io.katacontainers.*"]
|
||||
container_annotations = ["io.katacontainers.*"]
|
||||
....
|
||||
|
||||
```
|
||||
|
||||
Additional documentation on the above configuration can be found in the
|
||||
Additional documentation on the above configuration can be found in the
|
||||
[containerd docs](https://github.com/containerd/cri/blob/8d5a8355d07783ba2f8f451209f6bdcc7c412346/docs/config.md).
|
||||
|
||||
# Example - Using annotations
|
||||
@@ -183,32 +159,3 @@ spec:
|
||||
stdin: true
|
||||
tty: true
|
||||
```
|
||||
|
||||
# Restricted annotations
|
||||
|
||||
Some annotations are _restricted_, meaning that the configuration file specifies
|
||||
the acceptable values. Currently, only hypervisor annotations are restricted,
|
||||
for security reason, with the intent to control which binaries the Kata
|
||||
Containers runtime will launch on your behalf.
|
||||
|
||||
The configuration file validates the annotation _name_ as well as the annotation
|
||||
_value_.
|
||||
|
||||
The acceptable annotation names are defined by the `enable_annotations` entry in
|
||||
the configuration file.
|
||||
|
||||
For restricted annotations, an additional configuration entry provides a list of
|
||||
acceptable values. Since most restricted annotations are intended to control
|
||||
which binaries the runtime can execute, the valid value is generally provided by
|
||||
a shell pattern, as defined by `glob(3)`. The table below provides the name of
|
||||
the configuration entry:
|
||||
|
||||
| Key | Config file entry | Comments |
|
||||
|-------| ----- | ----- |
|
||||
| `ctlpath` | `valid_ctlpaths` | Valid paths for `acrnctl` binary |
|
||||
| `entropy_source` | `valid_entropy_sources` | Valid entropy sources, e.g. `/dev/random` |
|
||||
| `file_mem_backend` | `valid_file_mem_backends` | Valid locations for the file-based memory backend root directory |
|
||||
| `jailer_path` | `valid_jailer_paths`| Valid paths for the jailer constraining the container VM (Firecracker) |
|
||||
| `path` | `valid_hypervisor_paths` | Valid hypervisors to run the container VM |
|
||||
| `vhost_user_store_path` | `valid_vhost_user_store_paths` | Valid paths for vhost-user related files|
|
||||
| `virtio_fs_daemon` | `valid_virtio_fs_daemon_paths` | Valid paths for the `virtiofsd` daemon |
|
||||
|
||||
@@ -1,59 +0,0 @@
|
||||
# Setup swap device in guest kernel
|
||||
|
||||
## Introduction
|
||||
|
||||
Setup swap device in guest kernel can help to increase memory capacity, handle some memory issues and increase file access speed sometimes.
|
||||
Kata Containers can insert a raw file to the guest as the swap device.
|
||||
|
||||
## Requisites
|
||||
|
||||
The swap config of the containers should be set by [annotations](how-to-set-sandbox-config-kata.md#container-options). So [extra configuration is needed for containerd](how-to-set-sandbox-config-kata.md#containerd-configuration).
|
||||
|
||||
Kata Containers just supports setup swap device in guest kernel with QEMU.
|
||||
Install and setup Kata Containers as shown [here](../install/README.md).
|
||||
|
||||
Enable setup swap device in guest kernel as follows:
|
||||
```
|
||||
$ sudo sed -i -e 's/^#enable_guest_swap.*$/enable_guest_swap = true/g' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
|
||||
## Run a Kata Container utilizing swap device
|
||||
|
||||
Use following command to start a Kata Container with swappiness 60 and 1GB swap device (swap_in_bytes - memory_limit_in_bytes).
|
||||
```
|
||||
$ pod_yaml=pod.yaml
|
||||
$ container_yaml=container.yaml
|
||||
$ image="quay.io/prometheus/busybox:latest"
|
||||
$ cat << EOF > "${pod_yaml}"
|
||||
metadata:
|
||||
name: busybox-sandbox1
|
||||
EOF
|
||||
$ cat << EOF > "${container_yaml}"
|
||||
metadata:
|
||||
name: busybox-test-swap
|
||||
annotations:
|
||||
io.katacontainers.container.resource.swappiness: "60"
|
||||
io.katacontainers.container.resource.swap_in_bytes: "2147483648"
|
||||
linux:
|
||||
resources:
|
||||
memory_limit_in_bytes: 1073741824
|
||||
image:
|
||||
image: "$image"
|
||||
command:
|
||||
- top
|
||||
EOF
|
||||
$ sudo crictl pull $image
|
||||
$ podid=$(sudo crictl runp $pod_yaml)
|
||||
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
|
||||
$ sudo crictl start $cid
|
||||
```
|
||||
|
||||
Kata Container setups swap device for this container only when `io.katacontainers.container.resource.swappiness` is set.
|
||||
|
||||
The following table shows the swap size how to decide if `io.katacontainers.container.resource.swappiness` is set.
|
||||
|`io.katacontainers.container.resource.swap_in_bytes`|`memory_limit_in_bytes`|swap size|
|
||||
|---|---|---|
|
||||
|set|set| `io.katacontainers.container.resource.swap_in_bytes` - `memory_limit_in_bytes`|
|
||||
|not set|set| `memory_limit_in_bytes`|
|
||||
|not set|not set| `io.katacontainers.config.hypervisor.default_memory`|
|
||||
|set|not set|cgroup doesn't support this usage|
|
||||
@@ -1,9 +1,22 @@
|
||||
# How to use Kata Containers and CRI (containerd plugin) with Kubernetes
|
||||
|
||||
* [Requirements](#requirements)
|
||||
* [Install and configure containerd](#install-and-configure-containerd)
|
||||
* [Install and configure Kubernetes](#install-and-configure-kubernetes)
|
||||
* [Install Kubernetes](#install-kubernetes)
|
||||
* [Configure Kubelet to use containerd](#configure-kubelet-to-use-containerd)
|
||||
* [Configure HTTP proxy - OPTIONAL](#configure-http-proxy---optional)
|
||||
* [Start Kubernetes](#start-kubernetes)
|
||||
* [Configure Pod Network](#configure-pod-network)
|
||||
* [Allow pods to run in the master node](#allow-pods-to-run-in-the-master-node)
|
||||
* [Create runtime class for Kata Containers](#create-runtime-class-for-kata-containers)
|
||||
* [Run pod in Kata Containers](#run-pod-in-kata-containers)
|
||||
* [Delete created pod](#delete-created-pod)
|
||||
|
||||
This document describes how to set up a single-machine Kubernetes (k8s) cluster.
|
||||
|
||||
The Kubernetes cluster will use the
|
||||
[CRI containerd](https://github.com/containerd/containerd/) and
|
||||
[CRI containerd plugin](https://github.com/containerd/cri) and
|
||||
[Kata Containers](https://katacontainers.io) to launch untrusted workloads.
|
||||
|
||||
## Requirements
|
||||
@@ -71,12 +84,12 @@ $ for service in ${services}; do
|
||||
service_dir="/etc/systemd/system/${service}.service.d/"
|
||||
sudo mkdir -p ${service_dir}
|
||||
|
||||
cat << EOF | sudo tee "${service_dir}/proxy.conf"
|
||||
cat << EOT | sudo tee "${service_dir}/proxy.conf"
|
||||
[Service]
|
||||
Environment="HTTP_PROXY=${http_proxy}"
|
||||
Environment="HTTPS_PROXY=${https_proxy}"
|
||||
Environment="NO_PROXY=${no_proxy}"
|
||||
EOF
|
||||
EOT
|
||||
done
|
||||
|
||||
$ sudo systemctl daemon-reload
|
||||
@@ -154,7 +167,7 @@ From Kubernetes v1.12, users can use [`RuntimeClass`](https://kubernetes.io/docs
|
||||
|
||||
```bash
|
||||
$ cat > runtime.yaml <<EOF
|
||||
apiVersion: node.k8s.io/v1
|
||||
apiVersion: node.k8s.io/v1beta1
|
||||
kind: RuntimeClass
|
||||
metadata:
|
||||
name: kata
|
||||
@@ -172,7 +185,7 @@ If a pod has the `runtimeClassName` set to `kata`, the CRI plugin runs the pod w
|
||||
- Create an pod configuration that using Kata Containers runtime
|
||||
|
||||
```bash
|
||||
$ cat << EOF | tee nginx-kata.yaml
|
||||
$ cat << EOT | tee nginx-kata.yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
@@ -183,7 +196,7 @@ If a pod has the `runtimeClassName` set to `kata`, the CRI plugin runs the pod w
|
||||
- name: nginx
|
||||
image: nginx
|
||||
|
||||
EOF
|
||||
EOT
|
||||
```
|
||||
|
||||
- Create the pod
|
||||
|
||||
@@ -2,6 +2,11 @@
|
||||
|
||||
This document provides an overview on how to run Kata containers with ACRN hypervisor and device model.
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Pre-requisites](#pre-requisites)
|
||||
- [Configure Docker](#configure-docker)
|
||||
- [Configure Kata Containers with ACRN](#configure-kata-containers-with-acrn)
|
||||
|
||||
## Introduction
|
||||
|
||||
ACRN is a flexible, lightweight Type-1 reference hypervisor built with real-time and safety-criticality in mind. ACRN uses an open source platform making it optimized to streamline embedded development.
|
||||
@@ -22,7 +27,7 @@ This document requires the presence of the ACRN hypervisor and Kata Containers o
|
||||
|
||||
- ACRN supported [Hardware](https://projectacrn.github.io/latest/hardware.html#supported-hardware).
|
||||
> **Note:** Please make sure to have a minimum of 4 logical processors (HT) or cores.
|
||||
- ACRN [software](https://projectacrn.github.io/latest/tutorials/run_kata_containers.html) setup.
|
||||
- ACRN [software](https://projectacrn.github.io/latest/tutorials/kbl-nuc-sdc.html#use-the-script-to-set-up-acrn-automatically) setup.
|
||||
- For networking, ACRN supports either MACVTAP or TAP. If MACVTAP is not enabled in the Service OS, please follow the below steps to update the kernel:
|
||||
|
||||
```sh
|
||||
@@ -86,7 +91,7 @@ To configure Kata Containers with ACRN, copy the generated `configuration-acrn.t
|
||||
The following command shows full paths to the `configuration.toml` files that the runtime loads. It will use the first path that exists. (Please make sure the kernel and image paths are set correctly in the `configuration.toml` file)
|
||||
|
||||
```bash
|
||||
$ sudo kata-runtime --show-default-config-paths
|
||||
$ sudo kata-runtime --kata-show-default-config-paths
|
||||
```
|
||||
|
||||
>**Warning:** Please offline CPUs using [this](offline_cpu.sh) script, else VM launches will fail.
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
# Setting Sysctls with Kata
|
||||
|
||||
## Sysctls
|
||||
|
||||
In Linux, the sysctl interface allows an administrator to modify kernel
|
||||
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
|
||||
process file system.
|
||||
@@ -17,10 +16,11 @@ To get a complete list of kernel parameters, run:
|
||||
$ sudo sysctl -a
|
||||
```
|
||||
|
||||
Kubernetes provide mechanisms for setting namespaced sysctls.
|
||||
Namespaced sysctls can be set per pod in the case of Kubernetes.
|
||||
Both Docker and Kubernetes provide mechanisms for setting namespaced sysctls.
|
||||
Namespaced sysctls can be set per pod in the case of Kubernetes or per container
|
||||
in case of Docker.
|
||||
The following sysctls are known to be namespaced and can be set with
|
||||
Kubernetes:
|
||||
Docker and Kubernetes:
|
||||
|
||||
- `kernel.shm*`
|
||||
- `kernel.msg*`
|
||||
@@ -30,10 +30,31 @@ Kubernetes:
|
||||
|
||||
### Namespaced Sysctls:
|
||||
|
||||
Kata Containers supports setting namespaced sysctls with Kubernetes.
|
||||
Kata Containers supports setting namespaced sysctls with Docker and Kubernetes.
|
||||
All namespaced sysctls can be set in the same way as regular Linux based
|
||||
containers, the difference being, in the case of Kata they are set inside the guest.
|
||||
|
||||
#### Setting Namespaced Sysctls with Docker:
|
||||
|
||||
```
|
||||
$ sudo docker run --runtime=kata-runtime -it alpine cat /proc/sys/fs/mqueue/queues_max
|
||||
256
|
||||
$ sudo docker run --runtime=kata-runtime --sysctl fs.mqueue.queues_max=512 -it alpine cat /proc/sys/fs/mqueue/queues_max
|
||||
512
|
||||
```
|
||||
|
||||
... and:
|
||||
|
||||
```
|
||||
$ sudo docker run --runtime=kata-runtime -it alpine cat /proc/sys/kernel/shmmax
|
||||
18446744073692774399
|
||||
$ sudo docker run --runtime=kata-runtime --sysctl kernel.shmmax=1024 -it alpine cat /proc/sys/kernel/shmmax
|
||||
1024
|
||||
```
|
||||
|
||||
For additional documentation on setting sysctls with Docker please refer to [Docker-sysctl-doc](https://docs.docker.com/engine/reference/commandline/run/#configure-namespaced-kernel-parameters-sysctls-at-runtime).
|
||||
|
||||
|
||||
#### Setting Namespaced Sysctls with Kubernetes:
|
||||
|
||||
Kubernetes considers certain sysctls as safe and others as unsafe. For detailed
|
||||
@@ -79,7 +100,7 @@ spec:
|
||||
|
||||
### Non-Namespaced Sysctls:
|
||||
|
||||
Kubernetes disallow sysctls without a namespace.
|
||||
Docker and Kubernetes disallow sysctls without a namespace.
|
||||
The recommendation is to set them directly on the host or use a privileged
|
||||
container in the case of Kubernetes.
|
||||
|
||||
|
||||
@@ -1,57 +0,0 @@
|
||||
# Kata Containers with virtio-fs-nydus
|
||||
|
||||
## Introduction
|
||||
|
||||
Refer to [kata-`nydus`-design](../design/kata-nydus-design.md) for introduction and `nydus` has supported Kata Containers with hypervisor `QEMU` and `CLH` currently.
|
||||
|
||||
## How to
|
||||
|
||||
You can use Kata Containers with `nydus` as follows,
|
||||
|
||||
1. Use [`nydus` latest branch](https://github.com/dragonflyoss/image-service);
|
||||
|
||||
2. Deploy `nydus` environment as [`Nydus` Setup for Containerd Environment](https://github.com/dragonflyoss/image-service/blob/master/docs/containerd-env-setup.md);
|
||||
|
||||
3. Start `nydus-snapshotter` with `enable_nydus_overlayfs` enabled;
|
||||
|
||||
4. Use [kata-containers](https://github.com/kata-containers/kata-containers) `latest` branch to compile and build `kata-containers.img`;
|
||||
|
||||
5. Update `configuration-qemu.toml` or `configuration-clh.toml`to include:
|
||||
|
||||
```toml
|
||||
shared_fs = "virtio-fs-nydus"
|
||||
virtio_fs_daemon = "<nydusd binary path>"
|
||||
virtio_fs_extra_args = []
|
||||
```
|
||||
|
||||
6. run `crictl run -r kata nydus-container.yaml nydus-sandbox.yaml`;
|
||||
|
||||
The `nydus-sandbox.yaml` looks like below:
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
attempt: 1
|
||||
name: nydus-sandbox
|
||||
namespace: default
|
||||
log_directory: /tmp
|
||||
linux:
|
||||
security_context:
|
||||
namespace_options:
|
||||
network: 2
|
||||
annotations:
|
||||
"io.containerd.osfeature": "nydus.remoteimage.v1"
|
||||
```
|
||||
|
||||
The `nydus-container.yaml` looks like below:
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
name: nydus-container
|
||||
image:
|
||||
image: localhost:5000/ubuntu-nydus:latest
|
||||
command:
|
||||
- /bin/sleep
|
||||
args:
|
||||
- 600
|
||||
log_path: container.1.log
|
||||
```
|
||||
@@ -1,9 +1,12 @@
|
||||
# Kata Containers with virtio-fs
|
||||
|
||||
- [Kata Containers with virtio-fs](#kata-containers-with-virtio-fs)
|
||||
- [Introduction](#introduction)
|
||||
|
||||
## Introduction
|
||||
|
||||
Container deployments utilize explicit or implicit file sharing between host filesystem and containers. From a trust perspective, avoiding a shared file-system between the trusted host and untrusted container is recommended. This is not always feasible. In Kata Containers, block-based volumes are preferred as they allow usage of either device pass through or `virtio-blk` for access within the virtual machine.
|
||||
|
||||
As of the 2.0 release of Kata Containers, [virtio-fs](https://virtio-fs.gitlab.io/) is the default filesystem sharing mechanism.
|
||||
|
||||
virtio-fs support works out of the box for `cloud-hypervisor` and `qemu`, when Kata Containers is deployed using `kata-deploy`. Learn more about `kata-deploy` and how to use `kata-deploy` in Kubernetes [here](../../tools/packaging/kata-deploy/README.md#kubernetes-quick-start).
|
||||
virtio-fs support works out of the box for `cloud-hypervisor` and `qemu`, when Kata Containers is deployed using `kata-deploy`. Learn more about `kata-deploy` and how to use `kata-deploy` in Kubernetes [here](https://github.com/kata-containers/packaging/tree/master/kata-deploy#kubernetes-quick-start).
|
||||
@@ -1,5 +1,9 @@
|
||||
# Kata Containers with `virtio-mem`
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Requisites](#requisites)
|
||||
- [Run a Kata Container utilizing `virtio-mem`](#run-a-kata-container-utilizing-virtio-mem)
|
||||
|
||||
## Introduction
|
||||
|
||||
The basic idea of `virtio-mem` is to provide a flexible, cross-architecture memory hot plug and hot unplug solution that avoids many limitations imposed by existing technologies, architectures, and interfaces.
|
||||
@@ -9,23 +13,26 @@ Kata Containers with `virtio-mem` supports memory resize.
|
||||
|
||||
## Requisites
|
||||
|
||||
Kata Containers just supports `virtio-mem` with QEMU.
|
||||
Install and setup Kata Containers as shown [here](../install/README.md).
|
||||
Kata Containers with `virtio-mem` requires Linux and the QEMU that support `virtio-mem`.
|
||||
The Linux kernel and QEMU upstream version still not support `virtio-mem`. @davidhildenbrand is working on them.
|
||||
Please use following unofficial version of the Linux kernel and QEMU that support `virtio-mem` with Kata Containers.
|
||||
|
||||
### With x86_64
|
||||
The `virtio-mem` config of the x86_64 Kata Linux kernel is open.
|
||||
Enable `virtio-mem` as follows:
|
||||
```
|
||||
$ sudo sed -i -e 's/^#enable_virtio_mem.*$/enable_virtio_mem = true/g' /etc/kata-containers/configuration.toml
|
||||
The Linux kernel is at https://github.com/davidhildenbrand/linux/tree/virtio-mem-rfc-v4.
|
||||
The Linux kernel config that can work with Kata Containers is at https://gist.github.com/teawater/016194ee84748c768745a163d08b0fb9.
|
||||
|
||||
The QEMU is at https://github.com/teawater/qemu/tree/kata-virtio-mem. (The original source is at https://github.com/davidhildenbrand/qemu/tree/virtio-mem. Its base version of QEMU cannot work with Kata Containers. So merge the commit of `virtio-mem` to upstream QEMU.)
|
||||
|
||||
Set Linux and the QEMU that support `virtio-mem` with following line in the Kata Containers QEMU configuration `configuration-qemu.toml`:
|
||||
```toml
|
||||
[hypervisor.qemu]
|
||||
path = "qemu-dir"
|
||||
kernel = "vmlinux-dir"
|
||||
```
|
||||
|
||||
### With other architectures
|
||||
The `virtio-mem` config of the others Kata Linux kernel is not open.
|
||||
You can open `virtio-mem` config as follows:
|
||||
Enable `virtio-mem` with following line in the Kata Containers configuration:
|
||||
```toml
|
||||
enable_virtio_mem = true
|
||||
```
|
||||
CONFIG_VIRTIO_MEM=y
|
||||
```
|
||||
Then you can build and install the guest kernel image as shown [here](../../tools/packaging/kernel/README.md#build-kata-containers-kernel).
|
||||
|
||||
## Run a Kata Container utilizing `virtio-mem`
|
||||
|
||||
@@ -34,35 +41,13 @@ Use following command to enable memory overcommitment of a Linux kernel. Becaus
|
||||
$ echo 1 | sudo tee /proc/sys/vm/overcommit_memory
|
||||
```
|
||||
|
||||
Use following command to start a Kata Container.
|
||||
Use following command start a Kata Container.
|
||||
```
|
||||
$ pod_yaml=pod.yaml
|
||||
$ container_yaml=container.yaml
|
||||
$ image="quay.io/prometheus/busybox:latest"
|
||||
$ cat << EOF > "${pod_yaml}"
|
||||
metadata:
|
||||
name: busybox-sandbox1
|
||||
EOF
|
||||
$ cat << EOF > "${container_yaml}"
|
||||
metadata:
|
||||
name: busybox-killed-vmm
|
||||
image:
|
||||
image: "$image"
|
||||
command:
|
||||
- top
|
||||
EOF
|
||||
$ sudo crictl pull $image
|
||||
$ podid=$(sudo crictl runp $pod_yaml)
|
||||
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
|
||||
$ sudo crictl start $cid
|
||||
$ docker run --rm -it --runtime=kata --name test busybox
|
||||
```
|
||||
|
||||
Use the following command to set the container memory limit to 2g and the memory size of the VM to its default_memory + 2g.
|
||||
Use following command set the memory size of test to default_memory + 512m.
|
||||
```
|
||||
$ sudo crictl update --memory $((2*1024*1024*1024)) $cid
|
||||
$ docker update -m 512m --memory-swap -1 test
|
||||
```
|
||||
|
||||
Use the following command to set the container memory limit to 1g and the memory size of the VM to its default_memory + 1g.
|
||||
```
|
||||
$ sudo crictl update --memory $((1*1024*1024*1024)) $cid
|
||||
```
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
@@ -3,6 +3,11 @@
|
||||
Kata Containers supports creation of containers that are "privileged" (i.e. have additional capabilities and access
|
||||
that is not normally granted).
|
||||
|
||||
* [Warnings](#warnings)
|
||||
* [Host Devices](#host-devices)
|
||||
* [Containerd and CRI](#containerd-and-cri)
|
||||
* [CRI-O](#cri-o)
|
||||
|
||||
## Warnings
|
||||
|
||||
**Warning:** Whilst this functionality is supported, it can decrease the security of Kata Containers if not configured
|
||||
@@ -16,9 +21,9 @@ from the host, a potentially undesirable side-effect that decreases the security
|
||||
|
||||
The following sections document how to configure this behavior in different container runtimes.
|
||||
|
||||
#### Containerd
|
||||
#### Containerd and CRI
|
||||
|
||||
The Containerd allows configuring the privileged host devices behavior for each runtime in the containerd config. This is
|
||||
The Containerd CRI allows configuring the privileged host devices behavior for each runtime in the CRI config. This is
|
||||
done with the `privileged_without_host_devices` option. Setting this to `true` will disable hot plugging of the host
|
||||
devices into the guest, even when privileged is enabled.
|
||||
|
||||
@@ -41,7 +46,7 @@ See below example config:
|
||||
```
|
||||
|
||||
- [Kata Containers with Containerd and CRI documentation](how-to-use-k8s-with-cri-containerd-and-kata.md)
|
||||
- [Containerd CRI config documentation](https://github.com/containerd/containerd/blob/main/docs/cri/config.md)
|
||||
- [Containerd CRI config documentation](https://github.com/containerd/cri/blob/master/docs/config.md)
|
||||
|
||||
#### CRI-O
|
||||
|
||||
|
||||
@@ -1,5 +1,16 @@
|
||||
# Working with `crictl`
|
||||
|
||||
* [What's `cri-tools`](#whats-cri-tools)
|
||||
* [Use `crictl` run Pods in Kata containers](#use-crictl-run-pods-in-kata-containers)
|
||||
* [Run `busybox` Pod](#run-busybox-pod)
|
||||
* [Run pod sandbox with config file](#run-pod-sandbox-with-config-file)
|
||||
* [Create container in the pod sandbox with config file](#create-container-in-the-pod-sandbox-with-config-file)
|
||||
* [Start container](#start-container)
|
||||
* [Run `redis` Pod](#run-redis-pod)
|
||||
* [Create `redis-server` Pod](#create-redis-server-pod)
|
||||
* [Create `redis-client` Pod](#create-redis-client-pod)
|
||||
* [Check `redis` server is working](#check-redis-server-is-working)
|
||||
|
||||
## What's `cri-tools`
|
||||
|
||||
[`cri-tools`](https://github.com/kubernetes-sigs/cri-tools) provides debugging and validation tools for Kubelet Container Runtime Interface (CRI).
|
||||
|
||||
@@ -1,5 +1,18 @@
|
||||
# Run Kata Containers with Kubernetes
|
||||
|
||||
* [Run Kata Containers with Kubernetes](#run-kata-containers-with-kubernetes)
|
||||
* [Prerequisites](#prerequisites)
|
||||
* [Install a CRI implementation](#install-a-cri-implementation)
|
||||
* [CRI-O](#cri-o)
|
||||
* [Kubernetes Runtime Class (CRI-O v1.12 )](#kubernetes-runtime-class-cri-o-v112)
|
||||
* [Untrusted annotation (until CRI-O v1.12)](#untrusted-annotation-until-cri-o-v112)
|
||||
* [Network namespace management](#network-namespace-management)
|
||||
* [containerd with CRI plugin](#containerd-with-cri-plugin)
|
||||
* [Install Kubernetes](#install-kubernetes)
|
||||
* [Configure for CRI-O](#configure-for-cri-o)
|
||||
* [Configure for containerd](#configure-for-containerd)
|
||||
* [Run a Kubernetes pod with Kata Containers](#run-a-kubernetes-pod-with-kata-containers)
|
||||
|
||||
## Prerequisites
|
||||
This guide requires Kata Containers available on your system, install-able by following [this guide](../install/README.md).
|
||||
|
||||
@@ -9,7 +22,7 @@ Kubernetes CRI (Container Runtime Interface) implementations allow using any
|
||||
OCI-compatible runtime with Kubernetes, such as the Kata Containers runtime.
|
||||
|
||||
Kata Containers support both the [CRI-O](https://github.com/kubernetes-incubator/cri-o) and
|
||||
[containerd](https://github.com/containerd/containerd) CRI implementations.
|
||||
[CRI-containerd](https://github.com/containerd/cri) CRI implementations.
|
||||
|
||||
After choosing one CRI implementation, you must make the appropriate configuration
|
||||
to ensure it integrates with Kata Containers.
|
||||
@@ -20,9 +33,9 @@ required to spawn pods and containers, and this is the preferred way to run Kata
|
||||
An equivalent shim implementation for CRI-O is planned.
|
||||
|
||||
### CRI-O
|
||||
For CRI-O installation instructions, refer to the [CRI-O Tutorial](https://github.com/cri-o/cri-o/blob/main/tutorial.md) page.
|
||||
For CRI-O installation instructions, refer to the [CRI-O Tutorial](https://github.com/kubernetes-incubator/cri-o/blob/master/tutorial.md) page.
|
||||
|
||||
The following sections show how to set up the CRI-O snippet configuration file (default path: `/etc/crio/crio.conf`) for Kata.
|
||||
The following sections show how to set up the CRI-O configuration file (default path: `/etc/crio/crio.conf`) for Kata.
|
||||
|
||||
Unless otherwise stated, all the following settings are specific to the `crio.runtime` table:
|
||||
```toml
|
||||
@@ -30,7 +43,7 @@ Unless otherwise stated, all the following settings are specific to the `crio.ru
|
||||
# runtime used and options for how to set up and manage the OCI runtime.
|
||||
[crio.runtime]
|
||||
```
|
||||
A comprehensive documentation of the configuration file can be found [here](https://github.com/cri-o/cri-o/blob/main/docs/crio.conf.5.md).
|
||||
A comprehensive documentation of the configuration file can be found [here](https://github.com/cri-o/cri-o/blob/master/docs/crio.conf.5.md).
|
||||
|
||||
> **Note**: After any change to this file, the CRI-O daemon have to be restarted with:
|
||||
>````
|
||||
@@ -40,20 +53,82 @@ A comprehensive documentation of the configuration file can be found [here](http
|
||||
#### Kubernetes Runtime Class (CRI-O v1.12+)
|
||||
The [Kubernetes Runtime Class](https://kubernetes.io/docs/concepts/containers/runtime-class/)
|
||||
is the preferred way of specifying the container runtime configuration to run a Pod's containers.
|
||||
To use this feature, Kata must added as a runtime handler. This can be done by
|
||||
dropping a `50-kata` snippet file into `/etc/crio/crio.conf.d`, with the
|
||||
content shown below:
|
||||
To use this feature, Kata must added as a runtime handler with:
|
||||
|
||||
```toml
|
||||
[crio.runtime.runtimes.kata]
|
||||
runtime_path = "/usr/bin/containerd-shim-kata-v2"
|
||||
runtime_type = "vm"
|
||||
runtime_root = "/run/vc"
|
||||
privileged_without_host_devices = true
|
||||
[crio.runtime.runtimes.kata-runtime]
|
||||
runtime_path = "/usr/bin/kata-runtime"
|
||||
runtime_type = "oci"
|
||||
```
|
||||
|
||||
You can also add multiple entries to specify alternatives hypervisors, e.g.:
|
||||
```toml
|
||||
[crio.runtime.runtimes.kata-qemu]
|
||||
runtime_path = "/usr/bin/kata-runtime"
|
||||
runtime_type = "oci"
|
||||
|
||||
[crio.runtime.runtimes.kata-fc]
|
||||
runtime_path = "/usr/bin/kata-runtime"
|
||||
runtime_type = "oci"
|
||||
```
|
||||
|
||||
#### Untrusted annotation (until CRI-O v1.12)
|
||||
The untrusted annotation is used to specify a runtime for __untrusted__ workloads, i.e.
|
||||
a runtime to be used when the workload cannot be trusted and a higher level of security
|
||||
is required. An additional flag can be used to let CRI-O know if a workload
|
||||
should be considered _trusted_ or _untrusted_ by default.
|
||||
For further details, see the documentation
|
||||
[here](../design/architecture.md#mixing-vm-based-and-namespace-based-runtimes).
|
||||
|
||||
```toml
|
||||
# runtime is the OCI compatible runtime used for trusted container workloads.
|
||||
# This is a mandatory setting as this runtime will be the default one
|
||||
# and will also be used for untrusted container workloads if
|
||||
# runtime_untrusted_workload is not set.
|
||||
runtime = "/usr/bin/runc"
|
||||
|
||||
# runtime_untrusted_workload is the OCI compatible runtime used for untrusted
|
||||
# container workloads. This is an optional setting, except if
|
||||
# default_container_trust is set to "untrusted".
|
||||
runtime_untrusted_workload = "/usr/bin/kata-runtime"
|
||||
|
||||
# default_workload_trust is the default level of trust crio puts in container
|
||||
# workloads. It can either be "trusted" or "untrusted", and the default
|
||||
# is "trusted".
|
||||
# Containers can be run through different container runtimes, depending on
|
||||
# the trust hints we receive from kubelet:
|
||||
# - If kubelet tags a container workload as untrusted, crio will try first to
|
||||
# run it through the untrusted container workload runtime. If it is not set,
|
||||
# crio will use the trusted runtime.
|
||||
# - If kubelet does not provide any information about the container workload trust
|
||||
# level, the selected runtime will depend on the default_container_trust setting.
|
||||
# If it is set to "untrusted", then all containers except for the host privileged
|
||||
# ones, will be run by the runtime_untrusted_workload runtime. Host privileged
|
||||
# containers are by definition trusted and will always use the trusted container
|
||||
# runtime. If default_container_trust is set to "trusted", crio will use the trusted
|
||||
# container runtime for all containers.
|
||||
default_workload_trust = "untrusted"
|
||||
```
|
||||
|
||||
#### Network namespace management
|
||||
To enable networking for the workloads run by Kata, CRI-O needs to be configured to
|
||||
manage network namespaces, by setting the following key to `true`.
|
||||
|
||||
In CRI-O v1.16:
|
||||
```toml
|
||||
manage_network_ns_lifecycle = true
|
||||
```
|
||||
In CRI-O v1.17+:
|
||||
```toml
|
||||
manage_ns_lifecycle = true
|
||||
```
|
||||
|
||||
|
||||
### containerd
|
||||
### containerd with CRI plugin
|
||||
|
||||
If you select containerd with `cri` plugin, follow the "Getting Started for Developers"
|
||||
instructions [here](https://github.com/containerd/cri#getting-started-for-developers)
|
||||
to properly install it.
|
||||
|
||||
To customize containerd to select Kata Containers runtime, follow our
|
||||
"Configure containerd to use Kata Containers" internal documentation
|
||||
@@ -96,77 +171,34 @@ $ sudo systemctl daemon-reload
|
||||
$ sudo systemctl restart kubelet
|
||||
|
||||
# If using CRI-O
|
||||
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.244.0.0/16
|
||||
$ sudo kubeadm init --skip-preflight-checks --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.244.0.0/16
|
||||
|
||||
# If using containerd
|
||||
$ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
|
||||
# If using CRI-containerd
|
||||
$ sudo kubeadm init --skip-preflight-checks --cri-socket /run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
|
||||
|
||||
$ export KUBECONFIG=/etc/kubernetes/admin.conf
|
||||
```
|
||||
|
||||
### Allow pods to run in the master node
|
||||
You can force Kubelet to use Kata Containers by adding some `untrusted`
|
||||
annotation to your pod configuration. In our case, this ensures Kata
|
||||
Containers is the selected runtime to run the described workload.
|
||||
|
||||
By default, the cluster will not schedule pods in the master node. To enable master node scheduling:
|
||||
```bash
|
||||
$ sudo -E kubectl taint nodes --all node-role.kubernetes.io/master-
|
||||
```
|
||||
|
||||
### Create runtime class for Kata Containers
|
||||
|
||||
Users can use [`RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/#runtime-class) to specify a different runtime for Pods.
|
||||
|
||||
```bash
|
||||
$ cat > runtime.yaml <<EOF
|
||||
apiVersion: node.k8s.io/v1
|
||||
kind: RuntimeClass
|
||||
`nginx-untrusted.yaml`
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: kata
|
||||
handler: kata
|
||||
EOF
|
||||
|
||||
$ sudo -E kubectl apply -f runtime.yaml
|
||||
```
|
||||
|
||||
### Run pod in Kata Containers
|
||||
|
||||
If a pod has the `runtimeClassName` set to `kata`, the CRI plugin runs the pod with the
|
||||
[Kata Containers runtime](../../src/runtime/README.md).
|
||||
|
||||
- Create an pod configuration that using Kata Containers runtime
|
||||
|
||||
```bash
|
||||
$ cat << EOF | tee nginx-kata.yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: nginx-kata
|
||||
spec:
|
||||
runtimeClassName: kata
|
||||
containers:
|
||||
name: nginx-untrusted
|
||||
annotations:
|
||||
io.kubernetes.cri.untrusted-workload: "true"
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx
|
||||
|
||||
EOF
|
||||
```
|
||||
|
||||
- Create the pod
|
||||
```bash
|
||||
$ sudo -E kubectl apply -f nginx-kata.yaml
|
||||
```
|
||||
|
||||
- Check pod is running
|
||||
|
||||
```bash
|
||||
$ sudo -E kubectl get pods
|
||||
```
|
||||
|
||||
- Check hypervisor is running
|
||||
```bash
|
||||
$ ps aux | grep qemu
|
||||
```
|
||||
|
||||
### Delete created pod
|
||||
|
||||
```bash
|
||||
$ sudo -E kubectl delete -f nginx-kata.yaml
|
||||
```
|
||||
|
||||
Next, you run your pod:
|
||||
```
|
||||
$ sudo -E kubectl apply -f nginx-untrusted.yaml
|
||||
```
|
||||
|
||||
|
||||
@@ -1,5 +1,21 @@
|
||||
# Kata Containers and service mesh for Kubernetes
|
||||
|
||||
* [Assumptions](#assumptions)
|
||||
* [How they work](#how-they-work)
|
||||
* [Prerequisites](#prerequisites)
|
||||
* [Kata and Kubernetes](#kata-and-kubernetes)
|
||||
* [Restrictions](#restrictions)
|
||||
* [Install and deploy your service mesh](#install-and-deploy-your-service-mesh)
|
||||
* [Service Mesh Istio](#service-mesh-istio)
|
||||
* [Service Mesh Linkerd](#service-mesh-linkerd)
|
||||
* [Inject your services with sidecars](#inject-your-services-with-sidecars)
|
||||
* [Sidecar Istio](#sidecar-istio)
|
||||
* [Sidecar Linkerd](#sidecar-linkerd)
|
||||
* [Run your services with Kata](#run-your-services-with-kata)
|
||||
* [Lower privileges](#lower-privileges)
|
||||
* [Add annotations](#add-annotations)
|
||||
* [Deploy](#deploy)
|
||||
|
||||
A service mesh is a way to monitor and control the traffic between
|
||||
micro-services running in your Kubernetes cluster. It is a powerful
|
||||
tool that you might want to use in combination with the security
|
||||
@@ -34,7 +50,7 @@ as the proxy starts.
|
||||
|
||||
Follow the [instructions](../install/README.md)
|
||||
to get Kata Containers properly installed and configured with Kubernetes.
|
||||
You can choose between CRI-O and containerd, both are supported
|
||||
You can choose between CRI-O and CRI-containerd, both are supported
|
||||
through this document.
|
||||
|
||||
For both cases, select the workloads as _trusted_ by default. This way,
|
||||
@@ -60,16 +76,15 @@ is not able to perform a proper setup of the rules.
|
||||
|
||||
### Service Mesh Istio
|
||||
|
||||
The following is a summary of what you need to install Istio on your system:
|
||||
As a reference, you can follow Istio [instructions](https://istio.io/docs/setup/kubernetes/quick-start/#download-and-prepare-for-the-installation).
|
||||
|
||||
The following is a summary of what you need to install Istio on your system:
|
||||
```
|
||||
$ curl -L https://git.io/getLatestIstio | sh -
|
||||
$ cd istio-*
|
||||
$ export PATH=$PWD/bin:$PATH
|
||||
```
|
||||
|
||||
See the [Istio documentation](https://istio.io/docs) for further details.
|
||||
|
||||
Now deploy Istio in the control plane of your cluster with the following:
|
||||
```
|
||||
$ kubectl apply -f install/kubernetes/istio-demo.yaml
|
||||
@@ -159,7 +174,7 @@ containers with `privileged: true` to `privileged: false`.
|
||||
There is no difference between Istio and Linkerd in this section. It is
|
||||
about which CRI implementation you use.
|
||||
|
||||
For both CRI-O and containerd, you have to add an annotation indicating
|
||||
For both CRI-O and CRI-containerd, you have to add an annotation indicating
|
||||
the workload for this deployment is not _trusted_, which will trigger
|
||||
`kata-runtime` to be called instead of `runc`.
|
||||
|
||||
@@ -193,9 +208,9 @@ spec:
|
||||
...
|
||||
```
|
||||
|
||||
__containerd:__
|
||||
__CRI-containerd:__
|
||||
|
||||
Add the following annotation for containerd
|
||||
Add the following annotation for CRI-containerd
|
||||
```yaml
|
||||
io.kubernetes.cri.untrusted-workload: "true"
|
||||
```
|
||||
|
||||