Compare commits
18 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ae3645ac3d | ||
|
|
de89796190 | ||
|
|
fcc48d1de7 | ||
|
|
fac05f0fa8 | ||
|
|
5f1a9d28ea | ||
|
|
a662908478 | ||
|
|
e357dd3ca9 | ||
|
|
57b64f35e0 | ||
|
|
4c004891b5 | ||
|
|
1b157e5015 | ||
|
|
0d96145c29 | ||
|
|
c2539b1b3d | ||
|
|
8af8129a1a | ||
|
|
1f21947dd1 | ||
|
|
e902aeb5cf | ||
|
|
ba3078b8e5 | ||
|
|
a276da6194 | ||
|
|
bf5671357e |
17
.github/ISSUE_TEMPLATE.md
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
# Description of problem
|
||||
|
||||
(replace this text with the list of steps you followed)
|
||||
|
||||
# Expected result
|
||||
|
||||
(replace this text with an explanation of what you thought would happen)
|
||||
|
||||
# Actual result
|
||||
|
||||
(replace this text with details of what actually happened)
|
||||
|
||||
---
|
||||
|
||||
(replace this text with the output of the `kata-collect-data.sh` script, after
|
||||
you have reviewed its content to ensure it does not contain any private
|
||||
information).
|
||||
1
.github/workflows/PR-wip-checks.yaml
vendored
@@ -15,7 +15,6 @@ jobs:
|
||||
name: WIP Check
|
||||
steps:
|
||||
- name: WIP Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755
|
||||
with:
|
||||
labels: '["do-not-merge", "wip", "rfc"]'
|
||||
|
||||
55
.github/workflows/add-issues-to-project.yaml
vendored
@@ -1,55 +0,0 @@
|
||||
# Copyright (c) 2020 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
name: Add newly created issues to the backlog project
|
||||
|
||||
on:
|
||||
issues:
|
||||
types:
|
||||
- opened
|
||||
- reopened
|
||||
|
||||
jobs:
|
||||
add-new-issues-to-backlog:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Install hub
|
||||
run: |
|
||||
HUB_ARCH="amd64"
|
||||
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
|
||||
jq -r .tag_name | sed 's/^v//')
|
||||
curl -sL \
|
||||
"https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && \
|
||||
sudo install hub /usr/local/bin
|
||||
|
||||
- name: Install hub extension script
|
||||
run: |
|
||||
# Clone into a temporary directory to avoid overwriting
|
||||
# any existing github directory.
|
||||
pushd $(mktemp -d) &>/dev/null
|
||||
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
|
||||
sudo install hub-util.sh /usr/local/bin
|
||||
popd &>/dev/null
|
||||
|
||||
- name: Checkout code to allow hub to communicate with the project
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Add issue to issue backlog
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
run: |
|
||||
issue=${{ github.event.issue.number }}
|
||||
|
||||
project_name="Issue backlog"
|
||||
project_type="org"
|
||||
project_column="To do"
|
||||
|
||||
hub-util.sh \
|
||||
add-issue \
|
||||
"$issue" \
|
||||
"$project_name" \
|
||||
"$project_type" \
|
||||
"$project_column"
|
||||
99
.github/workflows/commit-message-check.yaml
vendored
@@ -1,99 +0,0 @@
|
||||
name: Commit Message Check
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
env:
|
||||
error_msg: |+
|
||||
See the document below for help on formatting commits for the project.
|
||||
|
||||
https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format
|
||||
|
||||
jobs:
|
||||
commit-message-check:
|
||||
runs-on: ubuntu-latest
|
||||
name: Commit Message Check
|
||||
steps:
|
||||
- name: Get PR Commits
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
id: 'get-pr-commits'
|
||||
uses: tim-actions/get-pr-commits@v1.2.0
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
# Filter out revert commits
|
||||
# The format of a revert commit is as follows:
|
||||
#
|
||||
# Revert "<original-subject-line>"
|
||||
#
|
||||
filter_out_pattern: '^Revert "'
|
||||
|
||||
- name: DCO Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
|
||||
- name: Commit Body Missing Check
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
uses: tim-actions/commit-body-check@v1.0.2
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
|
||||
- name: Check Subject Line Length
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
pattern: '^.{0,75}(\n.*)*$'
|
||||
error: 'Subject too long (max 75)'
|
||||
post_error: ${{ env.error_msg }}
|
||||
|
||||
- name: Check Body Line Length
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
# Notes:
|
||||
#
|
||||
# - The subject line is not enforced here (see other check), but has
|
||||
# to be specified at the start of the regex as the action is passed
|
||||
# the entire commit message.
|
||||
#
|
||||
# - Body lines *can* be longer than the maximum if they start
|
||||
# with a non-alphabetic character.
|
||||
#
|
||||
# This allows stack traces, log files snippets, emails, long URLs,
|
||||
# etc to be specified. Some of these naturally "work" as they start
|
||||
# with numeric timestamps or addresses. Emails can but quoted using
|
||||
# the normal ">" character, markdown bullets ("-", "*") are also
|
||||
# useful for lists of URLs, but it is always possible to override
|
||||
# the check by simply space indenting the content you need to add.
|
||||
#
|
||||
# - A SoB comment can be any length (as it is unreasonable to penalise
|
||||
# people with long names/email addresses :)
|
||||
pattern: '^.+(\n([a-zA-Z].{0,149}|[^a-zA-Z\n].*|Signed-off-by:.*|))+$'
|
||||
error: 'Body line too long (max 72)'
|
||||
post_error: ${{ env.error_msg }}
|
||||
|
||||
- name: Check Fixes
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
pattern: '\s*Fixes\s*:?\s*(#\d+|github\.com\/kata-containers\/[a-z-.]*#\d+)|^\s*release\s*:'
|
||||
flags: 'i'
|
||||
error: 'No "Fixes" found'
|
||||
post_error: ${{ env.error_msg }}
|
||||
one_pass_all_pass: 'true'
|
||||
|
||||
- name: Check Subsystem
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}
|
||||
uses: tim-actions/commit-message-checker-with-regex@v0.3.1
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
pattern: '^[\s\t]*[^:\s\t]+[\s\t]*:'
|
||||
error: 'Failed to find subsystem in subject'
|
||||
post_error: ${{ env.error_msg }}
|
||||
25
.github/workflows/darwin-tests.yaml
vendored
@@ -1,25 +0,0 @@
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
name: Darwin tests
|
||||
jobs:
|
||||
test:
|
||||
strategy:
|
||||
matrix:
|
||||
go-version: [1.16.x, 1.17.x]
|
||||
os: [macos-latest]
|
||||
runs-on: ${{ matrix.os }}
|
||||
steps:
|
||||
- name: Install Go
|
||||
uses: actions/setup-go@v2
|
||||
with:
|
||||
go-version: ${{ matrix.go-version }}
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v2
|
||||
- name: Build utils
|
||||
run: ./ci/darwin-test.sh
|
||||
22
.github/workflows/dco-check.yaml
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
name: DCO check
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
jobs:
|
||||
dco_check_job:
|
||||
runs-on: ubuntu-latest
|
||||
name: DCO Check
|
||||
steps:
|
||||
- name: Get PR Commits
|
||||
id: 'get-pr-commits'
|
||||
uses: tim-actions/get-pr-commits@ed97a21c3f83c3417e67a4733ea76887293a2c8f
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
- name: DCO Check
|
||||
uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20
|
||||
with:
|
||||
commits: ${{ steps.get-pr-commits.outputs.commits }}
|
||||
18
.github/workflows/gather-artifacts.sh
vendored
Executable file
@@ -0,0 +1,18 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
set -o pipefail
|
||||
|
||||
pushd kata-artifacts >>/dev/null
|
||||
for c in ./*.tar.gz
|
||||
do
|
||||
echo "untarring tarball $c"
|
||||
tar -xvf $c
|
||||
done
|
||||
|
||||
tar cvfJ ../kata-static.tar.xz ./opt
|
||||
popd >>/dev/null
|
||||
36
.github/workflows/generate-artifact-tarball.sh
vendored
Executable file
@@ -0,0 +1,36 @@
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
set -o pipefail
|
||||
|
||||
|
||||
main() {
|
||||
artifact_stage=${1:-}
|
||||
artifact=$(echo ${artifact_stage} | sed -n -e 's/^install_//p' | sed -r 's/_/-/g')
|
||||
if [ -z "${artifact}" ]; then
|
||||
"Scripts needs artifact name to build"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
export GOPATH=$HOME/go
|
||||
|
||||
go get github.com/kata-containers/packaging || true
|
||||
pushd $GOPATH/src/github.com/kata-containers/packaging/release >>/dev/null
|
||||
git checkout $tag
|
||||
pushd ../obs-packaging
|
||||
./gen_versions_txt.sh $tag
|
||||
popd
|
||||
|
||||
source ./kata-deploy-binaries.sh
|
||||
${artifact_stage} $tag
|
||||
popd
|
||||
|
||||
mv $HOME/go/src/github.com/kata-containers/packaging/release/kata-static-${artifact}.tar.gz .
|
||||
}
|
||||
|
||||
main $@
|
||||
83
.github/workflows/kata-deploy-push.yaml
vendored
@@ -1,83 +0,0 @@
|
||||
name: kata deploy build
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
paths:
|
||||
- tools/**
|
||||
- versions.yaml
|
||||
|
||||
jobs:
|
||||
build-asset:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- kernel
|
||||
- shim-v2
|
||||
- qemu
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Install docker
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make "${KATA_ASSET}-tarball"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r --preserve=all "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: build
|
||||
- name: merge-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make merge-builds
|
||||
- name: store-artifacts
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
path: kata-static.tar.xz
|
||||
|
||||
make-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: make kata-tarball
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
make kata-tarball
|
||||
sudo make install-tarball
|
||||
147
.github/workflows/kata-deploy-test.yaml
vendored
@@ -1,147 +0,0 @@
|
||||
on:
|
||||
issue_comment:
|
||||
types: [created, edited]
|
||||
|
||||
name: test-kata-deploy
|
||||
|
||||
jobs:
|
||||
check-comment-and-membership:
|
||||
runs-on: ubuntu-latest
|
||||
if: |
|
||||
github.event.issue.pull_request
|
||||
&& github.event_name == 'issue_comment'
|
||||
&& github.event.action == 'created'
|
||||
&& startsWith(github.event.comment.body, '/test_kata_deploy')
|
||||
steps:
|
||||
- name: Check membership
|
||||
uses: kata-containers/is-organization-member@1.0.1
|
||||
id: is_organization_member
|
||||
with:
|
||||
organization: kata-containers
|
||||
username: ${{ github.event.comment.user.login }}
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
- name: Fail if not member
|
||||
run: |
|
||||
result=${{ steps.is_organization_member.outputs.result }}
|
||||
if [ $result == false ]; then
|
||||
user=${{ github.event.comment.user.login }}
|
||||
echo Either ${user} is not part of the kata-containers organization
|
||||
echo or ${user} has its Organization Visibility set to Private at
|
||||
echo https://github.com/orgs/kata-containers/people?query=${user}
|
||||
echo
|
||||
echo Ensure you change your Organization Visibility to Public and
|
||||
echo trigger the test again.
|
||||
exit 1
|
||||
fi
|
||||
|
||||
build-asset:
|
||||
runs-on: ubuntu-latest
|
||||
needs: check-comment-and-membership
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- kernel
|
||||
- qemu
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
- shim-v2
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
id: get-PR-ref
|
||||
run: |
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
|
||||
- name: Install docker
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
run: |
|
||||
make "${KATA_ASSET}-tarball"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
id: get-PR-ref
|
||||
run: |
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-artifacts
|
||||
- name: merge-artifacts
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: create-kata-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get-PR-ref
|
||||
id: get-PR-ref
|
||||
run: |
|
||||
ref=$(cat $GITHUB_EVENT_PATH | jq -r '.issue.pull_request.url' | sed 's#^.*\/pulls#refs\/pull#' | sed 's#$#\/merge#')
|
||||
echo "reference for PR: " ${ref}
|
||||
echo "##[set-output name=pr-ref;]${ref}"
|
||||
- uses: actions/checkout@v2
|
||||
with:
|
||||
ref: ${{ steps.get-PR-ref.outputs.pr-ref }}
|
||||
- name: get-kata-tarball
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
run: |
|
||||
PR_SHA=$(git log --format=format:%H -n1)
|
||||
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t quay.io/kata-containers/kata-deploy-ci:$PR_SHA $GITHUB_WORKSPACE/tools/packaging/kata-deploy
|
||||
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
|
||||
docker push quay.io/kata-containers/kata-deploy-ci:$PR_SHA
|
||||
mkdir -p packaging/kata-deploy
|
||||
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
|
||||
echo "::set-output name=PKG_SHA::${PR_SHA}"
|
||||
- name: test-kata-deploy-ci-in-aks
|
||||
uses: ./packaging/kata-deploy/action
|
||||
with:
|
||||
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
env:
|
||||
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
AZ_APPID: ${{ secrets.AZ_APPID }}
|
||||
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
|
||||
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
|
||||
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
|
||||
349
.github/workflows/main.yaml
vendored
Normal file
@@ -0,0 +1,349 @@
|
||||
name: Publish release tarball
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- '*'
|
||||
|
||||
jobs:
|
||||
get-artifact-list:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get the list
|
||||
run: |
|
||||
git clone https://github.com/kata-containers/packaging
|
||||
pushd packaging
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
git checkout $tag
|
||||
popd
|
||||
./packaging/artifact-list.sh > artifact-list.txt
|
||||
- name: save-artifact-list
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
path: artifact-list.txt
|
||||
|
||||
build-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-kernel
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-kernel.tar.gz
|
||||
|
||||
build-experimental-kernel:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_experimental_kernel"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- run: |
|
||||
sudo apt-get update && sudo apt install -y flex bison libelf-dev bc iptables
|
||||
- name: build-experimental-kernel
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-experimental-kernel.tar.gz
|
||||
|
||||
build-qemu:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_qemu"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-qemu
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-qemu.tar.gz
|
||||
|
||||
build-nemu:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_nemu"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-nemu
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-nemu.tar.gz
|
||||
|
||||
# Job for building the QEMU binaries with virtiofs support
|
||||
build-qemu-virtiofsd:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_qemu_virtiofsd"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-qemu-virtiofsd
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-qemu-virtiofsd.tar.gz
|
||||
|
||||
# Job for building the image
|
||||
build-image:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_image"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-image
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-image.tar.gz
|
||||
|
||||
# Job for building firecracker hypervisor
|
||||
build-firecracker:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_firecracker"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-firecracker
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-firecracker.tar.gz
|
||||
|
||||
# Job for building cloud-hypervisor
|
||||
build-clh:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_clh"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-clh
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-clh.tar.gz
|
||||
|
||||
# Job for building kata components
|
||||
build-kata-components:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: get-artifact-list
|
||||
env:
|
||||
buildstr: "install_kata_components"
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifact-list
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: artifact-list
|
||||
- name: build-kata-components
|
||||
run: |
|
||||
if grep -q $buildstr ./artifact-list/artifact-list.txt; then
|
||||
$GITHUB_WORKSPACE/.github/workflows/generate-artifact-tarball.sh $buildstr
|
||||
echo ::set-env name=artifact-built::true
|
||||
else
|
||||
echo ::set-env name=artifact-built::false
|
||||
fi
|
||||
- name: store-artifacts
|
||||
if: env.artifact-built == 'true'
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-static-kata-components.tar.gz
|
||||
|
||||
gather-artifacts:
|
||||
runs-on: ubuntu-16.04
|
||||
needs: [build-experimental-kernel, build-kernel, build-qemu, build-qemu-virtiofsd, build-image, build-firecracker, build-kata-components, build-nemu, build-clh]
|
||||
steps:
|
||||
- uses: actions/checkout@v1
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: kata-artifacts
|
||||
- name: colate-artifacts
|
||||
run: |
|
||||
$GITHUB_WORKSPACE/.github/workflows/gather-artifacts.sh
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@v1
|
||||
with:
|
||||
name: release-candidate
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: gather-artifacts
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: release-candidate
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
git clone https://github.com/kata-containers/packaging
|
||||
pushd packaging
|
||||
git checkout $tag
|
||||
pkg_sha=$(git rev-parse HEAD)
|
||||
popd
|
||||
mv release-candidate/kata-static.tar.xz ./packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha ./packaging/kata-deploy
|
||||
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
|
||||
docker push katadocker/kata-deploy-ci:$pkg_sha
|
||||
|
||||
echo "##[set-output name=PKG_SHA;]${pkg_sha}"
|
||||
echo ::set-env name=TAG::$tag
|
||||
- name: test-kata-deploy-ci-in-aks
|
||||
uses: ./packaging/kata-deploy/action
|
||||
with:
|
||||
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
env:
|
||||
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
AZ_APPID: ${{ secrets.AZ_APPID }}
|
||||
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
|
||||
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
|
||||
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
|
||||
- name: push-tarball
|
||||
run: |
|
||||
# tag the container image we created and push to DockerHub
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag}
|
||||
docker push katadocker/kata-deploy:${tag}
|
||||
|
||||
upload-static-tarball:
|
||||
needs: kata-deploy
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: download-artifacts
|
||||
uses: actions/download-artifact@v1
|
||||
with:
|
||||
name: release-candidate
|
||||
- name: install hub
|
||||
run: |
|
||||
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')
|
||||
wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub
|
||||
- name: push static tarball to github
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tarball="kata-static-$tag-x86_64.tar.xz"
|
||||
repo="https://github.com/kata-containers/runtime.git"
|
||||
mv release-candidate/kata-static.tar.xz "release-candidate/${tarball}"
|
||||
git clone "${repo}"
|
||||
cd runtime
|
||||
echo "uploading asset '${tarball}' to '${repo}' tag: ${tag}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "../release-candidate/${tarball}" "${tag}"
|
||||
@@ -1,82 +0,0 @@
|
||||
# Copyright (c) 2020 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
name: Move issues to "In progress" in backlog project when referenced by a PR
|
||||
|
||||
on:
|
||||
pull_request_target:
|
||||
types:
|
||||
- opened
|
||||
- reopened
|
||||
|
||||
jobs:
|
||||
move-linked-issues-to-in-progress:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Install hub
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
HUB_ARCH="amd64"
|
||||
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
|
||||
jq -r .tag_name | sed 's/^v//')
|
||||
curl -sL \
|
||||
"https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && \
|
||||
sudo install hub /usr/local/bin
|
||||
|
||||
- name: Install hub extension script
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
# Clone into a temporary directory to avoid overwriting
|
||||
# any existing github directory.
|
||||
pushd $(mktemp -d) &>/dev/null
|
||||
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
|
||||
sudo install hub-util.sh /usr/local/bin
|
||||
popd &>/dev/null
|
||||
|
||||
- name: Checkout code to allow hub to communicate with the project
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Move issue to "In progress"
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
run: |
|
||||
pr=${{ github.event.pull_request.number }}
|
||||
|
||||
linked_issue_urls=$(hub-util.sh \
|
||||
list-issues-for-pr "$pr" |\
|
||||
grep -v "^\#" |\
|
||||
cut -d';' -f3 || true)
|
||||
|
||||
# PR doesn't have any linked issues
|
||||
# (it should, but maybe a new user forgot to add a "Fixes: #XXX" commit).
|
||||
[ -z "$linked_issue_urls" ] && {
|
||||
echo "::error::No linked issues for PR $pr"
|
||||
exit 1
|
||||
}
|
||||
|
||||
project_name="Issue backlog"
|
||||
project_type="org"
|
||||
project_column="In progress"
|
||||
|
||||
for issue_url in $(echo "$linked_issue_urls")
|
||||
do
|
||||
issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)
|
||||
|
||||
[ -z "$issue" ] && {
|
||||
echo "::error::Cannot determine issue number from $issue_url for PR $pr"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Move the issue to the correct column on the project board
|
||||
hub-util.sh \
|
||||
move-issue \
|
||||
"$issue" \
|
||||
"$project_name" \
|
||||
"$project_type" \
|
||||
"$project_column"
|
||||
done
|
||||
176
.github/workflows/release.yaml
vendored
@@ -1,176 +0,0 @@
|
||||
name: Publish Kata 2.x release artifacts
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- '2.*'
|
||||
|
||||
jobs:
|
||||
build-asset:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
asset:
|
||||
- cloud-hypervisor
|
||||
- firecracker
|
||||
- kernel
|
||||
- qemu
|
||||
- rootfs-image
|
||||
- rootfs-initrd
|
||||
- shim-v2
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Install docker
|
||||
run: |
|
||||
curl -fsSL https://test.docker.com -o test-docker.sh
|
||||
sh test-docker.sh
|
||||
|
||||
- name: Build ${{ matrix.asset }}
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-binaries-in-docker.sh --build="${KATA_ASSET}"
|
||||
build_dir=$(readlink -f build)
|
||||
# store-artifact does not work with symlink
|
||||
sudo cp -r "${build_dir}" "kata-build"
|
||||
env:
|
||||
KATA_ASSET: ${{ matrix.asset }}
|
||||
TAR_OUTPUT: ${{ matrix.asset }}.tar.gz
|
||||
|
||||
- name: store-artifact ${{ matrix.asset }}
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-build/kata-static-${{ matrix.asset }}.tar.xz
|
||||
if-no-files-found: error
|
||||
|
||||
create-kata-tarball:
|
||||
runs-on: ubuntu-latest
|
||||
needs: build-asset
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-artifacts
|
||||
path: kata-artifacts
|
||||
- name: merge-artifacts
|
||||
run: |
|
||||
./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts
|
||||
- name: store-artifacts
|
||||
uses: actions/upload-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
path: kata-static.tar.xz
|
||||
|
||||
kata-deploy:
|
||||
needs: create-kata-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: get-kata-tarball
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
- name: build-and-push-kata-deploy-ci
|
||||
id: build-and-push-kata-deploy-ci
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
pushd $GITHUB_WORKSPACE
|
||||
git checkout $tag
|
||||
pkg_sha=$(git rev-parse HEAD)
|
||||
popd
|
||||
mv kata-static.tar.xz $GITHUB_WORKSPACE/tools/packaging/kata-deploy/kata-static.tar.xz
|
||||
docker build --build-arg KATA_ARTIFACTS=kata-static.tar.xz -t katadocker/kata-deploy-ci:$pkg_sha -t quay.io/kata-containers/kata-deploy-ci:$pkg_sha $GITHUB_WORKSPACE/tools/packaging/kata-deploy
|
||||
docker login -u ${{ secrets.DOCKER_USERNAME }} -p ${{ secrets.DOCKER_PASSWORD }}
|
||||
docker push katadocker/kata-deploy-ci:$pkg_sha
|
||||
docker login -u ${{ secrets.QUAY_DEPLOYER_USERNAME }} -p ${{ secrets.QUAY_DEPLOYER_PASSWORD }} quay.io
|
||||
docker push quay.io/kata-containers/kata-deploy-ci:$pkg_sha
|
||||
mkdir -p packaging/kata-deploy
|
||||
ln -s $GITHUB_WORKSPACE/tools/packaging/kata-deploy/action packaging/kata-deploy/action
|
||||
echo "::set-output name=PKG_SHA::${pkg_sha}"
|
||||
- name: test-kata-deploy-ci-in-aks
|
||||
uses: ./packaging/kata-deploy/action
|
||||
with:
|
||||
packaging-sha: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
env:
|
||||
PKG_SHA: ${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}}
|
||||
AZ_APPID: ${{ secrets.AZ_APPID }}
|
||||
AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}
|
||||
AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}
|
||||
AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}
|
||||
- name: push-tarball
|
||||
run: |
|
||||
# tag the container image we created and push to DockerHub
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tags=($tag)
|
||||
tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))
|
||||
for tag in ${tags[@]}; do \
|
||||
docker tag katadocker/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} katadocker/kata-deploy:${tag} && \
|
||||
docker tag quay.io/kata-containers/kata-deploy-ci:${{steps.build-and-push-kata-deploy-ci.outputs.PKG_SHA}} quay.io/kata-containers/kata-deploy:${tag} && \
|
||||
docker push katadocker/kata-deploy:${tag} && \
|
||||
docker push quay.io/kata-containers/kata-deploy:${tag}; \
|
||||
done
|
||||
|
||||
upload-static-tarball:
|
||||
needs: kata-deploy
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: download-artifacts
|
||||
uses: actions/download-artifact@v2
|
||||
with:
|
||||
name: kata-static-tarball
|
||||
- name: install hub
|
||||
run: |
|
||||
HUB_VER=$(curl -s "https://api.github.com/repos/github/hub/releases/latest" | jq -r .tag_name | sed 's/^v//')
|
||||
wget -q -O- https://github.com/github/hub/releases/download/v$HUB_VER/hub-linux-amd64-$HUB_VER.tgz | \
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub
|
||||
- name: push static tarball to github
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tarball="kata-static-$tag-x86_64.tar.xz"
|
||||
mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"
|
||||
pushd $GITHUB_WORKSPACE
|
||||
echo "uploading asset '${tarball}' for tag: ${tag}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
popd
|
||||
|
||||
upload-cargo-vendored-tarball:
|
||||
needs: upload-static-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: generate-and-upload-tarball
|
||||
run: |
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
tarball="kata-containers-$tag-vendor.tar.gz"
|
||||
pushd $GITHUB_WORKSPACE
|
||||
bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"
|
||||
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
popd
|
||||
|
||||
upload-libseccomp-tarball:
|
||||
needs: upload-cargo-vendored-tarball
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: download-and-upload-tarball
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}
|
||||
GOPATH: ${HOME}/go
|
||||
run: |
|
||||
pushd $GITHUB_WORKSPACE
|
||||
./ci/install_yq.sh
|
||||
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
|
||||
versions_yaml="versions.yaml"
|
||||
version=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.version")
|
||||
repo_url=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.url")
|
||||
download_url="${repo_url}/releases/download/v${version}"
|
||||
tarball="libseccomp-${version}.tar.gz"
|
||||
asc="${tarball}.asc"
|
||||
curl -sSLO "${download_url}/${tarball}"
|
||||
curl -sSLO "${download_url}/${asc}"
|
||||
# "-m" option should be empty to re-use the existing release title
|
||||
# without opening a text editor.
|
||||
# For the details, check https://hub.github.com/hub-release.1.html.
|
||||
hub release edit -m "" -a "${tarball}" "${tag}"
|
||||
hub release edit -m "" -a "${asc}" "${tag}"
|
||||
popd
|
||||
54
.github/workflows/require-pr-porting-labels.yaml
vendored
@@ -1,54 +0,0 @@
|
||||
# Copyright (c) 2020 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
name: Ensure PR has required porting labels
|
||||
|
||||
on:
|
||||
pull_request_target:
|
||||
types:
|
||||
- opened
|
||||
- reopened
|
||||
- labeled
|
||||
- unlabeled
|
||||
branches:
|
||||
- main
|
||||
|
||||
jobs:
|
||||
check-pr-porting-labels:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Install hub
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
HUB_ARCH="amd64"
|
||||
HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\
|
||||
jq -r .tag_name | sed 's/^v//')
|
||||
curl -sL \
|
||||
"https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\
|
||||
tar xz --strip-components=2 --wildcards '*/bin/hub' && \
|
||||
sudo install hub /usr/local/bin
|
||||
|
||||
- name: Checkout code to allow hub to communicate with the project
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- name: Install porting checker script
|
||||
run: |
|
||||
# Clone into a temporary directory to avoid overwriting
|
||||
# any existing github directory.
|
||||
pushd $(mktemp -d) &>/dev/null
|
||||
git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts
|
||||
sudo install pr-porting-checks.sh /usr/local/bin
|
||||
popd &>/dev/null
|
||||
|
||||
- name: Stop PR being merged unless it has a correct set of porting labels
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}
|
||||
run: |
|
||||
pr=${{ github.event.number }}
|
||||
repo=${{ github.repository }}
|
||||
|
||||
pr-porting-checks.sh "$pr" "$repo"
|
||||
39
.github/workflows/snap-release.yaml
vendored
@@ -1,39 +0,0 @@
|
||||
name: Release Kata 2.x in snapcraft store
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- '2.*'
|
||||
jobs:
|
||||
release-snap:
|
||||
runs-on: ubuntu-20.04
|
||||
steps:
|
||||
- name: Check out Git repository
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Install Snapcraft
|
||||
uses: samuelmeuli/action-snapcraft@v1
|
||||
with:
|
||||
snapcraft_token: ${{ secrets.snapcraft_token }}
|
||||
|
||||
- name: Build snap
|
||||
run: |
|
||||
sudo apt-get install -y git git-extras
|
||||
kata_url="https://github.com/kata-containers/kata-containers"
|
||||
latest_version=$(git ls-remote --tags ${kata_url} | egrep -o "refs.*" | egrep -v "\-alpha|\-rc|{}" | egrep -o "[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+" | sort -V -r | head -1)
|
||||
current_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"
|
||||
# Check semantic versioning format (x.y.z) and if the current tag is the latest tag
|
||||
if echo "${current_version}" | grep -q "^[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+$" && echo -e "$latest_version\n$current_version" | sort -C -V; then
|
||||
# Current version is the latest version, build it
|
||||
snapcraft -d snap --destructive-mode
|
||||
fi
|
||||
|
||||
- name: Upload snap
|
||||
run: |
|
||||
snap_version="$(echo ${GITHUB_REF} | cut -d/ -f3)"
|
||||
snap_file="kata-containers_${snap_version}_amd64.snap"
|
||||
# Upload the snap if it exists
|
||||
if [ -f ${snap_file} ]; then
|
||||
snapcraft upload --release=stable ${snap_file}
|
||||
fi
|
||||
27
.github/workflows/snap.yaml
vendored
@@ -1,27 +0,0 @@
|
||||
name: snap CI
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- synchronize
|
||||
- reopened
|
||||
- edited
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-20.04
|
||||
steps:
|
||||
- name: Check out
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Install Snapcraft
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: samuelmeuli/action-snapcraft@v1
|
||||
|
||||
- name: Build snap
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
snapcraft -d snap --destructive-mode
|
||||
96
.github/workflows/static-checks.yaml
vendored
@@ -1,96 +0,0 @@
|
||||
on:
|
||||
pull_request:
|
||||
types:
|
||||
- opened
|
||||
- edited
|
||||
- reopened
|
||||
- synchronize
|
||||
|
||||
name: Static checks
|
||||
jobs:
|
||||
test:
|
||||
strategy:
|
||||
matrix:
|
||||
go-version: [1.16.x, 1.17.x]
|
||||
os: [ubuntu-20.04]
|
||||
runs-on: ${{ matrix.os }}
|
||||
env:
|
||||
TRAVIS: "true"
|
||||
TRAVIS_BRANCH: ${{ github.base_ref }}
|
||||
TRAVIS_PULL_REQUEST_BRANCH: ${{ github.head_ref }}
|
||||
TRAVIS_PULL_REQUEST_SHA : ${{ github.event.pull_request.head.sha }}
|
||||
RUST_BACKTRACE: "1"
|
||||
target_branch: ${{ github.base_ref }}
|
||||
steps:
|
||||
- name: Install Go
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/setup-go@v2
|
||||
with:
|
||||
go-version: ${{ matrix.go-version }}
|
||||
env:
|
||||
GOPATH: ${{ runner.workspace }}/kata-containers
|
||||
- name: Setup GOPATH
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "TRAVIS_BRANCH: ${TRAVIS_BRANCH}"
|
||||
echo "TRAVIS_PULL_REQUEST_BRANCH: ${TRAVIS_PULL_REQUEST_BRANCH}"
|
||||
echo "TRAVIS_PULL_REQUEST_SHA: ${TRAVIS_PULL_REQUEST_SHA}"
|
||||
echo "TRAVIS: ${TRAVIS}"
|
||||
- name: Set env
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV
|
||||
echo "${{ github.workspace }}/bin" >> $GITHUB_PATH
|
||||
- name: Checkout code
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
uses: actions/checkout@v2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
path: ./src/github.com/${{ github.repository }}
|
||||
- name: Setup travis references
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
echo "TRAVIS_BRANCH=${TRAVIS_BRANCH:-$(echo $GITHUB_REF | awk 'BEGIN { FS = \"/\" } ; { print $3 }')}"
|
||||
target_branch=${TRAVIS_BRANCH}
|
||||
- name: Setup
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh
|
||||
env:
|
||||
GOPATH: ${{ runner.workspace }}/kata-containers
|
||||
- name: Installing rust
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh
|
||||
PATH=$PATH:"$HOME/.cargo/bin"
|
||||
rustup target add x86_64-unknown-linux-musl
|
||||
rustup component add rustfmt clippy
|
||||
- name: Setup seccomp
|
||||
run: |
|
||||
libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)
|
||||
gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"
|
||||
echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"
|
||||
echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV
|
||||
echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV
|
||||
# Check whether the vendored code is up-to-date & working as the first thing
|
||||
- name: Check vendored code
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make vendor
|
||||
- name: Static Checks
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make static-checks
|
||||
- name: Run Compiler Checks
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make check
|
||||
- name: Run Unit Tests
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && make test
|
||||
- name: Run Unit Tests As Root User
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}
|
||||
run: |
|
||||
cd ${GOPATH}/src/github.com/${{ github.repository }} && sudo -E PATH="$PATH" make test
|
||||
15
.gitignore
vendored
@@ -1,12 +1,5 @@
|
||||
**/*.bk
|
||||
**/*~
|
||||
**/*.orig
|
||||
**/*.rej
|
||||
/target
|
||||
**/*.rs.bk
|
||||
**/target
|
||||
**/.vscode
|
||||
pkg/logging/Cargo.lock
|
||||
src/agent/src/version.rs
|
||||
src/agent/kata-agent.service
|
||||
src/agent/protocols/src/*.rs
|
||||
!src/agent/protocols/src/lib.rs
|
||||
|
||||
Cargo.lock
|
||||
**/Cargo.lock
|
||||
|
||||
33
.travis.yml
Normal file
@@ -0,0 +1,33 @@
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
sudo: required
|
||||
dist: bionic
|
||||
|
||||
os:
|
||||
- linux
|
||||
|
||||
language: rust
|
||||
rust:
|
||||
- stable
|
||||
|
||||
env:
|
||||
- target_branch=$TRAVIS_BRANCH RUST_AGENT=yes
|
||||
|
||||
before_install:
|
||||
- "ci/setup.sh"
|
||||
- "ci/install_go.sh"
|
||||
- "ci/install_rust.sh"
|
||||
- "ci/static-checks.sh"
|
||||
|
||||
# need to install rust from scratch?
|
||||
# still need go to download github.com/kata-containers/tests
|
||||
# which is already installed?
|
||||
|
||||
install:
|
||||
- cd ${TRAVIS_BUILD_DIR}/src/agent && make
|
||||
|
||||
script:
|
||||
- cd ${TRAVIS_BUILD_DIR}/src/agent && make check
|
||||
12
CODEOWNERS
@@ -1,12 +0,0 @@
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# Define any code owners for this repository.
|
||||
# The code owners lists are used to help automatically enforce
|
||||
# reviews and acks of the right groups on the right PRs.
|
||||
|
||||
# Order in this file is important. Only the last match will be
|
||||
# used. See https://help.github.com/articles/about-code-owners/
|
||||
|
||||
*.md @kata-containers/documentation
|
||||
@@ -2,4 +2,4 @@
|
||||
|
||||
## This repo is part of [Kata Containers](https://katacontainers.io)
|
||||
|
||||
For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).
|
||||
For details on how to contribute to the Kata Containers project, please see the main [contributing document](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md).
|
||||
@@ -1,3 +0,0 @@
|
||||
# Glossary
|
||||
|
||||
See the [project glossary hosted in the wiki](https://github.com/kata-containers/kata-containers/wiki/Glossary).
|
||||
45
Makefile
@@ -3,46 +3,5 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
# List of available components
|
||||
COMPONENTS =
|
||||
|
||||
COMPONENTS += agent
|
||||
COMPONENTS += runtime
|
||||
|
||||
# List of available tools
|
||||
TOOLS =
|
||||
|
||||
TOOLS += agent-ctl
|
||||
TOOLS += trace-forwarder
|
||||
|
||||
STANDARD_TARGETS = build check clean install test vendor
|
||||
|
||||
default: all
|
||||
|
||||
all: logging-crate-tests build
|
||||
|
||||
logging-crate-tests:
|
||||
make -C src/libs/logging
|
||||
|
||||
include utils.mk
|
||||
include ./tools/packaging/kata-deploy/local-build/Makefile
|
||||
|
||||
# Create the rules
|
||||
$(eval $(call create_all_rules,$(COMPONENTS),$(TOOLS),$(STANDARD_TARGETS)))
|
||||
|
||||
# Non-standard rules
|
||||
|
||||
generate-protocols:
|
||||
make -C src/agent generate-protocols
|
||||
|
||||
# Some static checks rely on generated source files of components.
|
||||
static-checks: build
|
||||
bash ci/static-checks.sh
|
||||
|
||||
.PHONY: \
|
||||
all \
|
||||
binary-tarball \
|
||||
default \
|
||||
install-binary-tarball \
|
||||
logging-crate-tests \
|
||||
static-checks
|
||||
test:
|
||||
bash ci/go-test.sh
|
||||
|
||||
236
README.md
@@ -2,145 +2,143 @@
|
||||
|
||||
# Kata Containers
|
||||
|
||||
* [Raising issues](#raising-issues)
|
||||
* [Kata Containers repositories](#kata-containers-repositories)
|
||||
* [Code Repositories](#code-repositories)
|
||||
* [Kata Containers-developed components](#kata-containers-developed-components)
|
||||
* [Agent](#agent)
|
||||
* [KSM throttler](#ksm-throttler)
|
||||
* [Proxy](#proxy)
|
||||
* [Runtime](#runtime)
|
||||
* [Shim](#shim)
|
||||
* [Additional](#additional)
|
||||
* [Hypervisor](#hypervisor)
|
||||
* [Kernel](#kernel)
|
||||
* [CI](#ci)
|
||||
* [Community](#community)
|
||||
* [Documentation](#documentation)
|
||||
* [Packaging](#packaging)
|
||||
* [Test code](#test-code)
|
||||
* [Utilities](#utilities)
|
||||
* [OS builder](#os-builder)
|
||||
* [Web content](#web-content)
|
||||
|
||||
---
|
||||
|
||||
Welcome to Kata Containers!
|
||||
|
||||
This repository is the home of the Kata Containers code for the 2.0 and newer
|
||||
releases.
|
||||
The purpose of this repository is to act as a "top level" site for the project. Specifically it is used:
|
||||
|
||||
If you want to learn about Kata Containers, visit the main
|
||||
[Kata Containers website](https://katacontainers.io).
|
||||
- To provide a list of the various *other* [Kata Containers repositories](#kata-containers-repositories),
|
||||
along with a brief explanation of their purpose.
|
||||
|
||||
## Introduction
|
||||
- To provide a general area for [Raising Issues](#raising-issues).
|
||||
|
||||
Kata Containers is an open source project and community working to build a
|
||||
standard implementation of lightweight Virtual Machines (VMs) that feel and
|
||||
perform like containers, but provide the workload isolation and security
|
||||
advantages of VMs.
|
||||
## Raising issues
|
||||
|
||||
## License
|
||||
This repository is used for [raising
|
||||
issues](https://github.com/kata-containers/kata-containers/issues/new):
|
||||
|
||||
The code is licensed under the Apache 2.0 license.
|
||||
See [the license file](LICENSE) for further details.
|
||||
- That might affect multiple code repositories.
|
||||
|
||||
## Platform support
|
||||
|
||||
Kata Containers currently runs on 64-bit systems supporting the following
|
||||
technologies:
|
||||
|
||||
| Architecture | Virtualization technology |
|
||||
|-|-|
|
||||
| `x86_64`, `amd64` | [Intel](https://www.intel.com) VT-x, AMD SVM |
|
||||
| `aarch64` ("`arm64`")| [ARM](https://www.arm.com) Hyp |
|
||||
| `ppc64le` | [IBM](https://www.ibm.com) Power |
|
||||
| `s390x` | [IBM](https://www.ibm.com) Z & LinuxONE SIE |
|
||||
|
||||
### Hardware requirements
|
||||
|
||||
The [Kata Containers runtime](src/runtime) provides a command to
|
||||
determine if your host system is capable of running and creating a
|
||||
Kata Container:
|
||||
|
||||
```bash
|
||||
$ kata-runtime check
|
||||
```
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - This command runs a number of checks including connecting to the
|
||||
> network to determine if a newer release of Kata Containers is
|
||||
> available on GitHub. If you do not wish this to check to run, add
|
||||
> the `--no-network-checks` option.
|
||||
>
|
||||
> - By default, only a brief success / failure message is printed.
|
||||
> If more details are needed, the `--verbose` flag can be used to display the
|
||||
> list of all the checks performed.
|
||||
>
|
||||
> - If the command is run as the `root` user additional checks are
|
||||
> run (including checking if another incompatible hypervisor is running).
|
||||
> When running as `root`, network checks are automatically disabled.
|
||||
|
||||
## Getting started
|
||||
|
||||
See the [installation documentation](docs/install).
|
||||
|
||||
## Documentation
|
||||
|
||||
See the [official documentation](docs) including:
|
||||
|
||||
- [Installation guides](docs/install)
|
||||
- [Developer guide](docs/Developer-Guide.md)
|
||||
- [Design documents](docs/design)
|
||||
- [Architecture overview](docs/design/architecture)
|
||||
|
||||
## Configuration
|
||||
|
||||
Kata Containers uses a single
|
||||
[configuration file](src/runtime/README.md#configuration)
|
||||
which contains a number of sections for various parts of the Kata
|
||||
Containers system including the [runtime](src/runtime), the
|
||||
[agent](src/agent) and the [hypervisor](#hypervisors).
|
||||
|
||||
## Hypervisors
|
||||
|
||||
See the [hypervisors document](docs/hypervisors.md) and the
|
||||
[Hypervisor specific configuration details](src/runtime/README.md#hypervisor-specific-configuration).
|
||||
|
||||
## Community
|
||||
|
||||
To learn more about the project, its community and governance, see the
|
||||
[community repository](https://github.com/kata-containers/community). This is
|
||||
the first place to go if you wish to contribute to the project.
|
||||
|
||||
## Getting help
|
||||
|
||||
See the [community](#community) section for ways to contact us.
|
||||
|
||||
### Raising issues
|
||||
|
||||
Please raise an issue
|
||||
[in this repository](https://github.com/kata-containers/kata-containers/issues).
|
||||
- Where the raiser is unsure which repositories are affected.
|
||||
|
||||
> **Note:**
|
||||
> If you are reporting a security issue, please follow the [vulnerability reporting process](https://github.com/kata-containers/community#vulnerability-handling)
|
||||
>
|
||||
> - If an issue affects only a single component, it should be raised in that
|
||||
> components repository.
|
||||
|
||||
## Developers
|
||||
## Kata Containers repositories
|
||||
|
||||
See the [developer guide](docs/Developer-Guide.md).
|
||||
### CI
|
||||
|
||||
### Components
|
||||
The [CI](https://github.com/kata-containers/ci) repository stores the Continuous
|
||||
Integration (CI) system configuration information.
|
||||
|
||||
### Main components
|
||||
### Community
|
||||
|
||||
The table below lists the core parts of the project:
|
||||
The [Community](https://github.com/kata-containers/community) repository is
|
||||
the first place to go if you want to use or contribute to the project.
|
||||
|
||||
| Component | Type | Description |
|
||||
|-|-|-|
|
||||
| [runtime](src/runtime) | core | Main component run by a container manager and providing a containerd shimv2 runtime implementation. |
|
||||
| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |
|
||||
| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |
|
||||
| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |
|
||||
### Code Repositories
|
||||
|
||||
### Additional components
|
||||
#### Kata Containers-developed components
|
||||
|
||||
The table below lists the remaining parts of the project:
|
||||
##### Agent
|
||||
|
||||
| Component | Type | Description |
|
||||
|-|-|-|
|
||||
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
|
||||
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
|
||||
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
|
||||
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
|
||||
| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |
|
||||
| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |
|
||||
| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |
|
||||
The [`kata-agent`](https://github.com/kata-containers/agent) runs inside the
|
||||
virtual machine and sets up the container environment.
|
||||
|
||||
### Packaging and releases
|
||||
##### KSM throttler
|
||||
|
||||
Kata Containers is now
|
||||
[available natively for most distributions](docs/install/README.md#packaged-installation-methods).
|
||||
However, packaging scripts and metadata are still used to generate snap and GitHub releases. See
|
||||
the [components](#components) section for further details.
|
||||
The [`kata-ksm-throttler`](https://github.com/kata-containers/ksm-throttler)
|
||||
is an optional utility that monitors containers and deduplicates memory to
|
||||
maximize container density on a host.
|
||||
|
||||
## Glossary of Terms
|
||||
##### Proxy
|
||||
|
||||
See the [glossary of terms](https://github.com/kata-containers/kata-containers/wiki/Glossary) related to Kata Containers.
|
||||
The [`kata-proxy`](https://github.com/kata-containers/proxy) is a process that
|
||||
runs on the host and co-ordinates access to the agent running inside the
|
||||
virtual machine.
|
||||
|
||||
##### Runtime
|
||||
|
||||
The [`kata-runtime`](https://github.com/kata-containers/runtime) is usually
|
||||
invoked by a container manager and provides high-level verbs to manage
|
||||
containers.
|
||||
|
||||
##### Shim
|
||||
|
||||
The [`kata-shim`](https://github.com/kata-containers/shim) is a process that
|
||||
runs on the host. It acts as though it is the workload (which actually runs
|
||||
inside the virtual machine). This shim is required to be compliant with the
|
||||
expectations of the [OCI runtime
|
||||
specification](https://github.com/opencontainers/runtime-spec).
|
||||
|
||||
#### Additional
|
||||
|
||||
##### Hypervisor
|
||||
|
||||
The [`qemu`](https://github.com/kata-containers/qemu) hypervisor is used to
|
||||
create virtual machines for hosting the containers.
|
||||
|
||||
##### Kernel
|
||||
|
||||
The hypervisor uses a [Linux\* kernel](https://github.com/kata-containers/linux) to boot the guest image.
|
||||
|
||||
### Documentation
|
||||
|
||||
The [documentation](https://github.com/kata-containers/documentation)
|
||||
repository hosts documentation common to all code components.
|
||||
|
||||
### Packaging
|
||||
|
||||
We use the [packaging](https://github.com/kata-containers/packaging)
|
||||
repository to create packages for the [system
|
||||
components](#kata-containers-developed-components) including
|
||||
[rootfs](#os-builder) and [kernel](#kernel) images.
|
||||
|
||||
### Test code
|
||||
|
||||
The [tests](https://github.com/kata-containers/tests) repository hosts all
|
||||
test code except the unit testing code (which is kept in the same repository
|
||||
as the component it tests).
|
||||
|
||||
### Utilities
|
||||
|
||||
#### OS builder
|
||||
|
||||
The [osbuilder](https://github.com/kata-containers/osbuilder) tool can create
|
||||
a rootfs and a "mini O/S" image. This image is used by the hypervisor to setup
|
||||
the environment before switching to the workload.
|
||||
|
||||
### Web content
|
||||
|
||||
The
|
||||
[www.katacontainers.io](https://github.com/kata-containers/www.katacontainers.io)
|
||||
repository contains all sources for the https://www.katacontainers.io site.
|
||||
|
||||
## Credits
|
||||
|
||||
Kata Containers uses [packagecloud](https://packagecloud.io) for package
|
||||
hosting.
|
||||
|
||||
@@ -1,42 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2022 Apple Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
cidir=$(dirname "$0")
|
||||
runtimedir=$cidir/../src/runtime
|
||||
|
||||
build_working_packages() {
|
||||
# working packages:
|
||||
device_api=$runtimedir/virtcontainers/device/api
|
||||
device_config=$runtimedir/virtcontainers/device/config
|
||||
device_drivers=$runtimedir/virtcontainers/device/drivers
|
||||
device_manager=$runtimedir/virtcontainers/device/manager
|
||||
rc_pkg_dir=$runtimedir/pkg/resourcecontrol/
|
||||
utils_pkg_dir=$runtimedir/virtcontainers/utils
|
||||
|
||||
# broken packages :( :
|
||||
#katautils=$runtimedir/pkg/katautils
|
||||
#oci=$runtimedir/pkg/oci
|
||||
#vc=$runtimedir/virtcontainers
|
||||
|
||||
pkgs=(
|
||||
"$device_api"
|
||||
"$device_config"
|
||||
"$device_drivers"
|
||||
"$device_manager"
|
||||
"$utils_pkg_dir"
|
||||
"$rc_pkg_dir")
|
||||
for pkg in "${pkgs[@]}"; do
|
||||
echo building "$pkg"
|
||||
pushd "$pkg" &>/dev/null
|
||||
go build
|
||||
go test
|
||||
popd &>/dev/null
|
||||
done
|
||||
}
|
||||
|
||||
build_working_packages
|
||||
@@ -1,4 +1,3 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2020 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2019 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,109 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright 2021 Sony Group Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -o errexit
|
||||
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
|
||||
clone_tests_repo
|
||||
|
||||
source "${tests_repo_dir}/.ci/lib.sh"
|
||||
|
||||
# The following variables if set on the environment will change the behavior
|
||||
# of gperf and libseccomp configure scripts, that may lead this script to
|
||||
# fail. So let's ensure they are unset here.
|
||||
unset PREFIX DESTDIR
|
||||
|
||||
arch=$(uname -m)
|
||||
workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"
|
||||
|
||||
# Variables for libseccomp
|
||||
# Currently, specify the libseccomp version directly without using `versions.yaml`
|
||||
# because the current Snap workflow is incomplete.
|
||||
# After solving the issue, replace this code by using the `versions.yaml`.
|
||||
# libseccomp_version=$(get_version "externals.libseccomp.version")
|
||||
# libseccomp_url=$(get_version "externals.libseccomp.url")
|
||||
libseccomp_version="2.5.1"
|
||||
libseccomp_url="https://github.com/seccomp/libseccomp"
|
||||
libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"
|
||||
libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"
|
||||
cflags="-O2"
|
||||
|
||||
# Variables for gperf
|
||||
# Currently, specify the gperf version directly without using `versions.yaml`
|
||||
# because the current Snap workflow is incomplete.
|
||||
# After solving the issue, replace this code by using the `versions.yaml`.
|
||||
# gperf_version=$(get_version "externals.gperf.version")
|
||||
# gperf_url=$(get_version "externals.gperf.url")
|
||||
gperf_version="3.1"
|
||||
gperf_url="https://ftp.gnu.org/gnu/gperf"
|
||||
gperf_tarball="gperf-${gperf_version}.tar.gz"
|
||||
gperf_tarball_url="${gperf_url}/${gperf_tarball}"
|
||||
|
||||
# We need to build the libseccomp library from sources to create a static library for the musl libc.
|
||||
# However, ppc64le and s390x have no musl targets in Rust. Hence, we do not set cflags for the musl libc.
|
||||
if ([ "${arch}" != "ppc64le" ] && [ "${arch}" != "s390x" ]); then
|
||||
# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2
|
||||
cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"
|
||||
fi
|
||||
|
||||
die() {
|
||||
msg="$*"
|
||||
echo "[Error] ${msg}" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
finish() {
|
||||
rm -rf "${workdir}"
|
||||
}
|
||||
|
||||
trap finish EXIT
|
||||
|
||||
build_and_install_gperf() {
|
||||
echo "Build and install gperf version ${gperf_version}"
|
||||
mkdir -p "${gperf_install_dir}"
|
||||
curl -sLO "${gperf_tarball_url}"
|
||||
tar -xf "${gperf_tarball}"
|
||||
pushd "gperf-${gperf_version}"
|
||||
./configure --prefix="${gperf_install_dir}"
|
||||
make
|
||||
make install
|
||||
export PATH=$PATH:"${gperf_install_dir}"/bin
|
||||
popd
|
||||
echo "Gperf installed successfully"
|
||||
}
|
||||
|
||||
build_and_install_libseccomp() {
|
||||
echo "Build and install libseccomp version ${libseccomp_version}"
|
||||
mkdir -p "${libseccomp_install_dir}"
|
||||
curl -sLO "${libseccomp_tarball_url}"
|
||||
tar -xf "${libseccomp_tarball}"
|
||||
pushd "libseccomp-${libseccomp_version}"
|
||||
./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static
|
||||
make
|
||||
make install
|
||||
popd
|
||||
echo "Libseccomp installed successfully"
|
||||
}
|
||||
|
||||
main() {
|
||||
local libseccomp_install_dir="${1:-}"
|
||||
local gperf_install_dir="${2:-}"
|
||||
|
||||
if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then
|
||||
die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"
|
||||
fi
|
||||
|
||||
pushd "$workdir"
|
||||
# gperf is required for building the libseccomp.
|
||||
build_and_install_gperf
|
||||
build_and_install_libseccomp
|
||||
popd
|
||||
}
|
||||
|
||||
main "$@"
|
||||
@@ -1,24 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Copyright (c) 2020 Ant Group
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
set -e
|
||||
|
||||
install_aarch64_musl() {
|
||||
local arch=$(uname -m)
|
||||
if [ "${arch}" == "aarch64" ]; then
|
||||
local musl_tar="${arch}-linux-musl-native.tgz"
|
||||
local musl_dir="${arch}-linux-musl-native"
|
||||
pushd /tmp
|
||||
if curl -sLO --fail https://musl.cc/${musl_tar}; then
|
||||
tar -zxf ${musl_tar}
|
||||
mkdir -p /usr/local/musl/
|
||||
cp -r ${musl_dir}/* /usr/local/musl/
|
||||
fi
|
||||
popd
|
||||
fi
|
||||
}
|
||||
|
||||
install_aarch64_musl
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
@@ -12,5 +12,5 @@ source "${cidir}/lib.sh"
|
||||
clone_tests_repo
|
||||
|
||||
pushd ${tests_repo_dir}
|
||||
.ci/install_rust.sh ${1:-}
|
||||
.ci/install_rust.sh
|
||||
popd
|
||||
|
||||
@@ -1,19 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -e
|
||||
|
||||
cidir=$(dirname "$0")
|
||||
vcdir="${cidir}/../src/runtime/virtcontainers/"
|
||||
source "${cidir}/lib.sh"
|
||||
export CI_JOB="${CI_JOB:-default}"
|
||||
|
||||
clone_tests_repo
|
||||
|
||||
if [ "${CI_JOB}" != "PODMAN" ]; then
|
||||
echo "Install virtcontainers"
|
||||
make -C "${vcdir}" && chronic sudo make -C "${vcdir}" install
|
||||
fi
|
||||
@@ -1,77 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Copyright (c) 2019 IBM
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
# If we fail for any reason a message will be displayed
|
||||
die() {
|
||||
msg="$*"
|
||||
echo "ERROR: $msg" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Install the yq yaml query package from the mikefarah github repo
|
||||
# Install via binary download, as we may not have golang installed at this point
|
||||
function install_yq() {
|
||||
local yq_pkg="github.com/mikefarah/yq"
|
||||
local yq_version=3.4.1
|
||||
INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}
|
||||
|
||||
if [ "${INSTALL_IN_GOPATH}" == "true" ];then
|
||||
GOPATH=${GOPATH:-${HOME}/go}
|
||||
mkdir -p "${GOPATH}/bin"
|
||||
local yq_path="${GOPATH}/bin/yq"
|
||||
else
|
||||
yq_path="/usr/local/bin/yq"
|
||||
fi
|
||||
[ -x "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return
|
||||
|
||||
read -r -a sysInfo <<< "$(uname -sm)"
|
||||
|
||||
case "${sysInfo[0]}" in
|
||||
"Linux" | "Darwin")
|
||||
goos="${sysInfo[0],}"
|
||||
;;
|
||||
"*")
|
||||
die "OS ${sysInfo[0]} not supported"
|
||||
;;
|
||||
esac
|
||||
|
||||
case "${sysInfo[1]}" in
|
||||
"aarch64")
|
||||
goarch=arm64
|
||||
;;
|
||||
"ppc64le")
|
||||
goarch=ppc64le
|
||||
;;
|
||||
"x86_64")
|
||||
goarch=amd64
|
||||
;;
|
||||
"s390x")
|
||||
goarch=s390x
|
||||
;;
|
||||
"*")
|
||||
die "Arch ${sysInfo[1]} not supported"
|
||||
;;
|
||||
esac
|
||||
|
||||
|
||||
# Check curl
|
||||
if ! command -v "curl" >/dev/null; then
|
||||
die "Please install curl"
|
||||
fi
|
||||
|
||||
## NOTE: ${var,,} => gives lowercase value of var
|
||||
local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos,,}_${goarch}"
|
||||
curl -o "${yq_path}" -LSsf "${yq_url}"
|
||||
[ $? -ne 0 ] && die "Download ${yq_url} failed"
|
||||
chmod +x "${yq_path}"
|
||||
|
||||
if ! command -v "${yq_path}" >/dev/null; then
|
||||
die "Cannot not get ${yq_path} executable"
|
||||
fi
|
||||
}
|
||||
|
||||
install_yq
|
||||
35
ci/lib.sh
@@ -3,40 +3,29 @@
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
set -o nounset
|
||||
|
||||
export tests_repo="${tests_repo:-github.com/kata-containers/tests}"
|
||||
export tests_repo_dir="$GOPATH/src/$tests_repo"
|
||||
export branch="${target_branch:-main}"
|
||||
|
||||
# Clones the tests repository and checkout to the branch pointed out by
|
||||
# the global $branch variable.
|
||||
# If the clone exists and `CI` is exported then it does nothing. Otherwise
|
||||
# it will clone the repository or `git pull` the latest code.
|
||||
#
|
||||
clone_tests_repo()
|
||||
{
|
||||
if [ -d "$tests_repo_dir" ]; then
|
||||
[ -n "${CI:-}" ] && return
|
||||
pushd "${tests_repo_dir}"
|
||||
git checkout "${branch}"
|
||||
git pull
|
||||
popd
|
||||
else
|
||||
git clone -q "https://${tests_repo}" "$tests_repo_dir"
|
||||
pushd "${tests_repo_dir}"
|
||||
git checkout "${branch}"
|
||||
popd
|
||||
# KATA_CI_NO_NETWORK is (has to be) ignored if there is
|
||||
# no existing clone.
|
||||
if [ -d "$tests_repo_dir" -a -n "$KATA_CI_NO_NETWORK" ]
|
||||
then
|
||||
return
|
||||
fi
|
||||
|
||||
go get -d -u "$tests_repo" || true
|
||||
|
||||
if [ -n "${TRAVIS_BRANCH:-}" ]; then
|
||||
( cd "${tests_repo_dir}" && git checkout "${TRAVIS_BRANCH}" )
|
||||
fi
|
||||
}
|
||||
|
||||
run_static_checks()
|
||||
{
|
||||
clone_tests_repo
|
||||
# Make sure we have the targeting branch
|
||||
git remote set-branches --add origin "${branch}"
|
||||
git fetch -a
|
||||
bash "$tests_repo_dir/.ci/static-checks.sh" "$@"
|
||||
bash "$tests_repo_dir/.ci/static-checks.sh" "github.com/kata-containers/kata-containers"
|
||||
}
|
||||
|
||||
run_go_test()
|
||||
|
||||
@@ -1,14 +0,0 @@
|
||||
# Copyright (c) 2021 Red Hat, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# This is the build root image for Kata Containers on OpenShift CI.
|
||||
#
|
||||
FROM quay.io/centos/centos:stream8
|
||||
|
||||
RUN yum -y update && \
|
||||
yum -y install \
|
||||
git \
|
||||
sudo \
|
||||
wget && \
|
||||
yum clean all
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2019 Ant Financial
|
||||
#
|
||||
@@ -8,14 +8,9 @@
|
||||
set -e
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
export CI_JOB="${CI_JOB:-}"
|
||||
|
||||
clone_tests_repo
|
||||
|
||||
pushd ${tests_repo_dir}
|
||||
.ci/run.sh
|
||||
# temporary fix, see https://github.com/kata-containers/tests/issues/3878
|
||||
if [ "$(uname -m)" != "s390x" ] && [ "$CI_JOB" == "CRI_CONTAINERD_K8S_MINIMAL" ]; then
|
||||
tracing/test-agent-shutdown.sh
|
||||
fi
|
||||
popd
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
#!/bin/bash
|
||||
#
|
||||
# Copyright (c) 2017-2018 Intel Corporation
|
||||
#
|
||||
@@ -9,4 +9,4 @@ set -e
|
||||
cidir=$(dirname "$0")
|
||||
source "${cidir}/lib.sh"
|
||||
|
||||
run_static_checks "${@:-github.com/kata-containers/kata-containers}"
|
||||
run_static_checks
|
||||
|
||||
@@ -1,716 +0,0 @@
|
||||
# Warning
|
||||
|
||||
This document is written **specifically for developers**: it is not intended for end users.
|
||||
|
||||
# Assumptions
|
||||
|
||||
- You are working on a non-critical test or development system.
|
||||
|
||||
# Initial setup
|
||||
|
||||
The recommended way to create a development environment is to first
|
||||
[install the packaged versions of the Kata Containers components](install/README.md)
|
||||
to create a working system.
|
||||
|
||||
The installation guide instructions will install all required Kata Containers
|
||||
components, plus *Docker*, the hypervisor, and the Kata Containers image and
|
||||
guest kernel.
|
||||
|
||||
# Requirements to build individual components
|
||||
|
||||
You need to install the following to build Kata Containers components:
|
||||
|
||||
- [golang](https://golang.org/dl)
|
||||
|
||||
To view the versions of go known to work, see the `golang` entry in the
|
||||
[versions database](../versions.yaml).
|
||||
|
||||
- [rust](https://www.rust-lang.org/tools/install)
|
||||
|
||||
To view the versions of rust known to work, see the `rust` entry in the
|
||||
[versions database](../versions.yaml).
|
||||
|
||||
- `make`.
|
||||
- `gcc` (required for building the shim and runtime).
|
||||
|
||||
# Build and install the Kata Containers runtime
|
||||
|
||||
```
|
||||
$ go get -d -u github.com/kata-containers/kata-containers
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/src/runtime
|
||||
$ make && sudo -E PATH=$PATH make install
|
||||
```
|
||||
|
||||
The build will create the following:
|
||||
|
||||
- runtime binary: `/usr/local/bin/kata-runtime` and `/usr/local/bin/containerd-shim-kata-v2`
|
||||
- configuration file: `/usr/share/defaults/kata-containers/configuration.toml`
|
||||
|
||||
# Check hardware requirements
|
||||
|
||||
You can check if your system is capable of creating a Kata Container by running the following:
|
||||
|
||||
```
|
||||
$ sudo kata-runtime check
|
||||
```
|
||||
|
||||
If your system is *not* able to run Kata Containers, the previous command will error out and explain why.
|
||||
|
||||
## Configure to use initrd or rootfs image
|
||||
|
||||
Kata containers can run with either an initrd image or a rootfs image.
|
||||
|
||||
If you want to test with `initrd`, make sure you have `initrd = /usr/share/kata-containers/kata-containers-initrd.img`
|
||||
in your configuration file, commenting out the `image` line:
|
||||
|
||||
`/usr/share/defaults/kata-containers/configuration.toml` and comment out the `image` line with the following. For example:
|
||||
|
||||
```
|
||||
$ sudo mkdir -p /etc/kata-containers/
|
||||
$ sudo install -o root -g root -m 0640 /usr/share/defaults/kata-containers/configuration.toml /etc/kata-containers
|
||||
$ sudo sed -i 's/^\(image =.*\)/# \1/g' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
You can create the initrd image as shown in the [create an initrd image](#create-an-initrd-image---optional) section.
|
||||
|
||||
If you want to test with a rootfs `image`, make sure you have `image = /usr/share/kata-containers/kata-containers.img`
|
||||
in your configuration file, commenting out the `initrd` line. For example:
|
||||
|
||||
```
|
||||
$ sudo mkdir -p /etc/kata-containers/
|
||||
$ sudo install -o root -g root -m 0640 /usr/share/defaults/kata-containers/configuration.toml /etc/kata-containers
|
||||
$ sudo sed -i 's/^\(initrd =.*\)/# \1/g' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
The rootfs image is created as shown in the [create a rootfs image](#create-a-rootfs-image) section.
|
||||
|
||||
One of the `initrd` and `image` options in Kata runtime config file **MUST** be set but **not both**.
|
||||
The main difference between the options is that the size of `initrd`(10MB+) is significantly smaller than
|
||||
rootfs `image`(100MB+).
|
||||
|
||||
## Enable seccomp
|
||||
|
||||
Enable seccomp as follows:
|
||||
|
||||
```
|
||||
$ sudo sed -i '/^disable_guest_seccomp/ s/true/false/' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
|
||||
This will pass container seccomp profiles to the kata agent.
|
||||
|
||||
## Enable full debug
|
||||
|
||||
Enable full debug as follows:
|
||||
|
||||
```
|
||||
$ sudo mkdir -p /etc/kata-containers/
|
||||
$ sudo install -o root -g root -m 0640 /usr/share/defaults/kata-containers/configuration.toml /etc/kata-containers
|
||||
$ sudo sed -i -e 's/^# *\(enable_debug\).*=.*$/\1 = true/g' /etc/kata-containers/configuration.toml
|
||||
$ sudo sed -i -e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 agent.log=debug initcall_debug"/g' /etc/kata-containers/configuration.toml
|
||||
```
|
||||
|
||||
### debug logs and shimv2
|
||||
|
||||
If you are using `containerd` and the Kata `containerd-shimv2` to launch Kata Containers, and wish
|
||||
to enable Kata debug logging, there are two ways this can be enabled via the `containerd` configuration file,
|
||||
detailed below.
|
||||
|
||||
The Kata logs appear in the `containerd` log files, along with logs from `containerd` itself.
|
||||
|
||||
For more information about `containerd` debug, please see the
|
||||
[`containerd` documentation](https://github.com/containerd/containerd/blob/master/docs/getting-started.md).
|
||||
|
||||
#### Enabling full `containerd` debug
|
||||
|
||||
Enabling full `containerd` debug also enables the shimv2 debug. Edit the `containerd` configuration file
|
||||
to include the top level debug option such as:
|
||||
|
||||
```toml
|
||||
[debug]
|
||||
level = "debug"
|
||||
```
|
||||
|
||||
#### Enabling just `containerd shim` debug
|
||||
|
||||
If you only wish to enable debug for the `containerd` shims themselves, just enable the debug
|
||||
option in the `plugins.linux` section of the `containerd` configuration file, such as:
|
||||
|
||||
```toml
|
||||
[plugins.linux]
|
||||
shim_debug = true
|
||||
```
|
||||
|
||||
#### Enabling `CRI-O` and `shimv2` debug
|
||||
|
||||
Depending on the CRI-O version being used one of the following configuration files can
|
||||
be found: `/etc/crio/crio.conf` or `/etc/crio/crio.conf.d/00-default`.
|
||||
|
||||
If the latter is found, the change must be done there as it'll take precedence, overriding
|
||||
`/etc/crio/crio.conf`.
|
||||
|
||||
```toml
|
||||
# Changes the verbosity of the logs based on the level it is set to. Options
|
||||
# are fatal, panic, error, warn, info, debug and trace. This option supports
|
||||
# live configuration reload.
|
||||
log_level = "info"
|
||||
```
|
||||
|
||||
Switching the default `log_level` from `info` to `debug` enables shimv2 debug logs.
|
||||
CRI-O logs can be found by using the `crio` identifier, and Kata specific logs can
|
||||
be found by using the `kata` identifier.
|
||||
|
||||
### journald rate limiting
|
||||
|
||||
Enabling [full debug](#enable-full-debug) results in the Kata components generating
|
||||
large amounts of logging, which by default is stored in the system log. Depending on
|
||||
your system configuration, it is possible that some events might be discarded by the
|
||||
system logging daemon. The following shows how to determine this for `systemd-journald`,
|
||||
and offers possible workarounds and fixes.
|
||||
|
||||
> **Note** The method of implementation can vary between Operating System installations.
|
||||
> Amend these instructions as necessary to your system implementation,
|
||||
> and consult with your system administrator for the appropriate configuration.
|
||||
|
||||
#### `systemd-journald` suppressing messages
|
||||
|
||||
`systemd-journald` can be configured to rate limit the number of journal entries
|
||||
it stores. When messages are suppressed, it is noted in the logs. This can be checked
|
||||
for by looking for those notifications, such as:
|
||||
|
||||
```sh
|
||||
$ sudo journalctl --since today | fgrep Suppressed
|
||||
Jun 29 14:51:17 mymachine systemd-journald[346]: Suppressed 4150 messages from /system.slice/docker.service
|
||||
```
|
||||
|
||||
This message indicates that a number of log messages from the `docker.service` slice were
|
||||
suppressed. In such a case, you can expect to have incomplete logging information
|
||||
stored from the Kata Containers components.
|
||||
|
||||
#### Disabling `systemd-journald` rate limiting
|
||||
|
||||
In order to capture complete logs from the Kata Containers components, you
|
||||
need to reduce or disable the `systemd-journald` rate limit. Configure
|
||||
this at the global `systemd-journald` level, and it will apply to all system slices.
|
||||
|
||||
To disable `systemd-journald` rate limiting at the global level, edit the file
|
||||
`/etc/systemd/journald.conf`, and add/uncomment the following lines:
|
||||
|
||||
```
|
||||
RateLimitInterval=0s
|
||||
RateLimitBurst=0
|
||||
```
|
||||
|
||||
Restart `systemd-journald` for the changes to take effect:
|
||||
|
||||
```sh
|
||||
$ sudo systemctl restart systemd-journald
|
||||
```
|
||||
|
||||
# Create and install rootfs and initrd image
|
||||
|
||||
## Build a custom Kata agent - OPTIONAL
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - You should only do this step if you are testing with the latest version of the agent.
|
||||
|
||||
The agent is built with a statically linked `musl.` The default `libc` used is `musl`, but on `ppc64le` and `s390x`, `gnu` should be used. To configure this:
|
||||
|
||||
```
|
||||
$ export ARCH=$(uname -m)
|
||||
$ if [ "$ARCH" = "ppc64le" -o "$ARCH" = "s390x" ]; then export LIBC=gnu; else export LIBC=musl; fi
|
||||
$ [ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
|
||||
$ rustup target add ${ARCH}-unknown-linux-${LIBC}
|
||||
```
|
||||
|
||||
To build the agent:
|
||||
|
||||
```
|
||||
$ go get -d -u github.com/kata-containers/kata-containers
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/src/agent && make
|
||||
```
|
||||
|
||||
The agent is built with seccomp capability by default.
|
||||
If you want to build the agent without the seccomp capability, you need to run `make` with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ make -C $GOPATH/src/github.com/kata-containers/kata-containers/src/agent SECCOMP=no
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - If you enable seccomp in the main configuration file but build the agent without seccomp capability,
|
||||
> the runtime exits conservatively with an error message.
|
||||
|
||||
## Get the osbuilder
|
||||
|
||||
```
|
||||
$ go get -d -u github.com/kata-containers/kata-containers
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder
|
||||
```
|
||||
|
||||
## Create a rootfs image
|
||||
### Create a local rootfs
|
||||
|
||||
As a prerequisite, you need to install Docker. Otherwise, you will not be
|
||||
able to run the `rootfs.sh` script with `USE_DOCKER=true` as expected in
|
||||
the following example.
|
||||
|
||||
```
|
||||
$ export ROOTFS_DIR=${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs
|
||||
$ sudo rm -rf ${ROOTFS_DIR}
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true ./rootfs.sh ${distro}'
|
||||
```
|
||||
|
||||
You MUST choose a distribution (e.g., `ubuntu`) for `${distro}`.
|
||||
You can get a supported distributions list in the Kata Containers by running the following.
|
||||
|
||||
```
|
||||
$ ./rootfs.sh -l
|
||||
```
|
||||
|
||||
If you want to build the agent without seccomp capability, you need to run the `rootfs.sh` script with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true SECCOMP=no ./rootfs.sh ${distro}'
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
|
||||
> - You must ensure that the *default Docker runtime* is `runc` to make use of
|
||||
> the `USE_DOCKER` variable. If that is not the case, remove the variable
|
||||
> from the previous command. See [Checking Docker default runtime](#checking-docker-default-runtime).
|
||||
|
||||
### Add a custom agent to the image - OPTIONAL
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - You should only do this step if you are testing with the latest version of the agent.
|
||||
|
||||
```
|
||||
$ sudo install -o root -g root -m 0550 -t ${ROOTFS_DIR}/usr/bin ../../../src/agent/target/x86_64-unknown-linux-musl/release/kata-agent
|
||||
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-agent.service ${ROOTFS_DIR}/usr/lib/systemd/system/
|
||||
$ sudo install -o root -g root -m 0440 ../../../src/agent/kata-containers.target ${ROOTFS_DIR}/usr/lib/systemd/system/
|
||||
```
|
||||
|
||||
### Build a rootfs image
|
||||
|
||||
```
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/image-builder
|
||||
$ script -fec 'sudo -E USE_DOCKER=true ./image_builder.sh ${ROOTFS_DIR}'
|
||||
```
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - You must ensure that the *default Docker runtime* is `runc` to make use of
|
||||
> the `USE_DOCKER` variable. If that is not the case, remove the variable
|
||||
> from the previous command. See [Checking Docker default runtime](#checking-docker-default-runtime).
|
||||
> - If you do *not* wish to build under Docker, remove the `USE_DOCKER`
|
||||
> variable in the previous command and ensure the `qemu-img` command is
|
||||
> available on your system.
|
||||
> - If `qemu-img` is not installed, you will likely see errors such as `ERROR: File /dev/loop19p1 is not a block device` and `losetup: /tmp/tmp.bHz11oY851: Warning: file is smaller than 512 bytes; the loop device may be useless or invisible for system tools`. These can be mitigated by installing the `qemu-img` command (available in the `qemu-img` package on Fedora or the `qemu-utils` package on Debian).
|
||||
|
||||
|
||||
### Install the rootfs image
|
||||
|
||||
```
|
||||
$ commit=$(git log --format=%h -1 HEAD)
|
||||
$ date=$(date +%Y-%m-%d-%T.%N%z)
|
||||
$ image="kata-containers-${date}-${commit}"
|
||||
$ sudo install -o root -g root -m 0640 -D kata-containers.img "/usr/share/kata-containers/${image}"
|
||||
$ (cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers.img)
|
||||
```
|
||||
|
||||
## Create an initrd image - OPTIONAL
|
||||
### Create a local rootfs for initrd image
|
||||
```
|
||||
$ export ROOTFS_DIR="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs"
|
||||
$ sudo rm -rf ${ROOTFS_DIR}
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true ./rootfs.sh ${distro}'
|
||||
```
|
||||
`AGENT_INIT` controls if the guest image uses the Kata agent as the guest `init` process. When you create an initrd image,
|
||||
always set `AGENT_INIT` to `yes`.
|
||||
|
||||
You MUST choose a distribution (e.g., `ubuntu`) for `${distro}`.
|
||||
You can get a supported distributions list in the Kata Containers by running the following.
|
||||
|
||||
```
|
||||
$ ./rootfs.sh -l
|
||||
```
|
||||
|
||||
If you want to build the agent without seccomp capability, you need to run the `rootfs.sh` script with `SECCOMP=no` as follows.
|
||||
|
||||
```
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH AGENT_INIT=yes USE_DOCKER=true SECCOMP=no ./rootfs.sh ${distro}'
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - Check the [compatibility matrix](../tools/osbuilder/README.md#platform-distro-compatibility-matrix) before creating rootfs.
|
||||
|
||||
Optionally, add your custom agent binary to the rootfs with the following commands. The default `$LIBC` used
|
||||
is `musl`, but on ppc64le and s390x, `gnu` should be used. Also, Rust refers to ppc64le as `powerpc64le`:
|
||||
```
|
||||
$ export ARCH=$(uname -m)
|
||||
$ [ ${ARCH} == "ppc64le" ] || [ ${ARCH} == "s390x" ] && export LIBC=gnu || export LIBC=musl
|
||||
$ [ ${ARCH} == "ppc64le" ] && export ARCH=powerpc64le
|
||||
$ sudo install -o root -g root -m 0550 -T ../../../src/agent/target/${ARCH}-unknown-linux-${LIBC}/release/kata-agent ${ROOTFS_DIR}/sbin/init
|
||||
```
|
||||
|
||||
### Build an initrd image
|
||||
|
||||
```
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/initrd-builder
|
||||
$ script -fec 'sudo -E AGENT_INIT=yes USE_DOCKER=true ./initrd_builder.sh ${ROOTFS_DIR}'
|
||||
```
|
||||
|
||||
### Install the initrd image
|
||||
|
||||
```
|
||||
$ commit=$(git log --format=%h -1 HEAD)
|
||||
$ date=$(date +%Y-%m-%d-%T.%N%z)
|
||||
$ image="kata-containers-initrd-${date}-${commit}"
|
||||
$ sudo install -o root -g root -m 0640 -D kata-containers-initrd.img "/usr/share/kata-containers/${image}"
|
||||
$ (cd /usr/share/kata-containers && sudo ln -sf "$image" kata-containers-initrd.img)
|
||||
```
|
||||
|
||||
# Install guest kernel images
|
||||
|
||||
You can build and install the guest kernel image as shown [here](../tools/packaging/kernel/README.md#build-kata-containers-kernel).
|
||||
|
||||
# Install a hypervisor
|
||||
|
||||
When setting up Kata using a [packaged installation method](install/README.md#installing-on-a-linux-system), the
|
||||
`QEMU` VMM is installed automatically. Cloud-Hypervisor and Firecracker VMMs are available from the [release tarballs](https://github.com/kata-containers/kata-containers/releases), as well as through [`kata-deploy`](../tools/packaging/kata-deploy/README.md).
|
||||
You may choose to manually build your VMM/hypervisor.
|
||||
|
||||
## Build a custom QEMU
|
||||
|
||||
Kata Containers makes use of upstream QEMU branch. The exact version
|
||||
and repository utilized can be found by looking at the [versions file](../versions.yaml).
|
||||
|
||||
Find the correct version of QEMU from the versions file:
|
||||
```
|
||||
$ source ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/scripts/lib.sh
|
||||
$ qemu_version=$(get_from_kata_deps "assets.hypervisor.qemu.version")
|
||||
$ echo ${qemu_version}
|
||||
```
|
||||
Get source from the matching branch of QEMU:
|
||||
```
|
||||
$ go get -d github.com/qemu/qemu
|
||||
$ cd ${GOPATH}/src/github.com/qemu/qemu
|
||||
$ git checkout ${qemu_version}
|
||||
$ your_qemu_directory=${GOPATH}/src/github.com/qemu/qemu
|
||||
```
|
||||
|
||||
There are scripts to manage the build and packaging of QEMU. For the examples below, set your
|
||||
environment as:
|
||||
```
|
||||
$ go get -d github.com/kata-containers/kata-containers
|
||||
$ packaging_dir="${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging"
|
||||
```
|
||||
|
||||
Kata often utilizes patches for not-yet-upstream and/or backported fixes for components,
|
||||
including QEMU. These can be found in the [packaging/QEMU directory](../tools/packaging/qemu/patches),
|
||||
and it's *recommended* that you apply them. For example, suppose that you are going to build QEMU
|
||||
version 5.2.0, do:
|
||||
```
|
||||
$ cd $your_qemu_directory
|
||||
$ $packaging_dir/scripts/apply_patches.sh $packaging_dir/qemu/patches/5.2.x/
|
||||
```
|
||||
|
||||
To build utilizing the same options as Kata, you should make use of the `configure-hypervisor.sh` script. For example:
|
||||
```
|
||||
$ cd $your_qemu_directory
|
||||
$ $packaging_dir/scripts/configure-hypervisor.sh kata-qemu > kata.cfg
|
||||
$ eval ./configure "$(cat kata.cfg)"
|
||||
$ make -j $(nproc)
|
||||
$ sudo -E make install
|
||||
```
|
||||
|
||||
See the [static-build script for QEMU](../tools/packaging/static-build/qemu/build-static-qemu.sh) for a reference on how to get, setup, configure and build QEMU for Kata.
|
||||
|
||||
### Build a custom QEMU for aarch64/arm64 - REQUIRED
|
||||
> **Note:**
|
||||
>
|
||||
> - You should only do this step if you are on aarch64/arm64.
|
||||
> - You should include [Eric Auger's latest PCDIMM/NVDIMM patches](https://patchwork.kernel.org/cover/10647305/) which are
|
||||
> under upstream review for supporting NVDIMM on aarch64.
|
||||
>
|
||||
You could build the custom `qemu-system-aarch64` as required with the following command:
|
||||
```
|
||||
$ go get -d github.com/kata-containers/tests
|
||||
$ script -fec 'sudo -E ${GOPATH}/src/github.com/kata-containers/tests/.ci/install_qemu.sh'
|
||||
```
|
||||
|
||||
# Run Kata Containers with Containerd
|
||||
Refer to the [How to use Kata Containers and Containerd](how-to/containerd-kata.md) how-to guide.
|
||||
|
||||
# Run Kata Containers with Kubernetes
|
||||
Refer to the [Run Kata Containers with Kubernetes](how-to/run-kata-with-k8s.md) how-to guide.
|
||||
|
||||
# Troubleshoot Kata Containers
|
||||
|
||||
If you are unable to create a Kata Container first ensure you have
|
||||
[enabled full debug](#enable-full-debug)
|
||||
before attempting to create a container. Then run the
|
||||
[`kata-collect-data.sh`](../src/runtime/data/kata-collect-data.sh.in)
|
||||
script and paste its output directly into a
|
||||
[GitHub issue](https://github.com/kata-containers/kata-containers/issues/new).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The `kata-collect-data.sh` script is built from the
|
||||
> [runtime](../src/runtime) repository.
|
||||
|
||||
To perform analysis on Kata logs, use the
|
||||
[`kata-log-parser`](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
|
||||
tool, which can convert the logs into formats (e.g. JSON, TOML, XML, and YAML).
|
||||
|
||||
See [Set up a debug console](#set-up-a-debug-console).
|
||||
|
||||
# Appendices
|
||||
|
||||
## Checking Docker default runtime
|
||||
|
||||
```
|
||||
$ sudo docker info 2>/dev/null | grep -i "default runtime" | cut -d: -f2- | grep -q runc && echo "SUCCESS" || echo "ERROR: Incorrect default Docker runtime"
|
||||
```
|
||||
## Set up a debug console
|
||||
|
||||
Kata containers provides two ways to connect to the guest. One is using traditional login service, which needs additional works. In contrast the simple debug console is easy to setup.
|
||||
|
||||
### Simple debug console setup
|
||||
|
||||
Kata Containers 2.0 supports a shell simulated *console* for quick debug purpose. This approach uses VSOCK to
|
||||
connect to the shell running inside the guest which the agent starts. This method only requires the guest image to
|
||||
contain either `/bin/sh` or `/bin/bash`.
|
||||
|
||||
#### Enable agent debug console
|
||||
|
||||
Enable debug_console_enabled in the `configuration.toml` configuration file:
|
||||
|
||||
```
|
||||
[agent.kata]
|
||||
debug_console_enabled = true
|
||||
```
|
||||
|
||||
This will pass `agent.debug_console agent.debug_console_vport=1026` to agent as kernel parameters, and sandboxes created using this parameters will start a shell in guest if new connection is accept from VSOCK.
|
||||
|
||||
#### Start `kata-monitor` - ONLY NEEDED FOR 2.0.x
|
||||
|
||||
For Kata Containers `2.0.x` releases, the `kata-runtime exec` command depends on the`kata-monitor` running, in order to get the sandbox's `vsock` address to connect to. Thus, first start the `kata-monitor` process.
|
||||
|
||||
```
|
||||
$ sudo kata-monitor
|
||||
```
|
||||
|
||||
`kata-monitor` will serve at `localhost:8090` by default.
|
||||
|
||||
#### Connect to debug console
|
||||
|
||||
Command `kata-runtime exec` is used to connect to the debug console.
|
||||
|
||||
```
|
||||
$ kata-runtime exec 1a9ab65be63b8b03dfd0c75036d27f0ed09eab38abb45337fea83acd3cd7bacd
|
||||
bash-4.2# id
|
||||
uid=0(root) gid=0(root) groups=0(root)
|
||||
bash-4.2# pwd
|
||||
/
|
||||
bash-4.2# exit
|
||||
exit
|
||||
```
|
||||
|
||||
`kata-runtime exec` has a command-line option `runtime-namespace`, which is used to specify under which [runtime namespace](https://github.com/containerd/containerd/blob/master/docs/namespaces.md) the particular pod was created. By default, it is set to `k8s.io` and works for containerd when configured
|
||||
with Kubernetes. For CRI-O, the namespace should set to `default` explicitly. This should not be confused with [Kubernetes namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/).
|
||||
For other CRI-runtimes and configurations, you may need to set the namespace utilizing the `runtime-namespace` option.
|
||||
|
||||
If you want to access guest OS through a traditional way, see [Traditional debug console setup)](#traditional-debug-console-setup).
|
||||
|
||||
### Traditional debug console setup
|
||||
|
||||
By default you cannot login to a virtual machine, since this can be sensitive
|
||||
from a security perspective. Also, allowing logins would require additional
|
||||
packages in the rootfs, which would increase the size of the image used to
|
||||
boot the virtual machine.
|
||||
|
||||
If you want to login to a virtual machine that hosts your containers, complete
|
||||
the following steps (using rootfs or initrd image).
|
||||
|
||||
> **Note:** The following debug console instructions assume a systemd-based guest
|
||||
> O/S image. This means you must create a rootfs for a distro that supports systemd.
|
||||
> Currently, all distros supported by [osbuilder](../tools/osbuilder) support systemd
|
||||
> except for Alpine Linux.
|
||||
>
|
||||
> Look for `INIT_PROCESS=systemd` in the `config.sh` osbuilder rootfs config file
|
||||
> to verify an osbuilder distro supports systemd for the distro you want to build rootfs for.
|
||||
> For an example, see the [Clear Linux config.sh file](../tools/osbuilder/rootfs-builder/clearlinux/config.sh).
|
||||
>
|
||||
> For a non-systemd-based distro, create an equivalent system
|
||||
> service using that distro’s init system syntax. Alternatively, you can build a distro
|
||||
> that contains a shell (e.g. `bash(1)`). In this circumstance it is likely you need to install
|
||||
> additional packages in the rootfs and add “agent.debug_console” to kernel parameters in the runtime
|
||||
> config file. This tells the Kata agent to launch the console directly.
|
||||
>
|
||||
> Once these steps are taken you can connect to the virtual machine using the [debug console](Developer-Guide.md#connect-to-the-virtual-machine-using-the-debug-console).
|
||||
|
||||
#### Create a custom image containing a shell
|
||||
|
||||
To login to a virtual machine, you must
|
||||
[create a custom rootfs](#create-a-rootfs-image) or [custom initrd](#create-an-initrd-image---optional)
|
||||
containing a shell such as `bash(1)`. For Clear Linux, you will need
|
||||
an additional `coreutils` package.
|
||||
|
||||
For example using CentOS:
|
||||
|
||||
```
|
||||
$ cd $GOPATH/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder
|
||||
$ export ROOTFS_DIR=${GOPATH}/src/github.com/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs
|
||||
$ script -fec 'sudo -E GOPATH=$GOPATH USE_DOCKER=true EXTRA_PKGS="bash coreutils" ./rootfs.sh centos'
|
||||
```
|
||||
|
||||
#### Build the debug image
|
||||
|
||||
Follow the instructions in the [Build a rootfs image](#build-a-rootfs-image)
|
||||
section when using rootfs, or when using initrd, complete the steps in the [Build an initrd image](#build-an-initrd-image) section.
|
||||
|
||||
#### Configure runtime for custom debug image
|
||||
|
||||
Install the image:
|
||||
|
||||
>**Note**: When using an initrd image, replace the below rootfs image name `kata-containers.img`
|
||||
>with the initrd image name `kata-containers-initrd.img`.
|
||||
|
||||
```
|
||||
$ name="kata-containers-centos-with-debug-console.img"
|
||||
$ sudo install -o root -g root -m 0640 kata-containers.img "/usr/share/kata-containers/${name}"
|
||||
```
|
||||
|
||||
Next, modify the `image=` values in the `[hypervisor.qemu]` section of the
|
||||
[configuration file](../src/runtime/README.md#configuration)
|
||||
to specify the full path to the image name specified in the previous code
|
||||
section. Alternatively, recreate the symbolic link so it points to
|
||||
the new debug image:
|
||||
|
||||
```
|
||||
$ (cd /usr/share/kata-containers && sudo ln -sf "$name" kata-containers.img)
|
||||
```
|
||||
|
||||
**Note**: You should take care to undo this change after you finish debugging
|
||||
to avoid all subsequently created containers from using the debug image.
|
||||
|
||||
#### Create a container
|
||||
|
||||
Create a container as normal. For example using `crictl`:
|
||||
|
||||
```
|
||||
$ sudo crictl run -r kata container.yaml pod.yaml
|
||||
```
|
||||
|
||||
#### Connect to the virtual machine using the debug console
|
||||
|
||||
The steps required to enable debug console for QEMU slightly differ with
|
||||
those for firecracker / cloud-hypervisor.
|
||||
|
||||
##### Enabling debug console for QEMU
|
||||
|
||||
Add `agent.debug_console` to the guest kernel command line to allow the agent process to start a debug console.
|
||||
|
||||
```
|
||||
$ sudo sed -i -e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 agent.debug_console"/g' "${kata_configuration_file}"
|
||||
```
|
||||
|
||||
Here `kata_configuration_file` could point to `/etc/kata-containers/configuration.toml`
|
||||
or `/usr/share/defaults/kata-containers/configuration.toml`
|
||||
or `/opt/kata/share/defaults/kata-containers/configuration-{hypervisor}.toml`, if
|
||||
you installed Kata Containers using `kata-deploy`.
|
||||
|
||||
##### Enabling debug console for cloud-hypervisor / firecracker
|
||||
|
||||
Slightly different configuration is required in case of firecracker and cloud hypervisor.
|
||||
Firecracker and cloud-hypervisor don't have a UNIX socket connected to `/dev/console`.
|
||||
Hence, the kernel command line option `agent.debug_console` will not work for them.
|
||||
These hypervisors support `hybrid vsocks`, which can be used for communication
|
||||
between the host and the guest. The kernel command line option `agent.debug_console_vport`
|
||||
was added to allow developers specify on which `vsock` port the debugging console should be connected.
|
||||
|
||||
|
||||
Add the parameter `agent.debug_console_vport=1026` to the kernel command line
|
||||
as shown below:
|
||||
```
|
||||
sudo sed -i -e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 agent.debug_console_vport=1026"/g' "${kata_configuration_file}"
|
||||
```
|
||||
|
||||
> **Note** Ports 1024 and 1025 are reserved for communication with the agent
|
||||
> and gathering of agent logs respectively.
|
||||
|
||||
##### Connecting to the debug console
|
||||
|
||||
Next, connect to the debug console. The VSOCKS paths vary slightly between each
|
||||
VMM solution.
|
||||
|
||||
In case of cloud-hypervisor, connect to the `vsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id}/root/ && socat stdin unix-connect:clh.sock'
|
||||
CONNECT 1026
|
||||
```
|
||||
|
||||
**Note**: You need to type `CONNECT 1026` and press `RETURN` key after entering the `socat` command.
|
||||
|
||||
For firecracker, connect to the `hvsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/firecracker/${sandbox_id}/root/ && socat stdin unix-connect:kata.hvsock'
|
||||
CONNECT 1026
|
||||
```
|
||||
|
||||
**Note**: You need to press the `RETURN` key to see the shell prompt.
|
||||
|
||||
|
||||
For QEMU, connect to the `vsock` as shown:
|
||||
```
|
||||
$ sudo su -c 'cd /var/run/vc/vm/${sandbox_id} && socat "stdin,raw,echo=0,escape=0x11" "unix-connect:console.sock"'
|
||||
```
|
||||
|
||||
To disconnect from the virtual machine, type `CONTROL+q` (hold down the
|
||||
`CONTROL` key and press `q`).
|
||||
|
||||
## Obtain details of the image
|
||||
|
||||
If the image is created using
|
||||
[osbuilder](../tools/osbuilder), the following YAML
|
||||
file exists and contains details of the image and how it was created:
|
||||
|
||||
```
|
||||
$ cat /var/lib/osbuilder/osbuilder.yaml
|
||||
```
|
||||
|
||||
## Capturing kernel boot logs
|
||||
|
||||
Sometimes it is useful to capture the kernel boot messages from a Kata Container
|
||||
launch. If the container launches to the point whereby you can `exec` into it, and
|
||||
if the container has the necessary components installed, often you can execute the `dmesg`
|
||||
command inside the container to view the kernel boot logs.
|
||||
|
||||
If however you are unable to `exec` into the container, you can enable some debug
|
||||
options to have the kernel boot messages logged into the system journal.
|
||||
|
||||
- Set `enable_debug = true` in the `[hypervisor.qemu]` and `[runtime]` sections
|
||||
|
||||
For generic information on enabling debug in the configuration file, see the
|
||||
[Enable full debug](#enable-full-debug) section.
|
||||
|
||||
The kernel boot messages will appear in the `containerd` or `CRI-O` log appropriately,
|
||||
such as:
|
||||
|
||||
```bash
|
||||
$ sudo journalctl -t containerd
|
||||
-- Logs begin at Thu 2020-02-13 16:20:40 UTC, end at Thu 2020-02-13 16:30:23 UTC. --
|
||||
...
|
||||
time="2020-09-15T14:56:23.095113803+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.395399] brd: module loaded"
|
||||
time="2020-09-15T14:56:23.102633107+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.402845] random: fast init done"
|
||||
time="2020-09-15T14:56:23.103125469+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.403544] random: crng init done"
|
||||
time="2020-09-15T14:56:23.105268162+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.405599] loop: module loaded"
|
||||
time="2020-09-15T14:56:23.121121598+08:00" level=debug msg="reading guest console" console-protocol=unix console-url=/run/vc/vm/ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791/console.sock pid=107642 sandbox=ab9f633385d4987828d342e47554fc6442445b32039023eeddaa971c1bb56791 source=virtcontainers subsystem=sandbox vmconsole="[ 0.421324] memmap_init_zone_device initialised 32768 pages in 12ms"
|
||||
...
|
||||
```
|
||||
@@ -1,224 +0,0 @@
|
||||
# Introduction
|
||||
|
||||
This document outlines the requirements for all documentation in the [Kata
|
||||
Containers](https://github.com/kata-containers) project.
|
||||
|
||||
# General requirements
|
||||
|
||||
All documents must:
|
||||
|
||||
- Be written in simple English.
|
||||
- Be written in [GitHub Flavored Markdown](https://github.github.com/gfm) format.
|
||||
- Have a `.md` file extension.
|
||||
- Be linked to from another document in the same repository.
|
||||
|
||||
Although GitHub allows navigation of the entire repository, it should be
|
||||
possible to access all documentation purely by navigating links inside the
|
||||
documents, starting from the repositories top-level `README`.
|
||||
|
||||
If you are adding a new document, ensure you add a link to it in the
|
||||
"closest" `README` above the directory where you created your document.
|
||||
- If the document needs to tell the user to manipulate files or commands, use a
|
||||
[code block](#code-blocks) to specify the commands.
|
||||
|
||||
If at all possible, ensure that every command in the code blocks can be run
|
||||
non-interactively. If this is possible, the document can be tested by the CI
|
||||
which can then execute the commands specified to ensure the instructions are
|
||||
correct. This avoids documents becoming out of date over time.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> Do not add a table of contents (TOC) since GitHub will auto-generate one.
|
||||
|
||||
# Linking advice
|
||||
|
||||
Linking between documents is strongly encouraged to help users and developers
|
||||
navigate the material more easily. Linking also avoids repetition - if a
|
||||
document needs to refer to a concept already well described in another section
|
||||
or document, do not repeat it, link to it
|
||||
(the [DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) principle).
|
||||
|
||||
Another advantage of this approach is that changes only need to be applied in
|
||||
one place: where the concept is defined (not the potentially many places where
|
||||
the concept is referred to using a link).
|
||||
|
||||
# Notes
|
||||
|
||||
Important information that is not part of the main document flow should be
|
||||
added as a Note in bold with all content contained within a block quote:
|
||||
|
||||
> **Note:** This is a really important point!
|
||||
>
|
||||
> This particular note also spans multiple lines. The entire note should be
|
||||
> included inside the quoted block.
|
||||
|
||||
If there are multiple notes, bullets should be used:
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - I am important point 1.
|
||||
>
|
||||
> - I am important point 2.
|
||||
>
|
||||
> - I am important point *n*.
|
||||
|
||||
# Warnings and other admonitions
|
||||
|
||||
Use the same approach as for [notes](#notes). For example:
|
||||
|
||||
> **Warning:** Running this command assumes you understand the risks of doing so.
|
||||
|
||||
Other examples:
|
||||
|
||||
> **Warnings:**
|
||||
>
|
||||
> - Do not unplug your computer!
|
||||
> - Always read the label.
|
||||
> - Do not pass go. Do not collect $200.
|
||||
|
||||
> **Tip:** Read the manual page for further information on available options.
|
||||
|
||||
> **Hint:** Look behind you!
|
||||
|
||||
# Files and command names
|
||||
|
||||
All filenames and command names should be rendered in a fixed-format font
|
||||
using backticks:
|
||||
|
||||
> Run the `foo` command to make it work.
|
||||
|
||||
> Modify the `bar` option in file `/etc/baz/baz.conf`.
|
||||
|
||||
Render any options that need to be specified to the command in the same manner:
|
||||
|
||||
> Run `bar -axz --apply foo.yaml` to make the changes.
|
||||
|
||||
For standard system commands, it is also acceptable to specify the name along
|
||||
with the manual page section that documents the command in brackets:
|
||||
|
||||
> The command to list files in a directory is called `ls(1)`.
|
||||
|
||||
# Code blocks
|
||||
|
||||
This section lists requirements for displaying commands and command output.
|
||||
|
||||
The requirements must be adhered to since documentation containing code blocks
|
||||
is validated by the CI system, which executes the command blocks with the help
|
||||
of the
|
||||
[doc-to-script](https://github.com/kata-containers/tests/tree/main/.ci/kata-doc-to-script.sh)
|
||||
utility.
|
||||
|
||||
- If a document includes commands the user should run, they **MUST** be shown
|
||||
in a *bash code block* with every command line prefixed with `$ ` to denote
|
||||
a shell prompt:
|
||||
|
||||
<pre>
|
||||
|
||||
```bash
|
||||
$ echo "Hi - I am some bash code"
|
||||
$ sudo docker run -ti busybox true
|
||||
$ [ $? -eq 0 ] && echo "success"
|
||||
```
|
||||
|
||||
<pre>
|
||||
|
||||
- If a command needs to be run as the `root` user, it must be run using
|
||||
`sudo(8)`.
|
||||
|
||||
```bash
|
||||
|
||||
$ sudo echo "I'm running as root"
|
||||
```
|
||||
|
||||
- All lines beginning `# ` should be comment lines, *NOT* commands to run as
|
||||
the `root` user.
|
||||
|
||||
- Try to avoid showing the *output* of commands.
|
||||
|
||||
The reasons for this:
|
||||
|
||||
- Command output can change, leading to confusion when the output the user
|
||||
sees does not match the output in the documentation.
|
||||
|
||||
- There is the risk the user will get confused between what parts of the
|
||||
block refer to the commands they should type and the output that they
|
||||
should not.
|
||||
|
||||
- It can make the document look overly "busy" or complex.
|
||||
|
||||
In the unusual case that you need to display command *output*, use an
|
||||
unadorned code block (\`\`\`):
|
||||
|
||||
<pre>
|
||||
|
||||
The output of the `ls(1)` command is expected to be:
|
||||
|
||||
```
|
||||
ls: cannot access '/foo': No such file or directory
|
||||
```
|
||||
|
||||
<pre>
|
||||
|
||||
- Long lines should not span across multiple lines by using the `\`
|
||||
continuation character.
|
||||
|
||||
GitHub automatically renders such blocks with scrollbars. Consequently,
|
||||
backslash continuation characters are not necessary and are a visual
|
||||
distraction. These characters also mess up a user's shell history when
|
||||
commands are pasted into a terminal.
|
||||
|
||||
# Images
|
||||
|
||||
All binary image files must be in a standard and well-supported format such as
|
||||
PNG. This format is preferred for vector graphics such as diagrams because the
|
||||
information is stored more efficiently, leading to smaller file sizes. JPEG
|
||||
images are acceptable, but this format is more appropriate to store
|
||||
photographic images.
|
||||
|
||||
When possible, generate images using freely available software.
|
||||
|
||||
Every binary image file **MUST** be accompanied by the "source" file used to
|
||||
generate it. This guarantees that the image can be modified by updating the
|
||||
source file and re-generating the binary format image file.
|
||||
|
||||
Ideally, the format of all image source files is an open standard, non-binary
|
||||
one such as SVG. Text formats are highly preferable because you can manipulate
|
||||
and compare them with standard tools (e.g. `diff(1)`).
|
||||
|
||||
# Spelling
|
||||
|
||||
Since this project uses a number of terms not found in conventional
|
||||
dictionaries, we have a
|
||||
[spell checking tool](https://github.com/kata-containers/tests/tree/main/cmd/check-spelling)
|
||||
that checks both dictionary words and the additional terms we use.
|
||||
|
||||
Run the spell checking tool on your document before raising a PR to ensure it
|
||||
is free of mistakes.
|
||||
|
||||
If your document introduces new terms, you need to update the custom
|
||||
dictionary used by the spell checking tool to incorporate the new words.
|
||||
|
||||
# Names
|
||||
|
||||
Occasionally documents need to specify the name of people. Write such names in
|
||||
backticks. The main reason for this is to keep the [spell checker](#spelling) happy (since
|
||||
it cannot manage all possible names). However, since backticks render in a
|
||||
fixed-width font, this makes the names clearer:
|
||||
|
||||
> Welcome to `Clark Kent`, the newest member of the Kata Containers Architecture Committee.
|
||||
|
||||
# Version numbers
|
||||
|
||||
Write version number in backticks. This keeps the [spell checker](#spelling)
|
||||
happy and since backticks render in a fixed-width font, it also makes the
|
||||
numbers clearer:
|
||||
|
||||
> Ensure you are using at least version `1.2.3-alpha3.wibble.1` of the tool.
|
||||
|
||||
# The apostrophe
|
||||
|
||||
The apostrophe character (`'`) must **only** be used for showing possession
|
||||
("Peter's book") and for standard contractions (such as "don't").
|
||||
|
||||
Use double-quotes ("...") in all other circumstances you use quotes outside of
|
||||
[code blocks](#code-blocks).
|
||||
@@ -1,21 +0,0 @@
|
||||
# Licensing strategy
|
||||
|
||||
## Project License
|
||||
|
||||
The license for the [Kata Containers](https://github.com/kata-containers)
|
||||
project is [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
||||
|
||||
## License file
|
||||
|
||||
All repositories in the project have a top level file called `LICENSE`. This
|
||||
file lists full details of all licences used by the repository.
|
||||
|
||||
## License for individual files
|
||||
|
||||
Where possible all files in all repositories also contain a
|
||||
[SPDX](https://spdx.org) license identifier. This provides fine-grained
|
||||
licensing and allows automated tooling to check the license of individual
|
||||
files.
|
||||
|
||||
This SPDX licence identifier requirement is enforced by the
|
||||
[CI (Continuous Integration) system](https://github.com/kata-containers/tests/blob/main/.ci/static-checks.sh).
|
||||
@@ -1,165 +0,0 @@
|
||||
# Overview
|
||||
|
||||
A [Kata Container](https://github.com/kata-containers) utilizes a Virtual Machine (VM) to enhance security and
|
||||
isolation of container workloads. As a result, the system has a number of differences
|
||||
and limitations when compared with the default [Docker*](https://www.docker.com/) runtime,
|
||||
[`runc`](https://github.com/opencontainers/runc).
|
||||
|
||||
Some of these limitations have potential solutions, whereas others exist
|
||||
due to fundamental architectural differences generally related to the
|
||||
use of VMs.
|
||||
|
||||
The [Kata Container runtime](../src/runtime)
|
||||
launches each container within its own hardware isolated VM, and each VM has
|
||||
its own kernel. Due to this higher degree of isolation, certain container
|
||||
capabilities cannot be supported or are implicitly enabled through the VM.
|
||||
|
||||
# Definition of a limitation
|
||||
|
||||
The [Open Container Initiative](https://www.opencontainers.org/)
|
||||
[Runtime Specification](https://github.com/opencontainers/runtime-spec) ("OCI spec")
|
||||
defines the minimum specifications a runtime must support to interoperate with
|
||||
container managers such as Docker. If a runtime does not support some aspect
|
||||
of the OCI spec, it is by definition a limitation.
|
||||
|
||||
However, the OCI runtime reference implementation (`runc`) does not perfectly
|
||||
align with the OCI spec itself.
|
||||
|
||||
Further, since the default OCI runtime used by Docker is `runc`, Docker
|
||||
expects runtimes to behave as `runc` does. This implies that another form of
|
||||
limitation arises if the behavior of a runtime implementation does not align
|
||||
with that of `runc`. Having two standards complicates the challenge of
|
||||
supporting a Docker environment since a runtime must support the official OCI
|
||||
spec and the non-standard extensions provided by `runc`.
|
||||
|
||||
# Scope
|
||||
|
||||
Each known limitation is captured in a separate GitHub issue that contains
|
||||
detailed information about the issue. These issues are tagged with the
|
||||
`limitation` label. This document is a curated summary of important known
|
||||
limitations and provides links to the relevant GitHub issues.
|
||||
|
||||
The following link shows the latest list of limitations:
|
||||
|
||||
- https://github.com/pulls?utf8=%E2%9C%93&q=is%3Aopen+label%3Alimitation+org%3Akata-containers
|
||||
|
||||
# Contributing
|
||||
|
||||
If you would like to work on resolving a limitation, please refer to the
|
||||
[contributors guide](https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md).
|
||||
If you wish to raise an issue for a new limitation, either
|
||||
[raise an issue directly on the runtime](https://github.com/kata-containers/kata-containers/issues/new)
|
||||
or see the
|
||||
[project table of contents](https://github.com/kata-containers/kata-containers)
|
||||
for advice on which repository to raise the issue against.
|
||||
|
||||
# Pending items
|
||||
|
||||
This section lists items that might be possible to fix.
|
||||
|
||||
## OCI CLI commands
|
||||
|
||||
### Docker and Podman support
|
||||
Currently Kata Containers does not support Docker or Podman.
|
||||
|
||||
See issue https://github.com/kata-containers/kata-containers/issues/722 for more information.
|
||||
|
||||
## Runtime commands
|
||||
|
||||
### checkpoint and restore
|
||||
|
||||
The runtime does not provide `checkpoint` and `restore` commands. There
|
||||
are discussions about using VM save and restore to give us a
|
||||
`[criu](https://github.com/checkpoint-restore/criu)`-like functionality,
|
||||
which might provide a solution.
|
||||
|
||||
Note that the OCI standard does not specify `checkpoint` and `restore`
|
||||
commands.
|
||||
|
||||
See issue https://github.com/kata-containers/runtime/issues/184 for more information.
|
||||
|
||||
### events command
|
||||
|
||||
The runtime does not fully implement the `events` command. `OOM` notifications and `Intel RDT` stats are not fully supported.
|
||||
|
||||
Note that the OCI standard does not specify an `events` command.
|
||||
|
||||
See issue https://github.com/kata-containers/runtime/issues/308 and https://github.com/kata-containers/runtime/issues/309 for more information.
|
||||
|
||||
### update command
|
||||
|
||||
Currently, only block I/O weight is not supported.
|
||||
All other configurations are supported and are working properly.
|
||||
|
||||
## Networking
|
||||
|
||||
## Resource management
|
||||
|
||||
Due to the way VMs differ in their CPU and memory allocation, and sharing
|
||||
across the host system, the implementation of an equivalent method for
|
||||
these commands is potentially challenging.
|
||||
|
||||
See issue https://github.com/clearcontainers/runtime/issues/341 and [the constraints challenge](#the-constraints-challenge) for more information.
|
||||
|
||||
For CPUs resource management see
|
||||
[CPU constraints](design/vcpu-handling.md).
|
||||
|
||||
# Architectural limitations
|
||||
|
||||
This section lists items that might not be fixed due to fundamental
|
||||
architectural differences between "soft containers" (i.e. traditional Linux*
|
||||
containers) and those based on VMs.
|
||||
|
||||
## Storage limitations
|
||||
|
||||
### Kubernetes `volumeMounts.subPaths`
|
||||
|
||||
Kubernetes `volumeMount.subPath` is not supported by Kata Containers at the
|
||||
moment.
|
||||
|
||||
See [this issue](https://github.com/kata-containers/runtime/issues/2812) for more details.
|
||||
[Another issue](https://github.com/kata-containers/kata-containers/issues/1728) focuses on the case of `emptyDir`.
|
||||
|
||||
## Host resource sharing
|
||||
|
||||
### Privileged containers
|
||||
|
||||
Privileged support in Kata is essentially different from `runc` containers.
|
||||
The container runs with elevated capabilities within the guest and is granted
|
||||
access to guest devices instead of the host devices.
|
||||
This is also true with using `securityContext privileged=true` with Kubernetes.
|
||||
|
||||
The container may also be granted full access to a subset of host devices
|
||||
(https://github.com/kata-containers/runtime/issues/1568).
|
||||
|
||||
See [Privileged Kata Containers](how-to/privileged.md) for how to configure some of this behavior.
|
||||
|
||||
# Appendices
|
||||
|
||||
## The constraints challenge
|
||||
|
||||
Applying resource constraints such as cgroup, CPU, memory, and storage to a workload is not always straightforward with a VM based system. A Kata Container runs in an isolated environment inside a virtual machine. This, coupled with the architecture of Kata Containers, offers many more possibilities than are available to traditional Linux containers due to the various layers and contexts.
|
||||
|
||||
In some cases it might be necessary to apply the constraints to multiple levels. In other cases, the hardware isolated VM provides equivalent functionality to the the requested constraint.
|
||||
|
||||
The following examples outline some of the various areas constraints can be applied:
|
||||
|
||||
- Inside the VM
|
||||
|
||||
Constrain the guest kernel. This can be achieved by passing particular values through the kernel command line used to boot the guest kernel. Alternatively, sysctl values can be applied at early boot.
|
||||
|
||||
- Inside the container
|
||||
|
||||
Constrain the container created inside the VM.
|
||||
|
||||
- Outside the VM:
|
||||
|
||||
- Constrain the hypervisor process by applying host-level constraints.
|
||||
|
||||
- Constrain all processes running inside the hypervisor.
|
||||
|
||||
This can be achieved by specifying particular hypervisor configuration options.
|
||||
|
||||
|
||||
Note that in some circumstances it might be necessary to apply particular constraints
|
||||
to more than one of the previous areas to achieve the desired level of isolation and resource control.
|
||||
@@ -1,8 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
default:
|
||||
@true
|
||||
@@ -1,84 +0,0 @@
|
||||
# Documentation
|
||||
|
||||
The [Kata Containers](https://github.com/kata-containers)
|
||||
documentation repository hosts overall system documentation, with information
|
||||
common to multiple components.
|
||||
|
||||
For details of the other Kata Containers repositories, see the
|
||||
[repository summary](https://github.com/kata-containers/kata-containers).
|
||||
|
||||
## Getting Started
|
||||
|
||||
* [Installation guides](./install/README.md): Install and run Kata Containers with Docker or Kubernetes
|
||||
|
||||
## Tracing
|
||||
|
||||
See the [tracing documentation](tracing.md).
|
||||
|
||||
## More User Guides
|
||||
|
||||
* [Upgrading](Upgrading.md): how to upgrade from [Clear Containers](https://github.com/clearcontainers) and [runV](https://github.com/hyperhq/runv) to [Kata Containers](https://github.com/kata-containers) and how to upgrade an existing Kata Containers system to the latest version.
|
||||
* [Limitations](Limitations.md): differences and limitations compared with the default [Docker](https://www.docker.com/) runtime,
|
||||
[`runc`](https://github.com/opencontainers/runc).
|
||||
|
||||
### How-to guides
|
||||
|
||||
See the [how-to documentation](how-to).
|
||||
|
||||
## Kata Use-Cases
|
||||
|
||||
* [GPU Passthrough with Kata](./use-cases/GPU-passthrough-and-Kata.md)
|
||||
* [SR-IOV with Kata](./use-cases/using-SRIOV-and-kata.md)
|
||||
* [Intel QAT with Kata](./use-cases/using-Intel-QAT-and-kata.md)
|
||||
* [VPP with Kata](./use-cases/using-vpp-and-kata.md)
|
||||
* [SPDK vhost-user with Kata](./use-cases/using-SPDK-vhostuser-and-kata.md)
|
||||
* [Intel SGX with Kata](./use-cases/using-Intel-SGX-and-kata.md)
|
||||
|
||||
## Developer Guide
|
||||
|
||||
Documents that help to understand and contribute to Kata Containers.
|
||||
|
||||
### Design and Implementations
|
||||
|
||||
* [Kata Containers Architecture](design/architecture): Architectural overview of Kata Containers
|
||||
* [Kata Containers E2E Flow](design/end-to-end-flow.md): The entire end-to-end flow of Kata Containers
|
||||
* [Kata Containers design](./design/README.md): More Kata Containers design documents
|
||||
* [Kata Containers threat model](./threat-model/threat-model.md): Kata Containers threat model
|
||||
|
||||
### How to Contribute
|
||||
|
||||
* [Developer Guide](Developer-Guide.md): Setup the Kata Containers developing environments
|
||||
* [How to contribute to Kata Containers](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md)
|
||||
* [Code of Conduct](../CODE_OF_CONDUCT.md)
|
||||
|
||||
## Help Writing a Code PR
|
||||
|
||||
* [Code PR advice](code-pr-advice.md).
|
||||
|
||||
## Help Writing Unit Tests
|
||||
|
||||
* [Unit Test Advice](Unit-Test-Advice.md)
|
||||
* [Unit testing presentation](presentations/unit-testing/kata-containers-unit-testing.md)
|
||||
|
||||
## Help Improving the Documents
|
||||
|
||||
* [Documentation Requirements](Documentation-Requirements.md)
|
||||
|
||||
### Code Licensing
|
||||
|
||||
* [Licensing](Licensing-strategy.md): About the licensing strategy of Kata Containers.
|
||||
|
||||
### The Release Process
|
||||
|
||||
* [Release strategy](Stable-Branch-Strategy.md)
|
||||
* [Release Process](Release-Process.md)
|
||||
|
||||
## Presentations
|
||||
|
||||
* [Presentations](presentations)
|
||||
|
||||
## Website Changes
|
||||
|
||||
If you have a suggestion for how we can improve the
|
||||
[website](https://katacontainers.io), please raise an issue (or a PR) on
|
||||
[the repository that holds the source for the website](https://github.com/OpenStackweb/kata-netlify-refresh).
|
||||
@@ -1,89 +0,0 @@
|
||||
# How to do a Kata Containers Release
|
||||
This document lists the tasks required to create a Kata Release.
|
||||
|
||||
## Requirements
|
||||
|
||||
- [hub](https://github.com/github/hub)
|
||||
* Using an [application token](https://github.com/settings/tokens) is required for hub.
|
||||
|
||||
- GitHub permissions to push tags and create releases in Kata repositories.
|
||||
|
||||
- GPG configured to sign git tags. https://help.github.com/articles/generating-a-new-gpg-key/
|
||||
|
||||
- You should configure your GitHub to use your ssh keys (to push to branches). See https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/.
|
||||
* As an alternative, configure hub to push and fork with HTTPS, `git config --global hub.protocol https` (Not tested yet) *
|
||||
|
||||
## Release Process
|
||||
|
||||
|
||||
### Bump all Kata repositories
|
||||
|
||||
Bump the repositories using a script in the Kata packaging repo, where:
|
||||
- `BRANCH=<the-branch-you-want-to-bump>`
|
||||
- `NEW_VERSION=<the-new-kata-version>`
|
||||
```
|
||||
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
|
||||
$ export NEW_VERSION=<the-new-kata-version>
|
||||
$ export BRANCH=<the-branch-you-want-to-bump>
|
||||
$ ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH"
|
||||
```
|
||||
|
||||
### Point tests repository to stable branch
|
||||
|
||||
If you create a new stable branch, i.e. if your release changes a major or minor version number (not a patch release), then
|
||||
you should modify the `tests` repository to point to that newly created stable branch and not the `main` branch.
|
||||
The objective is that changes in the CI on the main branch will not impact the stable branch.
|
||||
|
||||
In the test directory, change references the main branch in:
|
||||
* `README.md`
|
||||
* `versions.yaml`
|
||||
* `cmd/github-labels/labels.yaml.in`
|
||||
* `cmd/pmemctl/pmemctl.sh`
|
||||
* `.ci/lib.sh`
|
||||
* `.ci/static-checks.sh`
|
||||
|
||||
See the commits in [the corresponding PR for stable-2.1](https://github.com/kata-containers/tests/pull/3504) for an example of the changes.
|
||||
|
||||
|
||||
### Merge all bump version Pull requests
|
||||
|
||||
- The above step will create a GitHub pull request in the Kata projects. Trigger the CI using `/test` command on each bump Pull request.
|
||||
- Trigger the test-kata-deploy workflow on the kata-containers repository bump Pull request using `/test_kata_deploy` (monitor under the "action" tab).
|
||||
- Check any failures and fix if needed.
|
||||
- Work with the Kata approvers to verify that the CI works and the pull requests are merged.
|
||||
|
||||
### Tag all Kata repositories
|
||||
|
||||
Once all the pull requests to bump versions in all Kata repositories are merged,
|
||||
tag all the repositories as shown below.
|
||||
```
|
||||
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
|
||||
$ git checkout <kata-branch-to-release>
|
||||
$ git pull
|
||||
$ ./tag_repos.sh -p -b "$BRANCH" tag
|
||||
```
|
||||
|
||||
### Check Git-hub Actions
|
||||
|
||||
We make use of [GitHub actions](https://github.com/features/actions) in this [file](../.github/workflows/release.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-containers` repository.
|
||||
|
||||
Check the [actions status page](https://github.com/kata-containers/kata-containers/actions) to verify all steps in the actions workflow have completed successfully. On success, a static tarball containing Kata release artifacts will be uploaded to the [Release page](https://github.com/kata-containers/kata-containers/releases).
|
||||
|
||||
### Create release notes
|
||||
|
||||
We have a script in place in the packaging repository to create release notes that include a short-log of the commits across Kata components.
|
||||
|
||||
Run the script as shown below:
|
||||
|
||||
```
|
||||
$ cd ${GOPATH}/src/github.com/kata-containers/kata-containers/tools/packaging/release
|
||||
# Note: OLD_VERSION is where the script should start to get changes.
|
||||
$ ./release-notes.sh ${OLD_VERSION} ${NEW_VERSION} > notes.md
|
||||
# Edit the `notes.md` file to review and make any changes to the release notes.
|
||||
# Add the release notes in the project's GitHub.
|
||||
$ hub release edit -F notes.md "${NEW_VERSION}"
|
||||
```
|
||||
|
||||
### Announce the release
|
||||
|
||||
Publish in [Slack and Kata mailing list](https://github.com/kata-containers/community#join-us) that new release is ready.
|
||||
@@ -1,151 +0,0 @@
|
||||
Branch and release maintenance for the Kata Containers project.
|
||||
|
||||
## Introduction
|
||||
|
||||
This document provides details about Kata Containers releases.
|
||||
|
||||
## Versioning
|
||||
|
||||
The Kata Containers project uses [semantic versioning](http://semver.org/) for all releases.
|
||||
Semantic versions are comprised of three fields in the form:
|
||||
|
||||
```
|
||||
MAJOR.MINOR.PATCH
|
||||
```
|
||||
|
||||
For examples: `1.0.0`, `1.0.0-rc.5`, and `99.123.77+foo.bar.baz.5`.
|
||||
|
||||
Semantic versioning is used since the version number is able to convey clear
|
||||
information about how a new version relates to the previous version.
|
||||
For example, semantic versioning can also provide assurances to allow users to know
|
||||
when they must upgrade compared with when they might want to upgrade:
|
||||
|
||||
- When `PATCH` increases, the new release contains important **security fixes**
|
||||
and an upgrade is recommended.
|
||||
|
||||
The patch field can contain extra details after the number.
|
||||
Dashes denote pre-release versions. `1.0.0-rc.5` in the example denotes the fifth release
|
||||
candidate for release `1.0.0`. Plus signs denote other details. In our example, `+foo.bar.baz.5`
|
||||
provides additional information regarding release `99.123.77` in the previous example.
|
||||
|
||||
- When `MINOR` increases, the new release adds **new features** but *without
|
||||
changing the existing behavior*.
|
||||
|
||||
- When `MAJOR` increases, the new release adds **new features, bug fixes, or
|
||||
both** and which **changes the behavior from the previous release** (incompatible with previous releases).
|
||||
|
||||
A major release will also likely require a change of the container manager version used,
|
||||
for example Containerd or CRI-O. Please refer to the release notes for further details.
|
||||
|
||||
## Release Strategy
|
||||
|
||||
Any new features added since the last release will be available in the next minor
|
||||
release. These will include bug fixes as well. To facilitate a stable user environment,
|
||||
Kata provides stable branch-based releases and a main branch release.
|
||||
|
||||
## Stable branch patch criteria
|
||||
|
||||
No new features should be introduced to stable branches. This is intended to limit risk to users,
|
||||
providing only bug and security fixes.
|
||||
|
||||
## Branch Management
|
||||
Kata Containers will maintain **one** stable release branch, in addition to the main branch, for
|
||||
each active major release.
|
||||
Once a new MAJOR or MINOR release is created from main, a new stable branch is created for
|
||||
the prior MAJOR or MINOR release and the previous stable branch is no longer maintained. End of
|
||||
maintenance for a branch is announced on the Kata Containers mailing list. Users can determine
|
||||
the version currently installed by running `kata-runtime kata-env`. It is recommended to use the
|
||||
latest stable branch available.
|
||||
|
||||
A couple of examples follow to help clarify this process.
|
||||
|
||||
### New bug fix introduced
|
||||
|
||||
A bug fix is submitted against the runtime which does not introduce new inter-component dependencies.
|
||||
This fix is applied to both the main and stable branches, and there is no need to create a new
|
||||
stable branch.
|
||||
|
||||
| Branch | Original version | New version |
|
||||
|--|--|--|
|
||||
| `main` | `2.3.0-rc0` | `2.3.0-rc1` |
|
||||
| `stable-2.2` | `2.2.0` | `2.2.1` |
|
||||
| `stable-2.1` | (unmaintained) | (unmaintained) |
|
||||
|
||||
|
||||
### New release made feature or change adding new inter-component dependency
|
||||
|
||||
A new feature is introduced, which adds a new inter-component dependency. In this case a new stable
|
||||
branch is created (stable-2.3) starting from main and the previous stable branch (stable-2.2)
|
||||
is dropped from maintenance.
|
||||
|
||||
|
||||
| Branch | Original version | New version |
|
||||
|--|--|--|
|
||||
| `main` | `2.3.0-rc1` | `2.3.0` |
|
||||
| `stable-2.3` | N/A| `2.3.0` |
|
||||
| `stable-2.2` | `2.2.1` | (unmaintained) |
|
||||
| `stable-2.1` | (unmaintained) | (unmaintained) |
|
||||
|
||||
Note, the stable-2.2 branch will still exist with tag 2.2.1, but under current plans it is
|
||||
not maintained further. The next tag applied to main will be 2.4.0-alpha0. We would then
|
||||
create a couple of alpha releases gathering features targeted for that particular release (in
|
||||
this case 2.4.0), followed by a release candidate. The release candidate marks a feature freeze.
|
||||
A new stable branch is created for the release candidate. Only bug fixes and any security issues
|
||||
are added to the branch going forward until release 2.4.0 is made.
|
||||
|
||||
## Backporting Process
|
||||
|
||||
Development that occurs against the main branch and applicable code commits should also be submitted
|
||||
against the stable branches. Some guidelines for this process follow::
|
||||
1. Only bug and security fixes which do not introduce inter-component dependencies are
|
||||
candidates for stable branches. These PRs should be marked with "bug" in GitHub.
|
||||
2. Once a PR is created against main which meets requirement of (1), a comparable one
|
||||
should also be submitted against the stable branches. It is the responsibility of the submitter
|
||||
to apply their pull request against stable, and it is the responsibility of the
|
||||
reviewers to help identify stable-candidate pull requests.
|
||||
|
||||
## Continuous Integration Testing
|
||||
|
||||
The test repository is forked to create stable branches from main. Full CI
|
||||
runs on each stable and main PR using its respective tests repository branch.
|
||||
|
||||
### An alternative method for CI testing:
|
||||
|
||||
Ideally, the continuous integration infrastructure will run the same test suite on both main
|
||||
and the stable branches. When tests are modified or new feature tests are introduced, explicit
|
||||
logic should exist within the testing CI to make sure only applicable tests are executed against
|
||||
stable and main. While this is not in place currently, it should be considered in the long term.
|
||||
|
||||
## Release Management
|
||||
|
||||
### Patch releases
|
||||
|
||||
Releases are made every four weeks, which include a GitHub release as
|
||||
well as binary packages. These patch releases are made for both stable branches, and a "release candidate"
|
||||
for the next `MAJOR` or `MINOR` is created from main. If there are no changes across all the repositories, no
|
||||
release is created and an announcement is made on the developer mailing list to highlight this.
|
||||
If a release is being made, each repository is tagged for this release, regardless
|
||||
of whether changes are introduced. The release schedule can be seen on the
|
||||
[release rotation wiki page](https://github.com/kata-containers/community/wiki/Release-Team-Rota).
|
||||
|
||||
If there is urgent need for a fix, a patch release will be made outside of the planned schedule.
|
||||
|
||||
The process followed for making a release can be found at [Release Process](Release-Process.md).
|
||||
|
||||
## Minor releases
|
||||
|
||||
### Frequency
|
||||
Minor releases are less frequent in order to provide a more stable baseline for users. They are currently
|
||||
running on a sixteen weeks cadence. The release schedule can be seen on the
|
||||
[release rotation wiki page](https://github.com/kata-containers/community/wiki/Release-Team-Rota).
|
||||
|
||||
### Compatibility
|
||||
Kata guarantees compatibility between components that are within one minor release of each other.
|
||||
|
||||
This is critical for dependencies which cross between host (shimv2 runtime) and
|
||||
the guest (hypervisor, rootfs and agent). For example, consider a cluster with a long-running
|
||||
deployment, workload-never-dies, all on Kata version 2.1.3 components. If the operator updates
|
||||
the Kata components to the next new minor release (i.e. 2.2.0), we need to guarantee that the 2.2.0
|
||||
shimv2 runtime still communicates with 2.1.3 agent within workload-never-dies.
|
||||
|
||||
Handling live-update is out of the scope of this document. See this [`kata-runtime` issue](https://github.com/kata-containers/runtime/issues/492) for details.
|
||||
@@ -1,379 +0,0 @@
|
||||
# Unit Test Advice
|
||||
|
||||
## Overview
|
||||
|
||||
This document offers advice on writing a Unit Test (UT) in
|
||||
[Golang](https://golang.org) and [Rust](https://www.rust-lang.org).
|
||||
|
||||
## General advice
|
||||
|
||||
### Unit test strategies
|
||||
|
||||
#### Positive and negative tests
|
||||
|
||||
Always add positive tests (where success is expected) *and* negative
|
||||
tests (where failure is expected).
|
||||
|
||||
#### Boundary condition tests
|
||||
|
||||
Try to add unit tests that exercise boundary conditions such as:
|
||||
|
||||
- Missing values (`null` or `None`).
|
||||
- Empty strings and huge strings.
|
||||
- Empty (or uninitialised) complex data structures
|
||||
(such as lists, vectors and hash tables).
|
||||
- Common numeric values (such as `-1`, `0`, `1` and the minimum and
|
||||
maximum values).
|
||||
|
||||
#### Test unusual values
|
||||
|
||||
Also always consider "unusual" input values such as:
|
||||
|
||||
- String values containing spaces, Unicode characters, special
|
||||
characters, escaped characters or null bytes.
|
||||
|
||||
> **Note:** Consider these unusual values in prefix, infix and
|
||||
> suffix position.
|
||||
|
||||
- String values that cannot be converted into numeric values or which
|
||||
contain invalid structured data (such as invalid JSON).
|
||||
|
||||
#### Other types of tests
|
||||
|
||||
If the code requires other forms of testing (such as stress testing,
|
||||
fuzz testing and integration testing), raise a GitHub issue and
|
||||
reference it on the issue you are using for the main work. This
|
||||
ensures the test team are aware that a new test is required.
|
||||
|
||||
### Test environment
|
||||
|
||||
#### Create unique files and directories
|
||||
|
||||
Ensure your tests do not write to a fixed file or directory. This can
|
||||
cause problems when running multiple tests simultaneously and also
|
||||
when running tests after a previous test run failure.
|
||||
|
||||
#### Assume parallel testing
|
||||
|
||||
Always assume your tests will be run *in parallel*. If this is
|
||||
problematic for a test, force it to run in isolation using the
|
||||
`serial_test` crate for Rust code for example.
|
||||
|
||||
### Running
|
||||
|
||||
Ensure you run the unit tests and they all pass before raising a PR.
|
||||
Ideally do this on different distributions on different architectures
|
||||
to maximise coverage (and so minimise surprises when your code runs in
|
||||
the CI).
|
||||
|
||||
## Assertions
|
||||
|
||||
### Golang assertions
|
||||
|
||||
Use the `testify` assertions package to create a new assertion object as this
|
||||
keeps the test code free from distracting `if` tests:
|
||||
|
||||
```go
|
||||
func TestSomething(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
err := doSomething()
|
||||
assert.NoError(err)
|
||||
}
|
||||
```
|
||||
|
||||
### Rust assertions
|
||||
|
||||
Use the standard set of `assert!()` macros.
|
||||
|
||||
## Table driven tests
|
||||
|
||||
Try to write tests using a table-based approach. This allows you to distill
|
||||
the logic into a compact table (rather than spreading the tests across
|
||||
multiple test functions). It also makes it easy to cover all the
|
||||
interesting boundary conditions:
|
||||
|
||||
### Golang table driven tests
|
||||
|
||||
Assume the following function:
|
||||
|
||||
```go
|
||||
// The function under test.
|
||||
//
|
||||
// Accepts a string and an integer and returns the
|
||||
// result of sticking them together separated by a dash as a string.
|
||||
func joinParamsWithDash(str string, num int) (string, error) {
|
||||
if str == "" {
|
||||
return "", errors.New("string cannot be blank")
|
||||
}
|
||||
|
||||
if num <= 0 {
|
||||
return "", errors.New("number must be positive")
|
||||
}
|
||||
|
||||
return fmt.Sprintf("%s-%d", str, num), nil
|
||||
}
|
||||
```
|
||||
|
||||
A table driven approach to testing it:
|
||||
|
||||
```go
|
||||
import (
|
||||
"testing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestJoinParamsWithDash(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
// Type used to hold function parameters and expected results.
|
||||
type testData struct {
|
||||
param1 string
|
||||
param2 int
|
||||
expectedResult string
|
||||
expectError bool
|
||||
}
|
||||
|
||||
// List of tests to run including the expected results
|
||||
data := []testData{
|
||||
// Failure scenarios
|
||||
{"", -1, "", true},
|
||||
{"", 0, "", true},
|
||||
{"", 1, "", true},
|
||||
{"foo", 0, "", true},
|
||||
{"foo", -1, "", true},
|
||||
|
||||
// Success scenarios
|
||||
{"foo", 1, "foo-1", false},
|
||||
{"bar", 42, "bar-42", false},
|
||||
}
|
||||
|
||||
// Run the tests
|
||||
for i, d := range data {
|
||||
// Create a test-specific string that is added to each assert
|
||||
// call. It will be displayed if any assert test fails.
|
||||
msg := fmt.Sprintf("test[%d]: %+v", i, d)
|
||||
|
||||
// Call the function under test
|
||||
result, err := joinParamsWithDash(d.param1, d.param2)
|
||||
|
||||
// update the message for more information on failure
|
||||
msg = fmt.Sprintf("%s, result: %q, err: %v", msg, result, err)
|
||||
|
||||
if d.expectError {
|
||||
assert.Error(err, msg)
|
||||
|
||||
// If an error is expected, there is no point
|
||||
// performing additional checks.
|
||||
continue
|
||||
}
|
||||
|
||||
assert.NoError(err, msg)
|
||||
assert.Equal(d.expectedResult, result, msg)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rust table driven tests
|
||||
|
||||
Assume the following function:
|
||||
|
||||
```rust
|
||||
// Convenience type to allow Result return types to only specify the type
|
||||
// for the true case; failures are specified as static strings.
|
||||
// XXX: This is an example. In real code use the "anyhow" and
|
||||
// XXX: "thiserror" crates.
|
||||
pub type Result<T> = std::result::Result<T, &'static str>;
|
||||
|
||||
// The function under test.
|
||||
//
|
||||
// Accepts a string and an integer and returns the
|
||||
// result of sticking them together separated by a dash as a string.
|
||||
fn join_params_with_dash(str: &str, num: i32) -> Result<String> {
|
||||
if str.is_empty() {
|
||||
return Err("string cannot be blank");
|
||||
}
|
||||
|
||||
if num <= 0 {
|
||||
return Err("number must be positive");
|
||||
}
|
||||
|
||||
let result = format!("{}-{}", str, num);
|
||||
|
||||
Ok(result)
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
A table driven approach to testing it:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_join_params_with_dash() {
|
||||
// This is a type used to record all details of the inputs
|
||||
// and outputs of the function under test.
|
||||
#[derive(Debug)]
|
||||
struct TestData<'a> {
|
||||
str: &'a str,
|
||||
num: i32,
|
||||
result: Result<String>,
|
||||
}
|
||||
|
||||
// The tests can now be specified as a set of inputs and outputs
|
||||
let tests = &[
|
||||
// Failure scenarios
|
||||
TestData {
|
||||
str: "",
|
||||
num: 0,
|
||||
result: Err("string cannot be blank"),
|
||||
},
|
||||
TestData {
|
||||
str: "foo",
|
||||
num: -1,
|
||||
result: Err("number must be positive"),
|
||||
},
|
||||
|
||||
// Success scenarios
|
||||
TestData {
|
||||
str: "foo",
|
||||
num: 42,
|
||||
result: Ok("foo-42".to_string()),
|
||||
},
|
||||
TestData {
|
||||
str: "-",
|
||||
num: 1,
|
||||
result: Ok("--1".to_string()),
|
||||
},
|
||||
];
|
||||
|
||||
// Run the tests
|
||||
for (i, d) in tests.iter().enumerate() {
|
||||
// Create a string containing details of the test
|
||||
let msg = format!("test[{}]: {:?}", i, d);
|
||||
|
||||
// Call the function under test
|
||||
let result = join_params_with_dash(d.str, d.num);
|
||||
|
||||
// Update the test details string with the results of the call
|
||||
let msg = format!("{}, result: {:?}", msg, result);
|
||||
|
||||
// Perform the checks
|
||||
if d.result.is_ok() {
|
||||
assert!(result == d.result, msg);
|
||||
continue;
|
||||
}
|
||||
|
||||
let expected_error = format!("{}", d.result.as_ref().unwrap_err());
|
||||
let actual_error = format!("{}", result.unwrap_err());
|
||||
assert!(actual_error == expected_error, msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Temporary files
|
||||
|
||||
Always delete temporary files on success.
|
||||
|
||||
### Golang temporary files
|
||||
|
||||
```go
|
||||
func TestSomething(t *testing.T) {
|
||||
assert := assert.New(t)
|
||||
|
||||
// Create a temporary directory
|
||||
tmpdir, err := os.MkdirTemp("", "")
|
||||
assert.NoError(err)
|
||||
|
||||
// Delete it at the end of the test
|
||||
defer os.RemoveAll(tmpdir)
|
||||
|
||||
// Add test logic that will use the tmpdir here...
|
||||
}
|
||||
```
|
||||
|
||||
### Rust temporary files
|
||||
|
||||
Use the `tempfile` crate which allows files and directories to be deleted
|
||||
automatically:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use tempfile::tempdir;
|
||||
|
||||
#[test]
|
||||
fn test_something() {
|
||||
|
||||
// Create a temporary directory (which will be deleted automatically
|
||||
let dir = tempdir().expect("failed to create tmpdir");
|
||||
|
||||
let filename = dir.path().join("file.txt");
|
||||
|
||||
// create filename ...
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## Test user
|
||||
|
||||
[Unit tests are run *twice*](https://github.com/kata-containers/tests/blob/main/.ci/go-test.sh):
|
||||
|
||||
- as the current user
|
||||
- as the `root` user (if different to the current user)
|
||||
|
||||
When writing a test consider which user should run it; even if the code the
|
||||
test is exercising runs as `root`, it may be necessary to *only* run the test
|
||||
as a non-`root` for the test to be meaningful. Add appropriate skip
|
||||
guards around code that requires `root` and non-`root` so that the test
|
||||
will run if the correct type of user is detected and skipped if not.
|
||||
|
||||
### Run Golang tests as a different user
|
||||
|
||||
The main repository has the most comprehensive set of skip abilities. See:
|
||||
|
||||
- [`katatestutils`](../src/runtime/pkg/katatestutils)
|
||||
|
||||
### Run Rust tests as a different user
|
||||
|
||||
One method is to use the `nix` crate along with some custom macros:
|
||||
|
||||
```
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
#[allow(unused_macros)]
|
||||
macro_rules! skip_if_root {
|
||||
() => {
|
||||
if nix::unistd::Uid::effective().is_root() {
|
||||
println!("INFO: skipping {} which needs non-root", module_path!());
|
||||
return;
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
#[allow(unused_macros)]
|
||||
macro_rules! skip_if_not_root {
|
||||
() => {
|
||||
if !nix::unistd::Uid::effective().is_root() {
|
||||
println!("INFO: skipping {} which needs root", module_path!());
|
||||
return;
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_that_must_be_run_as_root() {
|
||||
// Not running as the superuser, so skip.
|
||||
skip_if_not_root!();
|
||||
|
||||
// Run test *iff* the user running the test is root
|
||||
|
||||
// ...
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -1,127 +0,0 @@
|
||||
# Introduction
|
||||
|
||||
This document outlines the options for upgrading from a
|
||||
[Kata Containers 1.x release](https://github.com/kata-containers/runtime/releases) to a
|
||||
[Kata Containers 2.x release](https://github.com/kata-containers/kata-containers/releases).
|
||||
|
||||
# Maintenance warning
|
||||
|
||||
Kata Containers 2.x is the new focus for the Kata Containers development
|
||||
community.
|
||||
|
||||
Although Kata Containers 1.x releases will continue to be published for a
|
||||
period of time, once a stable release for Kata Containers 2.x is published,
|
||||
Kata Containers 1.x stable users should consider switching to the Kata 2.x
|
||||
release.
|
||||
|
||||
See the [stable branch strategy documentation](Stable-Branch-Strategy.md) for
|
||||
further details.
|
||||
|
||||
# Determine current version
|
||||
|
||||
To display the current Kata Containers version, run one of the following:
|
||||
|
||||
```bash
|
||||
$ kata-runtime --version
|
||||
$ containerd-shim-kata-v2 --version
|
||||
```
|
||||
|
||||
# Determine latest version
|
||||
|
||||
Kata Containers 2.x releases are published on the
|
||||
[Kata Containers GitHub releases page](https://github.com/kata-containers/kata-containers/releases).
|
||||
|
||||
Alternatively, if you are using Kata Containers version 1.12.0 or newer, you
|
||||
can check for newer releases using the command line:
|
||||
|
||||
```bash
|
||||
$ kata-runtime check --check-version-only
|
||||
```
|
||||
|
||||
There are various other related options. Run `kata-runtime check --help`
|
||||
for further details.
|
||||
|
||||
# Configuration changes
|
||||
|
||||
The [Kata Containers 2.x configuration file](/src/runtime/README.md#configuration)
|
||||
is compatible with the
|
||||
[Kata Containers 1.x configuration file](https://github.com/kata-containers/runtime/blob/master/README.md#configuration).
|
||||
|
||||
However, if you have created a local configuration file
|
||||
(`/etc/kata-containers/configuration.toml`), this will mask the newer Kata
|
||||
Containers 2.x configuration file.
|
||||
|
||||
Since Kata Containers 2.x introduces a number of new options and changes
|
||||
some default values, we recommend that you disable the local configuration
|
||||
file (by moving or renaming it) until you have reviewed the changes to the
|
||||
official configuration file and applied them to your local file if required.
|
||||
|
||||
# Upgrade Kata Containers
|
||||
|
||||
## Upgrade native distribution packaged version
|
||||
|
||||
As shown in the
|
||||
[installation instructions](install),
|
||||
Kata Containers provide binaries for popular distributions in their native
|
||||
packaging formats. This allows Kata Containers to be upgraded using the
|
||||
standard package management tools for your distribution.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> Users should prefer the distribution packaged version of Kata Containers
|
||||
> unless they understand the implications of a manual installation.
|
||||
|
||||
## Static installation
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> Unless you are an advanced user, if you are using a static installation of
|
||||
> Kata Containers, we recommend you remove it and install a
|
||||
> [native distribution packaged version](#upgrade-native-distribution-packaged-version)
|
||||
> instead.
|
||||
|
||||
### Determine if you are using a static installation
|
||||
|
||||
If the following command displays the output "static", you are using a static
|
||||
version of Kata Containers:
|
||||
|
||||
```bash
|
||||
$ ls /opt/kata/bin/kata-runtime &>/dev/null && echo static
|
||||
```
|
||||
|
||||
### Remove a static installation
|
||||
|
||||
Static installations are installed in `/opt/kata/`, so to uninstall simply
|
||||
remove this directory.
|
||||
|
||||
### Upgrade a static installation
|
||||
|
||||
If you understand the implications of using a static installation, to upgrade
|
||||
first
|
||||
[remove the existing static installation](#remove-a-static-installation), then
|
||||
[install the latest release](#determine-latest-version).
|
||||
|
||||
See the
|
||||
[manual installation documentation](install/README.md#manual-installation)
|
||||
for details on how to automatically install and configuration a static release
|
||||
with containerd.
|
||||
|
||||
# Custom assets
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> This section only applies to advanced users who have built their own guest
|
||||
> kernel or image.
|
||||
|
||||
If you are using custom
|
||||
[guest assets](design/architecture/README.md#guest-assets),
|
||||
you must upgrade them to work with Kata Containers 2.x since Kata
|
||||
Containers 1.x assets will **not** work.
|
||||
|
||||
See the following for further details:
|
||||
|
||||
- [Guest kernel documentation](/tools/packaging/kernel)
|
||||
- [Guest image and initrd documentation](/tools/osbuilder)
|
||||
|
||||
The official assets are packaged meaning they are automatically included in
|
||||
new releases.
|
||||
@@ -1,247 +0,0 @@
|
||||
# Code PR Advice
|
||||
|
||||
Before raising a PR containing code changes, we suggest you consider
|
||||
the following to ensure a smooth and fast process.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> - All the advice in this document is optional. However, if the
|
||||
> advice provided is not followed, there is no guarantee your PR
|
||||
> will be merged.
|
||||
>
|
||||
> - All the check tools will be run automatically on your PR by the CI.
|
||||
> However, if you run them locally first, there is a much better
|
||||
> chance of a successful initial CI run.
|
||||
|
||||
## Assumptions
|
||||
|
||||
This document assumes you have already read (and in the case of the
|
||||
code of conduct agreed to):
|
||||
|
||||
- The [Kata Containers code of conduct](https://github.com/kata-containers/community/blob/main/CODE_OF_CONDUCT.md).
|
||||
- The [Kata Containers contributing guide](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md).
|
||||
|
||||
## Code
|
||||
|
||||
### Architectures
|
||||
|
||||
Do not write architecture-specific code if it is possible to write the
|
||||
code generically.
|
||||
|
||||
### General advice
|
||||
|
||||
- Do not write code to impress: instead write code that is easy to read and understand.
|
||||
|
||||
- Always consider which user will run the code. Try to minimise
|
||||
the privileges the code requires.
|
||||
|
||||
### Comments
|
||||
|
||||
Always add comments if the intent of the code is not obvious. However,
|
||||
try to avoid comments if the code could be made clearer (for example
|
||||
by using more meaningful variable names).
|
||||
|
||||
### Constants
|
||||
|
||||
Don't embed magic numbers and strings in functions, particularly if
|
||||
they are used repeatedly.
|
||||
|
||||
Create constants at the top of the file instead.
|
||||
|
||||
### Copyright and license
|
||||
|
||||
Ensure all new files contain a copyright statement and an SPDX license
|
||||
identifier in the comments at the top of the file.
|
||||
|
||||
### FIXME and TODO
|
||||
|
||||
If the code contains areas that are not fully implemented, make this
|
||||
clear a comment which provides a link to a GitHub issue that provides
|
||||
further information.
|
||||
|
||||
Do not just rely on comments in this case though: if possible, return
|
||||
a "`BUG: feature X not implemented see {bug-url}`" type error.
|
||||
|
||||
### Functions
|
||||
|
||||
- Keep functions relatively short (less than 100 lines is a good "rule of thumb").
|
||||
|
||||
- Document functions if the parameters, return value or general intent
|
||||
of the function is not obvious.
|
||||
|
||||
- Always return errors where possible.
|
||||
|
||||
Do not discard error return values from the functions this function
|
||||
calls.
|
||||
|
||||
### Logging
|
||||
|
||||
- Don't use multiple log calls when a single log call could be used.
|
||||
|
||||
- Use structured logging where possible to allow
|
||||
[standard tooling](https://github.com/kata-containers/tests/tree/main/cmd/log-parser)
|
||||
be able to extract the log fields.
|
||||
|
||||
### Names
|
||||
|
||||
Give functions, macros and variables clear and meaningful names.
|
||||
|
||||
### Structures
|
||||
|
||||
#### Golang structures
|
||||
|
||||
Unlike Rust, Go does not enforce that all structure members be set.
|
||||
This has lead to numerous bugs in the past where code like the
|
||||
following is used:
|
||||
|
||||
```go
|
||||
type Foo struct {
|
||||
Key string
|
||||
Value string
|
||||
}
|
||||
|
||||
// BUG: Key not set, but nobody noticed! ;(
|
||||
let foo1 = Foo {
|
||||
Value: "foo",
|
||||
}
|
||||
```
|
||||
|
||||
A much safer approach is to create a constructor function to enforce
|
||||
integrity:
|
||||
|
||||
```go
|
||||
type Foo struct {
|
||||
Key string
|
||||
Value string
|
||||
}
|
||||
|
||||
func NewFoo(key, value string) (*Foo, error) {
|
||||
if key == "" {
|
||||
return nil, errors.New("Foo needs a key")
|
||||
}
|
||||
|
||||
if value == "" {
|
||||
return nil, errors.New("Foo needs a value")
|
||||
}
|
||||
|
||||
return &Foo{
|
||||
Key: key,
|
||||
Value: value,
|
||||
}, nil
|
||||
}
|
||||
|
||||
func testFoo() error {
|
||||
// BUG: Key not set, but nobody noticed! ;(
|
||||
badFoo := Foo{Value: "value"}
|
||||
|
||||
// Ok - the constructor performs needed validation
|
||||
goodFoo, err := NewFoo("name", "value")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
```
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The above is just an example. The *safest* approach would be to move
|
||||
> `NewFoo()` into a separate package and make `Foo` and it's elements
|
||||
> private. The compiler would then enforce the use of the constructor
|
||||
> to guarantee correctly defined objects.
|
||||
|
||||
|
||||
### Tracing
|
||||
|
||||
Consider if the code needs to create a new
|
||||
[trace span](./tracing.md).
|
||||
|
||||
Ensure any new trace spans added to the code are completed.
|
||||
|
||||
## Tests
|
||||
|
||||
### Unit tests
|
||||
|
||||
Where possible, code changes should be accompanied by unit tests.
|
||||
|
||||
Consider using the standard
|
||||
[table-based approach](Unit-Test-Advice.md)
|
||||
as it encourages you to make functions small and simple, and also
|
||||
allows you to think about what types of value to test.
|
||||
|
||||
### Other categories of test
|
||||
|
||||
Raised a GitHub issue in the
|
||||
[`tests`](https://github.com/kata-containers/tests) repository that
|
||||
explains what sort of test is required along with as much detail as
|
||||
possible. Ensure the original issue is referenced on the `tests` issue.
|
||||
|
||||
### Unsafe code
|
||||
|
||||
#### Rust language specifics
|
||||
|
||||
Minimise the use of `unsafe` blocks in Rust code and since it is
|
||||
potentially dangerous always write [unit tests][#unit-tests]
|
||||
for this code where possible.
|
||||
|
||||
`expect()` and `unwrap()` will cause the code to panic on error.
|
||||
Prefer to return a `Result` on error rather than using these calls to
|
||||
allow the caller to deal with the error condition.
|
||||
|
||||
The table below lists the small number of cases where use of
|
||||
`expect()` and `unwrap()` are permitted:
|
||||
|
||||
| Area | Rationale for permitting |
|
||||
|-|-|
|
||||
| In test code (the `tests` module) | Panics will cause the test to fail, which is desirable. |
|
||||
| `lazy_static!()` | This magic macro cannot "return" a value as it runs before `main()`. |
|
||||
| `defer!()` | Similar to golang's `defer()` but doesn't allow the use of `?`. |
|
||||
| `tokio::spawn(async move {})` | Cannot currently return a `Result` from an `async move` closure. |
|
||||
| If an explicit test is performed before the `unwrap()` / `expect()` | *"Just about acceptable"*, but not ideal `[*]` |
|
||||
| `Mutex.lock()` | Almost unrecoverable if failed in the lock acquisition |
|
||||
|
||||
|
||||
`[*]` - There can lead to bad *future* code: consider what would
|
||||
happen if the explicit test gets dropped in the future. This is easier
|
||||
to happen if the test and the extraction of the value are two separate
|
||||
operations. In summary, this strategy can introduce an insidious
|
||||
maintenance issue.
|
||||
|
||||
## Documentation
|
||||
|
||||
### General requirements
|
||||
|
||||
- All new features should be accompanied by documentation explaining:
|
||||
|
||||
- What the new feature does
|
||||
|
||||
- Why it is useful
|
||||
|
||||
- How to use the feature
|
||||
|
||||
- Any known issues or limitations
|
||||
|
||||
Links should be provided to GitHub issues tracking the issues
|
||||
|
||||
- The [documentation requirements document](Documentation-Requirements.md)
|
||||
explains how the project formats documentation.
|
||||
|
||||
### Markdown syntax
|
||||
|
||||
Run the
|
||||
[markdown checker](https://github.com/kata-containers/tests/tree/main/cmd/check-markdown)
|
||||
on your documentation changes.
|
||||
|
||||
### Spell check
|
||||
|
||||
Run the
|
||||
[spell checker](https://github.com/kata-containers/tests/tree/main/cmd/check-spelling)
|
||||
on your documentation changes.
|
||||
|
||||
## Finally
|
||||
|
||||
You may wish to read the documentation that the
|
||||
[Kata Review Team](https://github.com/kata-containers/community/blob/main/Rota-Process.md) use to help review PRs:
|
||||
|
||||
- [PR review guide](https://github.com/kata-containers/community/blob/main/PR-Review-Guide.md).
|
||||
- [documentation review process](https://github.com/kata-containers/community/blob/main/Documentation-Review-Process.md).
|
||||
@@ -1,17 +0,0 @@
|
||||
# Design
|
||||
|
||||
Kata Containers design documents:
|
||||
|
||||
- [Kata Containers architecture](architecture)
|
||||
- [API Design of Kata Containers](kata-api-design.md)
|
||||
- [Design requirements for Kata Containers](kata-design-requirements.md)
|
||||
- [VSocks](VSocks.md)
|
||||
- [VCPU handling](vcpu-handling.md)
|
||||
- [Host cgroups](host-cgroups.md)
|
||||
- [`Inotify` support](inotify.md)
|
||||
- [Metrics(Kata 2.0)](kata-2-0-metrics.md)
|
||||
- [Design for Kata Containers `Lazyload` ability with `nydus`](kata-nydus-design.md)
|
||||
|
||||
---
|
||||
|
||||
- [Design proposals](proposals)
|
||||
@@ -1,88 +0,0 @@
|
||||
# Kata Containers and VSOCKs
|
||||
|
||||
## Introduction
|
||||
|
||||
There are two different ways processes in the virtual machine can communicate
|
||||
with processes in the host. The first one is by using serial ports, where the
|
||||
processes in the virtual machine can read/write data from/to a serial port
|
||||
device and the processes in the host can read/write data from/to a Unix socket.
|
||||
Most GNU/Linux distributions have support for serial ports, making it the most
|
||||
portable solution. However, the serial link limits read/write access to one
|
||||
process at a time.
|
||||
|
||||
A newer, simpler method is [VSOCKs][1], which can accept connections from
|
||||
multiple clients. The following diagram shows how it's implemented in Kata Containers.
|
||||
|
||||
### VSOCK communication diagram
|
||||
|
||||
```
|
||||
.----------------------.
|
||||
| .------------------. |
|
||||
| | .-----. .-----. | |
|
||||
| | |cont1| |cont2| | |
|
||||
| | `-----' `-----' | |
|
||||
| | | | | |
|
||||
| | .---------. | |
|
||||
| | | agent | | |
|
||||
| | `---------' | |
|
||||
| | | | | |
|
||||
| | POD .-------. | |
|
||||
| `-----| vsock |----' |
|
||||
| `-------' |
|
||||
| | | |
|
||||
| .------. .------. |
|
||||
| | shim | | shim | |
|
||||
| `------' `------' |
|
||||
| Host |
|
||||
`----------------------'
|
||||
```
|
||||
|
||||
## System requirements
|
||||
|
||||
The host Linux kernel version must be greater than or equal to v4.8, and the
|
||||
`vhost_vsock` module must be loaded or built-in (`CONFIG_VHOST_VSOCK=y`). To
|
||||
load the module run the following command:
|
||||
|
||||
```
|
||||
$ sudo modprobe -i vhost_vsock
|
||||
```
|
||||
|
||||
The Kata Containers version must be greater than or equal to 1.2.0 and `use_vsock`
|
||||
must be set to `true` in the runtime [configuration file][1].
|
||||
|
||||
### With VMWare guest
|
||||
|
||||
To use Kata Containers with VSOCKs in a VMWare guest environment, first stop the `vmware-tools` service and unload the VMWare Linux kernel module.
|
||||
```
|
||||
sudo systemctl stop vmware-tools
|
||||
sudo modprobe -r vmw_vsock_vmci_transport
|
||||
sudo modprobe -i vhost_vsock
|
||||
```
|
||||
|
||||
## Advantages of using VSOCKs
|
||||
|
||||
### High density
|
||||
|
||||
Using a proxy for multiplexing the connections between the VM and the host uses
|
||||
4.5MB per [POD][2]. In a high density deployment this could add up to GBs of
|
||||
memory that could have been used to host more PODs. When we talk about density
|
||||
each kilobyte matters and it might be the decisive factor between run another
|
||||
POD or not. For example if you have 500 PODs running in a server, the same
|
||||
amount of [`kata-proxy`][3] processes will be running and consuming for around
|
||||
2250MB of RAM. Before making the decision not to use VSOCKs, you should ask
|
||||
yourself, how many more containers can run with the memory RAM consumed by the
|
||||
Kata proxies?
|
||||
|
||||
### Reliability
|
||||
|
||||
[`kata-proxy`][3] is in charge of multiplexing the connections between virtual
|
||||
machine and host processes, if it dies all connections get broken. For example
|
||||
if you have a [POD][2] with 10 containers running, if `kata-proxy` dies it would
|
||||
be impossible to contact your containers, though they would still be running.
|
||||
Since communication via VSOCKs is direct, the only way to lose communication
|
||||
with the containers is if the VM itself or the `containerd-shim-kata-v2` dies, if this happens
|
||||
the containers are removed automatically.
|
||||
|
||||
[1]: https://wiki.qemu.org/Features/VirtioVsock
|
||||
[2]: ./vcpu-handling.md#virtual-cpus-and-kubernetes-pods
|
||||
[3]: https://github.com/kata-containers/proxy
|
||||
|
Before Width: | Height: | Size: 23 KiB |
|
Before Width: | Height: | Size: 25 KiB |
|
Before Width: | Height: | Size: 21 KiB |
|
Before Width: | Height: | Size: 293 KiB |
|
Before Width: | Height: | Size: 114 KiB |
|
Before Width: | Height: | Size: 101 KiB |
@@ -1 +0,0 @@
|
||||
<mxfile host="app.diagrams.net" modified="2021-11-05T13:07:32.992Z" agent="5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" etag="j5e7J3AOXxeQrt-Zz2uw" version="15.6.8" type="device"><diagram id="XNV8G0dePIPkhS_Khqr4" name="Page-1">7Vxdd9o4EP01nLP7QI5s+fORUNhNT7rNbnqaZl/2CCywG2OxQhDIr18Z29iyZD6CDZRuHho8toQ9986dGVlNC3Yny98omvqfiIfDlg68ZQt+aOm6AUyb/4otq8TiGnpiGNPAS0wgNzwGbzgxapl1Hnh4ltoSEyMkZMFUNA5JFOEhE2yIUvIqXjYioScYpmiMJcPjEIWy9SnwmJ9YdQjd/MTvOBj76VdDCNI7n6Ds6tQw85FHXgsm2GvBLiWEJZ8myy4OY++JjulXnN3cGcUR22fAn3fPzx+jj7e9HrIXA330feIZ7czPCxTO00dO75atMh+MKZlP08swZXip8jwaZJcD+ca0zeNyomAywYyu+CXpRG3NNM1kUMoSXU+PX3OfG9nEfsHdG56gFOfxZvbcE/xD6gy1Yz7/88nuPiLwcNcZDLvEvZ10Vm1zt1+4WyIPx5NoLXj76gcMP07RMD77yqOB23w2CdPTIxKxlN78nuHtmCIv4A7qkpDQ9XzQxsjC8blREIYFu4ewMxpy+4xR8oILZ6yhgwcjfkZ2+Xa0yzDK2JzP81ZzntcNtedHI8+1LNnzo9FIHyo971kDy7Qa8Xx61gJCSGhyRHCtkXHRLLMhXGwJF659/KFrCwtHBkAbIA3rKgAAsHqdfjqDCBn/aRIYTRORUWiVa8rAwKZwcSRcuCQzFESYcrNWLz6K4HFtD9i2QrZM7HiGCjtHH8Ak3ETs+n0A+v0msdNhCTv7RkZPM1TwNSV37lb4asw6VwifDc4MnqJ88ldTTBfBjHulDB1/SibiI/o2IhEuAZGaUBiMI3445B7lvIC3sc8CXqZ20hOTwPPir1ESIqcMUORDn9DgLaZcmF7QnHKWUhqQ4TNUpUZj6GkSeuM5nvGUBl4wjeJW5odAsDnAzBJihiLgrIYgUz6DrJYSRqdvVwzblol82qJZM3Y75sfrV9wKGC+pXdEa7BTP16/s7/lLbVc0uY+8hn7lYGCkdgVKyJy0XdHkPvJn6VcOxu4C2xVte7t5zf3K0fCdv12Ry6efpl05XDgvrFvR5V7zmruVw/E6Z7OiDjcoQYK9MX5MDwllPhmTCIW93FpyXn7NPSHTFMXvmLFV6lI0Z0TEGC8D9q3w+TmeiueN5OjDMp15fbCSMdJijDg0dPUtuzI+KMwSH+bTrI+yeWRss3xO5nSYsfb+y53jDp6+P1r+y+fXj+HLU97AMETHmG1xalo+xI7cygqKQ8SCBRZuQwXxemiHUrQqXDAlQcRmhZkfYoPQBNqOmJstu0gYxQjD1ksjnBLFkrvICbd5nCNUA0qqwRidDpXMvEcDLiMCm/ZXAopnwVvaV8dcSF3IJzdvW+YHBctktmwNo727+erWHdxormWIMpEcHUYXGV0osmFz09kUZDSaYSZJymEIqyNHLqirEb5A7Xmv1hTZZB+nPa6sPVtFaqf45HyDBrRFvtnHEa5WQqklQy7ir5g6aiE6hjpqEbuQtOUCUf52p63yCFOzS6w7Lm1t9WtB1LqFNrMXjfmnlm6FcYE74CZrHH/6ZdOLeq04F/T5v92/7tqff5UoXeeyj+U5tqXsPWHHNGA2w37LPtlLib0L37auOa4IKpQXpDXHkktfq4bSV4lf1tb+HBpyXOmr75t+4L7pZ28ROQpjXY7RBxrP7eP5LH5wTBdYXla4rsATyz4LqALPaCbw1Mn7nHGXx9pzMdR2xF0eas/ZfCfJ3VDRcqrFrPa4e2fytk1bSbfq5F0eYTpmrclbyUH5XaTP2FRJzMvsOPUKJTi44QQ3Um6uqd/MXm9l/aZuiFM01x7ICwqV6J5MdqCY8QGwTqI98aQPmAbcpTFTj1wC21+P4J56tGGhCQEUK8QjaVhNs8NVzbHFezNAaSP7TlUrjTha1dDX0f1d+292B77+8dRGX7/MfHCmzPpOplaycPcah9tIslM1lpq4jWZTPGWTJBGTjjuGYVXftKXpLY0wHbh9hA5ca9uIZtpkKKfaF8RQe0KigCle6V3sXofDa2/NNfWbCgIVqm/dVLxgba76lrcvXGjb+64yetvK1u4VMKtuYTkOKjl0frQXI5VR8446FZqmXlNlCsqvQkqyXktpqnxnvMdWvOZ3hzqGmAgMR7EmYG928hR1yW5s05XCMcnaqRcsBAdZ/87j/5C4pmR7tuZkh1+gGdPlmpjZ+WzFNV9wbc/8YNJep5/FZnp+t+tvSC6uLx8ZDeejrfQ6aDtqBdTNrbzKujYjw5f6XK/a8esAwIt4hes7JgAGOKXrs/2o2o1e2qUtb51T1QZ1bL5SPoL8mvYCxEn5pkDNWKcpxir3rv+vToeGiF1BnMtSJ3n16ArUaX/XV6qTKW9Wq0md+GH+RwaSSiv/Ww2w9x8=</diagram></mxfile>
|
||||
|
Before Width: | Height: | Size: 90 KiB |
@@ -1,47 +0,0 @@
|
||||
@startuml
|
||||
|
||||
User->CLI: network add-interface
|
||||
CLI->virtcontainers: AddInterface
|
||||
virtcontainers->QEMU:QMP-hot-add-network
|
||||
virtcontainers->agent:UpdateInterface
|
||||
note right
|
||||
the agent's UpdateInterface code will need to be augmented
|
||||
to have a timeout/wait associated with this for the network
|
||||
device to appear (ie, wait for qmp to complete)
|
||||
end note
|
||||
agent->User: err, interface detail
|
||||
|
||||
User->CLI: network del-interface
|
||||
CLI->virtcontainers: DeleteInterface
|
||||
note right
|
||||
There will be no call to the agent. We rely on guest kernel
|
||||
to clean up any state associated with the interface.
|
||||
end note
|
||||
virtcontainers->QEMU:QMP-hot-delete-network
|
||||
virtcontainers->User: err, interface detail
|
||||
|
||||
User->CLI: network list-interface
|
||||
CLI->virtcontainers: ListInterfaces
|
||||
virtcontainers->agent:ListInterfaces
|
||||
agent->User: err, list of interface details
|
||||
|
||||
User->CLI: network update-routes
|
||||
CLI->virtcontainers: UpdateRoutes
|
||||
note right
|
||||
routes are handled in a 'one shot' basis,
|
||||
setting all of the routes for the network. This needs to
|
||||
be called after interfaces are added, and should be called
|
||||
after interfaces are removed. It should be fine to call once
|
||||
after adding all of the expected interfaces. If you know all
|
||||
the resulting routes, simply calling set routes with the
|
||||
complete list should suffice.
|
||||
end note
|
||||
virtcontainers->agent:UpdateRoutes
|
||||
agent->User: err, list of routes
|
||||
|
||||
User->CLI: network list-routes
|
||||
CLI->virtcontainers: ListRoutes
|
||||
virtcontainers->agent:ListRoutes
|
||||
agent->User: err, list of routes
|
||||
|
||||
@enduml
|
||||
|
Before Width: | Height: | Size: 51 KiB |
|
Before Width: | Height: | Size: 509 KiB |
@@ -1,174 +0,0 @@
|
||||
Title: Kata Flow
|
||||
participant CRI
|
||||
participant CRIO
|
||||
participant Kata Runtime
|
||||
participant virtcontainers
|
||||
participant hypervisor
|
||||
participant agent
|
||||
participant shim-pod
|
||||
participant shim-ctr
|
||||
participant proxy
|
||||
|
||||
# Run the sandbox
|
||||
CRI->CRIO: RunPodSandbox()
|
||||
CRIO->Kata Runtime: create
|
||||
Kata Runtime->virtcontainers: CreateSandbox()
|
||||
Note left of virtcontainers: Sandbox\nReady
|
||||
virtcontainers->virtcontainers: createNetwork()
|
||||
virtcontainers->virtcontainers: Execute PreStart Hooks
|
||||
virtcontainers->+hypervisor: Start VM (inside the netns)
|
||||
hypervisor-->-virtcontainers: VM started
|
||||
virtcontainers->proxy: Start Proxy
|
||||
proxy->hypervisor: Connect the VM
|
||||
virtcontainers->+agent: CreateSandbox()
|
||||
agent-->-virtcontainers: Sandbox Created
|
||||
virtcontainers->+agent: CreateContainer()
|
||||
agent-->-virtcontainers: Container Created
|
||||
virtcontainers->shim-pod: Start Shim
|
||||
shim-pod->agent: ReadStdout() (blocking call)
|
||||
shim-pod->agent: ReadStderr() (blocking call)
|
||||
shim-pod->agent: WaitProcess() (blocking call)
|
||||
Note left of virtcontainers: Container-pod\nReady
|
||||
virtcontainers-->Kata Runtime: End of CreateSandbox()
|
||||
Kata Runtime-->CRIO: End of create
|
||||
CRIO->Kata Runtime: start
|
||||
Kata Runtime->virtcontainers: StartSandbox()
|
||||
Note left of virtcontainers: Sandbox\nRunning
|
||||
virtcontainers->+agent: StartContainer()
|
||||
agent-->-virtcontainers: Container Started
|
||||
Note left of virtcontainers: Container-pod\nRunning
|
||||
virtcontainers->virtcontainers: Execute PostStart Hooks
|
||||
virtcontainers-->Kata Runtime: End of StartSandbox()
|
||||
Kata Runtime-->CRIO: End of start
|
||||
CRIO-->CRI: End of RunPodSandbox()
|
||||
|
||||
# Create the container
|
||||
CRI->CRIO: CreateContainer()
|
||||
CRIO->Kata Runtime: create
|
||||
Kata Runtime->virtcontainers: CreateContainer()
|
||||
virtcontainers->+agent: CreateContainer()
|
||||
agent-->-virtcontainers: Container Created
|
||||
virtcontainers->shim-ctr: Start Shim
|
||||
shim-ctr->agent: ReadStdout() (blocking call)
|
||||
shim-ctr->agent: ReadStderr() (blocking call)
|
||||
shim-ctr->agent: WaitProcess() (blocking call)
|
||||
Note left of virtcontainers: Container-ctr\nReady
|
||||
virtcontainers-->Kata Runtime: End of CreateContainer()
|
||||
Kata Runtime-->CRIO: End of create
|
||||
CRIO-->CRI: End of CreateContainer()
|
||||
|
||||
# Start the container
|
||||
CRI->CRIO: StartContainer()
|
||||
CRIO->Kata Runtime: start
|
||||
Kata Runtime->virtcontainers: StartContainer()
|
||||
virtcontainers->+agent: StartContainer()
|
||||
agent-->-virtcontainers: Container Started
|
||||
Note left of virtcontainers: Container-ctr\nRunning
|
||||
virtcontainers-->Kata Runtime: End of StartContainer()
|
||||
Kata Runtime-->CRIO: End of start
|
||||
CRIO-->CRI: End of StartContainer()
|
||||
|
||||
# Stop the container
|
||||
CRI->CRIO: StopContainer()
|
||||
CRIO->Kata Runtime: kill
|
||||
Kata Runtime->virtcontainers: KillContainer()
|
||||
virtcontainers->+agent: SignalProcess()
|
||||
alt SIGTERM OR SIGKILL
|
||||
agent-->shim-ctr: WaitProcess() returns
|
||||
end
|
||||
agent-->-virtcontainers: Process Signalled
|
||||
virtcontainers-->Kata Runtime: End of KillContainer()
|
||||
alt SIGTERM OR SIGKILL
|
||||
Kata Runtime->virtcontainers: StopContainer()
|
||||
virtcontainers->+shim-ctr: waitForShim()
|
||||
alt Timeout exceeded
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->shim-ctr: WaitProcess() returns
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->shim-ctr: waitForShim()
|
||||
end
|
||||
shim-ctr-->-virtcontainers: Shim terminated
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->+agent: RemoveContainer()
|
||||
agent-->-virtcontainers: Container Removed
|
||||
Note left of virtcontainers: Container-ctr\nStopped
|
||||
virtcontainers-->Kata Runtime: End of StopContainer()
|
||||
end
|
||||
Kata Runtime-->CRIO: End of kill
|
||||
CRIO-->CRI: End of StopContainer()
|
||||
|
||||
# Remove the container
|
||||
CRI->CRIO: RemoveContainer()
|
||||
CRIO->Kata Runtime: delete
|
||||
Kata Runtime->virtcontainers: DeleteContainer()
|
||||
virtcontainers->virtcontainers: Delete container resources
|
||||
virtcontainers-->Kata Runtime: End of DeleteContainer()
|
||||
Kata Runtime-->CRIO: End of delete
|
||||
CRIO-->CRI: End of RemoveContainer()
|
||||
|
||||
# Stop the sandbox
|
||||
CRI->CRIO: StopPodSandbox()
|
||||
CRIO->Kata Runtime: kill
|
||||
Kata Runtime->virtcontainers: KillContainer()
|
||||
virtcontainers->+agent: SignalProcess()
|
||||
alt SIGTERM OR SIGKILL
|
||||
agent-->shim-pod: WaitProcess() returns
|
||||
end
|
||||
agent-->-virtcontainers: Process Signalled
|
||||
virtcontainers-->Kata Runtime: End of KillContainer()
|
||||
alt SIGTERM OR SIGKILL
|
||||
Kata Runtime->virtcontainers: StopSandbox()
|
||||
loop for each container
|
||||
alt Container-ctr
|
||||
virtcontainers->+shim-ctr: waitForShim()
|
||||
alt Timeout exceeded
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->shim-ctr: WaitProcess() returns
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->shim-ctr: waitForShim()
|
||||
end
|
||||
shim-ctr-->-virtcontainers: Shim terminated
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->+agent: RemoveContainer()
|
||||
agent-->-virtcontainers: Container Removed
|
||||
Note left of virtcontainers: Container-ctr\nStopped
|
||||
else Container-pod
|
||||
virtcontainers->+shim-pod: waitForShim()
|
||||
alt Timeout exceeded
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->shim-pod: WaitProcess() returns
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->shim-pod: waitForShim()
|
||||
end
|
||||
shim-pod-->-virtcontainers: Shim terminated
|
||||
virtcontainers->+agent: SignalProcess(SIGKILL)
|
||||
agent-->-virtcontainers: Process Signalled by SIGKILL
|
||||
virtcontainers->+agent: RemoveContainer()
|
||||
agent-->-virtcontainers: Container Removed
|
||||
Note left of virtcontainers: Container-pod\nStopped
|
||||
end
|
||||
end
|
||||
virtcontainers->+agent: DestroySandbox()
|
||||
agent-->-virtcontainers: Sandbox Destroyed
|
||||
virtcontainers->hypervisor: Stop VM
|
||||
Note left of virtcontainers: Sandbox\nStopped
|
||||
virtcontainers->virtcontainers: removeNetwork()
|
||||
virtcontainers->virtcontainers: Execute PostStop Hooks
|
||||
virtcontainers-->Kata Runtime: End of StopSandbox()
|
||||
end
|
||||
Kata Runtime-->CRIO: End of kill
|
||||
CRIO-->CRI: End of StopPodSandbox()
|
||||
|
||||
# Remove the sandbox
|
||||
CRI->CRIO: RemovePodSandbox()
|
||||
CRIO->Kata Runtime: delete
|
||||
Kata Runtime->virtcontainers: DeleteSandbox()
|
||||
loop for each container
|
||||
virtcontainers->virtcontainers: Delete container resources
|
||||
end
|
||||
virtcontainers->virtcontainers: Delete sandbox resources
|
||||
virtcontainers-->Kata Runtime: End of DeleteSandbox()
|
||||
Kata Runtime-->CRIO: End of delete
|
||||
CRIO-->CRI: End of RemovePodSandbox()
|
||||
@@ -1 +0,0 @@
|
||||
<mxfile host="Chrome" modified="2020-07-02T06:45:31.744Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36" etag="f3JpMUEY9_WRpPV9i93y" version="13.3.5" type="device"><diagram id="XNV8G0dePIPkhS_Khqr4" name="Page-1">7VrbcqM4EP0aPzolIa6PTpzMPEyqspna2tl9SclGNmwwYoRw7P36lUDY3GzjBPA4NSlXBTWtC92nT7cEI3S32nxhOPIeqUuCkQbczQhNR5qGHAeKf1KyzSSWjjLBkvluJgJ7wXf/P5IJYS5NfJfESpaJOKUB96OycE7DkMx5SYYZo29ltQUN3JIgwktSE3yf46Au/ct3uZdJNYSc/Y2vxF96amqEgFr5CufaShB72KVvBRG6H6E7RinPrlabOxJI65UN83Dg7m5ljIS8TYd4pmuY/PvHjzj+Of06ftWeH/4Zm9koaxwk6omfGF0R7pEklmsmbE2YWj/f5lYRjxLJy2QVfPMXJPBD0bqNCPNFV6GPpoESP+1lt2+ez8n3CM9l1zeBFyHz+CoQLSguhQs5Fl3Yrh0EOIr9WTorEBJG5gmL/TV5JnGGFCmlCZcz3e0QIIULMZhCk7T07cIPgjsaUJY+ADJd2zJ1IY85o6+kcAdNDB1BNUJBvkj/hLxuduUJYShONgWRcsMXIg3KtkJF3R1rmm1lnVRYaKYCyVsRZErHK+DLzhWxAvZyN/re9eJCef8MJFg1JLxijscrGvqcfkIIwBmGRGuCAADm/eShZwhAxwKnIWCAISEAYQ0DNb8XvBVRP+TpGozbkTGtuJ8y7tElDXFQBEDVKXVTHgVna/tCrWxdqBs168IG4xqG0ZNx7QbbmgGXAIxwWDKy+TORSSE11jiD+UQoaFq0SU2W3xdXS14MG3cce/5qnAbueje+WG42Rab9O5S7DmXTaBHKdlMoG32FsmYMGcrQaB/K9rn2hcgoW7dlKOuwr1DWmmL5So2rV4xr1aHbaNzegOsc5EnXX+cclkQuFuwjyO5RzOrP4wLXFdRqXiGuqO5Vc2/4+720SGE48JehZD+yUDQ998Plt7Q1lXRDQnci9xiiOQvo/FWSGk1Cl7iKt4Sz2PaHGi5t/F1ntNS/ZONzqQhuHGSrdqp7A4Cj2tNNPqxsbAuNAnJS2XloiWnC5uSYTxRGOGZLcmzAPDCkmVtRJ7gRHK2XEDiGahBGAsxFeijRchPe1PBPMswKyK6ScpVrs8dWvfaoFR7F24Kait7D8+h64zwPh/Qt42P6VklfXGQrbu6NULkMqtiALhYx4aNq0O489f44RuDTciTSL8yRqF6n/5kS4nMScmGeCifOWM6HfjgPEleQmFzpVlh47cdybwdWeY/r50toa1fDl3mQnOTLjrkQaT1xYYWrUPVI6pS+foTbumImWGcmvCTqwX/vizrcF+32PMf2RaBpX9QfddYPPH+RtLSDZWsDO+Xg0bV2aalWBHRn3Kba/UqNC0HFuqYoxbXL5n29zlzZRuglJbCXq83i8AbosJzFQd7uPIvnLjuZxnOuOJ3GnYHSOCzzqa7vMdl1Jq/CXz+RydEH9c0hMr9Wix+P8yg9XX0QP4FmHHg0Fr6e2MABcpIDIVUIGMkpvug4UYEzo5zTVXOg7EIDnAqNgwn4JGzbojY/7e8bteNa+alXM9AB0Hbmd1TzOyM8YeFH3UojEsqDKRx7KftlhJe/xrYLDAdK7OYYnXo89+RJl5sX87htlgfJltqbx7W6x/usRKyuKuZd4ZvbDV34qEHTD5Qc0pYkWFzz2UG54gCwi4qj85DNkdx7zEK7mpJbsvS5pUUDyuHxYkH0qFY+5/dAAxQYWn13Kb0r8IBDd0Y3R6LlcgXFaQgag0GwAgwDVF7h9VwoaPWPYD5VoXA5T9qOM2gBgOqeTPfIN2LSx18uBDs8UR6qxBMVXtnDpj1sUY/qL+E/Vay2Pn0YLqiNSk61jGGren3Yqv6o/86p6qFVfkVsgwtX9fqhqv5lmZD4Cg8S31/c6IPV19WXXVZLwjy/vq7uvvOZ3ln7iub+K/VMff+xP7r/Hw==</diagram></mxfile>
|
||||
|
Before Width: | Height: | Size: 80 KiB |
@@ -1 +0,0 @@
|
||||
<mxfile host="app.diagrams.net" modified="2022-01-18T14:06:01.890Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" etag="nId-8OV6FDjWTDgzqDu-" version="15.8.9" type="device"><diagram id="bkF_ZONM9sPFCpIYoGFl" name="Page-1">5Vtbj6M2GP01eUyEbW55nGSmM2q70mqnUnf6UrnBCdYSnIIzSfbX14AJYDsLSSDZUWdHGvxhDD7n8F1sdoTm6/1zgjfhJxaQaAStYD9CjyMIgQ1d8SezHAqL504LwyqhgexUGV7pdyKNlrRuaUDSRkfOWMTppmlcsDgmC96w4SRhu2a3JYuad93gFdEMrwsc6dY/acDDwupDr7K/ELoKyzuDcn5rXHaWM0lDHLBdzYSeRmieMMaLo/V+TqIMvBKX4rpfTpw9PlhCYt7lgpd4/J1SSP9++bR8Cb6A1de/XsZ2Mco7jrZywiFLuXxgfihR4GQv7jEL+ToSBiAOU56wb2TOIpYIS8xi0XO2pFGkmHBEV7FoLsRTEmGfvZOEU4HvgzyxpkGQ3Wa2Cyknrxu8yO65E2oStoRt44BkE7BES5+xBCEbk+xrJonAM2FrwpOD6CLP+o5kQ8rRsZ2ivavItWWXsMZrSSKWclodR64QFwcSdDMB8ycYj+f+Hw/+b3jxvN57e/vXsa8RoIHfBMEEU42XJYu5fIugfeSplC5QSBpBtHSyf/LKmr340ZgWZ9z858iHBr6BopN8INDkAwGdj6llIMSxh2JkamDEjbhEqEGN+++WlSfGaY76g+gA3c2+OimOVtnf+BBs03Ea400aMp69DHJY8ZTFyEW/H/AP+uC/D9aQNbFAkzjDiwQ8A3H+ULyVSrqCOARNxInQwjGNSRIMzth0OMacCYJN14csnTFnOkG+Tpo3GGnAQJqCJomDhyySZ1EkwmlKFzlKOOG6uYZr023WUBYTRDOBW3L4mp2cOGXzTV6ZNx738sqidWjEIBJoWYMWlFK2TRakg2DFTFaEt3kkndoab47JQ0pbQiLM6XvzeU1Eyjt8ZjR/W0rluErELD10OUQxT3lVPf9QBrIVV2+7ykAFDtpAua6O075Cauh6x97iH8ZpSNfjb5jj8TscxFn04Aocx2n3A65BUMM5AT0L7c+lwqFcqg8UHKEeAVGJdSOXdAYD0rle4tOTucvw4W8wrhyvyZU7NWQr0KB5dzCq3OupMqaZufcRVWnOzwfNVnxbiTlTg4tCP4h5/dPlXZin1KA7phxjkT3DRtZhTbxj+0Tikbc+k4SKCWWFdGHcU/61HF4cv1UJjWhVI2WNITIYdM/MxIOKStSEomtmosrNVVOcoTOTDosAncWl5LNWm6ykgirVvNX0dCMFdciBC0ruJjWkKAReKjWnZaCBpQZNRfLFUmu6sFYPdmdn1bXcuq9Xc1WFqClIV6mpA3nWjaV2aWlfl9oFkql5QgvYTYkC95Ioexd/Z/9MoVWLiJ39HWiJ0UOLEBpEeF6aDXxTmr3akrRzhv0zbZ9cl5grcdBxJL732j6BpqWDM/k1llHFNthHordZifn9EA6A4gmQYZXjtozraxxzoFFyaU2bB4hBalpggROpX1tRO9gaBNTXILLt6GX6IeH0O8KJBoNTXyOg6+zzAhGOPw6sSi3sGTZkgWlDdjhYTdXxmS7eMbn4NBSwBDQZZJ2s9OwRWfJ+qJmq+bxxq/yGxKAOteStNzc0t2BC6aZeodx1/d/LV0kdfeve8jXtB95ZvtNpO0i3VW+Hrbm2Iv70RjysL0DWS/xbrQkVL+e9qmzfP8H2uVW2Fhrs21bZyLTv2K9KykWd4wJkvx9rtK7HFFnIvZQCLNiXVFxVKt7kxmLRq47yo7g8mpmL63Mrahm4TtbTqXDjNF79nnd7tCvLF0leZmLi8mWUazYUFxIxwmyT4ZIj5czEr0Bznq1IOuJZ56INqrb4zbonfM5i8fiY5pojOOW7bO0okzzHHP+Tz1Sv4HvLiFzHLJ2adD3DZwrDxZRet7vO24MIcBoe43mP7qEQ9f3cg6VwrC6/dHUP6kYXALA//yCa1efuRffqPw2gp/8A</diagram></mxfile>
|
||||
|
Before Width: | Height: | Size: 51 KiB |
@@ -1,31 +0,0 @@
|
||||
Title: Kata Flow
|
||||
participant Docker
|
||||
participant Kata Runtime
|
||||
participant virtcontainers
|
||||
participant hypervisor
|
||||
participant agent
|
||||
participant shim-pod
|
||||
participant shim-ctr
|
||||
participant proxy
|
||||
|
||||
#Docker Create!
|
||||
Docker->Kata Runtime: create
|
||||
Kata Runtime->virtcontainers: CreateSandbox()
|
||||
Note left of virtcontainers: Sandbox\nReady
|
||||
virtcontainers->virtcontainers: createNetwork()
|
||||
virtcontainers->virtcontainers: Execute PreStart Hooks
|
||||
virtcontainers->+hypervisor: Start VM (inside the netns)
|
||||
hypervisor-->-virtcontainers: VM started
|
||||
virtcontainers->proxy: Start Proxy
|
||||
proxy->hypervisor: Connect the VM
|
||||
virtcontainers->+agent: CreateSandbox()
|
||||
agent-->-virtcontainers: Sandbox Created
|
||||
virtcontainers->+agent: CreateContainer()
|
||||
agent-->-virtcontainers: Container Created
|
||||
virtcontainers->shim-pod: Start Shim
|
||||
shim->agent: ReadStdout() (blocking call)
|
||||
shim->agent: ReadStderr() (blocking call)
|
||||
shim->agent: WaitProcess() (blocking call)
|
||||
Note left of virtcontainers: Container\nReady
|
||||
virtcontainers-->Kata Runtime: End of CreateSandbox()
|
||||
Kata Runtime-->Docker: End of create
|
||||
|
Before Width: | Height: | Size: 7.8 KiB |
@@ -1,20 +0,0 @@
|
||||
Title: Docker Exec
|
||||
participant Docker
|
||||
participant kata-runtime
|
||||
participant virtcontainers
|
||||
participant shim
|
||||
participant hypervisor
|
||||
participant agent
|
||||
participant proxy
|
||||
|
||||
#Docker Exec
|
||||
Docker->kata-runtime: exec
|
||||
kata-runtime->virtcontainers: EnterContainer()
|
||||
virtcontainers->agent: exec
|
||||
agent->virtcontainers: Process started in the container
|
||||
virtcontainers->shim: start shim
|
||||
shim->agent: ReadStdout()
|
||||
shim->agent: ReadStderr()
|
||||
shim->agent: WaitProcess()
|
||||
virtcontainers->kata-runtime: End of EnterContainer()
|
||||
kata-runtime-->Docker: End of exec
|
||||
|
Before Width: | Height: | Size: 7.3 KiB |
@@ -1,20 +0,0 @@
|
||||
Title: Docker Start
|
||||
participant Docker
|
||||
participant Kata Runtime
|
||||
participant virtcontainers
|
||||
participant hypervisor
|
||||
participant agent
|
||||
participant shim-pod
|
||||
participant shim-ctr
|
||||
participant proxy
|
||||
|
||||
#Docker Start
|
||||
Docker->Kata Runtime: start
|
||||
Kata Runtime->virtcontainers: StartSandbox()
|
||||
Note left of virtcontainers: Sandbox\nRunning
|
||||
virtcontainers->+agent: StartContainer()
|
||||
agent-->-virtcontainers: Container Started
|
||||
Note left of virtcontainers: Container-pod\nRunning
|
||||
virtcontainers->virtcontainers: Execute PostStart Hooks
|
||||
virtcontainers-->Kata Runtime: End of StartSandbox()
|
||||
Kata Runtime-->Docker: End of start
|
||||
|
Before Width: | Height: | Size: 1.2 MiB |
|
Before Width: | Height: | Size: 1.0 MiB |
|
Before Width: | Height: | Size: 163 KiB |
|
Before Width: | Height: | Size: 390 KiB |
|
Before Width: | Height: | Size: 942 KiB |
|
Before Width: | Height: | Size: 182 KiB |
|
Before Width: | Height: | Size: 190 KiB |
|
Before Width: | Height: | Size: 102 KiB |
@@ -1,477 +0,0 @@
|
||||
# Kata Containers Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Kata Containers is an open source community working to build a secure
|
||||
container [runtime](#runtime) with lightweight virtual machines (VM's)
|
||||
that feel and perform like standard Linux containers, but provide
|
||||
stronger [workload](#workload) isolation using hardware
|
||||
[virtualization](#virtualization) technology as a second layer of
|
||||
defence.
|
||||
|
||||
Kata Containers runs on [multiple architectures](../../../src/runtime/README.md#platform-support)
|
||||
and supports [multiple hypervisors](../../hypervisors.md).
|
||||
|
||||
This document is a summary of the Kata Containers architecture.
|
||||
|
||||
## Background knowledge
|
||||
|
||||
This document assumes the reader understands a number of concepts
|
||||
related to containers and file systems. The
|
||||
[background](background.md) document explains these concepts.
|
||||
|
||||
## Example command
|
||||
|
||||
This document makes use of a particular [example
|
||||
command](example-command.md) throughout the text to illustrate certain
|
||||
concepts.
|
||||
|
||||
## Virtualization
|
||||
|
||||
For details on how Kata Containers maps container concepts to VM
|
||||
technologies, and how this is realized in the multiple hypervisors and
|
||||
VMMs that Kata supports see the
|
||||
[virtualization documentation](../virtualization.md).
|
||||
|
||||
## Compatibility
|
||||
|
||||
The [Kata Containers runtime](../../../src/runtime) is compatible with
|
||||
the [OCI](https://github.com/opencontainers)
|
||||
[runtime specification](https://github.com/opencontainers/runtime-spec)
|
||||
and therefore works seamlessly with the
|
||||
[Kubernetes Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/container-runtime-interface.md)
|
||||
through the [CRI-O](https://github.com/kubernetes-incubator/cri-o)
|
||||
and [containerd](https://github.com/containerd/containerd)
|
||||
implementations.
|
||||
|
||||
Kata Containers provides a ["shimv2"](#shim-v2-architecture) compatible runtime.
|
||||
|
||||
## Shim v2 architecture
|
||||
|
||||
The Kata Containers runtime is shim v2 ("shimv2") compatible. This
|
||||
section explains what this means.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> For a comparison with the Kata 1.x architecture, see
|
||||
> [the architectural history document](history.md).
|
||||
|
||||
The
|
||||
[containerd runtime shimv2 architecture](https://github.com/containerd/containerd/tree/main/runtime/v2)
|
||||
or _shim API_ architecture resolves the issues with the old
|
||||
architecture by defining a set of shimv2 APIs that a compatible
|
||||
runtime implementation must supply. Rather than calling the runtime
|
||||
binary multiple times for each new container, the shimv2 architecture
|
||||
runs a single instance of the runtime binary (for any number of
|
||||
containers). This improves performance and resolves the state handling
|
||||
issue.
|
||||
|
||||
The shimv2 API is similar to the
|
||||
[OCI runtime](https://github.com/opencontainers/runtime-spec)
|
||||
API in terms of the way the container lifecycle is split into
|
||||
different verbs. Rather than calling the runtime multiple times, the
|
||||
container manager creates a socket and passes it to the shimv2
|
||||
runtime. The socket is a bi-directional communication channel that
|
||||
uses a gRPC based protocol to allow the container manager to send API
|
||||
calls to the runtime, which returns the result to the container
|
||||
manager using the same channel.
|
||||
|
||||
The shimv2 architecture allows running several containers per VM to
|
||||
support container engines that require multiple containers running
|
||||
inside a pod.
|
||||
|
||||
With the new architecture [Kubernetes](kubernetes.md) can
|
||||
launch both Pod and OCI compatible containers with a single
|
||||
[runtime](#runtime) shim per Pod, rather than `2N+1` shims. No stand
|
||||
alone `kata-proxy` process is required, even if VSOCK is not
|
||||
available.
|
||||
|
||||
## Workload
|
||||
|
||||
The workload is the command the user requested to run in the
|
||||
container and is specified in the [OCI bundle](background.md#oci-bundle)'s
|
||||
configuration file.
|
||||
|
||||
In our [example](example-command.md), the workload is the `sh(1)` command.
|
||||
|
||||
### Workload root filesystem
|
||||
|
||||
For details of how the [runtime](#runtime) makes the
|
||||
[container image](background.md#container-image) chosen by the user available to
|
||||
the workload process, see the
|
||||
[Container creation](#container-creation) and [storage](#storage) sections.
|
||||
|
||||
Note that the workload is isolated from the [guest VM](#environments) environment by its
|
||||
surrounding [container environment](#environments). The guest VM
|
||||
environment where the container runs in is also isolated from the _outer_
|
||||
[host environment](#environments) where the container manager runs.
|
||||
|
||||
## System overview
|
||||
|
||||
### Environments
|
||||
|
||||
The following terminology is used to describe the different or
|
||||
environments (or contexts) various processes run in. It is necessary
|
||||
to study this table closely to make sense of what follows:
|
||||
|
||||
| Type | Name | Virtualized | Containerized | rootfs | Rootfs device type | Mount type | Description |
|
||||
|-|-|-|-|-|-|-|-|
|
||||
| Host | Host | no `[1]` | no | Host specific | Host specific | Host specific | The environment provided by a standard, physical non virtualized system. |
|
||||
| VM root | Guest VM | yes | no | rootfs inside the [guest image](guest-assets.md#guest-image) | Hypervisor specific `[2]` | `ext4` | The first (or top) level VM environment created on a host system. |
|
||||
| VM container root | Container | yes | yes | rootfs type requested by user ([`ubuntu` in the example](example-command.md)) | `kataShared` | [virtio FS](storage.md#virtio-fs) | The first (or top) level container environment created inside the VM. Based on the [OCI bundle](background.md#oci-bundle). |
|
||||
|
||||
**Key:**
|
||||
|
||||
- `[1]`: For simplicity, this document assumes the host environment
|
||||
runs on physical hardware.
|
||||
|
||||
- `[2]`: See the [DAX](#dax) section.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The word "root" is used to mean _top level_ here in a similar
|
||||
> manner to the term [rootfs](background.md#root-filesystem).
|
||||
>
|
||||
> - The term "first level" prefix used above is important since it implies
|
||||
> that it is possible to create multi level systems. However, they do
|
||||
> not form part of a standard Kata Containers environment so will not
|
||||
> be considered in this document.
|
||||
|
||||
The reasons for containerizing the [workload](#workload) inside the VM
|
||||
are:
|
||||
|
||||
- Isolates the workload entirely from the VM environment.
|
||||
- Provides better isolation between containers in a [pod](kubernetes.md).
|
||||
- Allows the workload to be managed and monitored through its cgroup
|
||||
confinement.
|
||||
|
||||
### Container creation
|
||||
|
||||
The steps below show at a high level how a Kata Containers container is
|
||||
created using the containerd container manager:
|
||||
|
||||
1. The user requests the creation of a container by running a command
|
||||
like the [example command](example-command.md).
|
||||
1. The container manager daemon runs a single instance of the Kata
|
||||
[runtime](#runtime).
|
||||
1. The Kata runtime loads its [configuration file](#configuration).
|
||||
1. The container manager calls a set of shimv2 API functions on the runtime.
|
||||
1. The Kata runtime launches the configured [hypervisor](#hypervisor).
|
||||
1. The hypervisor creates and starts (_boots_) a VM using the
|
||||
[guest assets](guest-assets.md#guest-assets):
|
||||
|
||||
- The hypervisor [DAX](#dax) shares the
|
||||
[guest image](guest-assets.md#guest-image)
|
||||
into the VM to become the VM [rootfs](background.md#root-filesystem) (mounted on a `/dev/pmem*` device),
|
||||
which is known as the [VM root environment](#environments).
|
||||
- The hypervisor mounts the [OCI bundle](background.md#oci-bundle), using [virtio FS](storage.md#virtio-fs),
|
||||
into a container specific directory inside the VM's rootfs.
|
||||
|
||||
This container specific directory will become the
|
||||
[container rootfs](#environments), known as the
|
||||
[container environment](#environments).
|
||||
|
||||
1. The [agent](#agent) is started as part of the VM boot.
|
||||
|
||||
1. The runtime calls the agent's `CreateSandbox` API to request the
|
||||
agent create a container:
|
||||
|
||||
1. The agent creates a [container environment](#environments)
|
||||
in the container specific directory that contains the [container rootfs](#environments).
|
||||
|
||||
The container environment hosts the [workload](#workload) in the
|
||||
[container rootfs](#environments) directory.
|
||||
|
||||
1. The agent spawns the workload inside the container environment.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The container environment created by the agent is equivalent to
|
||||
> a container environment created by the
|
||||
> [`runc`](https://github.com/opencontainers/runc) OCI runtime;
|
||||
> Linux cgroups and namespaces are created inside the VM by the
|
||||
> [guest kernel](guest-assets.md#guest-kernel) to isolate the
|
||||
> workload from the VM environment the container is created in.
|
||||
> See the [Environments](#environments) section for an
|
||||
> explanation of why this is done.
|
||||
>
|
||||
> - See the [guest image](guest-assets.md#guest-image) section for
|
||||
> details of exactly how the agent is started.
|
||||
|
||||
1. The container manager returns control of the container to the
|
||||
user running the `ctr` command.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> At this point, the container is running and:
|
||||
>
|
||||
> - The [workload](#workload) process ([`sh(1)` in the example](example-command.md))
|
||||
> is running in the [container environment](#environments).
|
||||
> - The user is now able to interact with the workload
|
||||
> (using the [`ctr` command in the example](example-command.md)).
|
||||
> - The [agent](#agent), running inside the VM is monitoring the
|
||||
> [workload](#workload) process.
|
||||
> - The [runtime](#runtime) is waiting for the agent's `WaitProcess` API
|
||||
> call to complete.
|
||||
|
||||
Further details of these steps are provided in the sections below.
|
||||
|
||||
### Container shutdown
|
||||
|
||||
There are two possible ways for the container environment to be
|
||||
terminated:
|
||||
|
||||
- When the [workload](#workload) exits.
|
||||
|
||||
This is the standard, or _graceful_ shutdown method.
|
||||
|
||||
- When the container manager forces the container to be deleted.
|
||||
|
||||
#### Workload exit
|
||||
|
||||
The [agent](#agent) will detect when the [workload](#workload) process
|
||||
exits, capture its exit status (see `wait(2)`) and return that value
|
||||
to the [runtime](#runtime) by specifying it as the response to the
|
||||
`WaitProcess` agent API call made by the [runtime](#runtime).
|
||||
|
||||
The runtime then passes the value back to the container manager by the
|
||||
`Wait` [shimv2 API](#shim-v2-architecture) call.
|
||||
|
||||
Once the workload has fully exited, the VM is no longer needed and the
|
||||
runtime cleans up the environment (which includes terminating the
|
||||
[hypervisor](#hypervisor) process).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> When [agent tracing is enabled](../../tracing.md#agent-shutdown-behaviour),
|
||||
> the shutdown behaviour is different.
|
||||
|
||||
#### Container manager requested shutdown
|
||||
|
||||
If the container manager requests the container be deleted, the
|
||||
[runtime](#runtime) will signal the agent by sending it a
|
||||
`DestroySandbox` [ttRPC API](../../../src/libs/protocols/protos/agent.proto) request.
|
||||
|
||||
## Guest assets
|
||||
|
||||
The guest assets comprise a guest image and a guest kernel that are
|
||||
used by the [hypervisor](#hypervisor).
|
||||
|
||||
See the [guest assets](guest-assets.md) document for further
|
||||
information.
|
||||
|
||||
## Hypervisor
|
||||
|
||||
The [hypervisor](../../hypervisors.md) specified in the
|
||||
[configuration file](#configuration) creates a VM to host the
|
||||
[agent](#agent) and the [workload](#workload) inside the
|
||||
[container environment](#environments).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The hypervisor process runs inside an environment slightly different
|
||||
> to the host environment:
|
||||
>
|
||||
> - It is run in a different cgroup environment to the host.
|
||||
> - It is given a separate network namespace from the host.
|
||||
> - If the [OCI configuration specifies a SELinux label](https://github.com/opencontainers/runtime-spec/blob/main/config.md#linux-process),
|
||||
> the hypervisor process will run with that label (*not* the workload running inside the hypervisor's VM).
|
||||
|
||||
## Agent
|
||||
|
||||
The Kata Containers agent ([`kata-agent`](../../../src/agent)), written
|
||||
in the [Rust programming language](https://www.rust-lang.org), is a
|
||||
long running process that runs inside the VM. It acts as the
|
||||
supervisor for managing the containers and the [workload](#workload)
|
||||
running within those containers. Only a single agent process is run
|
||||
for each VM created.
|
||||
|
||||
### Agent communications protocol
|
||||
|
||||
The agent communicates with the other Kata components (primarily the
|
||||
[runtime](#runtime)) using a
|
||||
[`ttRPC`](https://github.com/containerd/ttrpc-rust) based
|
||||
[protocol](../../../src/libs/protocols/protos).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> If you wish to learn more about this protocol, a practical way to do
|
||||
> so is to experiment with the
|
||||
> [agent control tool](#agent-control-tool) on a test system.
|
||||
> This tool is for test and development purposes only and can send
|
||||
> arbitrary ttRPC agent API commands to the [agent](#agent).
|
||||
|
||||
## Runtime
|
||||
|
||||
The Kata Containers runtime (the [`containerd-shim-kata-v2`](../../../src/runtime/cmd/containerd-shim-kata-v2
|
||||
) binary) is a [shimv2](#shim-v2-architecture) compatible runtime.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The Kata Containers runtime is sometimes referred to as the Kata
|
||||
> _shim_. Both terms are correct since the `containerd-shim-kata-v2`
|
||||
> is a container runtime, and that runtime implements the containerd
|
||||
> shim v2 API.
|
||||
|
||||
The runtime makes heavy use of the [`virtcontainers`
|
||||
package](../../../src/runtime/virtcontainers), which provides a generic,
|
||||
runtime-specification agnostic, hardware-virtualized containers
|
||||
library.
|
||||
|
||||
The runtime is responsible for starting the [hypervisor](#hypervisor)
|
||||
and it's VM, and communicating with the [agent](#agent) using a
|
||||
[ttRPC based protocol](#agent-communications-protocol) over a VSOCK
|
||||
socket that provides a communications link between the VM and the
|
||||
host.
|
||||
|
||||
This protocol allows the runtime to send container management commands
|
||||
to the agent. The protocol is also used to carry the standard I/O
|
||||
streams (`stdout`, `stderr`, `stdin`) between the containers and
|
||||
container managers (such as CRI-O or containerd).
|
||||
|
||||
## Utility program
|
||||
|
||||
The `kata-runtime` binary is a utility program that provides
|
||||
administrative commands to manipulate and query a Kata Containers
|
||||
installation.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> In Kata 1.x, this program also acted as the main
|
||||
> [runtime](#runtime), but this is no longer required due to the
|
||||
> improved shimv2 architecture.
|
||||
|
||||
### exec command
|
||||
|
||||
The `exec` command allows an administrator or developer to enter the
|
||||
[VM root environment](#environments) which is not accessible by the container
|
||||
[workload](#workload).
|
||||
|
||||
See [the developer guide](../../Developer-Guide.md#connect-to-debug-console) for further details.
|
||||
|
||||
### Configuration
|
||||
|
||||
See the [configuration file details](../../../src/runtime/README.md#configuration).
|
||||
|
||||
The configuration file is also used to enable runtime [debug output](../../Developer-Guide.md#enable-full-debug).
|
||||
|
||||
## Process overview
|
||||
|
||||
The table below shows an example of the main processes running in the
|
||||
different [environments](#environments) when a Kata Container is
|
||||
created with containerd using our [example command](example-command.md):
|
||||
|
||||
| Description | Host | VM root environment | VM container environment |
|
||||
|-|-|-|-|
|
||||
| Container manager | `containerd` | |
|
||||
| Kata Containers | [runtime](#runtime), [`virtiofsd`](storage.md#virtio-fs), [hypervisor](#hypervisor) | [agent](#agent) |
|
||||
| User [workload](#workload) | | | [`ubuntu sh`](example-command.md) |
|
||||
|
||||
## Networking
|
||||
|
||||
See the [networking document](networking.md).
|
||||
|
||||
## Storage
|
||||
|
||||
See the [storage document](storage.md).
|
||||
|
||||
## Kubernetes support
|
||||
|
||||
See the [Kubernetes document](kubernetes.md).
|
||||
|
||||
#### OCI annotations
|
||||
|
||||
In order for the Kata Containers [runtime](#runtime) (or any VM based OCI compatible
|
||||
runtime) to be able to understand if it needs to create a full VM or if it
|
||||
has to create a new container inside an existing pod's VM, CRI-O adds
|
||||
specific annotations to the OCI configuration file (`config.json`) which is passed to
|
||||
the OCI compatible runtime.
|
||||
|
||||
Before calling its runtime, CRI-O will always add a `io.kubernetes.cri-o.ContainerType`
|
||||
annotation to the `config.json` configuration file it produces from the Kubelet CRI
|
||||
request. The `io.kubernetes.cri-o.ContainerType` annotation can either be set to `sandbox`
|
||||
or `container`. Kata Containers will then use this annotation to decide if it needs to
|
||||
respectively create a virtual machine or a container inside a virtual machine associated
|
||||
with a Kubernetes pod:
|
||||
|
||||
| Annotation value | Kata VM created? | Kata container created? |
|
||||
|-|-|-|
|
||||
| `sandbox` | yes | yes (inside new VM) |
|
||||
| `container`| no | yes (in existing VM) |
|
||||
|
||||
#### Mixing VM based and namespace based runtimes
|
||||
|
||||
> **Note:** Since Kubernetes 1.12, the [`Kubernetes RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/)
|
||||
> has been supported and the user can specify runtime without the non-standardized annotations.
|
||||
|
||||
With `RuntimeClass`, users can define Kata Containers as a
|
||||
`RuntimeClass` and then explicitly specify that a pod must be created
|
||||
as a Kata Containers pod. For details, please refer to [How to use
|
||||
Kata Containers and containerd](../../../docs/how-to/containerd-kata.md).
|
||||
|
||||
## Tracing
|
||||
|
||||
The [tracing document](../../tracing.md) provides details on the tracing
|
||||
architecture.
|
||||
|
||||
# Appendices
|
||||
|
||||
## DAX
|
||||
|
||||
Kata Containers utilizes the Linux kernel DAX
|
||||
[(Direct Access filesystem)](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/dax.rst?h=v5.14)
|
||||
feature to efficiently map the [guest image](guest-assets.md#guest-image) in the
|
||||
[host environment](#environments) into the
|
||||
[guest VM environment](#environments) to become the VM's
|
||||
[rootfs](background.md#root-filesystem).
|
||||
|
||||
If the [configured](#configuration) [hypervisor](#hypervisor) is set
|
||||
to either QEMU or Cloud Hypervisor, DAX is used with the feature shown
|
||||
in the table below:
|
||||
|
||||
| Hypervisor | Feature used | rootfs device type |
|
||||
|-|-|-|
|
||||
| Cloud Hypervisor (CH) | `dax` `FsConfig` configuration option | PMEM (emulated Persistent Memory device) |
|
||||
| QEMU | NVDIMM memory device with a memory file backend | NVDIMM (emulated Non-Volatile Dual In-line Memory Module device) |
|
||||
|
||||
The features in the table above are equivalent in that they provide a memory-mapped
|
||||
virtual device which is used to DAX map the VM's
|
||||
[rootfs](background.md#root-filesystem) into the [VM guest](#environments) memory
|
||||
address space.
|
||||
|
||||
The VM is then booted, specifying the `root=` kernel parameter to make
|
||||
the [guest kernel](guest-assets.md#guest-kernel) use the appropriate emulated device
|
||||
as its rootfs.
|
||||
|
||||
### DAX advantages
|
||||
|
||||
Mapping files using [DAX](#dax) provides a number of benefits over
|
||||
more traditional VM file and device mapping mechanisms:
|
||||
|
||||
- Mapping as a direct access device allows the guest to directly
|
||||
access the host memory pages (such as via Execute In Place (XIP)),
|
||||
bypassing the [guest kernel](guest-assets.md#guest-kernel)'s page cache. This
|
||||
zero copy provides both time and space optimizations.
|
||||
|
||||
- Mapping as a direct access device inside the VM allows pages from the
|
||||
host to be demand loaded using page faults, rather than having to make requests
|
||||
via a virtualized device (causing expensive VM exits/hypercalls), thus providing
|
||||
a speed optimization.
|
||||
|
||||
- Utilizing `mmap(2)`'s `MAP_SHARED` shared memory option on the host
|
||||
allows the host to efficiently share pages.
|
||||
|
||||

|
||||
|
||||
For further details of the use of NVDIMM with QEMU, see the [QEMU
|
||||
project documentation](https://www.qemu.org).
|
||||
|
||||
## Agent control tool
|
||||
|
||||
The [agent control tool](../../../src/tools/agent-ctl) is a test and
|
||||
development tool that can be used to learn more about a Kata
|
||||
Containers system.
|
||||
|
||||
## Terminology
|
||||
|
||||
See the [project glossary](../../../Glossary.md).
|
||||
@@ -1,81 +0,0 @@
|
||||
# Kata Containers architecture background knowledge
|
||||
|
||||
The following sections explain some of the background concepts
|
||||
required to understand the [architecture document](README.md).
|
||||
|
||||
## Root filesystem
|
||||
|
||||
This document uses the term _rootfs_ to refer to a root filesystem
|
||||
which is mounted as the top-level directory ("`/`") and often referred
|
||||
to as _slash_.
|
||||
|
||||
It is important to understand this term since the overall system uses
|
||||
multiple different rootfs's (as explained in the
|
||||
[Environments](README.md#environments) section.
|
||||
|
||||
## Container image
|
||||
|
||||
In the [example command](example-command.md) the user has specified the
|
||||
type of container they wish to run via the container image name:
|
||||
`ubuntu`. This image name corresponds to a _container image_ that can
|
||||
be used to create a container with an Ubuntu Linux environment. Hence,
|
||||
in our [example](example-command.md), the `sh(1)` command will be run
|
||||
inside a container which has an Ubuntu rootfs.
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> The term _container image_ is confusing since the image in question
|
||||
> is **not** a container: it is simply a set of files (_an image_)
|
||||
> that can be used to _create_ a container. The term _container
|
||||
> template_ would be more accurate but the term _container image_ is
|
||||
> commonly used so this document uses the standard term.
|
||||
|
||||
For the purposes of this document, the most important part of the
|
||||
[example command line](example-command.md) is the container image the
|
||||
user has requested. Normally, the container manager will _pull_
|
||||
(download) a container image from a remote site and store a copy
|
||||
locally. This local container image is used by the container manager
|
||||
to create an [OCI bundle](#oci-bundle) which will form the environment
|
||||
the container will run in. After creating the OCI bundle, the
|
||||
container manager launches a [runtime](README.md#runtime) which will create the
|
||||
container using the provided OCI bundle.
|
||||
|
||||
## OCI bundle
|
||||
|
||||
To understand what follows, it is important to know at a high level
|
||||
how an OCI ([Open Containers Initiative](https://opencontainers.org)) compatible container is created.
|
||||
|
||||
An OCI compatible container is created by taking a
|
||||
[container image](#container-image) and converting the embedded rootfs
|
||||
into an
|
||||
[OCI rootfs bundle](https://github.com/opencontainers/runtime-spec/blob/main/bundle.md),
|
||||
or more simply, an _OCI bundle_.
|
||||
|
||||
An OCI bundle is a `tar(1)` archive normally created by a container
|
||||
manager which is passed to an OCI [runtime](README.md#runtime) which converts
|
||||
it into a full container rootfs. The bundle contains two assets:
|
||||
|
||||
- A container image [rootfs](#root-filesystem)
|
||||
|
||||
This is simply a directory of files that will be used to represent
|
||||
the rootfs for the container.
|
||||
|
||||
For the [example command](example-command.md), the directory will
|
||||
contain the files necessary to create a minimal Ubuntu root
|
||||
filesystem.
|
||||
|
||||
- An [OCI configuration file](https://github.com/opencontainers/runtime-spec/blob/main/config.md)
|
||||
|
||||
This is a JSON file called `config.json`.
|
||||
|
||||
The container manager will create this file so that:
|
||||
|
||||
- The `root.path` value is set to the full path of the specified
|
||||
container rootfs.
|
||||
|
||||
In [the example](example-command.md) this value will be `ubuntu`.
|
||||
|
||||
- The `process.args` array specifies the list of commands the user
|
||||
wishes to run. This is known as the [workload](README.md#workload).
|
||||
|
||||
In [the example](example-command.md) the workload is `sh(1)`.
|
||||
@@ -1,30 +0,0 @@
|
||||
# Example command
|
||||
|
||||
The following containerd command creates a container. It is referred
|
||||
to throughout the architecture document to help explain various points:
|
||||
|
||||
```bash
|
||||
$ sudo ctr run --runtime "io.containerd.kata.v2" --rm -t "quay.io/libpod/ubuntu:latest" foo sh
|
||||
```
|
||||
|
||||
This command requests that containerd:
|
||||
|
||||
- Create a container (`ctr run`).
|
||||
- Use the Kata [shimv2](README.md#shim-v2-architecture) runtime (`--runtime "io.containerd.kata.v2"`).
|
||||
- Delete the container when it [exits](README.md#workload-exit) (`--rm`).
|
||||
- Attach the container to the user's terminal (`-t`).
|
||||
- Use the Ubuntu Linux [container image](background.md#container-image)
|
||||
to create the container [rootfs](background.md#root-filesystem) that will become
|
||||
the [container environment](README.md#environments)
|
||||
(`quay.io/libpod/ubuntu:latest`).
|
||||
- Create the container with the name "`foo`".
|
||||
- Run the `sh(1)` command in the Ubuntu rootfs based container
|
||||
environment.
|
||||
|
||||
The command specified here is referred to as the [workload](README.md#workload).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
> For the purposes of this document and to keep explanations
|
||||
> simpler, we assume the user is running this command in the
|
||||
> [host environment](README.md#environments).
|
||||
@@ -1,152 +0,0 @@
|
||||
# Guest assets
|
||||
|
||||
Kata Containers creates a VM in which to run one or more containers.
|
||||
It does this by launching a [hypervisor](README.md#hypervisor) to
|
||||
create the VM. The hypervisor needs two assets for this task: a Linux
|
||||
kernel and a small root filesystem image to boot the VM.
|
||||
|
||||
## Guest kernel
|
||||
|
||||
The [guest kernel](../../../tools/packaging/kernel)
|
||||
is passed to the hypervisor and used to boot the VM.
|
||||
The default kernel provided in Kata Containers is highly optimized for
|
||||
kernel boot time and minimal memory footprint, providing only those
|
||||
services required by a container workload. It is based on the latest
|
||||
Linux LTS (Long Term Support) [kernel](https://www.kernel.org).
|
||||
|
||||
## Guest image
|
||||
|
||||
The hypervisor uses an image file which provides a minimal root
|
||||
filesystem used by the guest kernel to boot the VM and host the Kata
|
||||
Container. Kata Containers supports both initrd and rootfs based
|
||||
minimal guest images. The [default packages](../../install/) provide both
|
||||
an image and an initrd, both of which are created using the
|
||||
[`osbuilder`](../../../tools/osbuilder) tool.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - Although initrd and rootfs based images are supported, not all
|
||||
> [hypervisors](README.md#hypervisor) support both types of image.
|
||||
>
|
||||
> - The guest image is *unrelated* to the image used in a container
|
||||
> workload.
|
||||
>
|
||||
> For example, if a user creates a container that runs a shell in a
|
||||
> BusyBox image, they will run that shell in a BusyBox environment.
|
||||
> However, the guest image running inside the VM that is used to
|
||||
> *host* that BusyBox image could be running Clear Linux, Ubuntu,
|
||||
> Fedora or any other distribution potentially.
|
||||
>
|
||||
> The `osbuilder` tool provides
|
||||
> [configurations for various common Linux distributions](../../../tools/osbuilder/rootfs-builder)
|
||||
> which can be built into either initrd or rootfs guest images.
|
||||
>
|
||||
> - If you are using a [packaged version of Kata
|
||||
> Containers](../../install), you can see image details by running the
|
||||
> [`kata-collect-data.sh`](../../../src/runtime/data/kata-collect-data.sh.in)
|
||||
> script as `root` and looking at the "Image details" section of the
|
||||
> output.
|
||||
|
||||
#### Root filesystem image
|
||||
|
||||
The default packaged rootfs image, sometimes referred to as the _mini
|
||||
O/S_, is a highly optimized container bootstrap system.
|
||||
|
||||
If this image type is [configured](README.md#configuration), when the
|
||||
user runs the [example command](example-command.md):
|
||||
|
||||
- The [runtime](README.md#runtime) will launch the configured [hypervisor](README.md#hypervisor).
|
||||
- The hypervisor will boot the mini-OS image using the [guest kernel](#guest-kernel).
|
||||
- The kernel will start the init daemon as PID 1 (`systemd`) inside the VM root environment.
|
||||
- `systemd`, running inside the mini-OS context, will launch the [agent](README.md#agent)
|
||||
in the root context of the VM.
|
||||
- The agent will create a new container environment, setting its root
|
||||
filesystem to that requested by the user (Ubuntu in [the example](example-command.md)).
|
||||
- The agent will then execute the command (`sh(1)` in [the example](example-command.md))
|
||||
inside the new container.
|
||||
|
||||
The table below summarises the default mini O/S showing the
|
||||
environments that are created, the services running in those
|
||||
environments (for all platforms) and the root filesystem used by
|
||||
each service:
|
||||
|
||||
| Process | Environment | systemd service? | rootfs | User accessible | Notes |
|
||||
|-|-|-|-|-|-|
|
||||
| systemd | VM root | n/a | [VM guest image](#guest-image)| [debug console][debug-console] | The init daemon, running as PID 1 |
|
||||
| [Agent](README.md#agent) | VM root | yes | [VM guest image](#guest-image)| [debug console][debug-console] | Runs as a systemd service |
|
||||
| `chronyd` | VM root | yes | [VM guest image](#guest-image)| [debug console][debug-console] | Used to synchronise the time with the host |
|
||||
| container workload (`sh(1)` in [the example](example-command.md)) | VM container | no | User specified (Ubuntu in [the example](example-command.md)) | [exec command](README.md#exec-command) | Managed by the agent |
|
||||
|
||||
See also the [process overview](README.md#process-overview).
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The "User accessible" column shows how an administrator can access
|
||||
> the environment.
|
||||
>
|
||||
> - The container workload is running inside a full container
|
||||
> environment which itself is running within a VM environment.
|
||||
>
|
||||
> - See the [configuration files for the `osbuilder` tool](../../../tools/osbuilder/rootfs-builder)
|
||||
> for details of the default distribution for platforms other than
|
||||
> Intel x86_64.
|
||||
|
||||
#### Initrd image
|
||||
|
||||
The initrd image is a compressed `cpio(1)` archive, created from a
|
||||
rootfs which is loaded into memory and used as part of the Linux
|
||||
startup process. During startup, the kernel unpacks it into a special
|
||||
instance of a `tmpfs` mount that becomes the initial root filesystem.
|
||||
|
||||
If this image type is [configured](README.md#configuration), when the user runs
|
||||
the [example command](example-command.md):
|
||||
|
||||
- The [runtime](README.md#runtime) will launch the configured [hypervisor](README.md#hypervisor).
|
||||
- The hypervisor will boot the mini-OS image using the [guest kernel](#guest-kernel).
|
||||
- The kernel will start the init daemon as PID 1 (the
|
||||
[agent](README.md#agent))
|
||||
inside the VM root environment.
|
||||
- The [agent](README.md#agent) will create a new container environment, setting its root
|
||||
filesystem to that requested by the user (`ubuntu` in
|
||||
[the example](example-command.md)).
|
||||
- The agent will then execute the command (`sh(1)` in [the example](example-command.md))
|
||||
inside the new container.
|
||||
|
||||
The table below summarises the default mini O/S showing the environments that are created,
|
||||
the processes running in those environments (for all platforms) and
|
||||
the root filesystem used by each service:
|
||||
|
||||
| Process | Environment | rootfs | User accessible | Notes |
|
||||
|-|-|-|-|-|
|
||||
| [Agent](README.md#agent) | VM root | [VM guest image](#guest-image) | [debug console][debug-console] | Runs as the init daemon (PID 1) |
|
||||
| container workload | VM container | User specified (Ubuntu in this example) | [exec command](README.md#exec-command) | Managed by the agent |
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - The "User accessible" column shows how an administrator can access
|
||||
> the environment.
|
||||
>
|
||||
> - It is possible to use a standard init daemon such as systemd with
|
||||
> an initrd image if this is desirable.
|
||||
|
||||
See also the [process overview](README.md#process-overview).
|
||||
|
||||
#### Image summary
|
||||
|
||||
| Image type | Default distro | Init daemon | Reason | Notes |
|
||||
|-|-|-|-|-|
|
||||
| [image](background.md#root-filesystem-image) | [Clear Linux](https://clearlinux.org) (for x86_64 systems)| systemd | Minimal and highly optimized | systemd offers flexibility |
|
||||
| [initrd](#initrd-image) | [Alpine Linux](https://alpinelinux.org) | Kata [agent](README.md#agent) (as no systemd support) | Security hardened and tiny C library |
|
||||
|
||||
See also:
|
||||
|
||||
- The [osbuilder](../../../tools/osbuilder) tool
|
||||
|
||||
This is used to build all default image types.
|
||||
|
||||
- The [versions database](../../../versions.yaml)
|
||||
|
||||
The `default-image-name` and `default-initrd-name` options specify
|
||||
the default distributions for each image type.
|
||||
|
||||
[debug-console]: ../../Developer-Guide.md#connect-to-debug-console
|
||||
@@ -1,41 +0,0 @@
|
||||
# History
|
||||
|
||||
## Kata 1.x architecture
|
||||
|
||||
In the old [Kata 1.x architecture](https://github.com/kata-containers/documentation/blob/master/design/architecture.md),
|
||||
the Kata [runtime](README.md#runtime) was an executable called `kata-runtime`.
|
||||
The container manager called this executable multiple times when
|
||||
creating each container. Each time the runtime was called a different
|
||||
OCI command-line verb was provided. This architecture was simple, but
|
||||
not well suited to creating VM based containers due to the issue of
|
||||
handling state between calls. Additionally, the architecture suffered
|
||||
from performance issues related to continually having to spawn new
|
||||
instances of the runtime binary, and
|
||||
[Kata shim](https://github.com/kata-containers/shim) and
|
||||
[Kata proxy](https://github.com/kata-containers/proxy) processes for systems
|
||||
that did not provide VSOCK.
|
||||
|
||||
## Kata 2.x architecture
|
||||
|
||||
See the ["shimv2"](README.md#shim-v2-architecture) section of the
|
||||
architecture document.
|
||||
|
||||
## Architectural comparison
|
||||
|
||||
| Kata version | Kata Runtime process calls | Kata shim processes | Kata proxy processes (if no VSOCK) |
|
||||
|-|-|-|-|
|
||||
| 1.x | multiple per container | 1 per container connection | 1 |
|
||||
| 2.x | 1 per VM (hosting any number of containers) | 0 | 0 |
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
> - A single VM can host one or more containers.
|
||||
>
|
||||
> - The "Kata shim processes" column refers to the old
|
||||
> [Kata shim](https://github.com/kata-containers/shim) (`kata-shim` binary),
|
||||
> *not* the new shimv2 runtime instance (`containerd-shim-kata-v2` binary).
|
||||
|
||||
The diagram below shows how the original architecture was simplified
|
||||
with the advent of shimv2.
|
||||
|
||||

|
||||
@@ -1,35 +0,0 @@
|
||||
# Kubernetes support
|
||||
|
||||
[Kubernetes](https://github.com/kubernetes/kubernetes/), or K8s, is a popular open source
|
||||
container orchestration engine. In Kubernetes, a set of containers sharing resources
|
||||
such as networking, storage, mount, PID, etc. is called a
|
||||
[pod](https://kubernetes.io/docs/user-guide/pods/).
|
||||
|
||||
A node can have multiple pods, but at a minimum, a node within a Kubernetes cluster
|
||||
only needs to run a container runtime and a container agent (called a
|
||||
[Kubelet](https://kubernetes.io/docs/admin/kubelet/)).
|
||||
|
||||
Kata Containers represents a Kubelet pod as a VM.
|
||||
|
||||
A Kubernetes cluster runs a control plane where a scheduler (typically
|
||||
running on a dedicated master node) calls into a compute Kubelet. This
|
||||
Kubelet instance is responsible for managing the lifecycle of pods
|
||||
within the nodes and eventually relies on a container runtime to
|
||||
handle execution. The Kubelet architecture decouples lifecycle
|
||||
management from container execution through a dedicated gRPC based
|
||||
[Container Runtime Interface (CRI)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/container-runtime-interface-v1.md).
|
||||
|
||||
In other words, a Kubelet is a CRI client and expects a CRI
|
||||
implementation to handle the server side of the interface.
|
||||
[CRI-O](https://github.com/kubernetes-incubator/cri-o) and
|
||||
[containerd](https://github.com/containerd/containerd/) are CRI
|
||||
implementations that rely on
|
||||
[OCI](https://github.com/opencontainers/runtime-spec) compatible
|
||||
runtimes for managing container instances.
|
||||
|
||||
Kata Containers is an officially supported CRI-O and containerd
|
||||
runtime. Refer to the following guides on how to set up Kata
|
||||
Containers with Kubernetes:
|
||||
|
||||
- [How to use Kata Containers and containerd](../../how-to/containerd-kata.md)
|
||||
- [Run Kata Containers with Kubernetes](../../how-to/run-kata-with-k8s.md)
|
||||
@@ -1,49 +0,0 @@
|
||||
# Networking
|
||||
|
||||
Containers typically live in their own, possibly shared, networking namespace.
|
||||
At some point in a container lifecycle, container engines will set up that namespace
|
||||
to add the container to a network which is isolated from the host network.
|
||||
|
||||
In order to setup the network for a container, container engines call into a
|
||||
networking plugin. The network plugin will usually create a virtual
|
||||
ethernet (`veth`) pair adding one end of the `veth` pair into the container
|
||||
networking namespace, while the other end of the `veth` pair is added to the
|
||||
host networking namespace.
|
||||
|
||||
This is a very namespace-centric approach as many hypervisors or VM
|
||||
Managers (VMMs) such as `virt-manager` cannot handle `veth`
|
||||
interfaces. Typically, [`TAP`](https://www.kernel.org/doc/Documentation/networking/tuntap.txt)
|
||||
interfaces are created for VM connectivity.
|
||||
|
||||
To overcome incompatibility between typical container engines expectations
|
||||
and virtual machines, Kata Containers networking transparently connects `veth`
|
||||
interfaces with `TAP` ones using [Traffic Control](https://man7.org/linux/man-pages/man8/tc.8.html):
|
||||
|
||||

|
||||
|
||||
With a TC filter rules in place, a redirection is created between the container network
|
||||
and the virtual machine. As an example, the network plugin may place a device,
|
||||
`eth0`, in the container's network namespace, which is one end of a VETH device.
|
||||
Kata Containers will create a tap device for the VM, `tap0_kata`,
|
||||
and setup a TC redirection filter to redirect traffic from `eth0`'s ingress to `tap0_kata`'s egress,
|
||||
and a second TC filter to redirect traffic from `tap0_kata`'s ingress to `eth0`'s egress.
|
||||
|
||||
Kata Containers maintains support for MACVTAP, which was an earlier implementation used in Kata.
|
||||
With this method, Kata created a MACVTAP device to connect directly to the `eth0` device.
|
||||
TC-filter is the default because it allows for simpler configuration, better CNI plugin
|
||||
compatibility, and performance on par with MACVTAP.
|
||||
|
||||
Kata Containers has deprecated support for bridge due to lacking performance relative to TC-filter and MACVTAP.
|
||||
|
||||
Kata Containers supports both
|
||||
[CNM](https://github.com/docker/libnetwork/blob/master/docs/design.md#the-container-network-model)
|
||||
and [CNI](https://github.com/containernetworking/cni) for networking management.
|
||||
|
||||
## Network Hotplug
|
||||
|
||||
Kata Containers has developed a set of network sub-commands and APIs to add, list and
|
||||
remove a guest network endpoint and to manipulate the guest route table.
|
||||
|
||||
The following diagram illustrates the Kata Containers network hotplug workflow.
|
||||
|
||||

|
||||
@@ -1,44 +0,0 @@
|
||||
# Storage
|
||||
|
||||
## virtio SCSI
|
||||
|
||||
If a block-based graph driver is [configured](README.md#configuration),
|
||||
`virtio-scsi` is used to _share_ the workload image (such as
|
||||
`busybox:latest`) into the container's environment inside the VM.
|
||||
|
||||
## virtio FS
|
||||
|
||||
If a block-based graph driver is _not_ [configured](README.md#configuration), a
|
||||
[`virtio-fs`](https://virtio-fs.gitlab.io) (`VIRTIO`) overlay
|
||||
filesystem mount point is used to _share_ the workload image instead. The
|
||||
[agent](README.md#agent) uses this mount point as the root filesystem for the
|
||||
container processes.
|
||||
|
||||
For virtio-fs, the [runtime](README.md#runtime) starts one `virtiofsd` daemon
|
||||
(that runs in the host context) for each VM created.
|
||||
|
||||
## Devicemapper
|
||||
|
||||
The
|
||||
[devicemapper `snapshotter`](https://github.com/containerd/containerd/tree/master/snapshots/devmapper)
|
||||
is a special case. The `snapshotter` uses dedicated block devices
|
||||
rather than formatted filesystems, and operates at the block level
|
||||
rather than the file level. This knowledge is used to directly use the
|
||||
underlying block device instead of the overlay file system for the
|
||||
container root file system. The block device maps to the top
|
||||
read-write layer for the overlay. This approach gives much better I/O
|
||||
performance compared to using `virtio-fs` to share the container file
|
||||
system.
|
||||
|
||||
#### Hot plug and unplug
|
||||
|
||||
Kata Containers has the ability to hot plug add and hot plug remove
|
||||
block devices. This makes it possible to use block devices for
|
||||
containers started after the VM has been launched.
|
||||
|
||||
Users can check to see if the container uses the `devicemapper` block
|
||||
device as its rootfs by calling `mount(8)` within the container. If
|
||||
the `devicemapper` block device is used, the root filesystem (`/`)
|
||||
will be mounted from `/dev/vda`. Users can disable direct mounting of
|
||||
the underlying block device through the runtime
|
||||
[configuration](README.md#configuration).
|
||||
@@ -1,3 +0,0 @@
|
||||
# Kata Containers E2E Flow
|
||||
|
||||

|
||||
@@ -1,368 +0,0 @@
|
||||
# Host cgroup management
|
||||
|
||||
## Introduction
|
||||
|
||||
In Kata Containers, workloads run in a virtual machine that is managed by a virtual
|
||||
machine monitor (VMM) running on the host. As a result, Kata Containers run over two layers of cgroups. The
|
||||
first layer is in the guest where the workload is placed, while the second layer is on the host where the
|
||||
VMM and associated threads are running.
|
||||
|
||||
The OCI [runtime specification][linux-config] provides guidance on where the container cgroups should be placed:
|
||||
|
||||
> [`cgroupsPath`][cgroupspath]: (string, OPTIONAL) path to the cgroups. It can be used to either control the cgroups
|
||||
> hierarchy for containers or to run a new process in an existing container
|
||||
|
||||
Cgroups are hierarchical, and this can be seen with the following pod example:
|
||||
|
||||
- Pod 1: `cgroupsPath=/kubepods/pod1`
|
||||
- Container 1: `cgroupsPath=/kubepods/pod1/container1`
|
||||
- Container 2: `cgroupsPath=/kubepods/pod1/container2`
|
||||
|
||||
- Pod 2: `cgroupsPath=/kubepods/pod2`
|
||||
- Container 1: `cgroupsPath=/kubepods/pod2/container1`
|
||||
- Container 2: `cgroupsPath=/kubepods/pod2/container2`
|
||||
|
||||
Depending on the upper-level orchestration layers, the cgroup under which the pod is placed is
|
||||
managed by the orchestrator or not. In the case of Kubernetes, the pod cgroup is created by Kubelet,
|
||||
while the container cgroups are to be handled by the runtime.
|
||||
Kubelet will size the pod cgroup based on the container resource requirements, to which it may add
|
||||
a configured set of [pod resource overheads](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/).
|
||||
|
||||
Kata Containers introduces a non-negligible resource overhead for running a sandbox (pod). Typically, the Kata shim,
|
||||
through its underlying VMM invocation, will create many additional threads compared to process based container runtimes:
|
||||
the para-virtualized I/O back-ends, the VMM instance or even the Kata shim process, all of those host processes consume
|
||||
memory and CPU time not directly tied to the container workload, and introduces a sandbox resource overhead.
|
||||
In order for a Kata workload to run without significant performance degradation, its sandbox overhead must be
|
||||
provisioned accordingly. Two scenarios are possible:
|
||||
|
||||
1) The upper-layer orchestrator takes the overhead of running a sandbox into account when sizing the pod cgroup.
|
||||
For example, Kubernetes [`PodOverhead`](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/)
|
||||
feature lets the orchestrator add a configured sandbox overhead to the sum of all its containers resources. In
|
||||
that case, the pod sandbox is properly sized and all Kata created processes will run under the pod cgroup
|
||||
defined constraints and limits.
|
||||
2) The upper-layer orchestrator does **not** take the sandbox overhead into account and the pod cgroup is not
|
||||
sized to properly run all Kata created processes. With that scenario, attaching all the Kata processes to the sandbox
|
||||
cgroup may lead to non-negligible workload performance degradations. As a consequence, Kata Containers will move
|
||||
all processes but the vCPU threads into a dedicated overhead cgroup under `/kata_overhead`. The Kata runtime will
|
||||
not apply any constraints or limits to that cgroup, it is up to the infrastructure owner to optionally set it up.
|
||||
|
||||
Those 2 scenarios are not dynamically detected by the Kata Containers runtime implementation, and thus the
|
||||
infrastructure owner must configure the runtime according to how the upper-layer orchestrator creates and sizes the
|
||||
pod cgroup. That configuration selection is done through the `sandbox_cgroup_only` flag within the Kata Containers
|
||||
[configuration](../../src/runtime/README.md#configuration) file.
|
||||
|
||||
## `sandbox_cgroup_only = true`
|
||||
|
||||
Setting `sandbox_cgroup_only` to `true` from the Kata Containers configuration file means that the pod cgroup is
|
||||
properly sized and takes the pod overhead into account. This is ideal, as all the applicable Kata Containers processes
|
||||
can simply be placed within the given cgroup path.
|
||||
|
||||
In the context of Kubernetes, Kubelet can size the pod cgroup to take the overhead of running a Kata-based sandbox
|
||||
into account. This has been supported since the 1.16 Kubernetes release, through the
|
||||
[`PodOverhead`](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/) feature.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────┐ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │
|
||||
│ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │Pod 1 │ │ │
|
||||
│ │ └─────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────┐ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │
|
||||
│ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │Pod 2 │ │ │
|
||||
│ │ └─────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │/kubepods │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Node │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Implementation details
|
||||
|
||||
When `sandbox_cgroup_only` is enabled, the Kata shim will create a per pod
|
||||
sub-cgroup under the pod's dedicated cgroup. For example, in the Kubernetes context,
|
||||
it will create a `/kata_<PodSandboxID>` under the `/kubepods` cgroup hierarchy.
|
||||
On a typical cgroup v1 hierarchy mounted under `/sys/fs/cgroup/`, the memory cgroup
|
||||
subsystem for a pod with sandbox ID `12345678` would live under
|
||||
`/sys/fs/cgroup/memory/kubepods/kata_12345678`.
|
||||
|
||||
In most cases, the `/kata_<PodSandboxID>` created cgroup is unrestricted and inherits and shares all
|
||||
constraints and limits from the parent cgroup (`/kubepods` in the Kubernetes case). The exception is
|
||||
for the `cpuset` and `devices` cgroup subsystems, which are managed by the Kata shim.
|
||||
|
||||
After creating the `/kata_<PodSandboxID>` cgroup, the Kata Containers shim will move itself to it, **before** starting
|
||||
the virtual machine. As a consequence all processes subsequently created by the Kata Containers shim (the VMM itself, and
|
||||
all vCPU and I/O related threads) will be created in the `/kata_<PodSandboxID>` cgroup.
|
||||
|
||||
### Why create a kata-cgroup under the parent cgroup?
|
||||
|
||||
And why not directly adding the per sandbox shim directly to the pod cgroup (e.g.
|
||||
`/kubepods` in the Kubernetes context)?
|
||||
|
||||
The Kata Containers shim implementation creates a per-sandbox cgroup
|
||||
(`/kata_<PodSandboxID>`) to support the `Docker` use case. Although `Docker` does not
|
||||
have a notion of pods, Kata Containers still creates a sandbox to support the pod-less,
|
||||
single container use case that `Docker` implements. Since `Docker` does create any
|
||||
cgroup hierarchy to place a container into, it would be very complex for Kata to map
|
||||
a particular container to its sandbox without placing it under a `/kata_<containerID>>`
|
||||
sub-cgroup first.
|
||||
|
||||
### Advantages
|
||||
|
||||
Keeping all Kata Containers processes under a properly sized pod cgroup is ideal
|
||||
and makes for a simpler Kata Containers implementation. It also helps with gathering
|
||||
accurate statistics and preventing Kata workloads from being noisy neighbors.
|
||||
|
||||
#### Pod resources statistics
|
||||
|
||||
If the Kata caller wants to know the resource usage on the host it can get
|
||||
statistics from the pod cgroup. All cgroups stats in the hierarchy will include
|
||||
the Kata overhead. This gives the possibility of gathering usage-statics at the
|
||||
pod level and the container level.
|
||||
|
||||
#### Better host resource isolation
|
||||
|
||||
Because the Kata runtime will place all the Kata processes in the pod cgroup,
|
||||
the resource limits that the caller applies to the pod cgroup will affect all
|
||||
processes that belong to the Kata sandbox in the host. This will improve the
|
||||
isolation in the host preventing Kata to become a noisy neighbor.
|
||||
|
||||
## `sandbox_cgroup_only = false` (Default setting)
|
||||
|
||||
If the cgroup provided to Kata is not sized appropriately, Kata components will
|
||||
consume resources that the actual container workloads expect to see and use.
|
||||
This can cause instability and performance degradations.
|
||||
|
||||
To avoid that situation, Kata Containers creates an unconstrained overhead
|
||||
cgroup and moves all non workload related processes (Anything but the virtual CPU
|
||||
threads) to it. The name of this overhead cgroup is `/kata_overhead` and a per
|
||||
sandbox sub cgroup will be created under it for each sandbox Kata Containers creates.
|
||||
|
||||
Kata Containers does not add any constraints or limitations on the overhead cgroup. It is up to the infrastructure
|
||||
owner to either:
|
||||
|
||||
- Provision nodes with a pre-sized `/kata_overhead` cgroup. Kata Containers will
|
||||
load that existing cgroup and move all non workload related processes to it.
|
||||
- Let Kata Containers create the `/kata_overhead` cgroup, leave it
|
||||
unconstrained or resize it a-posteriori.
|
||||
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────────┐
|
||||
│ │
|
||||
│ ┌─────────────────────────────┐ ┌───────────────────────────┐ │
|
||||
│ │ │ │ │ │
|
||||
│ │ ┌─────────────────────────┼────┼─────────────────────────┐ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ │ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ │ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │ /<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ Pod 1 │ │ │ │ │
|
||||
│ │ └─────────────────────────┼────┼─────────────────────────┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ ┌─────────────────────────┼────┼─────────────────────────┐ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ │ │ ┌─────────────────────┐ │ │ │
|
||||
│ │ │ │ vCPU threads │ │ │ │ VMM │ │ │ │
|
||||
│ │ │ │ │ │ │ │ I/O threads │ │ │ │
|
||||
│ │ │ │ │ │ │ │ Kata Shim │ │ │ │
|
||||
│ │ │ │ │ │ │ │ │ │ │ │
|
||||
│ │ │ │ /kata_<sandbox_id> │ │ │ │ /<sandbox_id> │ │ │ │
|
||||
│ │ │ └─────────────────────┘ │ │ └─────────────────────┘ │ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ │ Pod 2 │ │ │ │ │
|
||||
│ │ └─────────────────────────┼────┼─────────────────────────┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ /kubepods │ │ /kata_overhead │ │
|
||||
│ └─────────────────────────────┘ └───────────────────────────┘ │
|
||||
│ │
|
||||
│ │
|
||||
│ Node │
|
||||
└────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
```
|
||||
|
||||
### Implementation Details
|
||||
|
||||
When `sandbox_cgroup_only` is disabled, the Kata Containers shim will create a per pod
|
||||
sub-cgroup under the pods dedicated cgroup, and another one under the overhead cgroup.
|
||||
For example, in the Kubernetes context, it will create a `/kata_<PodSandboxID>` under
|
||||
the `/kubepods` cgroup hierarchy, and a `/<PodSandboxID>` under the `/kata_overhead` one.
|
||||
|
||||
On a typical cgroup v1 hierarchy mounted under `/sys/fs/cgroup/`, for a pod which sandbox
|
||||
ID is `12345678`, create with `sandbox_cgroup_only` disabled, the 2 memory subsystems
|
||||
for the sandbox cgroup and the overhead cgroup would respectively live under
|
||||
`/sys/fs/cgroup/memory/kubepods/kata_12345678` and `/sys/fs/cgroup/memory/kata_overhead/12345678`.
|
||||
|
||||
Unlike when `sandbox_cgroup_only` is enabled, the Kata Containers shim will move itself
|
||||
to the overhead cgroup first, and then move the vCPU threads to the sandbox cgroup as
|
||||
they're created. All Kata processes and threads will run under the overhead cgroup except for
|
||||
the vCPU threads.
|
||||
|
||||
With `sandbox_cgroup_only` disabled, Kata Containers assumes the pod cgroup is only sized
|
||||
to accommodate for the actual container workloads processes. For Kata, this maps
|
||||
to the VMM created virtual CPU threads and so they are the only ones running under the pod
|
||||
cgroup. This mitigates the risk of the VMM, the Kata shim and the I/O threads going through
|
||||
a catastrophic out of memory scenario (`OOM`).
|
||||
|
||||
#### Pros and Cons
|
||||
|
||||
Running all non vCPU threads under an unconstrained overhead cgroup could lead to workloads
|
||||
potentially consuming a large amount of host resources.
|
||||
|
||||
On the other hand, running all non vCPU threads under a dedicated overhead cgroup can provide
|
||||
accurate metrics on the actual Kata Container pod overhead, allowing for tuning the overhead
|
||||
cgroup size and constraints accordingly.
|
||||
|
||||
[linux-config]: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md
|
||||
[cgroupspath]: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md#cgroups-path
|
||||
|
||||
# Supported cgroups
|
||||
|
||||
Kata Containers currently only supports cgroups `v1`.
|
||||
|
||||
In the following sections each cgroup is described briefly.
|
||||
|
||||
## Cgroups V1
|
||||
|
||||
`Cgroups V1` are under a [`tmpfs`][1] filesystem mounted at `/sys/fs/cgroup`, where each cgroup is
|
||||
mounted under a separate cgroup filesystem. A `Cgroups v1` hierarchy may look like the following
|
||||
diagram:
|
||||
|
||||
```
|
||||
/sys/fs/cgroup/
|
||||
├── blkio
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── cpu -> cpu,cpuacct
|
||||
├── cpuacct -> cpu,cpuacct
|
||||
├── cpu,cpuacct
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── cpuset
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── devices
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── freezer
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── hugetlb
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── memory
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── net_cls -> net_cls,net_prio
|
||||
├── net_cls,net_prio
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── net_prio -> net_cls,net_prio
|
||||
├── perf_event
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
├── pids
|
||||
│ ├── cgroup.procs
|
||||
│ └── tasks
|
||||
└── systemd
|
||||
├── cgroup.procs
|
||||
└── tasks
|
||||
```
|
||||
|
||||
A process can join a cgroup by writing its process id (`pid`) to `cgroup.procs` file,
|
||||
or join a cgroup partially by writing the task (thread) id (`tid`) to the `tasks` file.
|
||||
|
||||
Kata Containers only supports `v1`.
|
||||
To know more about `cgroups v1`, see [cgroupsv1(7)][2].
|
||||
|
||||
## Cgroups V2
|
||||
|
||||
`Cgroups v2` are also known as unified cgroups, unlike `cgroups v1`, the cgroups are
|
||||
mounted under the same cgroup filesystem. A `Cgroups v2` hierarchy may look like the following
|
||||
diagram:
|
||||
|
||||
```
|
||||
/sys/fs/cgroup/system.slice
|
||||
├── cgroup.controllers
|
||||
├── cgroup.events
|
||||
├── cgroup.freeze
|
||||
├── cgroup.max.depth
|
||||
├── cgroup.max.descendants
|
||||
├── cgroup.procs
|
||||
├── cgroup.stat
|
||||
├── cgroup.subtree_control
|
||||
├── cgroup.threads
|
||||
├── cgroup.type
|
||||
├── cpu.max
|
||||
├── cpu.pressure
|
||||
├── cpu.stat
|
||||
├── cpu.weight
|
||||
├── cpu.weight.nice
|
||||
├── io.bfq.weight
|
||||
├── io.latency
|
||||
├── io.max
|
||||
├── io.pressure
|
||||
├── io.stat
|
||||
├── memory.current
|
||||
├── memory.events
|
||||
├── memory.events.local
|
||||
├── memory.high
|
||||
├── memory.low
|
||||
├── memory.max
|
||||
├── memory.min
|
||||
├── memory.oom.group
|
||||
├── memory.pressure
|
||||
├── memory.stat
|
||||
├── memory.swap.current
|
||||
├── memory.swap.events
|
||||
├── memory.swap.max
|
||||
├── pids.current
|
||||
├── pids.events
|
||||
└── pids.max
|
||||
```
|
||||
|
||||
Same as `cgroups v1`, a process can join the cgroup by writing its process id (`pid`) to
|
||||
`cgroup.procs` file, or join a cgroup partially by writing the task (thread) id (`tid`) to
|
||||
`cgroup.threads` file.
|
||||
|
||||
Kata Containers does not support cgroups `v2` on the host.
|
||||
|
||||
### Distro Support
|
||||
|
||||
Many Linux distributions do not yet support `cgroups v2`, as it is quite a recent addition.
|
||||
For more information about the status of this feature see [issue #2494][4].
|
||||
|
||||
|
||||
[1]: http://man7.org/linux/man-pages/man5/tmpfs.5.html
|
||||
[2]: http://man7.org/linux/man-pages/man7/cgroups.7.html#CGROUPS_VERSION_1
|
||||
[3]: http://man7.org/linux/man-pages/man7/cgroups.7.html#CGROUPS_VERSION_2
|
||||
[4]: https://github.com/kata-containers/runtime/issues/2494
|
||||
@@ -1,30 +0,0 @@
|
||||
# Kata Containers support for `inotify`
|
||||
|
||||
## Background on `inotify` usage
|
||||
|
||||
A common pattern in Kubernetes is to watch for changes to files/directories passed in as `ConfigMaps`
|
||||
or `Secrets`. Sidecar's normally use `inotify` to watch for changes and then signal the primary container to reload
|
||||
the updated configuration. Kata Containers typically will pass these host files into the guest using `virtiofs`, which
|
||||
does not support `inotify` today. While we work to enable this use case in `virtiofs`, we introduced a workaround in Kata Containers.
|
||||
This document describes how Kata Containers implements this workaround.
|
||||
|
||||
### Detecting a `watchable` mount
|
||||
|
||||
Kubernetes creates `secrets` and `ConfigMap` mounts at very specific locations on the host filesystem. For container mounts,
|
||||
the `Kata Containers` runtime will check the source of the mount to identify these special cases. For these use cases, only a single file
|
||||
or very few would typically need to be watched. To avoid excessive overheads in making a mount watchable,
|
||||
we enforce a limit of eight files per mount. If a `secret` or `ConfigMap` mount contains more than 8 files, it will not be
|
||||
considered watchable. We similarly enforce a limit of 1 MB per mount to be considered watchable. Non-watchable mounts will
|
||||
continue to propagate changes from the mount on the host to the container workload, but these updates will not trigger an
|
||||
`inotify` event.
|
||||
|
||||
If at any point a mount grows beyond the eight file or 1MB limit, it will no longer be `watchable.`
|
||||
|
||||
### Presenting a `watchable` mount to the workload
|
||||
|
||||
For mounts that are considered `watchable`, inside the guest, the `kata-agent` will poll the mount presented from
|
||||
the host through `virtiofs` and copy any changed files to a `tmpfs` mount that is presented to the container. In this way,
|
||||
for `watchable` mounts, Kata will do the polling on behalf of the workload and existing workloads needn't change their usage
|
||||
of `inotify`.
|
||||
|
||||

|
||||
@@ -1,354 +0,0 @@
|
||||
# Kata 2.0 Metrics Design
|
||||
|
||||
Kata implements CRI's API and supports [`ContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L101) and [`ListContainerStats`](https://github.com/kubernetes/kubernetes/blob/release-1.18/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto#L103) interfaces to expose containers metrics. User can use these interfaces to get basic metrics about containers.
|
||||
|
||||
Unlike `runc`, Kata is a VM-based runtime and has a different architecture.
|
||||
|
||||
## Limitations of Kata 1.x and target of Kata 2.0
|
||||
|
||||
Kata 1.x has a number of limitations related to observability that may be obstacles to running Kata Containers at scale.
|
||||
|
||||
In Kata 2.0, the following components will be able to provide more details about the system:
|
||||
|
||||
- containerd shim v2 (effectively `kata-runtime`)
|
||||
- Hypervisor statistics
|
||||
- Agent process
|
||||
- Guest OS statistics
|
||||
|
||||
> **Note**: In Kata 1.x, the main user-facing component was the runtime (`kata-runtime`). From 1.5, Kata introduced the Kata containerd shim v2 (`containerd-shim-kata-v2`) which is essentially a modified runtime that is loaded by containerd to simplify and improve the way VM-based containers are created and managed.
|
||||
>
|
||||
> For Kata 2.0, the main component is the Kata containerd shim v2, although the deprecated `kata-runtime` binary will be maintained for a period of time.
|
||||
>
|
||||
> Any mention of the "Kata runtime" in this document should be taken to refer to the Kata containerd shim v2 unless explicitly noted otherwise (for example by referring to it explicitly as the `kata-runtime` binary).
|
||||
|
||||
## Metrics architecture
|
||||
|
||||
Kata 2.0 metrics strongly depend on [Prometheus](https://prometheus.io/), a graduated project from CNCF.
|
||||
|
||||
Kata Containers 2.0 introduces a new Kata component called `kata-monitor` which is used to monitor the Kata components on the host. It's shipped with the Kata runtime to provide an interface to:
|
||||
|
||||
- Get metrics
|
||||
- Get events
|
||||
|
||||
At present, `kata-monitor` supports retrieval of metrics only: this is what will be covered in this document.
|
||||
|
||||
|
||||
This is the architecture overview of metrics in Kata Containers 2.0:
|
||||
|
||||

|
||||
|
||||
|
||||
And the sequence diagram is shown below:
|
||||
|
||||

|
||||
|
||||
For a quick evaluation, you can check out [this how to](../how-to/how-to-set-prometheus-in-k8s.md).
|
||||
|
||||
### Kata monitor
|
||||
|
||||
The `kata-monitor` management agent should be started on each node where the Kata containers runtime is installed. `kata-monitor` will:
|
||||
|
||||
> **Note**: a *node* running Kata containers will be either a single host system or a worker node belonging to a K8s cluster capable of running Kata pods.
|
||||
|
||||
- Aggregate sandbox metrics running on the node, adding the `sandbox_id` label to them.
|
||||
- Expose a new Prometheus target, allowing all node metrics coming from the Kata shim to be collected by Prometheus indirectly. This simplifies the targets count in Prometheus and avoids exposing shim's metrics by `ip:port`.
|
||||
|
||||
Only one `kata-monitor` process runs in each node.
|
||||
|
||||
`kata-monitor` uses a different communication channel than the one used by the container engine (`containerd`/`CRI-O`) to communicate with the Kata shim. The Kata shim exposes a dedicated socket address reserved to `kata-monitor`.
|
||||
|
||||
The shim's metrics socket file is created under the virtcontainers sandboxes directory, i.e. `vc/sbs/${PODID}/shim-monitor.sock`.
|
||||
|
||||
> **Note**: If there is no Prometheus server configured, i.e., there are no scrape operations, `kata-monitor` will not collect any metrics.
|
||||
|
||||
### Kata runtime
|
||||
|
||||
Kata runtime is responsible for:
|
||||
|
||||
- Gather metrics about shim process
|
||||
- Gather metrics about hypervisor process
|
||||
- Gather metrics about running sandbox
|
||||
- Get metrics from Kata agent (through `ttrpc`)
|
||||
|
||||
### Kata agent
|
||||
|
||||
Kata agent is responsible for:
|
||||
|
||||
- Gather agent process metrics
|
||||
- Gather guest OS metrics
|
||||
|
||||
In Kata 2.0, the agent adds a new interface:
|
||||
|
||||
```protobuf
|
||||
rpc GetMetrics(GetMetricsRequest) returns (Metrics);
|
||||
|
||||
message GetMetricsRequest {}
|
||||
|
||||
message Metrics {
|
||||
string metrics = 1;
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
The `metrics` field is Prometheus encoded content. This can avoid defining a fixed structure in protocol buffers.
|
||||
|
||||
### Performance and overhead
|
||||
|
||||
Metrics should not become a bottleneck for the system or downgrade the performance: they should run with minimal overhead.
|
||||
|
||||
Requirements:
|
||||
|
||||
* Metrics **MUST** be quick to collect
|
||||
* Metrics **MUST** be small
|
||||
* Metrics **MUST** be generated only if there are subscribers to the Kata metrics service
|
||||
* Metrics **MUST** be stateless
|
||||
|
||||
In Kata 2.0, metrics are collected only when needed (pull mode), mainly from the `/proc` filesystem, and consumed by Prometheus. This means that if the Prometheus collector is not running (so no one cares about the metrics) the overhead will be zero.
|
||||
|
||||
The metrics service also doesn't hold any metrics in memory.
|
||||
|
||||
#### Metrics size ####
|
||||
|
||||
|\*|No Sandbox | 1 Sandbox | 2 Sandboxes |
|
||||
|---|---|---|---|
|
||||
|Metrics count| 39 | 106 | 173 |
|
||||
|Metrics size (bytes)| 9K | 144K | 283K |
|
||||
|Metrics size (`gzipped`, bytes)| 2K | 10K | 17K |
|
||||
|
||||
*Metrics size*: response size of one Prometheus scrape request.
|
||||
|
||||
It's easy to estimate the size of one metrics fetch request issued by Prometheus.
|
||||
The formula to calculate the expected size when no gzip compression is in place is:
|
||||
9 + (144 - 9) * `number of kata sandboxes`
|
||||
|
||||
Prometheus supports `gzip compression`. When enabled, the response size of each request will be smaller:
|
||||
2 + (10 - 2) * `number of kata sandboxes`
|
||||
|
||||
**Example**
|
||||
We have 10 sandboxes running on a node. The expected size of one metrics fetch request issued by Prometheus against the kata-monitor agent running on that node will be:
|
||||
9 + (144 - 9) * 10 = **1.35M**
|
||||
|
||||
If `gzip compression` is enabled:
|
||||
2 + (10 - 2) * 10 = **82K**
|
||||
|
||||
#### Metrics delay ####
|
||||
|
||||
And here is some test data:
|
||||
|
||||
- End-to-end (from Prometheus server to `kata-monitor` and `kata-monitor` write response back): **20ms**(avg)
|
||||
- Agent (RPC all from shim to agent): **3ms**(avg)
|
||||
|
||||
Test infrastructure:
|
||||
|
||||
- OS: Ubuntu 20.04
|
||||
- Hardware: Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz, 6 Cores, and 16GB memory.
|
||||
|
||||
**Scrape interval**
|
||||
|
||||
Prometheus default `scrape_interval` is 1 minute, but it is usually set to 15 seconds. A smaller `scrape_interval` causes more overhead, so users should set it depending on their monitoring needs.
|
||||
|
||||
## Metrics list
|
||||
|
||||
Here are listed all the metrics supported by Kata 2.0. Some metrics are dependent on the VM guest kernel, so the available ones may differ based on the environment.
|
||||
|
||||
Metrics are categorized by the component from/for which the metrics are collected.
|
||||
|
||||
* [Metric types](#metric-types)
|
||||
* [Kata agent metrics](#kata-agent-metrics)
|
||||
* [Firecracker metrics](#firecracker-metrics)
|
||||
* [Kata guest OS metrics](#kata-guest-os-metrics)
|
||||
* [Hypervisor metrics](#hypervisor-metrics)
|
||||
* [Kata monitor metrics](#kata-monitor-metrics)
|
||||
* [Kata containerd shim v2 metrics](#kata-containerd-shim-v2-metrics)
|
||||
|
||||
> **Note**:
|
||||
> * Labels here do not include the `instance` and `job` labels added by Prometheus.
|
||||
> * Notes about metrics unit
|
||||
> * `Kibibytes`, abbreviated `KiB`. 1 `KiB` equals 1024 B.
|
||||
> * For some metrics (like network devices statistics from file `/proc/net/dev`), unit depends on label( for example `recv_bytes` and `recv_packets` have different units).
|
||||
> * Most of these metrics are collected from the `/proc` filesystem, so the unit of each metric matches the unit of the relevant `/proc` entry. See the `proc(5)` manual page for further details.
|
||||
|
||||
### Metric types
|
||||
|
||||
Prometheus offers four core metric types.
|
||||
|
||||
- Counter: A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase.
|
||||
|
||||
- Gauge: A gauge metric represents a single numerical value that can go up and down, typically used for measured values like current memory usage.
|
||||
|
||||
- Histogram: A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
|
||||
|
||||
- Summary: A summary samples observations like histogram, it can calculate configurable quantiles over a sliding time window.
|
||||
|
||||
See [Prometheus metric types](https://prometheus.io/docs/concepts/metric_types/) for detailed explanations about these metric types.
|
||||
|
||||
### Kata agent metrics
|
||||
|
||||
Agent's metrics contains metrics about agent process.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_agent_io_stat`: <br> Agent process IO stat. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelled_write_byte`</li><li>`rchar`</li><li>`read_bytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`write_bytes`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_proc_stat`: <br> Agent process stat. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_proc_status`: <br> Agent process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_resident_memory_bytes`: <br> Resident memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_process_virtual_memory_bytes`: <br> Virtual memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_scrape_count`: <br> Metrics scrape count | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_total_rss`: <br> Agent process total `rss` size | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_total_time`: <br> Agent process total time | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_agent_total_vm`: <br> Agent process total `vm` size | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
### Firecracker metrics
|
||||
|
||||
Metrics for Firecracker vmm.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_firecracker_api_server`: <br> Metrics related to the internal API server. | `GAUGE` | | <ul><li>`item`<ul><li>`process_startup_time_cpu_us`</li><li>`process_startup_time_us`</li><li>`sync_response_fails`</li><li>`sync_vmm_send_timeout_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_block`: <br> Block Device associated metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`event_fails`</li><li>`execute_fails`</li><li>`flush_count`</li><li>`invalid_reqs_count`</li><li>`no_avail_buffer`</li><li>`queue_event_count`</li><li>`rate_limiter_event_count`</li><li>`rate_limiter_throttled_events`</li><li>`read_bytes`</li><li>`read_count`</li><li>`update_count`</li><li>`update_fails`</li><li>`write_bytes`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_get_api_requests`: <br> Metrics specific to GET API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`instance_info_count`</li><li>`instance_info_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_i8042`: <br> Metrics specific to the i8042 device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`reset_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_latencies_us`: <br> Performance metrics related for the moment only to snapshots. | `GAUGE` | | <ul><li>`item`<ul><li>`diff_create_snapshot`</li><li>`full_create_snapshot`</li><li>`load_snapshot`</li><li>`pause_vm`</li><li>`resume_vm`</li><li>`vmm_diff_create_snapshot`</li><li>`vmm_full_create_snapshot`</li><li>`vmm_load_snapshot`</li><li>`vmm_pause_vm`</li><li>`vmm_resume_vm`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_logger`: <br> Metrics for the logging subsystem. | `GAUGE` | | <ul><li>`item`<ul><li>`log_fails`</li><li>`metrics_fails`</li><li>`missed_log_count`</li><li>`missed_metrics_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_mmds`: <br> Metrics for the MMDS functionality. | `GAUGE` | | <ul><li>`item`<ul><li>`connections_created`</li><li>`connections_destroyed`</li><li>`rx_accepted`</li><li>`rx_accepted_err`</li><li>`rx_accepted_unusual`</li><li>`rx_bad_eth`</li><li>`rx_count`</li><li>`tx_bytes`</li><li>`tx_count`</li><li>`tx_errors`</li><li>`tx_frames`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_net`: <br> Network-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`event_fails`</li><li>`mac_address_updates`</li><li>`no_rx_avail_buffer`</li><li>`no_tx_avail_buffer`</li><li>`rx_bytes_count`</li><li>`rx_count`</li><li>`rx_event_rate_limiter_count`</li><li>`rx_fails`</li><li>`rx_packets_count`</li><li>`rx_partial_writes`</li><li>`rx_queue_event_count`</li><li>`rx_rate_limiter_throttled`</li><li>`rx_tap_event_count`</li><li>`tap_read_fails`</li><li>`tap_write_fails`</li><li>`tx_bytes_count`</li><li>`tx_count`</li><li>`tx_fails`</li><li>`tx_malformed_frames`</li><li>`tx_packets_count`</li><li>`tx_partial_reads`</li><li>`tx_queue_event_count`</li><li>`tx_rate_limiter_event_count`</li><li>`tx_rate_limiter_throttled`</li><li>`tx_spoofed_mac_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_patch_api_requests`: <br> Metrics specific to PATCH API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`drive_count`</li><li>`drive_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_put_api_requests`: <br> Metrics specific to PUT API Requests for counting user triggered actions and/or failures. | `GAUGE` | | <ul><li>`item`<ul><li>`actions_count`</li><li>`actions_fails`</li><li>`boot_source_count`</li><li>`boot_source_fails`</li><li>`drive_count`</li><li>`drive_fails`</li><li>`logger_count`</li><li>`logger_fails`</li><li>`machine_cfg_count`</li><li>`machine_cfg_fails`</li><li>`metrics_count`</li><li>`metrics_fails`</li><li>`network_count`</li><li>`network_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_rtc`: <br> Metrics specific to the RTC device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_seccomp`: <br> Metrics for the seccomp filtering. | `GAUGE` | | <ul><li>`item`<ul><li>`num_faults`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_signals`: <br> Metrics related to signals. | `GAUGE` | | <ul><li>`item`<ul><li>`sigbus`</li><li>`sigsegv`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_uart`: <br> Metrics specific to the UART device. | `GAUGE` | | <ul><li>`item`<ul><li>`error_count`</li><li>`flush_count`</li><li>`missed_read_count`</li><li>`missed_write_count`</li><li>`read_count`</li><li>`write_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vcpu`: <br> Metrics specific to VCPUs' mode of functioning. | `GAUGE` | | <ul><li>`item`<ul><li>`exit_io_in`</li><li>`exit_io_out`</li><li>`exit_mmio_read`</li><li>`exit_mmio_write`</li><li>`failures`</li><li>`filter_cpuid`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vmm`: <br> Metrics specific to the machine manager as a whole. | `GAUGE` | | <ul><li>`item`<ul><li>`device_events`</li><li>`panic_count`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_firecracker_vsock`: <br> VSOCK-related metrics. | `GAUGE` | | <ul><li>`item`<ul><li>`activate_fails`</li><li>`cfg_fails`</li><li>`conn_event_fails`</li><li>`conns_added`</li><li>`conns_killed`</li><li>`conns_removed`</li><li>`ev_queue_event_fails`</li><li>`killq_resync`</li><li>`muxer_event_fails`</li><li>`rx_bytes_count`</li><li>`rx_packets_count`</li><li>`rx_queue_event_count`</li><li>`rx_queue_event_fails`</li><li>`rx_read_fails`</li><li>`tx_bytes_count`</li><li>`tx_flush_fails`</li><li>`tx_packets_count`</li><li>`tx_queue_event_count`</li><li>`tx_queue_event_fails`</li><li>`tx_write_fails`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
### Kata guest OS metrics
|
||||
|
||||
Guest OS's metrics in hypervisor.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_guest_cpu_time`: <br> Guest CPU stat. | `GAUGE` | | <ul><li>`cpu` (CPU no. and total for all CPUs)<ul><li>`0` (CPU 0)</li><li>`1` (CPU 1)</li><li>`total` (for all CPUs)</li></ul></li><li>`item` (Kernel/system statistics, from `/proc/stat`)<ul><li>`guest`</li><li>`guest_nice`</li><li>`idle`</li><li>`iowait`</li><li>`irq`</li><li>`nice`</li><li>`softirq`</li><li>`steal`</li><li>`system`</li><li>`user`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_diskstat`: <br> Disks stat in system. | `GAUGE` | | <ul><li>`disk` (disk name)</li><li>`item` (see `/proc/diskstats`)<ul><li>`discards`</li><li>`discards_merged`</li><li>`flushes`</li><li>`in_progress`</li><li>`merged`</li><li>`reads`</li><li>`sectors_discarded`</li><li>`sectors_read`</li><li>`sectors_written`</li><li>`time_discarding`</li><li>`time_flushing`</li><li>`time_in_progress`</li><li>`time_reading`</li><li>`time_writing`</li><li>`weighted_time_in_progress`</li><li>`writes`</li><li>`writes_merged`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_load`: <br> Guest system load. | `GAUGE` | | <ul><li>`item`<ul><li>`load1`</li><li>`load15`</li><li>`load5`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_meminfo`: <br> Statistics about memory usage on the system. | `GAUGE` | | <ul><li>`item` (see `/proc/meminfo`)<ul><li>`active`</li><li>`active_anon`</li><li>`active_file`</li><li>`anon_hugepages`</li><li>`anon_pages`</li><li>`bounce`</li><li>`buffers`</li><li>`cached`</li><li>`cma_free`</li><li>`cma_total`</li><li>`commit_limit`</li><li>`committed_as`</li><li>`direct_map_1G`</li><li>`direct_map_2M`</li><li>`direct_map_4M`</li><li>`direct_map_4k`</li><li>`dirty`</li><li>`hardware_corrupted`</li><li>`high_free`</li><li>`high_total`</li><li>`hugepages_free`</li><li>`hugepages_rsvd`</li><li>`hugepages_surp`</li><li>`hugepages_total`</li><li>`hugepagesize`</li><li>`hugetlb`</li><li>`inactive`</li><li>`inactive_anon`</li><li>`inactive_file`</li><li>`k_reclaimable`</li><li>`kernel_stack`</li><li>`low_free`</li><li>`low_total`</li><li>`mapped`</li><li>`mem_available`</li><li>`mem_free`</li><li>`mem_total`</li><li>`mlocked`</li><li>`mmap_copy`</li><li>`nfs_unstable`</li><li>`page_tables`</li><li>`per_cpu`</li><li>`quicklists`</li><li>`s_reclaimable`</li><li>`s_unreclaim`</li><li>`shmem`</li><li>`shmem_hugepages`</li><li>`shmem_pmd_mapped`</li><li>`slab`</li><li>`swap_cached`</li><li>`swap_free`</li><li>`swap_total`</li><li>`unevictable`</li><li>`vmalloc_chunk`</li><li>`vmalloc_total`</li><li>`vmalloc_used`</li><li>`writeback`</li><li>`writeback_tmp`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_netdev_stat`: <br> Guest net devices stats. | `GAUGE` | | <ul><li>`interface` (network device name)</li><li>`item` (see `/proc/net/dev`)<ul><li>`recv_bytes`</li><li>`recv_compressed`</li><li>`recv_drop`</li><li>`recv_errs`</li><li>`recv_fifo`</li><li>`recv_frame`</li><li>`recv_multicast`</li><li>`recv_packets`</li><li>`sent_bytes`</li><li>`sent_carrier`</li><li>`sent_colls`</li><li>`sent_compressed`</li><li>`sent_drop`</li><li>`sent_errs`</li><li>`sent_fifo`</li><li>`sent_packets`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_tasks`: <br> Guest system load. | `GAUGE` | | <ul><li>`item`<ul><li>`cur`</li><li>`max`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_guest_vm_stat`: <br> Guest virtual memory stat. | `GAUGE` | | <ul><li>`item` (see `/proc/vmstat`)<ul><li>`allocstall_dma`</li><li>`allocstall_dma32`</li><li>`allocstall_movable`</li><li>`allocstall_normal`</li><li>`balloon_deflate`</li><li>`balloon_inflate`</li><li>`compact_daemon_free_scanned`</li><li>`compact_daemon_migrate_scanned`</li><li>`compact_daemon_wake`</li><li>`compact_fail`</li><li>`compact_free_scanned`</li><li>`compact_isolated`</li><li>`compact_migrate_scanned`</li><li>`compact_stall`</li><li>`compact_success`</li><li>`drop_pagecache`</li><li>`drop_slab`</li><li>`htlb_buddy_alloc_fail`</li><li>`htlb_buddy_alloc_success`</li><li>`kswapd_high_wmark_hit_quickly`</li><li>`kswapd_inodesteal`</li><li>`kswapd_low_wmark_hit_quickly`</li><li>`nr_active_anon`</li><li>`nr_active_file`</li><li>`nr_anon_pages`</li><li>`nr_anon_transparent_hugepages`</li><li>`nr_bounce`</li><li>`nr_dirtied`</li><li>`nr_dirty`</li><li>`nr_dirty_background_threshold`</li><li>`nr_dirty_threshold`</li><li>`nr_file_pages`</li><li>`nr_free_cma`</li><li>`nr_free_pages`</li><li>`nr_inactive_anon`</li><li>`nr_inactive_file`</li><li>`nr_isolated_anon`</li><li>`nr_isolated_file`</li><li>`nr_kernel_stack`</li><li>`nr_mapped`</li><li>`nr_mlock`</li><li>`nr_page_table_pages`</li><li>`nr_shmem`</li><li>`nr_shmem_hugepages`</li><li>`nr_shmem_pmdmapped`</li><li>`nr_slab_reclaimable`</li><li>`nr_slab_unreclaimable`</li><li>`nr_unevictable`</li><li>`nr_unstable`</li><li>`nr_vmscan_immediate_reclaim`</li><li>`nr_vmscan_write`</li><li>`nr_writeback`</li><li>`nr_writeback_temp`</li><li>`nr_written`</li><li>`nr_zone_active_anon`</li><li>`nr_zone_active_file`</li><li>`nr_zone_inactive_anon`</li><li>`nr_zone_inactive_file`</li><li>`nr_zone_unevictable`</li><li>`nr_zone_write_pending`</li><li>`oom_kill`</li><li>`pageoutrun`</li><li>`pgactivate`</li><li>`pgalloc_dma`</li><li>`pgalloc_dma32`</li><li>`pgalloc_movable`</li><li>`pgalloc_normal`</li><li>`pgdeactivate`</li><li>`pgfault`</li><li>`pgfree`</li><li>`pginodesteal`</li><li>`pglazyfree`</li><li>`pglazyfreed`</li><li>`pgmajfault`</li><li>`pgmigrate_fail`</li><li>`pgmigrate_success`</li><li>`pgpgin`</li><li>`pgpgout`</li><li>`pgrefill`</li><li>`pgrotated`</li><li>`pgscan_direct`</li><li>`pgscan_direct_throttle`</li><li>`pgscan_kswapd`</li><li>`pgskip_dma`</li><li>`pgskip_dma32`</li><li>`pgskip_movable`</li><li>`pgskip_normal`</li><li>`pgsteal_direct`</li><li>`pgsteal_kswapd`</li><li>`pswpin`</li><li>`pswpout`</li><li>`slabs_scanned`</li><li>`swap_ra`</li><li>`swap_ra_hit`</li><li>`unevictable_pgs_cleared`</li><li>`unevictable_pgs_culled`</li><li>`unevictable_pgs_mlocked`</li><li>`unevictable_pgs_munlocked`</li><li>`unevictable_pgs_rescued`</li><li>`unevictable_pgs_scanned`</li><li>`unevictable_pgs_stranded`</li><li>`workingset_activate`</li><li>`workingset_nodereclaim`</li><li>`workingset_refault`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
### Hypervisor metrics
|
||||
|
||||
Hypervisors metrics, collected mainly from `proc` filesystem of hypervisor process.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_hypervisor_fds`: <br> Open FDs for hypervisor. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_hypervisor_io_stat`: <br> Process IO statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelledwritebytes`</li><li>`rchar`</li><li>`readbytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`writebytes`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_hypervisor_netdev`: <br> Net devices statistics. | `GAUGE` | | <ul><li>`interface` (network device name)</li><li>`item` (see `/proc/net/dev`)<ul><li>`recv_bytes`</li><li>`recv_compressed`</li><li>`recv_drop`</li><li>`recv_errs`</li><li>`recv_fifo`</li><li>`recv_frame`</li><li>`recv_multicast`</li><li>`recv_packets`</li><li>`sent_bytes`</li><li>`sent_carrier`</li><li>`sent_colls`</li><li>`sent_compressed`</li><li>`sent_drop`</li><li>`sent_errs`</li><li>`sent_fifo`</li><li>`sent_packets`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_hypervisor_proc_stat`: <br> Hypervisor process statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_hypervisor_proc_status`: <br> Hypervisor process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpmd`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_hypervisor_threads`: <br> Hypervisor process threads. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
### Kata monitor metrics
|
||||
|
||||
Metrics about monitor itself.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_monitor_go_gc_duration_seconds`: <br> A summary of the pause duration of garbage collection cycles. | `SUMMARY` | `seconds` | | 2.0.0 |
|
||||
| `kata_monitor_go_goroutines`: <br> Number of goroutines that currently exist. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_go_info`: <br> Information about the Go environment. | `GAUGE` | | <ul><li>`version` (golang version)<ul><li>`go1.13.9` (environment dependent variable)</li></ul></li></ul> | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_alloc_bytes`: <br> Number of bytes allocated and still in use. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_alloc_bytes_total`: <br> Total number of bytes allocated, even if freed. | `COUNTER` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_buck_hash_sys_bytes`: <br> Number of bytes used by the profiling bucket hash table. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_frees_total`: <br> Total number of frees. | `COUNTER` | | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_gc_cpu_fraction`: <br> The fraction of this program's available CPU time used by the GC since the program started. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_gc_sys_bytes`: <br> Number of bytes used for garbage collection system metadata. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_alloc_bytes`: <br> Number of heap bytes allocated and still in use. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_idle_bytes`: <br> Number of heap bytes waiting to be used. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_inuse_bytes`: <br> Number of heap bytes that are in use. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_objects`: <br> Number of allocated objects. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_released_bytes`: <br> Number of heap bytes released to OS. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_heap_sys_bytes`: <br> Number of heap bytes obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_last_gc_time_seconds`: <br> Number of seconds since 1970 of last garbage collection. | `GAUGE` | `seconds` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_lookups_total`: <br> Total number of pointer lookups. | `COUNTER` | | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_mallocs_total`: <br> Total number of `mallocs`. | `COUNTER` | | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_mcache_inuse_bytes`: <br> Number of bytes in use by `mcache` structures. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_mcache_sys_bytes`: <br> Number of bytes used for `mcache` structures obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_mspan_inuse_bytes`: <br> Number of bytes in use by `mspan` structures. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_mspan_sys_bytes`: <br> Number of bytes used for `mspan` structures obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_next_gc_bytes`: <br> Number of heap bytes when next garbage collection will take place. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_other_sys_bytes`: <br> Number of bytes used for other system allocations. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_stack_inuse_bytes`: <br> Number of bytes in use by the stack allocator. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_stack_sys_bytes`: <br> Number of bytes obtained from system for stack allocator. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_memstats_sys_bytes`: <br> Number of bytes obtained from system. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_go_threads`: <br> Number of OS threads created. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | | 2.0.0 |
|
||||
| `kata_monitor_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_process_resident_memory_bytes`: <br> Resident memory size in bytes. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | | 2.0.0 |
|
||||
| `kata_monitor_process_virtual_memory_bytes`: <br> Virtual memory size in bytes. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_process_virtual_memory_max_bytes`: <br> Maximum amount of virtual memory available in bytes. | `GAUGE` | `bytes` | | 2.0.0 |
|
||||
| `kata_monitor_running_shim_count`: <br> Running shim count(running sandboxes). | `GAUGE` | | | 2.0.0 |
|
||||
| `kata_monitor_scrape_count`: <br> Scape count. | `COUNTER` | | | 2.0.0 |
|
||||
| `kata_monitor_scrape_durations_histogram_milliseconds`: <br> Time used to scrape from shims | `HISTOGRAM` | `milliseconds` | | 2.0.0 |
|
||||
| `kata_monitor_scrape_failed_count`: <br> Failed scape count. | `COUNTER` | | | 2.0.0 |
|
||||
|
||||
### Kata containerd shim v2 metrics
|
||||
|
||||
Metrics about Kata containerd shim v2 process.
|
||||
|
||||
| Metric name | Type | Units | Labels | Introduced in Kata version |
|
||||
|---|---|---|---|---|
|
||||
| `kata_shim_agent_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (RPC actions of Kata agent)<ul><li>`grpc.CheckRequest`</li><li>`grpc.CloseStdinRequest`</li><li>`grpc.CopyFileRequest`</li><li>`grpc.CreateContainerRequest`</li><li>`grpc.CreateSandboxRequest`</li><li>`grpc.DestroySandboxRequest`</li><li>`grpc.ExecProcessRequest`</li><li>`grpc.GetMetricsRequest`</li><li>`grpc.GuestDetailsRequest`</li><li>`grpc.ListInterfacesRequest`</li><li>`grpc.ListProcessesRequest`</li><li>`grpc.ListRoutesRequest`</li><li>`grpc.MemHotplugByProbeRequest`</li><li>`grpc.OnlineCPUMemRequest`</li><li>`grpc.PauseContainerRequest`</li><li>`grpc.RemoveContainerRequest`</li><li>`grpc.ReseedRandomDevRequest`</li><li>`grpc.ResumeContainerRequest`</li><li>`grpc.SetGuestDateTimeRequest`</li><li>`grpc.SignalProcessRequest`</li><li>`grpc.StartContainerRequest`</li><li>`grpc.StatsContainerRequest`</li><li>`grpc.TtyWinResizeRequest`</li><li>`grpc.UpdateContainerRequest`</li><li>`grpc.UpdateInterfaceRequest`</li><li>`grpc.UpdateRoutesRequest`</li><li>`grpc.WaitProcessRequest`</li><li>`grpc.WriteStreamRequest`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_fds`: <br> Kata containerd shim v2 open FDs. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_gc_duration_seconds`: <br> A summary of the pause duration of garbage collection cycles. | `SUMMARY` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_goroutines`: <br> Number of goroutines that currently exist. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_info`: <br> Information about the Go environment. | `GAUGE` | | <ul><li>`sandbox_id`</li><li>`version` (golang version)<ul><li>`go1.13.9` (environment dependent variable)</li></ul></li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_alloc_bytes`: <br> Number of bytes allocated and still in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_alloc_bytes_total`: <br> Total number of bytes allocated, even if freed. | `COUNTER` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_buck_hash_sys_bytes`: <br> Number of bytes used by the profiling bucket hash table. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_frees_total`: <br> Total number of frees. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_gc_cpu_fraction`: <br> The fraction of this program's available CPU time used by the GC since the program started. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_gc_sys_bytes`: <br> Number of bytes used for garbage collection system metadata. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_alloc_bytes`: <br> Number of heap bytes allocated and still in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_idle_bytes`: <br> Number of heap bytes waiting to be used. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_inuse_bytes`: <br> Number of heap bytes that are in use. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_objects`: <br> Number of allocated objects. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_released_bytes`: <br> Number of heap bytes released to OS. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_heap_sys_bytes`: <br> Number of heap bytes obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_last_gc_time_seconds`: <br> Number of seconds since 1970 of last garbage collection. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_lookups_total`: <br> Total number of pointer lookups. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_mallocs_total`: <br> Total number of `mallocs`. | `COUNTER` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_mcache_inuse_bytes`: <br> Number of bytes in use by `mcache` structures. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_mcache_sys_bytes`: <br> Number of bytes used for `mcache` structures obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_mspan_inuse_bytes`: <br> Number of bytes in use by `mspan` structures. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_mspan_sys_bytes`: <br> Number of bytes used for `mspan` structures obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_next_gc_bytes`: <br> Number of heap bytes when next garbage collection will take place. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_other_sys_bytes`: <br> Number of bytes used for other system allocations. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_stack_inuse_bytes`: <br> Number of bytes in use by the stack allocator. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_stack_sys_bytes`: <br> Number of bytes obtained from system for stack allocator. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_memstats_sys_bytes`: <br> Number of bytes obtained from system. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_go_threads`: <br> Number of OS threads created. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_io_stat`: <br> Kata containerd shim v2 process IO statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/io`)<ul><li>`cancelledwritebytes`</li><li>`rchar`</li><li>`readbytes`</li><li>`syscr`</li><li>`syscw`</li><li>`wchar`</li><li>`writebytes`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_netdev`: <br> Kata containerd shim v2 network devices statistics. | `GAUGE` | | <ul><li>`interface` (network device name)</li><li>`item` (see `/proc/net/dev`)<ul><li>`recv_bytes`</li><li>`recv_compressed`</li><li>`recv_drop`</li><li>`recv_errs`</li><li>`recv_fifo`</li><li>`recv_frame`</li><li>`recv_multicast`</li><li>`recv_packets`</li><li>`sent_bytes`</li><li>`sent_carrier`</li><li>`sent_colls`</li><li>`sent_compressed`</li><li>`sent_drop`</li><li>`sent_errs`</li><li>`sent_fifo`</li><li>`sent_packets`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_pod_overhead_cpu`: <br> Kata Pod overhead for CPU resources(percent). | `GAUGE` | percent | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_pod_overhead_memory_in_bytes`: <br> Kata Pod overhead for memory resources(bytes). | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_proc_stat`: <br> Kata containerd shim v2 process statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_proc_status`: <br> Kata containerd shim v2 process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpmd`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_resident_memory_bytes`: <br> Resident memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_virtual_memory_bytes`: <br> Virtual memory size in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_process_virtual_memory_max_bytes`: <br> Maximum amount of virtual memory available in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_rpc_durations_histogram_milliseconds`: <br> RPC latency distributions. | `HISTOGRAM` | `milliseconds` | <ul><li>`action` (Kata shim v2 actions)<ul><li>`checkpoint`</li><li>`close_io`</li><li>`connect`</li><li>`create`</li><li>`delete`</li><li>`exec`</li><li>`kill`</li><li>`pause`</li><li>`pids`</li><li>`resize_pty`</li><li>`resume`</li><li>`shutdown`</li><li>`start`</li><li>`state`</li><li>`stats`</li><li>`update`</li><li>`wait`</li></ul></li><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
| `kata_shim_threads`: <br> Kata containerd shim v2 process threads. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> | 2.0.0 |
|
||||
|
||||
|
||||
@@ -1,101 +0,0 @@
|
||||
# Kata API Design
|
||||
|
||||
To fulfill the [Kata design requirements](kata-design-requirements.md), and based on the discussion on [Virtcontainers API extensions](https://docs.google.com/presentation/d/1dbGrD1h9cpuqAPooiEgtiwWDGCYhVPdatq7owsKHDEQ), the Kata runtime library features the following APIs:
|
||||
- Sandbox based top API
|
||||
- Storage and network hotplug API
|
||||
- Plugin frameworks for external proprietary Kata runtime extensions
|
||||
|
||||
## Sandbox Based API
|
||||
### Sandbox Management API
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`CreateSandbox(SandboxConfig, Factory)`| Create a sandbox and its containers, base on `SandboxConfig` and `Factory`. Return the `Sandbox` structure, but do not start them.|
|
||||
|
||||
### Sandbox Operation API
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`sandbox.Delete()`| Shut down the VM in which the sandbox, and destroy the sandbox and remove all persistent metadata.|
|
||||
|`sandbox.Monitor()`| Return a context handler for caller to monitor sandbox callbacks such as error termination.|
|
||||
|`sandbox.Release()`| Release a sandbox data structure, close connections to the agent, and quit any goroutines associated with the Sandbox. Mostly used for daemon restart.|
|
||||
|`sandbox.Start()`| Start a sandbox and the containers making the sandbox.|
|
||||
|`sandbox.Stats()`| Get the stats of a running sandbox, return a `SandboxStats` structure.|
|
||||
|`sandbox.Status()`| Get the status of the sandbox and containers, return a `SandboxStatus` structure.|
|
||||
|`sandbox.Stop(force)`| Stop a sandbox and Destroy the containers in the sandbox. When force is true, ignore guest related stop failures.|
|
||||
|`sandbox.CreateContainer(contConfig)`| Create new container in the sandbox with the `ContainerConfig` parameter. It will add new container config to `sandbox.config.Containers`.|
|
||||
|`sandbox.DeleteContainer(containerID)`| Delete a container from the sandbox by `containerID`, return a `Container` structure.|
|
||||
|`sandbox.EnterContainer(containerID, cmd)`| Run a new process in a container, executing customer's `types.Cmd` command.|
|
||||
|`sandbox.KillContainer(containerID, signal, all)`| Signal a container in the sandbox by the `containerID`.|
|
||||
|`sandbox.PauseContainer(containerID)`| Pause a running container in the sandbox by the `containerID`.|
|
||||
|`sandbox.ProcessListContainer(containerID, options)`| List every process running inside a specific container in the sandbox, return a `ProcessList` structure.|
|
||||
|`sandbox.ResumeContainer(containerID)`| Resume a paused container in the sandbox by the `containerID`.|
|
||||
|`sandbox.StartContainer(containerID)`| Start a container in the sandbox by the `containerID`.|
|
||||
|`sandbox.StatsContainer(containerID)`| Get the stats of a running container, return a `ContainerStats` structure.|
|
||||
|`sandbox.StatusContainer(containerID)`| Get the status of a container in the sandbox, return a `ContainerStatus` structure.|
|
||||
|`sandbox.StopContainer(containerID, force)`| Stop a container in the sandbox by the `containerID`.|
|
||||
|`sandbox.UpdateContainer(containerID, resources)`| Update a running container in the sandbox.|
|
||||
|`sandbox.WaitProcess(containerID, processID)`| Wait on a process to terminate.|
|
||||
### Sandbox Hotplug API
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`sandbox.AddDevice(info)`| Add new storage device `DeviceInfo` to the sandbox, return a `Device` structure.|
|
||||
|`sandbox.AddInterface(inf)`| Add new NIC to the sandbox.|
|
||||
|`sandbox.RemoveInterface(inf)`| Remove a NIC from the sandbox.|
|
||||
|`sandbox.ListInterfaces()`| List all NICs and their configurations in the sandbox, return a `pbTypes.Interface` list.|
|
||||
|`sandbox.UpdateRoutes(routes)`| Update the sandbox route table (e.g. for portmapping support), return a `pbTypes.Route` list.|
|
||||
|`sandbox.ListRoutes()`| List the sandbox route table, return a `pbTypes.Route` list.|
|
||||
|
||||
### Sandbox Relay API
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`sandbox.WinsizeProcess(containerID, processID, Height, Width)`| Relay TTY resize request to a process.|
|
||||
|`sandbox.SignalProcess(containerID, processID, signalID, signalALL)`| Relay a signal to a process or all processes in a container.|
|
||||
|`sandbox.IOStream(containerID, processID)`| Relay a process stdio. Return stdin/stdout/stderr pipes to the process stdin/stdout/stderr streams.|
|
||||
|
||||
### Sandbox Monitor API
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`sandbox.GetOOMEvent()`| Monitor the OOM events that occur in the sandbox..|
|
||||
|`sandbox.UpdateRuntimeMetrics()`| Update the `shim/hypervisor` metrics of the running sandbox.|
|
||||
|`sandbox.GetAgentMetrics()`| Get metrics of the agent and the guest in the running sandbox.|
|
||||
|
||||
## Plugin framework for external proprietary Kata runtime extensions
|
||||
### Hypervisor plugin
|
||||
|
||||
TBD.
|
||||
### Metadata storage plugin
|
||||
The metadata storage plugin controls where sandbox metadata is saved.
|
||||
All metadata storage plugins must implement the following API:
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`storage.Save(key, value)`| Save a record.|
|
||||
|`storage.Load(key)`| Load a record.|
|
||||
|`storage.Delete(key)`| Delete a record.|
|
||||
|
||||
Built-in implementations include:
|
||||
- Filesystem storage
|
||||
- LevelDB storage
|
||||
|
||||
### VM Factory plugin
|
||||
The VM factory plugin controls how a sandbox factory creates new VMs.
|
||||
All VM factory plugins must implement following API:
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`VMFactory.NewVM(HypervisorConfig)`|Create a new VM based on `HypervisorConfig`.|
|
||||
|
||||
Built-in implementations include:
|
||||
|
||||
|Name|Description|
|
||||
|---|---|
|
||||
|`CreateNew()`| Create brand new VM based on `HypervisorConfig`.|
|
||||
|`CreateFromTemplate()`| Create new VM from template.|
|
||||
|`CreateFromCache()`| Create new VM from VM caches.|
|
||||
|
||||
### Sandbox Creation Plugin Workflow
|
||||

|
||||
|
||||
### Sandbox Connection Plugin Workflow
|
||||

|
||||
@@ -1,95 +0,0 @@
|
||||
## Design requirements
|
||||
|
||||
The Kata Containers runtime **MUST** fulfill all of the following requirements:
|
||||
|
||||
### OCI compatibility
|
||||
The Kata Containers runtime **MUST** implement the [OCI runtime specification](https://github.com/opencontainers/runtime-spec) and support all
|
||||
the OCI runtime operations.
|
||||
|
||||
### [`runc`](https://github.com/opencontainers/runc) CLI compatibility
|
||||
In theory, being OCI compatible should be enough. In practice, the Kata Containers runtime
|
||||
should comply with the latest *stable* `runc` CLI. In particular, it **MUST** implement the
|
||||
following `runc` commands:
|
||||
|
||||
* `create`
|
||||
* `delete`
|
||||
* `exec`
|
||||
* `kill`
|
||||
* `list`
|
||||
* `pause`
|
||||
* `ps`
|
||||
* `start`
|
||||
* `state`
|
||||
* `version`
|
||||
|
||||
The Kata Containers runtime **MUST** implement the following command line options:
|
||||
* `--console-socket`
|
||||
* `--pid-file`
|
||||
|
||||
### [CRI](http://blog.kubernetes.io/2016/12/container-runtime-interface-cri-in-kubernetes.html) and [Kubernetes](https://kubernetes.io) support
|
||||
The Kata Containers project **MUST** provide two interfaces for CRI shims to manage hardware
|
||||
virtualization based Kubernetes pods and containers:
|
||||
- An OCI and `runc` compatible command line interface, as described in the previous section.
|
||||
This interface is used by implementations such as [`CRI-O`](http://cri-o.io) and [`containerd`](https://github.com/containerd/containerd), for example.
|
||||
- A hardware virtualization runtime library API for CRI shims to consume and provide a more
|
||||
CRI native implementation. The [`frakti`](https://github.com/kubernetes/frakti) CRI shim is an example of such a consumer.
|
||||
|
||||
### Multiple hardware architectures support
|
||||
The Kata Containers runtime **MUST NOT** be architecture-specific. It should be able to support
|
||||
multiple hardware architectures and provide a modular and flexible design for adding support
|
||||
for additional ones.
|
||||
|
||||
### Multiple hypervisor support
|
||||
The Kata Containers runtime **MUST NOT** be tied to any specific hardware virtualization technology,
|
||||
hypervisor, or virtual machine monitor implementation.
|
||||
It should support multiple hypervisors and provide a pluggable and flexible design to add support
|
||||
for additional ones.
|
||||
|
||||
#### Nesting
|
||||
The Kata Containers runtime **MUST** support nested virtualization environments.
|
||||
|
||||
### Networking
|
||||
|
||||
* The Kata Containers runtime **MUST** support CNI plugin.
|
||||
* The Kata Containers runtime **MUST** support both legacy and IPv6 networks.
|
||||
|
||||
### I/O
|
||||
|
||||
#### Devices direct assignment
|
||||
In order for containers to directly consume host hardware resources, the Kata Containers runtime
|
||||
**MUST** provide containers with secure pass through for generic devices such as GPUs, SRIOV,
|
||||
RDMA, QAT, by leveraging I/O virtualization technologies (IOMMU, interrupt remapping).
|
||||
|
||||
#### Acceleration
|
||||
The Kata Containers runtime **MUST** support accelerated and user-space-based I/O operations
|
||||
for networking (e.g. DPDK) as well as storage through `vhost-user` sockets.
|
||||
|
||||
#### Scalability
|
||||
The Kata Containers runtime **MUST** support scalable I/O through the SRIOV technology.
|
||||
|
||||
|
||||
### Virtualization overhead reduction
|
||||
A compelling aspect of containers is their minimal overhead compared to bare metal applications.
|
||||
A container runtime should keep the overhead to a minimum in order to provide the expected user
|
||||
experience.
|
||||
The Kata Containers runtime implementation **SHOULD** be optimized for:
|
||||
|
||||
* Minimal workload boot and shutdown times
|
||||
* Minimal workload memory footprint
|
||||
* Maximal networking throughput
|
||||
* Minimal networking latency
|
||||
|
||||
### Testing and debugging
|
||||
|
||||
#### Continuous Integration
|
||||
Each Kata Containers runtime pull request **MUST** pass at least the following set of container-related
|
||||
tests:
|
||||
|
||||
* Unit tests: runtime unit tests coverage >75%
|
||||
* Functional tests: the entire runtime CLI and APIs
|
||||
* Integration tests: Docker and Kubernetes
|
||||
|
||||
#### Debugging
|
||||
|
||||
The Kata Containers runtime implementation **MUST** use structured logging in order to namespace
|
||||
log messages to facilitate debugging.
|
||||
@@ -1,93 +0,0 @@
|
||||
# Background
|
||||
|
||||
[Research](https://www.usenix.org/conference/fast16/technical-sessions/presentation/harter) shows that time to take for pull operation accounts for 76% of container startup time but only 6.4% of that data is read. So if we can get data on demand (lazy load), it will speed up the container start. [`Nydus`](https://github.com/dragonflyoss/image-service) is a project which build image with new format and can get data on demand when container start.
|
||||
|
||||
The following benchmarking result shows the performance improvement compared with the OCI image for the container cold startup elapsed time on containerd. As the OCI image size increases, the container startup time of using `nydus` image remains very short. [Click here](https://github.com/dragonflyoss/image-service/blob/master/docs/nydus-design.md) to see `nydus` design.
|
||||
|
||||

|
||||
|
||||
## Proposal - Bring `lazyload` ability to Kata Containers
|
||||
|
||||
`Nydusd` is a fuse/`virtiofs` daemon which is provided by `nydus` project and it supports `PassthroughFS` and [RAFS](https://github.com/dragonflyoss/image-service/blob/master/docs/nydus-design.md) (Registry Acceleration File System) natively, so in Kata Containers, we can use `nydusd` in place of `virtiofsd` and mount `nydus` image to guest in the meanwhile.
|
||||
|
||||
The process of creating/starting Kata Containers with `virtiofsd`,
|
||||
|
||||
1. When creating sandbox, the Kata Containers Containerd v2 [shim](https://github.com/kata-containers/kata-containers/blob/main/docs/design/architecture/README.md#runtime) will launch `virtiofsd` before VM starts and share directories with VM.
|
||||
2. When creating container, the Kata Containers Containerd v2 shim will mount rootfs to `kataShared`(/run/kata-containers/shared/sandboxes/\<SANDBOX\>/mounts/\<CONTAINER\>/rootfs), so it can be seen at the path `/run/kata-containers/shared/containers/shared/\<CONTAINER\>/rootfs` in the guest and used as container's rootfs.
|
||||
|
||||
The process of creating/starting Kata Containers with `nydusd`,
|
||||
|
||||

|
||||
|
||||
1. When creating sandbox, the Kata Containers Containerd v2 shim will launch `nydusd` daemon before VM starts.
|
||||
After VM starts, `kata-agent` will mount `virtiofs` at the path `/run/kata-containers/shared` and Kata Containers Containerd v2 shim mount `passthroughfs` filesystem to path `/run/kata-containers/shared/containers` when the VM starts.
|
||||
|
||||
```bash
|
||||
# start nydusd
|
||||
$ sandbox_id=my-test-sandbox
|
||||
$ sudo /usr/local/bin/nydusd --log-level info --sock /run/vc/vm/${sandbox_id}/vhost-user-fs.sock --apisock /run/vc/vm/${sandbox_id}/api.sock
|
||||
```
|
||||
|
||||
```bash
|
||||
# source: the host sharedir which will pass through to guest
|
||||
$ sudo curl -v --unix-socket /run/vc/vm/${sandbox_id}/api.sock \
|
||||
-X POST "http://localhost/api/v1/mount?mountpoint=/containers" -H "accept: */*" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"source":"/path/to/sharedir",
|
||||
"fs_type":"passthrough_fs",
|
||||
"config":""
|
||||
}'
|
||||
```
|
||||
|
||||
2. When creating normal container, the Kata Containers Containerd v2 shim send request to `nydusd` to mount `rafs` at the path `/run/kata-containers/shared/rafs/<container_id>/lowerdir` in guest.
|
||||
|
||||
```bash
|
||||
# source: the metafile of nydus image
|
||||
# config: the config of this image
|
||||
$ sudo curl --unix-socket /run/vc/vm/${sandbox_id}/api.sock \
|
||||
-X POST "http://localhost/api/v1/mount?mountpoint=/rafs/<container_id>/lowerdir" -H "accept: */*" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"source":"/path/to/bootstrap",
|
||||
"fs_type":"rafs",
|
||||
"config":"config":"{\"device\":{\"backend\":{\"type\":\"localfs\",\"config\":{\"dir\":\"blobs\"}},\"cache\":{\"type\":\"blobcache\",\"config\":{\"work_dir\":\"cache\"}}},\"mode\":\"direct\",\"digest_validate\":true}",
|
||||
}'
|
||||
```
|
||||
|
||||
The Kata Containers Containerd v2 shim will also bind mount `snapshotdir` which `nydus-snapshotter` assigns to `sharedir`。
|
||||
So in guest, container rootfs=overlay(`lowerdir=rafs`, `upperdir=snapshotdir/fs`, `workdir=snapshotdir/work`)
|
||||
|
||||
> how to transfer the `rafs` info from `nydus-snapshotter` to the Kata Containers Containerd v2 shim?
|
||||
|
||||
By default, when creating `OCI` image container, `nydus-snapshotter` will return [`struct` Mount slice](https://github.com/containerd/containerd/blob/main/mount/mount.go#L21) below to containerd and containerd use them to mount rootfs
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
Type: "overlay",
|
||||
Source: "overlay",
|
||||
Options: [lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_A>/mnt,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/work],
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
Then, we can append `rafs` info into `Options`, but if do this, containerd will mount failed, as containerd can not identify `rafs` info. Here, we can refer to [containerd mount helper](https://github.com/containerd/containerd/blob/main/mount/mount_linux.go#L42) and provide a binary called `nydus-overlayfs`. The `Mount` slice which `nydus-snapshotter` returned becomes
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
Type: "fuse.nydus-overlayfs",
|
||||
Source: "overlay",
|
||||
Options: [lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_A>/mnt,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/<snapshot_B>/work,extraoption=base64({source:xxx,config:xxx,snapshotdir:xxx})],
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
When containerd find `Type` is `fuse.nydus-overlayfs`,
|
||||
|
||||
1. containerd will call `mount.fuse` command;
|
||||
2. in `mount.fuse`, it will call `nydus-overlayfs`.
|
||||
3. in `nydus-overlayfs`, it will ignore the `extraoption` and do the overlay mount.
|
||||
|
||||
Finally, in the Kata Containers Containerd v2 shim, it parse `extraoption` and get the `rafs` info to mount the image in guest.
|
||||
@@ -1,5 +0,0 @@
|
||||
# Design proposals
|
||||
|
||||
Kata Containers design proposal documents:
|
||||
|
||||
- [Kata Containers tracing](tracing-proposals.md)
|
||||
@@ -1,213 +0,0 @@
|
||||
# Kata Tracing proposals
|
||||
|
||||
## Overview
|
||||
|
||||
This document summarises a set of proposals triggered by the
|
||||
[tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
## Required context
|
||||
|
||||
This section explains some terminology required to understand the proposals.
|
||||
Further details can be found in the
|
||||
[tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
### Agent trace mode terminology
|
||||
|
||||
| Trace mode | Description | Use-case |
|
||||
|-|-|-|
|
||||
| Static | Trace agent from startup to shutdown | Entire lifespan |
|
||||
| Dynamic | Toggle tracing on/off as desired | On-demand "snapshot" |
|
||||
|
||||
### Agent trace type terminology
|
||||
|
||||
| Trace type | Description | Use-case |
|
||||
|-|-|-|
|
||||
| isolated | traces all relate to single component | Observing lifespan |
|
||||
| collated | traces "grouped" (runtime+agent) | Understanding component interaction |
|
||||
|
||||
### Container lifespan
|
||||
|
||||
| Lifespan | trace mode | trace type |
|
||||
|-|-|-|
|
||||
| short-lived | static | collated if possible, else isolated? |
|
||||
| long-running | dynamic | collated? (to see interactions) |
|
||||
|
||||
## Original plan for agent
|
||||
|
||||
- Implement all trace types and trace modes for agent.
|
||||
|
||||
- Why?
|
||||
- Maximum flexibility.
|
||||
|
||||
> **Counterargument:**
|
||||
>
|
||||
> Due to the intrusive nature of adding tracing, we have
|
||||
> learnt that landing small incremental changes is simpler and quicker!
|
||||
|
||||
- Compatibility with [Kata 1.x tracing][kata-1x-tracing].
|
||||
|
||||
> **Counterargument:**
|
||||
>
|
||||
> Agent tracing in Kata 1.x was extremely awkward to setup (to the extent
|
||||
> that it's unclear how many users actually used it!)
|
||||
>
|
||||
> This point, coupled with the new architecture for Kata 2.x, suggests
|
||||
> that we may not need to supply the same set of tracing features (in fact
|
||||
> they may not make sense)).
|
||||
|
||||
## Agent tracing proposals
|
||||
|
||||
### Agent tracing proposal 1: Don't implement dynamic trace mode
|
||||
|
||||
- All tracing will be static.
|
||||
|
||||
- Why?
|
||||
- Because dynamic tracing will always be "partial"
|
||||
|
||||
> In fact, not only would it be only a "snapshot" of activity, it may not
|
||||
> even be possible to create a complete "trace transaction". If this is
|
||||
> true, the trace output would be partial and would appear "unstructured".
|
||||
|
||||
### Agent tracing proposal 2: Simplify handling of trace type
|
||||
|
||||
- Agent tracing will be "isolated" by default.
|
||||
- Agent tracing will be "collated" if runtime tracing is also enabled.
|
||||
|
||||
- Why?
|
||||
- Offers a graceful fallback for agent tracing if runtime tracing disabled.
|
||||
- Simpler code!
|
||||
|
||||
## Questions to ask yourself (part 1)
|
||||
|
||||
- Are your containers long-running or short-lived?
|
||||
|
||||
- Would you ever need to turn on tracing "briefly"?
|
||||
- If "yes", is a "partial trace" useful or useless?
|
||||
|
||||
> Likely to be considered useless as it is a partial snapshot.
|
||||
> Alternative tracing methods may be more appropriate to dynamic
|
||||
> OpenTelemetry tracing.
|
||||
|
||||
## Questions to ask yourself (part 2)
|
||||
|
||||
- Are you happy to stop a container to enable tracing?
|
||||
If "no", dynamic tracing may be required.
|
||||
|
||||
- Would you ever want to trace the agent and the runtime "in isolation" at the
|
||||
same time?
|
||||
- If "yes", we need to fully implement `trace_mode=isolated`
|
||||
|
||||
> This seems unlikely though.
|
||||
|
||||
## Trace collection
|
||||
|
||||
The second set of proposals affect the way traces are collected.
|
||||
|
||||
### Motivation
|
||||
|
||||
Currently:
|
||||
|
||||
- The runtime sends trace spans to Jaeger directly.
|
||||
- The agent will send trace spans to the [`trace-forwarder`][trace-forwarder] component.
|
||||
- The trace forwarder will send trace spans to Jaeger.
|
||||
|
||||
Kata agent tracing overview:
|
||||
|
||||
```
|
||||
+-------------------------------------------+
|
||||
| Host |
|
||||
| |
|
||||
| +-----------+ |
|
||||
| | Trace | |
|
||||
| | Collector | |
|
||||
| +-----+-----+ |
|
||||
| ^ +--------------+ |
|
||||
| | spans | Kata VM | |
|
||||
| +-----+-----+ | | |
|
||||
| | Kata | spans | +-----+ | |
|
||||
| | Trace |<-----------------|Kata | | |
|
||||
| | Forwarder | VSOCK | |Agent| | |
|
||||
| +-----------+ Channel | +-----+ | |
|
||||
| +--------------+ |
|
||||
+-------------------------------------------+
|
||||
```
|
||||
|
||||
Currently:
|
||||
|
||||
- If agent tracing is enabled but the trace forwarder is not running,
|
||||
the agent will error.
|
||||
|
||||
- If the trace forwarder is started but Jaeger is not running,
|
||||
the trace forwarder will error.
|
||||
|
||||
### Goals
|
||||
|
||||
- The runtime and agent should:
|
||||
- Use the same trace collection implementation.
|
||||
- Use the most the common configuration items.
|
||||
|
||||
- Kata should should support more trace collection software or `SaaS`
|
||||
(for example `Zipkin`, `datadog`).
|
||||
|
||||
- Trace collection should not block normal runtime/agent operations
|
||||
(for example if `vsock-exporter`/Jaeger is not running, Kata Containers should work normally).
|
||||
|
||||
### Trace collection proposals
|
||||
|
||||
#### Trace collection proposal 1: Send all spans to the trace forwarder as a span proxy
|
||||
|
||||
Kata runtime/agent all send spans to trace forwarder, and the trace forwarder,
|
||||
acting as a tracing proxy, sends all spans to a tracing back-end, such as Jaeger or `datadog`.
|
||||
|
||||
**Pros:**
|
||||
|
||||
- Runtime/agent will be simple.
|
||||
- Could update trace collection target while Kata Containers are running.
|
||||
|
||||
**Cons:**
|
||||
|
||||
- Requires the trace forwarder component to be running (that is a pressure to operation).
|
||||
|
||||
#### Trace collection proposal 2: Send spans to collector directly from runtime/agent
|
||||
|
||||
Send spans to collector directly from runtime/agent, this proposal need
|
||||
network accessible to the collector.
|
||||
|
||||
**Pros:**
|
||||
|
||||
- No additional trace forwarder component needed.
|
||||
|
||||
**Cons:**
|
||||
|
||||
- Need more code/configuration to support all trace collectors.
|
||||
|
||||
## Future work
|
||||
|
||||
- We could add dynamic and fully isolated tracing at a later stage,
|
||||
if required.
|
||||
|
||||
## Further details
|
||||
|
||||
- See the new [GitHub project](https://github.com/orgs/kata-containers/projects/28).
|
||||
- [kata-containers-tracing-status](https://gist.github.com/jodh-intel/0ee54d41d2a803ba761e166136b42277) gist.
|
||||
- [tracing documentation PR][tracing-doc-pr].
|
||||
|
||||
## Summary
|
||||
|
||||
### Time line
|
||||
|
||||
- 2021-07-01: A summary of the discussion was
|
||||
[posted to the mail list](http://lists.katacontainers.io/pipermail/kata-dev/2021-July/001996.html).
|
||||
- 2021-06-22: These proposals were
|
||||
[discussed in the Kata Architecture Committee meeting](https://etherpad.opendev.org/p/Kata_Containers_2021_Architecture_Committee_Mtgs).
|
||||
- 2021-06-18: These proposals where
|
||||
[announced on the mailing list](http://lists.katacontainers.io/pipermail/kata-dev/2021-June/001980.html).
|
||||
|
||||
### Outcome
|
||||
|
||||
- Nobody opposed the agent proposals, so they are being implemented.
|
||||
- The trace collection proposals are still being considered.
|
||||
|
||||
[kata-1x-tracing]: https://github.com/kata-containers/agent/blob/master/TRACING.md
|
||||
[trace-forwarder]: /src/tools/trace-forwarder
|
||||
[tracing-doc-pr]: https://github.com/kata-containers/kata-containers/pull/1937
|
||||
@@ -1,193 +0,0 @@
|
||||
# Virtual machine vCPU sizing in Kata Containers
|
||||
|
||||
## Default number of virtual CPUs
|
||||
|
||||
Before starting a container, the [runtime][6] reads the `default_vcpus` option
|
||||
from the [configuration file][7] to determine the number of virtual CPUs
|
||||
(vCPUs) needed to start the virtual machine. By default, `default_vcpus` is
|
||||
equal to 1 for fast boot time and a small memory footprint per virtual machine.
|
||||
Be aware that increasing this value negatively impacts the virtual machine's
|
||||
boot time and memory footprint.
|
||||
In general, we recommend that you do not edit this variable, unless you know
|
||||
what are you doing. If your container needs more than one vCPU, use
|
||||
[docker `--cpus`][1], [docker update][4], or [Kubernetes `cpu` limits][2] to
|
||||
assign more resources.
|
||||
|
||||
*Docker*
|
||||
|
||||
```sh
|
||||
$ docker run --name foo -ti --cpus 2 debian bash
|
||||
$ docker update --cpus 4 foo
|
||||
```
|
||||
|
||||
|
||||
*Kubernetes*
|
||||
|
||||
```yaml
|
||||
# ~/cpu-demo.yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: cpu-demo
|
||||
namespace: sandbox
|
||||
spec:
|
||||
containers:
|
||||
- name: cpu0
|
||||
image: vish/stress
|
||||
resources:
|
||||
limits:
|
||||
cpu: "3"
|
||||
args:
|
||||
- -cpus
|
||||
- "5"
|
||||
```
|
||||
|
||||
```sh
|
||||
$ sudo -E kubectl create -f ~/cpu-demo.yaml
|
||||
```
|
||||
|
||||
## Virtual CPUs and Kubernetes pods
|
||||
|
||||
A Kubernetes pod is a group of one or more containers, with shared storage and
|
||||
network, and a specification for how to run the containers [[specification][3]].
|
||||
In Kata Containers this group of containers, which is called a sandbox, runs inside
|
||||
the same virtual machine. If you do not specify a CPU constraint, the runtime does
|
||||
not add more vCPUs and the container is not placed inside a CPU cgroup.
|
||||
Instead, the container uses the number of vCPUs specified by `default_vcpus`
|
||||
and shares these resources with other containers in the same situation
|
||||
(without a CPU constraint).
|
||||
|
||||
## Container lifecycle
|
||||
|
||||
When you create a container with a CPU constraint, the runtime adds the
|
||||
number of vCPUs required by the container. Similarly, when the container terminates,
|
||||
the runtime removes these resources.
|
||||
|
||||
## Container without CPU constraint
|
||||
|
||||
A container without a CPU constraint uses the default number of vCPUs specified
|
||||
in the configuration file. In the case of Kubernetes pods, containers without a
|
||||
CPU constraint use and share between them the default number of vCPUs. For
|
||||
example, if `default_vcpus` is equal to 1 and you have 2 containers without CPU
|
||||
constraints with each container trying to consume 100% of vCPU, the resources
|
||||
divide in two parts, 50% of vCPU for each container because your virtual
|
||||
machine does not have enough resources to satisfy containers needs. If you want
|
||||
to give access to a greater or lesser portion of vCPUs to a specific container,
|
||||
use [`docker --cpu-shares`][1] or [Kubernetes `cpu` requests][2].
|
||||
|
||||
*Docker*
|
||||
|
||||
```sh
|
||||
$ docker run -ti --cpus-shares=512 debian bash
|
||||
```
|
||||
|
||||
*Kubernetes*
|
||||
|
||||
```yaml
|
||||
# ~/cpu-demo.yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: cpu-demo
|
||||
namespace: sandbox
|
||||
spec:
|
||||
containers:
|
||||
- name: cpu0
|
||||
image: vish/stress
|
||||
resources:
|
||||
requests:
|
||||
cpu: "0.7"
|
||||
args:
|
||||
- -cpus
|
||||
- "3"
|
||||
```
|
||||
|
||||
```sh
|
||||
$ sudo -E kubectl create -f ~/cpu-demo.yaml
|
||||
```
|
||||
|
||||
Before running containers without CPU constraint, consider that your containers
|
||||
are not running alone. Since your containers run inside a virtual machine other
|
||||
processes use the vCPUs as well (e.g. `systemd` and the Kata Containers
|
||||
[agent][5]). In general, we recommend setting `default_vcpus` equal to 1 to
|
||||
allow non-container processes to run on this vCPU and to specify a CPU
|
||||
constraint for each container. If your container is already running and needs
|
||||
more vCPUs, you can add more using [docker update][4].
|
||||
|
||||
## Container with CPU constraint
|
||||
|
||||
The runtime calculates the number of vCPUs required by a container with CPU
|
||||
constraints using the following formula: `vCPUs = ceiling( quota / period )`, where
|
||||
`quota` specifies the number of microseconds per CPU Period that the container is
|
||||
guaranteed CPU access and `period` specifies the CPU CFS scheduler period of time
|
||||
in microseconds. The result determines the number of vCPU to hot plug into the
|
||||
virtual machine. Once the vCPUs have been added, the [agent][5] places the
|
||||
container inside a CPU cgroup. This placement allows the container to use only
|
||||
its assigned resources.
|
||||
|
||||
## Do not waste resources
|
||||
|
||||
If you already know the number of vCPUs needed for each container and pod, or
|
||||
just want to run them with the same number of vCPUs, you can specify that
|
||||
number using the `default_vcpus` option in the configuration file, each virtual
|
||||
machine starts with that number of vCPUs. One limitation of this approach is
|
||||
that these vCPUs cannot be removed later and you might be wasting
|
||||
resources. For example, if you set `default_vcpus` to 8 and run only one
|
||||
container with a CPU constraint of 1 vCPUs, you might be wasting 7 vCPUs since
|
||||
the virtual machine starts with 8 vCPUs and 1 vCPUs is added and assigned
|
||||
to the container. Non-container processes might be able to use 8 vCPUs but they
|
||||
use a maximum 1 vCPU, hence 7 vCPUs might not be used.
|
||||
|
||||
|
||||
*Container without CPU constraint*
|
||||
|
||||
```sh
|
||||
$ docker run -ti debian bash -c "nproc; cat /sys/fs/cgroup/cpu,cpuacct/cpu.cfs_*"
|
||||
1 # number of vCPUs
|
||||
100000 # cfs period
|
||||
-1 # cfs quota
|
||||
```
|
||||
|
||||
*Container with CPU constraint*
|
||||
|
||||
```sh
|
||||
docker run --cpus 4 -ti debian bash -c "nproc; cat /sys/fs/cgroup/cpu,cpuacct/cpu.cfs_*"
|
||||
5 # number of vCPUs
|
||||
100000 # cfs period
|
||||
400000 # cfs quota
|
||||
```
|
||||
|
||||
## Virtual CPU handling without hotplug
|
||||
|
||||
In some cases, the hardware and/or software architecture being utilized does not support
|
||||
hotplug. For example, Firecracker VMM does not support CPU or memory hotplug. Similarly,
|
||||
the current Linux Kernel for aarch64 does not support CPU or memory hotplug. To appropriately
|
||||
size the virtual machine for the workload within the container or pod, we provide a `static_sandbox_resource_mgmt`
|
||||
flag within the Kata Containers configuration. When this is set, the runtime will:
|
||||
- Size the VM based on the workload requirements as well as the `default_vcpus` option specified in the configuration.
|
||||
- Not resize the virtual machine after it has been launched.
|
||||
|
||||
VM size determination varies depending on the type of container being run, and may not always
|
||||
be available. If workload sizing information is not available, the virtual machine will be started with the
|
||||
`default_vcpus`.
|
||||
|
||||
In the case of a pod, the initial sandbox container (pause container) typically doesn't contain any resource
|
||||
information in its runtime `spec`. It is possible that the upper layer runtime
|
||||
(i.e. containerd or CRI-O) may pass sandbox sizing annotations within the pause container's
|
||||
`spec`. If these are provided, we will use this to appropriately size the VM. In particular,
|
||||
we'll calculate the number of CPUs required for the workload and augment this by `default_vcpus`
|
||||
configuration option, and use this for the virtual machine size.
|
||||
|
||||
In the case of a single container (i.e., not a pod), if the container specifies resource requirements,
|
||||
the container's `spec` will provide the sizing information directly. If these are set, we will
|
||||
calculate the number of CPUs required for the workload and augment this by `default_vcpus`
|
||||
configuration option, and use this for the virtual machine size.
|
||||
|
||||
|
||||
[1]: https://docs.docker.com/config/containers/resource_constraints/#cpu
|
||||
[2]: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource
|
||||
[3]: https://kubernetes.io/docs/concepts/workloads/pods/pod/
|
||||
[4]: https://docs.docker.com/engine/reference/commandline/update/
|
||||
[5]: ../../src/agent
|
||||
[6]: ../../src/runtime
|
||||
[7]: ../../src/runtime/README.md#configuration
|
||||
@@ -1,122 +0,0 @@
|
||||
# Virtualization in Kata Containers
|
||||
|
||||
Kata Containers, a second layer of isolation is created on top of those provided by traditional namespace-containers. The
|
||||
hardware virtualization interface is the basis of this additional layer. Kata will launch a lightweight virtual machine,
|
||||
and use the guest’s Linux kernel to create a container workload, or workloads in the case of multi-container pods. In Kubernetes
|
||||
and in the Kata implementation, the sandbox is carried out at the pod level. In Kata, this sandbox is created using a virtual machine.
|
||||
|
||||
This document describes how Kata Containers maps container technologies to virtual machines technologies, and how this is realized in
|
||||
the multiple hypervisors and virtual machine monitors that Kata supports.
|
||||
|
||||
## Mapping container concepts to virtual machine technologies
|
||||
|
||||
A typical deployment of Kata Containers will be in Kubernetes by way of a Container Runtime Interface (CRI) implementation. On every node,
|
||||
Kubelet will interact with a CRI implementer (such as containerd or CRI-O), which will in turn interface with Kata Containers (an OCI based runtime).
|
||||
|
||||
The CRI API, as defined at the [Kubernetes CRI-API repo](https://github.com/kubernetes/cri-api/), implies a few constructs being supported by the
|
||||
CRI implementation, and ultimately in Kata Containers. In order to support the full [API](https://github.com/kubernetes/cri-api/blob/a6f63f369f6d50e9d0886f2eda63d585fbd1ab6a/pkg/apis/runtime/v1alpha2/api.proto#L34-L110) with the CRI-implementer, Kata must provide the following constructs:
|
||||
|
||||

|
||||
|
||||
These constructs can then be further mapped to what devices are necessary for interfacing with the virtual machine:
|
||||
|
||||

|
||||
|
||||
Ultimately, these concepts map to specific para-virtualized devices or virtualization technologies.
|
||||
|
||||

|
||||
|
||||
Each hypervisor or VMM varies on how or if it handles each of these.
|
||||
|
||||
## Kata Containers Hypervisor and VMM support
|
||||
|
||||
Kata Containers [supports multiple hypervisors](../hypervisors.md).
|
||||
|
||||
Details of each solution and a summary are provided below.
|
||||
|
||||
### QEMU/KVM
|
||||
|
||||
Kata Containers with QEMU has complete compatibility with Kubernetes.
|
||||
|
||||
Depending on the host architecture, Kata Containers supports various machine types,
|
||||
for example `pc` and `q35` on x86 systems, `virt` on ARM systems and `pseries` on IBM Power systems. The default Kata Containers
|
||||
machine type is `q35`. The machine type and its [`Machine accelerators`](#machine-accelerators) can
|
||||
be changed by editing the runtime [`configuration`](architecture/README.md#configuration) file.
|
||||
|
||||
Devices and features used:
|
||||
- virtio VSOCK or virtio serial
|
||||
- virtio block or virtio SCSI
|
||||
- [virtio net](https://www.redhat.com/en/virtio-networking-series)
|
||||
- virtio fs or virtio 9p (recommend: virtio fs)
|
||||
- VFIO
|
||||
- hotplug
|
||||
- machine accelerators
|
||||
|
||||
Machine accelerators and hotplug are used in Kata Containers to manage resource constraints, improve boot time and reduce memory footprint. These are documented below.
|
||||
|
||||
#### Machine accelerators
|
||||
|
||||
Machine accelerators are architecture specific and can be used to improve the performance
|
||||
and enable specific features of the machine types. The following machine accelerators
|
||||
are used in Kata Containers:
|
||||
|
||||
- NVDIMM: This machine accelerator is x86 specific and only supported by `pc` and
|
||||
`q35` machine types. `nvdimm` is used to provide the root filesystem as a persistent
|
||||
memory device to the Virtual Machine.
|
||||
|
||||
#### Hotplug devices
|
||||
|
||||
The Kata Containers VM starts with a minimum amount of resources, allowing for faster boot time and a reduction in memory footprint. As the container launch progresses,
|
||||
devices are hotplugged to the VM. For example, when a CPU constraint is specified which includes additional CPUs, they can be hot added. Kata Containers has support
|
||||
for hot-adding the following devices:
|
||||
- Virtio block
|
||||
- Virtio SCSI
|
||||
- VFIO
|
||||
- CPU
|
||||
|
||||
### Firecracker/KVM
|
||||
|
||||
Firecracker, built on many rust crates that are within [rust-VMM](https://github.com/rust-vmm), has a very limited device model, providing a lighter
|
||||
footprint and attack surface, focusing on function-as-a-service like use cases. As a result, Kata Containers with Firecracker VMM supports a subset of the CRI API.
|
||||
Firecracker does not support file-system sharing, and as a result only block-based storage drivers are supported. Firecracker does not support device
|
||||
hotplug nor does it support VFIO. As a result, Kata Containers with Firecracker VMM does not support updating container resources after boot, nor
|
||||
does it support device passthrough.
|
||||
|
||||
Devices used:
|
||||
- virtio VSOCK
|
||||
- virtio block
|
||||
- virtio net
|
||||
|
||||
### Cloud Hypervisor/KVM
|
||||
|
||||
[Cloud Hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor), based
|
||||
on [rust-vmm](https://github.com/rust-vmm), is designed to have a
|
||||
lighter footprint and smaller attack surface for running modern cloud
|
||||
workloads. Kata Containers with Cloud
|
||||
Hypervisor provides mostly complete compatibility with Kubernetes
|
||||
comparable to the QEMU configuration. As of the 1.12 and 2.0.0 release
|
||||
of Kata Containers, the Cloud Hypervisor configuration supports both CPU
|
||||
and memory resize, device hotplug (disk and VFIO), file-system sharing through virtio-fs,
|
||||
block-based volumes, booting from VM images backed by pmem device, and
|
||||
fine-grained seccomp filters for each VMM threads (e.g. all virtio
|
||||
device worker threads). Please check [this GitHub Project](https://github.com/orgs/kata-containers/projects/21)
|
||||
for details of ongoing integration efforts.
|
||||
|
||||
Devices and features used:
|
||||
- virtio VSOCK or virtio serial
|
||||
- virtio block
|
||||
- virtio net
|
||||
- virtio fs
|
||||
- virtio pmem
|
||||
- VFIO
|
||||
- hotplug
|
||||
- seccomp filters
|
||||
- [HTTP OpenAPI](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/master/vmm/src/api/openapi/cloud-hypervisor.yaml)
|
||||
|
||||
### Summary
|
||||
|
||||
| Solution | release introduced | brief summary |
|
||||
|-|-|-|
|
||||
| Cloud Hypervisor | 1.10 | upstream Cloud Hypervisor with rich feature support, e.g. hotplug, VFIO and FS sharing|
|
||||
| Firecracker | 1.5 | upstream Firecracker, rust-VMM based, no VFIO, no FS sharing, no memory/CPU hotplug |
|
||||
| QEMU | 1.0 | upstream QEMU, with support for hotplug and filesystem sharing |
|
||||