metrics: Add memory footprint tests to the CI

This PR adds memory foot print metrics to tests/metrics/density
folder.

Intentionally, each test exits w/ zero in all test cases to ensure
that tests would be green when added, and will be enabled in a
subsequent PR.

A workflow matrix was added to define hypervisor variation on
each job, in order to run them sequentially.

The launch-times test was updated to make use of the matrix
environment variables.

Fixes: #7066

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This commit is contained in:
David Esparza 2023-06-22 09:23:13 -06:00
parent 5e3f617cb6
commit b2ce8b4d61
No known key found for this signature in database
GPG Key ID: EABE0B1A98CC3B7A
9 changed files with 1494 additions and 14 deletions

View File

@ -8,9 +8,15 @@ on:
jobs:
run-metrics:
strategy:
fail-fast: true
matrix:
vmm: ['clh', 'qemu']
max-parallel: 1
runs-on: metrics
env:
GOPATH: ${{ github.workspace }}
KATA_HYPERVISOR: ${{ matrix.vmm }}
steps:
- uses: actions/checkout@v3
with:
@ -25,8 +31,12 @@ jobs:
- name: Install kata
run: bash tests/metrics/gha-run.sh install-kata kata-artifacts
- name: run launch times on qemu
run: bash tests/metrics/gha-run.sh run-test-launchtimes-qemu
- name: run launch times test
run: bash tests/metrics/gha-run.sh run-test-launchtimes
- name: run memory foot print test
run: bash tests/metrics/gha-run.sh run-test-memory-usage
- name: run memory usage inside container test
run: bash tests/metrics/gha-run.sh run-test-memory-usage-inside-container
- name: run launch times on clh
run: bash tests/metrics/gha-run.sh run-test-launchtimes-clh

View File

@ -55,6 +55,8 @@ For further details see the [time tests documentation](time).
Tests that measure the size and overheads of the runtime. Generally this is looking at
memory footprint sizes, but could also cover disk space or even CPU consumption.
For further details see the [density tests documentation](density).
### Networking
Tests relating to networking. General items could include:

View File

@ -0,0 +1,53 @@
# Kata Containers density metrics tests
This directory contains a number of tests to help measure container
memory footprint. Some measures are based around the
[PSS](https://en.wikipedia.org/wiki/Proportional_set_size) of the runtime
components, and others look at the system level (`free` and `/proc/meminfo`
for instance) impact.
## `memory_usage`
This test measures the PSS footprint of the runtime components whilst
launching a number of small ([BusyBox](https://hub.docker.com/_/busybox/)) containers
using ctr.
## `fast_footprint`
This test takes system level resource measurements after launching a number of
containers in parallel and optionally waiting for KSM to settle its memory
compaction cycles.
The script is quite configurable via environment variables, including:
* Which container workload to run.
* How many containers to launch.
* How many containers are launched in parallel.
* How long to wait until taking the measures.
See the script itself for more details.
This test shares many config options with the `footprint_data` test. Thus, referring
to the [footprint test documentation](footprint_data.md) may be useful.
> *Note:* If this test finds KSM is enabled on the host, it will wait for KSM
> to "settle" before taking the final measurement. If your KSM is not configured
> to process all the allocated VM memory fast enough, the test will hit a timeout
> and proceed to take the final measurement anyway.
## `footprint_data`
Similar to the `fast_footprint` test, but this test launches the containers
sequentially and takes a system level measurement between each launch. Thus,
this test provides finer grained information on system scaling, but takes
significantly longer to run than the `fast_footprint` test. If you are only
interested in the final figure or the average impact, you may be better running
the `fast_footprint` test.
For more details see the [footprint test documentation](footprint_data.md).
## `memory_usage_inside_container`
Measures the memory statistics *inside* the container. This allows evaluation of
the overhead the VM kernel and rootfs are having on the memory that was requested
by the container co-ordination system, and thus supplied to the VM.

View File

@ -0,0 +1,433 @@
#!/bin/bash
# Copyright (c) 2017-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# A script to gather memory 'footprint' information as we launch more
# and more containers
#
# The script gathers information about both user and kernel space consumption
# Output is into a .json file, named using some of the config component names
# (such as footprint-busybox.json)
# Pull in some common, useful, items
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_PATH}/../lib/common.bash"
# Note that all vars that can be set from outside the script (that is,
# passed in the ENV), use the ':-' setting to allow being over-ridden
# Default sleep, in seconds, to let containers come up and finish their
# initialisation before we take the measures. Some of the larger
# containers can take a number of seconds to get running.
PAYLOAD_SLEEP="${PAYLOAD_SLEEP:-10}"
# How long, in seconds, do we wait for KSM to 'settle down', before we
# timeout and just continue anyway.
KSM_WAIT_TIME="${KSM_WAIT_TIME:-300}"
# How long, in seconds, do we poll for ctr to complete launching all the
# containers?
CTR_POLL_TIMEOUT="${CTR_POLL_TIMEOUT:-300}"
# How many containers do we launch in parallel before taking the PAYLOAD_SLEEP
# nap
PARALLELISM="${PARALLELISM:-10}"
### The default config - run a small busybox image
# Define what we will be running (app under test)
# Default is we run busybox, as a 'small' workload
PAYLOAD="${PAYLOAD:-quay.io/prometheus/busybox:latest}"
PAYLOAD_ARGS="${PAYLOAD_ARGS:-tail -f /dev/null}"
###
# which RUNTIME we use is picked up from the env in
# common.bash. You can over-ride by setting RUNTIME in your env
###
# Define the cutoff checks for when we stop running the test
# Run up to this many containers
NUM_CONTAINERS="${NUM_CONTAINERS:-100}"
# Run until we have consumed this much memory (from MemFree)
MAX_MEMORY_CONSUMED="${MAX_MEMORY_CONSUMED:-256*1024*1024*1024}"
# Run until we have this much MemFree left
MIN_MEMORY_FREE="${MIN_MEMORY_FREE:-2*1024*1024*1024}"
# Tools we need to have installed in order to operate
REQUIRED_COMMANDS="smem awk"
# If we 'dump' the system caches before we measure then we get less
# noise in the results - they show more what our un-reclaimable footprint is
DUMP_CACHES="${DUMP_CACHES:-1}"
# Affects the name of the file to store the results in
TEST_NAME="${TEST_NAME:-fast-footprint-busybox}"
############# end of configurable items ###################
# vars to remember where we started so we can calc diffs
base_mem_avail=0
base_mem_free=0
# dump the kernel caches, so we get a more precise (or just different)
# view of what our footprint really is.
function dump_caches() {
sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"
}
function init() {
restart_containerd_service
check_cmds $REQUIRED_COMMANDS
sudo -E "${CTR_EXE}" image pull "$PAYLOAD"
# Modify the test name if running with KSM enabled
check_for_ksm
# Use the common init func to get to a known state
init_env
# Prepare to start storing results
metrics_json_init
# Store up baseline measures
base_mem_avail=$(free -b | head -2 | tail -1 | awk '{print $7}')
base_mem_free=$(get_memfree)
# Store our configuration for this run
save_config
}
save_config(){
metrics_json_start_array
local json="$(cat << EOF
{
"testname": "${TEST_NAME}",
"payload": "${PAYLOAD}",
"payload_args": "${PAYLOAD_ARGS}",
"payload_sleep": ${PAYLOAD_SLEEP},
"ksm_settle_time": ${KSM_WAIT_TIME},
"num_containers": ${NUM_CONTAINERS},
"parallelism": ${PARALLELISM},
"max_memory_consumed": "${MAX_MEMORY_CONSUMED}",
"min_memory_free": "${MIN_MEMORY_FREE}",
"dump_caches": "${DUMP_CACHES}"
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Config"
}
function cleanup() {
# Finish storing the results
metrics_json_save
clean_env_ctr
}
# helper function to get USS of process in arg1
function get_proc_uss() {
item=$(sudo smem -t -P "^$1" | tail -1 | awk '{print $4}')
((item*=1024))
echo $item
}
# helper function to get PSS of process in arg1
function get_proc_pss() {
item=$(sudo smem -t -P "^$1" | tail -1 | awk '{print $5}')
((item*=1024))
echo $item
}
# Get the PSS for the whole of userspace (all processes)
# This allows us to see if we had any impact on the rest of the system, for instance
# dockerd grows as we launch containers, so we should account for that in our total
# memory breakdown
function grab_all_pss() {
item=$(sudo smem -t | tail -1 | awk '{print $5}')
((item*=1024))
local json="$(cat << EOF
"all_pss": {
"pss": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_user_smem() {
# userspace
item=$(sudo smem -w | head -5 | tail -1 | awk '{print $3}')
((item*=1024))
local json="$(cat << EOF
"user_smem": {
"userspace": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_slab() {
# Grabbing slab total from meminfo is easier than doing the math
# on slabinfo
item=$(fgrep "Slab:" /proc/meminfo | awk '{print $2}')
((item*=1024))
local json="$(cat << EOF
"slab": {
"slab": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function get_memfree() {
mem_free=$(sudo smem -w | head -6 | tail -1 | awk '{print $4}')
((mem_free*=1024))
echo $mem_free
}
function grab_system() {
# avail memory, from 'free'
local avail=$(free -b | head -2 | tail -1 | awk '{print $7}')
local avail_decr=$((base_mem_avail-avail))
# cached memory, from 'free'
local cached=$(free -b | head -2 | tail -1 | awk '{print $6}')
# free memory from smem
local smem_free=$(get_memfree)
local free_decr=$((base_mem_free-item))
# Anon pages
local anon=$(fgrep "AnonPages:" /proc/meminfo | awk '{print $2}')
((anon*=1024))
# Mapped pages
local mapped=$(egrep "^Mapped:" /proc/meminfo | awk '{print $2}')
((mapped*=1024))
# Cached
local meminfo_cached=$(grep "^Cached:" /proc/meminfo | awk '{print $2}')
((meminfo_cached*=1024))
local json="$(cat << EOF
"system": {
"avail": $avail,
"avail_decr": $avail_decr,
"cached": $cached,
"smem_free": $smem_free,
"free_decr": $free_decr,
"anon": $anon,
"mapped": $mapped,
"meminfo_cached": $meminfo_cached,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_stats() {
# If configured, dump the caches so we get a more stable
# view of what our static footprint really is
if [[ "$DUMP_CACHES" ]] ; then
dump_caches
fi
# user space data
# PSS taken all userspace
grab_all_pss
# user as reported by smem
grab_user_smem
# System overview data
# System free and cached
grab_system
# kernel data
# The 'total kernel space taken' we can work out as:
# ktotal = ((free-avail)-user)
# So, we don't grab that number from smem, as that is what it does
# internally anyhow.
# Still try to grab any finer kernel details that we can though
# totals from slabinfo
grab_slab
metrics_json_close_array_element
}
function check_limits() {
mem_free=$(get_memfree)
if ((mem_free <= MIN_MEMORY_FREE)); then
echo 1
return
fi
mem_consumed=$((base_mem_avail-mem_free))
if ((mem_consumed >= MAX_MEMORY_CONSUMED)); then
echo 1
return
fi
echo 0
}
launch_containers() {
local parloops leftovers
(( parloops=${NUM_CONTAINERS}/${PARALLELISM} ))
(( leftovers=${NUM_CONTAINERS} - (${parloops}*${PARALLELISM}) ))
echo "Launching ${parloops}x${PARALLELISM} containers + ${leftovers} etras"
containers=()
local iter n
for iter in $(seq 1 $parloops); do
echo "Launch iteration ${iter}"
for n in $(seq 1 $PARALLELISM); do
containers+=($(random_name))
sudo -E "${CTR_EXE}" run -d --runtime=$CTR_RUNTIME $PAYLOAD ${containers[-1]} sh -c $PAYLOAD_ARGS &
done
if [[ $PAYLOAD_SLEEP ]]; then
sleep $PAYLOAD_SLEEP
fi
# check if we have hit one of our limits and need to wrap up the tests
if (($(check_limits))); then
echo "Ran out of resources, check_limits failed"
return
fi
done
for n in $(seq 1 $leftovers); do
containers+=($(random_name))
sudo -E "${CTR_EXE}" run -d --runtime=$CTR_RUNTIME $PAYLOAD ${containers[-1]} sh -c $PAYLOAD_ARGS &
done
}
wait_containers() {
local t numcontainers
# nap 3s between checks
local step=3
for ((t=0; t<${CTR_POLL_TIMEOUT}; t+=step)); do
numcontainers=$(sudo -E "${CTR_EXE}" c list -q | wc -l)
if (( numcontainers >= ${NUM_CONTAINERS} )); then
echo "All containers now launched (${t}s)"
return
else
echo "Waiting for containers to launch (${numcontainers} at ${t}s)"
fi
sleep ${step}
done
echo "Timed out waiting for containers to launch (${t}s)"
cleanup
die "Timed out waiting for containers to launch (${t}s)"
}
function go() {
# Init the json cycle for this save
metrics_json_start_array
# Grab the first set of stats before we run any containers.
grab_stats
launch_containers
wait_containers
if [ $ksm_on == "1" ]; then
echo "Wating for KSM to settle..."
wait_ksm_settle ${KSM_WAIT_TIME}
fi
grab_stats
# Wrap up the results array
metrics_json_end_array "Results"
}
function show_vars()
{
echo -e "\nEvironment variables:"
echo -e "\tName (default)"
echo -e "\t\tDescription"
echo -e "\tPAYLOAD (${PAYLOAD})"
echo -e "\t\tThe ctr image to run"
echo -e "\tPAYLOAD_ARGS (${PAYLOAD_ARGS})"
echo -e "\t\tAny extra arguments passed into the docker 'run' command"
echo -e "\tPAYLOAD_SLEEP (${PAYLOAD_SLEEP})"
echo -e "\t\tSeconds to sleep between launch and measurement, to allow settling"
echo -e "\tKSM_WAIT_TIME (${KSM_WAIT_TIME})"
echo -e "\t\tSeconds to wait for KSM to settle before we take the final measure"
echo -e "\tCTR_POLL_TIMEOUT (${CTR_POLL_TIMEOUT})"
echo -e "\t\tSeconds to poll for ctr to finish launching containers"
echo -e "\tPARALLELISM (${PARALLELISM})"
echo -e "\t\tNumber of containers we launch in parallel"
echo -e "\tNUM_CONTAINERS (${NUM_CONTAINERS})"
echo -e "\t\tThe total number of containers to run"
echo -e "\tMAX_MEMORY_CONSUMED (${MAX_MEMORY_CONSUMED})"
echo -e "\t\tThe maximum amount of memory to be consumed before terminating"
echo -e "\tMIN_MEMORY_FREE (${MIN_MEMORY_FREE})"
echo -e "\t\tThe minimum amount of memory allowed to be free before terminating"
echo -e "\tDUMP_CACHES (${DUMP_CACHES})"
echo -e "\t\tA flag to note if the system caches should be dumped before capturing stats"
echo -e "\tTEST_NAME (${TEST_NAME})"
echo -e "\t\tCan be set to over-ride the default JSON results filename"
}
function help()
{
usage=$(cat << EOF
Usage: $0 [-h] [options]
Description:
Launch a series of workloads and take memory metric measurements after
each launch.
Options:
-h, Help page.
EOF
)
echo "$usage"
show_vars
}
function main() {
local OPTIND
while getopts "h" opt;do
case ${opt} in
h)
help
exit 0;
;;
esac
done
shift $((OPTIND-1))
init
go
cleanup
}
main "$@"

View File

@ -0,0 +1,87 @@
# Footprint data script details
The `footprint_data.sh` script runs a number of identical containers sequentially
via ctr and takes a number of memory related measurements after each
launch. The script is generally not used in a CI type environment, but is intended
to be run and analyzed manually.
You can configure the script by setting a number of environment variables.
The following sections list details of the configurable variables, along with a
small example invocation script.
## Variables
Environment variables can take effect in two ways.
Some variables affect how the payload is executed. The `RUNTIME` and `PAYLOAD`
arguments directly affect the payload execution with the following line in
the script:
`$ ctr run --memory-limit $PAYLOAD_RUNTIME_ARGS --rm --runtime=$CONTAINERD_RUNTIME $PAYLOAD $NAME sh -c $PAYLOAD_ARGS`
Other settings affect how memory footprint is measured and the test termination
conditions.
| Variable | Function
| -------- | --------
| `PAYLOAD` | The ctr image to run
| `PAYLOAD_ARGS` | Any arguments passed into the ctr image
| `PAYLOAD_RUNTIME_ARGS` | Any extra arguments passed into the ctr `run` command
| `PAYLOAD_SLEEP` | Seconds to sleep between launch and measurement, to allow settling
| `MAX_NUM_CONTAINERS` | The maximum number of containers to run before terminating
| `MAX_MEMORY_CONSUMED` | The maximum amount of memory to be consumed before terminating
| `MIN_MEMORY_FREE` | The minimum amount of memory allowed to be free before terminating
| `DUMP_CACHES` | A flag to note if the system caches should be dumped before capturing stats
| `DATAFILE` | Can be set to over-ride the default JSON results filename
## Output files
The names of the JSON files generated by the test are dictated by some of the parameters
the test is utilising. The default filename is generated in the form of:
`footprint-${PAYLOAD}[-ksm].json`
## Measurements
The test measures, calculates, and stores a number of data items:
| Item | Description
| ---- | -----------
| `uss` | USS for all the VM runtime components
| `pss` | PSS for all the VM runtime components
| `all_pss` | PSS of all of userspace - to monitor if we had other impact on the system
| `user_smem` | `smem` "userspace" consumption value
| `avail` | "available" memory from `free`
| `avail_decr` | "available" memory decrease since start of test
| `cached` | "Cached" memory from `/proc/meminfo`
| `smem_free` | Free memory as reported by `smem`
| `free_decr` | Decrease in Free memory reported by `smem` since start of test
| `anon` | `AnonPages` as reported from `/proc/meminfo`
| `mapped` | Mapped pages as reported from `/proc/meminfo`
| `cached` | Cached pages as reported from `/proc/meminfo`
| `slab` | Slab as reported from `/proc/meminfo`
## Example script
The following script is an example of how to configure the environment variables and
invoke the test script to run a number of different container tests.
```
#!/bin/bash
set -e
set -x
export MAX_NUM_CONTAINERS=10
export MAX_MEMORY_CONSUMED=6*1024*1024*1024
function run() {
###
# Define what we will be running (app under test)
# Default is we run busybox, as a 'small' workload
export PAYLOAD="quay.io/prometheus/busybox:latest"
export PAYLOAD_ARGS="tail -f /dev/null"
export PAYLOAD_SLEEP=10
export PAYLOAD_RUNTIME_ARGS="5120"
sudo -E bash $(pwd)/density/footprint_data.sh
}
export CONTAINERD_RUNTIME=io.containerd.kata.v2
run
```

View File

@ -0,0 +1,360 @@
#!/bin/bash
# Copyright (c) 2017-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# A script to gather memory 'footprint' information as we launch more
# and more containers
#
# The script gathers information about both user and kernel space consumption
# Output is into a .json file, named using some of the config component names
# (such as footprint-busybox.json)
# Pull in some common, useful, items
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_PATH}/../lib/common.bash"
KSM_ENABLE_FILE="/sys/kernel/mm/ksm/run"
# Note that all vars that can be set from outside the script (that is,
# passed in the ENV), use the ':-' setting to allow being over-ridden
# Default sleep for 10s to let containers come up and finish their
# initialisation before we take the measures. Some of the larger
# containers can take a number of seconds to get running.
PAYLOAD_SLEEP="${PAYLOAD_SLEEP:-10}"
### The default config - run a small busybox image
# Define what we will be running (app under test)
# Default is we run busybox, as a 'small' workload
PAYLOAD="${PAYLOAD:-quay.io/prometheus/busybox:latest}"
PAYLOAD_ARGS="${PAYLOAD_ARGS:-tail -f /dev/null}"
###
# Define the cutoff checks for when we stop running the test
# Run up to this many containers
MAX_NUM_CONTAINERS="${MAX_NUM_CONTAINERS:-10}"
# Run until we have consumed this much memory (from MemFree)
MAX_MEMORY_CONSUMED="${MAX_MEMORY_CONSUMED:-6*1024*1024*1024}"
# Run until we have this much MemFree left
MIN_MEMORY_FREE="${MIN_MEMORY_FREE:-2*1024*1024*1024}"
# Tools we need to have installed in order to operate
REQUIRED_COMMANDS="smem awk"
# If we 'dump' the system caches before we measure then we get less
# noise in the results - they show more what our un-reclaimable footprint is
DUMP_CACHES="${DUMP_CACHES:-1}"
# Affects the name of the file to store the results in
TEST_NAME="${TEST_NAME:-footprint-busybox}"
############# end of configurable items ###################
# vars to remember where we started so we can calc diffs
base_mem_avail=0
base_mem_free=0
# dump the kernel caches, so we get a more precise (or just different)
# view of what our footprint really is.
function dump_caches() {
sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"
}
function init() {
restart_containerd_service
check_cmds $REQUIRED_COMMANDS
sudo -E "${CTR_EXE}" image pull "$PAYLOAD"
# Modify the test name if running with KSM enabled
check_for_ksm
# Use the common init func to get to a known state
init_env
# Prepare to start storing results
metrics_json_init
# Store up baseline measures
base_mem_avail=$(free -b | head -2 | tail -1 | awk '{print $7}')
base_mem_free=$(get_memfree)
# Store our configuration for this run
save_config
}
save_config(){
metrics_json_start_array
local json="$(cat << EOF
{
"testname": "${TEST_NAME}",
"payload": "${PAYLOAD}",
"payload_args": "${PAYLOAD_ARGS}",
"payload_sleep": ${PAYLOAD_SLEEP},
"max_containers": ${MAX_NUM_CONTAINERS},
"max_memory_consumed": "${MAX_MEMORY_CONSUMED}",
"min_memory_free": "${MIN_MEMORY_FREE}",
"dump_caches": "${DUMP_CACHES}"
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Config"
}
function cleanup() {
# Finish storing the results
metrics_json_save
clean_env_ctr
}
# helper function to get USS of process in arg1
function get_proc_uss() {
item=$(sudo smem -t -P "^$1" | tail -1 | awk '{print $4}')
((item*=1024))
echo $item
}
# helper function to get PSS of process in arg1
function get_proc_pss() {
item=$(sudo smem -t -P "^$1" | tail -1 | awk '{print $5}')
((item*=1024))
echo $item
}
# Get the PSS for the whole of userspace (all processes)
# This allows us to see if we had any impact on the rest of the system, for instance
# containerd grows as we launch containers, so we should account for that in our total
# memory breakdown
function grab_all_pss() {
item=$(sudo smem -t | tail -1 | awk '{print $5}')
((item*=1024))
local json="$(cat << EOF
"all_pss": {
"pss": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_user_smem() {
# userspace
item=$(sudo smem -w | head -5 | tail -1 | awk '{print $3}')
((item*=1024))
local json="$(cat << EOF
"user_smem": {
"userspace": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_slab() {
# Grabbing slab total from meminfo is easier than doing the math
# on slabinfo
item=$(fgrep "Slab:" /proc/meminfo | awk '{print $2}')
((item*=1024))
local json="$(cat << EOF
"slab": {
"slab": $item,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function get_memfree() {
mem_free=$(sudo smem -w | head -6 | tail -1 | awk '{print $4}')
((mem_free*=1024))
echo $mem_free
}
function grab_system() {
# avail memory, from 'free'
local avail=$(free -b | head -2 | tail -1 | awk '{print $7}')
local avail_decr=$((base_mem_avail-avail))
# cached memory, from 'free'
local cached=$(free -b | head -2 | tail -1 | awk '{print $6}')
# free memory from smem
local smem_free=$(get_memfree)
local free_decr=$((base_mem_free-item))
# Anon pages
local anon=$(fgrep "AnonPages:" /proc/meminfo | awk '{print $2}')
((anon*=1024))
# Mapped pages
local mapped=$(egrep "^Mapped:" /proc/meminfo | awk '{print $2}')
((mapped*=1024))
# Cached
local meminfo_cached=$(grep "^Cached:" /proc/meminfo | awk '{print $2}')
((meminfo_cached*=1024))
local json="$(cat << EOF
"system": {
"avail": $avail,
"avail_decr": $avail_decr,
"cached": $cached,
"smem_free": $smem_free,
"free_decr": $free_decr,
"anon": $anon,
"mapped": $mapped,
"meminfo_cached": $meminfo_cached,
"Units": "KB"
}
EOF
)"
metrics_json_add_array_fragment "$json"
}
function grab_stats() {
# If configured, dump the caches so we get a more stable
# view of what our static footprint really is
if [[ "$DUMP_CACHES" ]] ; then
dump_caches
fi
# user space data
# PSS taken all userspace
grab_all_pss
# user as reported by smem
grab_user_smem
# System overview data
# System free and cached
grab_system
# kernel data
# The 'total kernel space taken' we can work out as:
# ktotal = ((free-avail)-user)
# So, we don't grab that number from smem, as that is what it does
# internally anyhow.
# Still try to grab any finer kernel details that we can though
# totals from slabinfo
grab_slab
metrics_json_close_array_element
}
function check_limits() {
mem_free=$(get_memfree)
if ((mem_free <= MIN_MEMORY_FREE)); then
echo 1
return
fi
mem_consumed=$((base_mem_avail-mem_free))
if ((mem_consumed >= MAX_MEMORY_CONSUMED)); then
echo 1
return
fi
echo 0
}
function go() {
# Init the json cycle for this save
metrics_json_start_array
containers=()
for i in $(seq 1 $MAX_NUM_CONTAINERS); do
containers+=($(random_name))
sudo -E "${CTR_EXE}" run --rm --runtime=$CTR_RUNTIME $PAYLOAD ${containers[-1]} sh -c $PAYLOAD_ARGS
if [[ $PAYLOAD_SLEEP ]]; then
sleep $PAYLOAD_SLEEP
fi
grab_stats
# check if we have hit one of our limits and need to wrap up the tests
if (($(check_limits))); then
# Wrap up the results array
metrics_json_end_array "Results"
return
fi
done
# Wrap up the results array
metrics_json_end_array "Results"
}
function show_vars()
{
echo -e "\nEvironment variables:"
echo -e "\tName (default)"
echo -e "\t\tDescription"
echo -e "\tPAYLOAD (${PAYLOAD})"
echo -e "\t\tThe ctr image to run"
echo -e "\tPAYLOAD_ARGS (${PAYLOAD_ARGS})"
echo -e "\t\tAny extra arguments passed into the ctr 'run' command"
echo -e "\tPAYLOAD_SLEEP (${PAYLOAD_SLEEP})"
echo -e "\t\tSeconds to sleep between launch and measurement, to allow settling"
echo -e "\tMAX_NUM_CONTAINERS (${MAX_NUM_CONTAINERS})"
echo -e "\t\tThe maximum number of containers to run before terminating"
echo -e "\tMAX_MEMORY_CONSUMED (${MAX_MEMORY_CONSUMED})"
echo -e "\t\tThe maximum amount of memory to be consumed before terminating"
echo -e "\tMIN_MEMORY_FREE (${MIN_MEMORY_FREE})"
echo -e "\t\tThe path to the ctr binary (for 'smem' measurements)"
echo -e "\tDUMP_CACHES (${DUMP_CACHES})"
echo -e "\t\tA flag to note if the system caches should be dumped before capturing stats"
echo -e "\tTEST_NAME (${TEST_NAME})"
echo -e "\t\tCan be set to over-ride the default JSON results filename"
}
function help()
{
usage=$(cat << EOF
Usage: $0 [-h] [options]
Description:
Launch a series of workloads and take memory metric measurements after
each launch.
Options:
-h, Help page.
EOF
)
echo "$usage"
show_vars
}
function main() {
local OPTIND
while getopts "h" opt;do
case ${opt} in
h)
help
exit 0;
;;
esac
done
shift $((OPTIND-1))
init
go
cleanup
}
main "$@"

View File

@ -0,0 +1,381 @@
#!/bin/bash
# Copyright (c) 2017-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# Description of the test:
# This test launches a number of containers in idle mode,
# It will then sleep for a configurable period of time to allow
# any memory optimisations to 'settle, and then checks the
# amount of memory used by all the containers to come up with
# an average (using the PSS measurements)
# This test uses smem tool to get the memory used.
set -e
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_PATH}/../lib/common.bash"
# Busybox image: Choose a small workload image, this is
# in order to measure the runtime footprint, not the workload
# footprint.
IMAGE='quay.io/prometheus/busybox:latest'
CMD='tail -f /dev/null'
NUM_CONTAINERS="$1"
WAIT_TIME="$2"
AUTO_MODE="$3"
TEST_NAME="memory footprint"
SMEM_BIN="smem"
KSM_ENABLE_FILE="/sys/kernel/mm/ksm/run"
MEM_TMP_FILE=$(mktemp meminfo.XXXXXXXXXX)
PS_TMP_FILE=$(mktemp psinfo.XXXXXXXXXX)
function remove_tmp_file() {
rm -rf $MEM_TMP_FILE $PS_TMP_FILE
}
trap remove_tmp_file EXIT
# Show help about this script
help(){
cat << EOF
Usage: $0 <count> <wait_time> [auto]
Description:
<count> : Number of containers to run.
<wait_time> : Time in seconds to wait before taking
metrics.
[auto] : Optional 'auto KSM settle' mode
waits for ksm pages_shared to settle down
EOF
}
get_runc_pss_memory(){
ctr_runc_shim_path="/usr/local/bin/containerd-shim-runc-v2"
get_pss_memory "$ctr_runc_shim_path"
}
get_runc_individual_memory() {
runc_process_result=$(cat $MEM_TMP_FILE | tr "\n" " " | sed -e 's/\s$//g' | sed 's/ /, /g')
# Verify runc process result
if [ -z "$runc_process_result" ];then
die "Runc process not found"
fi
read -r -a runc_values <<< "${runc_process_result}"
metrics_json_start_array
local json="$(cat << EOF
{
"runc individual results": [
$(for ((i=0;i<${NUM_CONTAINERS[@]};++i)); do
printf '%s\n\t\t\t' "${runc_values[i]}"
done)
]
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Raw results"
}
# This function measures the PSS average
# memory of a process.
get_pss_memory(){
ps="$1"
mem_amount=0
count=0
avg=0
if [ -z "$ps" ]; then
die "No argument to get_pss_memory()"
fi
# Save all the processes names
# This will be help us to retrieve raw information
echo $ps >> $PS_TMP_FILE
data=$(sudo "$SMEM_BIN" --no-header -P "^$ps" -c "pss" | sed 's/[[:space:]]//g')
# Save all the smem results
# This will help us to retrieve raw information
echo $data >> $MEM_TMP_FILE
for i in $data;do
if (( i > 0 ));then
mem_amount=$(( i + mem_amount ))
(( count++ ))
fi
done
if (( $count > 0 ));then
avg=$(bc -l <<< "scale=2; $mem_amount / $count")
fi
echo "$avg"
}
ppid() {
local pid
pid=$(ps -p "${1:-nopid}" -o ppid=)
echo "${pid//[[:blank:]]/}"
}
# This function measures the PSS average
# memory of virtiofsd.
# It is a special case of get_pss_memory,
# virtiofsd forks itself so, smem sees the process
# two times, this function sum both pss values:
# pss_virtiofsd=pss_fork + pss_parent
get_pss_memory_virtiofsd() {
mem_amount=0
count=0
avg=0
virtiofsd_path=${1:-}
if [ -z "${virtiofsd_path}" ]; then
die "virtiofsd_path not provided"
fi
echo "${virtiofsd_path}" >> $PS_TMP_FILE
virtiofsd_pids=$(ps aux | grep [v]irtiofsd | awk '{print $2}')
data=$(sudo smem --no-header -P "^${virtiofsd_path}" -c pid -c "pid pss")
for p in ${virtiofsd_pids}; do
parent_pid=$(ppid ${p})
cmd="$(cat /proc/${p}/cmdline | tr -d '\0')"
cmd_parent="$(cat /proc/${parent_pid}/cmdline | tr -d '\0')"
if [ "${cmd}" != "${cmd_parent}" ]; then
pss_parent=$(printf "%s" "${data}" | grep "\s^${p}" | awk '{print $2}')
fork=$(pgrep -P ${p})
pss_fork=$(printf "%s" "${data}" | grep "^\s*${fork}" | awk '{print $2}')
pss_process=$((pss_fork + pss_parent))
# Save all the smem results
# This will help us to retrieve raw information
echo "${pss_process}" >>$MEM_TMP_FILE
if ((pss_process > 0)); then
mem_amount=$((pss_process + mem_amount))
((count++))
fi
fi
done
if (( $count > 0 ));then
avg=$(bc -l <<< "scale=2; $mem_amount / $count")
fi
echo "${avg}"
}
get_individual_memory(){
# Getting all the individual container information
first_process_name=$(cat $PS_TMP_FILE | awk 'NR==1' | awk -F "/" '{print $NF}' | sed 's/[[:space:]]//g')
first_process_result=$(cat $MEM_TMP_FILE | awk 'NR==1' | sed 's/ /, /g')
second_process_name=$(cat $PS_TMP_FILE | awk 'NR==2' | awk -F "/" '{print $NF}' | sed 's/[[:space:]]//g')
second_process_result=$(cat $MEM_TMP_FILE | awk 'NR==2' | sed 's/ /, /g')
third_process_name=$(cat $PS_TMP_FILE | awk 'NR==3' | awk -F "/" '{print $NF}' | sed 's/[[:space:]]//g')
third_process_result=$(cat $MEM_TMP_FILE | awk 'NR==3' | sed 's/ /, /g')
read -r -a first_values <<< "${first_process_result}"
read -r -a second_values <<< "${second_process_result}"
read -r -a third_values <<< "${third_process_result}"
metrics_json_start_array
local json="$(cat << EOF
{
"$first_process_name memory": [
$(for ((i=0;i<${NUM_CONTAINERS[@]};++i)); do
[ -n "${first_values[i]}" ] &&
printf '%s\n\t\t\t' "${first_values[i]}"
done)
],
"$second_process_name memory": [
$(for ((i=0;i<${NUM_CONTAINERS[@]};++i)); do
[ -n "${second_values[i]}" ] &&
printf '%s\n\t\t\t' "${second_values[i]}"
done)
],
"$third_process_name memory": [
$(for ((i=0;i<${NUM_CONTAINERS[@]};++i)); do
[ -n "${third_values[i]}" ] &&
printf '%s\n\t\t\t' "${third_values[i]}"
done)
]
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Raw results"
}
# Try to work out the 'average memory footprint' of a container.
get_docker_memory_usage(){
hypervisor_mem=0
virtiofsd_mem=0
shim_mem=0
memory_usage=0
containers=()
for ((i=1; i<= NUM_CONTAINERS; i++)); do
containers+=($(random_name))
${CTR_EXE} run --runtime "${CTR_RUNTIME}" -d ${IMAGE} ${containers[-1]} ${CMD}
done
if [ "$AUTO_MODE" == "auto" ]; then
if (( ksm_on != 1 )); then
die "KSM not enabled, cannot use auto mode"
fi
echo "Entering KSM settle auto detect mode..."
wait_ksm_settle $WAIT_TIME
else
# If KSM is enabled, then you normally want to sleep long enough to
# let it do its work and for the numbers to 'settle'.
echo "napping $WAIT_TIME s"
sleep "$WAIT_TIME"
fi
metrics_json_start_array
# Check the runtime in order in order to determine which process will
# be measured about PSS
if [ "$RUNTIME" == "runc" ]; then
runc_workload_mem="$(get_runc_pss_memory)"
memory_usage="$runc_workload_mem"
local json="$(cat << EOF
{
"average": {
"Result": $memory_usage,
"Units" : "KB"
},
"runc": {
"Result": $runc_workload_mem,
"Units" : "KB"
}
}
EOF
)"
else [ "$RUNTIME" == "kata-runtime" ] || [ "$RUNTIME" == "kata-qemu" ]
# Get PSS memory of VM runtime components.
# And check that the smem search has found the process - we get a "0"
# back if that procedure fails (such as if a process has changed its name
# or is not running when expected to be so)
# As an added bonus - this script must be run as root.
# Now if you do not have enough rights
# the smem failure to read the stats will also be trapped.
hypervisor_mem="$(get_pss_memory "$HYPERVISOR_PATH")"
if [ "$hypervisor_mem" == "0" ]; then
die "Failed to find PSS for $HYPERVISOR_PATH"
fi
virtiofsd_mem="$(get_pss_memory_virtiofsd "$VIRTIOFSD_PATH")"
if [ "$virtiofsd_mem" == "0" ]; then
echo >&2 "WARNING: Failed to find PSS for $VIRTIOFSD_PATH"
fi
shim_mem="$(get_pss_memory "$SHIM_PATH")"
if [ "$shim_mem" == "0" ]; then
die "Failed to find PSS for $SHIM_PATH"
fi
mem_usage="$(bc -l <<< "scale=2; $hypervisor_mem +$virtiofsd_mem + $shim_mem")"
memory_usage="$mem_usage"
local json="$(cat << EOF
{
"average": {
"Result": $mem_usage,
"Units" : "KB"
},
"qemus": {
"Result": $hypervisor_mem,
"Units" : "KB"
},
"virtiofsds": {
"Result": $virtiofsd_mem,
"Units" : "KB"
},
"shims": {
"Result": $shim_mem,
"Units" : "KB"
}
}
EOF
)"
fi
metrics_json_add_array_element "$json"
metrics_json_end_array "Results"
clean_env_ctr
}
save_config(){
metrics_json_start_array
local json="$(cat << EOF
{
"containers": $NUM_CONTAINERS,
"ksm": $ksm_on,
"auto": "$AUTO_MODE",
"waittime": $WAIT_TIME,
"image": "$IMAGE",
"command": "$CMD"
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Config"
}
main(){
# Verify enough arguments
if [ $# != 2 ] && [ $# != 3 ];then
echo >&2 "error: Not enough arguments [$@]"
help
exit 1
fi
#Check for KSM before reporting test name, as it can modify it
check_for_ksm
init_env
check_cmds "${SMEM_BIN}" bc
check_images "$IMAGE"
if [ "${CTR_RUNTIME}" == "io.containerd.kata.v2" ]; then
export RUNTIME="kata-runtime"
elif [ "${CTR_RUNTIME}" == "io.containerd.runc.v2" ]; then
export RUNTIME="runc"
else
die "Unknown runtime ${CTR_RUNTIME}"
fi
metrics_json_init
save_config
get_docker_memory_usage
if [ "$RUNTIME" == "runc" ]; then
get_runc_individual_memory
elif [ "$RUNTIME" == "kata-runtime" ]; then
get_individual_memory
fi
metrics_json_save
}
main "$@"

View File

@ -0,0 +1,134 @@
#!/bin/bash
# Copyright (c) 2017-2023 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
# Description of the test:
# This test launches a busybox container and inside
# memory free, memory available and total memory
# is measured by using /proc/meminfo.
set -e
# General env
SCRIPT_PATH=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_PATH}/../lib/common.bash"
TEST_NAME="memory footprint inside container"
VERSIONS_FILE="${SCRIPT_PATH}/../../versions.yaml"
IMAGE='quay.io/prometheus/busybox:latest'
CMD="sleep 10; cat /proc/meminfo"
# We specify here in 'k', as that then matches the results we get from the meminfo,
# which makes later direct comparison easier.
MEMSIZE=${MEMSIZE:-$((2048*1024))}
# this variable determines the number of attempts when a test
# result is considered not valid (a zero value or a negative value)
MAX_FAILED_ATTEMPTS=3
memtotalAvg=0
units_memtotal=""
memfreeAvg=0
units_memfree=""
memavailableAvg=0
units_memavailable=""
# count_iters: is the index of the current iteration
count_iters=0
# valid_result: if value stored is '1' the result is valid, '0' otherwise
valid_result=0
parse_results() {
local raw_results="${1}"
# Variables used for sum cummulative values in the case of two or more reps.
# and used to compute average results for 'json' output format.
local memtotal_acu="${2:-0}"
local memfree_acu="${3:-0}"
local memavailable_acu="${4:-0}"
local memtotal=$(echo "$raw_results" | awk '/MemTotal/ {print $2}')
units_memtotal=$(echo "$raw_results" | awk '/MemTotal/ {print $3}')
local memfree=$(echo "$raw_results" | awk '/MemFree/ {print $2}')
units_memfree=$(echo "$raw_results" | awk '/MemFree/ {print $3}')
local memavailable=$(echo "$raw_results" | awk '/MemAvailable/ {print $2}')
units_memavailable=$(echo "$raw_results" | awk '/MemAvailable/ {print $3}')
# check results: if any result is zero or negative, it is considered as invalid, and the test will be repeated.
if (( $(echo "$memtotal <= 0" | bc -l) )) || (( $(echo "$memfree <= 0" | bc -l) )) || (( $(echo "$memavailable <= 0" | bc -l) )); then
MAX_FAILED_ATTEMPTS=$((MAX_FAILED_ATTEMPTS-1))
valid_result=0
info "Skipping invalid result: memtotal: $memtotal memfree: $memfree memavailable: $memavailable"
return 0
fi
memtotalAvg=$((memtotal+memtotal_acu))
memfreeAvg=$((memfree+memfree_acu))
memavailableAvg=$((memavailable+memavailable_acu))
valid_result=1
info "Iteration# $count_iters memtotal: $memtotal memfree: $memfree memavailable: $memavailable"
}
store_results_json() {
metrics_json_start_array
memtotalAvg=$(echo "scale=2; $memtotalAvg / $count_iters" | bc)
memfreeAvg=$(echo "scale=2; $memfreeAvg / $count_iters" | bc)
memavailableAvg=$(echo "scale=2; $memavailableAvg / $count_iters" | bc)
local json="$(cat << EOF
{
"memrequest": {
"Result" : ${MEMSIZE},
"Units" : "Kb"
},
"memtotal": {
"Result" : ${memtotalAvg},
"Units" : "${units_memtotal}"
},
"memfree": {
"Result" : ${memfreeAvg},
"Units" : "${units_memfree}"
},
"memavailable": {
"Result" : ${memavailableAvg},
"Units" : "${units_memavailable}"
},
"repetitions": {
"Result" : ${count_iters}
}
}
EOF
)"
metrics_json_add_array_element "$json"
metrics_json_end_array "Results"
metrics_json_save
}
function main() {
# switch to select output format
local num_iterations=${1:-1}
info "Iterations: $num_iterations"
# Check tools/commands dependencies
cmds=("awk" "ctr")
init_env
check_cmds "${cmds[@]}"
check_images "${IMAGE}"
metrics_json_init
while [ $count_iters -lt $num_iterations ]; do
local output=$(sudo -E "${CTR_EXE}" run --memory-limit $((MEMSIZE*1024)) --rm --runtime=$CTR_RUNTIME $IMAGE busybox sh -c "$CMD" 2>&1)
parse_results "${output}" "${memtotalAvg}" "${memfreeAvg}" "${memavailableAvg}"
# quit if number of attempts exceeds the allowed value.
[ ${MAX_FAILED_ATTEMPTS} -eq 0 ] && die "Max number of attempts exceeded."
[ ${valid_result} -eq 1 ] && count_iters=$((count_iters+1))
done
store_results_json
clean_env_ctr
}
# Parameters
# @1: num_iterations {integer}
main "$@"

View File

@ -14,12 +14,11 @@ metrics_dir="$(dirname "$(readlink -f "$0")")"
source "${metrics_dir}/../common.bash"
function create_symbolic_links() {
hypervisor="${1:-qemu}"
local link_configuration_file="/opt/kata/share/defaults/kata-containers/configuration.toml"
local source_configuration_file="/opt/kata/share/defaults/kata-containers/configuration-${hypervisor}.toml"
local source_configuration_file="/opt/kata/share/defaults/kata-containers/configuration-${KATA_HYPERVISOR}.toml"
if [ ${hypervisor} != 'qemu' ] && [ ${hypervisor} != 'clh' ]; then
die "Failed to set the configuration.toml: '${hypervisor}' is not recognized as a valid hypervisor name."
if [ ${KATA_HYPERVISOR} != 'qemu' ] && [ ${KATA_HYPERVISOR} != 'clh' ]; then
die "Failed to set the configuration.toml: '${KATA_HYPERVISOR}' is not recognized as a valid hypervisor name."
fi
sudo ln -sf "${source_configuration_file}" "${link_configuration_file}"
@ -45,6 +44,8 @@ EOF
}
function install_kata() {
# ToDo: remove the exit once the metrics workflow is stable
exit 0
local kata_tarball="kata-static.tar.xz"
declare -r katadir="/opt/kata"
declare -r destdir="/"
@ -83,20 +84,39 @@ function check_containerd_config_for_kata() {
}
function run_test_launchtimes() {
hypervisor="${1}"
info "Running Launch Time test using ${KATA_HYPERVISOR} hypervisor"
info "Running Launch Time test using ${hypervisor} hypervisor"
create_symbolic_links "${hypervisor}"
# ToDo: remove the exit once the metrics workflow is stable
exit 0
create_symbolic_links
bash tests/metrics/time/launch_times.sh -i public.ecr.aws/ubuntu/ubuntu:latest -n 20
}
function run_test_memory_usage() {
info "Running memory-usage test using ${KATA_HYPERVISOR} hypervisor"
# ToDo: remove the exit once the metrics workflow is stable
exit 0
create_symbolic_links
bash tests/metrics/density/memory_usage.sh 20 5
}
function run_test_memory_usage_inside_container() {
info "Running memory-usage inside the container test using ${KATA_HYPERVISOR} hypervisor"
# ToDo: remove the exit once the metrics workflow is stable
exit 0
create_symbolic_links
bash tests/metrics/density/memory_usage_inside_container.sh 5
}
function main() {
action="${1:-}"
case "${action}" in
install-kata) install_kata ;;
run-test-launchtimes-qemu) run_test_launchtimes "qemu" ;;
run-test-launchtimes-clh) run_test_launchtimes "clh" ;;
run-test-launchtimes) run_test_launchtimes ;;
run-test-memory-usage) run_test_memory_usage ;;
run-test-memory-usage-inside-container) run_test_memory_usage_inside_container ;;
*) >&2 die "Invalid argument" ;;
esac
}