mirror of
https://github.com/kubeshark/kubeshark.git
synced 2025-06-27 00:29:31 +00:00
🔥 Delete performance_analysis
directory (#1252)
This commit is contained in:
parent
8778e5770c
commit
9aeb1fadea
@ -1,107 +0,0 @@
|
||||
|
||||
# Performance analysis
|
||||
|
||||
This directory contains tools for analyzing tapper performance.
|
||||
|
||||
# Periodic tapper logs
|
||||
|
||||
In tapper logs there are some periodic lines that shows its internal state and consumed resources.
|
||||
|
||||
Internal state example (formatted and commented):
|
||||
```
|
||||
stats - {
|
||||
"processedBytes":468940592, // how many bytes we read from pcap
|
||||
"packetsCount":174883, // how many packets we read from pcap
|
||||
"tcpPacketsCount":174883, // how many tcp packets we read from pcap
|
||||
"reassembledTcpPayloadsCount":66893, // how many chunks sent to tcp stream
|
||||
"matchedPairs":24821, // how many request response pairs found
|
||||
"droppedTcpStreams":2 // how many tcp streams remained stale and dropped
|
||||
}
|
||||
```
|
||||
|
||||
Consumed resources example (formatted and commented):
|
||||
```
|
||||
mem: 24441240, // golang heap size
|
||||
goroutines: 29, // how many goroutines
|
||||
cpu: 91.208791, // how much cpu the tapper process consume (in percentage per core)
|
||||
cores: 16, // how many cores there are on the machine
|
||||
rss: 87052288 // how many bytes held by the tapper process
|
||||
```
|
||||
|
||||
# Plot tapper logs
|
||||
|
||||
In order to plot a tapper log or many logs into a graph, use the `plot_from_tapper_logs.py` util.
|
||||
|
||||
It gets a list of tapper logs as a parameter, and output an image with a nice graph.
|
||||
|
||||
The log file names should be named in this format `XX_DESCRIPTION.log` when XX is the number between determining the color of the output graph and description is the name of the series. It allows for easy comparison between various modes.
|
||||
|
||||
Example run:
|
||||
```
|
||||
cd $KUBESHARK_HOME/performance_analysis
|
||||
virtualenv venv
|
||||
source venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
python plot_from_tapper_logs.py 00_tapper.log
|
||||
```
|
||||
|
||||
# Tapper Modes
|
||||
|
||||
Every packet seen by the tapper is processed in a pipeline that contains various stages.
|
||||
* Pcap - Read the packet from libpcap
|
||||
* Assembler - Assemble the packet into a TcpStream
|
||||
* TcpStream - Hold stream information and TcpReaders
|
||||
* Dissectors - Read from TcpReader and recognize the packet content and protocol.
|
||||
* Emit - Marshal the request response pair into a Json
|
||||
* Send - Send the Json to Api Server
|
||||
|
||||
Tapper can be run with various debug modes:
|
||||
* No Pcap - Start the tapper process, but don't read from any packets from pcap
|
||||
* No Assembler - Read packets from pcap, but don't assemble them
|
||||
* No TcpStream - Assemble the packets, but don't create TcpStream for them
|
||||
* No Dissectors - Create a TcpStream for the packets, but don't dissect their content
|
||||
* No Emit - Dissect the TcpStream, but don't emit the matched request response pair
|
||||
* No Send - Emit the request response pair, but don't send them to the Api Server.
|
||||
* Regular mode
|
||||
|
||||

|
||||
|
||||
# Run benchmark with various tapper modes
|
||||
|
||||
## Prerequisite
|
||||
|
||||
In order to run the benchmark you probably want:
|
||||
1. An up and running Api Server
|
||||
2. An up and running Basenine
|
||||
3. An up and running UI (optional)
|
||||
4. An up and running test server, like nginx, that can return a known payload at a known endpoint.
|
||||
5. Set KUBESHARK_HOME environment variable to points to kubeshark directory
|
||||
6. Install the `hey` tool
|
||||
|
||||
## Running the benchmark
|
||||
|
||||
In order to run a benchmark use the `run_tapper_benchmark.sh` script.
|
||||
|
||||
Example run:
|
||||
```
|
||||
cd $KUBESHARK_HOME/performance_analysis
|
||||
source venv/bin/activate # Assuming you already run plot_from_tapper_logs.py
|
||||
./run_tapper_benchmark.sh
|
||||
```
|
||||
|
||||
Running it without params use the default values, use the following environment variables for customization:
|
||||
```
|
||||
export=KUBESHARK_BENCHMARK_OUTPUT_DIR=/path/to/dir # Set the output directory for tapper logs and graph
|
||||
export=KUBESHARK_BENCHMARK_CLIENT_PERIOD=1m # How long each test run
|
||||
export=KUBESHARK_BENCHMARK_URL=http://server:port/path # The URL to use for the benchmarking process (the test server endpoint)
|
||||
export=KUBESHARK_BENCHMARK_RUN_COUNT=3 # How many times each tapper mode should run
|
||||
export=KUBESHARK_BENCHMARK_QPS=250 # How many queries per second the each client should send to the test server
|
||||
export=KUBESHARK_BENCHMARK_CLIENTS_COUNT=5 # How many clients should run in parallel during the benchmark
|
||||
```
|
||||
|
||||
# Example output graph
|
||||
|
||||
An example output graph from a 15 min run with 15K payload and 1000 QPS looks like
|
||||
|
||||

|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 327 KiB |
@ -1,182 +0,0 @@
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
import pathlib
|
||||
import re
|
||||
import sys
|
||||
import typing
|
||||
|
||||
COLORMAP = plt.get_cmap('turbo')
|
||||
|
||||
# Extract cpu and rss samples from log files and plot them
|
||||
# Input: List of log files
|
||||
#
|
||||
# example:
|
||||
# python plot_from_tapper_logs.py 01_no_pcap_01.log 99_normal_00.log
|
||||
#
|
||||
# The script assumes that the log file names start with a number (pattern '\d+')
|
||||
# and groups based on this number. Files that start will the same number will be plotted with the same color.
|
||||
# Change group_pattern to an empty string to disable this, or change to a regex of your liking.
|
||||
|
||||
|
||||
def get_sample(name: str, line: str, default_value: float):
|
||||
pattern = name + r': ?(\d+(\.\d+)?)'
|
||||
maybe_sample = re.findall(pattern, line)
|
||||
if len(maybe_sample) == 0:
|
||||
return default_value
|
||||
|
||||
sample = float(maybe_sample[0][0])
|
||||
return sample
|
||||
|
||||
|
||||
def append_sample(name: str, line: str, samples: typing.List[float]):
|
||||
sample = get_sample(name, line, -1)
|
||||
|
||||
if sample == -1:
|
||||
return
|
||||
|
||||
samples.append(sample)
|
||||
|
||||
|
||||
def extract_samples(f: typing.IO) -> typing.Tuple[pd.Series, pd.Series, pd.Series, pd.Series, pd.Series, pd.Series, pd.Series, pd.Series]:
|
||||
cpu_samples = []
|
||||
rss_samples = []
|
||||
count_samples = []
|
||||
matched_samples = []
|
||||
live_samples = []
|
||||
processed_samples = []
|
||||
heap_samples = []
|
||||
goroutines_samples = []
|
||||
for line in f:
|
||||
append_sample('cpu', line, cpu_samples)
|
||||
append_sample('rss', line, rss_samples)
|
||||
ignored_packets_count = get_sample('"ignoredPacketsCount"', line, -1)
|
||||
packets_count = get_sample('"packetsCount"', line, -1)
|
||||
if ignored_packets_count != -1 and packets_count != -1:
|
||||
count_samples.append(packets_count - ignored_packets_count)
|
||||
append_sample('"matchedPairs"', line, matched_samples)
|
||||
append_sample('"liveTcpStreams"', line, live_samples)
|
||||
append_sample('"processedBytes"', line, processed_samples)
|
||||
append_sample('heap-alloc', line, heap_samples)
|
||||
append_sample('goroutines', line, goroutines_samples)
|
||||
|
||||
cpu_samples = pd.Series(cpu_samples)
|
||||
rss_samples = pd.Series(rss_samples)
|
||||
count_samples = pd.Series(count_samples)
|
||||
matched_samples = pd.Series(matched_samples)
|
||||
live_samples = pd.Series(live_samples)
|
||||
processed_samples = pd.Series(processed_samples)
|
||||
heap_samples = pd.Series(heap_samples)
|
||||
goroutines_samples = pd.Series(goroutines_samples)
|
||||
|
||||
return cpu_samples, rss_samples, count_samples, matched_samples, live_samples, processed_samples, heap_samples, goroutines_samples
|
||||
|
||||
|
||||
def plot(ax, df: pd.DataFrame, title: str, xlabel: str, ylabel: str, group_pattern: typing.Optional[str]):
|
||||
if group_pattern:
|
||||
color = get_group_color(df.columns, group_pattern)
|
||||
df.plot(color=color, ax=ax)
|
||||
else:
|
||||
df.plot(cmap=COLORMAP, ax=ax)
|
||||
|
||||
ax.ticklabel_format(style='plain')
|
||||
plt.title(title)
|
||||
plt.legend()
|
||||
plt.xlabel(xlabel)
|
||||
plt.ylabel(ylabel)
|
||||
|
||||
|
||||
def get_group_color(names, pattern):
|
||||
props = [int(re.findall(pattern, pathlib.Path(name).name)[0]) for name in names]
|
||||
key = dict(zip(sorted(list(set(props))), range(len(set(props)))))
|
||||
n_colors = len(key)
|
||||
color_options = plt.get_cmap('jet')(np.linspace(0, 1, n_colors))
|
||||
groups = [key[prop] for prop in props]
|
||||
color = color_options[groups] # type: ignore
|
||||
return color
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
filenames = sys.argv[1:]
|
||||
|
||||
cpu_samples_all_files = []
|
||||
rss_samples_all_files = []
|
||||
count_samples_all_files = []
|
||||
matched_samples_all_files = []
|
||||
live_samples_all_files = []
|
||||
processed_samples_all_files = []
|
||||
heap_samples_all_files = []
|
||||
goroutines_samples_all_files = []
|
||||
|
||||
for ii, filename in enumerate(filenames):
|
||||
print("Analyzing {}".format(filename))
|
||||
with open(filename, 'r') as f:
|
||||
cpu_samples, rss_samples, count_samples, matched_samples, live_samples, processed_samples, heap_samples, goroutines_samples = extract_samples(f)
|
||||
|
||||
cpu_samples.name = pathlib.Path(filename).name
|
||||
rss_samples.name = pathlib.Path(filename).name
|
||||
count_samples.name = pathlib.Path(filename).name
|
||||
matched_samples.name = pathlib.Path(filename).name
|
||||
live_samples.name = pathlib.Path(filename).name
|
||||
processed_samples.name = pathlib.Path(filename).name
|
||||
heap_samples.name = pathlib.Path(filename).name
|
||||
goroutines_samples.name = pathlib.Path(filename).name
|
||||
|
||||
cpu_samples_all_files.append(cpu_samples)
|
||||
rss_samples_all_files.append(rss_samples)
|
||||
count_samples_all_files.append(count_samples)
|
||||
matched_samples_all_files.append(matched_samples)
|
||||
live_samples_all_files.append(live_samples)
|
||||
processed_samples_all_files.append(processed_samples)
|
||||
heap_samples_all_files.append(heap_samples)
|
||||
goroutines_samples_all_files.append(goroutines_samples)
|
||||
|
||||
cpu_samples_df = pd.concat(cpu_samples_all_files, axis=1)
|
||||
rss_samples_df = pd.concat(rss_samples_all_files, axis=1)
|
||||
count_samples_df = pd.concat(count_samples_all_files, axis=1)
|
||||
matched_samples_df = pd.concat(matched_samples_all_files, axis=1)
|
||||
live_samples_df = pd.concat(live_samples_all_files, axis=1)
|
||||
processed_samples_df = pd.concat(processed_samples_all_files, axis=1)
|
||||
heap_samples_df = pd.concat(heap_samples_all_files, axis=1)
|
||||
goroutines_samples_df = pd.concat(goroutines_samples_all_files, axis=1)
|
||||
|
||||
group_pattern = r'^\d+'
|
||||
|
||||
cpu_plot = plt.subplot(8, 2, 1)
|
||||
plot(cpu_plot, cpu_samples_df, 'cpu', '', 'cpu (%)', group_pattern)
|
||||
cpu_plot.legend().remove()
|
||||
|
||||
mem_plot = plt.subplot(8, 2, 2)
|
||||
plot(mem_plot, (rss_samples_df / 1024 / 1024), 'rss', '', 'mem (mega)', group_pattern)
|
||||
mem_plot.legend(loc='center left', bbox_to_anchor=(1, 0.5))
|
||||
|
||||
packets_plot = plt.subplot(8, 2, 3)
|
||||
plot(packets_plot, count_samples_df, 'packetsCount', '', 'packetsCount', group_pattern)
|
||||
packets_plot.legend().remove()
|
||||
|
||||
matched_plot = plt.subplot(8, 2, 4)
|
||||
plot(matched_plot, matched_samples_df, 'matchedCount', '', 'matchedCount', group_pattern)
|
||||
matched_plot.legend().remove()
|
||||
|
||||
live_plot = plt.subplot(8, 2, 5)
|
||||
plot(live_plot, live_samples_df, 'liveStreamsCount', '', 'liveStreamsCount', group_pattern)
|
||||
live_plot.legend().remove()
|
||||
|
||||
processed_plot = plt.subplot(8, 2, 6)
|
||||
plot(processed_plot, (processed_samples_df / 1024 / 1024), 'processedBytes', '', 'bytes (mega)', group_pattern)
|
||||
processed_plot.legend().remove()
|
||||
|
||||
heap_plot = plt.subplot(8, 2, 7)
|
||||
plot(heap_plot, (heap_samples_df / 1024 / 1024), 'heap', '', 'heap (mega)', group_pattern)
|
||||
heap_plot.legend().remove()
|
||||
|
||||
goroutines_plot = plt.subplot(8, 2, 8)
|
||||
plot(goroutines_plot, goroutines_samples_df, 'goroutines', '', 'goroutines', group_pattern)
|
||||
goroutines_plot.legend().remove()
|
||||
|
||||
fig = plt.gcf()
|
||||
fig.set_size_inches(20, 18)
|
||||
|
||||
print('Saving graph to graph.png')
|
||||
plt.savefig('graph.png', bbox_inches='tight')
|
||||
|
@ -1,2 +0,0 @@
|
||||
matplotlib
|
||||
pandas
|
@ -1,100 +0,0 @@
|
||||
#!/bin/bash
|
||||
|
||||
[ -z "$KUBESHARK_HOME" ] && { echo "KUBESHARK_HOME is missing"; exit 1; }
|
||||
[ -z "$KUBESHARK_BENCHMARK_OUTPUT_DIR" ] && export KUBESHARK_BENCHMARK_OUTPUT_DIR="/tmp/kubeshark-benchmark-results-$(date +%d-%m-%H-%M)"
|
||||
[ -z "$KUBESHARK_BENCHMARK_CLIENT_PERIOD" ] && export KUBESHARK_BENCHMARK_CLIENT_PERIOD="1m"
|
||||
[ -z "$KUBESHARK_BENCHMARK_URL" ] && export KUBESHARK_BENCHMARK_URL="http://localhost:8081/data/b.1000.json"
|
||||
[ -z "$KUBESHARK_BENCHMARK_RUN_COUNT" ] && export KUBESHARK_BENCHMARK_RUN_COUNT="3"
|
||||
[ -z "$KUBESHARK_BENCHMARK_QPS" ] && export KUBESHARK_BENCHMARK_QPS="500"
|
||||
[ -z "$KUBESHARK_BENCHMARK_CLIENTS_COUNT" ] && export KUBESHARK_BENCHMARK_CLIENTS_COUNT="5"
|
||||
|
||||
function log() {
|
||||
local message=$@
|
||||
printf "[%s] %s\n" "$(date "+%d-%m %H:%M:%S")" "$message"
|
||||
}
|
||||
|
||||
function run_single_bench() {
|
||||
local mode_num=$1
|
||||
local mode_str=$2
|
||||
|
||||
log "Starting ${mode_num}_${mode_str} (runs: $KUBESHARK_BENCHMARK_RUN_COUNT) (period: $KUBESHARK_BENCHMARK_CLIENT_PERIOD)"
|
||||
|
||||
for ((i=0;i<"$KUBESHARK_BENCHMARK_RUN_COUNT";i++)); do
|
||||
log " $i: Running tapper"
|
||||
rm -f tapper.log
|
||||
tapper_args=("--tap" "--api-server-address" "ws://localhost:8899/wsTapper" "-stats" "10" "-ignore-ports" "8899,9099")
|
||||
if [[ $(uname) == "Darwin" ]]
|
||||
then
|
||||
tapper_args+=("-i" "lo0" "-"decoder "Loopback")
|
||||
else
|
||||
tapper_args+=("-i" "lo")
|
||||
fi
|
||||
nohup ./agent/build/kubesharkagent ${tapper_args[@]} > tapper.log 2>&1 &
|
||||
|
||||
log " $i: Running client (hey)"
|
||||
hey -z $KUBESHARK_BENCHMARK_CLIENT_PERIOD -c $KUBESHARK_BENCHMARK_CLIENTS_COUNT -q $KUBESHARK_BENCHMARK_QPS $KUBESHARK_BENCHMARK_URL > /dev/null || return 1
|
||||
|
||||
log " $i: Killing tapper"
|
||||
kill -9 $(ps -ef | grep agent/build/kubesharkagent | grep tap | grep -v grep | awk '{ print $2 }') > /dev/null 2>&1
|
||||
|
||||
local output_file=$KUBESHARK_BENCHMARK_OUTPUT_DIR/${mode_num}_${mode_str}_${i}.log
|
||||
log " $i: Moving output to $output_file"
|
||||
mv tapper.log $output_file || return 1
|
||||
done
|
||||
}
|
||||
|
||||
function generate_bench_graph() {
|
||||
cd performance_analysis/ || return 1
|
||||
source venv/bin/activate
|
||||
python plot_from_tapper_logs.py $KUBESHARK_BENCHMARK_OUTPUT_DIR/*.log || return 1
|
||||
mv graph.png $KUBESHARK_BENCHMARK_OUTPUT_DIR || return 1
|
||||
}
|
||||
|
||||
mkdir -p $KUBESHARK_BENCHMARK_OUTPUT_DIR
|
||||
rm -f $KUBESHARK_BENCHMARK_OUTPUT_DIR/*
|
||||
log "Writing output to $KUBESHARK_BENCHMARK_OUTPUT_DIR"
|
||||
|
||||
cd $KUBESHARK_HOME || exit 1
|
||||
|
||||
export HOST_MODE=0
|
||||
export SENSITIVE_DATA_FILTERING_OPTIONS='{}'
|
||||
export KUBESHARK_DEBUG_DISABLE_PCAP=false
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_REASSEMBLY=false
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_STREAM=false
|
||||
export KUBESHARK_DEBUG_DISABLE_NON_HTTP_EXTENSSION=false
|
||||
export KUBESHARK_DEBUG_DISABLE_DISSECTORS=false
|
||||
export KUBESHARK_DEBUG_DISABLE_EMITTING=false
|
||||
export KUBESHARK_DEBUG_DISABLE_SENDING=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_PCAP=true
|
||||
run_single_bench "01" "no_pcap" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_PCAP=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_REASSEMBLY=true
|
||||
run_single_bench "02" "no_assembler" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_REASSEMBLY=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_STREAM=true
|
||||
run_single_bench "03" "no_tcp_stream" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_TCP_STREAM=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_NON_HTTP_EXTENSSION=true
|
||||
run_single_bench "04" "only_http" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_NON_HTTP_EXTENSSION=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_DISSECTORS=true
|
||||
run_single_bench "05" "no_dissectors" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_DISSECTORS=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_EMITTING=true
|
||||
run_single_bench "06" "no_emit" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_EMITTING=false
|
||||
|
||||
export KUBESHARK_DEBUG_DISABLE_SENDING=true
|
||||
run_single_bench "07" "no_send" || exit 1
|
||||
export KUBESHARK_DEBUG_DISABLE_SENDING=false
|
||||
|
||||
run_single_bench "08" "normal" || exit 1
|
||||
|
||||
generate_bench_graph || exit 1
|
||||
log "Output written to to $KUBESHARK_BENCHMARK_OUTPUT_DIR"
|
Binary file not shown.
Before Width: | Height: | Size: 259 KiB |
Loading…
Reference in New Issue
Block a user