mirror of https://github.com/k3s-io/kubernetes.git synced 2026-01-29 21:29:24 +00:00

Files

Patrick Ohly 9a867c555c logs: benchmark write performance

The recent regression https://github.com/kubernetes/kubernetes/issues/107033
shows that we need a way to automatically measure different logging
configurations (structured text, JSON with and without split streams) under
realistic conditions (time stamping, caller identification).

System calls may affect the performance and thus writing into actual files is
useful. A temp dir under /tmp (usually a tmpfs) is used, so the actual IO
bandwidth shouldn't affect the outcome. The "normal" json.Factory code is used
to construct the JSON logger when we have actual files that can be set as
os.Stderr and os.Stdout, thus making this as realistic as possible.

When discarding the output instead of writing it, the focus is more on the rest
of the pipeline and changes there can be investigated more reliably.

The benchmarks automatically gather "log entries per second" and "bytes per
second", which is useful to know when considering requirements like the ones
from https://github.com/kubernetes/kubernetes/issues/107029.

2022-01-11 09:56:22 +01:00

data

…

benchmark_test.go

logs: benchmark write performance

2022-01-11 09:56:22 +01:00

common_test.go

logs: benchmark write performance

2022-01-11 09:56:22 +01:00

load_test.go

…

load.go

…

README.md

…

README.md

Benchmarking logging

Any major changes to the logging code, whether it is in Kubernetes or in klog, must be benchmarked before and after the change.

Running the benchmark

$ go test -bench=. -test.benchmem -benchmem .

Real log data

The files under data define test cases for specific aspects of formatting. To test with a log file that represents output under some kind of real load, copy the log file into data/<file name>.log and run benchmarking as described above. -bench=BenchmarkLogging/<file name without .log suffix> can be used to benchmark just the new file.

When using data/v<some number>/<file name>.log, formatting will be done at that log level. Symlinks can be created to simulating writing of the same log data at different levels.

No such real data is included in the Kubernetes repo because of their size. They can be found in the "artifacts" of this https://testgrid.kubernetes.io/sig-instrumentation-tests#kind-json-logging-master Prow job:

artifacts/logs/kind-control-plane/containers
artifacts/logs/kind-*/kubelet.log

With sufficient credentials, gsutil can be used to download everything for a job with:

gsutil -m cp -R gs://kubernetes-jenkins/logs/ci-kubernetes-kind-e2e-json-logging/<job ID> .

Analyzing log data

While loading a file, some statistics about it are collected. Those are shown when running with:

$ go test -v -bench=. -test.benchmem -benchmem .