mirror of
				https://github.com/k3s-io/kubernetes.git
				synced 2025-10-25 10:00:53 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			162 lines
		
	
	
		
			5.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			162 lines
		
	
	
		
			5.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| <!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
 | |
| 
 | |
| <!-- BEGIN STRIP_FOR_RELEASE -->
 | |
| 
 | |
| <img src="http://kubernetes.io/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| 
 | |
| <h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
 | |
| 
 | |
| If you are using a released version of Kubernetes, you should
 | |
| refer to the docs that go with that version.
 | |
| 
 | |
| <!-- TAG RELEASE_LINK, added by the munger automatically -->
 | |
| <strong>
 | |
| The latest release of this document can be found
 | |
| [here](http://releases.k8s.io/release-1.2/docs/devel/node-performance-testing.md).
 | |
| 
 | |
| Documentation for other releases can be found at
 | |
| [releases.k8s.io](http://releases.k8s.io).
 | |
| </strong>
 | |
| --
 | |
| 
 | |
| <!-- END STRIP_FOR_RELEASE -->
 | |
| 
 | |
| <!-- END MUNGE: UNVERSIONED_WARNING -->
 | |
| 
 | |
| # Measuring Node Performance
 | |
| 
 | |
| This document outlines the issues and pitfalls of measuring Node performance, as
 | |
| well as the tools available.
 | |
| 
 | |
| ## Cluster Set-up
 | |
| 
 | |
| There are lots of factors which can affect node performance numbers, so care
 | |
| must be taken in setting up the cluster to make the intended measurements. In
 | |
| addition to taking the following steps into consideration, it is important to
 | |
| document precisely which setup was used. For example, performance can vary
 | |
| wildly from commit-to-commit, so it is very important to **document which commit
 | |
| or version** of Kubernetes was used, which Docker version was used, etc.
 | |
| 
 | |
| ### Addon pods
 | |
| 
 | |
| Be aware of which addon pods are running on which nodes. By default Kubernetes
 | |
| runs 8 addon pods, plus another 2 per node (`fluentd-elasticsearch` and
 | |
| `kube-proxy`) in the `kube-system` namespace. The addon pods can be disabled for
 | |
| more consistent results, but doing so can also have performance implications.
 | |
| 
 | |
| For example, Heapster polls each node regularly to collect stats data. Disabling
 | |
| Heapster will hide the performance cost of serving those stats in the Kubelet.
 | |
| 
 | |
| #### Disabling Add-ons
 | |
| 
 | |
| Disabling addons is simple. Just ssh into the Kubernetes master and move the
 | |
| addon from `/etc/kubernetes/addons/` to a backup location. More details
 | |
| [here](../../cluster/addons/).
 | |
| 
 | |
| ### Which / how many pods?
 | |
| 
 | |
| Performance will vary a lot between a node with 0 pods and a node with 100 pods.
 | |
| In many cases you'll want to make measurements with several different amounts of
 | |
| pods. On a single node cluster scaling a replication controller makes this easy,
 | |
| just make sure the system reaches a steady-state before starting the
 | |
| measurement. E.g. `kubectl scale replicationcontroller pause --replicas=100`
 | |
| 
 | |
| In most cases pause pods will yield the most consistent measurements since the
 | |
| system will not be affected by pod load. However, in some special cases
 | |
| Kubernetes has been tuned to optimize pods that are not doing anything, such as
 | |
| the cAdvisor housekeeping (stats gathering). In these cases, performing a very
 | |
| light task (such as a simple network ping) can make a difference.
 | |
| 
 | |
| Finally, you should also consider which features yours pods should be using. For
 | |
| example, if you want to measure performance with probing, you should obviously
 | |
| use pods with liveness or readiness probes configured. Likewise for volumes,
 | |
| number of containers, etc.
 | |
| 
 | |
| ### Other Tips
 | |
| 
 | |
| **Number of nodes** - On the one hand, it can be easier to manage logs, pods,
 | |
| environment etc. with a single node to worry about. On the other hand, having
 | |
| multiple nodes will let you gather more data in parallel for more robust
 | |
| sampling.
 | |
| 
 | |
| ## E2E Performance Test
 | |
| 
 | |
| There is an end-to-end test for collecting overall resource usage of node
 | |
| components: [kubelet_perf.go](../../test/e2e/kubelet_perf.go). To
 | |
| run the test, simply make sure you have an e2e cluster running (`go run
 | |
| hack/e2e.go -up`) and [set up](#cluster-set-up) correctly.
 | |
| 
 | |
| Run the test with `go run hack/e2e.go -v -test
 | |
| --test_args="--ginkgo.focus=resource\susage\stracking"`. You may also wish to
 | |
| customise the number of pods or other parameters of the test (remember to rerun
 | |
| `make WHAT=test/e2e/e2e.test` after you do).
 | |
| 
 | |
| ## Profiling
 | |
| 
 | |
| Kubelet installs the [go pprof handlers]
 | |
| (https://golang.org/pkg/net/http/pprof/), which can be queried for CPU profiles:
 | |
| 
 | |
| ```console
 | |
| $ kubectl proxy &
 | |
| Starting to serve on 127.0.0.1:8001
 | |
| $ curl -G "http://localhost:8001/api/v1/proxy/nodes/${NODE}:10250/debug/pprof/profile?seconds=${DURATION_SECONDS}" > $OUTPUT
 | |
| $ KUBELET_BIN=_output/dockerized/bin/linux/amd64/kubelet
 | |
| $ go tool pprof -web $KUBELET_BIN $OUTPUT
 | |
| ```
 | |
| 
 | |
| `pprof` can also provide heap usage, from the `/debug/pprof/heap` endpoint
 | |
| (e.g. `http://localhost:8001/api/v1/proxy/nodes/${NODE}:10250/debug/pprof/heap`).
 | |
| 
 | |
| More information on go profiling can be found
 | |
| [here](http://blog.golang.org/profiling-go-programs).
 | |
| 
 | |
| ## Benchmarks
 | |
| 
 | |
| Before jumping through all the hoops to measure a live Kubernetes node in a real
 | |
| cluster, it is worth considering whether the data you need can be gathered
 | |
| through a Benchmark test. Go provides a really simple benchmarking mechanism,
 | |
| just add a unit test of the form:
 | |
| 
 | |
| ```go
 | |
| // In foo_test.go
 | |
| func BenchmarkFoo(b *testing.B) {
 | |
|   b.StopTimer()
 | |
|   setupFoo() // Perform any global setup
 | |
|   b.StartTimer()
 | |
|   for i := 0; i < b.N; i++ {
 | |
|     foo() // Functionality to measure
 | |
|   }
 | |
| }
 | |
| ```
 | |
| 
 | |
| Then:
 | |
| 
 | |
| ```console
 | |
| $ go test -bench=. -benchtime=${SECONDS}s foo_test.go
 | |
| ```
 | |
| 
 | |
| More details on benchmarking [here](https://golang.org/pkg/testing/).
 | |
| 
 | |
| ## TODO
 | |
| 
 | |
| - (taotao) Measuring docker performance
 | |
| - Expand cluster set-up section
 | |
| - (vishh) Measuring disk usage
 | |
| - (yujuhong) Measuring memory usage
 | |
| - Add section on monitoring kubelet metrics (e.g. with prometheus)
 | |
| 
 | |
| 
 | |
| 
 | |
| <!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
 | |
| []()
 | |
| <!-- END MUNGE: GENERATED_ANALYTICS -->
 |