From 5a0bc4dd83f204d4fb05ca4242aa814dd9a8ee27 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Daniel=20Mart=C3=AD?= Date: Tue, 28 Jul 2015 13:45:36 -0700 Subject: [PATCH] Add compute resource metrics API proposal --- .../proposals/compute-resource-metrics-api.md | 177 ++++++++++++++++++ 1 file changed, 177 insertions(+) create mode 100644 docs/proposals/compute-resource-metrics-api.md diff --git a/docs/proposals/compute-resource-metrics-api.md b/docs/proposals/compute-resource-metrics-api.md new file mode 100644 index 00000000000..472e6a377ad --- /dev/null +++ b/docs/proposals/compute-resource-metrics-api.md @@ -0,0 +1,177 @@ + + + + +WARNING +WARNING +WARNING +WARNING +WARNING + +

PLEASE NOTE: This document applies to the HEAD of the source tree

+ +If you are using a released version of Kubernetes, you should +refer to the docs that go with that version. + + +The latest 1.0.x release of this document can be found +[here](http://releases.k8s.io/release-1.0/docs/proposals/compute-resource-metrics-api.md). + +Documentation for other releases can be found at +[releases.k8s.io](http://releases.k8s.io). + +-- + + + + + +# Kubernetes compute resource metrics API + +## Goals + +Provide resource usage metrics on pods and nodes on the API server to be used +by the scheduler to improve job placement, utilization, etc. and by end users +to understand the resource utilization of their jobs. Horizontal and vertical +auto-scaling are also near-term uses. + +## Current state + +Right now, the Kubelet exports container metrics via an API endpoint. This +information is not gathered nor served by the Kubernetes API server. + +## Use cases + +The first user will be kubectl. The resource usage data can be shown to the +user via a periodically refreshing interface similar to `top` on Unix-like +systems. This info could let users assign resource limits more efficiently. + +``` +$ kubectl top kubernetes-minion-abcd +POD CPU MEM +monitoring-heapster-abcde 0.12 cores 302 MB +kube-ui-v1-nd7in 0.07 cores 130 MB +``` + +A second user will be the scheduler. To assign pods to nodes efficiently, the +scheduler needs to know the current free resources on each node. + +## Proposed endpoints + + /api/v1/namespaces/myns/podMetrics/mypod + /api/v1/nodeMetrics/myNode + +The derived metrics include the mean, max and a few percentiles of the list of +values. + +We are not adding new methods to pods and nodes, e.g. +`/api/v1/namespaces/myns/pods/mypod/metrics`, for a number of reasons. For +example, having a separate endpoint allows fetching all the pod metrics in a +single request. The rate of change of the data is also too high to include in +the pod resource. + +In the future, if any uses cases are found that would benefit from RC, +namespace or service aggregation, metrics at those levels could also be +exposed taking advantage of the fact that Heapster already does aggregation +and metrics for them. + +Initially, this proposal included raw metrics alongside the derived metrics. +After revising the use cases, it was clear that raw metrics could be left out +of this proposal. They can be dealt with in a separate proposal, exposing them +in the Kubelet API via proper versioned endpoints for Heapster to poll +periodically. + +This also means that the amount of data pushed by each Kubelet to the API +server will be much smaller. + +## Data gathering + +We will use a push based system. Each kubelet will periodically - every 10s - +POST its derived metrics to the API server. Then, any users of the metrics can +register as watchers to receive the new metrics when they are available. + +Users of the metrics may also periodically poll the API server instead of +registering as a watcher, having in mind that new data may only be available +every 10 seconds. If any user requires metrics that are either more specific +(e.g. last 1s) or updated more often, they should use the metrics pipeline via +Heapster. + +The API server will not hold any of this data directly. For our initial +purposes, it will hold the most recent metrics obtained from each node in +etcd. Then, when polled for metrics, the API server would only serve said most +recent data per node. + +Benchmarks will be run with etcd to see if it can keep up with the frequent +writes of data. If it turns out that etcd doesn't scale well enough, we will +have to switch to a different storage system. + +If a pod gets deleted, the API server will get rid of any metrics it may +currently be holding for it. + +The clients watching the metrics data may cache it for longer periods of time. +The clearest example would be Heapster. + +In the future, we might want to store the metrics differently: + +* via heapster - Since heapster keeps data for a period of time, we could + redirect requests to the API server to heapster instead of using etcd. This + would also allow serving metrics other than the latest ones. + +An edge case that this proposal doesn't take into account is kubelets being +restarted. If any of them are, with a simple implementation they would lose +historical data and thus take hours to gather enough information to provide +relevant metrics again. We might want to use persistent storage directly or in +the future to improve that situation. + +More information on kubelet checkpoints can be read on +[#489](https://issues.k8s.io/489). + +## Data structure + +```Go +type DerivedPodMetrics struct { + TypeMeta + ObjectMeta // should have pod name + // the key is the container name + Containers []struct { + ContainerReference *Container + Metrics MetricsWindows + } +} + +type DerivedNodeMetrics struct { + TypeMeta + ObjectMeta // should have node name + NodeMetrics MetricsWindows + SystemContainers []struct { + ContainerReference *Container + Metrics MetricsWindows + } +} + +// Last overlapping 10s, 1m, 1h and 1d as a start +// Updated every 10s, so the 10s window is sequential and the rest are +// rolling. +type MetricsWindows map[time.Duration]DerivedMetrics + +type DerivedMetrics struct { + // End time of all the time windows in Metrics + EndTime util.Time `json:"endtime"` + + Mean ResourceUsage `json:"mean"` + Max ResourceUsage `json:"max"` + NinetyFive ResourceUsage `json:"95th"` +} + +type ResourceUsage map[resource.Type]resource.Quantity +``` + + + +[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/compute-resource-metrics-api.md?pixel)]() +