mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-07-23 03:41:45 +00:00
Removed outdated metrics docs
This commit is contained in:
parent
abd653bd97
commit
a3b79d840d
@ -1,101 +0,0 @@
|
||||
# Custom Metrics in Kubernetes
|
||||
|
||||
|
||||
## Preface
|
||||
|
||||
Our aim is to create a mechanism in Kubernetes that will allow pods to expose custom system metrics, collect them, and make them accessible.
|
||||
Custom metrics are needed by:
|
||||
* horizontal pod autoscaler, for autoscaling the number of pods based on them;
|
||||
* scheduler, for using them in a sophisticated scheduling algorithm.
|
||||
|
||||
High level goals for our solution for version 1.2 are:
|
||||
* easy to use (it should be easy to export custom metric from user's application),
|
||||
* works for most application (it should be easy to configure monitoring for third-party applications),
|
||||
* performance & scalability (the largest supported cluster should be able to handle ~5 custom metrics per pod with reporting latency ~30 seconds).
|
||||
|
||||
For version 1.2, we are not going to address the following issues (non-goals):
|
||||
* security of access to custom metrics,
|
||||
* general monitoring of application health by a user
|
||||
(out of the heapster scope, see [#665](https://github.com/kubernetes/heapster/issues/665)).
|
||||
|
||||
## Design
|
||||
|
||||
For the Kubernetes version 1.2, we plan to implement aggregation of pod custom metrics in Prometheus format by pull.
|
||||
|
||||
Each pod, to expose custom metrics, will expose a set of Prometheus endpoints.
|
||||
(For version 1.2, we assume that custom metrics are not private information and they are accessible by everyone.
|
||||
In future, we may restrict it by making the endpoints accessible only by kubelet/cAdvisor).
|
||||
CAdvisor will collect metrics from such endpoints of all pods on each node by pulling, and expose them to Heapster.
|
||||
Heapster will:
|
||||
* collect custom metrics from all CAdvisors in the cluster, together with pulling system metrics
|
||||
(for version 1.2: we assume pooling period of ~30 seconds),
|
||||
* store them in a metrics backend (influxDB, Prometheus, Hawkular, GCM, …),
|
||||
* expose the latest snapshot of custom metrics for queries (by HPA/scheduler/…) using [model API](https://github.com/kubernetes/heapster/blob/master/docs/model.md).
|
||||
|
||||
User can easily expose Prometheus metrics for her own application by using Prometheus [client](http://prometheus.io/docs/instrumenting/clientlibs/) library.
|
||||
To monitor third-party applications, Prometheus [exporters](http://prometheus.io/docs/instrumenting/exporters/) run as side-cars containers may be used.
|
||||
|
||||
For version 1.2, to prevent a huge number of metrics negatively affect the system performance,
|
||||
the number of metrics that can be exposed by each pod will be limited to the configurable value (default: 5).
|
||||
In future, we will need a way to cap the number of exposed metrics per pod,
|
||||
one of possible solutions is using LimtRanger admission control plugin.
|
||||
|
||||
In future versions (later than 1.2), we want to extend our solution by:
|
||||
* accepting pod metrics exposed in different formats than Prometheus
|
||||
(collecting of the different formats will need to be supported by cAdvisor),
|
||||
* support push metrics by exposing push API on heapster (e.g. in StatsD format) or on a local node collector
|
||||
(if heapster performance is insufficient),
|
||||
* support metrics not associated with an individual pod.
|
||||
|
||||
|
||||
## API
|
||||
|
||||
For Kubernetes 1.2, defining pod Prometheus endpoints will be done using annotations.
|
||||
Later, when we are sure that our API is correct and stable, we will make it a part of `PodSpec`.
|
||||
|
||||
We will add a new optional pod annotation with the following key: `metrics.alpha.kubernetes.io/custom-endpoints`.
|
||||
It will contain a string-value in JSON format.
|
||||
The value will be a list of tuples defining ports, paths and API
|
||||
(currently, we will support only Prometheus API, this will be also the default value if format is empty)
|
||||
of metrics endpoints exposed by the pod, and names of metrics which should be taken from the endpoint (obligatory, no more than the configurable limit).
|
||||
|
||||
The annotation will be interpreted by kubelet during pod creation.
|
||||
It will not be possible to add/delete/edit it during the life time of a pod: such operations will be rejected.
|
||||
|
||||
For example, the following configuration:
|
||||
|
||||
```
|
||||
"metrics.alpha.kubernetes.io/custom-endpoints" = [
|
||||
{
|
||||
"api": "prometheus",
|
||||
"path": "/status",
|
||||
"port": "8080",
|
||||
"names": ["qps", "activeConnections"]
|
||||
},
|
||||
{
|
||||
"path": "/metrics",
|
||||
"port": "9090"
|
||||
"names": ["myMetric"]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
will expose metrics with names `qps` and `activeConnections` from `localhost:8080/status` and metric `myMetric` from `localhost:9090/metrics`.
|
||||
Please note that both endpoints are in Prometheus format.
|
||||
|
||||
|
||||
## Implementation notes
|
||||
|
||||
1. Kubelet will parse value of `metrics.alpha.kubernetes.io/custom-endpoints` annotation for pods.
|
||||
In case of error, pod will not be started (will be marked as failed) and kubelet will generate `FailedToCreateContainer` event with appropriate message
|
||||
(we will not introduce any new event type, as types of events are considered a part of kubelet API and we do not want to change it).
|
||||
|
||||
1. Kubelet will use application metrics in CAdvisor for implementation:
|
||||
* It will create a configuration file for CAdvisor based on the annotation,
|
||||
* It will mount this file as a part of a docker image to run,
|
||||
* It will set a docker label for the image to point CAdvisor to this file.
|
||||
|
||||
|
||||
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
||||
[]()
|
||||
<!-- END MUNGE: GENERATED_ANALYTICS -->
|
@ -1,101 +0,0 @@
|
||||
# Resource Usage Metrics plumbing in Kubernetes
|
||||
|
||||
**Author**: Vishnu Kannan (@vishh)
|
||||
|
||||
**Status**: Draft proposal; some parts are already implemented
|
||||
|
||||
* This document presents a design for handling container metrics in Kubernetes clusters*
|
||||
|
||||
## Motivation
|
||||
|
||||
Resource usage metrics are critical for various reasons:
|
||||
* Monitor and maintain the health of the cluster and user applications.
|
||||
* Improve the efficiency of the cluster by making more optimal scheduling decisions and enabling components like auto-scalers.
|
||||
|
||||
There are multiple types of metrics that describe the state of a container.
|
||||
Numerous strategies exist to aggregate these metrics from containers.
|
||||
There are a variety of storage backends that can handle metrics.
|
||||
|
||||
This document presents a design to abstract out collection and storage backends, and provide stable Kubernetes APIs that can be consumed by users and other cluster components.
|
||||
|
||||
## Introduction
|
||||
|
||||
Container metrics can be of two types.
|
||||
|
||||
1. `Compute resource metrics` refers to compute resources being used by a container. Ex.: CPU, Memory, Network, File-system
|
||||
2. `Service metrics` refers to container app specific metrics. Ex: QPS, query latency, etc.
|
||||
|
||||
Metrics can be collected either for cluster components or for user containers.
|
||||
|
||||
[cAdvisor](https://github.com/google/cadvisor) is a node level container metrics aggregator that is built into the kubelet. cAdvisor can collect both types of metrics, although the support for service metrics is limited at this point. cAdvisor collects metrics for both system components and user containers.
|
||||
|
||||
[heapster](https://github.com/kubernetes/heapster) is a cluster level metrics aggregator that is run by default on most Kubernetes cluster. Heapster aggregates all the metrics exposed by cAdvisor from the nodes. Heapster has a pluggable storage backend. It supports the following timeseries storage backends - InfluxDB, Google Cloud Monitoring and Hawkular.
|
||||
Heapster builds a model of the cluster and can aggregate metrics across pods, nodes, namespaces and the entire cluster. It exposes this data via [REST endpoints](https://github.com/kubernetes/heapster/blob/master/docs/model.md#api-documentation).
|
||||
|
||||
Metrics data will be consumed by many different clients - scheduler, horizontal and vertical pod auto scalers, initial pod limits controller, kubectl, web consoles, cluster management software, etc.
|
||||
|
||||
Storage backends can be shared for both monitoring of clusters and powering advanced cluster features.
|
||||
|
||||
## Goals
|
||||
|
||||
* Abstract out timeseries storage backends from Kubernetes components.
|
||||
* Provide stable Kubernetes Metrics APIs that other components can consume.
|
||||
|
||||
#### Non Goals
|
||||
|
||||
* Requiring users to run a specific storage backend.
|
||||
* Compatibility with other node level metrics aggregator. cAdvisor should be able to provide all the metrics.
|
||||
* Support for service metrics at the cluster level is out of scope for this document.
|
||||
Once the use cases for service metrics, other than monitoring, are clear, we can explore adding support for service metrics.
|
||||
|
||||
## Design
|
||||
|
||||
The basic idea is to evolve heapster to serve Metrics APIs which can then be consumed by other cluster components.
|
||||
Heapster will be run in all clusters by default. Heapster's memory usage is proportional to the number of containers in the cluster and so it should be possible to run heapster by default even on small development or test clusters.
|
||||
A cluster administrator will have to either run one of the supported storage backends or write a new storage plugin in heapster to support custom storage backends.
|
||||
Heapster will manage versioning and storage schema for the various storage backends it supports.
|
||||
Heapster APIs will be exposed as Kubernetes APIs once the apiserver supports [dynamic API plugins](https://github.com/kubernetes/kubernetes/issues/991).
|
||||
|
||||
Heapster stores a days worth of historical metrics. Heapster will fetch data from storage backends on-demand to serve metrics that are older than a day. Setting [initial pod resources](initial-resources.md) requires access to metrics from the past 30 days.
|
||||
|
||||
To make heapster APIs compatible with Kubernetes API requirements, heapster will have to incorporate the API server library. Until that is possible, we will run a secondary API server binary that supports the metrics APIs being consumed by other components. The initial plan is to use etcd to store the most recent metrics. Eventually, we would like to get rid of etcd for metrics and make heapster act as a backend to the api-server.
|
||||
|
||||
This is the current plan for supporting node and pod metrics API as described in this [proposal](resource-metrics-api.md).
|
||||
|
||||
There will be proposals in the future for adding more heapster metrics APIs in Kubernetes.
|
||||
|
||||
## Implementation plan
|
||||
|
||||
Heapster has an in-build model of a cluster and can expose the average, 95%ile and max of compute resource metrics for containers, pods, nodes, namespaces and entire cluster.
|
||||
However the existing APIs are not suitable for Kubernetes components.
|
||||
The metrics are stored in a rolling window. Adding support for other percentiles should be straightforward.
|
||||
Heapster is currently stateless and so it will loose its history upon restart.
|
||||
Some of the specific work items include,
|
||||
|
||||
1. Improve the existing API schema to be Kubernetes compatible ([Related issue](https://github.com/kubernetes/heapster/issues/476))
|
||||
2. Add support for fetching historical data from storage backends.
|
||||
3. Fetch historical metrics from storage backends upon restarts to pre-populate the internal model.
|
||||
4. Add support for image based aggregation.
|
||||
5. Add support for label queries.
|
||||
6. Expose heapster APIs via a Kubernetes service until the primary API server can handle plugins.
|
||||
|
||||
### Non Goals
|
||||
|
||||
### Known issues
|
||||
|
||||
* Running other metrics aggregators
|
||||
|
||||
An example here would be running collectd in-place of cadvisor and storing metrics to a custom database or running prometheus. We can let cluster admins run their own aggregation and storage stack as long as the storage backend is supported in heapster and the storage schema is versioned. Compatibility can be guaranteed by explicitly specifying the versions of different components that are supported in a specific Kubernetes release.
|
||||
|
||||
* Heapster scalability
|
||||
|
||||
Heapster's resource utilization is proportional to the number of containers running in the cluster. A fair amount of effort has gone into optimizing heapster's memory usage. As our cluster size increases, we can shard heapster. We believe the existing heapster design should scale for fairly large clusters with reasonable amount of compute resources.
|
||||
|
||||
### How can you contribute?
|
||||
|
||||
We are tracking heapster work items using [milestones](https://github.com/kubernetes/heapster/milestones) in the heapster repo.
|
||||
|
||||
|
||||
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
||||
[]()
|
||||
<!-- END MUNGE: GENERATED_ANALYTICS -->
|
Loading…
Reference in New Issue
Block a user