mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-07-23 19:56:01 +00:00
Merge pull request #25413 from pmorie/storage-proposal
Automatic merge from submit-queue Proposal: persistent volume selector Partially replaces #17056. Another proposal will follow dealing with dynamic provisioning on top of storage classes. @kubernetes/sig-storage
This commit is contained in:
commit
bf0a5e9fac
297
docs/proposals/volume-selectors.md
Normal file
297
docs/proposals/volume-selectors.md
Normal file
@ -0,0 +1,297 @@
|
||||
<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
|
||||
|
||||
<!-- BEGIN STRIP_FOR_RELEASE -->
|
||||
|
||||
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
||||
width="25" height="25">
|
||||
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
||||
width="25" height="25">
|
||||
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
||||
width="25" height="25">
|
||||
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
||||
width="25" height="25">
|
||||
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
|
||||
width="25" height="25">
|
||||
|
||||
<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
|
||||
|
||||
If you are using a released version of Kubernetes, you should
|
||||
refer to the docs that go with that version.
|
||||
|
||||
Documentation for other releases can be found at
|
||||
[releases.k8s.io](http://releases.k8s.io).
|
||||
</strong>
|
||||
--
|
||||
|
||||
<!-- END STRIP_FOR_RELEASE -->
|
||||
|
||||
<!-- END MUNGE: UNVERSIONED_WARNING -->
|
||||
|
||||
## Abstract
|
||||
|
||||
Real Kubernetes clusters have a variety of volumes which differ widely in
|
||||
size, iops performance, retention policy, and other characteristics. A
|
||||
mechanism is needed to enable administrators to describe the taxonomy of these
|
||||
volumes, and for users to make claims on these volumes based on their
|
||||
attributes within this taxonomy.
|
||||
|
||||
A label selector mechanism is proposed to enable flexible selection of volumes
|
||||
by persistent volume claims.
|
||||
|
||||
## Motivation
|
||||
|
||||
Currently, users of persistent volumes have the ability to make claims on
|
||||
those volumes based on some criteria such as the access modes the volume
|
||||
supports and minimum resources offered by a volume. In an organization, there
|
||||
are often more complex requirements for the storage volumes needed by
|
||||
different groups of users. A mechanism is needed to model these different
|
||||
types of volumes and to allow users to select those different types without
|
||||
being intimately familiar with their underlying characteristics.
|
||||
|
||||
As an example, many cloud providers offer a range of performance
|
||||
characteristics for storage, with higher performing storage being more
|
||||
expensive. Cluster administrators want the ability to:
|
||||
|
||||
1. Invent a taxonomy of logical storage classes using the attributes
|
||||
important to them
|
||||
2. Allow users to make claims on volumes using these attributes
|
||||
|
||||
## Constraints and Assumptions
|
||||
|
||||
The proposed design should:
|
||||
|
||||
1. Deal with manually-created volumes
|
||||
2. Not necessarily require users to know or understand the differences between
|
||||
volumes (ie, Kubernetes should not dictate any particular set of
|
||||
characteristics to administrators to think in terms of)
|
||||
|
||||
We will focus **only** on the barest mechanisms to describe and implement
|
||||
label selectors in this proposal. We will address the following topics in
|
||||
future proposals:
|
||||
|
||||
1. An extension resource or third party resource for storage classes
|
||||
1. Dynamically provisioning new volumes for based on storage class
|
||||
|
||||
## Use Cases
|
||||
|
||||
1. As a user, I want to be able to make a claim on a persistent volume by
|
||||
specifying a label selector as well as the currently available attributes
|
||||
|
||||
### Use Case: Taxonomy of Persistent Volumes
|
||||
|
||||
Kubernetes offers volume types for a variety of storage systems. Within each
|
||||
of those storage systems, there are numerous ways in which volume instances
|
||||
may differ from one another: iops performance, retention policy, etc.
|
||||
Administrators of real clusters typically need to manage a variety of
|
||||
different volumes with different characteristics for different groups of
|
||||
users.
|
||||
|
||||
Kubernetes should make it possible for administrators to flexibly model the
|
||||
taxonomy of volumes in their clusters and to label volumes with their storage
|
||||
class. This capability must be optional and fully backward-compatible with
|
||||
the existing API.
|
||||
|
||||
Let's look at an example. This example is *purely fictitious* and the
|
||||
taxonomies presented here are not a suggestion of any sort. In the case of
|
||||
AWS EBS there are four different types of volume (in ascending order of cost):
|
||||
|
||||
1. Cold HDD
|
||||
2. Throughput optimized HDD
|
||||
3. General purpose SSD
|
||||
4. Provisioned IOPS SSD
|
||||
|
||||
Currently, there is no way to distinguish between a group of 4 PVs where each
|
||||
volume is of one of these different types. Administrators need the ability to
|
||||
distinguish between instances of these types. An administrator might decide
|
||||
to think of these volumes as follows:
|
||||
|
||||
1. Cold HDD - `tin`
|
||||
2. Throughput optimized HDD - `bronze`
|
||||
3. General purpose SSD - `silver`
|
||||
4. Provisioned IOPS SSD - `gold`
|
||||
|
||||
This is not the only dimension that EBS volumes can differ in. Let's simplify
|
||||
things and imagine that AWS has two availability zones, `east` and `west`. Our
|
||||
administrators want to differentiate between volumes of the same type in these
|
||||
two zones, so they create a taxonomy of volumes like so:
|
||||
|
||||
1. `tin-west`
|
||||
2. `tin-east`
|
||||
3. `bronze-west`
|
||||
4. `bronze-east`
|
||||
5. `silver-west`
|
||||
6. `silver-east`
|
||||
7. `gold-west`
|
||||
8. `gold-east`
|
||||
|
||||
Another administrator of the same cluster might label things differently,
|
||||
choosing to focus on the business role of volumes. Say that the data
|
||||
warehouse department is the sole consumer of the cold HDD type, and the DB as
|
||||
a service offering is the sole consumer of provisioned IOPS volumes. The
|
||||
administrator might decide on the following taxonomy of volumes:
|
||||
|
||||
1. `warehouse-east`
|
||||
2. `warehouse-west`
|
||||
3. `dbaas-east`
|
||||
4. `dbaas-west`
|
||||
|
||||
There are any number of ways an administrator may choose to distinguish
|
||||
between volumes. Labels are used in Kubernetes to express the user-defined
|
||||
properties of API objects and are a good fit to express this information for
|
||||
volumes. In the examples above, administrators might differentiate between
|
||||
the classes of volumes using the labels `business-unit`, `volume-type`, or
|
||||
`region`.
|
||||
|
||||
Label selectors are used through the Kubernetes API to describe relationships
|
||||
between API objects using flexible, user-defined criteria. It makes sense to
|
||||
use the same mechanism with persistent volumes and storage claims to provide
|
||||
the same functionality for these API objects.
|
||||
|
||||
## Proposed Design
|
||||
|
||||
We propose that:
|
||||
|
||||
1. A new field called `Selector` be added to the `PersistentVolumeClaimSpec`
|
||||
type
|
||||
2. The persistent volume controller be modified to account for this selector
|
||||
when determining the volume to bind to a claim
|
||||
|
||||
### Persistent Volume Selector
|
||||
|
||||
Label selectors are used throughout the API to allow users to express
|
||||
relationships in a flexible manner. The problem of selecting a volume to
|
||||
match a claim fits perfectly within this metaphor. Adding a label selector to
|
||||
`PersistentVolumeClaimSpec` will allow users to label their volumes with
|
||||
criteria important to them and select volumes based on these criteria.
|
||||
|
||||
```go
|
||||
// PersistentVolumeClaimSpec describes the common attributes of storage devices
|
||||
// and allows a Source for provider-specific attributes
|
||||
type PersistentVolumeClaimSpec struct {
|
||||
// Contains the types of access modes required
|
||||
AccessModes []PersistentVolumeAccessMode `json:"accessModes,omitempty"`
|
||||
// Selector is a selector which must be true for the claim to bind to a volume
|
||||
Selector *unversioned.Selector `json:"selector,omitempty"`
|
||||
// Resources represents the minimum resources required
|
||||
Resources ResourceRequirements `json:"resources,omitempty"`
|
||||
// VolumeName is the binding reference to the PersistentVolume backing this claim
|
||||
VolumeName string `json:"volumeName,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
### Labeling volumes
|
||||
|
||||
Volumes can already be labeled:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: ebs-pv-1
|
||||
labels:
|
||||
ebs-volume-type: iops
|
||||
aws-availability-zone: us-east-1
|
||||
spec:
|
||||
capacity:
|
||||
storage: 100Gi
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
awsElasticBlockStore:
|
||||
volumeID: vol-12345
|
||||
fsType: xfs
|
||||
```
|
||||
|
||||
### Controller Changes
|
||||
|
||||
At the time of this writing, the various controllers for persistent volumes
|
||||
are in the process of being refactored into a single controller (see
|
||||
[kubernetes/24331](https://github.com/kubernetes/kubernetes/pull/24331)).
|
||||
|
||||
The resulting controller should be modified to use the new
|
||||
`selector` field to match a claim to a volume. In order to
|
||||
match to a volume, all criteria must be satisfied; ie, if a label selector is
|
||||
specified on a claim, a volume must match both the label selector and any
|
||||
specified access modes and resource requirements to be considered a match.
|
||||
|
||||
## Examples
|
||||
|
||||
Let's take a look at a few examples, revisiting the taxonomy of EBS volumes and regions:
|
||||
|
||||
Volumes of the different types might be labeled as follows:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: ebs-pv-west
|
||||
labels:
|
||||
ebs-volume-type: iops-ssd
|
||||
aws-availability-zone: us-west-1
|
||||
spec:
|
||||
capacity:
|
||||
storage: 150Gi
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
awsElasticBlockStore:
|
||||
volumeID: vol-23456
|
||||
fsType: xfs
|
||||
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: ebs-pv-east
|
||||
labels:
|
||||
ebs-volume-type: gp-ssd
|
||||
aws-availability-zone: us-east-1
|
||||
spec:
|
||||
capacity:
|
||||
storage: 150Gi
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
awsElasticBlockStore:
|
||||
volumeID: vol-34567
|
||||
fsType: xfs
|
||||
```
|
||||
|
||||
...claims on these volumes would look like:
|
||||
|
||||
```yaml
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: ebs-claim-west
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
selector:
|
||||
matchLabels:
|
||||
ebs-volume-type: iops-ssd
|
||||
aws-availability-zone: us-west-1
|
||||
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: ebs-claim-east
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
selector:
|
||||
matchLabels:
|
||||
ebs-volume-type: gp-ssd
|
||||
aws-availability-zone: us-east-1
|
||||
```
|
||||
|
||||
|
||||
|
||||
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
||||
[]()
|
||||
<!-- END MUNGE: GENERATED_ANALYTICS -->
|
Loading…
Reference in New Issue
Block a user