mirror of
				https://github.com/k3s-io/kubernetes.git
				synced 2025-10-30 21:30:16 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			303 lines
		
	
	
		
			9.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			303 lines
		
	
	
		
			9.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| <!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
 | |
| 
 | |
| <!-- BEGIN STRIP_FOR_RELEASE -->
 | |
| 
 | |
| <img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| <img src="http://kubernetes.io/kubernetes/img/warning.png" alt="WARNING"
 | |
|      width="25" height="25">
 | |
| 
 | |
| <h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
 | |
| 
 | |
| If you are using a released version of Kubernetes, you should
 | |
| refer to the docs that go with that version.
 | |
| 
 | |
| <!-- TAG RELEASE_LINK, added by the munger automatically -->
 | |
| <strong>
 | |
| The latest release of this document can be found
 | |
| [here](http://releases.k8s.io/release-1.4/docs/proposals/volume-selectors.md).
 | |
| 
 | |
| Documentation for other releases can be found at
 | |
| [releases.k8s.io](http://releases.k8s.io).
 | |
| </strong>
 | |
| --
 | |
| 
 | |
| <!-- END STRIP_FOR_RELEASE -->
 | |
| 
 | |
| <!-- END MUNGE: UNVERSIONED_WARNING -->
 | |
| 
 | |
| ## Abstract
 | |
| 
 | |
| Real Kubernetes clusters have a variety of volumes which differ widely in
 | |
| size, iops performance, retention policy, and other characteristics.  A
 | |
| mechanism is needed to enable administrators to describe the taxonomy of these
 | |
| volumes, and for users to make claims on these volumes based on their
 | |
| attributes within this taxonomy.
 | |
| 
 | |
| A label selector mechanism is proposed to enable flexible selection of volumes
 | |
| by persistent volume claims.
 | |
| 
 | |
| ## Motivation
 | |
| 
 | |
| Currently, users of persistent volumes have the ability to make claims on
 | |
| those volumes based on some criteria such as the access modes the volume
 | |
| supports and minimum resources offered by a volume.  In an organization, there
 | |
| are often more complex requirements for the storage volumes needed by
 | |
| different groups of users.  A mechanism is needed to model these different
 | |
| types of volumes and to allow users to select those different types without
 | |
| being intimately familiar with their underlying characteristics.
 | |
| 
 | |
| As an example, many cloud providers offer a range of performance
 | |
| characteristics for storage, with higher performing storage being more
 | |
| expensive.  Cluster administrators want the ability to:
 | |
| 
 | |
| 1.  Invent a taxonomy of logical storage classes using the attributes
 | |
|     important to them
 | |
| 2.  Allow users to make claims on volumes using these attributes
 | |
| 
 | |
| ## Constraints and Assumptions
 | |
| 
 | |
| The proposed design should:
 | |
| 
 | |
| 1.  Deal with manually-created volumes
 | |
| 2.  Not necessarily require users to know or understand the differences between
 | |
|     volumes (ie, Kubernetes should not dictate any particular set of
 | |
|     characteristics to administrators to think in terms of)
 | |
| 
 | |
| We will focus **only** on the barest mechanisms to describe and implement
 | |
| label selectors in this proposal.  We will address the following topics in
 | |
| future proposals:
 | |
| 
 | |
| 1.  An extension resource or third party resource for storage classes
 | |
| 1.  Dynamically provisioning new volumes for based on storage class
 | |
| 
 | |
| ## Use Cases
 | |
| 
 | |
| 1.  As a user, I want to be able to make a claim on a persistent volume by
 | |
|     specifying a label selector as well as the currently available attributes
 | |
| 
 | |
| ### Use Case: Taxonomy of Persistent Volumes
 | |
| 
 | |
| Kubernetes offers volume types for a variety of storage systems.  Within each
 | |
| of those storage systems, there are numerous ways in which volume instances
 | |
| may differ from one another: iops performance, retention policy, etc.
 | |
| Administrators of real clusters typically need to manage a variety of
 | |
| different volumes with different characteristics for different groups of
 | |
| users.
 | |
| 
 | |
| Kubernetes should make it possible for administrators to flexibly model the
 | |
| taxonomy of volumes in their clusters and to label volumes with their storage
 | |
| class.  This capability must be optional and fully backward-compatible with
 | |
| the existing API.
 | |
| 
 | |
| Let's look at an example.  This example is *purely fictitious* and the
 | |
| taxonomies presented here are not a suggestion of any sort.  In the case of
 | |
| AWS EBS there are four different types of volume (in ascending order of cost):
 | |
| 
 | |
| 1.  Cold HDD
 | |
| 2.  Throughput optimized HDD
 | |
| 3.  General purpose SSD
 | |
| 4.  Provisioned IOPS SSD
 | |
| 
 | |
| Currently, there is no way to distinguish between a group of 4 PVs where each
 | |
| volume is of one of these different types.  Administrators need the ability to
 | |
| distinguish between instances of these types.  An administrator might decide
 | |
| to think of these volumes as follows:
 | |
| 
 | |
| 1.  Cold HDD - `tin`
 | |
| 2.  Throughput optimized HDD - `bronze`
 | |
| 3.  General purpose SSD - `silver`
 | |
| 4.  Provisioned IOPS SSD - `gold`
 | |
| 
 | |
| This is not the only dimension that EBS volumes can differ in.  Let's simplify
 | |
| things and imagine that AWS has two availability zones, `east` and `west`. Our
 | |
| administrators want to differentiate between volumes of the same type in these
 | |
| two zones, so they create a taxonomy of volumes like so:
 | |
| 
 | |
| 1.  `tin-west`
 | |
| 2.  `tin-east`
 | |
| 3.  `bronze-west`
 | |
| 4.  `bronze-east`
 | |
| 5.  `silver-west`
 | |
| 6.  `silver-east`
 | |
| 7.  `gold-west`
 | |
| 8.  `gold-east`
 | |
| 
 | |
| Another administrator of the same cluster might label things differently,
 | |
| choosing to focus on the business role of volumes.  Say that the data
 | |
| warehouse department is the sole consumer of the cold HDD type, and the DB as
 | |
| a service offering is the sole consumer of provisioned IOPS volumes.  The
 | |
| administrator might decide on the following taxonomy of volumes:
 | |
| 
 | |
| 1.  `warehouse-east`
 | |
| 2.  `warehouse-west`
 | |
| 3.  `dbaas-east`
 | |
| 4.  `dbaas-west`
 | |
| 
 | |
| There are any number of ways an administrator may choose to distinguish
 | |
| between volumes.  Labels are used in Kubernetes to express the user-defined
 | |
| properties of API objects and are a good fit to express this information for
 | |
| volumes.  In the examples above, administrators might differentiate between
 | |
| the classes of volumes using the labels `business-unit`, `volume-type`, or
 | |
| `region`.
 | |
| 
 | |
| Label selectors are used through the Kubernetes API to describe relationships
 | |
| between API objects using flexible, user-defined criteria.  It makes sense to
 | |
| use the same mechanism with persistent volumes and storage claims to provide
 | |
| the same functionality for these API objects.
 | |
| 
 | |
| ## Proposed Design
 | |
| 
 | |
| We propose that:
 | |
| 
 | |
| 1.  A new field called `Selector` be added to the `PersistentVolumeClaimSpec`
 | |
|     type
 | |
| 2.  The persistent volume controller be modified to account for this selector
 | |
|     when determining the volume to bind to a claim
 | |
| 
 | |
| ### Persistent Volume Selector
 | |
| 
 | |
| Label selectors are used throughout the API to allow users to express
 | |
| relationships in a flexible manner.  The problem of selecting a volume to
 | |
| match a claim fits perfectly within this metaphor.  Adding a label selector to
 | |
| `PersistentVolumeClaimSpec` will allow users to label their volumes with
 | |
| criteria important to them and select volumes based on these criteria.
 | |
| 
 | |
| ```go
 | |
| // PersistentVolumeClaimSpec describes the common attributes of storage devices
 | |
| // and allows a Source for provider-specific attributes
 | |
| type PersistentVolumeClaimSpec struct {
 | |
|     // Contains the types of access modes required
 | |
|     AccessModes []PersistentVolumeAccessMode `json:"accessModes,omitempty"`
 | |
|     // Selector is a selector which must be true for the claim to bind to a volume
 | |
|     Selector *unversioned.Selector `json:"selector,omitempty"`
 | |
|     // Resources represents the minimum resources required
 | |
|     Resources ResourceRequirements `json:"resources,omitempty"`
 | |
|     // VolumeName is the binding reference to the PersistentVolume backing this claim
 | |
|     VolumeName string `json:"volumeName,omitempty"`
 | |
| }
 | |
| ```
 | |
| 
 | |
| ### Labeling volumes
 | |
| 
 | |
| Volumes can already be labeled:
 | |
| 
 | |
| ```yaml
 | |
| apiVersion: v1
 | |
| kind: PersistentVolume
 | |
| metadata:
 | |
|   name: ebs-pv-1
 | |
|   labels:
 | |
|     ebs-volume-type: iops
 | |
|     aws-availability-zone: us-east-1
 | |
| spec:
 | |
|   capacity:
 | |
|     storage: 100Gi
 | |
|   accessModes:
 | |
|     - ReadWriteMany
 | |
|   persistentVolumeReclaimPolicy: Retain
 | |
|   awsElasticBlockStore:
 | |
|     volumeID: vol-12345
 | |
|     fsType: xfs
 | |
| ```
 | |
| 
 | |
| ### Controller Changes
 | |
| 
 | |
| At the time of this writing, the various controllers for persistent volumes
 | |
| are in the process of being refactored into a single controller (see
 | |
| [kubernetes/24331](https://github.com/kubernetes/kubernetes/pull/24331)).
 | |
| 
 | |
| The resulting controller should be modified to use the new
 | |
| `selector` field to match a claim to a volume.  In order to
 | |
| match to a volume, all criteria must be satisfied; ie, if a label selector is
 | |
| specified on a claim, a volume must match both the label selector and any
 | |
| specified access modes and resource requirements to be considered a match.
 | |
| 
 | |
| ## Examples
 | |
| 
 | |
| Let's take a look at a few examples, revisiting the taxonomy of EBS volumes and regions:
 | |
| 
 | |
| Volumes of the different types might be labeled as follows:
 | |
| 
 | |
| ```yaml
 | |
| apiVersion: v1
 | |
| kind: PersistentVolume
 | |
| metadata:
 | |
|   name: ebs-pv-west
 | |
|   labels:
 | |
|     ebs-volume-type: iops-ssd
 | |
|     aws-availability-zone: us-west-1
 | |
| spec:
 | |
|   capacity:
 | |
|     storage: 150Gi
 | |
|   accessModes:
 | |
|     - ReadWriteMany
 | |
|   persistentVolumeReclaimPolicy: Retain
 | |
|   awsElasticBlockStore:
 | |
|     volumeID: vol-23456
 | |
|     fsType: xfs
 | |
| 
 | |
| apiVersion: v1
 | |
| kind: PersistentVolume
 | |
| metadata:
 | |
|   name: ebs-pv-east
 | |
|   labels:
 | |
|     ebs-volume-type: gp-ssd
 | |
|     aws-availability-zone: us-east-1
 | |
| spec:
 | |
|   capacity:
 | |
|     storage: 150Gi
 | |
|   accessModes:
 | |
|     - ReadWriteMany
 | |
|   persistentVolumeReclaimPolicy: Retain
 | |
|   awsElasticBlockStore:
 | |
|     volumeID: vol-34567
 | |
|     fsType: xfs
 | |
| ```
 | |
| 
 | |
| ...claims on these volumes would look like:
 | |
| 
 | |
| ```yaml
 | |
| kind: PersistentVolumeClaim
 | |
| apiVersion: v1
 | |
| metadata:
 | |
|   name: ebs-claim-west
 | |
| spec:
 | |
|   accessModes:
 | |
|     - ReadWriteMany 
 | |
|   resources:
 | |
|     requests:
 | |
|       storage: 1Gi
 | |
|   selector:
 | |
|     matchLabels:
 | |
|       ebs-volume-type: iops-ssd
 | |
|       aws-availability-zone: us-west-1
 | |
| 
 | |
| kind: PersistentVolumeClaim
 | |
| apiVersion: v1
 | |
| metadata:
 | |
|   name: ebs-claim-east
 | |
| spec:
 | |
|   accessModes:
 | |
|     - ReadWriteMany 
 | |
|   resources:
 | |
|     requests:
 | |
|       storage: 1Gi
 | |
|   selector:
 | |
|     matchLabels:
 | |
|       ebs-volume-type: gp-ssd
 | |
|       aws-availability-zone: us-east-1
 | |
| ```
 | |
| 
 | |
| 
 | |
| 
 | |
| <!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
 | |
| []()
 | |
| <!-- END MUNGE: GENERATED_ANALYTICS -->
 |