diff --git a/docs/design/secrets.md b/docs/design/secrets.md new file mode 100644 index 00000000000..6d561eec583 --- /dev/null +++ b/docs/design/secrets.md @@ -0,0 +1,547 @@ +# Secret Distribution + +## Abstract + +A proposal for the distribution of secrets (passwords, keys, etc) to the Kubelet and to +containers inside Kubernetes using a custom volume type. + +## Motivation + +Secrets are needed in containers to access internal resources like the Kubernetes master or +external resources such as git repositories, databases, etc. Users may also want behaviors in the +kubelet that depend on secret data (credentials for image pull from a docker registry) associated +with pods. + +Goals of this design: + +1. Describe a secret resource +2. Define the various challenges attendant to managing secrets on the node +3. Define a mechanism for consuming secrets in containers without modification + +## Constraints and Assumptions + +* This design does not prescribe a method for storing secrets; storage of secrets should be + pluggable to accomodate different use-cases +* Encryption of secret data and node security are orthogonal concerns +* It is assumed that node and master are secure and that compromising their security could also + compromise secrets: + * If a node is compromised, the only secrets that could potentially be exposed should be the + secrets belonging to containers scheduled onto it + * If the master is compromised, all secrets in the cluster may be exposed +* Secret rotation is an orthogonal concern, but it should be facilitated by this proposal + +## Use Cases + +1. As a user, I want to store secret artifacts for my applications and consume them securely in + containers, so that I can keep the configuration for my applications separate from the images + that use them: + 1. As a cluster operator, I want to allow a pod to access the Kubernetes master using a custom + `.kubeconfig` file, so that I can securely reach the master + 2. As a cluster operator, I want to allow a pod to access a Docker registry using credentials + from a `.dockercfg` file, so that containers can push images + 3. As a cluster operator, I want to allow a pod to access a git repository using SSH keys, + so that I can push and fetch to and from the repository +2. As a user, I want to allow containers to consume supplemental information about services such + as username and password which should be kept secret, so that I can share secrets about a + service amongst the containers in my application securely +3. As a user, I want to associate a pod with a `ServiceAccount` that consumes a secret and have + the kubelet implement some reserved behaviors based on the types of secrets the service account + consumes: + 1. Use credentials for a docker registry to pull the pod's docker image + 2. Present kubernetes auth token to the pod or transparently decorate traffic between the pod + and master service +4. As a user, I want to be able to indicate that a secret expires and for that secret's value to + be rotated once it expires, so that the system can help me follow good practices + +### Use-Case: Configuration artifacts + +Many configuration files contain secrets intermixed with other configuration information. For +example, a user's application may contain a properties file than contains database credentials, +SaaS API tokens, etc. Users should be able to consume configuration artifacts in their containers +and be able to control the path on the container's filesystems where the artifact will be +presented. + +### Use-Case: Metadata about services + +Most pieces of information about how to use a service are secrets. For example, a service that +provides a MySQL database needs to provide the username, password, and database name to consumers +so that they can authenticate and use the correct database. Containers in pods consuming the MySQL +service would also consume the secrets associated with the MySQL service. + +### Use-Case: Secrets associated with service accounts + +[Service Accounts](https://github.com/GoogleCloudPlatform/kubernetes/pull/2297) are proposed as a +mechanism to decouple capabilities and security contexts from individual human users. A +`ServiceAccount` contains references to some number of secrets. A `Pod` can specify that it is +associated with a `ServiceAccount`. Secrets should have a `Type` field to allow the Kubelet and +other system components to take action based on the secret's type. + +#### Example: service account consumes auth token secret + +As an example, the service account proposal discusses service accounts consuming secrets which +contain kubernetes auth tokens. When a Kubelet starts a pod associated with a service account +which consumes this type of secret, the Kubelet may take a number of actions: + +1. Expose the secret in a `.kubernetes_auth` file in a well-known location in the container's + file system +2. Configure that node's `kube-proxy` to decorate HTTP requests from that pod to the + `kubernetes-master` service with the auth token, e. g. by adding a header to the request + (see the [LOAS Daemon](https://github.com/GoogleCloudPlatform/kubernetes/issues/2209) proposal) + +#### Example: service account consumes docker registry credentials + +Another example use case is where a pod is associated with a secret containing docker registry +credentials. The Kubelet could use these credentials for the docker pull to retrieve the image. + +### Use-Case: Secret expiry and rotation + +Rotation is considered a good practice for many types of secret data. It should be possible to +express that a secret has an expiry date; this would make it possible to implement a system +component that could regenerate expired secrets. As an example, consider a component that rotates +expired secrets. The rotator could periodically regenerate the values for expired secrets of +common types and update their expiry dates. + +## Deferral: Consuming secrets as environment variables + +Some images will expect to receive configuration items as environment variables instead of files. +We should consider what the best way to allow this is; there are a few different options: + +1. Force the user to adapt files into environment variables. Users can store secrets that need to + be presented as environment variables in a format that is easy to consume from a shell: + + $ cat /etc/secrets/my-secret.txt + export MY_SECRET_ENV=MY_SECRET_VALUE + + The user could `source` the file at `/etc/secrets/my-secret` prior to executing the command for + the image either inline in the command or in an init script, + +2. Give secrets an attribute that allows users to express the intent that the platform should + generate the above syntax in the file used to present a secret. The user could consume these + files in the same manner as the above option. + +3. Give secrets attributes that allow the user to express that the secret should be presented to + the container as an environment variable. The container's environment would contain the + desired values and the software in the container could use them without accomodation the + command or setup script. + +For our initial work, we will treat all secrets as files to narrow the problem space. There will +be a future proposal that handles exposing secrets as environment variables. + +## Flow analysis of secret data with respect to the API server + +There are two fundamentally different use-cases for access to secrets: + +1. CRUD operations on secrets by their owners +2. Read-only access to the secrets needed for a particular node by the kubelet + +### Use-Case: CRUD operations by owners + +In use cases for CRUD operations, the user experience for secrets should be no different than for +other API resources. + +#### Data store backing the REST API + +The data store backing the REST API should be pluggable because different cluster operators will +have different preferences for the central store of secret data. Some possibilities for storage: + +1. An etcd collection alongside the storage for other API resources +2. A collocated [HSM](http://en.wikipedia.org/wiki/Hardware_security_module) +3. An external datastore such as an external etcd, RDBMS, etc. + +#### Size limit for secrets + +There should be a size limit for secrets in order to: + +1. Prevent DOS attacks against the API server +2. Allow kubelet implementations that prevent secret data from touching the node's filesystem + +The size limit should satisfy the following conditions: + +1. Large enough to store common artifact types (encryption keypairs, certificates, small + configuration files) +2. Small enough to avoid large impact on node resource consumption (storage, RAM for tmpfs, etc) + +To begin discussion, we propose an initial value for this size limit of **1MB**. + +#### Other limitations on secrets + +Defining a policy for limitations on how a secret may be referenced by another API resource and how +constraints should be applied throughout the cluster is tricky due to the number of variables +involved: + +1. Should there be a maximum number of secrets a pod can reference via a volume? +2. Should there be a maximum number of secrets a service account can reference? +3. Should there be a total maximum number of secrets a pod can reference via its own spec and its + associated service account? +4. Should there be a total size limit on the amount of secret data consumed by a pod? +5. How will cluster operators want to be able to configure these limits? +6. How will these limits impact API server validations? +7. How will these limits affect scheduling? + +For now, we will not implement validations around these limits. Cluster operators will decide how +much node storage is allocated to secrets. It will be the operator's responsibility to ensure that +the allocated storage is sufficient for the workload scheduled onto a node. + +### Use-Case: Kubelet read of secrets for node + +The use-case where the kubelet reads secrets has several additional requirements: + +1. Kubelets should only be able to receive secret data which is required by pods scheduled onto + the kubelet's node +2. Kubelets should have read-only access to secret data +3. Secret data should not be transmitted over the wire insecurely +4. Kubelets must ensure pods do not have access to each other's secrets + +#### Read of secret data by the Kubelet + +The Kubelet should only be allowed to read secrets which are consumed by pods scheduled onto that +Kubelet's node and their associated service accounts. Authorization of the Kubelet to read this +data would be delegated to an authorization plugin and associated policy rule. + +#### Secret data on the node: data at rest + +Consideration must be given to whether secret data should be allowed to be at rest on the node: + +1. If secret data is not allowed to be at rest, the size of secret data becomes another draw on + the node's RAM - should it affect scheduling? +2. If secret data is allowed to be at rest, should it be encrypted? + 1. If so, how should be this be done? + 2. If not, what threats exist? What types of secret are appropriate to store this way? + +For the sake of limiting complexity, we propose that initially secret data should not be allowed +to be at rest on a node; secret data should be stored on a node-level tmpfs filesystem. This +filesystem can be subdivided into directories for use by the kubelet and by the volume plugin. + +#### Secret data on the node: resource consumption + +The Kubelet will be responsible for creating the per-node tmpfs file system for secret storage. +It is hard to make a prescriptive declaration about how much storage is appropriate to reserve for +secrets because different installations will vary widely in available resources, desired pod to +node density, overcommit policy, and other operation dimensions. That being the case, we propose +for simplicity that the amount of secret storage be controlled by a new parameter to the kubelet +with a default value of **64MB**. It is the cluster operator's responsibility to handle choosing +the right storage size for their installation and configuring their Kubelets correctly. + +Configuring each Kubelet is not the ideal story for operator experience; it is more intuitive that +the cluster-wide storage size be readable from a central configuration store like the one proposed +in [#1553](https://github.com/GoogleCloudPlatform/kubernetes/issues/1553). When such a store +exists, the Kubelet could be modified to read this configuration item from the store. + +When the Kubelet is modified to advertise node resources (as proposed in +[#4441](https://github.com/GoogleCloudPlatform/kubernetes/issues/4441)), the capacity calculation +for available memory should factor in the potential size of the node-level tmpfs in order to avoid +memory overcommit on the node. + +#### Secret data on the node: isolation + +Every pod will have a [security context](https://github.com/GoogleCloudPlatform/kubernetes/pull/3910). +Secret data on the node should be isolated according to the security context of the container. The +Kubelet volume plugin API will be changed so that a volume plugin receives the security context of +a volume along with the volume spec. This will allow volume plugins to implement setting the +security context of volumes they manage. + +## Community work: + +Several proposals / upstream patches are notable as background for this proposal: + +1. [Docker vault proposal](https://github.com/docker/docker/issues/10310) +2. [Specification for image/container standardization based on volumes](https://github.com/docker/docker/issues/9277) +3. [Kubernetes service account proposal](https://github.com/GoogleCloudPlatform/kubernetes/pull/2297) +4. [Secrets proposal for docker (1)](https://github.com/docker/docker/pull/6075) +5. [Secrets proposal for docker (2)](https://github.com/docker/docker/pull/6697) + +## Proposed Design + +We propose a new `Secret` resource which is mounted into containers with a new volume type. Secret +volumes will be handled by a volume plugin that does the actual work of fetching the secret and +storing it. Secrets contain multiple pieces of data that are presented as different files within +the secret volume (example: SSH key pair). + +In order to remove the burden from the end user in specifying every file that a secret consists of, +it should be possible to mount all files provided by a secret with a single ```VolumeMount``` entry +in the container specification. + +### Secret API Resource + +A new resource for secrets will be added to the API: + +```go +type Secret struct { + TypeMeta + ObjectMeta + + // Keys in this map are the paths relative to the volume + // presented to a container for this secret data. + Data map[string][]byte + Type SecretType +} + +type SecretType string + +const ( + SecretTypeOpaque SecretType = "opaque" // Opaque (arbitrary data; default) + SecretTypeKubernetesAuthToken SecretType = "kubernetes-auth" // Kubernetes auth token + SecretTypeDockerRegistryAuth SecretType = "docker-reg-auth" // Docker registry auth + // FUTURE: other type values +) + +const MaxSecretSize = 1 * 1024 * 1024 +``` + +A Secret can declare a type in order to provide type information to system components that work +with secrets. The default type is `opaque`, which represents arbitrary user-owned data. + +Secrets are validated against `MaxSecretSize`. + +A new REST API and registry interface will be added to accompany the `Secret` resource. The +default implementation of the registry will store `Secret` information in etcd. Future registry +implementations could store the `TypeMeta` and `ObjectMeta` fields in etcd and store the secret +data in another data store entirely, or store the whole object in another data store. + +#### Other validations related to secrets + +Initially there will be no validations for the number of secrets a pod references, or the number of +secrets that can be associated with a service account. These may be added in the future as the +finer points of secrets and resource allocation are fleshed out. + +### Secret Volume Source + +A new `SecretSource` type of volume source will be added to the ```VolumeSource``` struct in the +API: + +```go +type VolumeSource struct { + // Other fields omitted + + // SecretSource represents a secret that should be presented in a volume + SecretSource *SecretSource `json:"secret"` +} + +type SecretSource struct { + Target ObjectReference +} +``` + +Secret volume sources are validated to ensure that the specified object reference actually points +to an object of type `Secret`. + +### Secret Volume Plugin + +A new Kubelet volume plugin will be added to handle volumes with a secret source. This plugin will +require access to the API server to retrieve secret data and therefore the volume `Host` interface +will have to change to expose a client interface: + +```go +type Host interface { + // Other methods omitted + + // GetKubeClient returns a client interface + GetKubeClient() client.Interface +} +``` + +The secret volume plugin will be responsible for: + +1. Returning a `volume.Builder` implementation from `NewBuilder` that: + 1. Retrieves the secret data for the volume from the API server + 2. Places the secret data onto the container's filesystem + 3. Sets the correct security attributes for the volume based on the pod's `SecurityContext` +2. Returning a `volume.Cleaner` implementation from `NewClear` that cleans the volume from the + container's filesystem + +### Kubelet: Node-level secret storage + +The Kubelet must be modified to accept a new parameter for the secret storage size and to create +a tmpfs file system of that size to store secret data. Rough accounting of specific changes: + +1. The Kubelet should have a new field added called `secretStorageSize`; units are megabytes +2. `NewMainKubelet` should accept a value for secret storage size +3. The Kubelet server should have a new flag added for secret storage size +4. The Kubelet's `setupDataDirs` method should be changed to create the secret storage + +### Kubelet: New behaviors for secrets associated with service accounts + +For use-cases where the Kubelet's behavior is affected by the secrets associated with a pod's +`ServiceAccount`, the Kubelet will need to be changed. For example, if secrets of type +`docker-reg-auth` affect how the pod's images are pulled, the Kubelet will need to be changed +to accomodate this. Subsequent proposals can address this on a type-by-type basis. + +## Examples + +For clarity, let's examine some detailed examples of some common use-cases in terms of the +suggested changes. All of these examples are assumed to be created in a namespace called +`example`. + +### Use-Case: Pod with ssh keys + +To create a pod that uses an ssh key stored as a secret, we first need to create a secret: + +```json +{ + "apiVersion": "v1beta2", + "kind": "Secret", + "id": "ssh-key-secret", + "data": { + "id_rsa.pub": "dmFsdWUtMQ0K", + "id_rsa": "dmFsdWUtMg0KDQo=" + } +} +``` + +**Note:** The values of secret data are encoded as base64-encoded strings. + +Now we can create a pod which references the secret with the ssh key and consumes it in a volume: + +```json +{ + "id": "secret-test-pod", + "kind": "Pod", + "apiVersion":"v1beta2", + "labels": { + "name": "secret-test" + }, + "desiredState": { + "manifest": { + "version": "v1beta1", + "id": "secret-test-pod", + "containers": [{ + "name": "ssh-test-container", + "image": "mySshImage", + "volumeMounts": [{ + "name": "secret-volume", + "mountPath": "/etc/secret-volume", + "readOnly": true + }] + }], + "volumes": [{ + "name": "secret-volume", + "source": { + "secret": { + "target": { + "kind": "Secret", + "namespace": "example", + "name": "ssh-key-secret" + } + } + } + }] + } + } +} +``` + +When the container's command runs, the pieces of the key will be available in: + + /etc/secret-volume/id_rsa.pub + /etc/secret-volume/id_rsa + +The container is then free to use the secret data to establish an ssh connection. + +### Use-Case: Pods with pod / test credentials + +Let's compare examples where a pod consumes a secret containing prod credentials and another pod +consumes a secret with test environment credentials. + +The secrets: + +```json +[{ + "apiVersion": "v1beta2", + "kind": "Secret", + "id": "prod-db-secret", + "data": { + "username": "dmFsdWUtMQ0K", + "password": "dmFsdWUtMg0KDQo=" + } +}, +{ + "apiVersion": "v1beta2", + "kind": "Secret", + "id": "test-db-secret", + "data": { + "username": "dmFsdWUtMQ0K", + "password": "dmFsdWUtMg0KDQo=" + } +}] +``` + +The pods: + +```json +[{ + "id": "prod-db-client-pod", + "kind": "Pod", + "apiVersion":"v1beta2", + "labels": { + "name": "prod-db-client" + }, + "desiredState": { + "manifest": { + "version": "v1beta1", + "id": "prod-db-pod", + "containers": [{ + "name": "db-client-container", + "image": "myClientImage", + "volumeMounts": [{ + "name": "secret-volume", + "mountPath": "/etc/secret-volume", + "readOnly": true + }] + }], + "volumes": [{ + "name": "secret-volume", + "source": { + "secret": { + "target": { + "kind": "Secret", + "namespace": "example", + "name": "prod-db-secret" + } + } + } + }] + } + } +}, +{ + "id": "test-db-client-pod", + "kind": "Pod", + "apiVersion":"v1beta2", + "labels": { + "name": "test-db-client" + }, + "desiredState": { + "manifest": { + "version": "v1beta1", + "id": "test-db-pod", + "containers": [{ + "name": "db-client-container", + "image": "myClientImage", + "volumeMounts": [{ + "name": "secret-volume", + "mountPath": "/etc/secret-volume", + "readOnly": true + }] + }], + "volumes": [{ + "name": "secret-volume", + "source": { + "secret": { + "target": { + "kind": "Secret", + "namespace": "example", + "name": "test-db-secret" + } + } + } + }] + } + } +}] +``` + +The specs for the two pods differ only in the value of the object referred to by the secret volume +source. Both containers will have the following files present on their filesystems: + + /etc/secret-volume/username + /etc/secret-volume/password