Proposal for pod level security context and backward compatibility

2025-07-23 19:56:01 +00:00 · 2015-09-24 02:06:52 -04:00 · 2015-09-24 02:06:52 -04:00 · 0c70062d67
commit 0c70062d67
parent bbc90cd9ff
1 changed files with 407 additions and 0 deletions
--- a/docs/proposals/pod-security-context.md
+++ b/docs/proposals/pod-security-context.md
@ -0,0 +1,407 @@
+<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
+
+<!-- BEGIN STRIP_FOR_RELEASE -->
+
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+
+<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
+
+If you are using a released version of Kubernetes, you should
+refer to the docs that go with that version.
+
+<strong>
+The latest 1.0.x release of this document can be found
+[here](http://releases.k8s.io/release-1.0/docs/proposals/pod-security-context.md).
+
+Documentation for other releases can be found at
+[releases.k8s.io](http://releases.k8s.io).
+</strong>
+--
+
+<!-- END STRIP_FOR_RELEASE -->
+
+<!-- END MUNGE: UNVERSIONED_WARNING -->
+
+## Abstract
+
+A proposal for refactoring `SecurityContext` to have pod-level and container-level attributes in
+order to correctly model pod- and container-level security concerns.
+
+## Motivation
+
+Currently, containers have a `SecurityContext` attribute which contains information about the
+security settings the container uses.  In practice many of these attributes are uniform across all
+containers in a pod.  Simultaneously, there is also a need to apply the security context pattern
+at the pod level to correctly model security attributes that apply only at a pod level.
+
+Users should be able to:
+
+1.  Express security settings that are applicable to the entire pod
+2.  Express base security settings that apply to all containers
+3.  Override only the settings that need to be differentiated from the base in individual
+    containers
+
+This proposal is a dependency for other changes related to security context:
+
+1.  [Volume ownership management in the Kubelet](https://github.com/kubernetes/kubernetes/pull/12944)
+2.  [Generic SELinux label management in the Kubelet](https://github.com/kubernetes/kubernetes/pull/14192)
+
+Goals of this design:
+
+1.  Describe the use cases for which a pod-level security context is necessary
+2.  Thoroughly describe the API backward compatibility issues that arise from the introduction of
+    a pod-level security context
+3.  Describe all implementation changes necessary for the feature
+
+## Constraints and assumptions
+
+1.  We will not design for intra-pod security; we are not currently concerned about isolating
+    containers in the same pod from one another
+1.  We will design for backward compatibility with the current V1 API
+
+## Use Cases
+
+1.  As a developer, I want to correctly model security attributes which belong to an entire pod
+2.  As a user, I want to be able to specify container attributes that apply to all containers
+    without repeating myself
+3.  As an existing user, I want to be able to use the existing container-level security API
+
+### Use Case: Pod level security attributes
+
+Some security attributes make sense only to model at the pod level.  For example, it is a
+fundamental property of pods that all containers in a pod share the same network namespace.
+Therefore, using the host namespace makes sense to model at the pod level only, and indeed, today
+it is part of the `PodSpec`.  Other host namespace support is currently being added and these will
+also be pod-level settings; it makes sense to model them as a pod-level collection of security
+attributes.
+
+## Use Case: Override pod security context for container
+
+Some use cases require the containers in a pod to run with different security settings.  As an
+example, a user may want to have a pod with two containers, one of which runs as root with the
+privileged setting, and one that runs as a non-root UID.  To support use cases like this, it should
+be possible to override appropriate (ie, not intrinsically pod-level) security settings for
+individual containers.
+
+## Proposed Design
+
+### SecurityContext
+
+For posterity and ease of reading, note the current state of `SecurityContext`:
+
+```go
+package api
+
+type Container struct {
+    // Other fields omitted
+
+    // Optional: SecurityContext defines the security options the pod should be run with
+    SecurityContext *SecurityContext `json:"securityContext,omitempty"`
+}
+
+type SecurityContext struct {
+    // Capabilities are the capabilities to add/drop when running the container
+    Capabilities *Capabilities `json:"capabilities,omitempty"`
+
+    // Run the container in privileged mode
+    Privileged *bool `json:"privileged,omitempty"`
+
+    // SELinuxOptions are the labels to be applied to the container
+    // and volumes
+    SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
+
+    // RunAsUser is the UID to run the entrypoint of the container process.
+    RunAsUser *int64 `json:"runAsUser,omitempty"`
+
+    // RunAsNonRoot indicates that the container should be run as a non-root user.  If the RunAsUser
+    // field is not explicitly set then the kubelet may check the image for a specified user or
+    // perform defaulting to specify a user.
+    RunAsNonRoot bool `json:"runAsNonRoot,omitempty"`
+}
+
+// SELinuxOptions contains the fields that make up the SELinux context of a container.
+type SELinuxOptions struct {
+    // SELinux user label
+    User string `json:"user,omitempty"`
+
+    // SELinux role label
+    Role string `json:"role,omitempty"`
+
+    // SELinux type label
+    Type string `json:"type,omitempty"`
+
+    // SELinux level label.
+    Level string `json:"level,omitempty"`
+}
+```
+
+### PodSecurityContext
+
+`PodSecurityContext` specifies two types of security attributes:
+
+1.  Attributes that apply to the pod itself
+2.  Attributes that apply to the containers of the pod
+
+In the internal API, fields of the `PodSpec` controlling the use of the host PID, IPC, and network
+namespaces are relocated to this type:
+
+```go
+package api
+
+type PodSpec struct {
+    // Other fields omitted
+
+    // Optional: SecurityContext specifies pod-level attributes and container security attributes
+    // that apply to all containers.
+    SecurityContext *PodSecurityContext `json:"securityContext,omitempty"`
+}
+
+// PodSecurityContext specifies security attributes of the pod and container attributes that apply
+// to all containers of the pod.
+type PodSecurityContext struct {
+    // Use the host's network namespace. If this option is set, the ports that will be
+    // used must be specified.
+    // Optional: Default to false.
+    HostNetwork bool
+    // Use the host's IPC namespace
+    HostIPC bool
+
+    // Use the host's PID namespace
+    HostPID bool
+
+    // Capabilities are the capabilities to add/drop when running containers
+    Capabilities *Capabilities `json:"capabilities,omitempty"`
+
+    // Run the container in privileged mode
+    Privileged *bool `json:"privileged,omitempty"`
+
+    // SELinuxOptions are the labels to be applied to the container
+    // and volumes
+    SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
+
+    // RunAsUser is the UID to run the entrypoint of the container process.
+    RunAsUser *int64 `json:"runAsUser,omitempty"`
+
+    // RunAsNonRoot indicates that the container should be run as a non-root user.  If the RunAsUser
+    // field is not explicitly set then the kubelet may check the image for a specified user or
+    // perform defaulting to specify a user.
+    RunAsNonRoot bool
+}
+
+// Comments and generated docs will change for the container.SecurityContext field to indicate
+// the precedence of these fields over the pod-level ones.
+
+type Container struct {
+    // Other fields omitted
+
+    // Optional: SecurityContext defines the security options the pod should be run with.
+    // Settings specified in this field take precedence over the settings defined in
+    // pod.Spec.SecurityContext.
+    SecurityContext *SecurityContext `json:"securityContext,omitempty"`
+}
+```
+
+In the V1 API, the pod-level security attributes which are currently fields of the `PodSpec` are
+retained on the `PodSpec` for backward compatibility purposes:
+
+```go
+package v1
+
+type PodSpec struct {
+    // Other fields omitted
+
+    // Use the host's network namespace. If this option is set, the ports that will be
+    // used must be specified.
+    // Optional: Default to false.
+    HostNetwork bool `json:"hostNetwork,omitempty"`
+    // Use the host's pid namespace.
+    // Optional: Default to false.
+    HostPID bool `json:"hostPID,omitempty"`
+    // Use the host's ipc namespace.
+    // Optional: Default to false.
+    HostIPC bool `json:"hostIPC,omitempty"`
+
+    // Optional: SecurityContext specifies pod-level attributes and container security attributes
+    // that apply to all containers.
+    SecurityContext *PodSecurityContext `json:"securityContext,omitempty"`
+}
+```
+
+The `pod.Spec.SecurityContext` specifies the security context of all containers in the pod.
+The containers' `securityContext` field is overlaid on the base security context to determine the
+effective security context for the container.
+
+The new V1 API should be backward compatible with the existing API.  Backward compatibility is
+defined as:
+
+> 1.  Any API call (e.g. a structure POSTed to a REST endpoint) that worked before your change must
+>     work the same after your change.
+> 2.  Any API call that uses your change must not cause problems (e.g. crash or degrade behavior) when
+>     issued against servers that do not include your change.
+> 3.  It must be possible to round-trip your change (convert to different API versions and back) with
+>     no loss of information.
+
+Previous versions of this proposal attempted to deal with backward compatiblity by defining
+the affect of setting the pod-level fields on the container-level fields.  While trying to find
+consensus on this design, it became apparent that this approach was going to be extremely complex
+to implement, explain, and support.  Instead, we will approach backward compatibility as follows:
+
+1.  Pod-level and container-level settings will not affect one another
+2.  Old clients will be able to use container-level settings in the exact same way
+3.  Container level settings always override pod-level settings if they are set
+
+#### Examples
+
+1.  Old client using `pod.Spec.Containers[x].SecurityContext`
+
+    An old client creates a pod:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      containers:
+      - name: a
+        securityContext:
+          runAsUser: 1001
+      - name: b
+        securityContest:
+          runAsUser: 1002
+    ```
+
+    looks to old clients like:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      containers:
+      - name: a
+        securityContext:
+          runAsUser: 1001
+      - name: b
+        securityContext:
+          runAsUser: 1002
+    ```
+
+    looks to new clients like:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      containers:
+      - name: a
+        securityContext:
+          runAsUser: 1001
+      - name: b
+        securityContext:
+          runAsUser: 1002
+    ```
+
+2.  New client using `pod.Spec.SecurityContext`
+
+    A new client creates a pod using a field of `pod.Spec.SecurityContext`:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      securityContext:
+        runAsUser: 1001
+      containers:
+      - name: a
+      - name: b
+    ```
+
+    appears to new clients as:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      securityContext:
+        runAsUser: 1001
+      containers:
+      - name: a
+      - name: b
+    ```
+
+    old clients will see:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      containers:
+      - name: a
+      - name: b
+    ```
+
+3.  Pods created using `pod.Spec.SecurityContext` and `pod.Spec.Containers[x].SecurityContext`
+
+    If a field is set in both `pod.Spec.SecurityContext` and
+    `pod.Spec.Containers[x].SecurityContext`, the value in `pod.Spec.Containers[x].SecurityContext`
+    wins.  In the following pod:
+
+    ```yaml
+    apiVersion: v1
+    kind: Pod
+    metadata:
+      name: test-pod
+    spec:
+      securityContext:
+        runAsUser: 1001
+      containers:
+      - name: a
+        securityContext:
+          runAsUser: 1002
+      - name: b
+    ```
+
+    The effective setting for `runAsUser` for container A is `1002`.
+
+#### Testing
+
+A backward compatibility test suite will be established for the v1 API.  The test suite will
+verify compatibility by converting objects into the internal API and back to the version API and
+examining the results.
+
+All of the examples here will be used as test-cases.  As more test cases are added, the proposal will
+be updated.
+
+An example of a test like this can be found in the
+[OpenShift API package](https://github.com/openshift/origin/blob/master/pkg/api/compatibility_test.go)
+
+E2E test cases will be added to test the correct determination of the security context for containers.
+
+### Kubelet changes
+
+1.  The Kubelet will use the new fields on the `PodSecurityContext` for host namespace control
+2.  The Kubelet will be modified to correctly implement the backward compatibility and effective
+    security context determination defined here
+
+<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
+[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/pod-security-context.md?pixel)]()
+<!-- END MUNGE: GENERATED_ANALYTICS -->