diff --git a/docs/proposals/volume-ownership-management.md b/docs/proposals/volume-ownership-management.md new file mode 100644 index 00000000000..8dd4b8bb0c0 --- /dev/null +++ b/docs/proposals/volume-ownership-management.md @@ -0,0 +1,141 @@ + + + + +WARNING +WARNING +WARNING +WARNING +WARNING + +

PLEASE NOTE: This document applies to the HEAD of the source tree

+ +If you are using a released version of Kubernetes, you should +refer to the docs that go with that version. + + +The latest release of this document can be found +[here](http://releases.k8s.io/release-1.1/docs/proposals/volume-ownership-management.md). + +Documentation for other releases can be found at +[releases.k8s.io](http://releases.k8s.io). + +-- + + + + + +## Volume plugins and idempotency + +Currently, volume plugins have a `SetUp` method which is called in the context of a higher-level +workflow within the kubelet which has externalized the problem of managing the ownership of volumes. +This design has a number of drawbacks that can be mitigated by completely internalizing all concerns +of volume setup behind the volume plugin `SetUp` method. + +### Known issues with current externalized design + +1. The ownership management is currently repeatedly applied, which breaks packages that require + special permissions in order to work correctly +2. There is a gap between files being mounted/created by volume plugins and when their ownership + is set correctly; race conditions exist around this +3. Solving the correct application of ownership management in an externalized model is difficult + and makes it clear that the a transaction boundary is being broken by the externalized design + +### Additional issues with externalization + +Fully externalizing any one concern of volumes is difficult for a number of reasons: + +1. Many types of idempotence checks exist, and are used in a variety of combinations and orders +2. Workflow in the kubelet becomes much more complex to handle: + 1. composition of plugins + 2. correct timing of application of ownership management + 3. callback to volume plugins when we know the whole `SetUp` flow is complete and correct + 4. callback to touch sentinel files + 5. etc etc +3. We want to support fully external volume plugins -- would require complex orchestration / chatty + remote API + +## Proposed implementation + +Since all of the ownership information is known in advance of the call to the volume plugin `SetUp` +method, we can easily internalize these concerns into the volume plugins and pass the ownership +information to `SetUp`. + +The volume `Builder` interface's `SetUp` method changes to accept the group that should own the +volume. Plugins become responsible for ensuring that the correct group is applied. The volume +`Attributes` struct can be modified to remove the `SupportsOwnershipManagement` field. + +```go +package volume + +type Builder interface { + // other methods omitted + + // SetUp prepares and mounts/unpacks the volume to a self-determined + // directory path and returns an error. The group ID that should own the volume + // is passed as a parameter. Plugins may choose to ignore the group ID directive + // in the event that they do not support it (example: NFS). A group ID of -1 + // indicates that the group ownership of the volume should not be modified by the plugin. + // + // SetUp will be called multiple times and should be idempotent. + SetUp(gid int64) error +} +``` + +Each volume plugin will have to change to support the new `SetUp` signature. The existing +ownership management code will be refactored into a library that volume plugins can use: + +``` +package volume + +func ManageOwnership(path string, fsGroup int64) error { + // 1. recursive chown of path + // 2. make path +setgid +} +``` + +The workflow from the Kubelet's perspective for handling volume setup and refresh becomes: + +```go +// go-ish pseudocode +func mountExternalVolumes(pod) error { + podVolumes := make(kubecontainer.VolumeMap) + for i := range pod.Spec.Volumes { + volSpec := &pod.Spec.Volumes[i] + var fsGroup int64 = 0 + if pod.Spec.SecurityContext != nil && + pod.Spec.SecurityContext.FSGroup != nil { + fsGroup = *pod.Spec.SecurityContext.FSGroup + } else { + fsGroup = -1 + } + + // Try to use a plugin for this volume. + plugin := volume.NewSpecFromVolume(volSpec) + builder, err := kl.newVolumeBuilderFromPlugins(plugin, pod) + if err != nil { + return err + } + if builder == nil { + return errUnsupportedVolumeType + } + + err := builder.SetUp(fsGroup) + if err != nil { + return nil + } + } + + return nil +} +``` + + +[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/volume-ownership-management.md?pixel)]() +