diff --git a/docs/proposals/gpu-support.md b/docs/proposals/gpu-support.md
new file mode 100644
index 00000000000..6d1ce9f1d5d
--- /dev/null
+++ b/docs/proposals/gpu-support.md
@@ -0,0 +1,308 @@
+<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
+
+<!-- BEGIN STRIP_FOR_RELEASE -->
+
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
+     width="25" height="25">
+
+<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
+
+If you are using a released version of Kubernetes, you should
+refer to the docs that go with that version.
+
+Documentation for other releases can be found at
+[releases.k8s.io](http://releases.k8s.io).
+</strong>
+--
+
+<!-- END STRIP_FOR_RELEASE -->
+
+<!-- END MUNGE: UNVERSIONED_WARNING -->
+
+<!-- BEGIN MUNGE: GENERATED_TOC -->
+
+- [GPU support](#gpu-support)
+  - [Objective](#objective)
+  - [Background](#background)
+  - [Detailed discussion](#detailed-discussion)
+    - [Inventory](#inventory)
+    - [Scheduling](#scheduling)
+    - [The runtime](#the-runtime)
+      - [NVIDIA support](#nvidia-support)
+    - [Event flow](#event-flow)
+    - [Too complex for now: nvidia-docker](#too-complex-for-now-nvidia-docker)
+  - [Implementation plan](#implementation-plan)
+    - [V0](#v0)
+      - [Scheduling](#scheduling)
+      - [Runtime](#runtime)
+      - [Other](#other)
+  - [Future work](#future-work)
+    - [V1](#v1)
+    - [V2](#v2)
+    - [V3](#v3)
+    - [Undetermined](#undetermined)
+  - [Security considerations](#security-considerations)
+
+<!-- END MUNGE: GENERATED_TOC -->
+
+# GPU support
+
+Author: @therc
+
+Date: Apr 2016
+
+Status: Design in progress, early implementation of requirements
+
+## Objective
+
+Users should be able to request GPU resources for their workloads, as easily as
+for CPU or memory. Kubernetes should keep an inventory of machines with GPU
+hardware, schedule containers on appropriate nodes and set up the container
+environment with all that's necessary to access the GPU. All of this should
+eventually be supported for clusters on either bare metal or cloud providers.
+
+## Background
+
+An increasing number of workloads, such as machine learning and seismic survey
+processing, benefits from offloading computations to graphic hardware. While not
+as tuned as traditional, dedicated high performance computing systems such as
+MPI, a Kubernetes cluster can still be a great environment for organizations
+that need a variety of additional, "classic" workloads, such as database, web
+serving, etc.
+
+GPU support is hard to provide extensively and will thus take time to tame
+completely, because
+
+- different vendors expose the hardware to users in different ways
+- some vendors require fairly tight coupling between the kernel driver
+controlling the GPU and the libraries/applications that access the hardware
+- it adds more resource types (whole GPUs, GPU cores, GPU memory)
+- it can introduce new security pitfalls
+- for systems with multiple GPUs, affinity matters, similarly to NUMA
+considerations for CPUs
+- running GPU code in containers is still a relatively novel idea
+
+## Detailed discussion
+
+Currently, this document is mostly focused on the basic use case: run GPU code
+on AWS `g2.2xlarge` EC2 machine instances using Docker. It constitutes a narrow
+enough scenario that it does not require large amounts of generic code yet. GCE
+doesn't support GPUs at all; bare metal systems throw a lot of extra variables
+into the mix.
+
+Later sections will outline future work to support a broader set of hardware,
+environments and container runtimes.
+
+### Inventory
+
+Before any scheduling can occur, we need to know what's available out there. In
+v0, we'll hardcode capacity detected by the kubelet based on a flag,
+`--experimental-nvidia-gpu`. This will result in the user-defined resource
+`alpha.kubernetes.io/nvidia-gpu` to be reported for `NodeCapacity` and
+`NodeAllocatable`, as well as as a node label.
+
+### Scheduling
+
+GPUs will be visible as first-class resources. In v0, we'll only assign whole
+devices; sharing among multiple pods is left to future implementations. It's
+probable that GPUs will exacerbate the need for [a rescheduler](rescheduler.md)
+or pod priorities, especially if the nodes in a cluster are not homogeneous.
+Consider these two cases:
+
+> Only half of the machines have a GPU and they're all busy with other
+workloads. The other half of the cluster is doing very little work. A GPU
+workload comes, but it can't schedule, because the devices are sitting idle on
+nodes that are running something else and the nodes with little load lack the
+hardware.
+
+> Some or all the machines have two graphic cards each. A number of jobs get
+scheduled, requesting one device per pod. The scheduler puts them all on
+different machines, spreading the load, perhaps by design. Then a new job comes
+in, requiring two devices per pod, but it can't schedule anywhere, because all
+we can find, at most, is one unused device per node.
+
+### The runtime
+
+Once we know where to run the container, it's time to set up its environment. At
+a minimum, we'll need to map the host device(s) into the container. Because each
+manufacturer exposes different device nodes (`/dev/ati/card0`, `/dev/nvidia0`,
+but also the required `/dev/nvidiactl` and `/dev/nvidia-uvm`), some of the logic
+needs to be hardware-specific, mapping from a logical device to a list of device
+nodes necessary for software to talk to it.
+
+Support binaries and libraries are often versioned along with the kernel module,
+so there should be further hooks to project those under `/bin` and some kind of
+`/lib` before the application is started. This can be done for Docker with the
+use of a versioned [Docker
+volume](https://docs.docker.com/engine/userguide/containers/dockervolumes/) or
+with upcoming Kubernetes-specific hooks such as init containers and volume
+containers. In v0, images are expected to bundle everything they need.
+
+#### NVIDIA support
+
+The first implementation and testing ground will be for NVIDIA devices, by far
+the most common setup.
+
+In v0, the `--experimental-nvidia-gpu` flag will also result in the host devices
+(limited to those required to drive the first card, `nvidia0`) to be mapped into
+the container by the dockertools library.
+
+### Event flow
+
+This is what happens before and after an user schedules a GPU pod.
+
+1. Administrator installs a number of Kubernetes nodes with GPUs. The correct
+kernel modules and device nodes under `/dev/` are present.
+
+1. Administrator makes sure the latest CUDA/driver versions are installed.
+
+1. Administrator enables `--experimental-nvidia-gpu` on kubelets
+
+1. Kubelets update node status with information about the GPU device, in addition
+to cAdvisor's usual data about CPU/memory/disk
+
+1. User creates a Docker image compiling their application for CUDA, bundling
+the necessary libraries. We ignore any versioning requirements in the image
+using labels based on [NVIDIA's
+conventions](https://github.com/NVIDIA/nvidia-docker/blob/64510511e3fd0d00168eb076623854b0fcf1507d/tools/src/nvidia-docker/utils.go#L13).
+
+1. User creates a pod using the image, requiring
+`alpha.kubernetes.io/nvidia-gpu: 1`
+
+1. Scheduler picks a node for the pod
+
+1. The kubelet notices the GPU requirement and maps the three devices. In
+Docker's engine-api, this means it'll add them to the Resources.Devices list.
+
+1. Docker runs the container to completion
+
+1. The scheduler notices that the device is available again
+
+### Too complex for now: nvidia-docker
+
+For v0, we discussed at length, but decided to leave aside initially the
+[nvidia-docker plugin](https://github.com/NVIDIA/nvidia-docker). The plugin is
+an officially supported solution, thus avoiding a lot of new low level code, as
+it takes care of functionality such as:
+
+- creating a Docker volume with binaries such as `nvidia-smi` and shared
+libraries
+- providing HTTP endpoints that monitoring tools can use to collect GPU metrics
+- abstracting details such as `/dev` entry names for each device, as well as
+control ones like `nvidiactl`
+
+The `nvidia-docker` wrapper also verifies that the CUDA version required by a
+given image is supported by the host drivers, through inspection of well-known
+image labels, if present. We should try to provide equivalent checks, either
+for CUDA or OpenCL.
+
+This is current sample output from `nvidia-docker-plugin`, wrapped for
+readability:
+
+    $ curl -s localhost:3476/docker/cli
+    --device=/dev/nvidiactl --device=/dev/nvidia-uvm --device=/dev/nvidia0
+    --volume-driver=nvidia-docker
+    --volume=nvidia_driver_352.68:/usr/local/nvidia:ro
+
+It runs as a daemon listening for HTTP requests on port 3476. The endpoint above
+returns flags that need to be added to the Docker command line in order to
+expose GPUs to the containers. There are optional URL arguments to request
+specific devices if more than one are present on the system, as well as specific
+versions of the support software. An obvious improvement is an additional
+endpoint for JSON output.
+
+The unresolved question is whether `nvidia-docker-plugin` would run standalone
+as it does today (called over HTTP, perhaps with endpoints for a new Kubernetes
+resource API) or whether the relevant code from its `nvidia` package should be
+linked directly into kubelet. A partial list of tradeoffs:
+
+|                     | External binary                                                                                   | Linked in                                                    |
+|---------------------|---------------------------------------------------------------------------------------------------|--------------------------------------------------------------|
+| Use of cgo          | Confined to binary                                                                                | Linked into kubelet, but with lazy binding                   |
+| Expandibility       | Limited if we run the plugin, increased if library is used to build a Kubernetes-tailored daemon. | Can reuse the `nvidia` library as we prefer                  |
+| Bloat               | None                                                                                              | Larger kubelet, even for systems without GPUs                |
+| Reliability         | Need to handle the binary disappearing at any time                                                | Fewer headeaches                                             |
+| (Un)Marshalling     | Need to talk over JSON                                                                            | None                                                         |
+| Administration cost | One more daemon to install, configure and monitor                                                 | No extra work required, other than perhaps configuring flags |
+| Releases            | Potentially on its own schedule                                                                   | Tied to Kubernetes'                                          |
+
+## Implementation plan
+
+### V0
+
+The first two tracks can progress in parallel.
+
+#### Scheduling
+
+1. Define new resource `alpha.kubernetes.io:nvidia-gpu` in `pkg/api/types.go`
+and co.
+1. Plug resource into feasability checks used by kubelet, scheduler and
+schedulercache. Maybe gated behind a flag?
+1. Plug resource into resource_helpers.go
+1. Plug resource into the limitranger
+
+#### Runtime
+
+1. Add kubelet config parameter to enable the resource
+1. Make kubelet's `setNodeStatusMachineInfo` report the resource
+1. Add a Devices list to container.RunContainerOptions
+1. Use it from DockerManager's runContainer
+1. Do the same for rkt (stretch goal)
+1. When a pod requests a GPU, add the devices to the container options
+
+#### Other
+
+1. Add new resource to `kubectl describe` output. Optional for non-GPU users?
+1. Administrator documentation, with sample scripts
+1. User documentation
+
+## Future work
+
+Above all, we need to collect feedback from real users and use that to set
+priorities for any of the items below.
+
+### V1
+
+- Perform real detection of the installed hardware
+- Figure a standard way to avoid bundling of shared libraries in images
+- Support fractional resources so multiple pods can share the same GPU
+- Support bare metal setups
+- Report resource usage
+
+### V2
+
+- Support multiple GPUs with resource hierarchies and affinities
+- Support versioning of resources (e.g. "CUDA v7.5+")
+- Build resource plugins into the kubelet?
+- Support other device vendors
+- Support Azure?
+- Support rkt?
+
+### V3
+
+- Support OpenCL (so images can be device-agnostic)
+
+### Undetermined
+
+It makes sense to turn the output of this project (external resource plugins,
+etc.) into a more generic abstraction at some point.
+
+
+## Security considerations
+
+There should be knobs for the cluster administrator to only allow certain users
+or roles to schedule GPU workloads. Overcommitting or sharing the same device
+across different pods is not considered safe. It should be possible to segregate
+such GPU-sharing pods by user, namespace or a combination thereof.
+
+<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
+[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/gpu-support.md?pixel)]()
+<!-- END MUNGE: GENERATED_ANALYTICS -->