Production-Grade Container Scheduling and Management
Go to file
Kubernetes Submit Queue 3a3dc827e4 Merge pull request #43467 from tvansteenburgh/gpu-support
Automatic merge from submit-queue (batch tested with PRs 44047, 43514, 44037, 43467)

Juju: Enable GPU mode if GPU hardware detected

**What this PR does / why we need it**:

Automatically configures kubernetes-worker node to utilize GPU hardware when such hardware is detected.

layer-nvidia-cuda does the hardware detection, installs CUDA and Nvidia
drivers, and sets a state that the k8s-worker can react to.

When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.

When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.

The kube-control interface has subsumed the kube-dns interface
functionality.

An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".

**Special notes for your reviewer**:

Quickest test setup is as follows:
```bash
# Bootstrap. If your aws account doesn't have a default vpc, you'll need to
# specify one at bootstrap time so that juju can provision a p2.xlarge.
# Otherwise you can leave out the --config "vpc-id=vpc-xxxxxxxx" bit.
juju bootstrap --config "vpc-id=vpc-xxxxxxxx" --constraints "cores=4 mem=16G root-disk=64G" aws/us-east-1 k8s

# Deploy the bundle containing master and worker charms built from
# https://github.com/tvansteenburgh/kubernetes/tree/gpu-support/cluster/juju/layers
juju deploy cs:~tvansteenburgh/bundle/kubernetes-gpu-support-3

# Setup kubectl locally
mkdir -p ~/.kube
juju scp kubernetes-master/0:config ~/.kube/config
juju scp kubernetes-master/0:kubectl ./kubectl

# Download a gpu-dependent job spec
wget -O /tmp/nvidia-smi.yaml https://raw.githubusercontent.com/madeden/blogposts/master/k8s-gpu-cloud/src/nvidia-smi.yaml

# Create the job
kubectl create -f /tmp/nvidia-smi.yaml

# You should see a new nvidia-smi-xxxxx pod created
kubectl get pods

# Wait a bit for the job to run, then view logs; you should see the
# nvidia-smi table output
kubectl logs $(kubectl get pods -l name=nvidia-smi -o=name -a)
```

kube-control interface: https://github.com/juju-solutions/interface-kube-control
nvidia-cuda layer: https://github.com/juju-solutions/layer-nvidia-cuda
(Both are registered on http://interfaces.juju.solutions/)

**Release note**:
```release-note
Juju: Enable GPU mode if GPU hardware detected
```
2017-04-04 14:33:26 -07:00
.github PR template: Update links to kubernetes/community repo 2017-03-17 12:23:58 -04:00
api call GetHostIP from makeEnvironment 2017-03-28 20:20:21 -04:00
build Merge pull request #43817 from spxtr/owners 2017-03-31 11:29:29 -07:00
cluster Merge pull request #43467 from tvansteenburgh/gpu-support 2017-04-04 14:33:26 -07:00
cmd kube-proxy: OnServiceUpdate takes pointers 2017-04-03 17:19:39 -07:00
docs Merge pull request #42717 from andrewsykim/support-host-ip-downward-api 2017-04-03 15:48:12 -07:00
examples Fix typo in mysql-galera example Dockerfile 2017-03-31 19:52:15 +02:00
federation Merge pull request #42717 from andrewsykim/support-host-ip-downward-api 2017-04-03 15:48:12 -07:00
Godeps Update boltdb dependency to fix golang 1.7 intermittent failures 2017-04-04 10:48:19 -04:00
hack Merge pull request #42674 from nikhiljindal/secretKubeTe 2017-04-04 00:28:42 -07:00
hooks
logo
pkg Merge pull request #43514 from zjj2wry/006 2017-04-04 14:33:22 -07:00
plugin Merge pull request #43885 from zhangxiaoyu-zidif/master 2017-04-02 17:17:26 -07:00
staging Update boltdb dependency to fix golang 1.7 intermittent failures 2017-04-04 10:48:19 -04:00
test Add retries in cluster-autoscaler e2e 2017-04-04 10:28:24 +02:00
third_party Add forked etcd 2.2.1 code to allow rollback to 2.2.1 version 2017-02-10 13:56:01 +01:00
translations Update extraction script, sort messages, add .pot file. 2017-02-23 18:53:00 +00:00
vendor Update boltdb dependency to fix golang 1.7 intermittent failures 2017-04-04 10:48:19 -04:00
.bazelrc Add verify-gofmt as a Bazel test. 2017-02-10 17:00:28 -08:00
.gazelcfg.json
.generated_files Move .generated_docs to docs/ so docs OWNERS can review / approve 2017-02-16 10:11:57 -08:00
.gitattributes
.gitignore
BUILD.bazel
CHANGELOG.md Fixes links in CHANGELOG.md table of contents 2017-04-03 17:13:32 -07:00
code-of-conduct.md
CONTRIBUTING.md Close kubernetes/community#420 2017-03-08 09:59:30 -08:00
labels.yaml
LICENSE
Makefile test/e2e_node: prepull images with CRI 2017-04-01 10:18:56 +02:00
Makefile.generated_files
OWNERS Add wojtec to global approvers 2017-01-25 11:57:00 -06:00
OWNERS_ALIASES Merge pull request #42953 from kargakis/rm-myself 2017-04-03 01:50:58 -07:00
README.md Close kubernetes/community#420 2017-03-08 09:59:30 -08:00
Vagrantfile
WORKSPACE Update busybox dependency to fix bazel build 2017-03-28 12:12:31 -07:00

Kubernetes

Submit Queue Widget GoDoc Widget


Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale using a system called Borg, combined with best-of-breed ideas and practices from the community.

Kubernetes is hosted by the Cloud Native Computing Foundation (CNCF). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Kubernetes plays a role, read the CNCF announcement.


To start using Kubernetes

See our documentation on kubernetes.io.

Try our interactive tutorial.

Take a free course on Scalable Microservices with Kubernetes.

To start developing Kubernetes

The community repository hosts all information about building Kubernetes from source, how to contribute code and documentation, who to contact about what, etc.

If you want to build Kubernetes right away there are two options:

You have a working Go environment.
$ go get -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ make
You have a working Docker environment.
$ git clone https://github.com/kubernetes/kubernetes
$ cd kubernetes
$ make quick-release

If you are less impatient, head over to the developer's documentation.

Support

If you need support, start with the troubleshooting guide and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another.

Analytics