Update user guide and scheduler documentation to describe node affinity.

Register image priority locality function, which the original PR that introduced
it forgot to do. Change zone and region labels to beta.
This commit is contained in:
David Oppenheimer
2016-02-14 16:56:40 -08:00
parent 5fe856c749
commit 053f1c6008
6 changed files with 107 additions and 15 deletions

View File

@@ -56,7 +56,7 @@ You can verify that it worked by re-running `kubectl get nodes` and checking tha
Take whatever pod config file you want to run, and add a nodeSelector section to it, like this. For example, if this is my pod config:
<pre>
```yaml
apiVersion: v1
kind: Pod
metadata:
@@ -67,11 +67,11 @@ spec:
containers:
- name: nginx
image: nginx
</pre>
```
Then add a nodeSelector like so:
<pre>
```yaml
apiVersion: v1
kind: Pod
metadata:
@@ -82,13 +82,95 @@ spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
<b>nodeSelector:
disktype: ssd</b>
</pre>
nodeSelector:
disktype: ssd
```
When you then run `kubectl create -f pod.yaml`, the pod will get scheduled on the node that you attached the label to! You can verify that it worked by running `kubectl get pods -o wide` and looking at the "NODE" that the pod was assigned to.
#### Alpha feature in Kubernetes v1.2: Node Affinity
During the first half of 2016 we are rolling out a new mechanism, called *affinity* for controlling which nodes your pods wil be scheduled onto.
Like `nodeSelector`, affinity is based on labels. But it allows you to write much more expressive rules.
`nodeSelector` wil continue to work during the transition, but will eventually be deprecated.
Kubernetes v1.2 offers an alpha version of the first piece of the affinity mechanism, called [node affinity](../../design/nodeaffinity.md).
There are currently two types of node affinity, called `requiredDuringSchedulingIgnoredDuringExecution` and
`preferresDuringSchedulingIgnoredDuringExecution`. You can think of them as "hard" and "soft" respectively,
in the sense that the former specifies rules that *must* be met for a pod to schedule onto a node (just like
`nodeSelector` but using a more expressive syntax), while the latter specifies *preferences* that the scheduler
will try to enforce but will not guarantee. The "IgnoredDuringExecution" part of the names means that, similar
to how `nodeSelector` works, if labels on a node change at runtime such that the rules on a pod are no longer
met, the pod will still continue to run on the node. In the future we plan to offer
`requiredDuringSchedulingRequiredDuringExecution` which will be just like `requiredDuringSchedulingIgnoredDuringExecution`
except that it will evict pods from nodes that cease to satisfy the pods' node affinity requirements.
Node affinity is currently expressed using an annotation on Pod. In v1.3 it will use a field, and we will
also introduce the second piece of the affinity mechanism, called [pod affinity](../../design/podaffinity.md),
which allows you to control whether a pod schedules onto a particular node based on which other pods are
running on the node, rather than the labels on the node.
Here's an example of a pod that uses node affinity:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: with-labels
annotations:
scheduler.alpha.kubernetes.io/affinity: >
{
"nodeAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": {
"nodeSelectorTerms": [
{
"matchExpressions": [
{
"key": "kubernetes.io/e2e-az-name",
"operator": "In",
"values": ["e2e-az1", "e2e-az2"]
}
]
}
]
},
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 10,
"preference": {"matchExpressions": [
{
"key": "foo",
"operator": "In", "values": ["bar"]
}
]}
}
]
}
}
another-annotation-key: another-annotation-value
spec:
containers:
- name: with-labels
image: gcr.io/google_containers/pause:2.0
```
This node affinity rule says the pod can only be placed on a node with a label whose key is
`kubernetes.io/e2e-az-name` and whose value is either `e2e-az1` or `e2e-az2`. In addition,
among nodes that meet that criteria, nodes with a label whose key is `foo` and whose
value is `bar` should be preferred.
If you specify both `nodeSelector` and `nodeAffinity`, *both* must be satisfied for the pod
to be scheduled onto a candidate node.
### Built-in node labels
In addition to labels you [attach yourself](#step-one-attach-label-to-the-node), nodes come pre-populated
with a standard set of labels. As of Kubernetes v1.2 these labels are
* `kubernetes.io/hostname`
* `failure-domain.beta.kubernetes.io/zone`
* `failure-domain.beta.kubernetes.io/region`
* `beta.kubernetes.io/instance-type`
### Conclusion
While this example only covered one node, you can attach labels to as many nodes as you want. Then when you schedule a pod with a nodeSelector, it can be scheduled on any of the nodes that satisfy that nodeSelector. Be careful that it will match at least one node, however, because if it doesn't the pod won't be scheduled at all.