Update user guide and scheduler documentation to describe node affinity.

Register image priority locality function, which the original PR that introduced it forgot to do. Change zone and region labels to beta.
2025-09-07 04:03:20 +00:00 · 2016-02-14 16:56:40 -08:00
parent 5fe856c749
commit 053f1c6008
6 changed files with 107 additions and 15 deletions
--- a/docs/user-guide/node-selection/README.md
+++ b/docs/user-guide/node-selection/README.md
@@ -56,7 +56,7 @@ You can verify that it worked by re-running `kubectl get nodes` and checking tha

 Take whatever pod config file you want to run, and add a nodeSelector section to it, like this. For example, if this is my pod config:

-<pre>
+```yaml
 apiVersion: v1
 kind: Pod
 metadata:
@@ -67,11 +67,11 @@ spec:
  containers:
  - name: nginx
    image: nginx
-</pre>
+```

 Then add a nodeSelector like so:

-<pre>
+```yaml
 apiVersion: v1
 kind: Pod
 metadata:
@@ -82,13 +82,95 @@ spec:
  containers:
  - name: nginx
    image: nginx
-    imagePullPolicy: IfNotPresent
-  <b>nodeSelector:
-    disktype: ssd</b>
-</pre>
+  nodeSelector:
+    disktype: ssd
+```

 When you then run `kubectl create -f pod.yaml`, the pod will get scheduled on the node that you attached the label to! You can verify that it worked by running `kubectl get pods -o wide` and looking at the "NODE" that the pod was assigned to.

+#### Alpha feature in Kubernetes v1.2: Node Affinity
+
+During the first half of 2016 we are rolling out a new mechanism, called *affinity* for controlling which nodes your pods wil be scheduled onto.
+Like `nodeSelector`, affinity is based on labels. But it allows you to write much more expressive rules.
+`nodeSelector` wil continue to work during the transition, but will eventually be deprecated.
+
+Kubernetes v1.2 offers an alpha version of the first piece of the affinity mechanism, called [node affinity](../../design/nodeaffinity.md).
+There are currently two types of node affinity, called `requiredDuringSchedulingIgnoredDuringExecution` and
+`preferresDuringSchedulingIgnoredDuringExecution`. You can think of them as "hard" and "soft" respectively,
+in the sense that the former specifies rules that *must* be met for a pod to schedule onto a node (just like
+`nodeSelector` but using a more expressive syntax), while the latter specifies *preferences* that the scheduler
+will try to enforce but will not guarantee. The "IgnoredDuringExecution" part of the names means that, similar
+to how `nodeSelector` works, if labels on a node change at runtime such that the rules on a pod are no longer
+met, the pod will still continue to run on the node. In the future we plan to offer
+`requiredDuringSchedulingRequiredDuringExecution` which will be just like `requiredDuringSchedulingIgnoredDuringExecution`
+except that it will evict pods from nodes that cease to satisfy the pods' node affinity requirements.
+
+Node affinity is currently expressed using an annotation on Pod. In v1.3 it will use a field, and we will
+also introduce the second piece of the affinity mechanism, called [pod affinity](../../design/podaffinity.md),
+which allows you to control whether a pod schedules onto a particular node based on which other pods are
+running on the node, rather than the labels on the node.
+
+Here's an example of a pod that uses node affinity:
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: with-labels
+  annotations:
+    scheduler.alpha.kubernetes.io/affinity: >
+      {
+        "nodeAffinity": {
+          "requiredDuringSchedulingIgnoredDuringExecution": {
+            "nodeSelectorTerms": [
+              {
+                "matchExpressions": [
+                  {
+                    "key": "kubernetes.io/e2e-az-name",
+                    "operator": "In",
+                    "values": ["e2e-az1", "e2e-az2"]
+                  }
+                ]
+              }
+            ]
+          },
+          "preferredDuringSchedulingIgnoredDuringExecution": [
+            {
+              "weight": 10,
+              "preference": {"matchExpressions": [
+                {
+                  "key": "foo",
+                  "operator": "In", "values": ["bar"]
+                }
+              ]}
+            }
+          ]
+        }
+      }
+    another-annotation-key: another-annotation-value
+spec:
+  containers:
+  - name: with-labels
+    image: gcr.io/google_containers/pause:2.0
+```
+
+This node affinity rule says the pod can only be placed on a node with a label whose key is
+`kubernetes.io/e2e-az-name` and whose value is either `e2e-az1` or `e2e-az2`. In addition,
+among nodes that meet that criteria, nodes with a label whose key is `foo` and whose
+value is `bar` should be preferred.
+
+If you specify both `nodeSelector` and `nodeAffinity`, *both* must be satisfied for the pod
+to be scheduled onto a candidate node.
+
+### Built-in node labels
+
+In addition to labels you [attach yourself](#step-one-attach-label-to-the-node), nodes come pre-populated
+with a standard set of labels. As of Kubernetes v1.2 these labels are
+* `kubernetes.io/hostname`
+* `failure-domain.beta.kubernetes.io/zone`
+* `failure-domain.beta.kubernetes.io/region`
+* `beta.kubernetes.io/instance-type`
+
 ### Conclusion

 While this example only covered one node, you can attach labels to as many nodes as you want. Then when you schedule a pod with a nodeSelector, it can be scheduled on any of the nodes that satisfy that nodeSelector. Be careful that it will match at least one node, however, because if it doesn't the pod won't be scheduled at all.