mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-09-08 20:50:24 +00:00
Deprecate the term "Ubernetes" in favor of "Cluster Federation" and "Multi-AZ Clusters"
This commit is contained in:
@@ -32,7 +32,7 @@ Documentation for other releases can be found at
|
||||
|
||||
<!-- END MUNGE: UNVERSIONED_WARNING -->
|
||||
|
||||
# Kubernetes/Ubernetes Control Plane Resilience
|
||||
# Kubernetes and Cluster Federation Control Plane Resilience
|
||||
|
||||
## Long Term Design and Current Status
|
||||
|
||||
@@ -44,7 +44,7 @@ Documentation for other releases can be found at
|
||||
|
||||
Some amount of confusion exists around how we currently, and in future
|
||||
want to ensure resilience of the Kubernetes (and by implication
|
||||
Ubernetes) control plane. This document is an attempt to capture that
|
||||
Kubernetes Cluster Federation) control plane. This document is an attempt to capture that
|
||||
definitively. It covers areas including self-healing, high
|
||||
availability, bootstrapping and recovery. Most of the information in
|
||||
this document already exists in the form of github comments,
|
||||
|
@@ -32,7 +32,7 @@ Documentation for other releases can be found at
|
||||
|
||||
<!-- END MUNGE: UNVERSIONED_WARNING -->
|
||||
|
||||
# Kubernetes Cluster Federation (a.k.a. "Ubernetes")
|
||||
# Kubernetes Cluster Federation (previously nicknamed "Ubernetes")
|
||||
|
||||
## Cross-cluster Load Balancing and Service Discovery
|
||||
|
||||
@@ -106,7 +106,7 @@ Documentation for other releases can be found at
|
||||
|
||||
A Kubernetes application configuration (e.g. for a Pod, Replication
|
||||
Controller, Service etc) should be able to be successfully deployed
|
||||
into any Kubernetes Cluster or Ubernetes Federation of Clusters,
|
||||
into any Kubernetes Cluster or Federation of Clusters,
|
||||
without modification. More specifically, a typical configuration
|
||||
should work correctly (although possibly not optimally) across any of
|
||||
the following environments:
|
||||
@@ -154,7 +154,7 @@ environments. More specifically, for example:
|
||||
|
||||
## Component Cloud Services
|
||||
|
||||
Ubernetes cross-cluster load balancing is built on top of the following:
|
||||
Cross-cluster Federated load balancing is built on top of the following:
|
||||
|
||||
1. [GCE Global L7 Load Balancers](https://cloud.google.com/compute/docs/load-balancing/http/global-forwarding-rules)
|
||||
provide single, static global IP addresses which load balance and
|
||||
@@ -194,10 +194,11 @@ Ubernetes cross-cluster load balancing is built on top of the following:
|
||||
A generic wrapper around cloud-provided L4 and L7 load balancing services, and
|
||||
roll-your-own load balancers run in pods, e.g. HA Proxy.
|
||||
|
||||
## Ubernetes API
|
||||
## Cluster Federation API
|
||||
|
||||
The Ubernetes API for load balancing should be compatible with the equivalent
|
||||
Kubernetes API, to ease porting of clients between Ubernetes and Kubernetes.
|
||||
The Cluster Federation API for load balancing should be compatible with the equivalent
|
||||
Kubernetes API, to ease porting of clients between Kubernetes and
|
||||
federations of Kubernetes clusters.
|
||||
Further details below.
|
||||
|
||||
## Common Client Behavior
|
||||
@@ -250,13 +251,13 @@ multiple) fixed server IP(s). Nothing else matters.
|
||||
|
||||
### General Control Plane Architecture
|
||||
|
||||
Each cluster hosts one or more Ubernetes master components (Ubernetes API
|
||||
Each cluster hosts one or more Cluster Federation master components (Federation API
|
||||
servers, controller managers with leader election, and etcd quorum members. This
|
||||
is documented in more detail in a separate design doc:
|
||||
[Kubernetes/Ubernetes Control Plane Resilience](https://docs.google.com/document/d/1jGcUVg9HDqQZdcgcFYlWMXXdZsplDdY6w3ZGJbU7lAw/edit#).
|
||||
[Kubernetes and Cluster Federation Control Plane Resilience](https://docs.google.com/document/d/1jGcUVg9HDqQZdcgcFYlWMXXdZsplDdY6w3ZGJbU7lAw/edit#).
|
||||
|
||||
In the description below, assume that 'n' clusters, named 'cluster-1'...
|
||||
'cluster-n' have been registered against an Ubernetes Federation "federation-1",
|
||||
'cluster-n' have been registered against a Cluster Federation "federation-1",
|
||||
each with their own set of Kubernetes API endpoints,so,
|
||||
"[http://endpoint-1.cluster-1](http://endpoint-1.cluster-1),
|
||||
[http://endpoint-2.cluster-1](http://endpoint-2.cluster-1)
|
||||
@@ -264,13 +265,13 @@ each with their own set of Kubernetes API endpoints,so,
|
||||
|
||||
### Federated Services
|
||||
|
||||
Ubernetes Services are pretty straight-forward. They're comprised of multiple
|
||||
Federated Services are pretty straight-forward. They're comprised of multiple
|
||||
equivalent underlying Kubernetes Services, each with their own external
|
||||
endpoint, and a load balancing mechanism across them. Let's work through how
|
||||
exactly that works in practice.
|
||||
|
||||
Our user creates the following Ubernetes Service (against an Ubernetes API
|
||||
endpoint):
|
||||
Our user creates the following Federated Service (against a Federation
|
||||
API endpoint):
|
||||
|
||||
$ kubectl create -f my-service.yaml --context="federation-1"
|
||||
|
||||
@@ -296,7 +297,7 @@ where service.yaml contains the following:
|
||||
run: my-service
|
||||
type: LoadBalancer
|
||||
|
||||
Ubernetes in turn creates one equivalent service (identical config to the above)
|
||||
The Cluster Federation control system in turn creates one equivalent service (identical config to the above)
|
||||
in each of the underlying Kubernetes clusters, each of which results in
|
||||
something like this:
|
||||
|
||||
@@ -338,7 +339,7 @@ something like this:
|
||||
Similar services are created in `cluster-2` and `cluster-3`, each of which are
|
||||
allocated their own `spec.clusterIP`, and `status.loadBalancer.ingress.ip`.
|
||||
|
||||
In Ubernetes `federation-1`, the resulting federated service looks as follows:
|
||||
In the Cluster Federation `federation-1`, the resulting federated service looks as follows:
|
||||
|
||||
$ kubectl get -o yaml --context="federation-1" service my-service
|
||||
|
||||
@@ -382,7 +383,7 @@ Note that the federated service:
|
||||
1. has a federation-wide load balancer hostname
|
||||
|
||||
In addition to the set of underlying Kubernetes services (one per cluster)
|
||||
described above, Ubernetes has also created a DNS name (e.g. on
|
||||
described above, the Cluster Federation control system has also created a DNS name (e.g. on
|
||||
[Google Cloud DNS](https://cloud.google.com/dns) or
|
||||
[AWS Route 53](https://aws.amazon.com/route53/), depending on configuration)
|
||||
which provides load balancing across all of those services. For example, in a
|
||||
@@ -397,7 +398,8 @@ Each of the above IP addresses (which are just the external load balancer
|
||||
ingress IP's of each cluster service) is of course load balanced across the pods
|
||||
comprising the service in each cluster.
|
||||
|
||||
In a more sophisticated configuration (e.g. on GCE or GKE), Ubernetes
|
||||
In a more sophisticated configuration (e.g. on GCE or GKE), the Cluster
|
||||
Federation control system
|
||||
automatically creates a
|
||||
[GCE Global L7 Load Balancer](https://cloud.google.com/compute/docs/load-balancing/http/global-forwarding-rules)
|
||||
which exposes a single, globally load-balanced IP:
|
||||
@@ -405,7 +407,7 @@ which exposes a single, globally load-balanced IP:
|
||||
$ dig +noall +answer my-service.my-namespace.my-federation.my-domain.com
|
||||
my-service.my-namespace.my-federation.my-domain.com 180 IN A 107.194.17.44
|
||||
|
||||
Optionally, Ubernetes also configures the local DNS servers (SkyDNS)
|
||||
Optionally, the Cluster Federation control system also configures the local DNS servers (SkyDNS)
|
||||
in each Kubernetes cluster to preferentially return the local
|
||||
clusterIP for the service in that cluster, with other clusters'
|
||||
external service IP's (or a global load-balanced IP) also configured
|
||||
@@ -416,7 +418,7 @@ for failover purposes:
|
||||
my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.74.77
|
||||
my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.38.157
|
||||
|
||||
If Ubernetes Global Service Health Checking is enabled, multiple service health
|
||||
If Cluster Federation Global Service Health Checking is enabled, multiple service health
|
||||
checkers running across the federated clusters collaborate to monitor the health
|
||||
of the service endpoints, and automatically remove unhealthy endpoints from the
|
||||
DNS record (e.g. a majority quorum is required to vote a service endpoint
|
||||
@@ -460,7 +462,7 @@ where `my-service-rc.yaml` contains the following:
|
||||
- containerPort: 2380
|
||||
protocol: TCP
|
||||
|
||||
Ubernetes in turn creates one equivalent replication controller
|
||||
The Cluster Federation control system in turn creates one equivalent replication controller
|
||||
(identical config to the above, except for the replica count) in each
|
||||
of the underlying Kubernetes clusters, each of which results in
|
||||
something like this:
|
||||
@@ -510,8 +512,8 @@ entire cluster failures, various approaches are possible, including:
|
||||
replicas in its cluster in response to the additional traffic
|
||||
diverted from the failed cluster. This saves resources and is relatively
|
||||
simple, but there is some delay in the autoscaling.
|
||||
3. **federated replica migration**, where the Ubernetes Federation
|
||||
Control Plane detects the cluster failure and automatically
|
||||
3. **federated replica migration**, where the Cluster Federation
|
||||
control system detects the cluster failure and automatically
|
||||
increases the replica count in the remainaing clusters to make up
|
||||
for the lost replicas in the failed cluster. This does not seem to
|
||||
offer any benefits relative to pod autoscaling above, and is
|
||||
@@ -523,23 +525,24 @@ entire cluster failures, various approaches are possible, including:
|
||||
The implementation approach and architecture is very similar to Kubernetes, so
|
||||
if you're familiar with how Kubernetes works, none of what follows will be
|
||||
surprising. One additional design driver not present in Kubernetes is that
|
||||
Ubernetes aims to be resilient to individual cluster and availability zone
|
||||
the Cluster Federation control system aims to be resilient to individual cluster and availability zone
|
||||
failures. So the control plane spans multiple clusters. More specifically:
|
||||
|
||||
+ Ubernetes runs it's own distinct set of API servers (typically one
|
||||
+ Cluster Federation runs it's own distinct set of API servers (typically one
|
||||
or more per underlying Kubernetes cluster). These are completely
|
||||
distinct from the Kubernetes API servers for each of the underlying
|
||||
clusters.
|
||||
+ Ubernetes runs it's own distinct quorum-based metadata store (etcd,
|
||||
+ Cluster Federation runs it's own distinct quorum-based metadata store (etcd,
|
||||
by default). Approximately 1 quorum member runs in each underlying
|
||||
cluster ("approximately" because we aim for an odd number of quorum
|
||||
members, and typically don't want more than 5 quorum members, even
|
||||
if we have a larger number of federated clusters, so 2 clusters->3
|
||||
quorum members, 3->3, 4->3, 5->5, 6->5, 7->5 etc).
|
||||
|
||||
Cluster Controllers in Ubernetes watch against the Ubernetes API server/etcd
|
||||
Cluster Controllers in the Federation control system watch against the
|
||||
Federation API server/etcd
|
||||
state, and apply changes to the underlying kubernetes clusters accordingly. They
|
||||
also have the anti-entropy mechanism for reconciling ubernetes "desired desired"
|
||||
also have the anti-entropy mechanism for reconciling Cluster Federation "desired desired"
|
||||
state against kubernetes "actual desired" state.
|
||||
|
||||
|
||||
|
@@ -320,8 +320,8 @@ Below is the state transition diagram.
|
||||
|
||||
## Replication Controller
|
||||
|
||||
A global workload submitted to control plane is represented as an
|
||||
Ubernetes replication controller. When a replication controller
|
||||
A global workload submitted to control plane is represented as a
|
||||
replication controller in the Cluster Federation control plane. When a replication controller
|
||||
is submitted to control plane, clients need a way to express its
|
||||
requirements or preferences on clusters. Depending on different use
|
||||
cases it may be complex. For example:
|
||||
@@ -377,11 +377,11 @@ some implicit scheduling restrictions. For example it defines
|
||||
“nodeSelector” which can only be satisfied on some particular
|
||||
clusters. How to handle this will be addressed after phase one.
|
||||
|
||||
## Ubernetes Services
|
||||
## Federated Services
|
||||
|
||||
The Service API object exposed by Ubernetes is similar to service
|
||||
The Service API object exposed by the Cluster Federation is similar to service
|
||||
objects on Kubernetes. It defines the access to a group of pods. The
|
||||
Ubernetes service controller will create corresponding Kubernetes
|
||||
federation service controller will create corresponding Kubernetes
|
||||
service objects on underlying clusters. These are detailed in a
|
||||
separate design document: [Federated Services](federated-services.md).
|
||||
|
||||
@@ -389,13 +389,13 @@ separate design document: [Federated Services](federated-services.md).
|
||||
|
||||
In phase one we only support scheduling replication controllers. Pod
|
||||
scheduling will be supported in later phase. This is primarily in
|
||||
order to keep the Ubernetes API compatible with the Kubernetes API.
|
||||
order to keep the Cluster Federation API compatible with the Kubernetes API.
|
||||
|
||||
## ACTIVITY FLOWS
|
||||
|
||||
## Scheduling
|
||||
|
||||
The below diagram shows how workloads are scheduled on the Ubernetes control\
|
||||
The below diagram shows how workloads are scheduled on the Cluster Federation control\
|
||||
plane:
|
||||
|
||||
1. A replication controller is created by the client.
|
||||
@@ -419,20 +419,20 @@ distribution policies. The scheduling rule is basically:
|
||||
There is a potential race condition here. Say at time _T1_ the control
|
||||
plane learns there are _m_ available resources in a K8S cluster. As
|
||||
the cluster is working independently it still accepts workload
|
||||
requests from other K8S clients or even another Ubernetes control
|
||||
plane. The Ubernetes scheduling decision is based on this data of
|
||||
requests from other K8S clients or even another Cluster Federation control
|
||||
plane. The Cluster Federation scheduling decision is based on this data of
|
||||
available resources. However when the actual RC creation happens to
|
||||
the cluster at time _T2_, the cluster may don’t have enough resources
|
||||
at that time. We will address this problem in later phases with some
|
||||
proposed solutions like resource reservation mechanisms.
|
||||
|
||||

|
||||

|
||||
|
||||
## Service Discovery
|
||||
|
||||
This part has been included in the section “Federated Service” of
|
||||
document
|
||||
“[Ubernetes Cross-cluster Load Balancing and Service Discovery Requirements and System Design](federated-services.md))”.
|
||||
“[Federated Cross-cluster Load Balancing and Service Discovery Requirements and System Design](federated-services.md))”.
|
||||
Please refer to that document for details.
|
||||
|
||||
|
||||
|
@@ -347,7 +347,7 @@ scheduler to not put more than one pod from S in the same zone, and thus by
|
||||
definition it will not put more than one pod from S on the same node, assuming
|
||||
each node is in one zone. This rule is more useful as PreferredDuringScheduling
|
||||
anti-affinity, e.g. one might expect it to be common in
|
||||
[Ubernetes](../../docs/proposals/federation.md) clusters.)
|
||||
[Cluster Federation](../../docs/proposals/federation.md) clusters.)
|
||||
|
||||
* **Don't co-locate pods of this service with pods from service "evilService"**:
|
||||
`{LabelSelector: selector that matches evilService's pods, TopologyKey: "node"}`
|
||||
|
Reference in New Issue
Block a user