Merge pull request #28558 from quinton-hoole/2016-07-06-excise-ubernetes-from-main-repo

Automatic merge from submit-queue

Deprecate the term "Ubernetes" 

Deprecate the term "Ubernetes" in favor of "Cluster Federation" and  "Multi-AZ Clusters"
This commit is contained in:
k8s-merge-robot 2016-07-11 23:20:53 -07:00 committed by GitHub
commit 629f3c159e
10 changed files with 80 additions and 74 deletions

View File

@ -32,7 +32,7 @@ Documentation for other releases can be found at
<!-- END MUNGE: UNVERSIONED_WARNING -->
# Kubernetes/Ubernetes Control Plane Resilience
# Kubernetes and Cluster Federation Control Plane Resilience
## Long Term Design and Current Status
@ -44,7 +44,7 @@ Documentation for other releases can be found at
Some amount of confusion exists around how we currently, and in future
want to ensure resilience of the Kubernetes (and by implication
Ubernetes) control plane. This document is an attempt to capture that
Kubernetes Cluster Federation) control plane. This document is an attempt to capture that
definitively. It covers areas including self-healing, high
availability, bootstrapping and recovery. Most of the information in
this document already exists in the form of github comments,

View File

@ -32,7 +32,7 @@ Documentation for other releases can be found at
<!-- END MUNGE: UNVERSIONED_WARNING -->
# Kubernetes Cluster Federation (a.k.a. "Ubernetes")
# Kubernetes Cluster Federation (previously nicknamed "Ubernetes")
## Cross-cluster Load Balancing and Service Discovery
@ -106,7 +106,7 @@ Documentation for other releases can be found at
A Kubernetes application configuration (e.g. for a Pod, Replication
Controller, Service etc) should be able to be successfully deployed
into any Kubernetes Cluster or Ubernetes Federation of Clusters,
into any Kubernetes Cluster or Federation of Clusters,
without modification. More specifically, a typical configuration
should work correctly (although possibly not optimally) across any of
the following environments:
@ -154,7 +154,7 @@ environments. More specifically, for example:
## Component Cloud Services
Ubernetes cross-cluster load balancing is built on top of the following:
Cross-cluster Federated load balancing is built on top of the following:
1. [GCE Global L7 Load Balancers](https://cloud.google.com/compute/docs/load-balancing/http/global-forwarding-rules)
provide single, static global IP addresses which load balance and
@ -194,10 +194,11 @@ Ubernetes cross-cluster load balancing is built on top of the following:
A generic wrapper around cloud-provided L4 and L7 load balancing services, and
roll-your-own load balancers run in pods, e.g. HA Proxy.
## Ubernetes API
## Cluster Federation API
The Ubernetes API for load balancing should be compatible with the equivalent
Kubernetes API, to ease porting of clients between Ubernetes and Kubernetes.
The Cluster Federation API for load balancing should be compatible with the equivalent
Kubernetes API, to ease porting of clients between Kubernetes and
federations of Kubernetes clusters.
Further details below.
## Common Client Behavior
@ -250,13 +251,13 @@ multiple) fixed server IP(s). Nothing else matters.
### General Control Plane Architecture
Each cluster hosts one or more Ubernetes master components (Ubernetes API
Each cluster hosts one or more Cluster Federation master components (Federation API
servers, controller managers with leader election, and etcd quorum members. This
is documented in more detail in a separate design doc:
[Kubernetes/Ubernetes Control Plane Resilience](https://docs.google.com/document/d/1jGcUVg9HDqQZdcgcFYlWMXXdZsplDdY6w3ZGJbU7lAw/edit#).
[Kubernetes and Cluster Federation Control Plane Resilience](https://docs.google.com/document/d/1jGcUVg9HDqQZdcgcFYlWMXXdZsplDdY6w3ZGJbU7lAw/edit#).
In the description below, assume that 'n' clusters, named 'cluster-1'...
'cluster-n' have been registered against an Ubernetes Federation "federation-1",
'cluster-n' have been registered against a Cluster Federation "federation-1",
each with their own set of Kubernetes API endpoints,so,
"[http://endpoint-1.cluster-1](http://endpoint-1.cluster-1),
[http://endpoint-2.cluster-1](http://endpoint-2.cluster-1)
@ -264,13 +265,13 @@ each with their own set of Kubernetes API endpoints,so,
### Federated Services
Ubernetes Services are pretty straight-forward. They're comprised of multiple
Federated Services are pretty straight-forward. They're comprised of multiple
equivalent underlying Kubernetes Services, each with their own external
endpoint, and a load balancing mechanism across them. Let's work through how
exactly that works in practice.
Our user creates the following Ubernetes Service (against an Ubernetes API
endpoint):
Our user creates the following Federated Service (against a Federation
API endpoint):
$ kubectl create -f my-service.yaml --context="federation-1"
@ -296,7 +297,7 @@ where service.yaml contains the following:
run: my-service
type: LoadBalancer
Ubernetes in turn creates one equivalent service (identical config to the above)
The Cluster Federation control system in turn creates one equivalent service (identical config to the above)
in each of the underlying Kubernetes clusters, each of which results in
something like this:
@ -338,7 +339,7 @@ something like this:
Similar services are created in `cluster-2` and `cluster-3`, each of which are
allocated their own `spec.clusterIP`, and `status.loadBalancer.ingress.ip`.
In Ubernetes `federation-1`, the resulting federated service looks as follows:
In the Cluster Federation `federation-1`, the resulting federated service looks as follows:
$ kubectl get -o yaml --context="federation-1" service my-service
@ -382,7 +383,7 @@ Note that the federated service:
1. has a federation-wide load balancer hostname
In addition to the set of underlying Kubernetes services (one per cluster)
described above, Ubernetes has also created a DNS name (e.g. on
described above, the Cluster Federation control system has also created a DNS name (e.g. on
[Google Cloud DNS](https://cloud.google.com/dns) or
[AWS Route 53](https://aws.amazon.com/route53/), depending on configuration)
which provides load balancing across all of those services. For example, in a
@ -397,7 +398,8 @@ Each of the above IP addresses (which are just the external load balancer
ingress IP's of each cluster service) is of course load balanced across the pods
comprising the service in each cluster.
In a more sophisticated configuration (e.g. on GCE or GKE), Ubernetes
In a more sophisticated configuration (e.g. on GCE or GKE), the Cluster
Federation control system
automatically creates a
[GCE Global L7 Load Balancer](https://cloud.google.com/compute/docs/load-balancing/http/global-forwarding-rules)
which exposes a single, globally load-balanced IP:
@ -405,7 +407,7 @@ which exposes a single, globally load-balanced IP:
$ dig +noall +answer my-service.my-namespace.my-federation.my-domain.com
my-service.my-namespace.my-federation.my-domain.com 180 IN A 107.194.17.44
Optionally, Ubernetes also configures the local DNS servers (SkyDNS)
Optionally, the Cluster Federation control system also configures the local DNS servers (SkyDNS)
in each Kubernetes cluster to preferentially return the local
clusterIP for the service in that cluster, with other clusters'
external service IP's (or a global load-balanced IP) also configured
@ -416,7 +418,7 @@ for failover purposes:
my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.74.77
my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.38.157
If Ubernetes Global Service Health Checking is enabled, multiple service health
If Cluster Federation Global Service Health Checking is enabled, multiple service health
checkers running across the federated clusters collaborate to monitor the health
of the service endpoints, and automatically remove unhealthy endpoints from the
DNS record (e.g. a majority quorum is required to vote a service endpoint
@ -460,7 +462,7 @@ where `my-service-rc.yaml` contains the following:
- containerPort: 2380
protocol: TCP
Ubernetes in turn creates one equivalent replication controller
The Cluster Federation control system in turn creates one equivalent replication controller
(identical config to the above, except for the replica count) in each
of the underlying Kubernetes clusters, each of which results in
something like this:
@ -510,8 +512,8 @@ entire cluster failures, various approaches are possible, including:
replicas in its cluster in response to the additional traffic
diverted from the failed cluster. This saves resources and is relatively
simple, but there is some delay in the autoscaling.
3. **federated replica migration**, where the Ubernetes Federation
Control Plane detects the cluster failure and automatically
3. **federated replica migration**, where the Cluster Federation
control system detects the cluster failure and automatically
increases the replica count in the remainaing clusters to make up
for the lost replicas in the failed cluster. This does not seem to
offer any benefits relative to pod autoscaling above, and is
@ -523,23 +525,24 @@ entire cluster failures, various approaches are possible, including:
The implementation approach and architecture is very similar to Kubernetes, so
if you're familiar with how Kubernetes works, none of what follows will be
surprising. One additional design driver not present in Kubernetes is that
Ubernetes aims to be resilient to individual cluster and availability zone
the Cluster Federation control system aims to be resilient to individual cluster and availability zone
failures. So the control plane spans multiple clusters. More specifically:
+ Ubernetes runs it's own distinct set of API servers (typically one
+ Cluster Federation runs it's own distinct set of API servers (typically one
or more per underlying Kubernetes cluster). These are completely
distinct from the Kubernetes API servers for each of the underlying
clusters.
+ Ubernetes runs it's own distinct quorum-based metadata store (etcd,
+ Cluster Federation runs it's own distinct quorum-based metadata store (etcd,
by default). Approximately 1 quorum member runs in each underlying
cluster ("approximately" because we aim for an odd number of quorum
members, and typically don't want more than 5 quorum members, even
if we have a larger number of federated clusters, so 2 clusters->3
quorum members, 3->3, 4->3, 5->5, 6->5, 7->5 etc).
Cluster Controllers in Ubernetes watch against the Ubernetes API server/etcd
Cluster Controllers in the Federation control system watch against the
Federation API server/etcd
state, and apply changes to the underlying kubernetes clusters accordingly. They
also have the anti-entropy mechanism for reconciling ubernetes "desired desired"
also have the anti-entropy mechanism for reconciling Cluster Federation "desired desired"
state against kubernetes "actual desired" state.

View File

@ -320,8 +320,8 @@ Below is the state transition diagram.
## Replication Controller
A global workload submitted to control plane is represented as an
Ubernetes replication controller. When a replication controller
A global workload submitted to control plane is represented as a
replication controller in the Cluster Federation control plane. When a replication controller
is submitted to control plane, clients need a way to express its
requirements or preferences on clusters. Depending on different use
cases it may be complex. For example:
@ -377,11 +377,11 @@ some implicit scheduling restrictions. For example it defines
“nodeSelector” which can only be satisfied on some particular
clusters. How to handle this will be addressed after phase one.
## Ubernetes Services
## Federated Services
The Service API object exposed by Ubernetes is similar to service
The Service API object exposed by the Cluster Federation is similar to service
objects on Kubernetes. It defines the access to a group of pods. The
Ubernetes service controller will create corresponding Kubernetes
federation service controller will create corresponding Kubernetes
service objects on underlying clusters. These are detailed in a
separate design document: [Federated Services](federated-services.md).
@ -389,13 +389,13 @@ separate design document: [Federated Services](federated-services.md).
In phase one we only support scheduling replication controllers. Pod
scheduling will be supported in later phase. This is primarily in
order to keep the Ubernetes API compatible with the Kubernetes API.
order to keep the Cluster Federation API compatible with the Kubernetes API.
## ACTIVITY FLOWS
## Scheduling
The below diagram shows how workloads are scheduled on the Ubernetes control\
The below diagram shows how workloads are scheduled on the Cluster Federation control\
plane:
1. A replication controller is created by the client.
@ -419,20 +419,20 @@ distribution policies. The scheduling rule is basically:
There is a potential race condition here. Say at time _T1_ the control
plane learns there are _m_ available resources in a K8S cluster. As
the cluster is working independently it still accepts workload
requests from other K8S clients or even another Ubernetes control
plane. The Ubernetes scheduling decision is based on this data of
requests from other K8S clients or even another Cluster Federation control
plane. The Cluster Federation scheduling decision is based on this data of
available resources. However when the actual RC creation happens to
the cluster at time _T2_, the cluster may dont have enough resources
at that time. We will address this problem in later phases with some
proposed solutions like resource reservation mechanisms.
![Ubernetes Scheduling](ubernetes-scheduling.png)
![Federated Scheduling](ubernetes-scheduling.png)
## Service Discovery
This part has been included in the section “Federated Service” of
document
“[Ubernetes Cross-cluster Load Balancing and Service Discovery Requirements and System Design](federated-services.md))”.
“[Federated Cross-cluster Load Balancing and Service Discovery Requirements and System Design](federated-services.md))”.
Please refer to that document for details.

View File

@ -347,7 +347,7 @@ scheduler to not put more than one pod from S in the same zone, and thus by
definition it will not put more than one pod from S on the same node, assuming
each node is in one zone. This rule is more useful as PreferredDuringScheduling
anti-affinity, e.g. one might expect it to be common in
[Ubernetes](../../docs/proposals/federation.md) clusters.)
[Cluster Federation](../../docs/proposals/federation.md) clusters.)
* **Don't co-locate pods of this service with pods from service "evilService"**:
`{LabelSelector: selector that matches evilService's pods, TopologyKey: "node"}`

View File

@ -34,25 +34,25 @@ Documentation for other releases can be found at
# Kubernetes Multi-AZ Clusters
## (a.k.a. "Ubernetes-Lite")
## (previously nicknamed "Ubernetes-Lite")
## Introduction
Full Ubernetes will offer sophisticated federation between multiple kuberentes
Full Cluster Federation will offer sophisticated federation between multiple kuberentes
clusters, offering true high-availability, multiple provider support &
cloud-bursting, multiple region support etc. However, many users have
expressed a desire for a "reasonably" high-available cluster, that runs in
multiple zones on GCE or availability zones in AWS, and can tolerate the failure
of a single zone without the complexity of running multiple clusters.
Ubernetes-Lite aims to deliver exactly that functionality: to run a single
Multi-AZ Clusters aim to deliver exactly that functionality: to run a single
Kubernetes cluster in multiple zones. It will attempt to make reasonable
scheduling decisions, in particular so that a replication controller's pods are
spread across zones, and it will try to be aware of constraints - for example
that a volume cannot be mounted on a node in a different zone.
Ubernetes-Lite is deliberately limited in scope; for many advanced functions
the answer will be "use Ubernetes (full)". For example, multiple-region
Multi-AZ Clusters are deliberately limited in scope; for many advanced functions
the answer will be "use full Cluster Federation". For example, multiple-region
support is not in scope. Routing affinity (e.g. so that a webserver will
prefer to talk to a backend service in the same zone) is similarly not in
scope.
@ -122,7 +122,7 @@ zones (in the same region). For both clouds, the behaviour of the native cloud
load-balancer is reasonable in the face of failures (indeed, this is why clouds
provide load-balancing as a primitve).
For Ubernetes-Lite we will therefore simply rely on the native cloud provider
For multi-AZ clusters we will therefore simply rely on the native cloud provider
load balancer behaviour, and we do not anticipate substantial code changes.
One notable shortcoming here is that load-balanced traffic still goes through
@ -130,8 +130,8 @@ kube-proxy controlled routing, and kube-proxy does not (currently) favor
targeting a pod running on the same instance or even the same zone. This will
likely produce a lot of unnecessary cross-zone traffic (which is likely slower
and more expensive). This might be sufficiently low-hanging fruit that we
choose to address it in kube-proxy / Ubernetes-Lite, but this can be addressed
after the initial Ubernetes-Lite implementation.
choose to address it in kube-proxy / multi-AZ clusters, but this can be addressed
after the initial implementation.
## Implementation
@ -182,8 +182,8 @@ region-wide, meaning that a single call will find instances and volumes in all
zones. In addition, instance ids and volume ids are unique per-region (and
hence also per-zone). I believe they are actually globally unique, but I do
not know if this is guaranteed; in any case we only need global uniqueness if
we are to span regions, which will not be supported by Ubernetes-Lite (to do
that correctly requires an Ubernetes-Full type approach).
we are to span regions, which will not be supported by multi-AZ clusters (to do
that correctly requires a full Cluster Federation type approach).
## GCE Specific Considerations
@ -197,20 +197,20 @@ combine results from calls in all relevant zones.
A further complexity is that GCE volume names are scoped per-zone, not
per-region. Thus it is permitted to have two volumes both named `myvolume` in
two different GCE zones. (Instance names are currently unique per-region, and
thus are not a problem for Ubernetes-Lite).
thus are not a problem for multi-AZ clusters).
The volume scoping leads to a (small) behavioural change for Ubernetes-Lite on
The volume scoping leads to a (small) behavioural change for multi-AZ clusters on
GCE. If you had two volumes both named `myvolume` in two different GCE zones,
this would not be ambiguous when Kubernetes is operating only in a single zone.
But, if Ubernetes-Lite is operating in multiple zones, `myvolume` is no longer
But, when operating a cluster across multiple zones, `myvolume` is no longer
sufficient to specify a volume uniquely. Worse, the fact that a volume happens
to be unambigious at a particular time is no guarantee that it will continue to
be unambigious in future, because a volume with the same name could
subsequently be created in a second zone. While perhaps unlikely in practice,
we cannot automatically enable Ubernetes-Lite for GCE users if this then causes
we cannot automatically enable multi-AZ clusters for GCE users if this then causes
volume mounts to stop working.
This suggests that (at least on GCE), Ubernetes-Lite must be optional (i.e.
This suggests that (at least on GCE), multi-AZ clusters must be optional (i.e.
there must be a feature-flag). It may be that we can make this feature
semi-automatic in future, by detecting whether nodes are running in multiple
zones, but it seems likely that kube-up could instead simply set this flag.
@ -218,14 +218,14 @@ zones, but it seems likely that kube-up could instead simply set this flag.
For the initial implementation, creating volumes with identical names will
yield undefined results. Later, we may add some way to specify the zone for a
volume (and possibly require that volumes have their zone specified when
running with Ubernetes-Lite). We could add a new `zone` field to the
running in multi-AZ cluster mode). We could add a new `zone` field to the
PersistentVolume type for GCE PD volumes, or we could use a DNS-style dotted
name for the volume name (<name>.<zone>)
Initially therefore, the GCE changes will be to:
1. change kube-up to support creation of a cluster in multiple zones
1. pass a flag enabling Ubernetes-Lite with kube-up
1. pass a flag enabling multi-AZ clusters with kube-up
1. change the kuberentes cloud provider to iterate through relevant zones when resolving items
1. tag GCE PD volumes with the appropriate zone information

View File

@ -34,7 +34,7 @@ Documentation for other releases can be found at
# Kubernetes Cluster Federation
## (a.k.a. "Ubernetes")
## (previously nicknamed "Ubernetes")
## Requirements Analysis and Product Proposal
@ -413,7 +413,7 @@ detail to be added here, but feel free to shoot down the basic DNS
idea in the mean time. In addition, some applications rely on private
networking between clusters for security (e.g. AWS VPC or more
generally VPN). It should not be necessary to forsake this in
order to use Ubernetes, for example by being forced to use public
order to use Cluster Federation, for example by being forced to use public
connectivity between clusters.
## Cross-cluster Scheduling
@ -546,7 +546,7 @@ prefers the Decoupled Hierarchical model for the reasons stated below).
here, as each underlying Kubernetes cluster can be scaled
completely independently w.r.t. scheduling, node state management,
monitoring, network connectivity etc. It is even potentially
feasible to stack "Ubernetes" federated clusters (i.e. create
feasible to stack federations of clusters (i.e. create
federations of federations) should scalability of the independent
Federation Control Plane become an issue (although the author does
not envision this being a problem worth solving in the short
@ -595,7 +595,7 @@ prefers the Decoupled Hierarchical model for the reasons stated below).
![image](federation-high-level-arch.png)
## Ubernetes API
## Cluster Federation API
It is proposed that this look a lot like the existing Kubernetes API
but be explicitly multi-cluster.
@ -603,7 +603,8 @@ but be explicitly multi-cluster.
+ Clusters become first class objects, which can be registered,
listed, described, deregistered etc via the API.
+ Compute resources can be explicitly requested in specific clusters,
or automatically scheduled to the "best" cluster by Ubernetes (by a
or automatically scheduled to the "best" cluster by the Cluster
Federation control system (by a
pluggable Policy Engine).
+ There is a federated equivalent of a replication controller type (or
perhaps a [deployment](deployment.md)),
@ -627,14 +628,15 @@ Controllers and related Services accordingly).
This should ideally be delegated to some external auth system, shared
by the underlying clusters, to avoid duplication and inconsistency.
Either that, or we end up with multilevel auth. Local readonly
eventually consistent auth slaves in each cluster and in Ubernetes
eventually consistent auth slaves in each cluster and in the Cluster
Federation control system
could potentially cache auth, to mitigate an SPOF auth system.
## Data consistency, failure and availability characteristics
The services comprising the Ubernetes Control Plane) have to run
The services comprising the Cluster Federation control plane) have to run
somewhere. Several options exist here:
* For high availability Ubernetes deployments, these
* For high availability Cluster Federation deployments, these
services may run in either:
* a dedicated Kubernetes cluster, not co-located in the same
availability zone with any of the federated clusters (for fault
@ -672,7 +674,7 @@ does the zookeeper config look like for N=3 across 3 AZs -- and how
does each replica find the other replicas and how do clients find
their primary zookeeper replica? And now how do I do a shared, highly
available redis database? Use a few common specific use cases like
this to flesh out the detailed API and semantics of Ubernetes.
this to flesh out the detailed API and semantics of Cluster Federation.
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->

View File

@ -79,10 +79,11 @@ The design of the pipeline for collecting application level metrics should
be revisited and it's not clear whether application level metrics should be
available in API server so the use case initially won't be supported.
#### Ubernetes
#### Cluster Federation
Ubernetes might want to consider cluster-level usage (in addition to cluster-level request)
of running pods when choosing where to schedule new pods. Although Ubernetes is still in design,
The Cluster Federation control system might want to consider cluster-level usage (in addition to cluster-level request)
of running pods when choosing where to schedule new pods. Although
Cluster Federation is still in design,
we expect the metrics API described here to be sufficient. Cluster-level usage can be
obtained by summing over usage of all nodes in the cluster.

View File

@ -1174,8 +1174,8 @@ func newAWSDisk(aws *Cloud, name string) (*awsDisk, error) {
// The original idea of the URL-style name was to put the AZ into the
// host, so we could find the AZ immediately from the name without
// querying the API. But it turns out we don't actually need it for
// Ubernetes-Lite, as we put the AZ into the labels on the PV instead.
// However, if in future we want to support Ubernetes-Lite
// multi-AZ clusters, as we put the AZ into the labels on the PV instead.
// However, if in future we want to support multi-AZ cluster
// volume-awareness without using PersistentVolumes, we likely will
// want the AZ in the host.

View File

@ -81,7 +81,7 @@ type GCECloud struct {
projectID string
region string
localZone string // The zone in which we are running
managedZones []string // List of zones we are spanning (for Ubernetes-Lite, primarily when running on master)
managedZones []string // List of zones we are spanning (for multi-AZ clusters, primarily when running on master)
networkURL string
nodeTags []string // List of tags to use on firewall rules for load balancers
nodeInstancePrefix string // If non-"", an advisory prefix for all nodes in the cluster

View File

@ -32,8 +32,8 @@ import (
"k8s.io/kubernetes/test/e2e/framework"
)
var _ = framework.KubeDescribe("Ubernetes Lite", func() {
f := framework.NewDefaultFramework("ubernetes-lite")
var _ = framework.KubeDescribe("Multi-AZ Clusters", func() {
f := framework.NewDefaultFramework("multi-az")
var zoneCount int
var err error
image := "gcr.io/google_containers/serve_hostname:v1.4"