More fixes based on commments

This commit is contained in:
Justin Santa Barbara 2015-10-19 13:55:43 -04:00
parent 645fe1d300
commit 426346c7e3

View File

@ -49,6 +49,18 @@ Kubernetes clusters are created on AWS. This can be particularly useful if
problems arise or in circumstances where the provided scripts are lacking and
you manually created or configured your cluster.
**Table of contents:**
* [Architecture overview](#architecture-overview)
* [Storage](#storage)
* [Auto Scaling group](#auto-scaling-group)
* [Networking](#networking)
* [NodePort and LoadBalancing services](#nodeport-and-loadbalancing-services)
* [Identity and access management (IAM)](#identity-and-access-management-iam)
* [Tagging](#tagging)
* [AWS objects](#aws-objects)
* [Manual infrastructure creation](#manual-infrastructure-creation)
* [Instance boot](#instance-boot)
### Architecture overview
Kubernetes is a cluster of several machines that consists of a Kubernetes
@ -56,17 +68,13 @@ master and a set number of nodes (previously known as 'minions') for which the
master which is responsible. See the [Architecture](architecture.md) topic for
more details.
Other documents describe the general architecture of Kubernetes (all nodes run
Docker; the kubelet agent runs on each node and launches containers; the
kube-proxy relays traffic between the nodes etc).
By default on AWS:
* Instances run Ubuntu 15.04 (the official AMI). It includes a sufficiently
modern kernel that pairs well with Docker and doesn't require a
reboot. (The default SSH user is `ubuntu` for this and other ubuntu images.)
* By default we run aufs over ext4 as the filesystem / container storage on the
nodes (mostly because this is what GCE uses).
* Nodes use aufs instead of ext4 as the filesystem / container storage (mostly
because this is what Google Compute Engine uses).
You can override these defaults by passing different environment variables to
kube-up.
@ -82,12 +90,12 @@ unless you create pods with persistent volumes
[(EBS)](../user-guide/volumes.md#awselasticblockstore). In general, Kubernetes
containers do not have persistent storage unless you attach a persistent
volume, and so nodes on AWS use instance storage. Instance storage is cheaper,
often faster, and historically more reliable. This does mean that you should
pick an instance type that has sufficient instance storage, unless you can make
do with whatever space is left on your root partition.
often faster, and historically more reliable. Unless you can make do with whatever
space is left on your root partition, you must choose an instance type that provides
you with sufficient instance storage for your needs.
Note: The master uses a persistent volume ([etcd](architecture.md#etcd)) to track
its state but similar to the nodes, containers are mostly run against instance
its state. Similar to nodes, containers are mostly run against instance
storage, except that we repoint some important data onto the peristent volume.
The default storage driver for Docker images is aufs. Specifying btrfs (by passing the environment
@ -96,12 +104,12 @@ is relatively reliable with Docker and has improved its reliability with modern
kernels. It can easily span multiple volumes, which is particularly useful
when we are using an instance type with multiple ephemeral instance disks.
### AutoScaling
### Auto Scaling group
Nodes (but not the master) are run in an
[AutoScalingGroup](http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroup.html)
[Auto Scaling group](http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroup.html)
on AWS. Currently auto-scaling (e.g. based on CPU) is not actually enabled
([#11935](http://issues.k8s.io/11935)). Instead, the auto-scaling group means
([#11935](http://issues.k8s.io/11935)). Instead, the Auto Scaling group means
that AWS will relaunch any nodes that are terminated.
We do not currently run the master in an AutoScalingGroup, but we should
@ -111,14 +119,13 @@ We do not currently run the master in an AutoScalingGroup, but we should
Kubernetes uses an IP-per-pod model. This means that a node, which runs many
pods, must have many IPs. AWS uses virtual private clouds (VPCs) and advanced
routing support so each pod is assigned a /24 CIDR. Each pod is assigned a /24
CIDR; the assigned CIDR is then configured to route to an instance in the VPC
routing table.
routing support so each pod is assigned a /24 CIDR. The assigned CIDR is then
configured to route to an instance in the VPC routing table.
It is also possible to use overlay networking on AWS, but that is not the
It is also possible to use overlay networking on AWS, but that is not the default
configuration of the kube-up script.
### NodePort and LoadBalancing
### NodePort and LoadBalancing services
Kubernetes on AWS integrates with [Elastic Load Balancing
(ELB)](http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/US_SetUpASLBApp.html).
@ -129,17 +136,23 @@ and modify the security group for the nodes to allow traffic from the ELB to
the nodes. This traffic reaches kube-proxy where it is then forwarded to the
pods.
ELB has some restrictions: it requires that all nodes listen on a single port,
and it acts as a forwarding proxy (i.e. the source IP is not preserved). To
work with these restrictions, in Kubernetes, [LoadBalancer
services](../user-guide/services.html#type-loadbalancer) are exposed as
ELB has some restrictions:
* it requires that all nodes listen on a single port,
* it acts as a forwarding proxy (i.e. the source IP is not preserved).
To work with these restrictions, in Kubernetes, [LoadBalancer
services](../user-guide/services.md#type-loadbalancer) are exposed as
[NodePort services](../user-guide/services.md#type-nodeport). Then
kube-proxy listens externally on the cluster-wide port that's assigned to
NodePort services and forwards traffic to the corresponding pods. So ELB is
configured to proxy traffic on the public port (e.g. port 80) to the NodePort
that is assigned to the service (e.g. 31234). Any in-coming traffic sent to
the NodePort (e.g. port 31234) is recognized by kube-proxy and then sent to the
correct pods for that service.
NodePort services and forwards traffic to the corresponding pods.
So for example, if we configure a service of Type LoadBalancer with a
public port of 80:
* Kubernetes will assign a NodePort to the service (e.g. 31234)
* ELB is configured to proxy traffic on the public port 80 to the NodePort
that is assigned to the service (31234).
* Then any in-coming traffic that ELB forwards to the NodePort (e.g. port 31234)
is recognized by kube-proxy and sent to the correct pods for that service.
Note that we do not automatically open NodePort services in the AWS firewall
(although we do open LoadBalancer services). This is because we expect that
@ -188,31 +201,31 @@ Important: If you choose not to use kube-up, you must pick a unique cluster-id
value, and ensure that all AWS resources have a tag with
`Name=KubernetesCluster,Value=<clusterid>`.
### AWS Objects
### AWS objects
The kube-up script does a number of things in AWS:
* Creates an S3 bucket (`AWS_S3_BUCKET`) and then copies the Kubernetes distribution
and the salt scripts into it. They are made world-readable and the HTTP URLs
are passed to instances; this is how Kubernetes code gets onto the machines.
are passed to instances; this is how Kubernetes code gets onto the machines.
* Creates two IAM profiles based on templates in [cluster/aws/templates/iam](../../cluster/aws/templates/iam/):
* `kubernetes-master` is used by the master
* `kubernetes-master` is used by the master.
* `kubernetes-minion` is used by nodes.
* Creates an AWS SSH key named `kubernetes-<fingerprint>`. Fingerprint here is
the OpenSSH key fingerprint, so that multiple users can run the script with
different keys and their keys will not collide (with near-certainty). It will
use an existing key if one is found at `AWS_SSH_KEY`, otherwise it will create
one there. (With the default ubuntu images, if you have to SSH in: the user is
`ubuntu` and that user can `sudo`)
different keys and their keys will not collide (with near-certainty). It will
use an existing key if one is found at `AWS_SSH_KEY`, otherwise it will create
one there. (With the default Ubuntu images, if you have to SSH in: the user is
`ubuntu` and that user can `sudo`).
* Creates a VPC for use with the cluster (with a CIDR of 172.20.0.0/16) and
enables the `dns-support` and `dns-hostnames` options.
* Creates an internet gateway for the VPC.
* Creates a route table for the VPC, with the internet gateway as the default
route
route.
* Creates a subnet (with a CIDR of 172.20.0.0/24) in the AZ `KUBE_AWS_ZONE`
(defaults to us-west-2a). Currently, each Kubernetes cluster runs in a
single AZ on AWS. Although, there are two philosophies in discussion on how to
achieve High Availability (HA):
single AZ on AWS. Although, there are two philosophies in discussion on how to
achieve High Availability (HA):
* cluster-per-AZ: An independent cluster for each AZ, where each cluster
is entirely separate.
* cross-AZ-clusters: A single cluster spans multiple AZs.
@ -220,31 +233,31 @@ The debate is open here, where cluster-per-AZ is discussed as more robust but
cross-AZ-clusters are more convenient.
* Associates the subnet to the route table
* Creates security groups for the master (`kubernetes-master-<clusterid>`)
and the nodes (`kubernetes-minion-<clusterid>`)
and the nodes (`kubernetes-minion-<clusterid>`).
* Configures security groups so that masters and nodes can communicate. This
includes intercommunication between masters and nodes, opening SSH publicly
for both masters and nodes, and opening port 443 on the master for the HTTPS
API endpoints.
for both masters and nodes, and opening port 443 on the master for the HTTPS
API endpoints.
* Creates an EBS volume for the master of size `MASTER_DISK_SIZE` and type
`MASTER_DISK_TYPE`
`MASTER_DISK_TYPE`.
* Launches a master with a fixed IP address (172.20.0.9) that is also
configured for the security group and all the necessary IAM credentials. An
instance script is used to pass vital configuration information to Salt. Note:
The hope is that over time we can reduce the amount of configuration
information that must be passed in this way.
instance script is used to pass vital configuration information to Salt. Note:
The hope is that over time we can reduce the amount of configuration
information that must be passed in this way.
* Once the instance is up, it attaches the EBS volume and sets up a manual
routing rule for the internal network range (`MASTER_IP_RANGE`, defaults to
10.246.0.0/24)
10.246.0.0/24).
* For auto-scaling, on each nodes it creates a launch configuration and group.
The name for both is <*KUBE_AWS_INSTANCE_PREFIX*>-minion-group. The default
name is kubernetes-minion-group. The auto-scaling group has a min and max size
that are both set to NUM_MINIONS. You can change the size of the auto-scaling
group to add or remove the total number of nodes from within the AWS API or
Console. Each nodes self-configures, meaning that they come up; run Salt with
the stored configuration; connect to the master; are assigned an internal CIDR;
and then the master configures the route-table with the assigned CIDR. The
kube-up script performs a health-check on the nodes but it's a self-check that
is not required.
name is kubernetes-minion-group. The auto-scaling group has a min and max size
that are both set to NUM_MINIONS. You can change the size of the auto-scaling
group to add or remove the total number of nodes from within the AWS API or
Console. Each nodes self-configures, meaning that they come up; run Salt with
the stored configuration; connect to the master; are assigned an internal CIDR;
and then the master configures the route-table with the assigned CIDR. The
kube-up script performs a health-check on the nodes but it's a self-check that
is not required.
If attempting this configuration manually, I highly recommend following along