Improve CockroachDB example

* Use an init container to eliminate potential edge case where losing
  the first pet's could cause it to start a second logical cluster
* Exec the cockroach binary so that it runs as PID 1 in the container
* Make some small improvements to the README
This commit is contained in:
Alex Robinson 2016-10-31 15:42:11 -04:00
parent e6b2517feb
commit 6b98de39a5
3 changed files with 108 additions and 39 deletions

View File

@ -12,10 +12,11 @@ a PetSet. CockroachDB is a distributed, scalable NewSQL database. Please see
Standard PetSet limitations apply: There is currently no possibility to use
node-local storage (outside of single-node tests), and so there is likely
a performance hit associated with running CockroachDB on some external storage.
Note that CockroachDB already does replication and thus should not be deployed on
a persistent volume which already replicates internally.
High-performance use cases on a private Kubernetes cluster should consider
a DaemonSet deployment.
Note that CockroachDB already does replication and thus it is unnecessary to
deploy it onto persistent volumes which already replicate internally.
For this reason, high-performance use cases on a private Kubernetes cluster
may want to consider a DaemonSet deployment until PetSets support node-local
storage (see #7562).
### Recovery after persistent storage failure
@ -27,17 +28,25 @@ first node is special in that the administrator must manually prepopulate the
parameter. If this is not done, the first node will bootstrap a new cluster,
which will lead to a lot of trouble.
### Dynamic provisioning
### Dynamic volume provisioning
The deployment is written for a use case in which dynamic provisioning is
The deployment is written for a use case in which dynamic volume provisioning is
available. When that is not the case, the persistent volume claims need
to be created manually. See [minikube.sh](minikube.sh) for the necessary
steps.
steps. If you're on GCE or AWS, where dynamic provisioning is supported, no
manual work is needed to create the persistent volumes.
## Testing locally on minikube
Follow the steps in [minikube.sh](minikube.sh) (or simply run that file).
## Testing in the cloud on GCE or AWS
Once you have a Kubernetes cluster running, just run
`kubectl create -f cockroachdb-petset.yaml` to create your cockroachdb cluster.
This works because GCE and AWS support dynamic volume provisioning by default,
so persistent volumes will be created for the CockroachDB pods as needed.
## Accessing the database
Along with our PetSet configuration, we expose a standard Kubernetes service
@ -48,8 +57,7 @@ Start up a client pod and open up an interactive, (mostly) Postgres-flavor
SQL shell using:
```console
$ kubectl run -it cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- bash
root@cockroach-client # ./cockroach sql --host cockroachdb-public
$ kubectl run -it --rm cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- ./cockroach sql --host cockroachdb-public
```
You can see example SQL statements for inserting and querying data in the
@ -57,6 +65,19 @@ included [demo script](demo.sh), but can use almost any Postgres-style SQL
commands. Some more basic examples can be found within
[CockroachDB's documentation](https://www.cockroachlabs.com/docs/learn-cockroachdb-sql.html).
## Accessing the admin UI
If you want to see information about how the cluster is doing, you can try
pulling up the CockroachDB admin UI by port-forwarding from your local machine
to one of the pods:
```shell
kubectl port-forward cockroachdb-0 8080
```
Once youve done that, you should be able to access the admin UI by visiting
http://localhost:8080/ in your web browser.
## Simulating failures
When all (or enough) nodes are up, simulate a failure like this:
@ -77,10 +98,17 @@ database and ensuring the other replicas have all data that was written.
## Scaling up or down
Simply edit the PetSet (but note that you may need to create a new persistent
volume claim first). If you ran `minikube.sh`, there's a spare volume so you
can immediately scale up by one. Convince yourself that the new node
immediately serves reads and writes.
Simply patch the PetSet by running
```shell
kubectl patch petset cockroachdb -p '{"spec":{"replicas":4}}'
```
Note that you may need to create a new persistent volume claim first. If you
ran `minikube.sh`, there's a spare volume so you can immediately scale up by
one. If you're running on GCE or AWS, you can scale up by as many as you want
because new volumes will automatically be created for you. Convince yourself
that the new node immediately serves reads and writes.
## Cleaning up when you're done

View File

@ -23,17 +23,25 @@ spec:
apiVersion: v1
kind: Service
metadata:
# This service only exists to create DNS entries for each pet in the petset
# such that they can resolve each other's IP addresses. It does not create a
# load-balanced ClusterIP and should not be used directly by clients in most
# circumstances.
name: cockroachdb
labels:
app: cockroachdb
annotations:
# This is needed to make the peer-finder work properly and to help avoid
# edge cases where instance 0 comes up after losing its data and needs to
# decide whether it should create a new cluster or try to join an existing
# one. If it creates a new cluster when it should have joined an existing
# one, we'd end up with two separate clusters listening at the same service
# endpoint, which would be very bad.
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
# Enable automatic monitoring of all instances when Prometheus is running in the cluster.
prometheus.io/scrape: "true"
prometheus.io/path: "_status/vars"
prometheus.io/port: "8080"
# This service only exists to create DNS entries for each pet in the petset such that they can resolve
# each other's IP addresses. It does not create a load-balanced ClusterIP and should not be used
# directly by clients in most circumstances.
name: cockroachdb
labels:
app: cockroachdb
spec:
ports:
- port: 26257
@ -52,13 +60,50 @@ metadata:
name: cockroachdb
spec:
serviceName: "cockroachdb"
replicas: 5
replicas: 3
template:
metadata:
labels:
app: cockroachdb
annotations:
pod.alpha.kubernetes.io/initialized: "true"
# Init containers are run only once in the lifetime of a pod, before
# it's started up for the first time. It has to exit successfully
# before the pod's main containers are allowed to start.
# This particular init container does a DNS lookup for other pods in
# the petset to help determine whether or not a cluster already exists.
# If any other pets exist, it creates a file in the cockroach-data
# directory to pass that information along to the primary container that
# has to decide what command-line flags to use when starting CockroachDB.
# This only matters when a pod's persistent volume is empty - if it has
# data from a previous execution, that data will always be used.
pod.alpha.kubernetes.io/init-containers: '[
{
"name": "bootstrap",
"image": "cockroachdb/cockroach-k8s-init:0.1",
"args": [
"-on-start=/on-start.sh",
"-service=cockroachdb"
],
"env": [
{
"name": "POD_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
}
],
"volumeMounts": [
{
"name": "datadir",
"mountPath": "/cockroach/cockroach-data"
}
]
}
]'
spec:
containers:
- name: cockroachdb
@ -93,27 +138,23 @@ spec:
- |
# The use of qualified `hostname -f` is crucial:
# Other nodes aren't able to look up the unqualified hostname.
CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)")
# TODO(tschottdorf): really want to use an init container to do
# the bootstrapping. The idea is that the container would know
# whether it's on the first node and could check whether there's
# already a data directory. If not, it would bootstrap the cluster.
# We will need some version of `cockroach init` back for this to
# work. For now, just do the same in a shell snippet.
# Of course this isn't without danger - if node0 loses its data,
# upon restarting it will simply bootstrap a new cluster and smack
# it into our existing cluster.
# There are likely ways out. For example, the init container could
# query the kubernetes API and see whether any other nodes are
# around, etc. Or, of course, the admin can pre-seed the lost
# volume somehow (and in that case we should provide a better way,
# for example a marker file).
CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)" "--http-host" "0.0.0.0")
# We only want to initialize a new cluster (by omitting the join flag)
# if we're sure that we're the first node (i.e. index 0) and that
# there aren't any other nodes running as part of the cluster that
# this is supposed to be a part of (which indicates that a cluster
# already exists and we should make sure not to create a new one).
# It's fine to run without --join on a restart if there aren't any
# other nodes.
if [ ! "$(hostname)" == "cockroachdb-0" ] || \
[ -e "/cockroach/cockroach-data/COCKROACHDB_VERSION" ]
[ -e "/cockroach/cockroach-data/cluster_exists_marker" ]
then
CRARGS+=("--join" "cockroachdb")
# We don't join cockroachdb in order to avoid a node attempting
# to join itself, which currently doesn't work
# (https://github.com/cockroachdb/cockroach/issues/9625).
CRARGS+=("--join" "cockroachdb-public")
fi
/cockroach/cockroach ${CRARGS[*]}
exec /cockroach/cockroach ${CRARGS[*]}
# No pre-stop hook is required, a SIGTERM plus some time is all that's
# needed for graceful shutdown of a node.
terminationGracePeriodSeconds: 60

View File

@ -35,7 +35,7 @@ kubectl delete petsets,pods,persistentvolumes,persistentvolumeclaims,services -l
# claims here manually even though that sounds counter-intuitive. For details
# see https://github.com/kubernetes/contrib/pull/1295#issuecomment-230180894.
# Note that we make an extra volume here so you can manually test scale-up.
for i in $(seq 0 5); do
for i in $(seq 0 3); do
cat <<EOF | kubectl create -f -
kind: PersistentVolume
apiVersion: v1