diff --git a/examples/cockroachdb/README.md b/examples/cockroachdb/README.md index 2c14b9682a3..9f6387fa180 100644 --- a/examples/cockroachdb/README.md +++ b/examples/cockroachdb/README.md @@ -12,10 +12,11 @@ a PetSet. CockroachDB is a distributed, scalable NewSQL database. Please see Standard PetSet limitations apply: There is currently no possibility to use node-local storage (outside of single-node tests), and so there is likely a performance hit associated with running CockroachDB on some external storage. -Note that CockroachDB already does replication and thus should not be deployed on -a persistent volume which already replicates internally. -High-performance use cases on a private Kubernetes cluster should consider -a DaemonSet deployment. +Note that CockroachDB already does replication and thus it is unnecessary to +deploy it onto persistent volumes which already replicate internally. +For this reason, high-performance use cases on a private Kubernetes cluster +may want to consider a DaemonSet deployment until PetSets support node-local +storage (see #7562). ### Recovery after persistent storage failure @@ -27,17 +28,25 @@ first node is special in that the administrator must manually prepopulate the parameter. If this is not done, the first node will bootstrap a new cluster, which will lead to a lot of trouble. -### Dynamic provisioning +### Dynamic volume provisioning -The deployment is written for a use case in which dynamic provisioning is +The deployment is written for a use case in which dynamic volume provisioning is available. When that is not the case, the persistent volume claims need to be created manually. See [minikube.sh](minikube.sh) for the necessary -steps. +steps. If you're on GCE or AWS, where dynamic provisioning is supported, no +manual work is needed to create the persistent volumes. ## Testing locally on minikube Follow the steps in [minikube.sh](minikube.sh) (or simply run that file). +## Testing in the cloud on GCE or AWS + +Once you have a Kubernetes cluster running, just run +`kubectl create -f cockroachdb-petset.yaml` to create your cockroachdb cluster. +This works because GCE and AWS support dynamic volume provisioning by default, +so persistent volumes will be created for the CockroachDB pods as needed. + ## Accessing the database Along with our PetSet configuration, we expose a standard Kubernetes service @@ -48,8 +57,7 @@ Start up a client pod and open up an interactive, (mostly) Postgres-flavor SQL shell using: ```console -$ kubectl run -it cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- bash -root@cockroach-client # ./cockroach sql --host cockroachdb-public +$ kubectl run -it --rm cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- ./cockroach sql --host cockroachdb-public ``` You can see example SQL statements for inserting and querying data in the @@ -57,6 +65,19 @@ included [demo script](demo.sh), but can use almost any Postgres-style SQL commands. Some more basic examples can be found within [CockroachDB's documentation](https://www.cockroachlabs.com/docs/learn-cockroachdb-sql.html). +## Accessing the admin UI + +If you want to see information about how the cluster is doing, you can try +pulling up the CockroachDB admin UI by port-forwarding from your local machine +to one of the pods: + +```shell +kubectl port-forward cockroachdb-0 8080 +``` + +Once you’ve done that, you should be able to access the admin UI by visiting +http://localhost:8080/ in your web browser. + ## Simulating failures When all (or enough) nodes are up, simulate a failure like this: @@ -77,10 +98,17 @@ database and ensuring the other replicas have all data that was written. ## Scaling up or down -Simply edit the PetSet (but note that you may need to create a new persistent -volume claim first). If you ran `minikube.sh`, there's a spare volume so you -can immediately scale up by one. Convince yourself that the new node -immediately serves reads and writes. +Simply patch the PetSet by running + +```shell +kubectl patch petset cockroachdb -p '{"spec":{"replicas":4}}' +``` + +Note that you may need to create a new persistent volume claim first. If you +ran `minikube.sh`, there's a spare volume so you can immediately scale up by +one. If you're running on GCE or AWS, you can scale up by as many as you want +because new volumes will automatically be created for you. Convince yourself +that the new node immediately serves reads and writes. ## Cleaning up when you're done diff --git a/examples/cockroachdb/cockroachdb-petset.yaml b/examples/cockroachdb/cockroachdb-petset.yaml index ca172bce0a8..c7d5bf4fc53 100644 --- a/examples/cockroachdb/cockroachdb-petset.yaml +++ b/examples/cockroachdb/cockroachdb-petset.yaml @@ -23,17 +23,25 @@ spec: apiVersion: v1 kind: Service metadata: + # This service only exists to create DNS entries for each pet in the petset + # such that they can resolve each other's IP addresses. It does not create a + # load-balanced ClusterIP and should not be used directly by clients in most + # circumstances. + name: cockroachdb + labels: + app: cockroachdb annotations: + # This is needed to make the peer-finder work properly and to help avoid + # edge cases where instance 0 comes up after losing its data and needs to + # decide whether it should create a new cluster or try to join an existing + # one. If it creates a new cluster when it should have joined an existing + # one, we'd end up with two separate clusters listening at the same service + # endpoint, which would be very bad. + service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" # Enable automatic monitoring of all instances when Prometheus is running in the cluster. prometheus.io/scrape: "true" prometheus.io/path: "_status/vars" prometheus.io/port: "8080" - # This service only exists to create DNS entries for each pet in the petset such that they can resolve - # each other's IP addresses. It does not create a load-balanced ClusterIP and should not be used - # directly by clients in most circumstances. - name: cockroachdb - labels: - app: cockroachdb spec: ports: - port: 26257 @@ -52,13 +60,50 @@ metadata: name: cockroachdb spec: serviceName: "cockroachdb" - replicas: 5 + replicas: 3 template: metadata: labels: app: cockroachdb annotations: pod.alpha.kubernetes.io/initialized: "true" + # Init containers are run only once in the lifetime of a pod, before + # it's started up for the first time. It has to exit successfully + # before the pod's main containers are allowed to start. + # This particular init container does a DNS lookup for other pods in + # the petset to help determine whether or not a cluster already exists. + # If any other pets exist, it creates a file in the cockroach-data + # directory to pass that information along to the primary container that + # has to decide what command-line flags to use when starting CockroachDB. + # This only matters when a pod's persistent volume is empty - if it has + # data from a previous execution, that data will always be used. + pod.alpha.kubernetes.io/init-containers: '[ + { + "name": "bootstrap", + "image": "cockroachdb/cockroach-k8s-init:0.1", + "args": [ + "-on-start=/on-start.sh", + "-service=cockroachdb" + ], + "env": [ + { + "name": "POD_NAMESPACE", + "valueFrom": { + "fieldRef": { + "apiVersion": "v1", + "fieldPath": "metadata.namespace" + } + } + } + ], + "volumeMounts": [ + { + "name": "datadir", + "mountPath": "/cockroach/cockroach-data" + } + ] + } + ]' spec: containers: - name: cockroachdb @@ -93,27 +138,23 @@ spec: - | # The use of qualified `hostname -f` is crucial: # Other nodes aren't able to look up the unqualified hostname. - CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)") - # TODO(tschottdorf): really want to use an init container to do - # the bootstrapping. The idea is that the container would know - # whether it's on the first node and could check whether there's - # already a data directory. If not, it would bootstrap the cluster. - # We will need some version of `cockroach init` back for this to - # work. For now, just do the same in a shell snippet. - # Of course this isn't without danger - if node0 loses its data, - # upon restarting it will simply bootstrap a new cluster and smack - # it into our existing cluster. - # There are likely ways out. For example, the init container could - # query the kubernetes API and see whether any other nodes are - # around, etc. Or, of course, the admin can pre-seed the lost - # volume somehow (and in that case we should provide a better way, - # for example a marker file). + CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)" "--http-host" "0.0.0.0") + # We only want to initialize a new cluster (by omitting the join flag) + # if we're sure that we're the first node (i.e. index 0) and that + # there aren't any other nodes running as part of the cluster that + # this is supposed to be a part of (which indicates that a cluster + # already exists and we should make sure not to create a new one). + # It's fine to run without --join on a restart if there aren't any + # other nodes. if [ ! "$(hostname)" == "cockroachdb-0" ] || \ - [ -e "/cockroach/cockroach-data/COCKROACHDB_VERSION" ] + [ -e "/cockroach/cockroach-data/cluster_exists_marker" ] then - CRARGS+=("--join" "cockroachdb") + # We don't join cockroachdb in order to avoid a node attempting + # to join itself, which currently doesn't work + # (https://github.com/cockroachdb/cockroach/issues/9625). + CRARGS+=("--join" "cockroachdb-public") fi - /cockroach/cockroach ${CRARGS[*]} + exec /cockroach/cockroach ${CRARGS[*]} # No pre-stop hook is required, a SIGTERM plus some time is all that's # needed for graceful shutdown of a node. terminationGracePeriodSeconds: 60 diff --git a/examples/cockroachdb/minikube.sh b/examples/cockroachdb/minikube.sh index 3c38fa0b11c..f14b72c45a5 100755 --- a/examples/cockroachdb/minikube.sh +++ b/examples/cockroachdb/minikube.sh @@ -35,7 +35,7 @@ kubectl delete petsets,pods,persistentvolumes,persistentvolumeclaims,services -l # claims here manually even though that sounds counter-intuitive. For details # see https://github.com/kubernetes/contrib/pull/1295#issuecomment-230180894. # Note that we make an extra volume here so you can manually test scale-up. -for i in $(seq 0 5); do +for i in $(seq 0 3); do cat <