Merge pull request #35922 from a-robinson/crdb

Automatic merge from submit-queue Improve CockroachDB example This is primarily about pulling in the init container to make the config more production-ready, but I've pulled in a few other small improvements that have been made since this was first contributed. * Use an init container to eliminate potential edge case where losing the first pet's data could cause it to start a second logical cluster * Exec the cockroach binary so that it runs as PID 1 in the container * Make some small improvements to the README @bprashanth ```release-note ```
2025-07-27 13:37:30 +00:00 · 2016-10-31 15:34:09 -07:00 · 2016-10-31 15:34:09 -07:00 · 079630a522
commit 079630a522
parent d06408e441 6b98de39a5
3 changed files with 108 additions and 39 deletions
--- a/examples/cockroachdb/README.md
+++ b/examples/cockroachdb/README.md
@ -12,10 +12,11 @@ a PetSet. CockroachDB is a distributed, scalable NewSQL database. Please see
 Standard PetSet limitations apply: There is currently no possibility to use
 node-local storage (outside of single-node tests), and so there is likely
 a performance hit associated with running CockroachDB on some external storage.
-Note that CockroachDB already does replication and thus should not be deployed on
+Note that CockroachDB already does replication and thus it is unnecessary to
-a persistent volume which already replicates internally.
+deploy it onto persistent volumes which already replicate internally.
-High-performance use cases on a private Kubernetes cluster should consider
+For this reason, high-performance use cases on a private Kubernetes cluster
-a DaemonSet deployment.
+may want to consider a DaemonSet deployment until PetSets support node-local
 storage (see #7562).
 ### Recovery after persistent storage failure
@ -27,17 +28,25 @@ first node is special in that the administrator must manually prepopulate the
 parameter. If this is not done, the first node will bootstrap a new cluster,
 which will lead to a lot of trouble.
-### Dynamic provisioning
+### Dynamic volume provisioning
-The deployment is written for a use case in which dynamic provisioning is
+The deployment is written for a use case in which dynamic volume provisioning is
 available. When that is not the case, the persistent volume claims need
 to be created manually. See [minikube.sh](minikube.sh) for the necessary
-steps.
+steps. If you're on GCE or AWS, where dynamic provisioning is supported, no
 manual work is needed to create the persistent volumes.
 ## Testing locally on minikube
 Follow the steps in [minikube.sh](minikube.sh) (or simply run that file).
 ## Testing in the cloud on GCE or AWS
 Once you have a Kubernetes cluster running, just run
 `kubectl create -f cockroachdb-petset.yaml` to create your cockroachdb cluster.
 This works because GCE and AWS support dynamic volume provisioning by default,
 so persistent volumes will be created for the CockroachDB pods as needed.
 ## Accessing the database
 Along with our PetSet configuration, we expose a standard Kubernetes service
@ -48,8 +57,7 @@ Start up a client pod and open up an interactive, (mostly) Postgres-flavor
 SQL shell using:
 ```console
-$ kubectl run -it cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- bash
+$ kubectl run -it --rm cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- ./cockroach sql --host cockroachdb-public
 root@cockroach-client # ./cockroach sql --host cockroachdb-public
 ```
 You can see example SQL statements for inserting and querying data in the
@ -57,6 +65,19 @@ included [demo script](demo.sh), but can use almost any Postgres-style SQL
 commands. Some more basic examples can be found within
 [CockroachDB's documentation](https://www.cockroachlabs.com/docs/learn-cockroachdb-sql.html).
 ## Accessing the admin UI
 If you want to see information about how the cluster is doing, you can try
 pulling up the CockroachDB admin UI by port-forwarding from your local machine
 to one of the pods:
 ```shell
 kubectl port-forward cockroachdb-0 8080
 ```
 Once you’ve done that, you should be able to access the admin UI by visiting
 http://localhost:8080/ in your web browser.
 ## Simulating failures
 When all (or enough) nodes are up, simulate a failure like this:
@ -77,10 +98,17 @@ database and ensuring the other replicas have all data that was written.
 ## Scaling up or down
-Simply edit the PetSet (but note that you may need to create a new persistent
+Simply patch the PetSet by running
-volume claim first). If you ran `minikube.sh`, there's a spare volume so you
+
-can immediately scale up by one. Convince yourself that the new node
+```shell
-immediately serves reads and writes.
+kubectl patch petset cockroachdb -p '{"spec":{"replicas":4}}'
 ```
 Note that you may need to create a new persistent volume claim first. If you
 ran `minikube.sh`, there's a spare volume so you can immediately scale up by
 one. If you're running on GCE or AWS, you can scale up by as many as you want
 because new volumes will automatically be created for you. Convince yourself
 that the new node immediately serves reads and writes.
 ## Cleaning up when you're done
--- a/examples/cockroachdb/cockroachdb-petset.yaml
+++ b/examples/cockroachdb/cockroachdb-petset.yaml
@ -23,17 +23,25 @@ spec:
 apiVersion: v1
 kind: Service
 metadata:
  # This service only exists to create DNS entries for each pet in the petset
  # such that they can resolve each other's IP addresses. It does not create a
  # load-balanced ClusterIP and should not be used directly by clients in most
  # circumstances.
  name: cockroachdb
  labels:
    app: cockroachdb
  annotations:
    # This is needed to make the peer-finder work properly and to help avoid
    # edge cases where instance 0 comes up after losing its data and needs to
    # decide whether it should create a new cluster or try to join an existing
    # one. If it creates a new cluster when it should have joined an existing
    # one, we'd end up with two separate clusters listening at the same service
    # endpoint, which would be very bad.
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
    # Enable automatic monitoring of all instances when Prometheus is running in the cluster.
    prometheus.io/scrape: "true"
    prometheus.io/path: "_status/vars"
    prometheus.io/port: "8080"
  # This service only exists to create DNS entries for each pet in the petset such that they can resolve
  # each other's IP addresses. It does not create a load-balanced ClusterIP and should not be used
  # directly by clients in most circumstances.
  name: cockroachdb
  labels:
    app: cockroachdb
 spec:
  ports:
  - port: 26257
@ -52,13 +60,50 @@ metadata:
  name: cockroachdb
 spec:
  serviceName: "cockroachdb"
-  replicas: 5
+  replicas: 3
  template:
    metadata:
      labels:
        app: cockroachdb
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
        # Init containers are run only once in the lifetime of a pod, before
        # it's started up for the first time. It has to exit successfully
        # before the pod's main containers are allowed to start.
        # This particular init container does a DNS lookup for other pods in
        # the petset to help determine whether or not a cluster already exists.
        # If any other pets exist, it creates a file in the cockroach-data
        # directory to pass that information along to the primary container that
        # has to decide what command-line flags to use when starting CockroachDB.
        # This only matters when a pod's persistent volume is empty - if it has
        # data from a previous execution, that data will always be used.
        pod.alpha.kubernetes.io/init-containers: '[
            {
                "name": "bootstrap",
                "image": "cockroachdb/cockroach-k8s-init:0.1",
                "args": [
                  "-on-start=/on-start.sh",
                  "-service=cockroachdb"
                ],
                "env": [
                  {
                      "name": "POD_NAMESPACE",
                      "valueFrom": {
                          "fieldRef": {
                              "apiVersion": "v1",
                              "fieldPath": "metadata.namespace"
                          }
                      }
                   }
                ],
                "volumeMounts": [
                    {
                        "name": "datadir",
                        "mountPath": "/cockroach/cockroach-data"
                    }
                ]
            }
        ]'
    spec:
      containers:
      - name: cockroachdb
@ -93,27 +138,23 @@ spec:
          - |
            # The use of qualified `hostname -f` is crucial:
            # Other nodes aren't able to look up the unqualified hostname.
-            CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)")
+            CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)" "--http-host" "0.0.0.0")
-            # TODO(tschottdorf): really want to use an init container to do
+            # We only want to initialize a new cluster (by omitting the join flag)
-            # the bootstrapping. The idea is that the container would know
+            # if we're sure that we're the first node (i.e. index 0) and that
-            # whether it's on the first node and could check whether there's
+            # there aren't any other nodes running as part of the cluster that
-            # already a data directory. If not, it would bootstrap the cluster.
+            # this is supposed to be a part of (which indicates that a cluster
-            # We will need some version of `cockroach init` back for this to
+            # already exists and we should make sure not to create a new one).
-            # work. For now, just do the same in a shell snippet.
+            # It's fine to run without --join on a restart if there aren't any
-            # Of course this isn't without danger - if node0 loses its data,
+            # other nodes.
            # upon restarting it will simply bootstrap a new cluster and smack
            # it into our existing cluster.
            # There are likely ways out. For example, the init container could
            # query the kubernetes API and see whether any other nodes are
            # around, etc. Or, of course, the admin can pre-seed the lost
            # volume somehow (and in that case we should provide a better way,
            # for example a marker file).
            if [ ! "$(hostname)" == "cockroachdb-0" ] || \
-               [ -e "/cockroach/cockroach-data/COCKROACHDB_VERSION" ]
+               [ -e "/cockroach/cockroach-data/cluster_exists_marker" ]
            then
-              CRARGS+=("--join" "cockroachdb")
+              # We don't join cockroachdb in order to avoid a node attempting
              # to join itself, which currently doesn't work
              # (https://github.com/cockroachdb/cockroach/issues/9625).
              CRARGS+=("--join" "cockroachdb-public")
            fi
-            /cockroach/cockroach ${CRARGS[*]}
+            exec /cockroach/cockroach ${CRARGS[*]}
      # No pre-stop hook is required, a SIGTERM plus some time is all that's
      # needed for graceful shutdown of a node.
      terminationGracePeriodSeconds: 60
--- a/examples/cockroachdb/minikube.sh
+++ b/examples/cockroachdb/minikube.sh
@ -35,7 +35,7 @@ kubectl delete petsets,pods,persistentvolumes,persistentvolumeclaims,services -l
 # claims here manually even though that sounds counter-intuitive. For details
 # see https://github.com/kubernetes/contrib/pull/1295#issuecomment-230180894.
 # Note that we make an extra volume here so you can manually test scale-up.
-for i in $(seq 0 5); do
+for i in $(seq 0 3); do
  cat <<EOF | kubectl create -f -
 kind: PersistentVolume
 apiVersion: v1