Revamped Elasticsearch example that now uses an Alpine Linux container with JRE 8u51 and Elasticsearch 1.7.1.

Replaced Go discovery mechanism for Elasticsearch discovery plug-in that supports Kubernetes.
This commit is contained in:
Paulo Pires
2015-08-21 12:30:11 +01:00
parent ae1236303e
commit 0a64995b7b
21 changed files with 618 additions and 873 deletions

View File

@@ -33,206 +33,130 @@ Documentation for other releases can be found at
# Elasticsearch for Kubernetes
This directory contains the source for a Docker image that creates an instance
of [Elasticsearch](https://www.elastic.co/products/elasticsearch) 1.5.2 which can
be used to automatically form clusters when used
with [replication controllers](../../docs/user-guide/replication-controller.md). This will not work with the library Elasticsearch image
because multicast discovery will not find the other pod IPs needed to form a cluster. This
image detects other Elasticsearch [pods](../../docs/user-guide/pods.md) running in a specified [namespace](../../docs/user-guide/namespaces.md) with a given
label selector. The detected instances are used to form a list of peer hosts which
are used as part of the unicast discovery mechanism for Elasticsearch. The detection
of the peer nodes is done by a program which communicates with the Kubernetes API
server to get a list of matching Elasticsearch pods.
Kubernetes makes it trivial for anyone to easily build and scale [Elasticsearch](http://www.elasticsearch.org/) clusters. Here, you'll find how to do so.
Current Elasticsearch version is `1.7.1`.
Here is an example replication controller specification that creates 4 instances of Elasticsearch.
[A more robust example that follows Elasticsearch best-practices of separating nodes concern is also available](production_cluster/README.md).
<!-- BEGIN MUNGE: EXAMPLE music-rc.yaml -->
<img src="http://kubernetes.io/img/warning.png" alt="WARNING" width="25" height="25"> Current pod descriptors use an `emptyDir` for storing data in each data node container. This is meant to be for the sake of simplicity and [should be adapted according to your storage needs](../../docs/design/persistent-storage.md).
```yaml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
name: music-db
namespace: mytunes
name: music-db
spec:
replicas: 4
selector:
name: music-db
template:
metadata:
labels:
name: music-db
spec:
containers:
- name: es
image: kubernetes/elasticsearch:1.2
env:
- name: "CLUSTER_NAME"
value: "mytunes-db"
- name: "SELECTOR"
value: "name=music-db"
- name: "NAMESPACE"
value: "mytunes"
ports:
- name: es
containerPort: 9200
- name: es-transport
containerPort: 9300
## Docker image
This example uses [this pre-built image](https://github.com/pires/docker-elasticsearch-kubernetes) will not be supported. Feel free to fork to fit your own needs, but mind yourself that you will need to change Kubernetes descriptors accordingly.
## Deploy
Let's kickstart our cluster with 1 instance of Elasticsearch.
```
kubectl create -f examples/elasticsearch/service-account.yaml
kubectl create -f examples/elasticsearch/es-svc.yaml
kubectl create -f examples/elasticsearch/es-rc.yaml
```
[Download example](music-rc.yaml)
<!-- END MUNGE: EXAMPLE music-rc.yaml -->
Let's see if it worked:
The `CLUSTER_NAME` variable gives a name to the cluster and allows multiple separate clusters to
exist in the same namespace.
The `SELECTOR` variable should be set to a label query that identifies the Elasticsearch
nodes that should participate in this cluster. For our example we specify `name=music-db` to
match all pods that have the label `name` set to the value `music-db`.
The `NAMESPACE` variable identifies the namespace
to be used to search for Elasticsearch pods and this should be the same as the namespace specified
for the replication controller (in this case `mytunes`).
Replace `NAMESPACE` with the actual namespace to be used. In this example we shall use
the namespace `mytunes`.
```yaml
kind: Namespace
apiVersion: v1
metadata:
name: mytunes
labels:
name: mytunes
```
First, let's create the namespace:
```console
$ kubectl create -f examples/elasticsearch/mytunes-namespace.yaml
namespaces/mytunes
```
Now you are ready to create the replication controller which will then create the pods:
```console
$ kubectl create -f examples/elasticsearch/music-rc.yaml --namespace=mytunes
replicationcontrollers/music-db
```
Let's check to see if the replication controller and pods are running:
```console
$ kubectl get rc,pods --namespace=mytunes
CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS
music-db es kubernetes/elasticsearch:1.2 name=music-db 4
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
music-db-5p46b 1/1 Running 0 34s
music-db-8re0f 1/1 Running 0 34s
music-db-eq8j0 1/1 Running 0 34s
music-db-uq5px 1/1 Running 0 34s
es-kfymw 1/1 Running 0 7m
kube-dns-p3v1u 3/3 Running 0 19m
```
It's also useful to have a [service](../../docs/user-guide/services.md) with an load balancer for accessing the Elasticsearch
cluster.
<!-- BEGIN MUNGE: EXAMPLE music-service.yaml -->
```yaml
apiVersion: v1
kind: Service
metadata:
name: music-server
namespace: mytunes
labels:
name: music-db
spec:
selector:
name: music-db
ports:
- name: db
port: 9200
targetPort: es
type: LoadBalancer
```
$ kubectl logs es-kfymw
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
[2015-08-30 10:01:31,946][INFO ][node ] [Hammerhead] version[1.7.1], pid[7], build[b88f43f/2015-07-29T09:54:16Z]
[2015-08-30 10:01:31,946][INFO ][node ] [Hammerhead] initializing ...
[2015-08-30 10:01:32,110][INFO ][plugins ] [Hammerhead] loaded [cloud-kubernetes], sites []
[2015-08-30 10:01:32,153][INFO ][env ] [Hammerhead] using [1] data paths, mounts [[/data (/dev/sda9)]], net usable_space [14.4gb], net total_space [15.5gb], types [ext4]
[2015-08-30 10:01:37,188][INFO ][node ] [Hammerhead] initialized
[2015-08-30 10:01:37,189][INFO ][node ] [Hammerhead] starting ...
[2015-08-30 10:01:37,499][INFO ][transport ] [Hammerhead] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.244.48.2:9300]}
[2015-08-30 10:01:37,550][INFO ][discovery ] [Hammerhead] myesdb/n2-6uu_UT3W5XNrjyqBPiA
[2015-08-30 10:01:43,966][INFO ][cluster.service ] [Hammerhead] new_master [Hammerhead][n2-6uu_UT3W5XNrjyqBPiA][es-kfymw][inet[/10.244.48.2:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2015-08-30 10:01:44,010][INFO ][http ] [Hammerhead] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.244.48.2:9200]}
[2015-08-30 10:01:44,011][INFO ][node ] [Hammerhead] started
[2015-08-30 10:01:44,042][INFO ][gateway ] [Hammerhead] recovered [0] indices into cluster_state
```
[Download example](music-service.yaml)
<!-- END MUNGE: EXAMPLE music-service.yaml -->
So we have a 1-node Elasticsearch cluster ready to handle some work.
Let's create the service with an external load balancer:
## Scale
```console
$ kubectl create -f examples/elasticsearch/music-service.yaml --namespace=mytunes
services/music-server
```
Let's check the status of the service:
```console
$ kubectl get service --namespace=mytunes
NAME LABELS SELECTOR IP(S) PORT(S)
music-server name=music-db name=music-db 10.0.185.179 9200/TCP
Scaling is as easy as:
```
Although this service has an IP address `10.0.185.179` internal to the cluster we don't yet have
an external IP address provisioned. Let's wait a bit and try again...
```console
$ kubectl get service --namespace=mytunes
NAME LABELS SELECTOR IP(S) PORT(S)
music-server name=music-db name=music-db 10.0.185.179 9200/TCP
104.197.114.130
kubectl scale --replicas=3 rc es
```
Now we have an external IP address `104.197.114.130` available for accessing the service
from outside the cluster.
Did it work?
Let's see what we've got:
```console
$ kubectl get pods,rc,services --namespace=mytunes
```
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
music-db-5p46b 1/1 Running 0 7m
music-db-8re0f 1/1 Running 0 7m
music-db-eq8j0 1/1 Running 0 7m
music-db-uq5px 1/1 Running 0 7m
CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS
music-db es kubernetes/elasticsearch:1.2 name=music-db 4
NAME LABELS SELECTOR IP(S) PORT(S)
music-server name=music-db name=music-db 10.0.185.179 9200/TCP
104.197.114.130
NAME TYPE DATA
default-token-gcilu kubernetes.io/service-account-token 2
es-78e0s 1/1 Running 0 8m
es-kfymw 1/1 Running 0 17m
es-rjmer 1/1 Running 0 8m
kube-dns-p3v1u 3/3 Running 0 30m
```
This shows 4 instances of Elasticsearch running. After making sure that port 9200 is accessible for this cluster (e.g. using a firewall rule for Google Compute Engine) we can make queries via the service which will be fielded by the matching Elasticsearch pods.
Let's take a look at logs:
```console
$ curl 104.197.114.130:9200
```
$ kubectl logs es-kfymw
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender.
[2015-08-30 10:01:31,946][INFO ][node ] [Hammerhead] version[1.7.1], pid[7], build[b88f43f/2015-07-29T09:54:16Z]
[2015-08-30 10:01:31,946][INFO ][node ] [Hammerhead] initializing ...
[2015-08-30 10:01:32,110][INFO ][plugins ] [Hammerhead] loaded [cloud-kubernetes], sites []
[2015-08-30 10:01:32,153][INFO ][env ] [Hammerhead] using [1] data paths, mounts [[/data (/dev/sda9)]], net usable_space [14.4gb], net total_space [15.5gb], types [ext4]
[2015-08-30 10:01:37,188][INFO ][node ] [Hammerhead] initialized
[2015-08-30 10:01:37,189][INFO ][node ] [Hammerhead] starting ...
[2015-08-30 10:01:37,499][INFO ][transport ] [Hammerhead] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.244.48.2:9300]}
[2015-08-30 10:01:37,550][INFO ][discovery ] [Hammerhead] myesdb/n2-6uu_UT3W5XNrjyqBPiA
[2015-08-30 10:01:43,966][INFO ][cluster.service ] [Hammerhead] new_master [Hammerhead][n2-6uu_UT3W5XNrjyqBPiA][es-kfymw][inet[/10.244.48.2:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2015-08-30 10:01:44,010][INFO ][http ] [Hammerhead] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.244.48.2:9200]}
[2015-08-30 10:01:44,011][INFO ][node ] [Hammerhead] started
[2015-08-30 10:01:44,042][INFO ][gateway ] [Hammerhead] recovered [0] indices into cluster_state
[2015-08-30 10:08:02,517][INFO ][cluster.service ] [Hammerhead] added {[Tenpin][2gv5MiwhRiOSsrTOF3DhuA][es-78e0s][inet[/10.244.54.4:9300]]{master=true},}, reason: zen-disco-receive(join from node[[Tenpin][2gv5MiwhRiOSsrTOF3DhuA][es-78e0s][inet[/10.244.54.4:9300]]{master=true}])
[2015-08-30 10:10:10,645][INFO ][cluster.service ] [Hammerhead] added {[Evilhawk][ziTq2PzYRJys43rNL2tbyg][es-rjmer][inet[/10.244.33.3:9300]]{master=true},}, reason: zen-disco-receive(join from node[[Evilhawk][ziTq2PzYRJys43rNL2tbyg][es-rjmer][inet[/10.244.33.3:9300]]{master=true}])
```
So we have a 3-node Elasticsearch cluster ready to handle more work.
## Access the service
*Don't forget* that services in Kubernetes are only acessible from containers in the cluster. For different behavior you should [configure the creation of an external load-balancer](http://kubernetes.io/v1.0/docs/user-guide/services.html#type-loadbalancer). While it's supported within this example service descriptor, its usage is out of scope of this document, for now.
```
$ kubectl get service elasticsearch
NAME LABELS SELECTOR IP(S) PORT(S)
elasticsearch component=elasticsearch component=elasticsearch 10.100.108.94 9200/TCP
9300/TCP
```
From any host on your cluster (that's running `kube-proxy`), run:
```
$ curl 10.100.108.94:9200
```
You should see something similar to the following:
```json
{
"status" : 200,
"name" : "Warpath",
"cluster_name" : "mytunes-db",
"name" : "Hammerhead",
"cluster_name" : "myesdb",
"version" : {
"number" : "1.5.2",
"build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",
"build_timestamp" : "2015-04-27T09:21:06Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}
$ curl 104.197.114.130:9200
{
"status" : 200,
"name" : "Callisto",
"cluster_name" : "mytunes-db",
"version" : {
"number" : "1.5.2",
"build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",
"build_timestamp" : "2015-04-27T09:21:06Z",
"number" : "1.7.1",
"build_hash" : "b88f43fc40b0bcd7f173a1f9ee2e97816de80b19",
"build_timestamp" : "2015-07-29T09:54:16Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
@@ -240,128 +164,33 @@ $ curl 104.197.114.130:9200
}
```
We can query the nodes to confirm that an Elasticsearch cluster has been formed.
Or if you want to check cluster information:
```console
$ curl 104.197.114.130:9200/_nodes?pretty=true
```
curl 10.100.108.94:9200/_cluster/health?pretty
```
You should see something similar to the following:
```json
{
"cluster_name" : "mytunes-db",
"nodes" : {
"u-KrvywFQmyaH5BulSclsA" : {
"name" : "Jonas Harrow",
...
"discovery" : {
"zen" : {
"ping" : {
"unicast" : {
"hosts" : [ "10.244.2.48", "10.244.0.24", "10.244.3.31", "10.244.1.37" ]
},
...
"name" : "Warpath",
...
"discovery" : {
"zen" : {
"ping" : {
"unicast" : {
"hosts" : [ "10.244.2.48", "10.244.0.24", "10.244.3.31", "10.244.1.37" ]
},
...
"name" : "Callisto",
...
"discovery" : {
"zen" : {
"ping" : {
"unicast" : {
"hosts" : [ "10.244.2.48", "10.244.0.24", "10.244.3.31", "10.244.1.37" ]
},
...
"name" : "Vapor",
...
"discovery" : {
"zen" : {
"ping" : {
"unicast" : {
"hosts" : [ "10.244.2.48", "10.244.0.24", "10.244.3.31", "10.244.1.37" ]
...
"cluster_name" : "myesdb",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}
```
Let's ramp up the number of Elasticsearch nodes from 4 to 10:
```console
$ kubectl scale --replicas=10 replicationcontrollers music-db --namespace=mytunes
scaled
$ kubectl get pods --namespace=mytunes
NAME READY STATUS RESTARTS AGE
music-db-0n8rm 0/1 Running 0 9s
music-db-4izba 1/1 Running 0 9s
music-db-5dqes 0/1 Running 0 9s
music-db-5p46b 1/1 Running 0 10m
music-db-8re0f 1/1 Running 0 10m
music-db-eq8j0 1/1 Running 0 10m
music-db-p9ajw 0/1 Running 0 9s
music-db-p9u1k 1/1 Running 0 9s
music-db-rav1q 0/1 Running 0 9s
music-db-uq5px 1/1 Running 0 10m
```
Let's check to make sure that these 10 nodes are part of the same Elasticsearch cluster:
```console
$ curl 104.197.114.130:9200/_nodes?pretty=true | grep name
"cluster_name" : "mytunes-db",
"name" : "Killraven",
"name" : "Killraven",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Tefral the Surveyor",
"name" : "Tefral the Surveyor",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Jonas Harrow",
"name" : "Jonas Harrow",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Warpath",
"name" : "Warpath",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Brute I",
"name" : "Brute I",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Callisto",
"name" : "Callisto",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Vapor",
"name" : "Vapor",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Timeslip",
"name" : "Timeslip",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Magik",
"name" : "Magik",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
"name" : "Brother Voodoo",
"name" : "Brother Voodoo",
"name" : "mytunes-db"
"vm_name" : "OpenJDK 64-Bit Server VM",
"name" : "eth0",
```
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/examples/elasticsearch/README.md?pixel)]()
<!-- END MUNGE: GENERATED_ANALYTICS -->
<!-- END MUNGE: GENERATED_ANALYTICS -->