How to deploy an ElasticSearch cluster using CoreOS and Consul

03 May, 2015
Xebia Background Header Wave

The hot potato in the room of Containerized solutions is persistent services. Stateless applications are easy and trivial, but to deploy a persistent services like ElasticSearch is a totally different ball game. In this blog post we will show you how easy it is on this platform to create ElasticSearch clusters. The key to the easiness is the ability to lookup external ip addresses and port numbers of all cluster members in Consul and the reusable power of the CoreOS unit file templates. The presented solution is a ready-to-use ElasticSearch component for your application. This solution:

  • uses empheral ports so that we can actually run multiple ElasticSearch nodes on the same host
  • mounts persistent storage under each node to prevent data loss on server crashes
  • uses the power of the CoreOS unit template files to deploy new ElasticSearch clusters.

In the previous blog posts we defined our A High Available Docker Container Platform using CoreOS and Consul and showed how we can add persistent storage to a Docker container. Once this platform is booted the only thing you need to do to deploy an ElasticSearch Cluster,  is to submit the following fleet unit system template file elasticsearch@.service  and start 3 or more instances.

Booting the platform

To see the ElasticSearch cluster in action, first boot up our CoreOS platform. [bash] git clone cd coreos-container-platform-as-a-service/vagrant vagrant up ./ [/bash]

Starting an ElasticSearch cluster

Once the platform is started, submit the elasticsearch unit file and start three instances: [bash] export FLEETCTL_TUNNEL= cd ../fleet-units/elasticsearch fleetctl submit elasticsearch@.service fleetctl start elasticsearch@{1..3} [/bash] Now wait until all elasticsearch instances are running by checking the unit status. [bash] fleetctl list-units ... UNIT MACHINE ACTIVE SUB elasticsearch@1.service f3337760.../ active running elasticsearch@2.service ed181b87.../ active running elasticsearch@3.service 9e37b320.../ active running mnt-data.mount 9e37b320.../ active mounted mnt-data.mount ed181b87.../ active mounted mnt-data.mount f3337760.../ active mounted [/bash]

Create an ElasticSearch index

Now that the ElasticSearch cluster is running, you can create an index to store data. [bash] curl -XPUT -d \ '{ "settings" : { "index" : { "number_of_shards" : 3, "number_of_replicas" : 2 } } }' [/bash]

Insert a few documents

[bash] curl -XPUT -d@- <<! { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } ! curl -XPUT -d@- <<! { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] } ! curl -XPUT -d@- <<! { "first_name" : "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] } ! [/bash]

And query the index

[bash] curl -XGET ... { "took": 50, "timed_out": false, "_shards": { "total": 3, "successful": 3, "failed": 0 }, "hits": { "total": 2, ... } [/bash]

restarting the cluster

Even when you restart the entire cluster, your data is persisted. [bash] fleetctl stop elasticsearch@{1..3} fleetctl list-units fleetctl start elasticsearch@{1..3} fleetctl list-units curl -XGET ... { "took": 50, "timed_out": false, "_shards": { "total": 3, "successful": 3, "failed": 0 }, "hits": { "total": 2, ... } [/bash]

Open the console

Finally you can see the servers and the distribution of the index in the cluster by opening the console

elasticsearch head

Deploy other ElasticSearch clusters

Changing the name of the template file is the only thing you need to deploy another ElasticSearch cluster. [bash] cp elasticsearch\@.service my-cluster\@.service fleetctl submit my-cluster\@.service fleetctl start my-cluster\@{1..3} curl [/bash]

How does it work?

Starting a node in an ElasticSearch cluster is quite trivial, as shown in by the command line below: [bash] exec gosu elasticsearch elasticsearch \ \$HOST_LIST \ --transport.publish_host=$PUBLISH_HOST \ --transport.publish_port=$PUBLISH_PORT \ $@ [/bash] We use the unicast protocol and specify our own publish host and port and list of ip address and port numbers of all the other nodes in the cluster.

Finding the other nodes in the cluster

But how do we find the other nodes in the cluster? That is quite easy. We query the Consul REST API for all entries with the same service name that are tagged as the "es-transport". This is the service exposed by ElasticSearch on port 9300. [bash] curl -s https://consul:8500/v1/catalog/service/$SERVICE_NAME?tag=es-transport ... [ { "Node": "core-03", "Address": "", "ServiceID": "elasticsearch-1", "ServiceName": "elasticsearch", "ServiceTags": [ "es-transport" ], "ServiceAddress": "", "ServicePort": 49170 }, { "Node": "core-01", "Address": "", "ServiceID": "elasticsearch-2", "ServiceName": "elasticsearch", "ServiceTags": [ "es-transport" ], "ServiceAddress": "", "ServicePort": 49169 }, { "Node": "core-02", "Address": "", "ServiceID": "elasticsearch-3", "ServiceName": "elasticsearch", "ServiceTags": [ "es-transport" ], "ServiceAddress": "", "ServicePort": 49169 } ] [/bash] Turning this into a comma seperated list of network endpoints is done using the following jq command: [bash] curl -s https://consul:8500/v1/catalog/service/$SERVICE_NAME?tag=es-transport |\ jq -r '[ .[] | [ .Address, .ServicePort | tostring ] | join(":") ] | join(",")' [/bash]

Finding your own network endpoint

As you can see in the above JSON output, each service entry has a unique ServiceID. To obtain our own endpoint, we use the following jq command: [bash] curl -s https://consul:8500/v1/catalog/service/$SERVICE_NAME?tag=es-transport |\ jq -r ".[] | select(.ServiceID==\"$SERVICE_9300_ID\") | .Address, .ServicePort" [/bash]

Finding the number of node in the cluster

Finding the intended number of nodes in the cluster is determined by counting the number of fleet unit instance files in CoreOS on startup and passing this number as an environment variable. [bash] TOTAL_NR_OF_SERVERS=$(fleetctl list-unit-files | grep '%p@[^.][^.]*.service' | wc -l) [/bash] The %p refers to the part of the fleet unit file before the @ sign.

The Docker run command

The Docker run command is shown below. ElasticSearch exposes two ports: port 9200 exposes a REST api to the clients and port 9300 is used as the transport protocol between nodes in the cluster. Each port is a service and tagged appropriately. [bash] ExecStart=/bin/sh -c "/usr/bin/docker run --rm \ --name %p-%i \ --env SERVICE_NAME=%p \ --env SERVICE_9200_TAGS=http \ --env SERVICE_9300_ID=%p-%i \ --env SERVICE_9300_TAGS=es-transport \ --env TOTAL_NR_OF_SERVERS=$(fleetctl list-unit-files | grep '%p@[^.][^.]*.service' | wc -l) \ -P \ --dns $(ifconfig docker0 | grep 'inet ' | awk '{print $2}') \ --dns-search=service.consul \ cargonauts/consul-elasticsearch" [/bash] The options are explained in the table below:

--env SERVICE_NAME=%pThe name of this service to be advertised in Consul, resulting in a FQDN of %p.service.consul and will be used as the cluster name. %p refers to the first part of the fleet unit template file up to the @.
--env SERVICE_9200_TAGS=wwwThe tag assigned to the service at port 9200. This is picked up by the http-router, so that any http traffic to the host elasticsearch is direct to this port.
--env SERVICE_9300_ID=%p-%iThe unique id of this service in Consul. This is used by the startup script to find it's external port and ip address in Consul and will be used as the node name for the ES server. %p refers to the first part of the fleet unit template file up to the @ %i refers to the second part of the fleet unit file upto the .service.
--env SERVICE_9300_TAGS=es-transportThe tag assigned to the service at port 9300. This is used by the startup script to find the other servers in the cluster.
--env TOTAL_NR_OF_SERVERS=$(...)The number of submitted unit files is counted and passed in as the environment variable 'TOTAL_NR_OF_SERVERS'. The start script will wait until this number of servers is actually registered in Consul before starting the ElasticSearch Instance.
--dns $(...)Set DNS to query on the docker0 interface, where Consul is bound on port 53. (The docker0 interface ip address is chosen at random from a specific range).
-dns-search=service.consulThe default DNS search domain.


The sources for the ElasticSearch repository can be found on github.

sourcedescription  complete startup script of elasticsearch
elasticsearchCoreOS fleet unit files for elasticsearch cluster
consul-elasticsearchSources for the Consul ElasticSearch repository


CoreOS fleet template unit files are a powerful way of deploying ready to use components for your platform. If you want to deploy cluster aware applications, a service registry like Consul is essential.

Mark van Holsteijn
Mark van Holsteijn is a senior software systems architect at Xebia Cloud-native solutions. He is passionate about removing waste in the software delivery process and keeping things clear and simple.

Explore related posts