Elasticsearch, some note about shards and replicas

Elasticsearch, some note about shards and replicas


An elasticsearch index is composed of one or more shards. Shards can be primary or replica, primary shards are the first involved in a write, replica are created or updated after the primary.

Shards.

Shards are how elasticsearch scales horizontally and distributes load among the nodes of a cluster. A shard is a lucene index, it can hold up to Integer.MAX_VALUE-128 documents (2,147,483,519). You must tune your index to make sure you won’t have shards greater than 20/30GB (or you will have out of memory issues very soon). Changing the number of shards requires to build/re-build a new index, you must set this while creating the index (or through a proper mapping configuration),

see https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-params.html

{ 
  "settings":
   {
    "index.number_of_shards": 10
   }
}

See template API for details: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html

Replicas.

They are important to scale reads and lower the pressure (replica are usually involved in the reads as primary target) and make you’re index resilient to failure. A replica can become a primary shard if needed.

The replicas can be set on-the-fly

curl -X PUT "localhost:9200/indexname/_settings" -H 'Content-Type: application/json' -d' {     "index" : {         "number_of_replicas" : 2     } } '
Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s