Tag: shards

Elasticsearch, some note about shards and replicas

Elasticsearch, some note about shards and replicas

An elasticsearch index is composed of one or more shards. Shards can be primary or replica, primary shards are the first involved in a write, replica are created or updated after the primary.

Shards.

Shards are how elasticsearch scales horizontally and distributes load among the nodes of a cluster. A shard is a lucene index, it can hold up to Integer.MAX_VALUE-128 documents (2,147,483,519). You must tune your index to make sure you won’t have shards greater than 20/30GB (or you will have out of memory issues very soon). Changing the number of shards requires to build/re-build a new index, you must set this while creating the index (or through a proper mapping configuration),

see https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-params.html

{ 
  "settings":
   {
    "index.number_of_shards": 10
   }
}

See template API for details: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html

Replicas.

They are important to scale reads and lower the pressure (replica are usually involved in the reads as primary target) and make you’re index resilient to failure. A replica can become a primary shard if needed.

The replicas can be set on-the-fly

curl -X PUT "localhost:9200/indexname/_settings" -H 'Content-Type: application/json' -d' {     "index" : {         "number_of_replicas" : 2     } } '