Let's continue our Elastic journey with some basic concepts in Elasticsearch as well as how to index, search, update, and delete data in Elasticsearch.
Basic Concepts in Elasticsearch
Before diving to Elasticsearch, let's grab a quick overview of some basic concepts in it.
Some concepts you need to know:
- Document: A document is a basic unit of information that can be indexed. It is expressed in JSON format.
- Shard: A shard is a single Lucene instance. Elasticsearch uses Lucene to index and search data. A shard is a low-level worker unit that holds just a slice of all the data in the index. It is a single instance of Lucene. An index is a logical namespace that points to one or more physical shards.
- Index: An index is a collection of documents that have somewhat similar characteristics. For example, you can have an index for customer data, another index for a product catalog, and yet another index for order data. An index is identified by a name (that must be all lowercase) and this name is used to refer to the index when performing indexing, search, update, and delete operations against the documents in it.
- Node: A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities. Just like shards, a node can be a master node, a data node, or a client node. By default, each node is configured to act as both a master node and a data node. In a single-node cluster, this is exactly what you want. However, in a production environment, for redundancy and performance, you must set up a cluster consisting of multiple nodes.
- Cluster: A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is "elasticsearch". This name is important because a node can only be part of a cluster if the node is set up to join the cluster by its name.