ElasticSearch Flashcards
What is ElasticSearch?
It is a highly scalable open-source full-text search and analytics engine. It allows you to store, search and analyze big volumes of data.
What is an Index and a Document in ES?
Index is a collection of documents
Document is a basic unit expressed in JSON format
What are Shards in ES?
Suppose you want to store billions of documents in a single index. The size could easily exceed several TBs. For that reason, you can divide the index into several parts, called shards. It allows you to horizontally split your data.
To tolerate failures, you can maintain or more copies of the shards which are called replicas.
What are the types of nodes in ES?
Master Nodes: Controls the cluster
Data Nodes: Holds data and performs data related operations such as CRUD.
Ingest Nodes: Apply an ingest pipeline to a document in order to transform and enrich the document before indexing.
Remote Nodes: Remote client
Transform Nodes: If you want to use transforms
What is Logstash?
It is a data processing pipeline. Allows you to collect data from a variety of sources, transform it on the fly, and send it to your desired destination
What is the ELK stack?
It is collection of three products: ElasticSearch, Logstash and Kibana
What is a replica in ElasticSearch?
A replica is an exact copy of the Shard which helps with high-availability and fault-tolerance
What are the main operations you can perform on a document?
Indexing, Fetching, Updating and Deleting
Compare the different entities between ElasticSearch and a RDMBS
Index - Database Type - Table Document - Row Field - Column Mapping - Schema
What is Type in ES?
Type is like a table. It holds multiple documents.
How is Lucene related to ElasticSearch?
ES is built over Lucene and provides a JSON based REST API to refer to Lucene features
ES provides a distributed system on top of Lucene.