Paging and Sorting, ElasticSearch Flashcards by Thomas Reddy

What is Elasticsearch?

Elasticsearch is a distributed search and analytics engine built on top of Apache Lucene. It’s designed for horizontal scalability, reliability, and real-time search

How well did you know this?

Not at all

Perfectly

What is Apache Lucene?

Apache Lucene is a high-performance, full-featured text search engine library written in Java. Elasticsearch builds upon Lucene’s capabilities to provide distributed search and analytics.

How well did you know this?

Not at all

Perfectly

What are the key features of Elasticsearch?

Distributed and scalable architecture
Near real-time search and analytics
Full-text search capabilities
Support for structured and unstructured data
RESTful API for easy integration
Powerful query DSL (Domain Specific Language)

How well did you know this?

Not at all

Perfectly

What is a cluster in Elasticsearch?

A cluster in Elasticsearch is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes.

How well did you know this?

Not at all

Perfectly

What is a node in Elasticsearch?

A node is a single server that is part of your Elasticsearch cluster, storing data and participating in the cluster’s indexing and search capabilities.

How well did you know this?

Not at all

Perfectly

What is an index in Elasticsearch?

An index in Elasticsearch is similar to a database in the relational database world. It is a collection of documents that have somewhat similar characteristics.

How well did you know this?

Not at all

Perfectly

What is a document in Elasticsearch?

A document is a basic unit of information that can be indexed. It is expressed in JSON (JavaScript Object Notation) format and consists of field-value pairs.

How well did you know this?

Not at all

Perfectly

What is a shard in Elasticsearch?

A shard is a subset of documents from an index. Elasticsearch divides an index into multiple shards to distribute data across multiple nodes and to achieve horizontal scalability.

How well did you know this?

Not at all

Perfectly

What is a replica in Elasticsearch?

A replica is a copy of a shard. Elasticsearch creates replicas to provide high availability and failover mechanisms. Replicas also help to distribute search load across multiple nodes.

How well did you know this?

Not at all

Perfectly

What is a query DSL in Elasticsearch?

Query DSL (Domain Specific Language) is a powerful JSON-based query language used in Elasticsearch to define various search queries and filters. It allows for complex queries to be constructed easily.

How well did you know this?

Not at all

Perfectly

What is the purpose of an analyzer in Elasticsearch?

An analyzer is responsible for processing the text of fields in documents during indexing and search operations. It breaks down text into individual terms or tokens and applies various transformations like stemming, lowercase conversion, and removing stop words.

How well did you know this?

Not at all

Perfectly

What are the three main types of analyzers in Elasticsearch?

Standard analyzer
Custom analyzer
Language-specific analyzer (e.g., English analyzer, French analyzer)

How well did you know this?

Not at all

Perfectly

What is a token filter in Elasticsearch?

A token filter is a component of the analyzer that processes individual tokens generated from the text. It applies additional transformations such as lowercase conversion, stemming, synonym expansion, and stop word removal.

How well did you know this?

Not at all

Perfectly

What is a mapping in Elasticsearch?

A mapping defines the fields and properties of documents in an index, including their data types, analyzers, and other settings. It provides the schema for the documents stored in the index.

How well did you know this?

Not at all

Perfectly

What are dynamic mappings in Elasticsearch?

Dynamic mappings in Elasticsearch allow the index to automatically detect and create mappings for new fields in incoming documents. Elasticsearch infers the data type and other properties based on the content of the document.

How well did you know this?

Not at all

Perfectly

What is relevance scoring in Elasticsearch?

Study These Flashcards

Relevance scoring is the process of determining the relevance of documents to a given search query. Elasticsearch calculates a relevance score for each document based on factors like term frequency, inverse document frequency, and field length normalization.

What is a term query in Elasticsearch?

Study These Flashcards

A term query is a type of query used to search for documents that contain an exact term or phrase in a specific field. It matches documents where the field value exactly matches the specified term.

What is a match query in Elasticsearch?

Study These Flashcards

A match query is a type of query used to search for documents that contain a specified term or phrase in any field. It analyzes the query string and performs a full-text search across all indexed fields.

What is a filter in Elasticsearch?

Study These Flashcards

A filter in Elasticsearch is used to narrow down search results based on specific criteria. It does not affect the relevance score of documents but only determines whether a document matches the filter conditions.

What is the role of the inverted index in Elasticsearch?

Study These Flashcards

The inverted index is a data structure used by Elasticsearch to facilitate fast full-text search. It maps terms to the documents that contain them, enabling efficient retrieval of documents containing specific terms.

What are OFFSET and LIMIT in SQL queries used for?

Study These Flashcards

OFFSET and LIMIT are used for pagination in SQL queries. OFFSET specifies the number of rows to skip before starting to return rows, while LIMIT specifies the maximum number of rows to return.

What is Elasticsearch?

Study These Flashcards

Elasticsearch is a distributed, RESTful search and analytics engine designed for horizontal scalability, reliability, and real-time search capabilities. It is commonly used for log analytics, full-text search, and monitoring applications.

What is the purpose of OFFSET and LIMIT in Elasticsearch queries?

Study These Flashcards

OFFSET and LIMIT are used for pagination in Elasticsearch queries. OFFSET specifies the number of results to skip before starting to return documents, while LIMIT specifies the maximum number of documents to return in the result set.

What is sharding in Elasticsearch?

Study These Flashcards

Sharding in Elasticsearch is the process of dividing an index into multiple smaller index segments called shards, allowing data to be distributed across multiple nodes in a cluster for scalability and performance.

What is a mapping in Elasticsearch?

A mapping in Elasticsearch defines the data structure and properties of documents within an index, including field types, analyzers, and other settings. It helps Elasticsearch understand how to index and search the data efficiently.

What is pagination?

Pagination is the process of dividing a large set of data into smaller, manageable subsets or pages to improve performance and user experience when displaying data in an application.

What are the common components of pagination?

Common components of pagination include: Page size or limit: The maximum number of items displayed per page. Page number or offset: The index of the current page within the entire dataset. Total count: The total number of items in the dataset, often used for calculating the total number of pages.

What is the purpose of pagination in web applications?

Pagination is used in web applications to efficiently manage and display large datasets by breaking them into smaller chunks or pages. It improves user experience by reducing load times and minimizing the amount of data transferred over the network.

What is OFFSET in pagination?

OFFSET is a parameter used in pagination queries to specify the number of items to skip from the beginning of the dataset before returning results. It is typically used in conjunction with the LIMIT parameter to retrieve a specific subset of data.

What is LIMIT in pagination?

LIMIT is a parameter used in pagination queries to specify the maximum number of items to return in a single page or subset of data. It determines the size of each page displayed to the user.

What are the benefits of using pagination?

The benefits of pagination include: Improved performance: By dividing large datasets into smaller pages, pagination reduces load times and resource consumption. Enhanced user experience: Users can navigate through data more easily and efficiently, especially when dealing with large datasets. Reduced network traffic: Displaying only a subset of data at a time minimizes the amount of data transferred over the network, improving application responsiveness.

How is pagination typically implemented in database queries?

Pagination in database queries is typically implemented using the LIMIT and OFFSET clauses. LIMIT specifies the maximum number of rows to return, while OFFSET specifies the number of rows to skip before returning results, effectively determining the current page.

What are some alternative pagination techniques?

Some alternative pagination techniques include: Keyset pagination: Using unique identifiers or keys to paginate through results, often used for navigating sorted or filtered datasets. Cursor-based pagination: Using opaque cursors or tokens to paginate through results, ensuring stability and consistency across paginated requests. Infinite scrolling: Dynamically loading more data as the user scrolls down the page, providing a seamless browsing experience without traditional page navigation.

Paging and Sorting, ElasticSearch Flashcards

(33 cards)