Search - Compound Queries Flashcards

1
Q

Search

A

Use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API’s query request body parameter accepts queries written in Query DSL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Query DSL

A

The Search API’s query request body parameter accepts queries written in Query DSL. It provides a JSON-based syntax to write queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Boolean Query

A

Boolean query is a type of compound query that allows you to combine multiple query clauses with Boolean logic.

must - Specifies that the query must match these conditions. Only documents that meet all must conditions will be returned. It’s similar to an “AND” condition in SQL.

must_not - Documents matching any must_not clause are excluded from the results.

should - Specifies conditions that are optional but will increase the relevance score of documents if they match.

filter - Similar to must, but it doesn’t affect the relevance score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Boolean Example

A

{
“query”: {
“bool”: {
“must”: [
{ “match”: { “title”: “Elasticsearch” } }
],
“must_not”: [
{ “match”: { “status”: “archived” } }
],
“should”: [
{ “match”: { “tags”: “search” } },
{ “match”: { “tags”: “database” } }
],
“filter”: [
{ “range”: { “date”: { “gte”: “2022-01-01” } } }
]
}
}
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Query Boosting

A

A boosting query takes two sub-queries:
Positive and Negative with Negative Boost

Negative can reduce the relevance but not completely elminiate the hit

{
“query”: {
“boosting”: {
“positive”: {
“match”: {
“content”: “Elasticsearch”
}
},
“negative”: {
“match”: {
“content”: “deprecated features”
}
},
“negative_boost”: 0.5
}
}
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Constant Score Query

A

In a constant score query:

All matching documents receive the same relevance score, which can either be set to 1 by default or boosted to a custom value.
It wraps other queries or filters, treating them purely as filters without scoring.
The primary use is to simplify relevance handling by making all matches “equal” in the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Constant Score Example

A

{
“query”: {
“constant_score”: {
“filter”: {
“bool”: {
“must”: [
{ “term”: { “status”: “published” } },
{ “term”: { “category”: “technology” } }
]
}
},
“boost”: 2.0
}
}
}

The constant_score query wraps a filter that specifies the conditions: status must be published and category must be technology.

The boost factor is set to 2.0, so all documents that match these conditions will have a relevance score of 2, making them stand out from non-boosted matches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Relevance Score

A

This score indicates how well the document matches the query and determines the document’s rank or position in the search results. Documents with higher relevance scores appear at the top of the search results. Elasticsearch uses a scoring algorithm called BM25 (Best Matching 25), which calculates relevance scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Factors influencing Relevancy

A

Term Frequency (TF)
Inverse Document Frequency (IDF)
Field Length Normalization - For example, if a document has a title with a search term, it might be scored higher than if the term appears in a lengthy content field.
Boosting - Elasticsearch allows you to manually boost certain fields or terms to increase their influence on the relevance score

Relevance scores are relative and don’t represent an absolute measure of relevance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Disjunction Max Query (often referred to as DisMax query)

A

The Disjunction Max Query (often referred to as DisMax query) in Elasticsearch is a type of query that combines results from multiple queries and selects the highest relevance score from among them for each document. ie, each of the queries assigns a score, the dismax query takes the highest score as the relevancy score rather than summing up the scores of the individual queries
DisMax is particularly useful in scenarios where:
A strong match in one field should dominate: For instance, if matching in the title field should outweigh partial matches in description.

Reduces Noise from Weak Matches: Unlike Boolean or multi_match queries that sum scores, DisMax ignores weaker matches by default.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

DisMax Example

A

{
“query”: {
“dis_max”: {
“queries”: [
{ “match”: { “title”: “Elasticsearch tutorial” } },
{ “match”: { “description”: “Elasticsearch tutorial” } }
],
“tie_breaker”: 0.3
}
}
}

The tie_breaker is set to 0.3, meaning that 30% of the non-maximal scores will be added to the highest score. This can help slightly boost documents that also have matches in other fields, though it still prioritizes the highest score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Function Score

A

This query type is especially useful when you want to influence ranking based on additional factors, such as field values, document age, or geographic location, rather than relying solely on standard text-matching relevance. The function score query applies scoring functions to documents that match a specified query. These functions modify the document’s original score, creating a final score that accounts for the additional criteria you define.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Function Score Example

A

{
“query”: {
“function_score”: {
“query”: {
“match”: { “content”: “Elasticsearch tutorial” }
},
“functions”: [
{
“gauss”: {
“publish_date”: {
“origin”: “now”,
“scale”: “30d”,
“offset”: “7d”,
“decay”: 0.5
}
}
},
{
“field_value_factor”: {
“field”: “popularity_score”,
“factor”: 1.5,
“modifier”: “sqrt”
}
}
],
“score_mode”: “sum”,
“boost_mode”: “multiply”
}
}
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly