Aggregations Flashcards
How do you count unique values in elastic?
GET < index >/_search { size: 0, aggs: { "< aggregation name >": { "cardinality": { field: "< field name >" }
Why should you use aggregations only on keyword fields?
Because analysed fields are broken down into tokens and it would be too expensive to aggregate them.
You get an error if you try saying the process would required uninverting the index which takes a lot of memory.
How do remove the documents output in an aggregations query if you’re not interested in seeing the documents, just the counts?
Set “size: 0”.
How do you do a sum aggregation?
GET < index >/_search { size: 0, aggs: { "< aggregation name >": { "sum": { field: "< field name >" }
How do you do an average aggregation?
GET < index >/_search { size: 0, aggs: { "< aggregation name >": { "avg": { field: "< field name >" }
How do you do a terms aggregation and what is it?
It aggregates and counts the values of a field into buckets.
GET < index >/_search { size: 0, aggs: { "< aggregation name >": { "terms": { field: "< field name >" size: 10, }
What is a date histogram aggregation and how do you do it?
Gives you a count of events by day/month/etc
GET < index >/_search { size: 0, aggs: { "< aggregation name >": { "date_histogram": { field: "< field name >" # date field calendar_interval: "day|month|etc" }
TODO: Read more about other types of aggregations.
How do you perform nested aggregations?
GET < index >/_search { size: 0, aggs: { "< agg name >": { terms: { field: "< fild name >", } aggs: { .... }
How do you sort your nested aggregations?
agg: { "< name >: { terms: { .... order: { "< other aggregation >": "desc|asc" } }, "aggs": { "< other aggregation >": ....
What are the 2 different types of pipeline aggregation?
- Sibling aggregation
- Parent aggregation
How do you perform sibling aggregations?
{ aggs: { < agg name >: { ... }, < sibbling aggregation name >: { "sum_bucket": { buckets_path: "path>to>aggregation>...."
What is a sibling aggregation?
Allows you to aggregate the output of an other aggregation. For example, you can bucket the counts with one aggregation and calculate the total with a sibling aggregation by simply adding the result of the buckets.
What is a pipeline aggregation?
An aggregation that takes the output of another aggregation as input.
What is the difference between sibling and parent aggregation pipeline?
TODO: Do some more research on this because it wasn’t super clear.