Basics Flashcards

1
Q

Document

A

JSON File Converted into a ES Document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Index

A

Collection of Documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Shard

A

Index is a collection of shards. Shards are distributed over nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Node

A

An instance of elasticsearch. Can have umltiple nodes running on a physical machine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Primary Shard & Replica Shard

A

Primary Shard is the first shard. Replica Shard is replica of the primary shard

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cluster

A

Collection of index, developed for specific purpose, like eComm Seaarch, APM. Cross Cluster Searches are possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Routing Formula and No of Replicas

A

shard_number = hash(documentid) + number of replicas
Number of replicas of index cannot be changed as it would affect the routing formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Add documents

A

POST /index_name/_doc/
{
“field1”: “value1”,
“field2”: “value2”
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Delete documents

A

DELETE /index_name/_doc/document_id

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Update documents

A

POST /index_name/_doc/document_id/_update
{
“doc”: {
“field1”: “new_value”
}
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Create Index

A

PUT /index_name
{
“settings”: {
“number_of_shards”: 1,
“number_of_replicas”: 1
},
“mappings”: {
“properties”: {
“field1”: { “type”: “text” },
“field2”: { “type”: “keyword” }
}
}
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

DELETE index

A

DELETE /index_name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Optimistic Concurrency Control

A

To ensure an older version of a document doesn’t overwrite a newer version, every operation performed to a document is assigned a sequence number by the primary shard that coordinates that change. The sequence number is increased with each operation and thus newer operations are guaranteed to have a higher sequence number than older operations.

As an application developer you need to pass the primary term and sequence number during the update, to make sure that you are not updating an older copy of the document

First do a GET and get the sample fields in the response. Then do a POST to update the contents

PUT products/_doc/1567?if_seq_no=362&if_primary_term=2
{
“product”: “r2d2”,
“details”: “A resourceful astromech droid”,
“tags”: [ “droid” ]
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Inverted Index

A

Mapping between keyword to document number
Ex
Document 1 :
Space : The final frontier. These are the voyages..

Document 2 :
He’s bad, he’s the number one. He’s the space cowboy with the laser gun!

space: 1,2
the: 1,2
final: 1
frontier: 1
he: 2
bad: 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TF-IDF

A

Term Frequence * Inverse Document Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Term Frequency

A

is how often a term appears in a given document.

17
Q

Document Frequency

A

is how often a term appears in all documents

Important point.
The word ‘the’ may have a high document frequency

The word ‘space’ may have a low document frequency

18
Q

Term Frequency/ Document Frequency or Relevancy

A

Measures the relevance of a term in a document

19
Q
A