Indexing API's Flashcards by Giorgenes G

What is the use case for pipelines?

It’s a saved script that can be stored and reused in different API calls.

How well did you know this?

Not at all

Perfectly

How do you create a pipeline?

PUT _ingest/pipeline/< pipeline name >
{
   description: ".....",
   processors: [
   ]
}

How well did you know this?

Not at all

Perfectly

What are some processors available?

TODO: Check on elastic website and make a list of important ones.

remove { field: “….” },
set { field: “_source.< field > “, value: { “{{ _source…..}}” } # mustache notation
convert { field: …, type: …. } # type cast
script { ….. } # more generic processor

How well did you know this?

Not at all

Perfectly

What’s the gotcha between using scripts in the ingest pipeline and outside of it?

Ingest pipeline script uses “ctx.< field >” while outside you need to use “ctd._source.< field >”.

TODO: is this still the case in latest version?

How well did you know this?

Not at all

Perfectly

What’s the requirement to use a pipeline?

You need to have an ingest node type running.

How well did you know this?

Not at all

Perfectly

What are the use cases for the re-index API?

Copy data across clusters

- Re-process and modify data into a new index

How well did you know this?

Not at all

Perfectly

What’s the API structure for the re-index API?

POST _reindex
{
   "source": { index: ... }.
   "dest": { index: < new index > },
   "script": { .... }
}

How well did you know this?

Not at all

Perfectly

How do you enable remote re-index across nodes?

Whitelist the SRC on the DEST config file:

reindex. remote.whitelist: “< ip >:< port >, ….” (comma separated list)”
reindex. ssl.verification_mode: certificate
reindex. ssl.truststore.type: PKCS12
reindex. ssl.keystore.type: PKCS12
reindex. ssl.truststore.path: certs/node-1
reindex. ssl.keystore.path: certs/node-1

$ bin/elasticsearch-keystore add reindex.ssl.truststore.secure_password

$ bin/elasticsearch-keystore add reindex.ssl.keystore.secure_password

How well did you know this?

Not at all

Perfectly

How do you do remote re-index (across clusters)?

POST _reindex
{
    source: {
        remote: {
            host: "https:///< ip >:< port >",
            username: < user >,
           password: < password >
      }
      index: ....,
   }
   "dest": {
     index: ....
   }
}

How well did you know this?

Not at all

Perfectly

How to re-index only a subset of the data?

Add a query section to the source section.

POST _reindex
{
source {
    query: { .... }
}
}

How well did you know this?

Not at all

Perfectly

How do you mutate the data while copying it?

Add a “script” section:

POST _reindex
{

source: { … },
dest: { … },
script: { …. }
}

How well did you know this?

Not at all

Perfectly

What’s the update by query api structure?

POST < index name >/_update_by_query
{
   script: { .... },
   query: { ... } 
}

How well did you know this?

Not at all

Perfectly

In what instances would you want to simply increment the version of all the objects in an index?

TODO: this was mentioned in the video, but why?

How well did you know this?

Not at all

Perfectly

How do you add multi line scripts?

You can use triple quotes in the script (“””) to have multi line scripts from kibana.
This doesn’t seem to be a standard JSON feature. (TODO: confirm)

How well did you know this?

Not at all

Perfectly

How would you increase a value by X percent with the _update by query api?

script: {
lang: “painless”,
source: “””
ctx. _source.field += ctx._source.balance * X

 if (ctx._source.transactions == null) {
 }
""" }

TODO: move this to a painless deck of cards.
TODO: what about concurrent updates?

How well did you know this?

Not at all

Perfectly

How do you use reindex and update by query with a pipeline?

_update_by_query:

Add the “?pipeline=< pipeline >” param

re-index:

{

dest: {
pipeline: “< pipeline >”

How well did you know this?

Not at all

Perfectly

What are the use cases for dynamic templates?

Allows you to specify how new fields are to be mapped into an index.
Create patterns so you don’t need to specifiy every new field. For example: “text” is mapped as text, “is” is mapped as boolean, etc. This allow you to use convention instead of explicitly specifying everything.

How well did you know this?

Not at all

Perfectly

How do you create a dynamic template mapping?

Study These Flashcards

Set it to the index:

PUT < index name >
{
“mappings”: {
“dynamic_templates”: [
“< template name > “: {
“match_mapping_type”: “< type to match >”,
“match”: “< filter on field name >”,
“unmatch”: “ < filter on what NOT to match >”,
“mapping”: {
… < mapping definition > …
“type”: “…. type … “
}
}
]
}
}

How to filter on field names on dynamic template mappings?

Study These Flashcards

Use the “match” or “unmatch” fields with a wildcard.

What are the use cases for index templates?

Study These Flashcards

Time series data: as we routinely create new indexes to store data like log for example. We might create a new index every day so we want that index to follow a template.

How do you create an index pattern / template?

Study These Flashcards

PUT _templates/< template name >
{
  "aliases": ....,
  "mappings:": ....,
  "settings": ....,
  "index_patterns": ["< wildcard > "] 
}

Explain how does index template works

Study These Flashcards

Whenever a new index is created, it is matched against the index pattern of the templates
If it matches, the index is created using that template.

What are some use cases for aliases?

Study These Flashcards

Create a filter that return only a subset of the data (like a saved query).
Aggregate data (same alias, multiple indexes). TODO: test this and learn more about it.

How do you create an alias?

Study These Flashcards

POST _aliases
{
 "actions": [
   {
      "add": {
         "index": "< index name >",
         "alias": "< alias name >"
     }
   }
]
}

How to access an alias?

It behaves the same way as a regular index.

How to remove an alias?

``` POST _aliases { "actions": [ { "remove": { "index": "< index name >", "alias": "< alias name >" } } ] } ```

How do you create a filtered alias?

``` POST _aliases POST _aliases { "actions": [ { "add": { "index": "< index name >", "alias": "< alias name >", "filter": { .... < filter definition > .... } } } ] } ```

What are the 3 main components of indexes?

- Aliases - Mappings - Settings

How do you list indexes?

GET _cat/indices?v

What happens to fields (mappings) you don't define when you post new data?

The mappings are auto filled. TODO: Can this be disabled? How to control this?

How do you create an index?

``` # empty index PUT ``` ``` # with options PUT { } ```

Explain dynamic vs explicity mapping

TODO

Where do the default settings for an index come from?

TODO

How to index an object?

``` # auto generated id PUT /_doc ``` ``` # With a given id PUT /_doc/ ```

How to fetch an object?

With metadata GET /_doc/ Without metadata (source only) GET /_source/

How ids are auto generated for elastic objects?

A UUID is generated.

What are the 2 types of updates you can perform to an object?

- Doc Update ``` POST /_update/ { "doc": { "lastname": "new last name" } } ``` ``` - Script update { "script": { "lang": "painless", "source": "ctx._source.remove('field')" } } ```

What scripting languages are supported by elastic?

- painless TODO

How to remove fields from an object?

Use the update api with a script. For example: ctx._source.remove('fieldname')

How do you delete an object from the index?

DELETE [INDEX-NAME]/_doc/[ID]

What's the file format for bulk indexing?

NDJSON (newline delimited json) (http://ndjson.org/) first line: metadata { "index": { "_id": "...." } } Second line: source {"field": ...., "field2": ....} and so on

How to bulk index object?

curl -u -k -H 'Content-Type: application/x-ndjson' -X POST 'https://localhost:9200/[index-name]/_bulk?pretty' --data-binary @file.json > output.json

What formats does the bulk api support?

- ndjson | - Apparently that's the only one (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)

Indexing API's Flashcards

(43 cards)