Optimistic Concurrency Control in Elasticsearch Flashcards
Optimistic concurrency control (OCC)
Elasticsearch ensures that updates to documents are safely applied, even when multiple processes are updating the same document simultaneously. It works by using version numbers or sequence numbers and primary term tracking to manage concurrent modifications.
Working
When a document is indexed or updated, Elasticsearch assigns it a version number. This version number increments with each update, enabling Elasticsearch to track changes.
Sequence Number and Primary Term
Sequence Number (_seq_no): Each time a document changes, its sequence number increments. This sequence number is used to track changes at the shard level.
Primary Term (_primary_term): This represents the lifecycle of a primary shard. It increments if the primary shard is reassigned, helping Elasticsearch manage changes across shard relocations.
Example
PUT /my_index/_doc/1?if_seq_no=2&if_primary_term=1
{
“title”: “Another Updated Title”
}
Elasticsearch checks both the _seq_no and _primary_term:
If both values match the current state, it applies the update.
If either value has changed, Elasticsearch returns a 409 Conflict error.
On Conflict
Retry the operation: Read the latest document state and reapply the update.
Use the _retry_on_conflict parameter in Elasticsearch to automatically retry the update a specified number of times. This can be useful in high-concurrency scenarios.
POST /my_index/_update/1?_retry_on_conflict=3
{
“doc”: {
“title”: “Updated Title After Retry”
}
}
The _retry_on_conflict=3 parameter tells Elasticsearch to retry up to three times if a version conflict occurs.