Application Engineering Flashcards

1
Q

What are the components of a database server?

A
  • CPU
  • Memory - inserts and updates are written here first
  • Persistent Disk
  • Journal - log of every database process

Database writes to memory first and then writes to journal and disk. There is a lag between the write to memory and the write to disk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How could the data be lossed when updating and inserting even though Mongo returns a message saying insert/update success?

A

The server can crash after the data is written to memory but before it is written to disk. It’s up to the developers to configure Mongo for waiting for a persistance to disk before receiving a successful save message.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the default write concern for MongoDB?

A

w=1 j = false –> Save occurs for a write to the memory in the database and there’s no wait for the journal to sync

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you set a wait for the write to disk in Mongo?

A

w = 1 j = true –> This waits for the write to disk when saving which is much slower than w=1 j = false option, but less risky.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Provided you assume that the disk is persistent, what are the w and j settings required to guarantee that an insert or update has been written all the way to disk?

A

w=1, j=1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What typically happens when you don’t receive a affirmative response?

A

Network error; this can also give a false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the reasons why an application may receive an error back even if the write was successful?

A

Network connection between the server and application and the server was reset after the write was received but before the message was sent
Mongodb can also terminate between receiving the response and writing it to disk
Mongodb can fail between the time of the write and the time the application receives a response.

Inserts instead of updates is the only way to prevent this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we get availability and fault tolerance in MongoDB?

A

Using replication across multiple Mongo instances. Data written to a primary node will asynchronously be replicated to the secondary nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the minimum original number of nodes needed to assure the election of a new Primary if a node goes down?

A

3 –> You need two other secondaries when a primary goes down for an election. When the primary goes down, the a new primary is elected and the data is written there. The old primary returns as a secondary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the different types of replica set nodes?

A
  1. Regular - has data; can be primary or secondary
  2. Arbiter - only there for voting purposes in elections
  3. Delayed/Regular - disaster recovery node (can’t participate in voting - p = 0)
  4. Hidden - can’t be a primary but can participate in elections
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which types of nodes can participate in elections of a new primary?

A

Regular
Hidden
Arbitors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain Write Consistency

A
  • Writes go to the only primary but reads don’t have to go to the primary
  • ## If reads go some where else, you may read stale data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

During the time when failover is occurring, can writes successfully complete?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the oplog?

A

It is a capped collection on a mongo instance (you can find it using ‘show collections’) that follows the activity of the primary. The activity of the primary is recorded on the primary’s oplog and then these are copied by the secondaries asynchronously

  • doesn’t matter which driver you use with these - version doesn’t matter or driver doesn’t matter (WiredTiger)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does rs.slaveOk() do?

A

It allows a secondary instance to be queried on; by default they can’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T or F –> A copy of the oplog is kept on both the primary and secondary servers

A

T

17
Q

T or F –> Replication supports mixed-mode storage engines For example mmpap1 and wiredTiger

A

T

18
Q

What happens if a node comes back up as a secondary after a period of being offline and the oplog has looped on the primary?

A

The entire dataset will be copied from the primary

19
Q

T or F - There is a scenario that a write performed with w = majorithy get rolled back

A

T - because this can happen when a failover on the primary occurs after the write is committed to the primary but before the replication of the oplog is committed to the secondaries. This will show an exception on the client.

20
Q

If you leave a replica set node out of the seedlist within the driver, what will happen?

A

It will be discovered anyways as long as you list atleast one valid node

21
Q

What kind of exception occurs when a primary fails and election occurs during an insert?

A

AutoReconnect exception

22
Q

What will happen if the following statement is executed in Python during a primary election?

A

db.test.insert_one({‘x’:1})

23
Q

How is failover detected?

A

Use catches and exceptions in your code to catch a failover - this however does not guarentee that your failover catch will write all data - instead use a retry up to 3 times or more

24
Q

Why don’t you need to handle a duplicate key exception for reads?

A

Duplicate key reads are impossible when reading

25
Q

Give examples of item potent updates and non item potent updates:

A

Itempotent –> $set
NonItempotent –> $inc, $push –> these are more risky to run again because they could potentially doubly increment - not good for incrementing sensitive data like money but probably not a big deal for

26
Q

How do you handle not item potent updates?

A
  • Converting to item potent updates
  • ## Not updating again and not caring about it
27
Q

If you want to be sure that an update with a $inc occurred exactly once in the face of failover, what’s the best way to do it?

A

transform the update into a statement that is item potent - however this risks losing one update in a multithreaded program