E4 Flashcards

1
Q

Some of the challenges of creating big data applications

A
  1. Scaling problems
  2. Fault-tolerance issues
  3. Data corruption issues
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Best approach to scaling problems?

A

Use multiple database servers and spread the table across all servers.

Each server will have a subset of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Scaling using multiple databases. How?

A
  1. Deploy more database servers
  2. Use a different hash function
  3. Redistribute the users according to the new hash function
  4. Change the code of our application
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fault-tolerance issues

A

When we have many databases it starts to become frequent that the hard drive in one of the databases goes bad

  • We need to deal with having one of the databases down
  • We need to add backups to each of the databases

Our system is not resilient to hardware errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data corruption issues

A

At some point we deploy code with a bug: instead of incrementing each video viewership by one unit, our code increments by two units. We notice the mistake only 24 hours later.

Now we have corrupted data: every video watched in the past 24 hours have their viewership inflated. How do we solve this?

Our system is not resilient to human errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The desired properties of Big Data systems are related both to

A

Complexity and scalability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Complexity

A

generally used to characterize something with many parts where those parts interact with each other in multiple ways

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Scalability

A

ability to maintain performance in the face of increasing data or load by adding resources to the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A big data system must

A
  1. perform well
  2. be resource-efficient
  3. easy to reason about
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Desired properties of a Big Data system

A
  1. Robustness and fault tolerance
  2. Low latency
  3. Minimal maintenance
  4. Ad hoc queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly