Data Science at Scale Flashcards

1
Q

How have companies changed in the big data era and what has enabled this?

A

Become more social, customer-orientated and dynamic. They have done this by collecting data, learning from the data, and improving and adapting in response. This is because of cheaper storage and processing, faster networks, and free open-source tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What technological shift enabled widespread data analytics?

A

Cloud-based infrastructure (e.g., AWS, GCP) and Infrastructure as a Service solutions from internet giants like Google, Amazon, and Microsoft.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How has big data transformed marketing?

A

Customer profiling, targeted ads, and personalised communication and recommendations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 3 V’s of big data?

A

Volume, Velocity and Variety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the four fundamental functionalities that Data-intensive applications are built from?

A

Database - Store data so it can be retrieved later.
Caching - Store the results of expensive operations to be used again soon.
Indexing - Allow users to efficiently search the data.
Batch Processing - Periodically run specific routines on large amounts of accumulated data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the three important things that a Data-intensive application needs to be?

A

Reliable, Scalable, Maintainable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does it mean for a system to be reliable?

A

It performs the function the user expected. It can tolerate the user making mistakes. Its performance is good enough for the requires use case, under the expected load and volume. It prevents any unauthorised access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are faults and failures?

A

A fault is when a component (hardware/software) of the system works in an unexpected way, and a failure is when the entire system stops providing the service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Hardware Faults and what measures can be taken to stop them?

A

Usually when a HDD, memory module or PSU stops working. In large data centres this is common. We can use hardware measures such as RAID for HDD’s, redundant PSU’s, and hot-swappable CPU’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Software Faults?

A

When the software stops working. These are harder to anticipate, and can be present on many nodes of a system causing widespread failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Scalability?

A

A system’s ability to cope with increased load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Load?

A

A measure of the amount of use of a system, for example: requests per second, number of players, read/write ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is performance?

A

How well the system is responding to the load, for example: response time, or time taken to process a dataset. The average and distribution are both important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Vertical Scaling?

A

Upping the specs of the current system, this does not scale linearly, and has limited fault tolerance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Horizontal Scaling?

A

Increasing the amount of devices in the system, which scales better and has better fault tolerance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Maintainability?

A

The overall cost to maintain a system operational and updated.

16
Q

What is Operability?

A

How easy it is for the operation team to keep it running. Includes good monitoring, automation, and predictable behaviour.

17
Q

What is Simplicity?

A

How easy it is for new people working on the system to understand it, without reducing functionality, just accidental complexity.

18
Q

What is Evolvability?

A

How easy it is to make changes and update the system, which is closely linked with simplicity.