Big Data Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are the three main features of Big data?

A
  • Volume
  • Velocity
  • Variety
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does Volume mean in terms of Big data?

A

Data is too big to fit into a single server. Data must be stored over multiple servers, each composed of many hard drives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does Velocity mean in terms of Big data?

A

Data on the servers are created and modified rapidly. The servers must respond to frequently changing data within miliseconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Big Data?

A

A term used to refer to typically unstructured datasets that are large in terms of storage size, data streaming rate and/or variety.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does Variety mean in terms of Big data?

A

Data held on servers consist of many different types of data, from binary files to multimedia files (photos and videos)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the problem with Big data being unstructured?

A

Being unstructured makes it difficult to analyse the data. Conventional databases aren’t suited for storing big data because they require the data to conform to rows and columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can useful information be extracted from Big data?

A

Using machine learning techniques to discern patterns in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the downside of storing data across multiple servers?

A

The processing associated with using big data must be split across multiple machines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is functional programming used to process data over multiple machines?

A

Functional programs are stateless (they have no side effects), make use of immutable data structures and support higher order functions.

Makes it easier to write correct, efficient, distributed code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why do conventional programming paradigms struggle with processing data over multiple machines?

A

Conventional programming paradigms wouldn’t work as the machines would all have to be synchronised to stop data being overwritten or damaged.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Higher-order function?

A

A function which takes functions as its inputs and/or outputs a function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Graph Schema?

A

A method of defining a database in terms of nodes, edges and properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly