BIG DATA Flashcards

1
Q

TYPES OF DATA

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The 4 V’s

A

Volume, Velocity, Variety, Veractiy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Volume

A

A large amount of data that increasingly requires more storage space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Velocity

A

An amount of data that are growing exponentially fast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variety

A

Data that are generated in different formats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Veracity

A

Data are generated by the public rather than employees; therefore, it has varying levels of accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where does Data originate from

A

Data originates from sensors and anything that has been scanned, entered, and released to the internet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Collected Data can be:

A

categorized as structured or unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Structured Data

A

are created by applications that use fixed format input such as spreadsheets. May need to be manipulated into a common format such as CSV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unstructured data

A

are generated in a freeform style such as audio, video, web pages, and tweets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Huge Data

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Each day we create ____ bytes of data

A

2.5 quintillion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

To calculate the size of database (see example in answer )

A

assume 1000 bytes/transaction. 1000 bytes/transaction * 30 billion transaction/quarter * 4 quarters/year * 10 years = 1200 petabytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Big Data Storage (5 major storage problems with big data)

A

Management: Cloud or on premise, Security: Ensuring good security policies are in place and followed, Redundancy: Need good backups, Scale: Data storage needs may change at any time, Access: Data needs to be easy to access with a friendly user interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Big Data Storage (Benefits of Big Data)

A

Analyzing a large amount of data for data-driven decisions, Businesses can utilize other big data warehouses for decision making, Improved customer service, Increasing operational efficiencies of manufacturing, products, and services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Partitioning

A

Splits large tables into separate files on one machine

17
Q

Constant schema

A

Schema is developed when designing database and does not change

18
Q

Vertical Scaling

A

This increases speed and CPU