Big data analytics and data science role Flashcards

1
Q

What are the 5 v’s?

A
Volume
Velocity
variety
Value
Veracity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does Veracity mean?

A

willingness to believe data is good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 2 types of data?

A

Meta and para

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does metadata mean?

A

minimum you should know about the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does paradata mean?

A

how has the data been processed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 4 data structures?

A

structured
semi-structured
quasi-structured
unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the data repositories?

A

data islands
data warehouses
analytic sandbox

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a data island?

A

isolated datamarts. record keeping in spreadsheets and low volume DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a data warehouse?

A

centralised data repository. Supports BI and reporting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is an analytic sandbox?

A

assets from multiple sources ready for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three big data project success factors?

A

timely decision making
processing throughout
flexibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what three ways does an analytic sandbox support big data success factors?

A

provides high performance analysis
ingests data from different sources
owned by the DS rather than IT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the business drivers of big data/data science?

A

optimise business processes
predict new business opportunities
mitigate business risk
meet legal and regulatory requirements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the four parts of the big data ecosystem?

A

data devices
data collectors
data aggregators
data users/buyers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are data devices in the big data ecosystem

A

they continuously gather data about the world (phones)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are data collectors in the big data ecosystem

A

interact with many organisations and institutions. Provides them with information to access their services

17
Q

what are data aggregators in the big data ecosystem

A

take data from multiple sources and combine and enrich them to provide data to consumers

18
Q

what are data users/buyers in the big data ecosystem

A

users consume data from their own sensor net and data collector along with data acquired from data aggregators to help form data decision making

19
Q

which is in the past and which is in the future?

A

BI - the past

DS - the future

20
Q

What are the 4 key roles within DS?

A

analytical talent
data savvy professionals
technology enablers
knowledge engineers

21
Q

what does a knowledge engineer do?

A

wrangle the data ready for projects to consume

22
Q

what are the 5 things a data scientist should be?

A
quantitative - do math
curious and creative
technical - code
skeptical - question
communicate and collaborate
23
Q

what are the 3 original V words?

A

Volume
Variety
Velocity

24
Q

Describe semi-structured data?

A

XML (Coding)

25
Q

Describe Quasi- structured data?

A

web clickstream

26
Q

Big data uses ELT what does it mean?

A

extract load transform

27
Q

what is a data savvy professional?

A

intro of understanding of DS

28
Q

what is an analytical talent?

A

training in quantitative methods

29
Q

what is a technology enabler?

A

looks at hardware and software