Chapter 5 Flashcards

1
Q

four v’s of big data

A

volume, velocity, variety, and veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

data volume

A

amount of data created and stored by an organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data velocity

A

pace at which data is created and stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data variety

A

different forms data can take

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

data veracity

A

quality or trustworthiness of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

analytics mindset is ability to

A

ask right questions; extract, transform, and load relevant data; apply appropriate data analytic technique; interpret and share results with stakeholders

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

asking right questions is the 1st step of analytics mindset: establishing objectives that are smart

A

specific, measurable, achievable, relevant, timely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

etl process

A

extracting, transforming, and loading data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

structured data

A

data that is highly organized and fits into fixed fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

unstructured data

A

data that has no uniform structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

semi structured data

A

organized in some ways but not fully organized to be inserted into a relational database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

data warehouses store

A

structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

data lake

A

collection of structured, semi structured, and unstructured data in a single location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

dark data

A

info the organization has collected and stored that would be useful for analysis but is not analyzed and is ignored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

data swamps

A

data repositories that arent accurately documented so the stored data cant be properly identified and analyzed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

data swamps

A

data repositories that arent accurately documented so the stored data cant be properly identified and analyzed

17
Q

flat file

A

text file that contains data from multiple tables or sources and merges it into a single row

18
Q

delimiter

A

character that marks end of 1 field and beginning of tect

19
Q

text qualifier

A

2 characters that indicate the beginning and end of a field and tell program to ignore any delimiters contained btw the characters

20
Q

4 steps for transforming data

A

understand the data and desired outcome
standardize, structure, and clean data
validate data uality and verify data meets data requirements
document the transformation process

21
Q

descriptive analytics

A

info that results from examination of data to understand the pasts
“what happened?”

22
Q

diagnostic analytics

A

build on descriptive to answer “why did this happen?”
attempt to determine causal relationships

23
Q

predictive analytics

A

answers “what might happen in the future?”

24
Q

prescriptive analytics

A

info that provide a recommendation of what should happen
“what should be done?’

25
Q

common way people interpret results incorrectly is

A

correlation and causation

26
Q

correlation

A

tells if 2 things happen at the same timie

27
Q

causation

A

tells if the occurrence of 1 thing causes the occurrence of the 2nd thing

28
Q

components of sharing results

A

remember the question that initiated the analytics process, consider audience, data visualization

29
Q

good principles of visualization design

A

choosing right type of visualization, simplifying presentation of data, emphasizing whats important, representing data ethically

30
Q

automation

A

application of machines to automatically perform a task once performed by humans

31
Q

what is one tool that can be used to automate etl tasks?

A

robotic process automation