Wk 1: Data Science Process Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are the steps in the DS process?

A

Acquire, Prepare: Explore, Prepare: Pre-Process, Analyze, Communicate, Take Action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the acquire step?

A

identify suitable data, use all available data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some different data sources?

A

Traditional DBs, Text files / spreadsheets, remote data (web sites), NoSQL Storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Prepare: Explore?

A

Looking for correlations, trends and outliers. Use visualizations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Prepare: Pre-Process step?

A

Clean and transform data. AKA munging, wrangling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why would you scale / normalize data?

A

To equalize the contributions of variables with different magnitudes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How might you do feature selection?

A

Remove, combine or add new features?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is dimension reduction?

A

Find a smaller subset of features that captures most of the variation. Common method: Principal Component Analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is data preparation so important?

A

Garbage in, Garbage out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Regression model

A

Predict a value. eg, stock price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Classification model

A

Predict the category of a thing. Eg, weather category or image classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Association Analysis

A

Find associations between items. Eg, basket analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Graph Analysis

A

Graph structure. eg, Social Networks, Disease transmission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to evaluate classification / regression models

A

Compare prediction vs actual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some javascript visualization tools for the web?

A

D3, Leaflet for maps, Timeline for timelines,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly