Quiz 1 Flashcards

Question 1

Q

Steps of the data science pipeline

Answer

A

Data selection
Data Preprocessing
Data Transformation
Data Mining
Evaluation/Interpretation

Question 2

Q

Ways to measure central tendency

Answer

A

Mean
Median
Midrange
Mode

Question 3

Q

Ways to measure dispersion or spread

Answer

A

Range
Quartiles
Variance
Standard Deviation
Interquartile Range

Question 4

Q

Unsupervised Learning

Answer

A

Includes Clustering

Find groups in data without provided labels

Question 5

Q

types of supervised learning problems

Answer

A

Regression

Classification

Question 6

Q

Classifier

Answer

A

Discovers a pattern that can predict a class that a new data instance falls into

Question 7

Q

What is clustering points used for?

Answer

A

Anomaly Detection
Based on similarities between them
Does not require labeled data

Question 8

Q

Supervised learning examples

Answer

A

Examine a web page, and classify whether the content on the web page should be considered “child friendly” or “adult.”

In farming, given data on crop yields over the last 20 years, learn to predict next year’s crop yields.

Learn from historical data and determine whether a new user will respond to an add campaign (or not).

Question 9

Q

Data discretization is part of data reduction

Question 10

Q

Scatter plot is not an effective graphical method to look for correlation between two numerical variables

Question 11

Q

Truths about correlation

Answer

A

If correlation is equal to -1 then two features are perfectly negatively correlated

Correlation between two features ranges between [-1, 1]

If correlation is equal to 1 then two features are perfectly positively correlated

If correlation is equal to 0 then two features have no correlation

Question 12

Q

Scatter plot

Answer

A

Can handle multiple Y values per X value

Question 13

Q

Bar Chart

Answer

A

Good for categorical X values and cases where the Y value is ratio scaled.

Question 14

Q

Line Graph

Answer

A

Implies some importance of the connection between the data points

Quiz 1 Flashcards

(14 cards)