Machine Learning Flashcards

1
Q

What is Machine Learning?

A

Finds patterns in data then uses those patterns to predict the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Learning?

A

Identifying patterns and then recognizing those patterns when you see them again

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the simple Machine Learning workflow?

A
  • Feed data that contains patterns
  • Machine learning algorithms find patterns
  • This outputs a “model” that recognizes patterns
  • Applications can use the model to get probabilities of match
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Good machine learning requires what?

A
  • Lots of data
  • Lots of computing power
  • Effective algorithms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Who cares about Machine Learning?

A
  • Business leaders - want solutions to problems
  • Software devs - create better applications
  • Data Scientists
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a Data Scientist?

A
  • Someone familiar with statistics
  • Machine learning software (and ability to code it)
  • Some problem domain (ideally)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Who are some Machine Learning vendors?

A
  • SAS Analytics
  • RapidMiner Studio
  • Alteryx Analytics

And “megavendors” and “cloud”
- IBM, SAP, Oracle, MS Azure, Amazon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is R?

A
  • open source programming language and environment for machine learning
  • very popular and many available packages
  • been around a long time, since 90s
  • most popular
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 3 tenants of Machine Learning?

A
  • Are you asking the right question?
  • Do you have the data you need to answer that question?
  • How do you measure success? How do you know when you are done?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the Machine Learning pipeline?

A
  • Raw data w/ someone w/ domain knowledge
  • Pre-processing data into prepared data often many iterations until you are ready
  • Apply learning alogorithm(s)
  • Get Candidate Model and iterate to find best model
  • Deploy the model
  • Recreate model regularly based on new data and changing world
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Training Data?

A
  • Prepared data used to create a model

- Creating a model is “training” a model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Supervised vs Unsupervised Learning?

A
  • Supervised - the value you want to predict is in the data, the data is “labeled”
  • Un-supervised - the value you want is not in data, data is “not labeled”
  • Supervised is most common
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are Features in Machine Learning?

A
  • “Features” are basically columns of the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are categories of Machine Learning problems?

A
  • Regression (supervised) - (how many will I sell next month)
  • Classification (supervised) - (is CC trans fraud?) - returns probability
  • Clustering (unsupervised) - what are our customer segments?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are types of algorithms?

A
  • Decision tree
  • Neural network
  • Bayesian
  • K-Means (for clustering)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are steps for training a model with Supervised Learning?

A
  • Hire Data Scientist
  • Choose features
  • Send data to chosen algorithm
  • Only send 75% of data
  • Generate Candidate Model
  • Test model with remaining 25% of data against candidate model