Modeling - ML Models Flashcards

1
Q

XGBoost

A
  • Extreme Gradient Boosted Trees
    • Boosted group of decision trees
    • new trees made to correct errors of previous trees
    • uses gradient descent to minimize loss as new trees are added
  • Classification or regression (using regression trees)
  • regularization term penalizes complexity of each tree
  • nodes are split if there is a positive reduction of the loss function
  • loss reduction (gamma) is used to control complexity costs with each additional leaf
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Logistic Regression

A

Nonlinear Classification Model
Probabilities describe possible outcomes when modeled with logistic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

K-means

A
  • method for grouping n observations into K clusters
  • each observation belongs to the cluster with the nearest mean
    Unsupervised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Linear Regression

A

Supervised
Regression Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SVM

A
  • Supervised learning models for classification or regression
  • finds a hyperplane in N-dimensional space that distinctly classifies the datapoints
  • If classes can’t be separated with a single line, you need a non-linear kernal to create hyperplane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Decision Trees

A
  • flowchart like structure in which each internal node represents a test on an attribute and each leaf node represents a class label
  • paths from root to leaf represent classification rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Random Forest

A
  • ensemble of decision tree classifiers
  • each tree is generated from independent random vectors from samples in dataset
  • tree classifiers are then combined by averaging probabilistic predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

RNN

A
  • Recurrent neural network
  • connections between nodes can create a cycle, allowing output from some nodes to affect subsequent inputs to same nodes
  • Infinite impulse response class of networks
    • linear time-invariant systems
    • h(t) does not become exactly zero past a certain point, continues indefinitely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CNN

A
  • most commonly for visual images
  • uses convolution kernels that map a high dimension dataset to a lower dimension dataset
  • finite impulse response class of networks
    • impulse response does become exactly zero at times t > T for some finite T
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Collaborative Filtering

A
  • Technique for recommender systems
  • make auto predictions about interests of a user by collecting preferences or taste information from many users (collaborating)
  • if person A has same opinion as person B on an issue, A is more likely to have B’s opinion on a different issue than of a randomly chosen person
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Semantic Segmentation

A
  • deep learning algorithm that associates a label or category with every pixel in an image
  • used to recognize a collection of pixels that form distinct categories
  • try to draw a boundary around every object and know pixel level details
  • labeling every pixel in image and knowing to which class it belongs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Instance Segmentation

A

Segment and show different instances of same class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Linear Learner

A

SageMaker Built in algorithm
supervised learning algorithms used for classification or regression
For regression - basically Linear Regression.
For classification - linear threshold function is used. Can do binary or multi-class.
Uses Stochastic Gradient descent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DeepAR

A

Sagemaker built in algorithm
Forecasting algorithm
Forecasting scalar time series using RNN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Random Cut Forest

A

For anomaly detection
Unsupervised
Can detect unexpected spikes in time series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

KNN

A

K-Nearest Neighbors
supervised
simple classification or regression algorithm.
Find K closest points to a sample point and return most frequent label or average value

16
Q

PCA

A

Dimensionality reduction
Unsupervised

17
Q

Factorization Machines

A

dealing with sparse data
good for item recommendations
supervised
classification or regression
pair-wise interactions

18
Q

BlazingText

A

provides highly optimized implementations of Word2Vec and text classification algorithms
- sentiment analysis, entity recognition, translation
- text classification
- web searches, information retrieval, ranking, document classification
- predict labels for a sentence
- supervised

19
Q

Sequence2Sequence

A
  • supervised algorithm
  • input is sequence of tokens
  • output generated is another sequence of tokens
    works well for the summarization of the text
    decodes and encodes sequences of tokens, such as words
20
Q

Object2Vec

A
  • generalizes Word2Vec embedding technique for words that are optimized in BlazingText algorithm
  • like Word2Vec but with arbitrary objects
21
Q

Neural Topic Model (NTM)

A

Topic modeling algorithm

22
Q

Latent Dirichlet Allocation (LDA)

A

Topic modeling algorithm

23
Q

Amazon Comprehend

A

Advanced text Analytics (Use natural language processing to extract insights & relationships from unstructured texts

24
Amazon CodeGuru
Automated code reviews (Automate code reviews & identify your most expensive lines of code
25
Amazon Lex
ChatBots (Easily build conversational agents to improve customer service & increase contact center efficiency
26
Amazon Forecast
Demand forecasting (Build accurate forecasting models on the same machine learning forecasting technology used by Amazon.com)
27
Amazon Textract
Document analysis (Automatically extract text and data from millions of documents in just hours, reducing manual efforts)
28
Amazon Kendra
Enterprise search (Add natural language search capabilities to your apps so users can find the information they need more easily)
29
Amazon Fraud Detector
Fraud prevention (Identify potentially fraudulent online activities based on the same technology used at Amazon.com)
30
Amazon Rekognition
Image and video analysis (Add image and video analysis to your applications to catalog assets, automate media workflows, and extract meaning)
31
Amazon Personalize
Personalized recommendations (Personalize experiences for your customers using machine learning technology perfected from years of use on Amazon.com)
32
Amazon Translate:
Real-time translation (Expand your reach through efficient and cost-effective translation to reach audiences in multiple languages)
33
Amazon Polly
Text to speech (Turn text into life-like speech to give voice to your applications)
34
Amazon Transcribe
Transcription (Easily add high-quality speech-to-text capabilities to your applications and workflows)
35