3: Machine Learning Flashcards
Data are shuffled randomly and then divided into k equal subsamples.
One sample is saved to be used as validation sample, and the other k-1 samples are used as training samples
K-fold cross validation
Technique of combining predictions from a number of models, with the objective of canceling out noise
Ensemble Learning
Results in: more accuracy & stable predictions (vs single model)
- Nodes connected by links
- Useful in: Supervised Regression & Classification models
- Works well in presence of: nonlinearities & complex interactions among variables
- Recognizes: patterns, clusters, and classifies
Neural Networks
Unsupervised Neural Networks with many hidden layers (often >20), and reinforcorced learning learn from their own prediction errors
Used for: complex tasks; image, pattern, & character recognition
Deep Learning Networks
- Algorithm learns from success & mistakes
- Seeking to maximize reward and minimize punishment
- Defined constraints
Reinforcement Learning
Inputs & outputs are identified for the computer, and the algorithm uses this labeled training data to model relationships
Supervised Learning
Computer is provided unlabeled data that the algorithm uses to determine the structure of the data
Unsupervised Data
Least Absolute Shrinkage and Selection Operator (LASSO) is useful in building:
Penalized regression model
Parsimonious models, through feature reduction
K-Nearest Neighbor, investment application includes:
Used in: classification & regression
- predicting bankrupcty
- assigning bond ratings class
- predicting stock prices
- creating customized indicies
Random Forest investment applications include:
- factor based asset allocation
- prediction models for IPO success
Linear relationships
A penalized regression model tries to use a limited number of most important features that…
explain the variation in the dependent variable
Example: monthly returns on 100 stocks
Overfitting occurs when:
Bias error:
Variance error:
when model fits the training too well
Bias error: low
Variance error: high
displaying non linear characteristics
Generalize is the degree to which the model retains it’s explanatory power when:
predicting out of sample
Bias error is the degree to which:
the model fits the training data
Variance error shows how much the model responds to:
new data