Machine Learning - Reading 7 Flashcards
What are target variables
this is the dependent variable and can be continuous, categorical or ordinal
What are features
these are the independent variables
what is training data set
this is the sample used to fit the model
what is a hyperparameter
this is a model input specified by the research
What is unsupervised learning
The ML program is not given labeled training data, instead, puts are provided without any conclusions about those inputs
what is deep learning
algorithms are used for complex tasks such as image recognition, natural language processing and so on
what is supervised learning
uses labeled training data to guide the ML algorithms towards superior forecasting accuracy
What is overfitting
is an issue with supervised ML that result when a large number of features are included in the data sample. Overfitting has occurred when the noise in the target variable seems to improve the model fit. Overfitting the model will decrease the accuracy of model forecasts on other data
what is bias error
This is the in-sample error resulting from model w/ a poor fit
what is variance error
This is the out-of-sample error resulting from overfitting models that do not generalize well
what is base error
These are residual errors due to random noise
What will a graph of a robust, well generalized model show?
a robust, well-generalizing model will show an improving accuracy rate as the sample size is increased, and the in-sample and out-sample error rates will converge toward a desired accuracy level
What is penalized regressions
penalized regression models reduce the problem of overfitting by imposing a penalty based on the number of features used in the model
what is LASSO
minimizes the sum of absolute value of slope coefficients
*automatically eliminates the least predictive features
what is a support vector machine
is a linear classification algorithm that separates the data into one of two possible classifiers
what is k-nearest neighbor
classify an observation based on nearness to the observation in the training sample
what is the tradeoff in the specification of k in KNN
when k is too small, you have a high error rate and when it is too large, you dilute the result by averaging across too many outcomes
what is CART method
Are appropriate when target when the target variable is categorical, and typically used when the target is binary. Classification trees assign to one of two possible classifications at each node
what is ensemble and random forest method
Ensemble learning is the technique of combining predictions from multiple models rather than a single model
**The ensemble method results in a lower average error rate because the different models cancel out noise
Random forest is a variant of classification trees whereby a large number of classification trees are trained using data from the dame data set
what is the ensemble method of aggregation of heterogeneous learners
Different algorithms are combined tighter via a voting classifier. The different algorithms each get a vote, and then we go with whichever answer gets the most votes. Ideally, the models selected will have sufficient diversity in approach, resulting in a greater level of confidence in the predictions
what is the ensemble method of aggregation of homogeneous learners
The same algorithm is used, but on different training data. The different training data samples van be derived by bootstrapping
what is principal component analysis? what is Eigenvectors and eigenvalues?
**dimension reduction
PCA: Summarizes the information in a large number of correlated factors into a much smaller set of uncorrelated factors
Eigenvectors: These uncorrelated factors, ate linear combinations of the original features
Eigenvalue: The proportion of total variance in the data set explained by the eigenvector
what is clustering
clustering is the process of grouping observations into categories based on familiarities in their attributes (called cohesion)
what is k means clustering
k-means clustering partitions observations into k nonoverlappinf clusters, where k is a parameter. Each cluster has a centroid, and each new observation is assigned to a cluster based on its proximity to the centroid
what is hierarchical clustering
builds an hierarchy of clusters without any predefined number of clusters.
In a agglomerative clustering, we start with one observations as its own cluster and other similar observations to that group, or form another non overlapping cluster.
A divisive algorithms starts with one giant cluster, and then it partitions that cluster into smaller and smaller clusters
what are neural networks
are constructed as nodes connected by links. The input layer consists of nodes with values for the features. These values are scaled so that the information from multiple nodes is comparable
what are neurons in neural networks
nodes that follow input variable
what is a summation operator in neural networks
collates the information and passes in on to an activation function
what is an activation function in neural networks
generate value from input value
what is a forward propagation in neural networks
value passed to other neurons in other hidden layers
what is a backward propagation in neural networks
process employed to revise the weights in the summation operator
what is deep learning
deep learning are neural networks with many hidden layers
what is reinforcement learning
reinforcement learning algorithms have an agent that seeks to maximize a defined reward given defined constraints