Intro Flashcards
Overfitting
the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably
Underfitting
a data model is unable to capture the relationship between the input and output variables accurately, generating a high error rate on both the training set and unseen data
Neural network
a biologically inspired mathematical function made up of artifical “neurons” that interact with one another. Neurons are typically represented by circles and arrows between neurons represent weights that describe the relationships between the neurons
Sigmoid function
A weighted sum of inputs is passed through an activation function and this output serves as an input to the next layer. When the activation function for a neuron is a sigmoid function it is a guarantee that the output of this unit will always be between 0 and 1
Recurrent neural networks (RNN)
a neural network made up of identical “units” of neurons that feed into one another in a series. The most common type of network that interacts with time series or sequential data
Long Short-Term Memory (LSTM) networks
variant of RNNs. Its Gating mechanism sets it apart. This feature addresses the short term memory problem of RNNs
Regularization
step taken to reduce Overfitting (high variance) and underfitting (high bias)
Tensor
mathematical objects that can be used to describe physical properties, just like scalars and vectors. In fact tensors are merely a generalization of scalars and vectors; a scalar is a zero rank tensor, and a vector is a first rank tensor.
Scalar
has only magnitude, no direction
Vector
having magnitude and direction
Backpropagation
short for “backward propagation of errors,” is an algorithm for supervised learning of artificial neural networks using gradient descent. Given an artificial neural network and an error function, the method calculates the gradient of the error function with respect to the neural network’s weights.
Gradient descent
an iterative first-order optimization algorithm used to find a local minimum/maximum of a given function. This method is commonly used in machine learning (ML) and deep learning(DL) to minimize a cost/loss function (e.g. in a linear regression)
Gradient
calculated by taking the partial derivative of a function with respect to each variable. Result is expressed as a vector
Jacobian matrix
the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives.
Normalization
The goal of normalization is to transform features to be on a similar scale. This improves the performance and training stability of the model