College 1 Flashcards
what are the types of artificial neurons?
- perceptron
- sigmoid neuron
Finish the sentence:
The perceptron takes several ..(1).. inputs and produces a single ..(2).. output.
- binary
2. binary
What determines whether the perceptron neuron’s output is 0 or 1?
The neuron’s output, 0 or 1, is determined by whether the weighted sum is less than or greater than some threshold value.
w * x + b > 0, output = 1
What’s the difference between a perceptron and a sigmoid neuron?
- With perceptrons, a small change in the weights or bias of any single perceptron in the network can sometimes cause the output of that perceptron to completely flip, say from 0 to 1. Sigmoid neurons are modified so that small changes in their weights and bias cause only a small change in their output.
- Just like a perceptron, the sigmoid neuron has inputs, x1, x2, … But these inputs can also take on any values between 0 and 1.
- The output is not 0 or 1. Instead, it’s σ (w ⋅ x + b), where σ is called the sigmoid function or the logistic function. (If you want a binary output, you can for example decide to interpret <0.5 as 0.)
Define: multilayer perceptron (MLP)
Network with an input layer, multiple hidden layers and an output later are sometimes called multilayer perceptrons (MLPs), despite being made up of sigmoid neurons, not perceptrons.
Define: feedforward neural networks
Neural networks where the output from one layer is used as input for the next layer,
Define: recurrent neural network
- Artificial neural networks in which feedback loops are possible.
- The idea in these models is to have neurons which fire for some limited duration of time, before becoming inactive. That firing can stimulate other neurons, which may fire a little while later, also for a limited duration.
- That causes still more neurons to fire, and so over time we get a cascade of neurons firing.
- Loops don’t cause problems in such a model, since a neuron’s output only affects its input at some later time, not instantaneously.
Define: cost / loss / objective function
A cost / loss / objective function quantifies how well our algorithm finds weights and biases so that the output from the network approximates y(x) for all training inputs x.
How does gradient descent work?
- You want to find a point where the cost function C achieves it’s global minimum.
- We try this by randomly choosing a starting point and computing derivatives. In practice we compute the gradients seperately for every training example and average them.
- We decide the direction of the step by choosing the direction which will lead to the largest immdediate decrease of C (defined as the vector of partial derivatives)
- The size of the step is dependent on the learning rate.
- We take the step and start computing derivatives again.
what is the difference between plain gradient descent and stochastic gradient descent?
- Stochastic gradient descent can speed up learning
- SGD picks out a randomly chosen mini-batch of training inputs.
- The true gradient ∇C is estimates by computing the gradient for each input in the mini-batch and averaging over this small sample.
- This is is repeated until all inputs are exhausted, which is said to complete an epoch in training. Then we start another epoch.
Define: online / incremental learning
SGD with a minibatch of size 1.
What does the the back propagation algorithm do?
the backpropagation algorithm is a fast way of computing the gradient of the cost function.
Explain the relation between:
- deep learning
- representation learning
- machine learning
- AI
Deep learning is a kind of representation learning, which is in turn a kind of machine learning, which is used for many but not all approaches to AI.
Define: Knowledge base
approach to AI
Achieve AI by hard-coding knowledge about the world in formal languages. A computer can reason automatically about statements in these formal languages using logical inference rules.
Define: Machine learning
The ability to acquire their own knowledge, by extracting patterns from raw data. Simple machine algorithms depend heavily on the representation of the data they are given. Each piece of information included in the representation of the patient is known as a feature. Many artificial intelligence tasks can be solved by designing the right set of features to extract for that task, then providing these features to a simple machine learning algorithm.
- E.g. logistic regression, naïve Bayes
Define: representation learning
An approach to use machine learning to discover not only the mapping from representation to output but also the representation itself.
- E.g. shallow autoencoders
Define: shallow auto encoders
An autoencoder is the combination of an encoder function, which converts the input data into a different representation, and a decoder function, which converts the new representation back into the original format.
Define: Deep learning
Deep learning represents the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones. It is the study of models that involve a greater amount of composition of either learned functions or learned concepts than traditional machine learning does.
Deep learning resolves a problem by breaking the desired complicated mapping into a series of nested simple mappings, each described by a different layer of the model. The input is presented at the visible layer, so named because it contains the variables that we are able to observe. Then a series of hidden layers extracts increasingly abstract features from the image. These layers are called “hidden” because their values are not given in the data; instead the model must determine which concepts are useful for explaining the relationships in the observed data.
- E.g. MLPs