Artificial Neural Networks and Applications Flashcards
What is a neural network?
a model that mimics the behavior of biological neurons in the human brain
How does a NN calculate an output?
passed input through an array of neurons
What kind of complicated problems do NN perform exceptionally well in solving?
text, voice, and image recognition, and NLP
What is a type of connectionist model that has become a mainstay of AI and cognitive science?
neural networks
Where does human intelligence begin?
with the connections between neurons
What are the three milestones in neural networks models?
Single layer perceptron, multi-layer perceptron, and DNN
Who founded the single layer perceptron model and when?
Rosenblatt in 1957
Who founded the multi-layer perceptron model and when?
Rumelhart in 1986
Who founded the DNN model and when?
Hinton in 2006
Which algorithm is a key algorithm of the single layer perceptron model?
the perceptron algorithm
Which algorithm is a key algorithm of the multi-layer perceptron model?
backpropagation algorithm
Which algorithm is a key algorithm of the DNN model?
deep learning algorithm
What is the main role common to all neural networks?
the learning function
What types of information are large-scale multimedia information?
audio and video information
True or false: Learning and recognizing data is small and easy to handle?
false
What structures comprise a neuron?
dendrites, synapses, axons, and terminals
What is the role of a neuron?
sensory organ neurons, network signaling
What are the components of a perceptron?
weights, bias, and activation functions
What is an activation function?
a function that expresses the activation/deactivation of a neuron
What are a few types of activation functions?
step-fn, sigmoid, tanh, and ReLU
Why do we use activation functions?
we need non-linear functions
What is the function of a dendrite?
receives stimuli from other neurons or surroundings and transmits impulses to cell body via electrical signals
What is a synapse?
the junction of cells where the axons of one neuron and the dendrites of the next neuron meet
What is an axon?
a branch of a neuron whose function is to transmit signals to other neurons
What is the function of a terminal?
receives transmitted electrical signals and secretes neurotransmitters into synapses
What happens to signals from sensory organs?
they pass through the brain’s network of neurons and get converted into meaningful signals
Where is the focus of current research?
implementing learning-capable computing
What is the behavior of a single neuron?
n inputs -> operation* (1 or 0) -> m outputs
What happens within the *operation (1 or 0) function?
neuron activates a signal when signal crosses a threshold according to rules of the cell body, or it does not emit a signal if it is not activated
What is the function of a weight?
controls the importance of the input signal to the output
What is the function of a bias?
controls how easily neurons are activated
What is the function of an activation function?
to pass vs not to pass
What are adjusted through learning?
weights and biases
Why are weights and biases adjusted through learning?
to strengthen neural networks of relevant signals and weaken unrelated neural networks
True or false: The activation function used for the neural network doesn’t matter?
false, you must choose the appropriate function
An activation function does what?
takes the sum of the inputs and calculates the output
When the sum of the inputs exceeds a certain threshold, what happens?
the neuron is activated
Neurons follow what type of activation function?
a step function
True or false: MLP (multilayer perceptrons) use a variety of activation functions?
true
True or false: All activation functions are nonlinear functions?
true
Why is it difficult to improve performance if you use a linear function as an activation function?
combining linear functions is the same as using one linear function so even if you have multiple layers, it’s really only expressed as one layer
What is a multilayer perceptron?
a perceptron that has a hidden layer between the input and output layers
What are MLPs notable for?
being able to distinguish data that is not linearly seperable
What do MLPs use for training the network?
backpropagation
What are the limitations of a MLP?
black box model and overfitting
What is the black box model?
since the perceptron is a black box, sometimes it is unknown how the network makes predictions or judgements
What is overfitting?
if the model is too complicated or the training data is too restricted, MLPs may easily overfit the training data
What is forward passage regarding MLPs?
the process by which input signals are applied to the input layer unit and these input signals are propagated through the hidden layer to the output layer
What is a loss function?
a mathematical function that measures the difference between predicted and actual values in a machine learning model
How does the loss function determine the model’s performance?
by comparing the distance between the prediction output and the target values
What does the backpropagation algorithm do?
calculates the output by calculating the forward direction given the input and then calculates the error between the actual output and the calculated output
Why is backpropagation useful?
by propagating this error in the reverse direction, the weights are changed in the way of reducing error
What is backpropagation used for?
supervised learning of artificial neural networks using gradient descent
What is deep learning?
the act of stacking a neural network on top of eachother
What has deep learning recently been applied to?
computer vision, speech recognition, NLP, social network filtering, machine translation, etc.
Why do we use deep learning?
to mimic the human brain and improve performance of AI
What algorithm cause a jumpstart in the research for deep learning and hidden layer learning?
backpropagation
What activation functions resolved the problem of overfitting?
dropout, ReLU
What activation function resolved the issue that happens when the number of hidden layers increases and the gradient value decreases?
ReLU
What is another limitation of MLP that has been resolved?
computing speed limits
What was the first neural network capable of recognizing letters?
Rosenblatt’s (single-layer) Perceptron in 1957
What was the first neural network ever?
McCulloch-Pitts neuron in 1943
What was the McCulloch-Pitts neuron the basis for?
the perceptron algorithm
What is the function of the McCulloch-Pitts neuron?
receives one or more inputs and sums them to produce an output
What can the McCulloch-Pitts neuron be used for?
classification problems
How is Hebb’s rule summarized?
neurons that fire together wire together
What does Hebb’s rule describe?
how neuronal activities influence the connections between neurons
How many layers is Rosenblatt’s perceptron made up of?
a single layer
When was the Mark I Perceptron produced and what was it?
1957, a neural network hardware device
What is a linear system?
a system that solves a problems where the input classes are linearly seperable
What was the first implementation of the perceptron algorithm?
the Mark I Perceptron
What was the input for the Mark I
Perceptron device?
characters (ASCII)
What did the Mark I Perceptron do with the input?
classified the characters into characters classes (A, B, C, etc) aka character recognition
Where was/is the Mark I Perceptron housed?
the Smithsonian Museum, USA
What is the structure of the single-layer perceptron?
adjustment connection strength of just one layer, and uses McCulloch-Pitts neuron and Hebb’s learning rule for feedback learning for errors
What is the structure of the input/output in neurons?
result of n inputs is multiplied by n connections strength vectors
What is the sum determined by in the neuron?
the activation function
What is the output of the neuron?
1 if the value is greater than the threshold (usually 0), otherwise -1
What are some typical non-linear activation functions used in NN?
step function, critical logic function, S-shaped sigmoid function
What is the activation function commonly used in perceptrons?
the sigmoid function
What is a key characteristic of the sigmoid function?
it had a smooth value between 0 and 1
Describe the perceptron learning process.
1)initialize connection strengths and thresholds, 2) present new input and expected output, 3) calculate actual output value, 4) readjust connection strength, 5) go to step 2, rinse and repeat until no more adjustments
Generally what is linear separability?
a property of 2 sets of points in Euclidean geometry
What is the definition of linear separable points?
data points in binary classification problems that can be separated using a linear decision boundary (basically points that can be divided into two areas by a straight line)
What binary functions are linearly separable?
AND and OR
What binary function is not linearly separable?
the XOR (exclusive-or function)
True or false: A perceptron can only converge on linearly separable data?
true
What is the limitation of the single-layer perceptron?
only linearly separable sets can be separated (so not XOR sets)
What solved the linear separability problem with the single-layer perceptron and when?
in the mid-1980s the multi-layer perceptron solved the XOR problem
Single-layer perceptron has how many nodes between the input matrix and the decision node?
one node
Is the single-layer perceptron suitable as a learning model?
no
What is the single-layer perceptron used widely for?
character recognition
What are 2 early neural network models?
Adaline (Adaptive linear neuron) and Madaline (Many Adaline)
Who proposed the first couple early neural networks in 1960?
Bernard Widrow and Ted Hoff
What are some applications of Adaline?
system modeling, statistical prediction, eliminate communication noise and echo, channel equalizer, adaptive signal processing
For how many years after 1969 did neural network research stagnate?
10 years
When was the multi-layer perceptron model proposed?
mid-1980s
What key NN research group is based out of Stanford?
the Parallel Distributed Processing group
What does the PDP focus on?
the study of cognitive processes using parallel distributed processing models
What are some activities of the PDP?
developing models of cognitive processes, creating software programs to simulate these models, and conducting experiments to test these models
What prominent book was published by PDP in 1986?
Parallel Distributed Processing: Explorations in the Microstructure of Cognition
What important algorithm did the PDP introduce?
backpropagation
When was the backpropagation algorithm proposed?
in 1986 in the multi-layer perceptron structure
What was the multi-layer perceptron with back propagation useful for?
overcoming limitations of single-layer perceptrons particularly the linear separability (XOR) problem
How does backpropagation implement learning?
returning in the opposite direction
What is backpropagation used for?
supervised learning of artificial neural networks using gradient descent
What exactly is backpropagation?
a backward propagation of errors algorithm that calculates the gradient of the error function with respect to the NN’s weights
In backpropagation, what minimizes errors?
a backwards pass to adjust a neural network model’s parameters
What are the steps of backpropagation in a multi-layered network?
propagate training data through the model, adjust the model weights to reduce the error, and repeatedly update weights until convergence or iteration satisfaction
Who was both the first author of the famous backpropagation paper in 1986 and a key member of the PDP group?
Dr. David Rumelhart
What was an early goal of the PDP group?
to recognize incomplete or noisy characters
What is the structure of the multi-layer perceptron?
one or more hidden layers between the input and output layers
What is the order of connections for multi-layer perceptrons?
input -> hidden -> output
Are there connections within each layer of a multi-layer perceptron?
no
True or false: There is a direct connection from the output layer to the input layer of a multi-layer perceptron?
false
Fill in the blank: Multi-layer perceptron makes the input and output characteristics of nodes __________.
nonlinear
What are the steps in the forward pass of the backpropagation learning algorithm of multi-layer perceptron?
give an input pattern to each node in input layer, signal is converted at each node and sent to hidden layer, output signal from hidden layer to output layer
After forward pass, what are the remaining steps of the backpropagation learning algorithm of multi-layer perceptron?
compare output to expected output, adjust connections strength, backpropagate again to adjust connection strength, repeat until end conditions are met
What are the end conditions for the backpropagation learning algorithm of multi-layer perceptron?
when the actual output value and the target output value are within the error range
What is the delta rule?
the square of the error between the output and the target output value
What is the gradient descent method?
the square of the error on a curved surface (an iterative optimization algorithm used to find the local minimum of a differentiable function)
What is the idea behind the gradient descent method?
to take repeated steps in the opposite direction of the gradient of the function at the current point because this is the direction of steepest descent (to find the minimum by moving in the direction of the negative gradient)
How does backpropagation use gradient descent in learning?
used to update weights of network during training by computing gradient of loss function with respect to the weights then moving the weights in opposite direction of the gradient until weights converge to a minimum
What are the disadvantages of the backpropagation learning algorithm?
long training time, low probability or local minimum possible, steepest descent is likely to stay at a local minimum (when we want a global minimum)
What did the multi-layer perceptron make possible?
A neural network that can implement the XOR function
What are other applications of multi-layer perceptron models?
parity problem, encoding problem, symmetry problem, development of nettalk system to convert text to voice, stock market predictions, translation between different languages, factory automation, robots, real-time voice recognition