Neural Networks Flashcards
Neural Network
Neural networks are a class of machine learning models inspired by the structure and functioning of the human brain. They are composed of interconnected nodes, called neurons, organized in layers. Each neuron receives input signals, processes them, and produces an output signal, which is passed to neurons in the next layer.
Here’s a brief overview of neural networks:
- Neurons: Neurons are the basic units of a neural network. Each neuron receives one or more input signals, computes a weighted sum of these inputs, adds a bias term, and applies an activation function to produce an output.
-
Layers: Neurons in a neural network are organized into layers. There are typically three types of layers:
- Input Layer: Receives input data and passes it to the next layer.
- Hidden Layers: Intermediate layers between the input and output layers. They perform complex transformations on the input data.
- Output Layer: Produces the final output of the network.
- Connections: Neurons in adjacent layers are connected by connections, which have associated weights. These weights determine the strength of the connection between neurons and are adjusted during the training process to optimize the network’s performance.
- Activation Functions: Activation functions introduce non-linearity into the network, allowing it to learn complex patterns in the data. Common activation functions include sigmoid, tanh, ReLU (Rectified Linear Unit), and softmax.
- Forward Propagation: During forward propagation, input data is fed into the network, and the output is computed by passing the data through the layers while applying the activation functions.
- Training: Neural networks are trained using a process called backpropagation, which involves iteratively adjusting the weights of the connections based on the difference between the predicted output and the true output (the loss). This process aims to minimize the loss function and improve the network’s performance on the training data.
- Deep Learning: Deep learning refers to the use of neural networks with multiple hidden layers (deep neural networks). Deep learning has revolutionized many fields, including computer vision, natural language processing, and speech recognition, by enabling the learning of complex hierarchical representations from data.
Neural networks have shown remarkable success in various applications, including image recognition, language translation, recommendation systems, and autonomous vehicles. They continue to be an active area of research and development in the field of artificial intelligence.
Structure
A single neuron holds a single number which is the neuron activation (a higher number means more activated)
activation of a neuron is going to be some function of the activation of the neurons feeding into it, and how strong the connections between them is
The activation of one layer determines the activation of the next layer
The firing of some neurons in one layer may cause some neuron in the subsequent layer to be activated
Identifying Numbers
Imagine all neurons lined up in a single column – instead of in the array structure of the original image
This column becomes the input layer of our neural network
10 digits so there will be 10 output neurons
784 input “neurons”; each representing the brightness of a single pixel
The activation in each output neuron now corresponds to the probability that the input image represents a specific digit
Hidden Layers
Layers between the input and output neurons
By moving from one layer to the next, you can create and select more abstract or composite features
If you go too deep, then it won’t be generalised to all things as it might start incorporating things like noise that are specific to that data set
Properties
Number of neurons in the input and output layer are generally determined by the problem at hand
No. of hidden layers, neurons in hidden layers, way neurons are connected
How to determine the optimal set-up
Calculating Neuron Values
Weights: After the input layer, the value of each neuron is the weighted sum of all its connected neurons
Activation Function: The output of a linear combination can be any number so normalisation needed to restrict outcomes to 0,1
Sigmoids - an S-shaped curve which squished the output to the interval 0,1
Bias: If you want the neuron to fire if the linear combination exceeds a certain value
Shift the function to the left or to the right
If the bias is -10, the remaining linear combination needs to reach at least 10 before the result of the normalisation may be large enough for “neuron” to “fire”
each “neuron” has its own bias, each connection its own weight
We would then expect a construct of k equations where each equation has n weights and one bias
If the number of neurons in the hidden layer is less than the inputs, there will be some compression of the information so only key things can be extracted from the data
Universal Approximation Theorem
Neural Network with a single hidden layer (but a potentially very, very large number of “neurons” in it) can approximate pretty much any continuous* function
Learning
Need an error metric to evaluate performance: a cost function
Optimisation
Assume there is one minimum so set the derivative equal to 0
Start anywhere on the cost function
Look in all directions from the point you are and determine the direction with the steepest descent
Move in that direction
Repeat
Step size is very important
Too small: will take a long time to find a local minima
Too large: May jump back and forth and miss the local minimum
Can reduce the step size as soon as the slop gets smaller and smaller
The negative gradient of the cost function tells how to adjust the weight to take the next step
Backpropogation
Purpose: Adjust the weights of all neurons in all layers so that we reduce our cost function
The output layer ‘knows’ what error each of its neurons has compared to the correct answer
Each neuron can then report the required change back to its predecessor
These nodes then collect the feedback from all their successor nodes and adjust their weights based on the collective feedback
Types of Neural Networkd
can, for example, create loops of a neuron to itself (RNN) or connect each neuron with every other neuron (Hopfield network) or let two networks play against each other (GAN)
These networks then obtain new properties that make them more useful than others for specific use cases
Convolution Neural Networks: structured neural networks that don’t require an image to be flattened so are very good at detecting more and more abstract features in the image or audio data
Limitations
Only works:
patterns actually exist
Have enough data
If the data distribution isn’t changing
Poor at knowing what they don’t know
Finds a pattern, doesn’t understand
Learns correlation not causation