Deep Learning A-Z™ 2023: Neural Networks, AI & ChatGPT Bonus (UDEMY) Flashcards
Moore’s law
Moore’s law is the observation that the number of transistors in an integrated circuit (IC) doubles about every two years.
What is Deep Learning?
Deep learning is a method in artificial intelligence (AI) that teaches computers to process data in a way that is inspired by the human brain. Deep learning models can recognize complex patterns in pictures, text, sounds, and other data to produce accurate insights and predictions. You can use deep learning methods to automate tasks that typically require human intelligence, such as describing images or transcribing a sound file into text.
Activation functions
The Threshold Function
The Sigmoid Function
The Rectifier Function
Hyperbolic Tangent Function
The Cost Function
The cost function tells us the error in our prediction.
Our aim is to minimize the cost function. The lower the cost function, the closer Ŷ is to Y, and hence, the closer our output value to our actual value.
A lower cost function means higher accuracy for our Network.
Once we have our cost function, a recycling process begins. We feed the resulting data back through the entire Neural Network. The weighted synapses connecting the input variables to the neuron are the only thing we have any control over.
For bigger examples cost function is calculated separately for every example but the cost functions are than taken into account together for back propagation.
Backpropagation
This is when you feed the end data back through the Neural Network and then adjust the weighted synapses between the input value and the neuron.
By repeating this cycle and adjusting the weights accordingly, you reduce the cost function.
A step-by-step training guide for your Neural Network
Randomly initialize the weights to small numbers close to 0 (but not 0).
Input the first observation. One feature per input node.
Forward-Propagation. From left to right the neurons are activated and the output value is produced.
Compare output value to actual value. Measure the difference between the two; the generated error.
From right to left the generated error is back-propagated and the weights adjusted accordingly. The learning rate of the Network is dependent on how much you adjust the weights.
Repeat steps 1-5 and either adjust the weights after each observation (Reinforcement learning), or after a batch of observations (Batch learning).
When the whole training set passes through the Neural Network, that makes an epoch. Redo more epochs.
Normalisation vs standardization
Normalization and standardization are both techniques used in data preprocessing to transform and scale numerical data. They are commonly employed to ensure that the data is on a similar scale, which can be beneficial for certain machine learning algorithms.
Normalization:
Normalization, also known as Min-Max scaling, rescales the data to a specific range, typically between 0 and 1. It involves subtracting the minimum value from each data point and dividing it by the range (maximum value minus minimum value). This process preserves the relative relationships between data points and is suitable when the distribution of the data is not necessarily Gaussian.
Formula for normalization: x_normalized = (x - min(x)) / (max(x) - min(x))
Standardization:
Standardization, also known as Z-score normalization, transforms the data to have a mean of 0 and a standard deviation of 1. It involves subtracting the mean from each data point and dividing it by the standard deviation. Standardization makes the data follow a Gaussian distribution, with a mean of 0 and a variance of 1.
Formula for standardization: x_standardized = (x - mean(x)) / std(x)
Key differences:
Range: Normalization scales the data to a specific range (e.g., 0 to 1), while standardization centers the data around the mean with a standard deviation of 1.
Sensitivity to outliers: Normalization can be sensitive to outliers since it is based on the range of the data. Standardization is generally more robust to outliers since it considers the mean and standard deviation.
Interpretability: Normalization preserves the original distribution and relative relationships between data points. Standardization transforms the data to follow a Gaussian distribution, which can make it easier to interpret and compare different features.
Which one to use:
The choice between normalization and standardization depends on the specific requirements of your data and the machine learning algorithm you intend to use. In general, standardization is more widely used and suitable for many algorithms, especially those that assume the data to be normally distributed. However, there may be cases where normalization or a different scaling technique is more appropriate, such as when dealing with data that has a predefined range or when using algorithms that are sensitive to the scale of the features.
It’s often a good practice to experiment with both normalization and standardization, or even other scaling methods, and evaluate their impact on your specific problem to determine the most effective approach.
What is the benefit of 3 input values combining it one neurone?
By combining 3 input features we can get a new feature as a combinations of those inputs - added value in terms of information when predicting something.