CHAP 10 : INTRO TO DEEP LEARNING Flashcards
What is deep learning?
It is one of the many branches of machine learning where operations are applied one after another.
It is a mathematical framework to learn representations from data
Operations in deep learning are structured into models called _____, deep learning models are typically stacks of _____ (same word)
Layers
What is the goal of the process of learning for deep learning?
To find good values for weights in the layers (i.e. values that minimise a loss function)
What are the 2 limitations of machine learning, as compared to deep learning?
- ML requires a lot of storage in CPU (CENTRAL processing Unit –> found in computer), resulting in longer computations
ML cannot extract features for complex problems like object recognition
e.g. face detection
> ML : We need to define/specify features like eyes, ears etc and ML program will identify which features are more important for different people
> DL : The deep learning framework will automatically find out features which are important for foace detection, with large amount of data
Why did Deep Learning become popular since the 2000s, and why do we use DL now? [3]
**1. Better algorithms and its understanding
- Very powerful computing infrastructures available (GPU, TPU, cloud – AWS, google colab)
- Large amount of resources online –> huge amount of labelled data packages, open source tools (keras, PyTorch, Tensorflow) and pretrained models
What are pretrained models?
Pretrained models are machine learning models that has been trained on a large amount of data and pre-existing knowledge. It has the optimal parameters and users just need to input their data into the model.
List some applications of DL in computer vision (other than object recognition/classification in pictures)
- Image segmentation (e.g. segmenting different objects in the image – e.g. there are 2 humans in the picture, and objects, so DL will segment the 2 humans as one class, and the same objects into another class etc)
- Style transfer – taking the style from the style image and applying it into input image (e.g. taking the style f a painting like Starry Night and applying it to a random image you took, image filters in apps and applying it to your pictures)
- Autocolouring of images
- Restoration of images by filling in missing pixels
- Image super resolution : e.g. take a 8x8 image and generate a 32 x 32 image which has a higher resolution. The original 8x8 image is called the GROUND TRUTH
- image synthesis – e.g. a synthesused horse image into a zebra image –> generatng an image of a horse with stripes of zebra (the process of artificially generating images that contain some particular desired content. )
** What are the 2 properties of CNN?
- The patterns they learn are translation invariant. After learning a certain pattern in the lower right corner of a picture, a CNN can recognise it anywhere
- They can learn spatial hierachies of patterns –> a first layer will learn small local patterns such as edges, the second layer will learn larger patterns made of the features of the first layer etc
What is the core of CNN (main components of CNN?
- Convolutional layer — consists of a series of filters known as convolutional kernels
- Filter / kernel — a matrix of integers that are used on a subset of the input pixel values.
Each pixel is multiplied by the corresponding kernel value in the kernel, result is summed up for a single value representing a grid cell (like a pixel) in the output feature map
- Input images
- Convolution operations — kernel strides over input matrix of numbers moving horizontally column by column,and then strides down vertically for subsequent rows. (left to right also )
What are the no of channels for input images for CNN?
RGB images — 3 channels
Black and white images - 1 channel
How can we normalise pixel values of images in CNN?
What is the purpose of normalising or standardising image pixels?
For RGB images, can normalise by dividing each pixel by 255.
Data normalization is an important step which ensures that each input parameter (pixel, in this case) has a similar data distribution. This makes convergence faster while training the network. It caan also avoid the possibility of exploding gradients.
What is the goal of the pooling layer?
Whaat functions are used?
- Goal : to reduce computational, memory usage and number of parameters in the network by reducing the size of the image
- max / mean aggregate functions are used
What are the 4 different layers in a CNN?
- Input layer
- Convolutional layer
- Pooling layer
- Fully-connected layer (Dense layer)
What does the fully-connected / dense layer do in CNN?
it classifies the image into its class
- like a normal neural network
- e.g. there a are 9 different classes of animals. Dense layer has neurons of 9 different output animal classes, and last dense layer has a single neuron to classify which animal the image is.
What does NLP deal with?
It deals with building computational algorithms to automatically analyze and represent human language.
It allows machines to have the ability to perform complex natural language related tasks.