Deep Learning Flashcards by Maciej Mucha

what characterises supervised learning

labelled data
cost function
reperated epochs

How well did you know this?

Not at all

Perfectly

what characterises un-supervised learning

non-labelled data
no cost function
finds representations of data

How well did you know this?

Not at all

Perfectly

deep learning what is it

usually labelled data
uses multilayered neural networks, either convolutional or perceptrom based

How well did you know this?

Not at all

Perfectly

What are the key components of a neural network

-Some cost function e.g. RMSE
-Forward propagation per layer per node consisting of weight x input + bias
-Following a neuron usually an activation function such as a ReLu or a Softmax
-Backpropagation

How well did you know this?

Not at all

Perfectly

Why is backpropagation an efficient method of optimization

It stores inermediate gradients, allowing for more efficient computations of the partial derivatives at each feed forward stage, this allows for fast optimiazation of a neural network (at least relative to computing the derivatives seperately per variable

How well did you know this?

Not at all

Perfectly

What is the activation function sigmoid and what are its pros and cons

-it is equal to 1/1+e^x and maps values range from any possible value [-infinity,infinity] to range of [0,1]
pros: differentiable and normalisees ranges
cons: hard to compute

How well did you know this?

Not at all

Perfectly

What is the activation function (tanH) and what are its pros and cons

(e^x - e^-x) / ( e^x + e^-x). normalizes values to the range [-1,1]
-differentiable but hard to compute

How well did you know this?

Not at all

Perfectly

What is the activation function ReLu and what are its pros and cons

f(x) = max (0,x). easy to compute but non differnetiable
-avoids saturation issues (i.e. when nodes have weights close to -1 or +1)

How well did you know this?

Not at all

Perfectly

What is the neural radiance field neural network

Form of stereo vision
-takes as input an image from different perspectives
-outputs a 3d map of the object
This is an example of a depth image

How well did you know this?

Not at all

Perfectly

What is the relationship between amount of data and the performance of different ML / AI methods

Large neural network performs best with large data but worst with little data
- small neural network 2nd place in both cases
- traditional ML performs better than rest with little data but worst with most data

How well did you know this?

Not at all

Perfectly

What is the general goal of a convolutional neural network

Learning higher-level features from image

Neural networks do this hierarchically i.e. they first learn low level -> mid level -> high level features then train a classifier

How well did you know this?

Not at all

Perfectly

what is a general structure of a general CNN up to the feature maps

Data
Convolution
Activation
Pooling
Feature maps

CNN = multi-layer neural network with:

Local connectivity
–neurone in a layer are only connected to small region of lair before it
They share weight parameters across spatial positions
– Learning shift-invariant filter kernels

How well did you know this?

Not at all

Perfectly

what parameters do convolutional kernels have

Kernel Size
Stride
Padding
Dilation:

How well did you know this?

Not at all

Perfectly

what are non-linear activation functions

Leaky ReLu
Parametric ReLu
Exponential Linear Units

Good for when we want to find non-linear relationships

How well did you know this?

Not at all

Perfectly

what is a pooling layer

taking the average or max of a section of kernel to reduce kernel size for better efficiency

How well did you know this?

Not at all

Perfectly

What is the random wiring strategy for neurel net design

Study These Flashcards

Randomly wiring nodes between multiple neurons and testing their efficacy, some examples are Minsky

What is a non-random algorithmic approach to neural net design

Study These Flashcards

Evolutionary algorithms:
-often out-perform human designed architectures
-keeps good parts of network and discards poor ones with some mutation probability

given a set of neural network architectures, we want to find best one. what can we reduce this problem to

Study These Flashcards

search problem

what are 4 types of regularisation

Study These Flashcards

-L2
-L1
-Dropout
-Cross Entropy

what is the general form of gradient descent

Study These Flashcards

w^(t+1) = w^(t) - lr ( dx ( Loss Function (w^(t))))

name 3 optimisation algorithms

Study These Flashcards

Adam
RMSProp
AdaGrad

name 3 optimisation algorithms

Study These Flashcards

Adam
RMSProp
AdaGrad

What are some different deep learning algorithms for computer vision

Study These Flashcards

CNN:
-classification algorithm, for detection, segmentation, learn hierarchical representations of visual data

R-CNN: (Regional CNN)
Object Detection algorithm combining CNNs. For each region it applies a CNN to localize objects

Fast R-CNN:
-same as R-CNN and runs faster by sharing computation over different regions. this is done by pooling layers across regions

What is a hidden layer, convolutional layer and pooling layer

Study These Flashcards

Hidden layer:
-region of forward propagation between inputs and outputs with weights and biases
-find reresentations and relationships in a dataset
-in CNN it has convolutional layers, pooling and fully connected

Covolutional Layer:
- convolutional layers are filters with learnable parameters which capture patterns in an image

Pooling Layter:
- Used to downsample spatial dimensions of feature maps produced by Convolutional Layers
-This process effectively retains the most salient features while reducing the spatial size of the feature maps.

What is data augmentation

Adding diversity to dataset to improve real world performance. it may include: -scaling, resizing -translations -rotations -gaussian noise ]0colour jittering

When is ReLu better than some other activation funcs

When network is very deep: - more efficient - less likely to cause saturation (-1 or +1 only nodes) causing faster convergence - less likely for vanishing gradient problem

What is the YOLO algrotithm (simple)

You Only Look Once: -Object detection for classifying multiple objects in an image in real-time -Rather than using sliding window it treats it as a regression problem and attempts to find likely bounding boxes and probabilities using a single nerual net

Describe the YOLO algorithm (detail)

-Input image is divided into groups of cells responsible for predicting class probabilities for objects in the box -Pre-definied anchor boxes with different shapes + sizes are used to predict dimensions of bounding boxes -Put image thorugh CNN once which contains cov layer, feature extraction layer, then fully connected layers for regression and classification - Outputs predictions of each bounding box containing an object and co-ordinates of the box and width -Non-Maxima supression to delete overlapping boxes with lower probability -Final output is a set of bounding boxes with estimated probabilities

Give some examples of supervised and non-supervised algorithms

supervised: - CNNs -SVM unsupervised: -GANs -Autoencoders -Kmeans

Deep Learning Flashcards

(29 cards)