Lecture 4 Flashcards

Question 1

Q

Parameters

Answer

A

Variables learnt (found) during training e.g. weights

Question 2

Q

Hyperparamters

Answer

A

Variables whose value is set before the training process begins (alpha and k (from k-NN))

Question 3

Q

Loss Function (or error)

Answer

A

Should be minimized for a single training example to achieve the objective; Part of a cost function

Question 4

Q

Cost Function

Answer

A

average of your loss functions over the entire training set (mean square error); A type of objective function

Question 5

Q

Objective Function

Answer

A

Any function that you optimize during training (maximum likelihood, divergence between classes)

Question 6

Q

Gradient Descent

Answer

A

Finds optimal values for w;

Is an iterative optimization algorithm that operates over a loss landscape (cost function);

Follows the slope of the gradient W to reach the minimum cost

Question 7

Q

Gradient of Cost Function

Answer

A

The direction of the steps to achieve the goal

Question 8

Q

Learning Rate

Answer

A

The size of steps took in any direction

Question 9

Q

Sigmoid Function (Logistic Function)

Answer

A

Assumes a particular functional form (a sigmoid) is applied to the linear function of the data; the output is a smooth and differentiable function of the inputs and the weights

Question 10

Q

Cross Entropy (Logarithmic Loss)

Answer

A

Predicts class probability compared to actual class for an output of 0 or 1;
The score calculated penalizes probability based on how far it is from actual value;
Penalty is logarithmic in nature

Question 11

Q

Regularization

Answer

A

Any modification to a learning algorithm that is intended to reduce its generalization error but not its training error;
Solves overfitting;
Used when we don’t have enough samples to create a good logistic regression classification model.

Question 12

Q

Activation Functions

Answer

A

Applied on the hidden units;
achieve nonlinearity;
popular activation function

Question 13

Q

Feedforward

Answer

A

Calculates the predicted output (y); inference

Question 14

Q

Backpropogation

Answer

A

Updating the weights and biases; learning

Question 15

Q

Artificial Neural Networks

Answer

A

Highly expressive non-linear functions
Highly parallel network of logistic function units
minimizes sum of square training errors plus weight squared (regularization)
Uses gradient descent as training procedure

Question 16

Q

Is logistic regression used for classification or regression?

Answer

Study These Flashcards

A

Logistic Regression is used for classification only.

Question 17

Q

Does regularization work better with a small set of features (low dimensions) or a large set of features (high dimensions)?

Answer

Study These Flashcards

A

Regularization works well with a large number of features.

Question 18

Q

Is logistic regression used for binary or multiclass classification?

Answer

Study These Flashcards

A

It can be used for both

Question 19

Q

What are the steps for a gradient decent algorithm?

Answer

Study These Flashcards

A

Initialize random weights and bias
Pass an input through the network and compute predicted values from output layer
Calculate error between the actual value and the predicted value
Go to each weights which contributes to the error and change its respective values to reduce the error
Reiterate until you find the best weights of network

Lecture 4 Flashcards

(19 cards)