Optimisation and gradient descent algorithm Flashcards by Amrithesh K

explain simply how a machine learning model works?

predict—->calculate error—–>learn—–>predict
this is called an algorithm

How well did you know this?

Not at all

Perfectly

what is an algorithm?

algorithm is a set of mathematical instruction for solving a problem
it is basically a word used by programmers when they don’t want to explain what they did

How well did you know this?

Not at all

Perfectly

where did the term algorithm originate from?

Muhammad ibn Musa Al-Khwarizmi a ninth-century Persian mathematician who wrote a popular mathematics book of that time, when that book was translated to latin the translators where confused with his name and termed it algorithm

How well did you know this?

Not at all

Perfectly

what is cost function in machine learining?

-a cost function is an important parameter in deciding how well a machine learning model fits into the dataset
-it the sum of squares of the difference between actual and fitted values
-we need a function that can find when the model is most accurate whole the way between undertrained and overtrained model
-by minimizing the value of cost function we will get an optimal solution
-Cost function is a measure of how wrong the model is in estimating the relationship between X(input) and Y(output) Parameter
-also called as loss function, error function etc..

How well did you know this?

Not at all

Perfectly

what is Latex markdown?

it is a syntax to write down mathematical expressions

How well did you know this?

Not at all

Perfectly

what is linspace in numpy?

it generates an array of linearly spaced numbers between a and b

How well did you know this?

Not at all

Perfectly

what is subplot and how to implement it in matplotlib?

used to display two figures side by side

How well did you know this?

Not at all

Perfectly

explain about cost function implementation in python

-represent function
-represent derivative of function
-at the minimum f(x) the slope of function will be zero which is found from the derivative plot

How well did you know this?

Not at all

Perfectly

explain briefly about gradient descent algorithm?

Gradient Descent is an optimization algorithm used for minimizing the cost function in various machine learning algorithms.

How well did you know this?

Not at all

Perfectly

visualize 3d model of cost function

downward convex

How well did you know this?

Not at all

Perfectly

Implement an optimization algorithm in python

Gradient Descent

new_x = 3
previous_x = 0
step_multiplier = 0.1
precision = 0.00001

x_list = [new_x]
slope_list = [df(new_x)]

for n in range(500):
previous_x = new_x
gradient = df(previous_x)
new_x = previous_x - step_multiplier * gradient

step_size = abs(new_x - previous_x)
# print(step_size)

x_list.append(new_x)
slope_list.append(df(new_x))

if step_size < precision:
    print('Loop ran this many times:', n)
    break

print(‘Local minimum occurs at:’, new_x)
print(‘Slope or df(x) value at this point is:’, df(new_x))
print(‘f(x) value or cost at this point is:’, f(new_x))

How well did you know this?

Not at all

Perfectly

scatter function can plot with list of x true or false?

False
scatter function cannot plot list it can only plot arrays therefore we will have to convert list to array using numpy

How well did you know this?

Not at all

Perfectly

what happens with gradient descent when there is a maxima ,local minima and global minima?

the gradient descent depends on the initial guess
if the initial guess is near the local minima then the algorithm won’t converge to global minima thereby giving us the wrong output

How well did you know this?

Not at all

Perfectly

implement gradient descent by calling a function?

def gradient_descent(df,initial_guess,step_multiplier=0.02,precision=0.001):

new_x = initial_guess

x_list = [new_x]
slope_list = [df(new_x)]

for n in range(500):
    previous_x = new_x
    gradient = df(previous_x)
    new_x = previous_x - step_multiplier * gradient

    step_size = abs(new_x - previous_x)


    x_list.append(new_x)
    slope_list.append(df(new_x))

    if step_size < precision:
        print('Loop ran this many times:', n)
        break

return new_x,x_list,slope_list

localmin,x_list,slope_list=gradient_descent(df,0,0.02,0.001)
print(localmin)

How well did you know this?

Not at all

Perfectly

what is the difference between stochastic and batch gradient descent?

stochastic descent has the feature of randomness
it can deal with random initial guesses thereby trying to predict the correct minima better that batch

How well did you know this?

Not at all

Perfectly

what is divergence and overflow in gradient descent how it occurs and how can you solve it ?

Study These Flashcards

overflow the result is too large for the system to handle
it can be solved by limiting the number of iterations

what is sys module in python?

Study These Flashcards

system module gives various information about python runtime environment
like max floating number python can deal with

what is tuple packing and tuple unpacking?

Study These Flashcards

packing- breakfast=”bacon”,”beans”,”avacado”
unpacking- “x”,”y”,”z”=breakfast

what is learning rate in gradient descent algorithm?

Study These Flashcards

learning rate decides how fast the algorithm can converge to the minimal point
if the learning rate is small then it will take more time to converge
if the learning rate is large then it might diverge and never converge to the minima
in our example learning rate can be changed by changing the multipler

Bold driver learning rate mechanism?

Study These Flashcards

if your cost fn has reduced since the last iteration then increase learning rate by 5%
if your cost fn has increased since the last iteration (algorithm crossed minimal point) then go back to the last iteration and reduce the learning rate by 50%

how can you create a 3d model of cost function in python? what is cmap and how it is implemented?

Study These Flashcards

TODO generat 3d plot

from mpl_toolkits.mplot3d.axes3d import Axes3D
from matplotlib import cm

fig=plt.figure(figsize=(16,12))
ax=fig.gca(projection=”3d”)
ax.set_xlabel(“x”,fontsize=20)
ax.set_ylabel(“y”,fontsize=20)
ax.set_zlabel(“f(x,y)-cost”,fontsize=20)
#gca-get current axes
ax.plot_surface(x4,y4,f(x4,y4),cmap=cm.coolwarm,alpha=0.4)
plt.show()

what is a bug?

Study These Flashcards

an unintended behaviour or defect in a program that causes it to crash or malfunction

How do you find partial derivative of a function in python? what does symbols do ?

Study These Flashcards

from sympy import symbols,diff
a,b=symbols(“x,y”) - it recognises x,y as a,b (now we can print function by calling f(a,b))
f(a,b)
diff(f(a,b),a)-find partial diff of f(x) w.r.t a
f(a,b).evalf(subs={a:1.8,b:1.0})-evaluate f(1.8,1.0)
diff(f(a,b).evalf(subs={a:1.8,b:1.0}))

implement batch gradient descent for multivariable cost function?

Study These Flashcards

TODO Batch gradient descent with python

in case of multivariable function we have two differentials w.r.t both x and y both of them has to be considered for finding the minimal point

multiplier=0.1
max_iter=200
params=np.array([1.8,1.0])#initial guess

for i in range(max_iter):
gradient_x=diff(f(a,b),a).evalf(subs={a:params[0],b:params[1]})
gradient_y=diff(f(a,b),b).evalf(subs={a:params[0],b:params[1]})
gradinets=np.array([gradient_x,gradient_y])
params=params-multiplier*gradinets

print(params[0],params[1])
print(‘cost is’,f(params[0],params[1]))

what is the drawback of sympy module ?

computational time is higher as it have to differentiate the function every time it is run so we can write partial derivative as a function to reduce the time required

what type of datastructure can be used to plot 3d function?how to create that datastructure?

2d array kirk = np.array([['Captain', 'Guitar']]) print(kirk.shape) hs_band = np.array([['Black Thought', 'MC'], ['Questlove', 'Drums']]) print(hs_band.shape) print('hs_band[0] :', hs_band[0]) print('hs_band[0][1] :', hs_band[1][0]) or you can use reshape function

How do you append data to a 2d array?what is axis?

kirk = np.array([['Captain', 'Guitar']]) print(kirk.shape) hs_band = np.array([['Black Thought', 'MC'], ['Questlove', 'Drums']]) print(hs_band.shape) print('hs_band[0] :', hs_band[0]) print('hs_band[0][1] :', hs_band[1][0]) the_roots = np.append(arr=hs_band, values=kirk, axis=0) print(the_roots) axis defines the way by which you want to add the data either by column or by row if you want to add the data by row then the column number must match if you want to add the data by column then the row number must match i.e dimensions should match you can do this by reshaping the array

how do you access a particular row or column in a 2d array?

print('Printing nicknames...', the_roots[:, 0]) : selects all the rows 0-prints first column

explain ways in which you can add elements to a 2d array?

values_array = np.append(values_array, params.reshape(1, 2), axis=0) values_array = np.concatenate((values_array, params.reshape(1, 2)), axis=0)

what is the need for MSE when there is RSS?

when there are large number of datapoints RSS becomes very big and we might encounter overflow error but when we divide it with the number of datapoints it becomes easy to deal with

write a python code to return MSE without using a for loop when two arrays are passed as input

#TODO define a function to def MSE(pred,actu): mse_calc=(1/len(pred))*sum((pred-actu)**2) return mse_calc mse=MSE(pred_v,actu_v) print(mse) where pred and actu are arrays

what is an array ? is tuple an array what about dictionary? what is the difference between array and dictionary?

array is a collection of same datatype in contiguous memory locations tuple is an array if it have same datatype dictionary is like an array but instead of index, keys are used to access these element

what is the difference between meshgrid and reshape function?

meshgrid adds more elements to the array by duplicating current element Input : x = [0, 1, 2, 3, 4, 5] y = [2, 3, 4, 5, 6, 7, 8] Output : x_1 = array([[0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.], [0., 1., 2., 3., 4., 5.]]) y_1 = array([[2., 2., 2., 2., 2., 2.], [3., 3., 3., 3., 3., 3.], [4., 4., 4., 4., 4., 4.], [5., 5., 5., 5., 5., 5.], [6., 6., 6., 6., 6., 6.], [7., 7., 7., 7., 7., 7.], [8., 8., 8., 8., 8., 8.]] reshape cannot add new elements but can only reshape the order of the array x=np.arange(12) y=np.reshape(x, (4,3))

how do you access all elements of rows and column seperately using two for loops?

for i in range(no): for j in range(no): x=matrix[i][j]-access elements of row y=matrix[j][i]-access elements of column

what does unravel_index do in numpy?

it helps to obtain a particular index(row and column index) from a matrix ij_min=np.unravel_index(indices=plot_cost.argmin(),shape=plot_cost.shape)

find partial derivative of mean square error by substituting hypothesis equation?

we get two seperate equation for both partial derivatives

how the actual cost function and study cost function differs?

in actual cost function the variables are theta0 and theta1 in study cost function the variables are x and y normally machine learning problems we have to find the optimal values of thetas by gradient descent algorithm

Optimisation and gradient descent algorithm Flashcards

(37 cards)