Linear regression Flashcards
describe diffrent type of steps in datascience?
formulate question
gather data
clean data
explore and visualize data
train algorithm
evaluate model
why cleaning of data is required?
to deal with
missing data
incomplete data
inaccurate data
format data
what is pandas module?
it is an opensource library in python widely used for datascience and machine learning purposes
mainly used as a datastructure and data analysis tool
made on top of numpy
explain the following
-describe()
describe()-gives a details about the data like count,max,min,std etc..
what is matplotlib?
it is a data visualization library with which you can generate graphs,plot,histogram etc..
what is pyplot in matplotlib?
it is using pyplot we make use of matplotlib’s plotting properties.
The various plots we can utilize using Pyplot are Line Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar.
explain the implementation of the following in pyplot
-plt.plot
-plt.scatter(x,y,alpha=)
-plt.show()
-plt.title()
-plt.xlabel,plt.ylabel
-plt.figure(figsize=())
-plt.xlim,plt.ylim
alpha-transperency
xlim,ylim-to limit x and yaxis
how many zeroes in a million
6
what is linear regression?explain θ0 and θ1 in hypothesis equation?explain intersept and slope?
explain actual and fitted value?what are residuals
hypothesis eqn=h(x)=θ0+θ1x
the best fitted line is so choosed so that the absolute sum of the square of residuals is minimum
why the term regression ?
outcome of the highest input tends to move towards the average of the outcomes(move backwards)
what is scikit-learn module?
scikit-learn is machine learning module in python
used for data mining and data analysis built on top of numpy,scipy and matplotlib
how do you implement a linear regression model in python?
from sklearn.linear_model import LinearRegression
regression=LinearRegression()
regression.fit(x,y)
prediction=regression.predict(x)
plot(x,prediction)
how do you interpret slope and intercept( θ1 and θ0)
in case of movie revenue prediction
slope represent m times the budget would be the revenue
intercept represent the revenue for a zero dollar budget movie
what is goodness of fit in regression?how it is coded in python?
goodness of fit represents how well a model fits the given set of data
why do we use %matplotlib inline?
it is used to display figures below each shell itself