Regression Flashcards
Is it possible to create models in Machine learning?
Machine learning is mainly about creating mathematical relations between features of a dataset. These mathematical relations are called models.
How was linear regression done prior to machine learning?
Long long ago, it was done by mathematicians with very few datasets. In the availability of computers, we are now enabled with high computational powers and hence we can prepare linear regression models with any amount of data in today’s world.
What was fearless about ML?
It’s a reference that ML practitioners usually don’t spend too much time evaluating the statistical validity of a method; they prefer to just create a model and evaluate its performance, rather than thinking about the statistical validity of the process.
If we don’t understand how a model’s intricacies work, how can we be 100% sure that it’s a successful model?
Interpretation of how the model is working mathematically is not always possible. While preparing the algorithm, its mathematical steps are derived. While applying the model it is not always possible and needed to know the steps. It is about the performance of the model on unseen data that decides the success of the model.
Do more data points bring new insights?
Yes, with larger data the insights will be supported with more data and hence they will be more trustworthy. The same will be the quality of model training. With more data, models train better.
Why is the model interpretation important?
Model interpretation is important because it gives an understanding of how the model reached the final results/predictions. It gives some clarity and indication about if some changes have to be done to make the model perform better.
What are the features of the machine learning model?
Features are the attributes or variables that are used in a machine learning model. Generally, there are independent and dependent features. Independent features are the input features of the model while the dependent feature is the output variable.
Is risk similar to variance?
For a machine learning model, the higher the error more will be chance of its failure over the unseen data and it will be a measure of the risk of the model. So risk can be interpreted as the error associated with the model and hence minimizing risk is equivalent to minimizing error or variance.
What are overfitting and underfitting?
Overfitting is a case when a machine learning model follows the noises of the data too much. It becomes too complex to make good predictions on the unseen data. Underfitting is the opposite case where the model is simple enough to capture the patterns of the data and hence it is also unable to generalize over the unseen data.
What does interpolation mean?
Interpolation is a statistical method by which related known values are used to estimate an unknown value. It is useful in treating missing values in machine learning.
Why do we square the residual?
It is for mathematical purposes. If we don’t square them then the positive values and negative values cancel out each other and we will still get the error as zero or close to zero which is wrong.
What is the limit of the error made by a regression model?
There is no existing defined limit on errors by a linear regression model. The only condition is that the sum squared of error terms should be minimum.
Can we use either normalization or standardization?
We use either standardization or normalization as it brings all the variables to a uniform scale. If we don’t do that there is a chance a less important variable will be given more priority.
Is it acceptable if the slope of a variable is very small in the model?
The slope of any variable tells about how the dependent variable would change for a unit increase in the independent variable while keeping other variables constant. The larger the value of slope higher the change and Vice versa. A small slope will indicate that a certain feature is not influencing the target variable and that can be removed.
Is that always good to add more variables that could potentially have an influence on the outcome?
Yes, it is good to add more variables that are influencing the outcome. It will make the predictions more generic and reliable over unseen data.
Is there a way to validate the independence of error terms?
Yes, to do that hypothesis testing is done.
What do we mean by “linear predictor”?
A linear predictor means a linear relationship between the output and the input variables. The term linear is associated with the coefficients/parameters of the model. Every equation that is not following the linearity in parameters is a non-linear relation.
Don’t we divide the TSS by ‘n’ so we have something somehow ‘independent’ from the number of samples we take?
Dividing the total sum of squares with n will not make any sense because the residual sum of squares will also be divided by n. As we are considering the ratio between them we do not need to divide TSS with n.
Can we interpret R² as a measure to understand how linear our dataset is?
No, R² is used to explain how much our model can capture the variance in the data. The higher the value better is the performance of the model.
How to choose between Normalization and Standardization?
Normalization, typically means that the range of values is “normalized to be from 0 to 1” while Standardization typically means that the range of values is standardized to measure how many standard deviations the value is from its mean.
What is an acceptable R-squared in the real world?
Higher the value of R-square better is the fit of the model over the given data. This is because a higher r-squared value implies a lower residual sum of squares.
Do we call it R squared because the square is mathematically useful somewhere else?
R is the correlation between the predicted values and the observed values of Y. R square is the square of this coefficient and indicates the percentage of variation explained by your regression line out of the total variation.
Can we discriminate the features before the training? like finding the correlation between features against labels?
Yes, we can find the correlation between the variables and if any 2 variables are highly correlated then we drop one of them as it is redundant to our model and adds no new information to the model we develop. Along with that by doing exploratory data analysis we can remove variables based on many criteria.
Do we need to test for multicollinearity in case of more variables?
Yes, we do need to check for multicollinearity before we develop a model. We drop the redundant variables. If we have a VIF value for a variable greater than 5 we generally drop those variables.