Non-linearity Flashcards
What is the role of polynomial regression in capturing non-linear relationships in data? Discuss the potential drawbacks of using high-degree polynomial regression and suggest strategies to mitigate these drawbacks.
Polynomial uses polynomials so that we can examine more complex patterns or relationships between variables. Using a too-high degree will likely fit the training data better, but lead to overfitting, and conversely, a too-low degree will underfitt. Higher degrees will be more sensitive to changes in data.
The choice of degree is usually obtained from Cross-validation.
We can also add a penalty term, either in the form of Ridge or Lasso.
Step functions: Explain the concepts of underfitting and overfitting in the context of non-linear models. How do step functions contribute to underfitting, and what are the pros and cons of using step functions in capturing non-linear patterns?
The step functions form several functions with intervals. They are easy to implement - just set the boundaries. Computationally easy to implement and good for categorical approximations.
Cons:
They can lose the distictions of the original function.
They are discontinuous, which might not be desirable.
We need much information about data to set the steps.
So when we have too simplistic models (few steps) we underfit, and vice versa.
Basis functions and Polynomial regression:
Describe the concept of basis functions in the context of polynomial regression. How are basis functions utilized to represent non-linear relationships, and what role do they play in constructing piecewise polynomial functions?
Basis functions = are fucntions used to model the relationships between variables in polynomial regression. these are typically polynomials of varying degrees. They are good at capturing non-linear (even complex) relationships.
Piecewise polynomials are combining several polynomial functions, each defined over a specific interval.
These are good because they can capture relationships in certain REGIONS of the data.
In non-linear modeling, what role do splits, basis functions, and knots play? Explain how different choices of basis functions and knot placements influence the flexibility and complexity of the model.
Splits are where the function transitions between two segments. this is how we create piecewise polynmials.
Basis functions are the ground components of our non-linear models. these contribute to the flexibility.
Knots: points where piecewise polynomials transitions from one to another.
How many and where knots are placed are of significant importance on the flexibility of the model. It is common to start placing knots uniformly across the data. Though, too many may lead to overfitting. LOCAL variations.
Spline functions:
piecewise polynomials that are continuous in their derivatives - man kan rita smooth lines mellan dem!!.
How do Generalized Additive Models (GAMs) extend the capabilities of linear models in capturing non-linear relationships? Discuss the significance of link functions in GAMs and provide examples of scenarios where GAMs are most useful.
GAMs usually use splines such as Bsplines or cubic spline (flexible). Despite non-linear relationships between features and target - they maintain additivity which means that the effect of predictors on the response are additive!! BAM. This means that we can interpret the impact of each separated variable.
Estimations: Parameters of GAM are found by found by MLE.
Link function:
invertible function - transforms expected value of response variable to the linear predictor. Choice of link function depends on distribution of the response variable.
Link function can stabilize variance.
Which of the non-linear methods are best at extrapolating?