Quantitative Methods Flashcards

Question

Which model works best with non linear relationship?

Answer 1

Machine learning

Answer 2

Target Variable is dependent variable and feature is independent variable

Answer 3

a) Uses label data, target and feature should be defined. Binary classification. Eg. - multiple regression. b) does not use label data, only feature is entered. Cannnot define whether data is continuous or categorical. Eg. - clustering c) Image recognition, uses neural network. For continuous and categorical data.

Answer 4

In overfitting data, there is high R^2. no noise and inability to generalise pattern. In underfitting data, no recognised pattern and predicting power of machine is low.

Answer 5

Training data subject to in sample error

Answer 6

Complexity reduction - reduce independent variables Cross validation -Use k fold cross verification

Answer 7

Reduces the problem of overfitting. makes the model parsimonious. Seeks to minimize the total sum of errors. Technique - LASSO and regularisation

Answer 8

When we want to predict one out of two possible outcomes.

Answer 9

A technique which helps in handling outliers in the data set

Answer 10

Classify data in the basis of nearness of observation. Eg. - predicting bankruptcy. a) high error rate b) dilution of results c) No clean dataset winner

Answer 11

CART is often described as blackbox due to opacity. For classification tree - Target variable is binary or categorical, can be used when data is non linear. Logit and probit allows us to create a prediction when target is binary but assumes linear. For regression tree - it is used when data is continous

Answer 12

Ensemble - Combine predictions of multiple models such that error of one model is overcome by the other. Types- Aggregation of heterogenous learners, aggregation of homogenous learners. Random Forest - Similar to CART, but here best tres are combined to make a single tree and we use random features. It increases signal noise ratio.

Answer 13

Number of features that have minimal information are combined into 1 Independent variable i.e. Eigen vector.

Answer 14

If there are too many eigen values, we create a chart known as scree plots. It tells how much variance is explained by each vectors.

Answer 15

Grouping data of similar features into 1 cluster. Cohesion and euclidian distance is commonly used. It helps in uncovering hidden structures or similarities in complex data set. Techniques- a) K mean clustering b) hierarchical Clustering - Agglomerative, Devise

Answer 16

It is a neural network with more than 20 hidden layers in each node. used in Image recognition and helps in natural language processing.

Answer 17

Summation operator - passes info and takes weighted average Activation function - Generates output from given input.

Answer 18

If an output is passed to neuron 2, it is again processed known as forward propagation. When any changes happen in summation operator while data is being processed in neuron 2, it is known as backward integration

Answer 19

It does not rely on label data, no input and output. Only features are given and machine learns itself.

Answer 20

Classification and regression tree (CART).

Answer 21

Volume Veracity - validity Variety Velocity - speed

Answer 22

Data preparation and wrangling

Answer 23

If the structured data is collected through external sources.

Answer 24

The input sample given to filter out data in data preparation and wrangling.

Answer 25

a) Data transformation - leads to outliers. To prevent outliers- 1) Trimming 2) Winsorization - convert outliers with maximum value b) Data scaling - convert features into common units of measurement. 1) Normalisation - sensitive to outliers 2) Standardisation - not sensitive to outliers

Answer 26

Data preparation and wrangling Data exploration

Answer 27

Merging 2 or more independent variables to create new feature. Technique - One hot coding - process used to convert binary or dummy variables in order to make model faster and easier. Tends to prevent model unfitting. Techniques- a) N gram b) Numbers c) Name entity recognition d) Parts of speech

Answer 28

Stemming refers to the text wrangling process in which all similar words are converted into one word. eg. integrate, integrated are converted to integrat. Lemmatization is an advanced version of stemming.

Answer 29

Process of converting word into token. Process of converting sentences into tokens is tokenisation.

Answer 30

Collection of all tokens.

Answer 31

A process in which 2 words used together are given a single token.

Answer 32

a) High accuracy b) Random guesses

Answer 33

RMSE helps in giving idea of volatility in error term. It is useful when data set is continous.

Answer 34

Complexity increases leads to decrease in bias error which further leads to increase in variance error.

Answer 35

It is an automatic process of selecting the best hyperparamter combination.

Answer 36

When the prediciton error of training dataset is small while prediciton error on cross validation dataset is significantly larger.

Answer 37

Evaluation of what tuning is doing and identify weak links which will further help in tuning.

Answer 38

True N-gram implementation will affect the normalization of the BOW because stop words will not be removed.

Answer 39

Bias error is the prediction error in the training data resulting from underfit models. Variance error is the prediction error in the validation sample resulting from overfitting models that do not generalize well.

Answer 40

a) Type 1 error b) Type 2 error.

Answer 41

It tells the relationship between 2 variables. Does not tell the magnitude. Range = -∞ to +∞

Answer 42

Y = bo + b1X + E

Answer 43

To check the reliability of regression. Higher the R^2, higher the movement explained.

Answer 44

It measures the degree of variability of the actual Y values relative to estimated Y values from regression. It gauges the fit of regression line

Answer 45

Assesses how well a set of independent variables as a group explains variation in dependent variable. It is always a one tail test.

Answer 46

The smallest level of significance at which null can be rejected.

Answer 47

a) Type 1 error- reject the null when it is true. Probability = significance level b) Type 2 error - fail to reject the null when it is false.

Answer 48

a) reject null b) Fail to reject the null

Answer 49

a) High R^2 b) Intercept = 0 c) Slope = 1

Answer 50

These are ocassions where independent variable takes form of either "Yes" or "No". Take value 0 or 1. Often used to quantify impact of quantitative events.

Answer 51

AIC is used to better forecast and to compete between model. Lower the better. BIC is used to evaluate goodness of fit in the model. It penalise model for being too complex. Lower the better.

Answer 52

These are models such that one model (i.e. full or unrestricted model) has a higher number of independent variables while another model (restricted model) has only a subset of independent variables. Evaluated by f test.

Answer 53

When variance of residuals is not the same across all observations in sample. Types - a) Unconditional - When covariance is not correlated with IV. b) Conditional - When error variance is correlated with IV. Creates major problem.

Answer 54

a) SEE are underestimated due to which t stat is high thus too many type 1 erros.

Answer 55

a) Scatter plot b) BP chi square test

Answer 56

Requires a regression of squared residual terms from original regression equation with Independent variables in the regression. If 2nd regression- a) R^2 is high - presence of conditional heteroskedasticity. since error term and IV are correlated. b) R^2 low - presence of conditional heteroskedasticity.

Answer 57

Do not use SEE of original equation but use higher new SEE known as white corrected SE or heteroskedasticity consistent SE.

Answer 58

Auto correlation is the situation in which the residual terms are correlated with each other. a) Positive b) Negative

Answer 59

a) Graphically b) Durbin Watson- DW = 2(1-r) Correlation between residual from one period and those from previous period. c) Breusch Godfrey BG test

Answer 60

Adjust the coeffecient SE using Hansen method.

Answer 61

When 2 or more IV are correlated.

Answer 62

Coefficients are consistent but not reliable. T stat is too low due to high SEE. Thus too many type 2 errors.

Answer 63

a) F and t test will give conflicting answer and R^2 will be high. b) Variance Inflation Factor

Answer 64

Drop one of the correlated variables.

Answer 65

We start by regressing one of the IV against the remaining and calculate the R^2.. Then we calculate VIF as- 1/(1-R^2). Higher the VIF, higher chances of multi collinearity.

Answer 66

It regresses the residuals against the orignial set of IV plus one or more IV from lagged residuals. Checked using F test.

Answer 67

a) probit model - normal distribution. b) logit model - fatter tails logistic distribution use log odds as dependent variable c) discriminant model - altman z score

Quantitative Methods Flashcards

(99 cards)