stats Flashcards

1
Q

Quantitative data what ?

A

Numerical values :

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is continuous data?

A

Continuous data: It represents variables that cannot be counted but can be measured.
Discrete data: It can take up only integer values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The set of all possible outcomes is called

A

The sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When we repeat a random experiment several times, we call each one of them a.

A

Trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Any subset of the sample space is called an

A

Event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

We can either get an even number, or an odd

number, but not both. Such events are called ?

A

Mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bayesian statistics, a posterior probability

A

The posterior probability is calculated by updating the prior probability using Bayes’ theorem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example of mutually-exclusive event

A

Both football teams can’t win mutually-exclusive events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

PDF

A

Probability density function : gives the probability that a discrete random variable X is equal to a certain value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Example of discrete variable.

A

Countable outcome, kids in a class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Continuous random variable

A

Continuous random variable can take an infinite number of outcomes, eg height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Area Under The Curve:

A

Which represents the total probability, in the case of

continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The y-axis in a probability density function represents

A

Is the probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Exponential distribution ?

A

A continuous distribution, that is often used to model the expected time one needs to wait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A left-skewed distribution or negatively skewed: has a long tail, in which direction ?

A

Left tail, the mean is less than the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A right-skewed distribution or positively skewed distribution, has a long tail in

A

The right direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Higher kurtosis implies ?

A

fatter tails, more probability for extreme values happening > 3 leptokurtic more risky.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mesokurtic Distribution

A

Normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Leptokurtic Distribution

A

Thin & tall with fatter tails, higher to lep over, more risky

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Platykurtic Distribution

A

Fat and wide shallow: like a plate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

log-normal distribution

A

Income of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

T-distribution is measured in

A

Degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is exponential distribution used to measure ?

A

Probability distribution, of time between events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Standard normal distribution is when ?

A

The mean is close to zero and standard deviation is 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Which distributions is often used to model the asset prices? As they are not negative

A

A lognormal distribution i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

T -distribution has fat or thin tails ?

A

T has fatter tails then normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is Inferential statistics

A

Extrapolating data to help predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

CLT

A

Central limit theorem, helps us predict confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

The population

A

Is a superset of a sample and a representative sample of a larger group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The sample mean is a random variable as it varies from sample to sample.

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Hypothesis is held true

A

Until we have evidence to reject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The p-value

A

The probability of observing a more extreme value than that of the test statistic, proves the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Type I error

A

Rejection of an actually true null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Type II error

A

The failure to reject a null hypothesis that is actually false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

A statistical hypothesis is a factual statement that

A

That is about a population parameter which may or may not be true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

The significance level or p-value is found first

A

Yes, setting significance value in advance, helps us to avoid bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

If p-value is less than the chosen significance level then we ?

A

We reject the null hypothesis i.e. The sample gives reasonable evidence, to support the alternative hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

If the obtained p-value is greater than the chosen significance level then we ?

A

We do not reject the null hypothesis.

39
Q

Two or more variables, uses what type of statistical analysis ?

A

Covariance and correlation

40
Q

Covariance positive, negative and and 0

A

Positive variables moving the same direction. Negative moving in opposite directions. Two variables are not related.

41
Q

Pearson’s correlation analysis

A

is used to established negative positive correlation between -1 and 1

42
Q

Correlation and causation

A

Correlation and causation may not be related

43
Q

Correlation is a ?

A

Standardized version of covariance. The value of the Pearson’s correlation coefficient lies between +1 & -1

44
Q

The variance of a random variable is ?

A

The variance of a random variable is nothing but the covariance of that variable with itself.

45
Q

The predicted variable is known as ?

A

The dependent variable, this is dependent on the independent variable

46
Q

A regression line is estimated using a method called

A

Ordinary least squares (OLS),

47
Q

𝐑𝟐 (Coefficient of Determination):

A

Higher the value of R2, higher the accuracy of the model

48
Q

Value of F statistics

A

Higher the value of F statistic, better the model.

49
Q

Multicollinearity is good or bad ?

A

Bad, Variance Inflation Factor (VIF) is used to check.

50
Q

What do you want in your data Heteroskedasticity: or homoskedasticity.?

A

homoskedasticity : Cook-Weisberg test

51
Q

Normality of errors, what is the test?

A

Kolmogorov-Smirnov test or Shapiro-Wilk test

52
Q

Error terms or residuals should or should not be correlated ? What is the test ?

A

Residuals should not be correlated. Durbin-Watson statistic (DW) = 2

53
Q

Are there multiple independent variables in a linear regression model ?

A

No, a linear regression model has an independent variable and a dependent variable

54
Q

R-squared value goes up or down with more variables ?

A

When more variables are added to the regression model, the R-squared value typically increases. It can never decrease on adding a variable.

55
Q

Multicollinearity is a desired condition for building a regression model.

A

No

56
Q

values y = mx + c, Which is Beta and alpha

A

Beta = M an alpha = C

57
Q

How do we work out if we reject a Null hypothesis

A

If p value is less than ( 100% - confidence Interval )

58
Q

Alpha is it good or bad

A

A big alpha reading is good.

59
Q

What is Bayes theorem ?

A

Bayes’ theorem named after Thomas Bayes, describes the probability of an event Pa, based on prior knowledge of conditions ..Pb

bayes = (Pa U Pb) / Pb

60
Q

Expected Value

A

EX = SUM (all values * prob )

61
Q

Coverience

A

How stocks move together, if they move in line, coverience would be high.

62
Q

Correlation is ?

A

-1 < corr < 1 : if negative the stocks are always moving in different directions.

63
Q

How to find the STD against time of stock

A

= standard D * SQRT(T)

64
Q

Is geometric return the same as compounded.

A

Yes

65
Q

Hit rasio

A

Positive trades / all trades

66
Q

Normalized Hit ratio above 65%

A

Profitable trade * % win / total ( winning and losing trades)

67
Q

Kelly fraction

A

This is used to work out best % of wealth to invest

68
Q

What is a good Sharp

A

AVE return / STD > 2 is good

69
Q

Draw down

A

The max return - the lowest consecutive point

70
Q

Hite ratio

A

Number of wins / the sum of all trades
For example, if you have 51 wins and 3 losses
Divide 51 by 54. A hit ratio of 94.4%

71
Q

Normalized hit ratio

A

Number of wins * % Av win / all trades tatal (wins * %) + (losses * %)

72
Q

p-values for all the four coefficients are almost 0,

A

Statistically significant, at a very high level of confidence.

73
Q

ARCH

A

Autoregressive Conditional Heteroskedasticity method provides a way to model a change in variance in a time series that is time dependent, such as increasing or decreasing volatility

74
Q

Homoscedasticity, heteroscedasticity

A

heteroscedasticity data that has seasonality volatility, Homo much more stationary

75
Q

PACF

A

Partial correlation of a stationary time series with its own lagged values,

76
Q

MA or AR , which shows surprises, sudden p

A

The MA models trys to capture the idiosyncratic shocks observed in financial markets.

77
Q

Check for the normality of the residuals

A

Jarque-Bera

78
Q

In the ADF test, if the p-value is greater than the level of significance, we conclude that:

A

The series is non-stationary

79
Q

If a time series process is non-linear, which of the following type of model would likely describe it better?

A

Multiplicative

80
Q

Seasonality component

A

The ‘seasonality’ component of a time series model does not try to capture the average value of the process. It is to show similarities or repeating pattons over time,

81
Q

Which of the following is categorized as the non-systematic component of a time series model?

A

Noise

82
Q

Autocorrelation function (ACF):

A

Autocorrelation function (ACF): Measures the correlation of a variable with a lagged version of itself. This is also called serial correlation

83
Q

Which of the following statistical properties should remain constant in time for a time series to be stationary?

A

Mean
Variance
Covariance

84
Q

Implied volatility

A

Is that’s of looking forward, estimation, computed based on supply and demand

85
Q

Which of the following is a covered call strategy?

A

Buy a stock and sell an ATM call.

86
Q

A protective put strategy

A

Built by going long on a stock and simultaneously buying a put option.

87
Q

What is it when trader writes a put option
at strike price of INR 800 and receives a premium of INR 30. What is his profit or loss at expiry when the stock is trading at INR 840?

A

Buys a put option and receives the premium of INR 30 profit, as stock is now at 840

88
Q

Cointegrated V correlated

A

Cointegrated when 2 stationary time series overlap, the don’t have to be correlated ,correlated they have trend direction similar .

89
Q

ADF

A

is used to check cointegration

90
Q

Lambda < 0

A

Lambda is less than 0 we reject the null hypothesis and state that the assets are stationary

91
Q

Does a negative gradient line of stock price indicate what ?

A

A negative incline graph shows stationarity

92
Q

What is covariance

A

The relationship of two variables, when postative both move in the same direction

93
Q

What is the difference between covariance and correlation

A

Covariance how the two variables differ, correlation shows they are related & strength of correlation.