stats Flashcards

(93 cards)

1
Q

Quantitative data what ?

A

Numerical values :

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is continuous data?

A

Continuous data: It represents variables that cannot be counted but can be measured.
Discrete data: It can take up only integer values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The set of all possible outcomes is called

A

The sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When we repeat a random experiment several times, we call each one of them a.

A

Trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Any subset of the sample space is called an

A

Event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

We can either get an even number, or an odd

number, but not both. Such events are called ?

A

Mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bayesian statistics, a posterior probability

A

The posterior probability is calculated by updating the prior probability using Bayes’ theorem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example of mutually-exclusive event

A

Both football teams can’t win mutually-exclusive events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

PDF

A

Probability density function : gives the probability that a discrete random variable X is equal to a certain value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Example of discrete variable.

A

Countable outcome, kids in a class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Continuous random variable

A

Continuous random variable can take an infinite number of outcomes, eg height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Area Under The Curve:

A

Which represents the total probability, in the case of

continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The y-axis in a probability density function represents

A

Is the probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Exponential distribution ?

A

A continuous distribution, that is often used to model the expected time one needs to wait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A left-skewed distribution or negatively skewed: has a long tail, in which direction ?

A

Left tail, the mean is less than the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A right-skewed distribution or positively skewed distribution, has a long tail in

A

The right direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Higher kurtosis implies ?

A

fatter tails, more probability for extreme values happening > 3 leptokurtic more risky.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mesokurtic Distribution

A

Normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Leptokurtic Distribution

A

Thin & tall with fatter tails, higher to lep over, more risky

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Platykurtic Distribution

A

Fat and wide shallow: like a plate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

log-normal distribution

A

Income of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

T-distribution is measured in

A

Degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is exponential distribution used to measure ?

A

Probability distribution, of time between events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Standard normal distribution is when ?

A

The mean is close to zero and standard deviation is 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Which distributions is often used to model the asset prices? As they are not negative
A lognormal distribution i
26
T -distribution has fat or thin tails ?
T has fatter tails then normal distribution
27
What is Inferential statistics
Extrapolating data to help predictions
28
CLT
Central limit theorem, helps us predict confidence intervals
29
The population
Is a superset of a sample and a representative sample of a larger group
30
The sample mean is a random variable as it varies from sample to sample.
Yes
31
Hypothesis is held true
Until we have evidence to reject
32
The p-value
The probability of observing a more extreme value than that of the test statistic, proves the null hypothesis is true
33
Type I error
Rejection of an actually true null hypothesis
34
Type II error
The failure to reject a null hypothesis that is actually false
35
A statistical hypothesis is a factual statement that
That is about a population parameter which may or may not be true.
36
The significance level or p-value is found first
Yes, setting significance value in advance, helps us to avoid bias
37
If p-value is less than the chosen significance level then we ?
We reject the null hypothesis i.e. The sample gives reasonable evidence, to support the alternative hypothesis.
38
If the obtained p-value is greater than the chosen significance level then we ?
We do not reject the null hypothesis.
39
Two or more variables, uses what type of statistical analysis ?
Covariance and correlation
40
Covariance positive, negative and and 0
Positive variables moving the same direction. Negative moving in opposite directions. Two variables are not related.
41
Pearson's correlation analysis
is used to established negative positive correlation between -1 and 1
42
Correlation and causation
Correlation and causation may not be related
43
Correlation is a ?
Standardized version of covariance. The value of the Pearson's correlation coefficient lies between +1 & -1
44
The variance of a random variable is ?
The variance of a random variable is nothing but the covariance of that variable with itself.
45
The predicted variable is known as ?
The dependent variable, this is dependent on the independent variable
46
A regression line is estimated using a method called
Ordinary least squares (OLS),
47
𝐑𝟐 (Coefficient of Determination):
Higher the value of R2, higher the accuracy of the model
48
Value of F statistics
Higher the value of F statistic, better the model.
49
Multicollinearity is good or bad ?
Bad, Variance Inflation Factor (VIF) is used to check.
50
What do you want in your data Heteroskedasticity: or homoskedasticity.?
homoskedasticity : Cook-Weisberg test
51
Normality of errors, what is the test?
Kolmogorov-Smirnov test or Shapiro-Wilk test
52
Error terms or residuals should or should not be correlated ? What is the test ?
Residuals should not be correlated. Durbin-Watson statistic (DW) = 2
53
Are there multiple independent variables in a linear regression model ?
No, a linear regression model has an independent variable and a dependent variable
54
R-squared value goes up or down with more variables ?
When more variables are added to the regression model, the R-squared value typically increases. It can never decrease on adding a variable.
55
Multicollinearity is a desired condition for building a regression model.
No
56
values y = mx + c, Which is Beta and alpha
Beta = M an alpha = C
57
How do we work out if we reject a Null hypothesis
If p value is less than ( 100% - confidence Interval )
58
Alpha is it good or bad
A big alpha reading is good.
59
What is Bayes theorem ?
Bayes' theorem named after Thomas Bayes, describes the probability of an event Pa, based on prior knowledge of conditions ..Pb bayes = (Pa U Pb) / Pb
60
Expected Value
EX = SUM (all values * prob )
61
Coverience
How stocks move together, if they move in line, coverience would be high.
62
Correlation is ?
-1 < corr < 1 : if negative the stocks are always moving in different directions.
63
How to find the STD against time of stock
= standard D * SQRT(T)
64
Is geometric return the same as compounded.
Yes
65
Hit rasio
Positive trades / all trades
66
Normalized Hit ratio above 65%
Profitable trade * % win / total ( winning and losing trades)
67
Kelly fraction
This is used to work out best % of wealth to invest
68
What is a good Sharp
AVE return / STD > 2 is good
69
Draw down
The max return - the lowest consecutive point
70
Hite ratio
Number of wins / the sum of all trades For example, if you have 51 wins and 3 losses Divide 51 by 54. A hit ratio of 94.4%
71
Normalized hit ratio
Number of wins * % Av win / all trades tatal (wins * %) + (losses * %)
72
p-values for all the four coefficients are almost 0,
Statistically significant, at a very high level of confidence.
73
ARCH
Autoregressive Conditional Heteroskedasticity method provides a way to model a change in variance in a time series that is time dependent, such as increasing or decreasing volatility
74
Homoscedasticity, heteroscedasticity
heteroscedasticity data that has seasonality volatility, Homo much more stationary
75
PACF
Partial correlation of a stationary time series with its own lagged values,
76
MA or AR , which shows surprises, sudden p
The MA models trys to capture the idiosyncratic shocks observed in financial markets.
77
Check for the normality of the residuals
Jarque-Bera
78
In the ADF test, if the p-value is greater than the level of significance, we conclude that:
The series is non-stationary
79
If a time series process is non-linear, which of the following type of model would likely describe it better?
Multiplicative
80
Seasonality component
The ‘seasonality’ component of a time series model does not try to capture the average value of the process. It is to show similarities or repeating pattons over time,
81
Which of the following is categorized as the non-systematic component of a time series model?
Noise
82
Autocorrelation function (ACF):
Autocorrelation function (ACF): Measures the correlation of a variable with a lagged version of itself. This is also called serial correlation
83
Which of the following statistical properties should remain constant in time for a time series to be stationary?
Mean Variance Covariance
84
Implied volatility
Is that's of looking forward, estimation, computed based on supply and demand
85
Which of the following is a covered call strategy?
Buy a stock and sell an ATM call.
86
A protective put strategy
Built by going long on a stock and simultaneously buying a put option.
87
What is it when trader writes a put option at strike price of INR 800 and receives a premium of INR 30. What is his profit or loss at expiry when the stock is trading at INR 840?
Buys a put option and receives the premium of INR 30 profit, as stock is now at 840
88
Cointegrated V correlated
Cointegrated when 2 stationary time series overlap, the don't have to be correlated ,correlated they have trend direction similar .
89
ADF
is used to check cointegration
90
Lambda < 0
Lambda is less than 0 we reject the null hypothesis and state that the assets are stationary
91
Does a negative gradient line of stock price indicate what ?
A negative incline graph shows stationarity
92
What is covariance
The relationship of two variables, when postative both move in the same direction
93
What is the difference between covariance and correlation
Covariance how the two variables differ, correlation shows they are related & strength of correlation.