Statistics Flashcards

1
Q

Descriptive data

A

Methods for organising, summarising, and presenting data in an informative way - Graphs, tables, and numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential

A

Methods for drawing conclusions about a population, from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Qualitative (Categorical)

A

Nominal - Categories that cannot be ordered (eg male, female)
Ordinal - Categories that can be ordered, but the numerical difference between groups cannot always be determined (eg, low-income, middle-income, high-income)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quantitative (Numerical)

A

Discrete: Number
Continuous - Interval data, doesn’t contain a true zero, and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Raw data

A

Collected data that has not been organised numerically or grouped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Frequency

A

How many times does value/category appear in the data. Can be expressed as the total number of individuals or expressed as a fraction/percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quartiles

A

Quarters
1st quartile - located where 25% of all data points are equal to or lower than this Q1 value and 75% equal to or higher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Percentiles

A

100s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Median quartile

A

Second quartile, 50th Percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interquartile range(IQR)

A

Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sturge’s rule

A

A rule for determining the number of classes to use in a histogram or frequency distribution table - Optimal bins
k=1+3.33*log10(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mean calculation

A

X̄=∑x,/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Median

A

Middle of the data set. equal halves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Mode

A

Value which occurs with the greater frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Deviation from the mean

A

Difference between each price and the average price
x,-X̄

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Symmetric distribution

A

Graph is a mirror image, Median=mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Left skewed

A

mode>median>mean
Negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Right skewed

A

mean>median>mode
Positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Variance

A

The average of all deviations
σ^2=∑(x,-X̄)^2,/n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard deviation

A

A quantity expressing by how much the members of a group differ from the mean value of group
Sx=SQR(∑(x,-X̄)^2,/n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Skewness

A

=3(mean-median)/standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Kurtosis

A

Measure of the tailedness of a distribution - how often outliners occur
=∑(x,-X̄)^4/n/S^4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Cross-sectional data

A

Observations from a particular point in time, containing different variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Time series data

A

Data across time periods

25
Q

Heteroskedasticity

A

Periods of variable volatility

26
Q

Serial Correlation

A

Little to no variation in tend of time data

27
Q

Growth factor

A

Xt/Xt-1

28
Q

The approximate average growth rate

A

Average of each points growth rate over the period. (First data point cannot have an average, as such only dividing by n-1) Arithmetic equation

29
Q

The accurate growth rate

A

Geometric mean of the growth factors
=^nSQRT(gt/(t-1)*gt-1/(t-2) -1

30
Q

Approximate average log equation

A

=In(Xt)-In(I1)/n-1

31
Q

Probability

A

Certainty of an outcome

32
Q

A change experiment

A

A procedure carried out under controlled conditions which has a well-defined set of possible results

33
Q

Sample space

A

All possible outcomes

34
Q

Simple set

A

A single outcome

35
Q

Compound set

A

Collection of possible outcomes

36
Q

The law of large number

A

The greater the number of turns, the closer that the outcome will approach its probability

37
Q

Independent events

A

P(A|B)=P(A)
P(B|A)=P(B)
P(A&B)=P(A)P(B)

38
Q

Complement rule

A

P(A)=1-P(A’)

39
Q

Multiplication rules (Joint Probability) And

A

Dependent: P(A&B)=P(A)P(B|A)
Independent: P(A&B)=P(A)
P(B)
Mutually exclusive: P(A&B)=0

40
Q

Addition Rules (Union of Events) Or

A

Non mutually exclusive events: P(A or B)=P(A) + P(B) - P(A&B)
Mutually exclusive: P(A or B)=P(A) + P(B)

41
Q

Bayes’ Theorem

A

P(A|B)=P(B|A)*P(A)/P(B)

42
Q

Factorial

A

! Multiplication of all positive consecutive numbers up to and including the original number

43
Q

Permutation

A

Number of unique ways of arranging data set where its order matters
Calculated by the factorial
P(n,r)=n!/(n-r)!
P=6!/P(n,r)

44
Q

Combination
Binomial Coefficient

A

Do not require a particular ordering of number
C(n,k)=k!/r!(n-k)!

45
Q

Discrete Random variables

A

Outcome is random but can only take a limited number of outcomes

46
Q

Probability distribution function

A

Chance of picking data from random set

47
Q

Expected value

A

Long term average or mean

48
Q

Law of large numbers

A

The higher number of turns the closer the outcome will resemble the probability

49
Q

Standard deviation of a probability function

A

=SQRT(∑(x-E(X))*P(x))

50
Q

Characteristics of Binomial distribution

A

Fixed number of trials
Only two possible, mutually exclusive outcomes
The trials are independent
X~B(n,p)

51
Q

Binomial sample space

A

number of outcomes^n

52
Q

Number of combinations

A

nCxP^x(1-p)^n-x
nCx=n!/x!(n-x)!

The number of combinations, times by the probability of event x to the power of its occurs, timed by the probability of event 2 to the power of its probability

53
Q

Probability density function

A

Represents the distribution of a continuous function

54
Q

Cumulative distribution function

A

Area under he curve
Used to evaluate the probability of X assuming values in a particular interval
Probability = Area

55
Q

Properties of Continuous probability distributions

A

The outcomes of random variable X are measured, no counted
The entire area under the curve is =1
P(c<x<d) is the probability that the random variable X takes a value x in the interval between the values c and d. P(c<x<d) is the area under the curve, above the x-axis, to the right of c and the left of d
P(x = c) = 0 The probability that x takes on any single individual value is zero. The area below the curve, above the X-axis, and between x = c and x = c has no width, and therefore no area (area = 0)
P(c < x < d) is the same as P(c ≤ x ≤ d) because probability is equal to area (and the probability that X is equal to the end points c or d is 0)

56
Q

Bell curve distribution

A

Mean=Median=Mode

57
Q

X~N(,)

A

() standard deviation

58
Q

The standard normal distribution

A

Denoted by Z
Transforms any distribution X variable into Z such that Z has a mean of 0 and a standard deviation of 1

X~N(a,b) mean of X and SD of C
Subtract the mean from both sides, and then divide both sides by the standard deviation

59
Q

Bell curve distribution Rule

A

68% within one standard deviation
95% within two standard deviations
99.7 within three standard deviations