S1 Statistics Flashcards

(46 cards)

1
Q

Positive Skew

A

Q3 - Q2 > Q2 - Q1 or Mean > Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Negative Skew

A

Q2 - Q1 > Q3 - Q2 or Median > Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Frequency

A

frequency density × class width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Qualitative Variables

A

Non-numerical - e.g. red, blue or long, short etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Quantitative Variables

A

Numerical - e.g. length, age, time, number of coins in pocket, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Continuous Variables

A

Can take any value within a given range - e.g. height, time, age, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Discrete Variables

A

Can only take certain values - e.g. shoe size, cost in £ and p, number of coins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mode

A

The value, or class interval, which occurs most often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Linear Interpolation

A

Median = a (start of group) + ((b - distance of beginning group to median / c - length of group) x length of group)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interquartile Range

A

Q3 - Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Variance Formula

A

(sum of x squared / number of terms) - ((sum of x / n) squared)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Standard Deviation

A

the square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Addition Law

A

P(AUB) = P(A) + P(B) - P(A intersect B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

P(B|A)

A

P(A and B)/P(A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mutually Exclusive Addition Rule

A

P(AUB) = P(A) + P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mutually exclusive intersection

A

P(A intersect B) = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Independent Event

A

One event does not effect the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Independent: P(A|B) =

A

P(A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Independent multiplication rule

A

P(A intersect B) = P(A) x P(B)

20
Q

Product Moment Correlation Coefficient (PMCC)

A

a quantity between -1.0 and 1.0 that estimates the strength of the linear relationship between two random variables. Close to -1, strong negative correlation. Close to 1, strong positive correlation.

21
Q

If scale changes for PMCC (correlation) ….

A

PMCC (correlation) is still the same

22
Q

Linear Regression

A

a statistical method used to fit a linear model to a given data set (basically best fit line)

23
Q

Reliable Regression

A

Values within the range of data

24
Q

Standard Deviation Definition

A

a measure that is used to quantify the amount of variation or dispersion of a set of data values

25
Discrete Random Variable
Variables can only take certain values
26
Uniform / Discrete Distribution
Every outcome has the same value
27
F(x)
Cumulative distribution function in which probabilities up to 1
28
E(x)
(expected value if you did it many times, like the mean) Sum of x multiplied by p
29
Variance of E(x)
E(x^2) - E(X)^2 | Mean of the squares / square of the mean
30
Expected value is affected by multiplication, division, subtraction, addition
E(4x+1) = 4E(x) + 1 E(1 - x) = 1 - E(x) E(x/2) = E(x) / 2
31
Variance is not affected by addition and subtraction, but is affected by multiplication and division
``` For variance, you have to square the value Var(4x) = 16Var(x) Var(x+1) = Var(x) Var(3x+2) = 9Var(x) Var(x/2) = 1/4Var(x) ```
32
Independent Rule: P(A intersect B) =
P(A) x P(B)
33
Normal Distribution Formula
X ~ N(u , o^2)
34
Standardizing Formula: Z =
(x - Mean) / standard deviation
35
Explain why a histogram is appropriate for this data
Data is continuous
36
Explain why this diagram would support the fitting of a | regression line of x onto y.
The points are close to an implied straight line | of best fit. There is a strong correlation within the data
37
Which is the explanatory variable?
The variable that influences the other variable. etc. The explanatory variable is the age of each coin. This is because the age is set and the weight varies.
38
Give a reason to support the use of normal distribution in this case
Mean and median are very close | - but when data is skewed, normal distribution will not be a good fit
39
It was discovered that a coin in the original sample, which was 5 years oldl and weighed 20 grams, was a fake. State whether the exclusion of this coin would increase or decrease the value of the PMCC. Give a reason for your answer
It would decrease the value of the PMCC closer to -1 because removing the fake will result in a better linear fit
40
Write down 2 of these events that are mutally exclusive. Give a reason for your answer
No overlap between events
41
State whether the estimate of the mean or median is a better representation of the average speed of traffic on the road
- Mean is a better representation because it uses all the data OR - Median is better because data is skewed and median is not affected by extreme values
42
Comment on shape of distribution
- skewness because ... | - symmetric because median is similar to mean
43
what happens to median estimates when values are below the median change?
Median remains the same becomes values that change are below the median
44
what happens to mean estimates when values become lower than used to?
Mean would lower as changes reduce total of x
45
what happens to standard deviation estimates when values become lower than used to?
The standard deviation would increase because the data is more spread out
46
Response Variable
The dependent variable. The variable being studied