Week 19 Flashcards

1
Q

What are three types of values?

A

Categorical - value comes from one of n non-numeric categories, e.g. favourite colour

Numerical - numerical values (no shit), e.g. height of student

Ordinal - one of n numerical categories, e.g. number of stars rating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are mean, median and mode known as?

A

Measures of central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are descriptive statistics?

A

Describe/summarize the data e.g. compute an average. But don’t extrapolate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are measures of variability?

A

Measure how spread out the data is around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are inferential statistics?

A

Statistics which make conclusions that go beyond the sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two interpretations of probability?

A

Relative frequency: how often something happens on average

Degree of belief: subjective opinion of some individual regarding how certain an event is to occur (not really repeatable experiment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the sample space and event?

A

Sample/outcome space: Set of all possible outcomes (e.g. {1,2,3,4,5,6} for rolling a die)

Event: subset of sample space (e.g. event E of getting a value less than 4: E={1,2,3}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is a random variable used with probability?

A

Takes unique value for each event. E.g. experiment where 3 coins are tossed:

Y = number of heads

Range is 0-3

Y=0 corresponds to {TTT}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two types of random variables?

A

Discrete: Takes countable values, e.g. number of heads

Continuous: real value e.g. 1.534

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a discrete probability distribution?

A

P(X=x) gives probabilities for each possible value of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a continuous probability distribution?

A

Defined by probability density function giving the probability X is in a certain range,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

With normal distributions,

__% of the data is within the first standard deviation from the mean

__% is within two stddev

__% is within three stddev

A

68%

95%

99.7%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a standard normal distribution?

A

When mean = 0 and stddev = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can a normal distribution X be converted to a Z distribution

A

Z = (X - mean) / stddev

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between population and sample?

A

Population is a universe of individuals you’re interested in (e.g. all people in Colchester, infinite number of coin flips)

Sample is a subset of a population that should be representative (e.g. 100 coin flips, 100 people from Colchester)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Difference between true mean and true s.d. and sample mean and sample s.d.

A

True = performed on population

Sample = performed on sample

17
Q

What is the primary concern with using sample statistics?

A

Variability - hard to get a representative sample

18
Q

What is the sampling distribution?

A

Taking a very large number of samples of size N, and plotting the sample statistic.

Then, random variable is the sample statistic, not actual values

19
Q

What is the standard error?

A

The standard deviation of the sampling distribution

Aka, the uncertainty of the sample means. If I take different samples, how much do the means vary

20
Q

What happens to the standard error as the sample size increases

A

Standard error decreases

21
Q

How can you approximate the standard error from sample standard deviation?

A

s / sqrt(N)

where s = sample stddev, N = number of samples

22
Q

What does the central limit theorem say?

A

As N becomes large the sampling distribution can be approximated by a normal distribution

~30+ samples reveal a normal dist

23
Q

What are the implications of the central limit theorem?

A

Get one sample

Compute sample mean

Get probability of the sample mean under the sampling distribution

  • Can get this probability without doing the sampling many times
24
Q

What is the Z test?

A

Collect N samples, compute sample mean and standard error

z = (sample mean - mean) / standard error

Reject null hypothesis if z value < -1.96 or > 1.96

This represents 95% (1.96 standard deviations from mean)

25
Q

What is the p value?

A

The probability beneath which you reject the null hypotheses

Look up the p value on a table, e.g. for 2 tailed hypothese, p<0.05 = 0.42

26
Q

What is a common use of the z test?

A

Check if a sample mean is close to the population mean

27
Q

What is a paired t-test?

A

Compares two population means where you have two samples in which observations in one sample can be paired with observations in the other sample

e.g. student’s scores before and after a module or course

28
Q

Between-group testing is a form of ___________ testing

A

Non-paired

29
Q

What is involved in between-group testing?

A

An experiment with two or more group that each have different testing conditions

E.g. control group and test group

30
Q

Repeated-measures design is also known as

and is a form of ________ testing

A

within group design

paired testing

31
Q

What is repeated-measures testing?

A

Using the same subjects with every branch of research

i.e. a longitudinal study where each testing condition is done at some point for each subject

32
Q

What are the advantages and disadvantages of between groups design?

A

Advantages: multiple variables can be tested at the same time

Can save time

Disadvantages: potential scale can be impractical due to limited resources

selection of subjects may not be representative