Descriptive Statistics, Inferential Statistics & Statistical Power Flashcards

1
Q

What are the three measures of central tendency that describe the middle values of a data set?

A

Mode

▪ The most frequently occurring value

Median

▪ The middle score (also called the 50th percentile) of all the data once they are ordered

  • The median only reflects the number of scores in the data set, is not dependent upon the magnitudes of the values)

Mean

▪ Arithmetic average of all data points

  • Reflects both the number of scores and the values of all scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three measures of central tendency that describe the spread or dispersion of the data

A

Range & Inter-quartile Range (IQR)

Range: The spread between the highest and lowest values of data

IQR: The spread of between the 75th percentile and the 25th percentile scores

only consider two points to determine variability

Variance (variability, hoew close the data is ) (V 𝑜𝑟 𝑠2 𝑜𝑟 σ2)

Variance uses squared difference (𝑑) of each score (𝑋 ) from the
mean (𝑋) to estimate the spread of the data

Standard Deviation (𝑠 𝑜𝑟 σ)

The square root of the variance (𝑉)

The distance from the actual mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference from Population vs. Standard deviation

A

Samples rarely catch extreme values that are present in a population = the standard deviation can be an underestimate.

Therefore, when dealing with sample data you subtract 1 from your sample size to derive the degrees of freedom

Degrees of freedom = Number of independent pieces of information that go into the estimate of a parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is statistical significance?

A

Statistical significance implies that we are reasonably confident that an effect in our sample is big enough compared to any error in our estimate that it reflects the true state of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or False: The bigger my sample size the more confident I am that even a small difference between samples will be present in the population.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is common about all of these statistics used to make different inferences about a population?

A

All represent a ratio (or value scaled as a function of) variance that is explained (effect) vs. sampling error (i.e. effect that could have occurred due to chance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the statistical significance and effect size?

A

▪ Statistical significance – whether something is likely true about the population

▪ Effect Size – The magnitude of the difference in some standardized units (in other words, how many z-scores is the experimental group mean from the control group mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

As your degrees of freedom in a sample increase, the critical value your t- statistic needs to exceed decreases. What does this mean?

A

easier to find significance when you have a bigger sample size\

Therefore, if I perform the same experiment using 100 different random samples, I will get the same result more consistently if the sample is bigger→true result becomes more likely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is statistical power?

A

The likelihood of a statistical analysis of a sample will find an effect when there is an effect to be found in the population

(ie, overlap in data can cause a portion of the experimental distribution in which experimental mean can fall but not be considered different than control mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 factors that influence statistical power for a test of difference?

A

Effect Size
Sample Size
P-value (i.e. 1-level of confidence)

𝑃𝑜𝑤𝑒𝑟 = 𝐸𝑓𝑓𝑒𝑐𝑡 𝑆𝑖𝑧𝑒 | 𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑖𝑧𝑒 | 𝑙𝑒𝑣𝑒𝑙 𝑜𝑓 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two ways statistical power is generally used?

A

Post-hoc→How likely was I to detect an effect if there was one to

find (i.e. probability of a Type II error)

A priori→Given an expected effect size and level of confidence and desired level of statistical power what sample size do I need to achieve statistical significance.

desired statistical power is 0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

So you’ve designed a great experiment, determined sample size, collected your data and run the appropriate statistical test but still might not achieve the “true” conclusion about the population.

Why?

A

There always might be some sort of error involved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two types of inferential statistics?

A

Parametric→make assumptions about the parameters of the

population distribution from which the sample is drawn

Non-parametric→Does not make any assumptions about the population distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is normal distribution and its key assumption?

A

Parametric statistics assume that the parameter (i.e. height) sampled is drawn from a population where it is normally distributed.

If sample is random, and large enough then the sample should also be normal (key assumption of parametric tests!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is skewness and the 3 types?

A

asymmetry in the distribution of a set of observations

Normal: Tails are symmetrical with high and low scores equally distributed around the center

Positive: Tail is pulled to right by extreme high scores

Negative: Tail is pulled to left by extreme low scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is kurtosis and what are the 3 types?

A

Kurtosis – The sharpness of the peak of the distribution

a) Normal (or mesokurtic) distribution -

Bell-shaped distribution around mean

b) Platykurtic distribution – Abnormally large distribution of scores
c) Leptokurtic distribution – Most scores fall within a tight range around the peak.

17
Q

What are the two common tests for normality?

A

Kolmogorov-Smirnov (KS) Test

Shapiro-Wilk Test

18
Q

What is the KS test, how does it work?

A

▪ Compares an empirical distribution and the cumulative distribution of a reference (normal) function.

▪ Statistic represents differences in the between the two distributions.

Null Hypothesis: The sample is drawn from the reference (i.e. normal) distribution

Alternate Hypothesis: The sample is not drawn from the reference distribution

19
Q

What is the Shapiro-Wilk test, how does it work?

A

Detects all departures from normality. Compares actual quantiles to expected quantiles. If data are normally distributed correlation will be quite high (i.e. actual scores fall on straight line)

▪ Null hypothesis – the data are normally distributed
▪ Alternate hypothesis – the data are not normally distributed

20
Q

What are the 3 types of transformation to establish normality?

A

Log Transform

take the log of data, so that is will be less skewed

Square-root transformation

Take square root, good Good for variables that are frequency counts (i.e. number of instances per unit)

Arcsine transformation

Ascrine of the square root, converts to radians, Values should range between 0 and 1 (i.e. proportions)