1. Descriptive and inferential statistics Flashcards

1
Q

Classify these variables as NOMINAL or CONTINUOUS:

A) Age

B) Gender

C) Height

A

A) Age = Continuous

B) Gender = Nominal

C) Height = Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe what a confounding variable is

A

A variable that affects the outcome being measured as well as, or instead of, the independent variable.

  • because a confounding variable is an unforeseen and unaccounted-for variable that jeopardizes reliability and validity of an experiment’s outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a test is valid, what does this mean?

A

The test measures what it claims to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If a test is reliable what does this mean?

A

The test will give consistent results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The discrepancy between the numbers used to represent something that we are trying to measure and the actual value of what we are measuring is called:

A

Measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the ‘fit’ of the model?

A

The ‘fit’ of the model is the degree to which a statistical model represents the data collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is variance?

A

The variance is the average error between the mean and the observations made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A frequency distribution in which low scores are most frequent (i.e. bars on the graph are highest on the left hand side) is said to be:

A

Positively skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can we compensate for practice effects?

A

Counterbalancing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can we compensate for boredom effects?

A

Giving participants a break between tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Variation due to variables that have not been measured is known as:

A

Unsystematic variation

  • Unsystematic variation results from random factors that exist between the experimental conditions (such as natural differences in ability, the time of day, etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the assumption of homogeneity of variance?

A

That the variance within each of the populations is equal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Variation due to the experimenter doing something in one condition but not in the other condition is known as:

A

Systematic variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does residual variance tell us?

A

Residual variance helps us confirm how well a regression line that we constructed fits the actual data set. The smaller the variance, the more accurate the predictions are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The purpose of a control condition is to

A

Allow inferences about cause

  • A properly constructed control condition provides you with a reference point to determine what change (if any) occurred when a variable was modified
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What helps to control for participant characteristics (thus minimize unsystematic variation)?

A

Randomization

17
Q

How are Z scores calculated?

A

By subtracting the mean from the score and dividing the answer by the standard deviation

SCORE - MEAN = X

X / STDEV = Z-SCORE

18
Q

The standard deviation is the square root of the:

A

Variance

19
Q

What is the coefficient of determination?

A

A measure of the amount of variability in one variable that is shared by the other

Calculated as:

correlation coefficient squared

20
Q

Complete the following sentence:

A large standard deviation (relative to the value of the mean itself)…

A

Indicates that the data points are distant from the mean

(i.e. the mean is a poor fit of the data).

21
Q

The probability is p = 0.80 that a patient with a certain disease will be successfully treated with a new medical treatment. Suppose that the treatment is used on 40 patients. What is the “expected value” of the number of patients who are successfully treated?

A

32

because 80% of 40 patients is 32 (or 40 x .80 = 32)

22
Q

What is the Confusion of the inverse?

A

A logical fallacy whereupon a conditional probability is equated with its inverse

  • that is, given two events A and B, the probability of A happening given that B has happened is assumed to be about the same as the probability of B given A, when there is actually no evidence for this assumption.

More formally, P(A|B) is assumed to be approximately equal to P(B|A).

23
Q

The test statistics we use to assess a linear model are usually _______ based on the normal distribution.

A

Parametric tests

24
Q

What are the assumptions of the general linear model?

A

Independence:

  • The errors in your model should not be related to each other

Additivity/Linearity:

  • If you have several predictors then their combined effect is best described by adding their effects together
  • The outcome variable is, in reality, linearly related to any predictors

Normality:

  • The core element of the
    Assumption of Normality asserts that the distribution of sample means (across independent
    samples) is normal.

(In technical terms, the Assumption of Normality claims that the sampling
distribution of the mean is normal or that the distribution of means across samples is normal)

Homogeneity of variance:

  • When testing several groups of participants, samples should come from populations with the same variance
25
Q

Finish the sentence

The further the values of skewness and kurtosis are from zero, the more likely…

A

…it is that the data are not normally distributed

26
Q

Parameters are numbers that summarize data for…

A

an entire population

27
Q

Statistics are numbers that summarize data from…

A

a sample

28
Q

What are the measures of central tendency?

A
  • Mean
  • Median
  • Mode
29
Q

What are the measures of spread or dispersion?

A
  • Range
  • Variance
  • Standard Deviation
30
Q

What does kurtosis tell us?

A

what data points are outliers

Distributions:
Leptokurtic = relatively large tails (heavy drop off)

Platykurtic = relatively small tails (light/no drop off)

Mesokurtic = same kurtosis as the normal distribution

31
Q

What is Gambler’s fallacy?

A

mistaken belief that, if something happens more frequently than normal during some period, it will happen less frequently in the future, or that, if something happens less frequently than normal during some period, it will happen more frequently in the future

32
Q

What is the Law of small numbers?

A

exaggerated confidence in the validity of conclusions based on small samples.

  • Misperceive a small sample to be indicative of the entire population
33
Q

What does the Sum of squared errors (SS) indicate?

A

The total dispersion, or total deviance of scores from the mean

34
Q

How does an increasing number of participants affect the distribution of the sample?

A
  • Distribution becomes more normal

- Spread of the distribution decreases

35
Q

What do confidence intervals tell us?

A

There is a tradeoff between degree of certainty and width of the CI:

  • The more certain you want to be, the wider (larger) the interval needs to be
  • The goal is to have a high level of confidence paired with a small interval.
  • One way to help achieve this is to have less variability in your sample (i.e. smaller error or mean)
36
Q

Sum of squares, Variance and standard deviation represent the same things.

What do they represent?

A
  • The ‘fit’ of the mean to the data
  • the variability in the data
  • How well the mean represents the observed data
  • error
37
Q

What does standard error tell you?

A

It is the standard deviation of the sampling distribution of a statistic

How accurate the mean of any given sample from that population is likely to be compared to the true population mean.

When the standard error increases, i.e. the means are more spread out, it becomes more likely that any given mean is an inaccurate representation of the true population mean.

38
Q

Which t-test has more power to find an effect given that everything else is equal?

Repeated measures

vs

independent measures

A

repeated measures t-test:

  • When the same participants are used across conditions the unsystematic variance (often called the error variance) is reduced dramatically, making it easier to detect any systematic variance