Chapter 2 Flashcards

1
Q

α-level

A

the probability of making a Type I error (usually this value is 0.05).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Alternative hypothesis

A

the prediction that there will be an effect (i.e., that your experimental manipulation will have some effect or that certain variables will relate to each other).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

β-level

A

the probability of making a Type II error (Cohen, 1992, suggests a maximum value of 0.2).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bonferroni correction

A

a correction applied to the α -level to control the overall Type I error rate when multiple significance tests are carried out. Each test conducted should use a criterion of significance of the α -level (normally 0.05) divided by the number of tests conducted. This is a simple but effective correction, but tends to be too strict when lots of tests are performed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Central limit theorem

A

this theorem states that when samples are large (above about 30) the sampling distribution will take the shape of a normal distribution regardless of the shape of the population from which the sample was drawn. For small samples the t -distribution better approximates the shape of the sampling distribution. We also know from this theorem that the standard deviation of the sampling distribution (i.e., the standard error of the sample mean ) will be equal to the standard deviation of the sample ( s ) divided by the square root of the sample size ( N ).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Confidence interval

A

for a given statistic calculated for a sample of observations (e.g., the mean), the confidence interval is a range of values around that statistic that are believed to contain, in a certain proportion of samples (e.g., 95%), the true value of that statistic (i.e., the population parameter). What that also means is that for the other proportion of samples (e.g., 5%), the confidence interval won’t contain that true value. The trouble is, you don’t know which category your particular sample falls into.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Degrees of freedom

A

Essentially it is the number of ‘entities’ that are free to vary when estimating some kind of statistical , parameter. In a more practical sense, it has a bearing on significance tests for many commonly used test statistics (such as the F-statistic , t-statistic , chi-square test ) and determines the exact form of the probability distribution for these test statistics .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Deviance

A

the fact or state of diverging from usual or accepted standards, especially in social or sexual behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Experimental hypothesis

A

synonym for alternative hypothesis; the prediction that there will be an effect (i.e., that your experimental manipulation will have some effect or that certain variables will relate to each other).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Experiment wise error rate

A

the probability of making a Type I error in an experiment involving one or more statistical comparisons when the null hypothesis is true in each case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Family wise error rate

A

the probability of making a Type I error in any family of tests when the null hypothesis is true in each case. The ‘family of tests’ can be loosely defined as a set of tests conducted on the same data set and addressing the same empirical question.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do we use samples?

A

We are usually interested in populations, but because we cannot collect data from every human being (or whatever) in the population, we collect data from a small subset of the population (known as a sample) and use these data to infer things about the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the mean and how do we tell if it’s representative of our data?

A

The mean is a simple statistical model of the centre of a distribution of scores. A hypothetical estimate of the ‘typical’ score. We use the variance, or standard deviation, to tell us whether it is representative of our data. The standard deviation is a measure of how much error there is associated with the mean: a small standard deviation indicates that the mean is a good representation of our data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s the difference between the standard deviation and the standard error?

A

The standard deviation tells us how much observations in our sample differ from the mean value within our sample. The standard error tells us not about how the sample mean represents the sample itself, but how well the sample mean represents the population mean. The standard error is the standard deviation of the sampling distribution of a statistic. For a given statistic (e.g. the mean) it tells us how much variability there is in this statistic across samples from the same population. Large values, therefore, indicate that a statistic from a given sample may not be an accurate reflection of the population from which the sample came.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do the sum of squares, variance and standard deviation represent? How do they differ?

A

All of these measures tell us something about how well the mean fits the observed sample data. Large values (relative to the scale of measurement) suggest the mean is a poor fit of the observed scores, and small values suggest a good fit. They are also, therefore, measures of dispersion, with large values indicating a spread-out distribution of scores and small values showing a more tightly packed distribution. These measures all represent the same thing, but differ in how they express it. The sum of squared errors is a ‘total’ and is, therefore, affected by the number of data points. The variance is the ‘average’ variability but in units squared. The standard deviation is the average variation but converted back to the original units of measurement. As such, the size of the standard deviation can be compared to the mean (because they are in the same units of measurement).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a test statistic and what does it tell us?

A

A test statistic is a statistic for which we know how frequently different values occur. The observed value of such a statistic is typically used to test hypotheses, or to establish whether a model is a reasonable representation of what’s happening in the population.

17
Q

What are Type I and Type II errors?

A

A Type I error occurs when we believe that there is a genuine effect in our population, when in fact there isn’t. A Type II error occurs when we believe that there is no effect in the population when, in reality, there is.

18
Q

What is statistical power?

A

Power is the ability of a test to detect an effect of a particular size (a value of 0.8 is a good level to aim for).

19
Q

Fit

A

the degree to which a statistical model is an accurate representation of some observed data.

20
Q

Interval estimate

A

in Bayesian statistics, a credible interval is an interval within which a certain percentage of the posterior distribution falls (usually 95%). It can be used to express the limits within which a parameter falls with a fixed probability. For example, if we estimated the average length of a romantic relationship to be 6 years with a 95% credible interval of 1 to 11 years, then this would mean that 95% of the posterior distribution for the length of romantic relationships falls between 1 and 11 years. A plausible estimate of the length of romantic relationships would, therefore, be 1 to 11 years.

21
Q

Linear model

A

a statistical model that is based upon an equation of the form Y = BX + E , in which Y is a vector containing scores from an outcome variable, B represents the b -values, X the predictor variables and E the error terms associated with each predictor. The equation can represent a solitary predictor variable ( B , X and E are vectors) as in simple regression or multiple predictors ( B , X and E are matrices) as in multiple regression . The key is the form of the model, which is linear (e.g., with a single predictor the equation is that of a straight line).

22
Q

Method of least squares

A

a method of estimating parameters (such as the mean or a regression coefficient) that is based on minimizing the sum of squared errors . The parameter estimate will be the value, out of all of those possible, which has the smallest sum of squared errors .

23
Q

Null hypothesis

A

the reverse of the experimental hypothesis , it states that your prediction is wrong and the predicted effect doesn’t exist.

24
Q

One-tailed test

A

a test of a directional hypothesis.

25
Q

Ordinary least squares

A

a method of regression in which the parameters of the model are estimated using the method of least squares

26
Q

Parameter

A

When you fit a statistical model to your data, that model will consist of variables and parameters: variables are measured constructs that vary across entities in the sample, whereas parameters describe the relations between those variables in the population. In other words, they are constants believed to represent some fundamental truth about the measured variables. We use sample data to estimate the likely value of parameters because we don’t have direct access to the population.

27
Q

Point estimate

A

a single value from the sample

28
Q

Power

A

the ability of a test to detect an effect of a particular size (a value of 0.8 is a good level to aim for).

29
Q

Sample

A

a smaller (but hopefully representative) collection of units from a population used to determine truths about that population (e.g., how a given population behaves in certain conditions).

30
Q

Sampling distribution

A

the probability distribution of a statistic. We can think of this as follows: if we take a sample from a population and calculate some statistic (e.g., the mean ), the value of this statistic will depend somewhat on the sample we took. As such the statistic will vary slightly from sample to sample. If, hypothetically, we took lots and lots of samples from the population and calculated the statistic of interest we could create
a frequency distribution of the values we get. The resulting distribution is what the sampling distribution represents: the distribution of possible values of a given statistic that we could expect to get from a given population.

31
Q

Sampling variation

A

the extent to which a statistic (the mean, median, t , F , etc.) varies in samples taken from the same population.

32
Q

Standard error

A

the standard deviation of the sampling distribution of a statistic. For a given statistic (e.g., the mean ) it tells us how much variability there is in this statistic across samples from the same population . Large values, therefore, indicate that a statistic from a given sample may not be an accurate reflection of the population from which the sample came.

33
Q

Standard error of the mean (SE)

A

the standard error associated with the mean.

34
Q

Test statistic

A

a statistic for which we know how frequently different values occur. The observed value of such a statistic is typically used to test hypotheses

35
Q

Two-tailed test

A

a test of a non-directional hypothesis.

36
Q

Type I error

A

occurs when we believe that there is a genuine effect in our population, when in fact there isn’t.

37
Q

Type II error

A

occurs when we believe that there is no effect in the population, when in fact there is.