Lectures 1-3 Flashcards

1
Q

Define: experiment

A

Vary an independent variable while holding everything else constant to measure changes in a dependent variable. Therefore we can infer causality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define: quasi-experiment

A

Independent variable cannot be manipulated (e,g, gender differences). DV is still measured as IV changes, but there is a problem with confounding variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define: correlational design

A

No manipulation, measure 2+ variables and determine to what extent they are co-related. Cannot infer causality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define: nominal (categorical) scale

A

Mutually exclusive, not necessarily orders, categories. Calculations are meaningless. E.g. gender.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define: ordinal (ranking) scale

A

Numbers indicate a relative position (rank) in a list which is meaningful, although items are not necessarily equally spaced. E.g. questionnaire answer ranks, shoe size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define: interval scale

A

The difference between two values is meaningful, but there is not a meaningful zero point. E.g. temperature in Celcius.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define: ratio scale

A

The difference between two values is meaningful and there is a meaningful zero point. Calculations (e.g. double) are meaningful. E.g. temperature in Kelvin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define: confounding variable

A

A variable that confounds the interpretation of the results of an experiment. Some aspect of the experimental situation varies SYSTEMATICALLY with the IV. E.g. graphical stimulus was more interesting than verbal stimulus in the experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define: nuisance variable

A

A variable that introduces noise but does not SYSTEMATICALLY bias the results. E.g. occasional noise during the experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define: between-subjects design

A

Each condition is applied to a different group of subjects. Often the only available option depending on the IV (e.g. gender, task learning method). Individual differences between groups can be a problem, balancing this can be addressed by random group assignment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define: within-subjects design

A

The same subject performs all levels of the IV. Also known as repeated measures design as they repeat the measure for each condition. Generally much more powerful, as each subject is their own control, eliminating individual differences. However, it can lead to a confound with order effects, e.g. the confound of practice or fatigue effects. This can be minimised by counterbalancing. However this can be complicated with complex designs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define: matched-subjects design

A

Solves the problem of being unable to run within-subjects designs in certain cases - participants in each group are matched with a member of the other group and the data is treated like a regular within-subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define: correlational design

A

Sometimes the test variables cannot be manipulated (e.g. for ethical reasons/time constraints) and instead pre-existing variables are measured in terms of the extent to which they are co-related or co-varying.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define: experimental hypothesis

A

Also known as a research hypothesis, it is a question addressed in an experiment, based on a more general theory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define: statistical hypothesis

A

This involves precise statements about the data to be collected, e.g. explaining the IV and DV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define: null hypothesis

A

H0, simply states that the different samples come from the same population. For parametric stats, often states that all the means are equal, for non-parametric that all the distributions are the same. It is the default, to be accepted unless there is good evidence to the contrary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define: alternative hypothesis

A

H1/HA, the logical opposite of the null hypothesis - states that the conditions will have different means/distributions. The alternative and null hypotheses are therefore mutually exclusive and exhaustive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define: theory, research hypothesis and statistical hypothesis.

A

A theory is a simple statement, e.g. ‘French lecturers are particularly great.’
A research hypothesis is more specific and testable, defining the IV, e.g. ‘The students of French lecturers perform better than the students of non-French lecturers’.
A statistical hypothesis defines the DV and states the null and alternative hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define: test statistics

A

We reject the null hypothesis when p<α (usually 0.05). The test statistic is used in calculating p and has known probabilities associated with its values. It depends on the statistical test used.

20
Q

Define: p

A

p is the probability of collecting this data assuming that the null hypothesis is true, i.e. the probability that the result was found due to chance.

21
Q

Define: α

A

α is the criterion level set which p must be less than for the results to be believed to be more than just chance.

22
Q

Define: discrete data

A

Have certain fixed values, often integers.

23
Q

Define: continuous data

A

Can take any fractional value within given range.

24
Q

Ordinal scale variables are usually:

discrete/continuous?

A

Discrete

25
Q

Interval and ratio scales:

a) are usually continuous
b) are usually discrete
c) can be either continuous or discrete

A

Can be either continuous or discrete (e.g. time/no.of correct answers).

26
Q

The richest form of data representation is…?

A

Frequency distribution!

27
Q

In frequency distributions, representation of data depends on…?

A

Measurement scale, e.g. categorical, ordinal, ratio, interval.

28
Q

Categorical data can be presented as a frequency distribution with the y-axis being…?

A

Either n or the percentage of participants.

29
Q

Discrete variables can be presented as frequency distributions in various ways:

A

The frequency of each value (if the DV has only a few possible values), cumulative frequency, percentage frequency or cumulative percentages.

30
Q

It is often not possible when doing frequency distributions to calculate frequencies on the basis of every possible score. This can be solved by…

A

Using ranges or intervals of scores within which frequencies can be calculated.

31
Q

There is no particular rule for the number of ranges or intervals to use when doing a frequency range distribution, as it often depends on…

A

the number of samples.

32
Q

Usually with frequency ranges, …. intervals are used.

A

10-15

33
Q

Counting frequencies is a powerful technique for condensing data whilst retaining a great deal of information, but sometimes the entire frequency distribution shape is unnecessary and it is summarised further to a single number, a …

A

Measure of the central tendency of the data.

34
Q

There are three commonly used measures of central tendency - …

A

The mean, mode and median.

35
Q

Define mode and state the pros and cons of using the mode as a measure of central tendency.

A

The mode is the most common score or category in the data.
+ can be used for categorical data
+ always gives a real data value
- sometimes gives more than one value (bimodal distributions - two peaks?)
- varies depending on interval (bin) size - changing the interval sizes can change the mode.

36
Q

Define median and state the pros and cons of using the median as a measure of central tendency.

A

The middle value of the data set, or the mean of the two middle values if necessary.
+ insensitive to outlying data
+ often gives a real data value
+ intuitively appealing
- ignores a lot of the data
- difficult to calculate without a computer

37
Q

What kind of data does the mode tend to be used for?

A

Nominal.

38
Q

What kind of data is the median usually used for?

A

It’s the best measure for ordinal data and sometimes for skewed interval or ratio data.

39
Q

Define mean and state the pros and cons of using the mean as a measure of central tendency.

A

The mean is the sum divided by the number of samples: χ ̅=(∑x)/N
+ uses all the data and is therefore powerful
- very sensitive to outliers or skew
- does not always give a meaningful value
- only meaningful for ratio and interval data

40
Q

What kind of data is the mean the best measure for?

A

Ratio and interval data which is normally distributed.

41
Q

Outline measures of spread.

A

They are related to measures of central tendency - the median is associated with range or inter-quartile range (distance-based) and the mean with variance and standard deviation (centre-based).

42
Q

Define range and state the pros and cons of using it as a measure of spread.

A

Highest value - lowest value.

- very sensitive to outliers - the highest and lowest values are unlikely to be representative of the sample.

43
Q

Define interquartile range and state the pros and cons of using it as a measure of spread.

A

It is similar to the range but ignores the most extreme values - a quartile is the lowest score needed to include a given quarter of the population. The interquartile range is Q3-Q1 (the semi-interquartile range is the same divided by 2).
+ insensitive to outlying data
+ few assumptions about the data
- ignores a lot of the data
- difficult to calculate for large datasets without a computer

44
Q

Define variance and state the pros and cons of using it as a measure of spread.

A
Variance is: σ^2=(∑(x-x ̅^2)/N
\+ uses all the data, therefore powerful
\+ forms the basis of any other tests
- sensitive to outliers
- requires a normal distribution
- does not have a sensible unit
45
Q

Define standard deviation and state the pros and cons of using it as a measure of spread.

A

Standard deviation is the square root of the variance and therefore has a sensible unit. There are several variants of the formula, depending on whether a population (σ) or sample (s) is tested and whether this means to estimate for a population (ŝ). σ and s are just the square root of the variance equation, but ŝ uses N-1 as the denominator of the equation. This gives a better unbiased estimate of the population’s sd.