statistics Flashcards

1
Q

What is an operational definition of variables?

A
  • a specific statement about how a variable will be measured to represent the concept under study

Makes study more replicable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Measurement?

A

A way to describe real life factors by numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 4 types of measurement

A

Nominal scales
Ordinal scales
Interval scales
Ratio scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a nominal scale

A

A measurement scale, in which numbers serve as “tags” or “labels” only, to identify or classify an object.
E.g. Bus 19, 242, 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an ordinal scale

A

-Data are put in order (distances between scores vary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an interval scale

A

measurement scale where there is order, the difference between the two variables is equal
Zero has no meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a ratio scale

A

-Interval scale and 0 is meaningful
-No negative numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the measures of central tendency

A

-Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

define what measures of spread are

A

How much scores vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 measures of spread

A

Range
Interquartile range
Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is interquartile range

A

Looks at the measures of spread between the first and third quarters ( the 25th and 75th score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is standard deviation

A

how far away is each data point from the mean

  • The larger the SD the larger the spread of scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the 6 step calculation for standard deviation

A

Step 1: Find the mean.
Step 2: Subtract the mean from each score.
Step 3: Square each deviation.
Step 4: Add the squared deviations.
Step 5: Divide the sum by the number of scores.
Step 6: Take the square root of the result from step 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What 3 things are graphs for?

A

representing data

Indicates patterns within the data (e,g. Central tendency, spread of data, correlations)

Use graphs to decide how to analyse data (e.g. outliers = median rather than mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What kind of data are bar graphs for?

A

Ordinal data
Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 3 types bar graphs?

A

Horizontal
Stacked
Histograms (however, the area represents the frequency)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the properties of stem and leaf plots?

A

Data in a compact form
Shows the size of data subsets
Stems = Multiples of (e.g. 0s 10s 20s)
Leafs = units (can only be 1 unit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do box plots do?

A

Summarise data and shows the:
Lower and upper quartile
Median
Minimum
Maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does a box plot interpret outliers?

A

1.5 x interquartile range
(interquartile range is shown by the length of the red box)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the properties of scatterplots

A
  • Shows the relationship between variables
  • Needs two bits of data (presents each variable) = bivariate data
  • Can work out correlations from it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Describe correlations on scatterplots

A

-Positive, negative relationship = direction
-Strength of relationship = Points lie closer to a line
-Weak relationship = Points are widely scattered
-Variables that are related are correlated
-Correlation makes no distinction between dependant and independent variable (no cause)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the purpose of correlational analysis?

A
  • Whether there is a linear (straight line) relationship between two variables
    -The direction of the relationship
    -Strength of the relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Define correlation coefficient

A

the specific measure that quantifies the strength of the linear relationship between two variables in a correlation analysis.

  • Correlation coefficients do not change if we change the unit of measurement (e.g. gallons instead of litres)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the two types of correlation coefficients

A

-Pearson r
-Spearman r
-Values lie between -1 and 1.

-Positive values = positive relationship
-Negative values = negative relationship
-A larger sample size leads to more certainty that relation is real

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Why does linear and non linear matter in scatterplots when quantifying correlation?

A

-Linear relationship = Can measure correlations
-Non linear relationship = Measuring correlation does not make sense, might need to transform data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Describe the correlation coefficient pearson r

A
  • Calculated directly from the raw scores
  • interval or ratio data
  • Highly affected by outliers
  • Not suitable for skewed data
27
Q

Describe the correlation coefficient spearman r

A
  • Calculated from the ranking of the raw scores
  • ordinal data
  • Minimally affected by outliers
  • skewed data
28
Q

What shows distributions?

A

Density curves

29
Q

What are density curves useful for?

A
  • generalising results to the population.
  • A density curve is a histogram distribution
  • Displays overall pattern (shape) of a distribution
  • always on or above the horizontal axis
30
Q

What does the area under a curve of a distribution represent and what can you do with this area?

A
  • Curves are calculated so they have an area of exactly 1 (probability) underneath them
  • 100% of the scores under the curve
  • if you know certain values of the model (e.g. mean or SD) you can make predictions about the overall population
    Area above the mean = 0.6
    60% of the scores will be above the mean
31
Q

What is the median in relation to density curves?

A

point that divides area into two equal parts

32
Q

What are quartiles in relation to density curves?

A

points that divide area under curve into quarters

33
Q

What is the mode in relation to density curves?

A

positions at the peak of the curve

34
Q

What is the mean in relation to density curves?

A

the balancing point of the curve

35
Q

What are the properties of a normal distribution?

A

Symmetrical
Single peaked
Tails meet the x- axis at infinity

36
Q

what is the shape of a normal distribution determined by

A

its Standard deviation

37
Q

What is the location of a normal distribution determined by

A

Its mean

38
Q

What are z scores?

A
  • Allows us to compare values from two data sets where two values can be made into a single score, this is called Z- scores (standard scores)
39
Q

What is the calculation for a z score

A

(Score) - (Mean) divided by (standard deviation)

40
Q

What is a standard normal distribution?

A
  • To compare data from two different normal distributions = Convert normal datasets into standard normal distributions by calculating the z- score
41
Q

How are details of a z distribution (standard normal) worked out?

A

Using table entries
Table entry always gives = area to the left of z score
Can work out the percentage of population above or below our point of interest

42
Q

What is always the standard deviation for a z distribution

A

1

43
Q

What is always mean for a z distribution

A

0

44
Q

What is the total area under the curve for a z distribution

A

1 (representing 100% of the participants)

45
Q

What is a chi- squared test?

A
  • Non parametric (Makes no assumptions of population parameters so they are distribution free)
46
Q

What are the two types of chi- squared test?

A

The goodness of fit test
The test of independence

  • Both types of tests are there to test for significant differences between data sets
47
Q

What is the chi squared goodness of fit test?

A
  • Used on unrelated categorical data, where each person can only be in one category
  • Used to look at the proportions of a population
  • Looks at the categories of one variable
48
Q

What are observed and expected frequencies in the chi squared goodness of fit test?

A
  • The observed frequencies are the numbers of participants measured in individual categories e.g. number of men vs number of women
  • These frequencies are then compared to frequencies predicted by the null hypothesis (the expected frequencies)
49
Q

How do you calculate expected frequencies?

A

Sample size x the proportion

50
Q

What is the chi squared test of independence?

A
  • Looks at the categories of two variables
  • uses data in the form of frequencies in different categories which is compared to expected frequencies predicted by the null hypothesis
  • But instead of 2 categories there are 4
    Data is presented in the form of a matrix displaying all categories
51
Q

How do you calculate the degrees of freedom for chi squared test of independence

A

(number of rows R minus 1) x (number of column C minus 1)

52
Q

What is probability?

A

A measure of how likely it is that some event will occur
Probability can vary from 0 (never) to 1 (always)

53
Q

Summarise testing the null hypothesis

A
  • Assuming there is no difference and there is no relationship between the two conditions
  • Calculate how probable it is to get the score as extreme or more extreme than what we obtained
  • If the probability is very small, reject the null hypothesis (Thus accepting the alternative H)
    If the p- value is less than 0.05 (5%)
54
Q

What are critical values?

A
  • A score that tells you if someone scores less or higher than this they are outside this 5% range
  • Essentially scores that are the cut off point for statistical significance (top 5%)
55
Q

How do you calculate critical values?

A
  • To get a 5% score you need a z- score of 1.645
  • That 5% cut of of significance is roughly 1.6 standard deviations below or above the mean

-The same for every normal distribution

56
Q

What is a type I error?

A

Rejecting the null hypothesis (and accepting the alternative) when we shouldn’t

  • Deciding the score is statistically significant when its not
57
Q

What is a type 2 error?

A

Accepting the null hypothesis (rejecting the alternative) when we shouldnt

58
Q

How can you decrease the likelihood of a type I error?

A

By reducing the threshold of significance from 5% to 1% (0.01)

    • However this could increase the possibility of a type 2 error.
  • And decreasing the likelihood of a type 2 error could also increase the likelihood of a type I error
59
Q

What is an alpha level?

A

P value

60
Q

What is a non directional (two- tailed) alternative hypothesis?

A

Does not state the direction just states they will differ

60
Q

What is a directional (one tailed) alternative hypothesis?

A

States which direction its going on (e.g. higher lower, better worse)

61
Q

What is statistical inference from samples?

A

Using probability theory to make inferences about a population from sample data

Why do we do it? - to Make inferences from a sample to the population

62
Q

What is the calculation for statistical inference

A

(Estimated mean) divided (standard deviation from sample)