statistics Flashcards

1
Q

What is an operational definition of variables?

A
  • a specific statement about how a variable will be measured to represent the concept under study

Makes study more replicable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Measurement?

A

A way to describe real life factors by numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 4 types of measurement

A

Nominal scales
Ordinal scales
Interval scales
Ratio scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a nominal scale

A

A measurement scale, in which numbers serve as “tags” or “labels” only, to identify or classify an object.
E.g. Bus 19, 242, 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an ordinal scale

A

-Data are put in order (distances between scores vary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an interval scale

A

measurement scale where there is order, the difference between the two variables is equal
Zero has no meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a ratio scale

A

-Interval scale and 0 is meaningful
-No negative numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the measures of central tendency

A

-Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

define what measures of spread are

A

How much scores vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 measures of spread

A

Range
Interquartile range
Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is interquartile range

A

Looks at the measures of spread between the first and third quarters ( the 25th and 75th score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is standard deviation

A

how far away is each data point from the mean

  • The larger the SD the larger the spread of scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the 6 step calculation for standard deviation

A

Step 1: Find the mean.
Step 2: Subtract the mean from each score.
Step 3: Square each deviation.
Step 4: Add the squared deviations.
Step 5: Divide the sum by the number of scores.
Step 6: Take the square root of the result from step 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What 3 things are graphs for?

A

representing data

Indicates patterns within the data (e,g. Central tendency, spread of data, correlations)

Use graphs to decide how to analyse data (e.g. outliers = median rather than mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What kind of data are bar graphs for?

A

Ordinal data
Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 3 types bar graphs?

A

Horizontal
Stacked
Histograms (however, the area represents the frequency)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the properties of stem and leaf plots?

A

Data in a compact form
Shows the size of data subsets
Stems = Multiples of (e.g. 0s 10s 20s)
Leafs = units (can only be 1 unit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do box plots do?

A

Summarise data and shows the:
Lower and upper quartile
Median
Minimum
Maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does a box plot interpret outliers?

A

1.5 x interquartile range
(interquartile range is shown by the length of the red box)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the properties of scatterplots

A
  • Shows the relationship between variables
  • Needs two bits of data (presents each variable) = bivariate data
  • Can work out correlations from it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Describe correlations on scatterplots

A

-Positive, negative relationship = direction
-Strength of relationship = Points lie closer to a line
-Weak relationship = Points are widely scattered
-Variables that are related are correlated
-Correlation makes no distinction between dependant and independent variable (no cause)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the purpose of correlational analysis?

A
  • Whether there is a linear (straight line) relationship between two variables
    -The direction of the relationship
    -Strength of the relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Define correlation coefficient

A

the specific measure that quantifies the strength of the linear relationship between two variables in a correlation analysis.

  • Correlation coefficients do not change if we change the unit of measurement (e.g. gallons instead of litres)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the two types of correlation coefficients

A

-Pearson r
-Spearman r
-Values lie between -1 and 1.

-Positive values = positive relationship
-Negative values = negative relationship
-A larger sample size leads to more certainty that relation is real

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Why does linear and non linear matter in scatterplots when quantifying correlation?
-Linear relationship = Can measure correlations -Non linear relationship = Measuring correlation does not make sense, might need to transform data
26
Describe the correlation coefficient pearson r
- Calculated directly from the raw scores - interval or ratio data - Highly affected by outliers - Not suitable for skewed data
27
Describe the correlation coefficient spearman r
- Calculated from the ranking of the raw scores - ordinal data - Minimally affected by outliers - skewed data
28
What shows distributions?
Density curves
29
What are density curves useful for?
- generalising results to the population. - A density curve is a histogram distribution - Displays overall pattern (shape) of a distribution - always on or above the horizontal axis
30
What does the area under a curve of a distribution represent and what can you do with this area?
- Curves are calculated so they have an area of exactly 1 (probability) underneath them - 100% of the scores under the curve - if you know certain values of the model (e.g. mean or SD) you can make predictions about the overall population Area above the mean = 0.6 60% of the scores will be above the mean
31
What is the median in relation to density curves?
point that divides area into two equal parts
32
What are quartiles in relation to density curves?
points that divide area under curve into quarters
33
What is the mode in relation to density curves?
positions at the peak of the curve
34
What is the mean in relation to density curves?
the balancing point of the curve
35
What are the properties of a normal distribution?
Symmetrical Single peaked Tails meet the x- axis at infinity
36
what is the shape of a normal distribution determined by
its Standard deviation
37
What is the location of a normal distribution determined by
Its mean
38
What are z scores?
- Allows us to compare values from two data sets where two values can be made into a single score, this is called Z- scores (standard scores)
39
What is the calculation for a z score
(Score) - (Mean) divided by (standard deviation)
40
What is a standard normal distribution?
- To compare data from two different normal distributions = Convert normal datasets into standard normal distributions by calculating the z- score
41
How are details of a z distribution (standard normal) worked out?
Using table entries Table entry always gives = area to the left of z score Can work out the percentage of population above or below our point of interest
42
What is always the standard deviation for a z distribution
1
43
What is always mean for a z distribution
0
44
What is the total area under the curve for a z distribution
1 (representing 100% of the participants)
45
What is a chi- squared test?
- Non parametric (Makes no assumptions of population parameters so they are distribution free)
46
What are the two types of chi- squared test?
The goodness of fit test The test of independence - Both types of tests are there to test for significant differences between data sets
47
What is the chi squared goodness of fit test?
- Used on unrelated categorical data, where each person can only be in one category - Used to look at the proportions of a population - Looks at the categories of one variable
48
What are observed and expected frequencies in the chi squared goodness of fit test?
- The observed frequencies are the numbers of participants measured in individual categories e.g. number of men vs number of women - These frequencies are then compared to frequencies predicted by the null hypothesis (the expected frequencies)
49
How do you calculate expected frequencies?
Sample size x the proportion
50
What is the chi squared test of independence?
- Looks at the categories of two variables - uses data in the form of frequencies in different categories which is compared to expected frequencies predicted by the null hypothesis - But instead of 2 categories there are 4 Data is presented in the form of a matrix displaying all categories
51
How do you calculate the degrees of freedom for chi squared test of independence
(number of rows R minus 1) x (number of column C minus 1)
52
What is probability?
A measure of how likely it is that some event will occur Probability can vary from 0 (never) to 1 (always)
53
Summarise testing the null hypothesis
- Assuming there is no difference and there is no relationship between the two conditions - Calculate how probable it is to get the score as extreme or more extreme than what we obtained - If the probability is very small, reject the null hypothesis (Thus accepting the alternative H) If the p- value is less than 0.05 (5%)
54
What are critical values?
- A score that tells you if someone scores less or higher than this they are outside this 5% range - Essentially scores that are the cut off point for statistical significance (top 5%)
55
How do you calculate critical values?
- To get a 5% score you need a z- score of 1.645 - That 5% cut of of significance is roughly 1.6 standard deviations below or above the mean -The same for every normal distribution
56
What is a type I error?
Rejecting the null hypothesis (and accepting the alternative) when we shouldn't - Deciding the score is statistically significant when its not
57
What is a type 2 error?
Accepting the null hypothesis (rejecting the alternative) when we shouldnt
58
How can you decrease the likelihood of a type I error?
By reducing the threshold of significance from 5% to 1% (0.01) - - However this could increase the possibility of a type 2 error. - And decreasing the likelihood of a type 2 error could also increase the likelihood of a type I error
59
What is an alpha level?
P value
60
What is a non directional (two- tailed) alternative hypothesis?
Does not state the direction just states they will differ
60
What is a directional (one tailed) alternative hypothesis?
States which direction its going on (e.g. higher lower, better worse)
61
What is statistical inference from samples?
Using probability theory to make inferences about a population from sample data Why do we do it? - to Make inferences from a sample to the population
62
What is the calculation for statistical inference
(Estimated mean) divided (standard deviation from sample)