statistics Flashcards
What is an operational definition of variables?
- a specific statement about how a variable will be measured to represent the concept under study
Makes study more replicable
What is a Measurement?
A way to describe real life factors by numbers
What are the 4 types of measurement
Nominal scales
Ordinal scales
Interval scales
Ratio scales
What is a nominal scale
A measurement scale, in which numbers serve as “tags” or “labels” only, to identify or classify an object.
E.g. Bus 19, 242, 3
What is an ordinal scale
-Data are put in order (distances between scores vary)
What is an interval scale
measurement scale where there is order, the difference between the two variables is equal
Zero has no meaning
What is a ratio scale
-Interval scale and 0 is meaningful
-No negative numbers
What are the measures of central tendency
-Mean, median, mode
define what measures of spread are
How much scores vary
What are the 3 measures of spread
Range
Interquartile range
Standard deviation
What is interquartile range
Looks at the measures of spread between the first and third quarters ( the 25th and 75th score)
What is standard deviation
how far away is each data point from the mean
- The larger the SD the larger the spread of scores
What is the 6 step calculation for standard deviation
Step 1: Find the mean.
Step 2: Subtract the mean from each score.
Step 3: Square each deviation.
Step 4: Add the squared deviations.
Step 5: Divide the sum by the number of scores.
Step 6: Take the square root of the result from step 5
What 3 things are graphs for?
representing data
Indicates patterns within the data (e,g. Central tendency, spread of data, correlations)
Use graphs to decide how to analyse data (e.g. outliers = median rather than mean)
What kind of data are bar graphs for?
Ordinal data
Nominal data
What are the 3 types bar graphs?
Horizontal
Stacked
Histograms (however, the area represents the frequency)
What are the properties of stem and leaf plots?
Data in a compact form
Shows the size of data subsets
Stems = Multiples of (e.g. 0s 10s 20s)
Leafs = units (can only be 1 unit)
What do box plots do?
Summarise data and shows the:
Lower and upper quartile
Median
Minimum
Maximum
How does a box plot interpret outliers?
1.5 x interquartile range
(interquartile range is shown by the length of the red box)
What are the properties of scatterplots
- Shows the relationship between variables
- Needs two bits of data (presents each variable) = bivariate data
- Can work out correlations from it
Describe correlations on scatterplots
-Positive, negative relationship = direction
-Strength of relationship = Points lie closer to a line
-Weak relationship = Points are widely scattered
-Variables that are related are correlated
-Correlation makes no distinction between dependant and independent variable (no cause)
What is the purpose of correlational analysis?
- Whether there is a linear (straight line) relationship between two variables
-The direction of the relationship
-Strength of the relationship
Define correlation coefficient
the specific measure that quantifies the strength of the linear relationship between two variables in a correlation analysis.
- Correlation coefficients do not change if we change the unit of measurement (e.g. gallons instead of litres)
What are the two types of correlation coefficients
-Pearson r
-Spearman r
-Values lie between -1 and 1.
-Positive values = positive relationship
-Negative values = negative relationship
-A larger sample size leads to more certainty that relation is real
Why does linear and non linear matter in scatterplots when quantifying correlation?
-Linear relationship = Can measure correlations
-Non linear relationship = Measuring correlation does not make sense, might need to transform data