Data: Descriptive Statistics Flashcards
What are the various measures of central tendency
- Mode
- Median
- Mean
How do you decide which measure of central tendency to use
Based on the levels of measurement
which measure of central tendency you’d use if the data was nominal
Mode
which measure of central tendency you’d use if the data was ordinal
- the median
- then the mode if calculating the median isn’t possible
which measure of central tendency you’d use if the data was interval/ratio
- preferably the mean
- however, if the data is skewed…
- you need to calculate the median
- then the mode if calculating the median isn’t possible
What is the strength of using the mean
- takes all the scores into account so it is the most sensitive
What is the weaknesses of using the mean
- impacted by extreme values (i.e. outliers) - it will be skewed (artificially inflated or deflated)
- not useful when a decimal point is not an option for the data (e.g. 2.4 cola bottles)
What is the strength of using the mode
- Not impacted by extreme values (i.e. outliers) — it will not be skewed (artificially inflated or deflated)
- useful for nominal data as it is the only method
What is the weakness of using the mode
- Doesn’t take all the scores into account so it is not as sensitive
- there may be several or none
What is the strength of using the median
- not impacted by extreme values (i.e. outliers) - it will not be skewed (artificially inflated or deflated)
What is the weaknesses of using the median
- Doesn’t take all the scores into account so it is not as sensitive
EXAMPLE:
In an unethical experiment 3 groups of 8 lab rats were given a maze to complete and times were recorded in seconds.
Group 1) Rats given brain lesions - 35, 27, 26, 27, 28, 79, 27, 30
Group 2) Rats with tails cut off - 15, 10, 18, 22, 8, 49, 16, 22
Group 3) Rats with eyes damaged - 33, 33, 32, 28, 67, 45, 24, 29
Which measure of central tendency should be used + why?
- the data was ratio, however there were outliers in the groups
- therefore calculating the median is the most appropriate measure of central tendency
- because we can’t calculate the mean because there are outliers
Describe the characteristics of a Bar chart
used for nominal (category/not continuous data):
- frequency = Y-axis
- categories = X-axis
- gaps between each bar represents the lack of continuity
For experiments: IV = X-axis, DV = Y-axis
Describe the characteristics of a Histogram
Used for continuous data:
- frequency = Y-axis (it must start at 0)
- continuous data = X-axis
- no gap between each bar, represents continuity
Describe the characteristics of a Line graph
- frequency = Y-axis (must start at 0)
- used for continuous data = displayed on X-axis
- each dot is connected by a line
Describe the characteristics of a Pie chart
- suitable for nominal data
- each slice represents a portion of
- each slice is calculated by a portion of 360’
Describe the characteristics of a Scatter diagram
- each covariable is plotted on y and x axis
- suitable for continuous data (displayed on both x and y axis)
- each dot represents 2 scores
- the scatter of the dots indicates the relationship between the covariable
What is the definition of range
The distance between the top and bottom values in the data set
How do you calculate the range in psychology
Subtract the lowest from the highest, then add one
Why is it helpful to calculate the range
- useful to describe data
- help explain differences in data, not visible by only calculating the mean
What does a high standard deviation score mean
Scores are spread widely from the mean
What does a low standard deviation score mean
Scores are clustered near the mean
What does it mean if the standard deviation score is 0
All the values in the data set are the same
How do you calculate a standard deviation
What is meant by standard deviation
A measure of dispersion, shows how the data is spread around the mean
Explain the steps you would follow to calculate a standard deviation
1) calc the mean of the scores in the data set
2) take mean away from each score in data set
3) square each difference
4) add the sum of all the squared differences
5) divide this by the number of scores minus one
6) Calc the square root of the divided data ——> this is the standard deviation
What is the strength of using a range
Easy to calculate
What is the disadvantage of using a range
- impacted by extreme values (e.g. outliers), it will be skewed (artificially inflated or deflated)
- Fails to take into account the distribution of scores around the mean (don’t know if the scores are close together or far apart)
What is the strength of using standard deviation
- more precise and informative measure of dispersion than range (because it takes all the values into account)
- Highlights if the mean is an appropriate measure of central tendency
- Used in further statistical analysis, such as computing skewness
- less affected by anomalous results than range scores
What is the weakness of using standard deviation
- can only be used if the data set is normally distributed and not skewed
- more difficult to calculate than the range score
- can only be used when data collected is ordinal level or above
- Can only be used where an IV is plotted against frequency