Data analysis Flashcards

1
Q

DESCRIPTIVE STATISTICS

A

help to summarise data sets and make it easy to se any obvious patterns or trends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

MEASURE OF CENTRAL TENDENCY

A

mathematical way to find a midpoint/average score from a data set
MEAN: measure of central tendency calculated by adding up all the data points and dividing by the number of items in data set
+ uses all the data points in a set to calculate average - most informative measure
- can only be used with certain types of data (quantitative)
- outliers can skew the data
MEDIAN: measure of central tendency that identifies the middle score of data set
+ not affected by extreme scores
- doesn’t reflect outliers well
- can only be used with certain types of data (quantitative)
MODE: measure of central tendency that identifies the most frequent data point(s) in a data set
+ can be used with qualitative and quantitative data
+ provides info about frequency
+ not affected by extreme scores
- data may have several modes (bi-modal = 2 modes)
- doesn’t reflect outliers well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

OUTLIER

A

data point that differs significantly from other data points in the set - may be due to variability in measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MEASURE OF SPREAD

A

mathematical way to describe the variation or dispersion within a data set
STANDARD DEVIATION: average difference between each score in a data set and the mean - the higher the value the more variation in your scores - “the mean of the squares minus the square of the mean”
+ all values are taken into account - more precise and sensitive measure of spread
+ looks at difference between each data point and the mean deviation - not just the extremes
- time consuming to calculate
- may hide some characteristics of the data (e.g doesn’t tell us of the data is positively or negatively skewed)
RANGE: the difference between the biggest and smallest values in the data set
+ simple measure of spread
+ easy to calculate
+ shows outliers
- doesn’t take into account the number of values in data set
- doesn’t tell us if data is clustered or spread out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

GRAPHS

A

visual representation to help researchers quickly communicate their results
BAR CHART:
- used for discrete data
- has gaps between each bar cuz they’re not related in a linear way
- often compare mean/median/mode of different levels of IV
- bars can also represent totals/frequencies etc.
- DV on y axis, levels of IV on x axis
- can’t check distribution of data
HISTOGRAM:
- used for continuous data (can be measured on an infinite scale)
- shows distribution of items in a data set
- freq. of DV (percentage or number count) on y axis, DV on x axis
- allows us to check if the distribution of data is skewed
SCATTER GRAPH:
- used to display correlational study
- each point on the graph represents the point where a Ps data points on the two co-variables meet
- shows relationship between two co-variables
- helps to interpret relationship between co-variables and correlation coefficients
- a regression line (line of best fit) may be added to show the trend in data
- strong/weak positive, strong/weak negative, none
- no causation in correlation - correlations are not experiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly