Descriptive statistics Flashcards
Data
The measurements collected from an experiment to illustrate a trend or lack thereof that the IV has on DV
Forms of data can be in
Primary or secondary
Quantitative or qualitative
Discrete or continuous
Raw or processed
Primary data
Data collected within each condition from an experiment by a researcher
Secondary data
Data which already exists and is collected by researchers to analyse
Quantitive data
Measurements taken in the form of numbers
Qualitative data
Measurements taken in the form of words, descriptions, observations, pictures etc
Not limited by a choice or scale and can not be put into pre set categories
Can qualitative data be processed into quantitate data?
Yes then can be analysed
Discrete data
Values that can only fit in a scale/ categories of fixed and specific values, such as blood type
And there is no in between such as number of people, number of words recounted etc (cannot have half of a word for example)
Continuous data
Values that can be counted in a scale of continuous values so can be in between intervals and exist as any value
Such as length, weight and height
Raw data
Data collected straight from the experiment which has not been processed
Such as pre making a table for an experiment and inputting values collected
Processed data
Data which has been put into graphs and drawing conclusions from it etc
Raw data tables
Made before an experiment to input values which will be collected in the experiment
No analysis
Raw data table for repeated measures design
All as 1 big column
Participant number down the first column
Then the other columns for scores taken from each condition completed
To show every participant took part in the same condition
Raw data for matched participant/ independent measures design
1 table per condition to show each participant took part only once in only 1 condition so no repeating
Descriptive statistics
Calculations made straight from raw data collected to sum up and present findings
Or drawing graphs
Measures of central tendency
Ways of determining the average of typical score in a set of data :
Mean
Mode
Median
How to calculate mean
Total of all scores
—————————————
How many scores there are
Advantages of mean
All data is included when calculated so doesn’t ignore any scores
Disadvantages of mean
An outlier score would skew the mean calculated so wouldn’t represent most of the scores
How to calculate median
Put in order smallest to largest
Find middle value
If there are 2 middle values then find midpoint of those
Advantages of median
Not affected by an outlier score so not skewed
Disadvantages of median
Not helpful if there are not enough values
Not take into account the precise other values so not use all data
How to calculate the mode
Most common score
Can be more than 1 or none at all
Advantages of mode
Can be used for qualitative data
Easy to calculate
Not affected by outlier score
Disadvantages of mode
May not represent all of data in some cases
May not work with a small set of data
Impossible to calculate if all data is different
Bar charts
A way of representing processed data in an experiment where each condition is separate categories for each bar
Might use central tendency as we are comparing average scores between conditions
What to remember when making a bar chart
Labelled y axis with specific descriptive statistic eg mean
Which always starts at 0
Title
Accurate data points plotted
Labelled x axis for each condition
Measures of dispersion
Indicates how far the results will be spread around the typical (central tendency score)
Range
Variance
Standard deviation
How to calculate the range
Difference in lowest and highest scores
(Optional too add 1 to represent all values that are possible including 0, although either is allowed but be consistent
Advantages of the range
Easy and quick to calculate for a sense of how dispersed scores are
Shows variety or no variety: small variance shows scores are close together
Disadvantages of range
Does not give any indication if the distribution is even or clustered around a certain point and if other values are outliers or just less spread out
How to overcome the disadvantage of the range?
Calculate variance and standard deviation
How to display the range on a bar chart
A line drawn at each condition shown on the x axis (so inside each bar) which will extend from the smallest value to largest value so length of this line shows the range
Variance
Indicates how far spread apart the data is form the mean score
What does a low variance show?
Within that condition, the participants scores dont deviate much from the mean score
So overall, are quite consistent and not a large range
What does a high variance show?
Within that condition, the participants are more spread out away from the mean score
Why is the variance good?
It treats all variations from the mean as the same and therefore takes into account every value
So won’t be affected by outliers
How to calculate the variance?
Calculate mean score per condition
For each participant score: Mean score - Participant score =d (Might be negative)
d²
All of d² added together
Divide by number of participants in sample
Methods of data collection
Observation
Correlation
Self report
Physiological measures
How to calculate the standard deviation?
Square root of the variance
Why calculate the standard deviation?
Tells us an accurate measure of dispersion in the same scale as the data that we measured
Examples of descriptive statistics in terms of calculation
Measures of central tendency
Measures of dispersion
Descriptive statistics for displaying data
Frequency tables (tally charts)
Line graphs
Pie charts
Bar charts
Histograms
Scatter diagrams