RM 2 data Flashcards
what is data that exist in categories with a natural order.
it gets ranked
ordinal data
what is ordinal data
Data that exist in categories with a natural order, it can be RANKED or ordered
what is nominal data
Data that exist in CATEGORIES with no natural order.
what is ratio data
Data with number values that can’t go below zero, for which we can tell exactly how much bigger one number is than another.
what is interval data
Data with number values that can go below zero, for which we can tell exactly how much bigger one number is than another.
number of cars sold per day is an example of
discrete data
blood pressure of people in a company is an example of
continuous data
temperature inside an office is an example of
continuous data
number of people working in a company is an example of
discrete data
shoe size is an example of
discrete data
Whether a participant chooses a male or a female person to talk to.
discrete
The length of time someone can endure extreme pain
continuous
The number of days someone spends in isolation before showing signs of loneliness.
discrete
The sum when $2$ dice are rolled.
discrete
The diameter of someone’s pupils in different light levels, measured in
continuous
what is discrete data
Discrete data is quantitative data that’s restricted to just certain numbers.
what is continuous data
Continuous data is quantitative data that’s not restricted to certain numbers.
ratio data can either be
discrete or continuous
ordinal and nominal data always has a restricted number of
values eg alevel grades or eye colour
ordinal data is …. data
discrete
nominal data is …. data
discrete
interval data can either be
continuous or discrete
quantitative data is split into what types
nominal
ordinal
interval
ratio
what is primary data
info that is observed or collected directly by the researcher for the purpose of the study that is currently being carried out
what is primary data specific too
specifically related to the aims/hypothesis of that study
how might primary data be collected
questionnaire
interview
observation
experiment
etc
what would primary data collection involve (5)
designing the study
gaining ethical approval
piloting the study
recruiting and testing the participants
analysing the data and drawing conclusions
what is good about primary data
the data collection is designed so that it fits the aims and hypothesis of the study (it fits the purpose)
what is bad about primary data
time consuming
expensive
what is secondary data
information that was collected by someone else for a purpose other than the current study
what can secondary data include (how can the data be collected)
data collected by the researcher for a different study or data that was collected by another researcher for a different purpose
secondary data can include (examples)
government data eg crime stats, mental health admissions
data held by a hospital or another institution
example of secondary data (type of study)
review studies conducting meta analysis on such data
positives of secondary data
simple
cheap
to access exisisting data
wmt its less time consuming
the data may have already have been subjected to statistical testing which would identify whether it is significant
what is a negative of secondary data
the data may not exactly meet the needs of the study
how can quantitative data not be numbers
eg yes and no, the researcher will classify the responses into groups and count the number of ps in each group
A table with the categories in the first column and the frequency count for each category in the second column is called a
frequency table
advantages of quantitative data
numbers are more objective =more accurate conclusions
advantages of qualitative data
reflects opinions, impressions of the researcher, more meaningful, provides greater detail and depth, life from the perspective of the individuals, can catch the subtleties
if an obs is unstructured what type of data is being collected
qualitative
how can continuous data be displayed (4)
histogram, bar chart, pie chart, scattergraph, frequency polygon
how can discrete data be displayed (3)
pie chart, bar chart, scattergraph only correlation and ordinal
what is the difference between thematic and content analysis
you do not quantify/count the frequency of thematic you just have themes
example of what thematic analysis can look like/be done on
un and semi structured interviews, diary, unstructured obs
what can be done in thematic analysis that cant be done with content analysis
you can organise the info into a map to show the links between data and the themes
how do you carry out content analysis (5)
start by examining a SAMPLE of the media, identify and code potential categories into a tally chart, analyse ALL the media, counting the frequency and quantifying the qualitative data, draw conclusions
what is content analysis
a research tool used to analyse the content of various forms of communication (written, verbal or visual) it indirectly observes the presence of words/themes within the communication
example of content analysis
to study the sex role stereotyping in tv adverts
what does content analysis do
analyses qualitative data by converting it into quantitative
what are the measures of central tendency
mode, median, mean
what are the 3 descriptive statistics
measures of central tendency measures of dispersion graphs and charts
you have to chose which descriptive stat to use based on
level of measurement how the data is dispersed eg any outlying values data tables
you can only use a descriptive statistic if it suits …………
all the sets of data
what is the mode
the most frequently occurring value
what are advantages of using the mode
unaffected by one or two extreme scores
can use on nominal data
useful when other measures mare meaningless
represents a figure that is actually in the set
with the mode you can use what type of data
nominal
disadvantages of using the mode
there might not be a mode
small changes in the data can radically alter the mode
there isnt always a single value for the mode wmt the mode becomes less useful.
doesnt tell us anything about the other values in the set
what is it called when there are 2 modes for a set of data
bi-modal
what is it called when there are more than 2 modes for a set of data
mulit modal, and it isnt used if this is the case
what is the median value
the middle value when the scores are arranged in ascending or descending order
when there is an even number the two middle values are added and divided by 2
what are advantages of using the median
unaffected by extreme outlying values
it can be used on data with skewed distributions, skewed meaning it has an outlying value
what are disadvantages of using the median
it doesnt work as well with small sets of data
ignores most scores so less sensitive than the mean
what is the mean
the artithmetic average calculated by adding up all of the scores in the set and dividing by the total number of scores in the set
advantages of using the mean
makes use of all the data in a set therefore it is the most powerful
disadvantages of the mean
not to be used on skewed data that has one or two anomalies/extreme values as this can give a misleading average = mean is distorted
it is inappropriate to use on ordinal and nominal data
which measure of central tendency should you use if you could
the mean
what measures of central tendency can you use for nominal data
mode only
what measures of central tendency can you use for ordinal data
median
mode
what measures of central tendency can you use for interval/ratio data
mean
median
mode
scattergraphs can only be used for ….. only
correlations
what are the 3 measures of dispersion
range
interquartile range
standard deviation
what are measures of dispersion
show how spread out the scores are within a set of data
a large measure of dispersion shows what
that the scores are widely scattered
a small measure of dispersion shows what
that the scores are closely clustered
what is the range and how is it calculated
it is the difference between the highest and lowest scores in a set of data
subtract the lowest value from the highest value
advantage of the range
easy to calculate
disadvantage of the range
only considers the 2 extreme values so it can be seriously affected by outlying values
doesnt tell us anything about the distribution of scores in the middle of the set
what is the interquartile range
measures the spread of the scores in the middle 50% of values when they are placed in numerical order
how is the interquartile range calculated
calculate the median
calculate the mean of the 25% that is closest above and then again for closest below the median
the IQR is the difference between the 2 means
advantage of the interquartile range
the top and bottom 25% are ignored which gets over the problem of outlying values
what is standard deviation
measures the average distance of each score away from the mean
done using the formula
advantage of using SD
this is the most powerful measure of dispersion and it uses all the scores in the set of data in the calculation
can be used to describe the spread od scores in a normal distribution
disadvantage of SD
it is less effective when there are outlying scores that skew the data
comparision of sd and range
sd is less distorted by extreme scores
sd takes account for all the “verbal error scores” from the mean
therefore compared to the range it isnt just the difference between the highest verbal error and the lowest verbal error.
measures of dispersion can only really be used appropriately on what types of data
interval and ratio
what are characteristics of normal distribution
symmetrical bell shaped curve
mean, median, mode at same midpoint, dispersion either side of the midpoint is consistent and can be expressed as standard deviations
on a normal distribution curve, what % of the scores lie between the mean and one sd above OR below
34.1%
on a normal distribution curve what % of scores lie between one SD above AND below the mean
68.2%
for a sample of 1000 people the average IQ score was 100 and the SD was 15, what % of people and how many people would have an IQ of between 100 and 115
100 to 115 is 15 which is 1 SD
1 SD=34.1%
34.1% of 1000 is 341 people
0.341 x 1000 = 341
what is a skewed distribution
where scores are not equally distributed around the mean with a number of extreme scores to one side or the other of the mid-score
what does a negative skew look like, where are most of the scores bunched
the skew (tail/peak) is in a negative direction, most of the scores are bunches towards the right
what is different about the mean compared to other MOCTs in a negative skew
mean has a lower value than the median and mode as affected by the extreme lower scores to the left
what does a positive skew look like, where are most of the scores bunched
the skew (tail/peak) is in the positive direction, most of the scores are bunches towards the left
what is different about the mean compared to other MOCTs in a positive skew
the mean is always higher than the median and mode in a +skew, as the mean is affected by extreme scores to the right