research methods - data Flashcards
4 types of data
-primary
-secondary
-quantitative
-qualitative
quantitative data
data expressed numerically
how can quantitative data be obtained?
using closed questions
what is qualitative data?
data expressed in words/non numerically
how can qualitative data be obtained?
using open questions
what does quantitative data aim to produce?
-results that can be easily compared and analysed
-produces results that can go through statistical tests to see if they are significant
what does qualitative data aim to produce?
-meaningful data
-to understand phenomena from the point of view of an individual
strengths of quantitative data
-easy to analyse statistically
-lets comparisons and trends be seen
weaknesses of quantitative data
-lack of detail/representativeness
-responses can’t explain complex human behavior
strengths of qualitative data
rich in detail:
-gives investigator meaningful insights into human behaviour
-high external validity, info is more likely to represent a real world view
weaknesses of qualitative data
-can be subjective because of ppts detail, interpretations rely on opinions & judgements of the researcher
primary data
-data that has been collected for a specific reason firsthand by the original researcher
-sometimes called field research
advantages of primary data
authenticity
↳ it is collected with the sole purpose of being for a specific investigation
↳ the data collected is to suit the aims of the research, this enables the researcher to exert a high level of control
great probability that the data generated will fit the aims of the investigation:
↳ wasted time is reduced on behalf of the researcher
↳ info is relevant
disadvantages of primary data
designing and carrying out a psychological study can take a long period of time and considerable effort:
↳ expenses can accrue (time investment)
what is secondary data?
-information that was collected by other researchers for a purpose other than the investigation in which it is currently being used
-data which already exists
-sometimes referred to as desk research
advantages of secondary data
the information already exists in the public domain:
↳ less time consuming and expensive to collect
disadvantages of secondary data
concerns over accuracy:
↳ the information was not gathered to meet the specific aim of the research
quality of the data may be poor:
↳ much of the data may be of little or no value to the researchers
meta-analysis
-investigators combine findings from multiple studies (secondary data) on a specific phenomenon to make an overall analysis of trends and patterns arising across research
advantages of meta-analysis
-since the results are combined from many studies, rather than just one, the conclusions drawn will be based on a larger sample which provides greater confidence for generalisation
↳ increases the validity of the patterns and trends identified
disadvantages of meta-analysis
issues of bias
↳ since the researcher is selecting data from research which has already taken place, the may choose to omit certain findings from their investigation
(especially if the previous findings showed no significant results)
↳ the findings and conclusions from the meta-analysis will be biased as they do not accurately represent all of the relevant data on the topic
time consuming
which 3 levels of measurement do quantitative data fall into?
-nominal
-ordinal
-interval
nominal data
categorical data
how nominal data is discrete
each participant will only appear in one category
ordinal data
-ordinal if it is ordered in some way and the intervals between the data are not equal
-typically, this is used to rank data
what does it mean that intervals between the data are not equal in ordinal data?
-if people were asked to rate their preference of local restaurants, with 1 being their least favourite and 10 being their favourite, they wouldn’t be able to say for sure that the difference between the restaurants ranked in 1st and 2nd place was equal to the difference between the ones rated as 8th and 9th - perhaps it was a very close call between those rated as 1st and 2nd, but there was a much bigger difference between 8th and 9th
when does ordinal data often appear in psychology?
when researchers are investigating a non-physical entity, such as attitudes
interval data
-data that is ordered in some way
-with interval data we are confident that the intervals between each value are equal in measurement
-more objective and scientific
strength of nominal data
easily generated from closed questions on a questionnaire or interview
weaknesses of nominal data
data can’t express its true complexity and can therefore appear overly simplistic
strengths of ordinal data
-provides more detail than nominal data as the scores are ordered in a linear way
weaknesses of ordinal data
-the intervals between scores are not of equal value
-an average (the mean) cannot be used as a measure of central tendency
strengths of interval data
-considered more informative than the nominal and ordinal levels of measurement
-gaps in between the scores are of equal value/distance and are therefore more reliable
weaknesses of interval data
-in some instances, the intervals are arbitrary
-we can only say that the difference between 10 and 20 degrees is the same as between 30 and 40 degrees
what must be done once quantitiative data has been collected & what is this called
it should be summarised numerically
(descriptive statistics)
why are descriptive statistics useful?
save the reader from needing to navigate through lots of results to get a basic understanding of the data
what do descriptive statistics include?
-a measure of central tendency
-a measure of dispersion
what are the measures of central tendency?
mean, median and mode
what do measures of central tendency tell us about?
the central values of a set of data
what type of data is the mean used for?
interval data
how is the mean calculated?
the aim of all numbers divided by the amount of numbers
advantages and disadvantages of mean:
advantages:
-representative of all data as every score is used in its calculation
-if numbers cluster around the central
value it is a good way of showing the
typical score
disadvantages:
-distorted by extremes
what data is the median used for?
usually ordinal data or any data with extremes
how is the median calculated?
-put the scores into order from lowest to highest
-find the middle value
(if there are two middle values, add and divide by two
strength and weakness of the median:
strength:
-not distorted by extreme values
weakness:
-doesn’t represent every score as not every score is used when calculating it
-time consuming if there are many numbers
what data is the mode used for?
it is the only way to represent nominal data
how is the mode calculated?
find the most common value in the data set
strength and weakness of the mode:
strength:
-only way to summarise nominal data
weakness:
-not always useful to summarise the data as there may be more than one modal value
what are measures of dispersion?
descriptive statistics that define the spread of data around a central value
what are the two measures of dispersion?
range and standard deviation
how do you calculate the range?
-subtract the lowest score in the data set from the highest score
-add 1
advantage and disadvantages of the range:
advantages:
-easy to calculate
disadvantages:
-distorted by extreme values
-doesn’t use all data → not representative
-doesn’t tell us whether values are closely grouped around the mean
what is standard deviation?
it looks at how far the scores deviate from the mean
what does a large deviation suggest?
there was a lot of variation around the mean
what does a small deviation suggest?
the values are very concentrated
around the mean, everyone had similar results
advantages and disadvantages of standard deviation
advantages:
-each score is taken into account (representative)
-most sensitive measure of dispersion
disadvantages:
-extreme values can distort the value
-difficult to calculate
how are percentages useful in the summary of a dataset?
the reader can get a feel for the data at a glance, without needing to read all of the results
how to calculate percentage increase:
difference/original
x100
how to calculate percentage decrease
difference/original
x100
how can quantitative data be displayed?
-table
-scattergram
-histogram
-bar chart
table of raw data:
-shows scores prior to analysis
-hard to identify patterns in the data
what do summary tables include?
-measures of central tendency
-measures of dispersion
-a clear summary of data
why are tables useful?
the reader can easily compare the most important values
are summary tables raw scores?
no, tables in the results section of a report have been converted into descriptive statistics
what does a scatter graph show?
the relationship between two sets of data (co-variables)
where are the sets of data on a scatter graph?
one set is on the x axis
another set is on the y axis
what does a positive correlation show?
shows an upward trend
(as one variable increases, so does the other)
what does a negative correlation show?
as one variable increases, the other decreases
what does a zero correlation mean & look like?
-there is no distinct relationship shown between the two variables
-scores are random
REMEMBER!!!
comment on the strength of a correlation
what data are bar charts used for?
to show discrete or nominal data
where is the mean/frequency on bar chart?
the y axis
where are the categories on bar chart?
the x axis
things to remember to include/anout a bar chart:
-always have a title
-clearly label axes
-columns do not touch
-columns have equal width and
spacing
what can histograms be called?
frequency distribution graphs
what data are histograms used to represent?
continuous data (eg: ages)
difference between bar charts & histograms:
the columns histograms touch
what is on the x axis of a histogram?
categories
what is on the y axis of a histogram?
frequency
where is the IV on a graph?
x axis
where is the DV on a graph?
y axis
what curve does normally distributed data produce?
a bell shaped curve
what does a bell shaped curve indicate?
most scores are close to the mean, less are at the extremes
where would the mean, median and mode be on a normally distributed curve?
at the centre point, near the mean (top of the curve)
what is a good chance that the data follows a normal distribution?
if the mean, median and mode are similar
percentages of standard deviations with a normally distributed data set:
-68% will lie within one standard deviation of the mean
-95% of scores will lie within two standard deviations from the mean
-only 5% of scores will lie beyond two standard deviations from the mean
what does it mean when many scores fall below or after the mean?
the distribution is skewed
characteristics of positively skewed distribution:
-mean is higher than the mode
-mode is to the left of the mean
(whale is happy to be close to home)
what is an example of something that would cause a positive skew?
a hard test
characteristics of negatively skewed distribution:
-mean is lower than the mode
-mode is to the right
(whale is sad to be further away from home)
example of what would cause a negative skewed distribution:
an easy test
where does the mode stay in a skewed distribution and why?
at the highest point of the peak as it isn’t affected by extreme scores
what does a skewed distribution tell us?
-there are anomalous results of extreme values
-the mean is not a very representative score & shouldn’t be used as the sole measure of tendency when distributions are skewed