statistics Flashcards
What are the types of data and what can they be broken down into
Quantitative - numerical - levels of measurement
- > ordinal, nominal & interval
Qualitative - language
Primary -
Collected specifically for researcher
Secondary
Collected by someone other than the person who is conducting the research -> meta analysis
What is meant by the term quantitative data (2)
This is data that is expressed numerically (1).
This type of data can be gained from individual scores in experiments, such as the number of words recalled or the number of seconds it takes to complete a task or from self report methods and the use of closed questions (2)
The data is open to being analysed statistically and can be easily converted into graphs, charts etc.
What is meant by the term qualitative data (2)
Qualitative data is expressed in words/ is descriptive data (1)
and may take the form a written description of the thoughts, feelings and opinions of participants such as from a notes recorded within an interview, a diary entry or answers from open questions in a questionnaire (2)
Qualitative methods are concerned with the interpretation of language.
What is discrete data
Information/findings that can be categorised into groups, the data can only appear in one category. It can’t be sub-divided i.e. it needs to be whole numbers e.g. 25/30 on test.
What is continuous data
Data that can be measured using scientific tools e.g. height, weight, time.
What are the 3 types of quantitative data
Nominal - discrete
Ordinal - discrete
Interval - continuous
What is nominal data
Data in the form of categories
For example: you can count how many boys and girls are in your year group - male and female are the categories and you take a count of how many in each group
Other examples: hair colour, people’s favourite football team.
What is ordinal data
Ordinal data is ordered/ranked in some way e.g. from highest to lowest
e.g. 19, 2nd, 3rd
Ordinal data does not have equal/fixed intervals between each unit
E.g. if you were asked to rate your enjoyment of Psychology on a scale of 1-10, what is the difference in the amount of enjoyment between 6 and 7?
Data based on subjective opinions are an example of ordinal level data.
E.g. rate how much you enjoy psychology out of 10 (1 being ‘I do not like psychology, 10 being “ love psychology. If two people say ‘8’ they may have different opinions of what an 8 is.
Another example of ordinal data would be the amount of items recalled in a a memory test or score on an IQ test.
For these reasons, ordinal data is often known as ‘unsafe’ data due to its lack of precision and is not used as part of statistical testing. Instead raw scores are converted to ranks (15, 2”, 3”°) and it is the ranks, not the scores, that used in the calculation for the statistical test.
What are the key features of ordinal data
Ordered / ranked
Does not have equal/ fixed intervals
Subjective opinions
What is interval data
Data is a STANDARDISED/UNIVERSAL/OFFICIAL measurement.
Data based on objective (factual) measures e.g. time in seconds, height in centre metres
Interval is based on numerical scales that include units of equal, precisely defined size.
What is meant by the term secondary data (2)
Secondary data has previously been collected by a third party (1)
(another researcher or an official body), not specifically for the aim of the study, and then used by the researcher (2).
E.g. preexisting data such as Government statistics.
What is meant by the term meta-analysis (2)
A meta-analysis is a form of research method that uses secondary data (1) as it gains data from a large number of studies, which have investigated the same research questions and methods of research. It then combines this information from all the studies to make conclusions about behaviour (2)
Breakdown of analysing data
Quantitative -> inferential statistics -> descriptive statistics
Descriptive statistics -> central tendencies -> mean, median, mode
Descriptive statistics -> measures of dispersion -> range, standard deviation
Qualitative -> content analysis -> thematic analysis
Content analysis -> coding
Thematic analysis -> emergent themes
What are the two ways of analysing data
Content analysis
Thematic analysis
What does content analysis observe
Usually makes observations indirectly through books, films, advertisements, interview transcripts and photographs.
What is content analysis (2)
This is a method of analysing qualitative data by changing large amounts of qualitative data into quantitative (1)
This is done by identifying meaningful codes that can be counted enabling us to present the data in a graph (2)
Why is it appropriate to use content analysis (1)
The data (name what the data is from the scenario given e.g. video recordings) being analysed is qualitative data. (1)
What is meant by coding (1)
Coding is the initial process of a content analysis where qualitative data is placed into meaningful categories.
How is content analysis carried out/ explain how you would analyse qualitative data (4)
Read /view the video or transcript (link to whatever qualitative data it refers to in the scenario) (1)
Identify/create coding (categories) provide an example of a relevant category (1)
Re-read the diaries/ questionnaire or repeatedly listen to sections of the recording (choose appropriate one in relation to the scenario) and tally every time each code appears (1)
Present the quantitative data in a graph/table (1)
What is thematic analysis (2)
This is a method of analysing qualitative data by identifying emergent (keep cropping up) themes enabling us to present the data in a qualitative format.
E.g. Interview recordings, presentation/conversation, diary entries, newspapers, texts, social media, radio and tv ads.
How is a thematic analysis carried out (2-4)
If the data in the scenario is not already a transcript: watch the video or listen to recordings to create a transcript of (contextualise e.g. refer to specific data in scenario such as interview about aggressive behaviour) * (1)
Read & re-read transcript (familiarisation)
Identify coding (categories) - looking for words which cropped up repeatedly. (1)
Combine these codes to reduce the number of codes into three or four themes that are linked to (contextualis e.g. what is the topic being studied?/ Provide an example of a potential theme) (1)
Present the data in qualitative format not quantitative. (1)
Ways to assess reliability of content analysis
Test re-test
Inter-rated reliability
test re-test
- The researcher completes the content analysis by creating a series of coding categories, (provide an example category that links to scenario) and tallying every time it occurs within the qualitative data.
- Then the same researcher repeats the content analysis on the same qualitative data e.g. interview, tallying every time the coding category occurs.
- Compare the results from each content analysis
- Then correlate the results from each content analysis using stats test.
- A strong positive correlation of above +0.8 shows high reliability
Inter-rather reliability
- The two raters would read through the qualitative data seperately and create coding categories together. INCLUDEEXAMPLE OF CATEGORY HERE
- Two raters read exactly the same content (contextualise e.g. what is the content? but record/tally the occurrences of the categories separately.
- They(compare the tallies from both raters
- Which are then correlated using an appropriate stats test.
- A strong positive correlation shows high reliability (+0.8).
Define operationalising in terms of content analysis
Operationalising means to be specific and clear when defining coding categories (1 mark to make the codes more measurable (1 mark)
Why is operationalising important in terms of content analysis
If coding categories are vague (not operationalised) then it would not be possible to repeat the research to check for consistent results.
Operationalising increases reliability as if the coding categories are operationalised the other researchers can repeat the research in the same way to check for consistent results.
How do you assess the validity of content analysis
Face validity
Concurrent validity
How do you conduct face validity
The quickest most superficial way of assessing for validity. This involves an independent psychologist in the same field seeing if a coding category (contextualise: give an example looks like it measures what it claims to measure (contextualise: refer to scenario, what are they measuring?) at first sight/face value. If they say YES the content analysis is valid.
How do you conduct concurrent validity
A way of assessing validity by comparing the results of a new yu content analysis (contextualise here: what is the content analysis investigating?) with the results from another similar pre-existing content analysis which has already been established for its validity. If the results from both are similar then we can assume the test is valid. The correlation of two sets of coding recordings/ results gained from an appropriate stats test should exceed +0.8.
How do you improve the validity of a content analysis
Ensure coding categories are operationalised.
Researchers are trained in how to use the coding categories
Break down analysing quantitate data
Descriptive statistics -> measures of central tendency -> mean, median, mode
Descriptive statistics-> measures of dispersion -> range, standard deviation
What is meant by measures of central tendency (2)
The general term for any measure of the average value in a set of data. For example, the mean.
Describe the mode
Most common or popular number in set of scores and there can be more than one mode in data set
USED WITH NOMINAL
What is the mode used with
Nominal data
Describe median
Central/middle score in a list of ranked-ordered scores
If there are two central scores, add together and divide by 2.
USED WITH ORDINAL DATA
What is the median used with
Ordinal data
What is the mean used with
Used with internal data
Describe mean
All scores added up and divided by the total number of scores (mathematical average)
OUSED WITH INTERVAL DATA
What is meant by measures of dispersion (2)
This is based on the spread of scores: how far score vary from the mean or range. For example, the range or standard deviation.
What are the two types of dispersion
Range
Standard deviation
What is the range used for
Ordinal data
What is the standard deviation used for
Interval data
Describe the range
The range is the spread of data from the smallest to the largest.
Calculated by subtracting the lowest value from the highest value and adding 1
USED FOR ORDINAL DATA
Describe the standard deviation
Measure of spread around the mean.
The higher the SD the more the data is spread around the mean.
The larger the calculated number, the data is spread around the mean, less consistency and more individual differences (suggesting that not all participants were affected by the IV in the same way).
The smaller the calculated number, the data is clustered around the mean so more consistency and less individual differences.
USED FOR INTERVAL DATA
What does the standard deviation tell us
A HIGH SD means scores are more spread around the an so more variation in scores
The scores are less consistent and there are more individual differences in the results
A LOW standard deviation (closer to 0) means scores are less spread around the mean so there are less variation in scores
The scores are more consistent and there are less Individual differences in the results.
What does the mean tell us
Provides a good indication of the average/typical score participants gain
Generally the higher the mean that is gained then the greater the score/effect (although this is not always the case! Do double check in the scenario what a high score will mean)
What is the writing frame for the sd/ mean questions
STATE WHICH IS HIGHER/LOWER, STATE WHAT THIS SUGGESTS.
The mean for condition A (context) is
which is higher/lower than the mean for condition B (context)
which is
Therefore… what does this suggest about the effect on the DV? Link to the scenario. (MUST INCLUDE)
The Standard deviations for condition A (context) is
which is higher/lower than the standard deviations for condition B (context)
which is
Therefore… what does this suggest about the spread of SCORES (context DV) and individual differences? Link to the scenario. (MUST INCLUDE)