Section 3: Data Handling and Data Analysis Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

what is quantitative data?

A

numerical data (data in the form of numbers)

quantitative data collection techniques tend to be more tightly controlled in order to produce precise numerical measures

for example:
- experiments such as laboratory experiments tend to measure the DV quantitatively (eg. number of words recalled from a list, or reaction time),
- the use of closed questions in a questionnaire (eg. number of hours revising per week, or ratings)
- behavioural catergories in an observation
…are all quantitative in nature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is qualitative data?

A

non-numerical data

usually data in the form of words (eg. a transcription of an interview) but could be any non-numerical data (eg. a drawing)

qualitative data collection techniques are often used when more depth or detail is required and where tight control and precise measurements would be seen as unrepresentative and limited (eg, when analysing more complex thoughts, feelings and opinions)

for example:
- interviews are more likely to produce more detail and elaboration than questionnaires
- observational studies where written records are made rather than simply counting behavioural categories such as case studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

difference between quantitative and qualitative data

A

quantitive data is numerical data (data in the form of numbers) whereas non-numerical data is usually data in the form of words (eg. a transcription of an interview) but could be any non-numerical data (eg. a drawing)

quantitative data collection techniques tend to be more tightly controlled in order to produce precise numerical measures. for example, experiments such as laboratory experiments tend to measure the DV quantitatively (eg. number of words recalled from a list, or reaction time), the use of closed questions in a questionnaire (eg. number of hours revising per week, or ratings), and behavioural catergories in an observation are all quantitative in nature. whereas qualitative data collection techniques are often used when more depth or detail is required and where tight control and precise measurements would be seen as unrepresentative and limited (eg, when analysing more complex thoughts, feelings and opinions). for example, interviews are more likely to produce more detail and elaboration than questionnaires, and observational studies where written records are made rather than simply counting behavioural categories such as case studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

explain the overlap between quantitative and qualitative data collection techniques

A

researchers collecting quantitative data as part of an experiment may often interview their ppts as a way of gaining a more detailed insight into their experience of the investigation

likewise, there are a number of different ways of converting qualitative data into quantitative data in order to analyse the data in a clearer way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is primary data?

A

primary data refers to original data that has been directly observed and collected specifically of the purposes of the investigation by the researcher

it is data that arrives first-hand from the ppts themselves

for example, data gathered from your own experiment, questionnaire, interview of observation would be classed as primary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is secondary data?

A

secondary data refers to data that has been collected by someone other than the person conducting the research

in other words, this is data that already exists (second-hand data) before the psychologist begins their investigation

for example, data from research journals, books, websites, government data (eg. population records) or data held within organisations (eg. population records) or data held within organisations (eg. employee absence rates) would be classed as secondary data

a meta-analysis is also secondary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

strengths of primary data

A
  • can control the quality and accuracy of data (eg can standardise to ensure high validity)
  • can ensure data covers research objective (firm conclusions can be drawn)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

limitations of primary data

A
  • difficult to access data as it does not already exist, so it is time consuming
  • significance not known as there is limited prior research (statistical tests needed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

strengths of secondary data

A
  • easy to access data as it already exists, so it saves time
  • significance is already known from prior research (no statistical tests needed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

limitations of secondary data

A
  • cannot control the quality and accuracy of data (eg cant ensure high validity)
  • cannot ensure data covers research objective (unable to draw firm conclusions)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a meta analysis

A

a meta analysis refers to the process of combining results from a number of studies on a particular topic to provide an overall view

it is a form of secondary data because the data is not gather first-hand from the researchers own research

a meta-analysis may involve a qualitative review of conclusions (i.e. a discussion) and/or a quantitative analysis of the of the results producing an effect size across the different studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

advantage of a meta-analysis

A

P: easier to gather results on a large scale
E: because meta-analyses draw findings together from a range of studies, each with their own samples of ppts, it is easier to gather results which represent a wider, more representative sample than most single studies can
E: eg, research can consider studies of a similar type conducted in many countries to compare effects of a particular variable cross-culturally
L: allows us to view the data with more confidence and the findings may be said to be higher in population validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

disadvantage of a meta-analysis

A

P: may suffer from the file-drawer effect
E: meta-analysis research relies on the researcher selecting a series of studies in order to analyse the overall view, but this process of selection can be open to researcher bias
E: eg, the researcher may choose to leave out any studies which do not support their hypothesis
L: the results would be biased and not a true reflection of the research rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

types of descriptive statistics

A
  • measures of central tendency
  • measures of dispersion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is raw data?

A

when you carry out research, the results you generate are known as raw data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

why may you need to calculate a measure of central tendency or summarise data

A

when you carry out research, the results you generate are known as raw data

there did often so much raw data it becomes difficult to see an overall effect or pattern

as a result, the data often needs to be summarised so that you can clearly see the effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

measures of central tendency

A
  • mean
  • median
  • mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is the mean?

A

a measure of central tendency where all scores are totalled & divided by the number of scores

this is the only measure of central tendency that includes all the data/scores in the calculation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is the median

A

a measure of central tendency in which scores are placed in rank order and the middle value is taken

if there is an even number of scores, the midpoint between the two middle scores is taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is the mode

A

a measure of central tendency in which the most common value in the data set is taken

in some data sets there may be two modes (bi-modal) or no mode if all scores are different

for some data, the mode is the only method you can use, for example data in categories.

21
Q

strength of using the mean

A

it is a sensitive measure (uses all scores in the data set) so it is more representative

22
Q

limitation of using the mean

A

skewed by extreme scores

23
Q

strength of using the median

A

not skewed by extreme scores

24
Q

limitation of using the median

A

it is not a sensitive measure (doesn’t use all scores in the data set) so it is less representative

25
Q

strength of using the mode

A

not skewed by extreme scores

26
Q

limitation of using the mode

A

it is not a sensitive measure (doesn’t use all scores in the data set) so it is less representative

27
Q

what are measures of dispersion

A

‘dispersion’ is another way of saying ‘spread’

this tells us, not the average ‘midpoint’ like a measure of central tendency, but how ‘spread out’ the scores are

28
Q

measures of dispersion

A
  • range
  • standard deviation
29
Q

what is the range

A

a measure of dispersion in which the lowest value is subtracted from the highest value

this is the simplest calculation to measure the spread of data

a large range indicates a large spread of data whilst a small range indicates a small spread of data

30
Q

what is standard deviation

A

a measure of dispersion that measures the average deviation (or difference) of each score from the mean

this is a different measure of the ‘spread’ of data which, unlike the range, takes all the scores in a data set into account

a large standard deviation indicates scores are widely spread out from the mean (all ppts responded very differently) whilst a small standard deviation indicates scores are closely clustered around the mean (all ppts responded in a similar way)

31
Q

strength of the range

A

easy to calculate and provides the same unit of measure

32
Q

limitation of the range

A

not a precise measure of spread as it depends on only 2 values (not a sensitive measure) and can be affected by extreme values

33
Q

strength of standard deviation

A

it is not a sensitive measure as it takes all values into account, so isn’t as affected by extreme values

34
Q

limitation of standard deviation

A

can be difficult and time consuming to calculate in comparison to the range

35
Q

ways of presenting/displaying/interpreting quantitative data

A
  • tables
  • bar charts
  • histograms
  • line graphs
  • scattergram (scattergraph)
  • distributions
36
Q

presentation and interpretation of quantitative data:
tables

A

tables are used to organise or summarise the raw data from the study

one way of organising data to make it clearer to identify patterns is through a frequency distribution table

this involves putting the units of measurement into some sort of order before counting the number of times (frequency) each unit of measurement occurs in the raw data

tables are also used to summarise data by presenting descriptive statistics sh b as measures of central tendency and dispersion

it is standard practice to include a summary paragraph beneath the table explaining the results

37
Q

presentation and interpretation of quantitative data:
bar charts

A

bar charts are used when data is divided into categories (also known as discrete data)

bar charts are also useful for presenting the difference in mean values, for example

the vertical y-axis represents the frequency and the horizontal x-axis represents the categories or conditions

in a bar chart, a space is left between each bar to indicate the lack of continuity l

there just be gaps between the bars on a bar chart

38
Q

presentation and interpretation of quantitative data:
histograms

A

in a histogram the bars touch each other, which shows that data is continuous rather than discrete (as in a bar chart)

the x-axis represents a scale made up of equal-sized intervals (eg. marks on a test broke up into 0-9, 10-19, 20-29 etc.)

the y-axis again represents the frequency (eg. number of people who scored a certain mark) within each interval

there are no spaces between the bars on a histogram

39
Q

presentation and interpretation of quantitative data:
like graphs

A

like histograms, line graphs use continuous data on the x-axis and there is a dot to mark the middle top of tl where each bar would be and each dot is connected by a line

40
Q

presentation and interpretation of quantitative data:
scattergram (scattergraph)

A

unlike other forms of graphs, scattergrams do not depict differences but relationships (or associations) between co-variables

each pot is scored twice on two variables (eg. hours of revision per week and exam score) and one of these scores occupies the x-axis and one occupies the y-axis (it doesn’t matter which is which) and each point on the graph represents the score of one pot on both of the co-variables

scattergram also use continuous data

41
Q

what is a distribution

A

a distribution is the overall pattern shown by a large set of data

42
Q

types of distribution

A
  • normal distributions
  • skewed distributions
43
Q

what is a normal distribution

A

a normal distribution is a symmetrical spread of frequency data that forms a ‘bell shaped curve’

this pattern occurs when most people are gathered in the middle of a measurement scale (the average)

as a result, the mean, median and mode are all in the exact mid-point (peak)

there are progressively fewer people either side of the ‘peak’ frequency

the dispersion of scores either side of the mid-point is consistent and can be expressed in standard deviations

44
Q

characteristics of normal distributions

A
  • the mean, median and mode are all in the exact mid-point (peak)
  • the distribution is symmetrical around the mid-point (peak)
  • the dispersion of scores either side of the mid-point is consistent and can be expressed in standard deviations

-for any data set that is normally distributed, 68.26% of people will lie within one standard decision of the mean

  • a total of 95.44% of people will lie within two standard deviations of the mean, which means only 4.56% lie in the area beyond this
45
Q

what is a skewed distribution

A

skewed distributions occur when frequency data is not symmetrically spread around the mean

in other words, there is a higher frequency of people on one side of the measurement scale than the other

this can either take the form of a positive or negative skew

a positive skew is where the most common score (the mode) is concentrated on the left side of the distribution

a negative skew is where the most common score is concentrated on the right side of the distributions

46
Q

types of skewed distribution

A
  • positive skew
  • negative skew
47
Q

what is a positive skew

A

a positive skew is where the most common score (the mode) is concentrated on the left side of the distribution

48
Q

what is a negative skew

A

a negative skew is where the most common score is concentrated on the right side of the distributions

49
Q

how are the various measures of central tendency affected by skewed distributions

A

the mode remains at the highest point, the median comes next, not the mean has been dragged across the graph