Data Analysis Flashcards

1
Q

What is quantitive data?

A

data presented with numbers which allows for quick comparison between individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is qualitative data?

A

data presented with words

- provides depth/detail of situation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are strengths of using quantitative data?

A
  • means/ranges can be calculated
  • easy to enter numbers into tables/display data in graphs or charts
  • precise details used
  • easy to check for reliability
  • easy to test for hypotheses
  • easy to analyse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a limitation of using qualitative data?

A
  • can be difficult/time consuming to analyse as involves looking for trends and/or categorisation
  • subjective
  • hard to test hypotheses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are strengths of using qualitative data?

A
  • allows for detailed descriptions; rich/informative data

- useful for attitudes, opinions, beliefs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are limitations of using quantitative data?

A
  • reduces complex behaviour to a number

- important information may be lost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is primary data?

A

data collected/observed directly from first-hand experience by researcher for the purpose of their particular investigation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the strengths of using primary data?

A
  • control researcher has; data collected designed to fit aims and/or hypotheses of the study
  • not been altered in any way by any other researchers, reduces likeliness of investigator bias or subjectivity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are limitations of using primary data?

A
  • lengthy and time consuming, possibly expensive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is secondary data?

A

data collected by someone other than the researcher (usually for a purpose that differs from that of the researcher)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the strengths of using secondary data?

A
  • no need to design study, go through ethical committees, collect participants etc; more convenient and less expensive to obtain
  • possible may have already been subjected to inferential statistical testing, known whether or not it is significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the limitations of using secondary data?

A
  • for some studies, the data will not fit the specific aims and/or hypothesis of the current researcher, may not match their needs
  • may be substantial variation in the quality and accuracy of secondary data, information may appear valuable initially, but turns out to be incomplete
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is meta-analysis?

A

method where, rather than conducting research, primary data from other studies is re-analysed and consequently, uses secondary data - data from a large number of studies is combined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the strengths of using meta-analysis?

A
  • technique is useful when a number of small studies have found contradictory or weak results as by combining the data from these studies it may be possible to identify common trends that are not noticeable in a single study
  • reviewing the results from a number of studies, rather than just one, can increase the validity of the conclusions drawn as they are based on a larger sample of participants
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the limitations of using meta-analysis?

A
  • individual studies may have different designs so may not be truly comparable, may lead to a misleading conclusion
  • it’s difficult to come up with the right criteria for accepting/rejecting studies to be part of the meta-analysis
  • problem of publication bias (file-drawer problem) studies that give positive results may be over-represented in meta-analysis and any conclusions based on these studies will not take into account the studies that failed to get published
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is nominal data?

A

or categorical; the lowest level - measuring the frequency of occurrence in each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is ordinal data?

A

measurements place in rank order or in terms of relative position (in relation to others in the group)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is interval data?

A

when the data measured on a scale are made up of equal units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is ratio data?

A

same as interval; when data measured on a scale is made up of equal units BUT, ratio has a fixed 0 (no negative values) e.g. weight, height, temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the mode?

A

when data is arranged in numerical order and the value which occurs most frequently is identified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When/why is using the mode useful?

A
  • for nominal data
  • not affected by outliers
  • can make more sense than average (e.g. for age just saying 2 rather than 2.4)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When/why is using the mode not useful

A
  • there can be more than one mode in a set of data (data is bimodal) making it more difficult to use the mode as a summary value in the data
  • does not take into account all the other values, loses a lot of information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the median?

A

when data is arranged in numerical order and the middle value/mid-point is selected, if it lies between two numbers, work out the mean of these two values

24
Q

When/why is using the median useful?

A
  • most appropriate for ordinal data or skewed distributions

- not affected by outliers

25
Q

When/why is using the median not useful?

A
  • some information is lost as the raw scores are not used in the calculation
26
Q

What is the mean?

A

when values are added up and then divided by the total number of values

27
Q

When/why is the mean useful?

A
  • most appropriate with interval/ratio data, symmetrical distributions with no extreme values
  • includes information from all the items of data so is the most sensitive measure of central tendency (least information is lost)
28
Q

When/why is the mean not useful?

A
  • if the data is skewed (outliers)
  • mean may not be one of the original values (e.g. family does not have 3.2 children) so may be misleading
  • if the distribution is bimodal, again may be misleading
29
Q

What are measures of dispersion?

A

how spread out data is from around the mid-point e.g. range, interquartile range, standard deviation

30
Q

What is the range?

A

calculated by subtracting the lowest from the highest value in the data set (often researchers add 1)

31
Q

What are the strengths of using the range?

A
  • easy/quick to calculate
32
Q

What are the limitations of using the range?

A
  • includes end values, may be distorted by outliers
  • only having information from end scores contains no information about whether the values are spread evenly or clustered
33
Q

What is standard deviation?

A

measures how spread out a set of values are around the mean value - the larger the standard deviation, the larger the spread of scores are within a set of data

34
Q

How do you calculate the SD? (very unlikely this will come up)

A
  1. calculate the mean
  2. subtract mean from each value in data set to find the difference between each value and the mean
  3. square each of these (get rid of -)
  4. find the sum of all of these squared differences
  5. divide by population/sample (variance)
  6. find the square root of the variance
35
Q

What are the strengths of using SD?

A
  • easy/quick to calculate
36
Q

What are the limitations of using SD?

A
  • includes end values, may be distorted by outliers
  • only having information from end scores contains no information about whether the values are spread evenly or clustered
37
Q

What is a summary table?

A

includes descriptive statistics, common to include a paragraph or two after explaining what results show

38
Q

What is a contingency table?

A

all possible contingencies included, often for nominal data and shows the frequency of occurrences in each category (e.g. as well as showing those speeding, show also not speeding - so that wrong conclusions are not drawn)

39
Q

What is a line graph and when do we use it?

A

show continuous data, how one variable changes in respect to another (e.g. time)

40
Q

What are pie charts and when do we use them?

A

used to show the relative proportions of different categories, show the frequency of each category as as percentage

41
Q

What are scattergrams/scattergraphs and when do we use them?

A

used to represent data from correlational research, each pair of values plotted, one against the other, to determine if a consistent trend is apparent

42
Q

What are bar graphs and when do we use them?

A

shows data in the form of categories which the researcher wishes to compare (e.g. males with females) categories go alone x-axis, y-axis = IV, height of bar represents frequency; used for discrete variables

43
Q

What is a histogram and when do we use it?

A

used for continuous variables, rather than discrete, continuous variable plotted on x-axis indicated by no space between bars, y-axis must show frequency with which value on the x-axis occurs

44
Q

What is a frequency polygon and when do we use it?

A

very similar to histogram and one variable on the x-axis must be continuous, drawn by drawing line from midpoint of each bar in a histogram to the midpoint on the next
- advantage: 2+ frequency distributions displayed on the same graph, allow for comparisons to be made

45
Q

What is a distribution?

A

the pattern that can be seen on a graph, normal, positively skewed or negatively skewed

46
Q

What is a normal distribution?

A

an arrangement of data that is symmetrical and forms a bell shaped pattern where the mean, median and mode all fall in the centre at the highest peak (can be bimodal)

47
Q

What is a skewed distribution?

A

an arrangement of data that is not symmetrical data is clustered to one end of the distribution

48
Q

What order are mean, median, mode in a negatively skewed distribution?

A

Mean, median, mode - possibly when a task is too easy and so participants might be expected to get a high score (ceiling effect);(left foot)

49
Q

What order are mean, median, mode in a positively skewed distribution?

A

Mode, median, mean - may occur if task is too difficult (floor effect);(right foot)

50
Q

What are inferential statistics?

A

the ways of analysing data using statistical tests that allow the researcher to make conclusions about whether a hypothesis was supported by the results

51
Q

What is the minimum level chosen for research and what does it mean?

A

P < 0.05 - the probability the observed value is down to chance is less than 5% chance

52
Q

When might a level lower than P < 0.05 chosen?

A

P < 0.025 or P < 0.01 - more stringent levels used if study cannot easily be checked by replication or there is an aspect of risk involved

53
Q

What is a type 1 error?

A

when we reject the null hypothesis but we shouldn’t and the result was actually down to chance
- increased chance when the we set the level of significance too low

54
Q

What is a type 2 error?

A

when we retain the null hypothesis, but there was actually a real effect taking place and we should have rejected it
- increased chance when we set the level of significance too high

55
Q

How do we calculate the sign test?

A
  1. collect data in a table
  2. make sure level is NOMINAL - look at difference between second and first rating and see if it is positive or negative
  3. add the number of times the less frequent sign occurs (this is S - the observed/calculated value)
  4. to see if the difference between the two conditions is significant, chose the correct statistical table - if the observed value (s) is less than/equal to the critical value for a given level of significance, the null hypothesis can be rejected