U1 - Collecting data Flashcards

1
Q

Advantages of primary data (3)

Disadvantages of primary data (2)

A

ADV:

  • Collection method is known
  • Accuracy is known
  • Can find answers to very specific questions

DISADV:

  • Time-consuming to collect
  • Expensive to collect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Advantages of secondary data (3)

Disadvantages of secondary data (5)

A

ADV:

  • Easy to obtain
  • Cheap to obtain
  • Data from some organisations can be more reliable than data you collect yourself

DISADV:

  • Method of collection is unknown
  • Data might be out of date
  • Data may contain mistakes
  • Data may come from an unreliable source
  • May be difficult to find answers to specific questions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Advantages of a census (3)

Disadvantages of a census (4)

A

ADV:

  • Unbiased
  • Accurate
  • Takes the whole population into account therefore it’s representative

DISADV:

  • Time-consuming
  • Expensive
  • Difficult to ensure the whole population is used
  • Lots of data to handle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages of a sample (3)

Disadvantages of a sample (2)

A

ADV:

  • Cheaper than a census
  • Less time-consuming than a census
  • Less data to be considered than a census

DISADV:

  • Not completely representative
  • May be biased
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define sampling units

A

The people or items that are to be sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sampling frame

A

A list of all the sampling units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you carry out a simple random sample?

A
  • Using the sampling frame, number each person from 01 to x.
  • Then, use a random number generator to generate x numbers, ignoring any repeats.
  • Identify what students these numbers correspond to - this is the data you should use.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantages of a simple random sample (3)

Disadvantages of a simple random sample (2)

A

ADV:

  • Free of bias
  • Sample is more likely to be representative of the population, provided it is a large sample
  • Each sampling unit has an equal chance of selection

DISADV:

  • Not suitable when the sample size is small
  • A sampling frame is needed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you carry out a systematic sample?

A
  • Using the sampling frame, number each person from 01 to x.
  • Calculate a regular interval to use by dividing the population size by the sample size.
  • Generate a random number from 0 to the interval to determine the starting point.
  • Keep adding the interval to the starting point to select your sample.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages of a systematic sample (2)

Disadvantages of a systematic sample (2)

A

ADV:

  • Simple and quick to do
  • Suitable for large samples and populations

DISADV:

  • A sampling frame is needed
  • Can introduce bias if the interval aligns with a pattern in the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you carry out a stratified sample?

A
  • Divide the population into categories that you’re stratifying by
  • Calculate the number needed from each strata using the formula: (sample size/population size) x num in strata
  • Use a random number generator to select the sample for each category
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Advantages of a stratified sample (3)

Disadvantages of a stratified sample (1)

A

ADV:

  • Sample accurately reflects population
  • Guarantees proportional representation of groups within a population
  • Minimises bias

DISADV:
- Population must be put into strata which can be costly or time consuming, especially if the population size is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you carry out a quota sample?

A
  • Group the population by characteristics such as age/gender
  • Give each category a quote (number of members to sample)
  • Collect data until the quotas are met in all categories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Advantages of a quota sample (4)

Disadvantages of a quota sample (4)

A

ADV:

  • Allows a small sample to still be representative of the population
  • No sampling frame is required
  • Quick, easy, inexpensive
  • Allows for easy comparison between different groups within a population

DISADV:

  • Non-random therefore can introduce bias
  • Population must be divided into groups which can be costly or time-consuming, especially if the population size is large
  • Time-consuming and expensive
  • Non-responses are not recorded
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you carry out an opportunity sample?

A
  • Choose members of the population that are the easiest to sample e.g. the first people to walk past
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Advantages of opportunity sampling (2)

Disadvantages of opportunity sampling (2)

A

ADV:

  • Easy to carry out
  • Inexpensive

DISADV:

  • Unlikely to provide a representative sample
  • Highly dependent on individual researcher
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why would you want to group data?

A

Because it helps you to see the distribution of the data and spot patterns more easily

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Describe what grouped discrete data would look like.

A

Classes with non-overlapping categories like 11-20, 21-30, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Disadvantages of grouped data (2)

A
  • If too many or too few class intervals are selected, trends in the data can be obscured
  • Individual data values are not known so you can only calculate estimates of the mean, mode and median - therefore its less accurate than raw data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define continuous data

A

Data that can take any place on a continuous numerical scale e.g. length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define discrete data

A

Data that can only take particular values on a continuous numerical scale e.g. shoe size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Define categorical data

A

Data that can be sorted into non-overlapping categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Define ordinal data

A

Data that can be written in order or can be given a numerical rating scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Define bivariate data

A

Data that involves pairs of related data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Define multivariate data

A

Data that involves sets of three or more related data values

26
Q

What is self-selection sampling?

A

A type of non-probability sampling in which people choose to be part of the sample - e.g. they choose to complete a questionnaire or volunteer to take part in a study

27
Q

Advantages of self-selection sampling (3)

A
  • Requires little time or effort in finding sample members (because they contact you)
  • People who have volunteered are more likely to respond
  • It could be the only way to get people to take part in a study, or to find members of a population
28
Q

Disadvantage of self-selection sampling (1)

A
  • There can easily be trends within the respondents, such as people having strong opinions, which would lead to bias
29
Q

Describe what grouped continuous data would look like.

A
Classes with non-overlapping categories and class intervals with no gaps (such as 50-59, 60-69 - 59.3 wouldn't be able to be shown).
E.g. 50 < t <= 60
30
Q

Define population

A

Everything or everybody that could possibly be involved in an investigation

31
Q

Define a census

A

A census is a survey or investigation with data taken from every member of a population

32
Q

Define bias

A

Systematic error

33
Q

Define independent variable

A

A variable whos variation does not depend on that of another (x axis)

34
Q

Define dependent variable

A

A variable whose value depends on that of another (y axis)

35
Q

Define sample

A

A smaller number of items from the population

36
Q

What’s the problem with gathering a bigger sample?

A

It’s more costly and time-consuming

37
Q

What assumptions do you make when using the capture-recapture method? (4)

A
  • The population hasn’t changed - no members have entered or left the population and there have been no births or deaths between the release and recapture times
  • The probability of being caught is equal for all individuals
  • Marks (or tags) have not come off
  • The sample size is large enough to be representative of the population
38
Q

Peterson capture-recapture method

A
  • Capture a sample of the population
  • Mark each item
  • Put the items back into the population and ensure they’re thoroughly mixed
  • Take a second sample and count how many are marked. The second sample should be taken long enough to ensure that the items are
39
Q

Advantages of an interview (4)

Disadvantages of an interview (5)

A

ADV:

  • Interviewer can explain questions
  • Interviewer can put people at their ease when answering personal questions
  • Respondent can explain answers
  • High response rate - every person interviewed answers the questions

DISADV:

  • Respondents may be less honest in an interview and less likely to answer personal questions
  • Interviewing can take a long time, so can be expensive
  • Sample size is smaller than for a questionnaire
  • Interviewer bias - interviewer may interpret answers to suit their own opinions
  • Respondents may try to impress the interviewer, or guess the answers the interviewer wants to hear
40
Q

Advantages of an anonymous questionnaire (4)

Disadvantages of an anonymous questionnaire (3)

A

ADV:

  • Respondents are more likely to be honest and more likely to answer personal questions
  • Respondents can all complete the questionnaire at the same time, or in their own time, so can be quick and cheap
  • Easy to send questionnaires to a large and representative sample
  • No interviewer bias

DISADV:

  • Respondent may not understand the questions
  • Researcher may not understand the respondent’s answers
  • Lower response rate - some people may not answer all the questions or return the questionnaire
41
Q

What is a lab experiment?

A

An experiment conducted in a controlled environment (not necessarily a lab).

42
Q

Advantages of a lab experiment (2)

Disadvantage of a lab experiment (1)

A

ADV:

  • Easy to replicate because you can copy the experiment exactly
  • You can control extraneous variables

DISADV:
- Test subjects may behave differently in test conditions than they do in real life

43
Q

What is a field experiment?

A

An experiment carried out in test subjects’ everyday environment. The researcher sets up the situation and controls one or more variables.

44
Q

Advantage of a field experiment (1)

Disadvantages of a field experiment (2)

A

ADV:
- Test subjects are more likely to reflect real life behaviour

DISADV:

  • You can’t control extraneous variables
  • Harder to replicate the experiment exactly
45
Q

What is a natural experiment?

A

An experiment carried out in test subjects’ everyday environment, where researcher has no control over any variables.

46
Q

Advantage of a natural experiment (1)

Disadvantages of a natural experiment (2)

A

ADV:
- Test subjects are more likely to reflect real life behaviour

DISADV:

  • You can’t control any variables
  • Harder to replicate the study exactly
47
Q

What is an extraneous variable?

A

A variable that you are not interested in but could affect the results of your experiment

48
Q

If replicating an experiment gives very similar data, what does this show?

A

That the data is likely to be valid and reliable

49
Q

Disadvantage of using open questions in a questionnaire

A

Every respondent could give a different answer, so it can be difficult to summarise and analyse the answers

50
Q

Disadvantage of opinion scales in a closed question questionnaire

A

Most people will answer somewhere near the middle. They are unlikely to indicate a strong opinion either way as they do not wish to seem extreme

51
Q

Problems to look for in questionnaires (5)

A
  • Boxes that do not cover all possibilities
  • Boxes that cover one option more than once
  • Biased questions that try to persuade you to agree
  • Questions that people are unlikely to answer honestly
  • Open questions that allow for personal opinions and do not have tick boxes where closed questions would be better
52
Q

Things to do when designing a questionnaire (6)

A
  • Keep questions short and use simple language
  • Avoid biased or ‘leading’ questions that suggest a particular answer
  • Give intervals that do not overlap
  • Make sure options cover all possibilities, including ‘0’, ‘never’, ‘dont know’ or ‘other
  • Include a time frame in questions e.g. in the last week
  • Avoid questions that respondents are unlikely to answer honestly
53
Q

What is a pilot survey?

A

A survey conducted on a small sample to test the design and methods of that survey. They’re good because you can check for any unforeseen problems

54
Q

What is an outlier/anomaly?

A

A value that does not fit the pattern of the data

55
Q

What is cleaning data? (3)

A
  • Identifying and either correcting or removing inaccurate data values (caused by recording or other errors) or extreme values
  • Removing units or other symbols from data
  • Deciding what to do about missing data
56
Q

What is a hypothesis?

A

An idea that can be tested by collecting and analysing data

57
Q

What do you need to consider when testing a hypothesis (designing an investigation)? (8)

A
  • How long it will take
  • How much it will cost
  • Ethical issues
  • If people will answer sensitive questions
  • If you can get the data locally, cheaply and in a short time frame
  • How to select your population and sample
  • How to deal with non-response
  • How to deal with unexpected results
58
Q

What is a questionnaire?

A

A set of questions designed to obtain data

59
Q

What is a control group?

A

A group selected randomly from the population and is not subject to any factors under investigation

60
Q

What causes outliers?

A

Human / machine / genuine error