U1 - collecting data Flashcards

1
Q

quantitative data definition

A

numerical data (numerical observations or measurements)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

qualitative data definition

A

non-numerical data (non-numerical observations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

continuous data definition with examples

how can this data be represented?

A

can take any value on a continuous numerical scale. grouped data with inequalities.

e. g.
- height
- weight
- temperature
- length

can be represented by:

  • histograms
  • cumulative frequency curves
  • line graphs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

discrete data definition with examples

how can this data be represented?

A

can only take particular values. grouped data with no inequalities

e. g.
- the number of students in a class
- shoe size
- the number of languages an individual speaks

can be represented by:

  • bar charts
  • CF step polygons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

categorical data definition with examples

how can this data be represented?

A

can be sorted into non-overlapping categories.

e. g.
- race
- sex
- age group

can be represented by:

  • frequency tables (normal freq tables, relative freq tables, cf tables)
  • pie charts
  • bar charts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ordinal data definition with examples

how can this data be represented?

A

like categorical data but can be written in order and given a rating scale

e. g.
- spicy scale (plain, mild, medium, hot, extra hot)
- income level (low income, medium income, high income)
- satisfaction level (extremely dislike, dislike, neutral, like, extremely like)

can be represented by:

  • bar charts
  • pie charts
  • tables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

bivariate data definition with examples

how can this data be represented?

A

involves a pair of related data, helps you study correlation between two variables

e. g.
- how temperature affects the state of an ice cream (two variables are temperature and ice cream)

can be represented by:
- scatterplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

multivariate data definition with examples

how can this data be represented?

A

involves sets of 3 or more related data values. involves multiple dependent variables that result in one outcome

e. g.
- predicting the weather (multiple factors like pollution, humidity, precipitation, etc)

can be represented by:
- radar charts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

population definition

A

everything or everybody that could possibly be involved in an investigation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

census definition

A

a survey of a whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

sample definition

A

a smaller number of items from the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

biased sample definition

A

not representative of everyone in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sampling frame definition

A

a list of people/items that are to be sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

advantages of primary data (3)

A
  • accurate
  • collection method is known (because its your own)
  • you can find answers to specific questions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

disadvantages of primary data (2)

A
  • time consuming

- usually expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

advantages of secondary data (4)

A
  • cheap
  • easy
  • quick
  • data from some organisations can be more reliable than data collected yourself
17
Q

disadvantages of secondary data (5)

A
  • method of collection is unknown
  • data may be out of date
  • may contain mistakes
  • may come from an unreliable source
  • may be difficult to find answers to specific questions
18
Q

advantages of a census (3)

A
  • unbiased
  • accurate
  • takes the entire population into account
19
Q

disadvantages of a census (4)

A
  • time consuming
  • expensive
  • lots of data to manage
  • difficult to ensure the whole population is used - if some are missed, the survey may be biased
20
Q

advantages of a sample (3)

A
  • cheaper
  • quicker
  • less data to consider
  • easier to get hold of all the required information
21
Q

disadvantages of a sample (2)

A
  • may be biased
  • not representative of the entire population - each possible sample will give different results, so the one selected might not accurately reflect the population
22
Q

impact of a sample size on reliability and replication

A

the bigger the sample size, the better the estimate of the population parameters

23
Q

what is the peterson capture-recapture method

A

a way of estimating the size of a population, usually dealing with wildlife.

24
Q

peterson capture-recapture: population size formula

A

population size = (number in 1st sample x number in 2nd sample) / number in 2nd sample that are marked

25
Q

A fish farmer wants to estimate the size of his fish stocks. He nets 142 fish and marks them with a special ink. The fish are released back into the fish farm. A month later he nets 127 fish and finds that 6 of them are marked.

a) Estimate the size of the fish population at the fish farm.
b) What assumptions are made in obtaining this estimation?

A

a) Working out:
(142x127) / 6 = 3005.6666667
population size = 3000 to 2.s.f

b) Four possible answers:
- That the population does not change between capture and recapture.
- That the sampling method is identical.
- That the capture and marking does not have an effect on the population.
- That the percentage of fish marked on the recapture is accurate. This is unlikely to be true as it is random. There could have as easily been any number from 1 to 10 marked fish.

26
Q

To estimate the size of the population of Caribou in a national forest in Canada, 100 Caribou are trapped at different locations through the forest (capture), and tags fitted to their ears. A week later another 100 Caribou are trapped (recapture). It is found that 4 of these have tags on their ears. Estimate the population of Caribou in the forest.

A

Working out:
(100x100) / 4 = 2500
population size = 2500

27
Q

why is it important to make sure the sample is as similar to the population as possible?

A

so that it is representative. otherwise, it may be biased, and conclusions about the population based on your sample may not be correct

28
Q

how to avoid sampling bias (3)

A
  • select from the correct population and make sure no member of the population is excluded
  • select your sample at random - if members are linked in some way, it can cause bias
  • make sure all your sample members respond