Statistics - Chapter 1 - Data Collection Flashcards

1
Q

Population

A

Whole set of items that are of interest. Information can be obtained from a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Raw data

A

Unprocessed information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data vs information

A

Data: collection of raw unorganised facts
Information: collection of processed, organised facts placed into context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Census
- Observes/measures entire population
- Pros: should give completely accurate result
- Cons: time consuming, expensive, cannot be used when testing process destroys item, hard to process large quantity of data

A

“Testing process will destroy…, so a census would destroy all the…”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

Selection of observations taken from a subset of the population to use to find information about population as a whole
Pros: less time consuming n expensive than a census, less people have to respond/less data to process than a census
Cons: data may not be as accurate, sample may not be large enough to give information about small sub-groups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How the size of the sample can affect the validity of any conclusions drawn
- size depends on required accuracy + available resources
- larger the sample, more accurate it is + more accurate predictions, but greater resources needed
-if population is very varied, need large sample than if population were uniform
– as natural variation in pop: different samples -> different conclusions

A

“they could take a larger sample, for example… this would give a better estimate of the overall proportion of…”
“full coverage”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sampling units

A

Individual units of of a population e.g an university student/house. Often individually named or numbered to form a list (sampling frame - list of units a sample can be drawn from) e.g list of university students/total number of houses in the locality//phone book/a map/electoral roll

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Random sampling
-every mem
-equal chance
-of selection
-sample representative of pop
-removes bias

A

simple random sampling, systematic sampling, stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Simple random sampling
- every sample of same size has an equal chance of being selected
- no bias, ez cheap implement for small, each s unit known equal selecton chance,
- frame needed, large= not suitable (time expense disruptive)

A
  • frame
  • each member in frame allocated a unique number from 1 to pop size
  • selection of these numbers chosen at random for n sample size
  • by generating w random number generator/calculator/computer/random number table or by lottery sampling (members are written on tickets and placed in a hat, required number of tickets drawn out).
  • go back to pop, select mems corresponding to the generated nums
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

random number table

A
  • assign unique digit identifies e,g 3-digit
    so 000, 001…
    -work along rows of random number tables generating 3-digit numbers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Systematic sampling
- required elements chosen at regular intervals from an ordered list.
- simple, quick to use, for large
- frame needed, introduce bias if frame not random e.g MFMF, patterns in sample data might occur when taking every _ person

A

’- allocate a number from 1 to pop size
- use a random number generator to select the first person from 1 to interval calculated
- “Select every (interval calculated)th person thereafter.”
e.g first person chosen random at 2, remaining would be 7,12,17 etc for interval 5th

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Stratified sampling
- pop divided into mutually exclusive strata e.g F and M, random sample taken from each.
- sample accurately reflects pop structure, proportional representation of groups within pop guarantee
- clearly classify pop into distinct strata, each stratum selection = same CONS of simple

A

stratified sample for that strata = (stratum size/pop size) x req overall sample size

e.g working out layout
cricket : 121/370 x 30 = 9.8 ≈ 10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quota sampling
- interviewer selects a sample that reflects the characteristics of the whole population. pop / into groups according to given chars. size of each group determines proportion of sample that should have that chars. meet, assess their group and allocates them into the appropriate quota. continue until quotas filled.
- allows small sample to be still representative of pop, no frame required, quick ez, allows for ez comparision between diff grps in pop
- non random so bias. pop must be divided into group (costly, inaccurate ++ increasing scope -> +groups -> +time +expense), non-responses not recorded

A

Maddison has a list of 210 pupils, and wants to find out which musical instrument they prefer listening to amongst the flute, the clarinet, the guitar and the saxophone. To take a sample of size 30, Maddison surveys the first 15 girls and the first 15 boys to arrive at the school.

non-responses elaboration: means that the people who refuse to participate or cannot be reached which can affect the representativeness of the sample + not included in the sample, potentially introducing bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Opportunity/convenience sampling
- taking sample from people available at the time the study is carried out and who fit the crit
- ez to carry out, cheap
- unlikely to provide representative sample, highly independent on individual researcher (time, place)

A

“sample is likely to be biased towards … who …”
“improvements by interviewing ppl at diff locations + times, + increase sample size”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

types of data

A

quantitative: associated with numerical observations
qualitative: associated with non-numerical observations
continuous variable: can take any value in a given range e.g height or time
discrete variable - can take only specific values in a given range e.g number of people cant be 2.65

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when data presented in a grouped frequency table

A
  • groups = classes
  • specific data values are not shown
  • class boundaries: max and min values belonging in each class
  • midpoint = average of the class boundaries
  • class width = difference between upper and lower class boundaries