C8 Flashcards

1
Q

what are sampling units and elements?

A

The elements of sample e.g. people / organisations may be continued within a sampling unit. For example, a household is a sampling unit and an individual is a sampling element. If individuals are being sampled directly, they are both the sampling unit and the sampling element.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does developing a sampling plan involve? what is the purpose of it?

A

Selecting, without bias and with as much precision as resources allow, the sampling items / elements from which or whom you wish to collect data. Drawing up a sampling plan includes:

Defining target population
Choosing an appropriate sampling technique
Deciding on the sample size
Preparing sampling instructions

When deciding on sample and how to select it you must take into account aims & objectives of research (what you want to find out and how it will be used); nature of target population and how to identify them (availability and / or selection of a sample frame or source); how they can be reached and how much of your time and cost resources can be dedicated to it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is defining the population? which criteria is used in businesses and household / people? UoE?

A

Universe of enquiry = the people, organisations, events or items that are relevant to research problem

Any flaws in the definition of the population will mean flaws in the sample drawn from it. Criteria used in defining population:

Business:

Type of organisation 
Geographic area
Market or industry sector
Size of organisation
Type of experience / time 
Type of department / office within the organisation
Job title / role / responsibilities of employee
Type of experience of an employee

Households and people:

Geographic area
Demographic profile
Geodemographic profile
Time
Type of experience / time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the relationship between target population and survey population? CE

A

For practical reasons the target population (from which results are required) can differ from survey population (actual target acquired). E.g. people / orgs in remote / difficult to reach destinations on islands may not be included in a F2F survey population.

For this reason, it is important that distinction is made in all documents relating to research.E.g. Non-internet users within a representative GB survey. This is known as coverage error - an error in which the sampling approach does not deliver representative sample of target population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the difference between a census and a sample?

A

Census - collecting data from every member or element or representative sub-sample of it

Census’ are time and resource heavy for large populations, but may be necessary for staff surveys on ways of working for example. Levels of non-response can mean that results are less representative than a sub-sample. Administrative, field and data processing resources stretched to limits may be more likely to errors in handling data and admin during and after the survey.

Argument for a well-designed sample rests on the practical issue of time and cost involved in administering it and methodological ability of a sample to be representative of it. By representative we mean that results of sub-sample are similar to those that would be achieved with a population census.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the two main types of sampling techniques? what do they mean?

A

Random or probability sampling - each element within sample has a known chance of being selected, where the person selecting has no influence on elements selected. Conditions to select a representative sample:

Sample size must be at least 100
Population should be homogeneous / well mixed, if it is not (stratified / layered in any way) a simple random selection may not deliver a truly representative sample
Sampling frame must be complete, accurate and up-to-date
Non-response must be zero - everyone selected must take part

Realistically not all conditions may hold, leading to concepts of sampling error, standard error and confidence intervals.

Purposive or non-probability sampling - do not know probability of each element as the person selecting a sample may consciously or unconsciously favour / select particular elements.

In qual research statistical representativeness does not apply due to small sample sizes involved, but is still an important goal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How should you choose a sampling technique?

A

For qual which involves small sample sizes, non-probability techniques are normally the most suitable e.g. theoretical / judgement sampling / lurk and grab / list sampling / snowball sampling and piggy-backing / multi-purposing. This will be influenced by methodological issues such as nature and aims of study, practical concerns including nature and accessibility of study population, availability of suitable sampling frame and constraints of time and budget.

If research aims are exploratory and non-conclusive (not necessary to obtain highly accurate estimates of population characteristics to make inferences about population than a non-probability is appropriate. If it is necessary to obtain measurements from samples of known accuracy or precision in order to make statistical inferences or generalisations, then a probability sample should be used.

When there is little variation within a population, when a population is homogenous, a non-probability sample can be effective in achieving a representative sample; with a great deal of variability in population a random sample is likely to be more effective.

If there is no suitable sampling frame from which to select a sample, random methods are not feasible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you choose a sample size? What impacts this decision?

A

Sample size - number of elements that will be included in sample

In exploratory research the sample size may be relatively small in comparison to one used in a conclusive study (as the latter intends to provide precise estimates of population characteristics).

Conclusive evidence may be needed to compare one group against another, meaning that sample sizes need to be significant enough to provide a specified degree of confidence. This includes sub groups.

The most important factors may be time, resources and budget available, importance of decisions that rest on basis of results, and the need to look and compare at findings between groups, if multivariate statistical techniques have implication on sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does preparing sampling instructions include?

A

Drawing up a sampling plan includes:

Definition of target / study population
Sample size required
Sampling method to be used, including the way in which units and elements are to be selected
Details of sampling frame, if one is available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you check that the sample as been achieved?

A

As fieldwork progresses the sample should be monitored to ensure units and elements selected meet the sample criteria, as well as when fieldwork is completed. In the event discrepancies are found (high rates of non-response, under / over rep of particular elements) it will be necessary to address them (via either further fieldwork or statistical manipulation).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Brief sample report?

A

There should be a brief sample report included that details key information about sample planned and sample achieved. This should also include a definition of sample, how it was drawn, gross sample, quality checks made and drop-out rate. Where appropriate copy of invitation or contact text should be included.

This is useful for future users of the research to assess suitability and quality; and to those that want to repeat the research. Serves as a validation check on the representativeness of the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Population parameters?

A

Population parameters are definitions of a population / sub set of a population. They are nearly unknowable as they are true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sample statistic?

A

Statistic derives from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Probability statement?

A

Findings provided by samples are estimates of the population values, and statements based on findings are always probability statements - claims cannot be made about the value of population parameters based on sample data with absolute certainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sampling distribution of the mean?

A

The sampling distribution of the mean is the mean of the population from where the items are sampled. Sampling distribution of the mean graph typically shows that each sample from a population does not produce the same value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard error of the mean

A

The standard error of the mean is the measure of variability within the sampling distribution - the variability or spread in the values of the measures we take from each sample, and sets out to measure accuracy between the sample measure and the true population value.

This is used to measure probable accuracy or precision of a particular sample estimate. To work out the SEOTM we need to know the standard deviation of the population (S) and size of the sample (n). We are unlikely to know the value of the standard deviation of the population, so in its place we use the standard deviation of the sample (s).

Precision of sample estimate depends on sample size and variability of the sample.

17
Q

Sampling variability

A

Each sample from a population does not produce the same value. This variation is sampling variability. Ordinarily we only take one sample and estimate the population value.

18
Q

Confidence intervals

A

A confidence interval is simply a way to measure how well your sample represents the population you are studying.
Sampling distribution of the mean closely resembles a normal distribution (central value with no bias left or right, bell shaped curve), with the larger the sample the closer the resemblance.

Standard deviation tells you how spread out the data is. It is a measure of how far each observed value is from the mean. In any distribution, about 95% of values will be within 2 standard deviations of the mean.

You can calculate a CI for any confidence level you like, but the most commonly used value is 95%. A 95% confidence interval is a range of values (upper and lower) that you can be 95% certain contains the true mean of the population. (within roughly 2 standard deviations of the mean).

19
Q

Significant levels

A

Significance levels are the level of probability at which you can accept that a difference is statistically significant or real (not due to chance).

20
Q

Probability or random sampling methods?

A

Member of population has a known and non-zero chance of being selected

21
Q

Simple random sampling?

A

each member of population has equal likelihood of being selected.

2 ways of selecting a simple random sample - all numbers placed in drum and thoroughly mixed and drawn at random; or numbered population and select numbers via a random number table or generating random number via computer program.

22
Q

Systematic random sampling?

A

Systematic random sampling is a variation of simple random sampling where items in population are numbered 1 to N and arranged in random order. Size of sample needed (n) is divided by population size (N) to give sampling interval (k). Every N/nth item is then chosen.

This method will produce results similarly to a simple random sampling method if list is truly randomised. Systematic can produce a good spread of sample if desired by ordering list by order i.e. grades received in exams. Problems can arise if list is sub-divided in categories e.g. alternating users and non-users.

The sampling methods prior to this are only possible if list of target population is available.

23
Q

Stratified random sampling ?

A

Stratified random sampling allows researchers to obtain a sample population that best represents the entire population being studied.
Stratified random sampling involves dividing the entire population into homogeneous groups called strata by stratification factor e.g. grade or age
Stratified sampling is used to highlight differences between groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled.

In a proportionate stratified sampling method the sample size of each stratum is proportionate to the population size of each stratum e.g. (sample size/population size) x stratum size

If particular strata need to be over / under represented in order to create a robust sub-sample for analysis in sample then disproportionate allocation may be used e.g. if examination of low incidence groups is needed.

24
Q

Cluster and multi-stage sampling process?

A

In cluster sampling, researchers divide a population into smaller groups known as clusters. They then randomly select among these clusters to form a sample.

Step 1: Define your population
Step 2: Divide your sample into clusters
Each cluster’s population should be as diverse as possible. You want every potential characteristic of the entire population to be represented in each cluster.
Each cluster should have a similar distribution of characteristics as the distribution of the population as a whole.
Taken together, the clusters should cover the entire population.
There not be any overlap between clusters (i.e. the same people or units do not appear in more than one cluster).
Ideally, each cluster should be a mini-representation of the entire population. However, in practice, clusters often do not perfectly represent the population’s characteristics, which is why this method provides less statistical certainty than simple random sampling.
Step 3: Randomly select clusters to use as your sample
If each cluster is itself a mini-representation of the larger population, randomly selecting and sampling from the clusters allows you to imitate simple random sampling, which in turn supports the validity of your results.
Conversely, if the clusters are not representative, then random sampling will allow you to gather data on a diverse array of clusters, which should still provide you with an overview of the population as a whole

25
Q

Cluster and multi-stage sampling +/-, uses

A

In multi-stage clustering, rather than collect data from every single unit in the selected clusters, you randomly select individual units from within the cluster to use as your sample.

This process is cost-effective, when compared to simple / systematic random sampling where sample may be widely spread. Interviewer travel time needed is typically less. Standard error is greater than simple / systematic as at each stage of multi-stage sampling error is being introduced and therefore sampling estimates may be less precise.

26
Q

Missing elements? how to combat?

A

Elements that belong to the population but do not exist in the sampling frame. Incomplete sampling frames will mean samples derived will not be representative of population.

One way to mitigate this is to look for another source of information about the same population to compare / combine the two.

27
Q

Clusters of elements (missing elements)?

A

May list elements not as individuals but as groups / clusters of elements e.g. individuals living at the same address.

Options to mitigate:

Include all elements from cluster in sample - drawback, elements in same cluster may produce similar attitudes, characteristics
Choose element at random from cluster - drawback, all elements of population do not have equal chance of selection
Take sample of all clusters in the sampling frame, list all elements of each one and take random sample from list - cautions, need to take a large enough sample of clusters and an appropriate sampling interval to ensure that each of the elements in the final sample comes from a different cluster

28
Q

Blanks or foreign elements

A

Is an element included in a sampling frame that does not belong there. Incidence of blank / foreign elements may be relatively high in a sampling frame that is out-of-date, e.g. elements that have died, retired, left the country or are not eligible to be considered part of the target population. Sampling frames may also cover wider population than population of interest, so contain irrelevant items to the problem.

To treat with foreign / blank elements you should omit them and continue selecting sample units in the appropriate way. A substitution of next item on the list is not a suitable way of dealing with them - as then the next item has two chances of being selected (once in its own right and once as a replacement for a blank / foreign element).

29
Q

Duplication? elements

A

Elements may be duplication in a sampling frame, appearing more than once. Duplication is easy to deal with by using a de-duplication program.

30
Q

What is non-response? How to combat them?

A

Non-response occurs when those included in the sample do not respond - this is important as it can lead to serious concerns about representativeness of sample and so validity of data. Responders and non-respondents tend to differ, and thus data will be biased.

Main causes of non-response are refusals and ‘not at homes’ / ‘non-contactables’. Refusal rates can be reduced by:

Good qnn design
Good research administration inc training and briefing of interviewers
Use of pre-notification
Engaging contact text and intros
Follow-ups
Use of appropriate incentives

Two main approaches to managing not at homes: varying times at which contacts are made (weekdays, weekends, day, evening) and making ‘call-backs’ or return visits. Non response can also be dealt with by providing substitutes or replacements for non-responder. Taking samples of non-responders (and using results to project to all non-responders) can help in understanding differences between respondents and non-respondents and the final sample may be adjusted accordingly.

31
Q

why might you choose semi-random sampling?

A

Random sampling can be expensive - particularly in F2F surveys. Generating a sample and detailed list of addresses for each interviewer to visit, and completing fieldwork can be time consuming and expensive e.g. call-backs. One way of reducing time and cost involved without giving interview greater discretion in selecting units of sample (and thus introducing selection bias) is to use semi-random sampling procedure known as random route sampling or random walk.

32
Q

random route sampling or random walk?

A

One way of reducing time and cost involved without giving interview greater discretion in selecting units of sample (and thus introducing selection bias) is to use semi-random sampling procedure known as random route sampling or random walk. List of random starting addresses is selected using a multi-stage stratified random sample e.g. to ensure mix of urban and rural locations or towns of varying size. Each interviewer is given random address and set of instructions fo selecting subsequent addresses at which to view.

As with random sampling methods, no substitutes for chosen subject are allowed and a number of call-backs may be necessary.