gea1000 chp1 Flashcards

1
Q

PPDAC full form

A

Problem, Plan, Data, Analysis, Conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Population

A

The entire group of individuals or objects that we wish to know about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Research question

A

Seeks to investigate some characteristic of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population of interest

A

A group in which we have interest in drawing conclusions in a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Population parameter

A

Numerical fact about a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Census

A

An attempt to reach out to the entire population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Drawbacks of a census

A
  1. High cost of conducting
  2. Takes a long time to complete - some studies are time sensitive
  3. One may not be able to achieve 100% response rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sample

A

Proportion of the population selected in the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling frame

A

List from which the sample was obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Conditions for generalisability

A
  1. The sampling frame must be equal to or greater than the population of interest
  2. There should be no bias when we obtain the sample (selection bias, non-response bias)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Selection bias

A
  • Associated with the researcher’s biased selection of units into the sample
  • This can be caused by imperfect sampling frame, which excludes units from being selected
  • Can also be caused by non-probability sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Non-response bias

A
  • Associated with participants’ non-disclosure or non-participation in the research study
  • This results in the exclusion of info from this group
  • E.g. inconvenience or unwillingness to disclose sensitive info

can occur regardless of whether the sampling method is probabilistic or non-probabilistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Probability sampling

A

A sampling scheme such that the selection process is done via a known randomised mechanism

Every unit in the sampling frame has a known non-zero probability of being selected but the probability of being selected doesn’t have to be the same for all units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Simple random sampling

A
  1. Units are randomly selected from the sampling frame
  2. Every unit of the sampling frame has equal chance to be selected
  3. Sampling without replacement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Systematic sampling

A

A method of selecting units from a list by applying a selection interval k and a random starting point from the first interval

e.g. in an interval of k, the rth number is chosen where
1 <= r <= k
r, r+k, r+2k .. r + (n-1)k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pitfalls of systematic sampling

A

Potentially under-representing the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Stratified sampling

A
  • Sampling frame is divided into groups called strata
  • Each stratum is similar in that they share similar characteristics but the size of each stratum is not necessarily the same
  • We apply SRS to each stratum to generate the overall sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Pitfalls of stratified sampling

A

Require sampling frame and criteria for classification of the population into stratum

19
Q

Cluster sampling

A
  • Sampling frame is divided into clusters
  • A fixed number of clusters are then selected using SRS
  • All the units from the selected clusters are then included in the overall sample
20
Q

Advantages of cluster sampling

A

Less time-consuming and less costly
Clusters as usually naturally defined so it is easy to classify a unit under a cluster

21
Q

Disadvantage of cluster sampling

A
  • Depending on which clusters are selected, we may see high variability in the overall sample if there are dissimilar clusters with distinct characteristics
  • If the number of clusters sampled is small, there is also a risk that the clusters selected will not be representative of the population
22
Q

Pitfalls of simple random sampling

A

Time-consuming; accessibility of information and sampling frame

23
Q

Non-probability sampling

A

A non-probability sampling method is when the selection of units is not done by randomisation
There is no element of chance in determining which units are selected - down to human discretion

24
Q

Convenience sampling

A

Non-probability sampling method where a researcher chooses subjects based on the most easy availability
* introduces selection bias
* introduces non-response bias

25
Volunteer sampling/Self-selected sampling
Subjects volunteer themselves into a sample * Sample contains subjects who have a stronger opinion on the research question than the rest of the population * Such a sample is unlikely to be representative of the population of interest
26
Independent variables
Those that may be subject to manipulation, either deliberately or spontaneously, in a study
27
Dependent variables
Those that are hypothesised to change depending on how the independent variable is manipulated in the study
28
Categorical variables
Variables that take on categories or label values Ordinal: There is some natural ordering and numbers can be used to represent this ordering Nominal: There is no intrinsic ordering
29
Numerical variables
Variables that take on numerical values and we can meaningfully perform arithmetic operations on them Discrete numerical variable: there are gaps in the set of possible numbers taken on by the variable. Continuous numerical variable: can take on all possible numerical values in a given range or interval.
30
If we add a constant value c to all data points how does the mean of that dataset change?
mean + c
31
If we multiply a constant value c to all data points how does the mean of that dataset change?
c * mean
32
Variance formula Standard deviation formula
var = ((x1 - mean)2 + ... + (xn - mean)2)/(n-1) standard deviation = sqrt(var)
33
when is standard deviation 0
when all data points are identical in this case variance is 0 so standard deviation also is
34
When a constant c is added to all data points how does standard deviation change
does not change
35
When a constant c is multiplied to all data points how does standard deviation change
Standard deviation is multiplied by |c|
36
Coefficient of variation
Used to quantify the degree of spread relative to the mean Formula = standard deviation/mean
37
How does the median change when a constant c is added to every data point
median + c
38
How does the median change if all data points are multiplied by c
median * c
39
How does the IQR change if we add c to all points
No change
40
How does the IQR change if we multiply each point by c
IQR is multiplied by |c|
41
For numerical variables when is median and IQR preferred over mean and standard deviation?
The median and IQR is preferred if the distribution of the data is not symmetrical or when there are outliers
42
Primary goal of an experimental study
To provide evidence for a cause-and-effect relationship between two variables
43
What is a placebo
Something given to the control group that in actual fact has no effect on the subjects on the group
44
What is an observational study
Measures the variables of interest without any direct/deliberate manipulation of the variables by researchers