Module 2 Flashcards

1
Q

What is a case-control study?

A

A study comparing cases and controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a retrospective case control study?

A

Researchers looking back on how subjects behaved over time, looking at case group and control gorup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Draw the casual model(i.e. the “directed acyclic graph). Include the mediator, and have arrows which show “what we already know”, “what we can prove”, and “what we want to know”

A

Refer to PCV 2.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is sampling variability/error?

A

When you draw a random and representative sample from a population, it is not always going to be the exact same sample every time you draw it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3 students draw a sample of 5 observations from the population. How many samples did each student draw from the population? What is the sample size used in the experiment?

A

1 sample with a sample size of 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a sampling distribution?

A

First you take several samples and take the mean of each sample. Then, you treat the sample means as the new data set and plot a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sample distribution?

A

A sample distribution would be a histogram of the values within one sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

As you increase the sample size, what happens to the standard deviation, graph shape, and mean of the sampling distribution graph for a NORMAL distribution?

A

The graph tightens, variability decreases, standard deviation also decreases. The shape of the graph does not change.

The mean of the means does not change with an increased sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is n?

A

The sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

As you increase the sample size, what happens to the standard deviation, graph shape, and mean of the sampling distribution graph for a UNIFORM distribution?

A

Mean of the means is unchanged

The standard deviation gets smaller

The shape of the sampling distribution becomes more and more normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some synonyms for a normal graph

A

symmetric
gaussian
bell shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the central limit theorem?

A

When n is “sufficiently” large, the sampling distribution for a particular statistic(e.g. sample mean) will tend towards a normal distribution even if the underlying population distribution is not Gaussian.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the normal distribution? What are the two parameters for a normal distribution? If a random variable X is distributed normally, then we denote it as ….

A

The normal distribution is a continuous probability distribution for real-valued random variable. The two parameters for a normal distribution are the mean (mu) and variance(sigma squared).
(Notation in notes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does the normal distribution differ from the binomial distribution?

A

The binomial distribution is a discrete probability distribution. This means that the values that our random variable X could take on were clearly delineated integers, like the number of heads in five coin flips. X could not be a fraction like three and a half heads.

In comparison, the values that the random variable X could take on in a normal distribution could include fractions of a whole unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a normal distribution, what do the mean and variance tell us about the graph?

A
  1. The mean tells you about the location of the distribution. The shape would remain the same if only the mean of a normal distribution is changed
  2. The variance tells us about how widely distributed the values are. The centre of the distributions would be the same but the peak of the graph and the width of the graph would change.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The probability of observing random, normally-distributed values within a given range is equal to

A

the associated area under the curve (AUC)

this works by considering intervals of potential values for X as defined on the x-axis of the p-lot, then calculating the proportion of th total area under the curve that falls within that interval. This would give us the probability of observing a random normal variable from this population with a value in that range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can you calculate the probability for a single value of X in a normal distribution?

A

No, when working with continuous distributions like the normal distribution, our probability will always be anchored to an interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If a random variable X is distributed normally, why do we know about the standard deviations?

A
  1. There is an approximately 68% chance X falls within one standard deviation of the mean
  2. there is an approximately 95% chance falls within two standard deviations of the mean
  3. There is an approximately a 99.7% chance X falls within three standard deviations of the mean
19
Q

The value you get for the probability density function is not a probability in and of itself. What do you have to do to calculate the actual probability?

A

Use calculus to integrate the PDF across the desired rand and calculate the AUC which tells you the probability X falls in the range.

20
Q

What is a transformation?

A

transformations are just functions that map a value in one space to a value in a second space. Usually we can identify functions to “back-transform” the new data to the original space(useful transformations will allow us to do this).

21
Q

What is the log transformation look like(i.e. the calculations)? What are log transformations useful for?

A

To go from original data to log transformed data: take the log of the x value you are trying to transform, keep the y value the same. TO back transform rats the x value to the power of 10.

Log transformation are useful fro mapping right-skewed distributions into more normal distributions in the transformed space.

22
Q

What is the formula for calculating the 95% confidence interval for a population mean? Define each variable.

A

Refer to notes page

23
Q

Is the confidence interval random?

A

Yes. Our X bar could be different depending on the specific random sample we take. Moreover, the confidence interval will either cover or not cover the true mean

24
Q

What is the coverage probability?

A

The franction of samples that are taken from the dataset that over the true population mean

25
Q

What is the difference between mu and X bar?

A

mu is the population mean

X bar is the sample mean

26
Q

What is one reason why a CI might have a coverage possible lower than 95%? What are two ways this can be fixed?

A

If the population data is non gaussian(i.e. very skewed), the our confidence interval is not going to have 95% coverage probability. We can fix this by increasing the sample size taken from the dataset. You can also use a log transformation to create a Gaussian distribution, even with a small sample size

27
Q

How well the data conforms to the stated coverage probability depends on the ________(1)

A

(1) shape of the population distribution and the sample size

28
Q

The standard deviation of the sampling distribution of X bar is known as ______(1). What is the formula for this?

A

the standard error of X bar

formula in notes

29
Q

What are the three differences between a distribution of gaussian observations vs a distribution of sample means?

A

Distribution of Gaussian observations:
- made up of individual observations from the populaiton
- Centered at population mean mu
- Variability quantified by standard deviation

Distribution of sample means
- Made up of sample means calculated from infinite samples of size n from the popualtion
- Centered at population mean mu
- Variability quantified by standard error

30
Q

When calculating confidence intervals, we assume that the true value of mu is equal to

A

X bar

31
Q

How do you calculate the margin of error? How do you calculate the width of the 95% confidence interval?

A

Refer to notes

32
Q

As n increases, the width of the 95% confidence interval goes _____(1).

A

(1) down

33
Q

What effect would lowering the standard deviation have on the standard error, the margin of error, and the width of the confidence interval?

A

It would lower all of them

34
Q

What is the formula for calculating sample mean and sample variance?

A

Refer to notes sheet

35
Q

What are the three rules that are relevant to log transformed data?

A
  1. The mean of the logged data is almost equal to the median of the logged data
  2. The log of the median(of the regular data) is equal to the median of the logged data. This is because the median is an observable value in the dataset
  3. The log of the mean(of the regular data) is NOT equal to the mean of the logged data.
36
Q

When you do 10 raised to the power of the mean of the log data. what do you get?

A

the median of the raw data

37
Q

Mathematically write out rule 1 and rule 2 relating to log transformed data.

A

Refer to notes `

38
Q

You just calculated the confidence interval for the mean of the data in log dollars. If you exponentiate these values, what would the new interval tell you?

A

The confidence interval for the median of the raw data

39
Q

What does stratified mean?

A

Stratification is the process of dividing members of the population into homogeneous subgroups before sampling.

40
Q

What is a histogram?

A

Graphs with bus that count values in dataset and give us an overview of the data distribution

41
Q

What are the drawbacks of a histogram?

A
  1. Hard to see the centre of distribution to compare “typical” values
  2. Hard to plot the two distributions together on the same graph
42
Q

What is a density plot? What is the drawback of a density plot?

A

SImilair to a historgram but you have lines instead of bars.

It is hard to see the centre of distribution to compare typical values

43
Q
A