Probability/Statistical Significance Flashcards

1
Q

What are the two ways studies can screw up?

A
  1. caused by chance = random error

2. Not caused by chance = bias or systematic error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What deals with random error in studies?

A

Statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a study has a random error, is it likely to happen again if/when the study is repeated?

A

NO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

An error that is inherent to the study method being used and results in a predictable and repeatable error for each observation is labeled a _____ error. What is it due to?

A

Systematic error due to bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

T/F: If you repeat a study that had a systematic error, it is likely to happen again

A

TRUE

these errors are not caused by chance and there is no formal method to deal with them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What tests will estimate the likelihood that a study result was caused by chance?

A

Tests of statistical inference

**a study result is called “statistically significant” if it is unlikely to be caused by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Do if a study is statistically significant, is it clinically significant?

A

Not necessarily

Those terms have two different meanings

*even very small measures of association that are not large enough to matter can be statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a chance occurrence?

A

Something that happens unpredictably without discernible human intention or with no observable cause: caused by chance or random variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is random variation?

A

There is error in every measurement. If we measure something over and over again, we will get slightly different measurements each time AND a few measurements may be extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is statistical inference?

A

Tells us: if we measure something only once, how sure are we that our measurement has been caused by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What two methods are used for estimating how much random variation there is in our study and whether our result was likely to have been caused by chance?

A
  1. Confidence intervals

2. P-values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

_______ estimates how much random variation there is in our measurement

A

Confidence intervals

-the range of values where the true value of our measurement could be found

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

_____ are used to estimate whether the measure was likely to have been caused by chance or not

A

P values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Will small sample sizes have a large 95% Confidence interval or small CI?

What about large sample sizes?

A

The larger the sample size, the smaller the confidence interval will be = more precise

  • small samples have large CIs
  • Large samples have small CIs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you interpret this statement?

“prevalence of disease was 8% (95% CI: 4%-12%)”

A

The estimate of the prevalence from the study was 8%, but we are 95% confident that the true prevalence lies somewhere between 4% and 12%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T/F: If the 95% CI for the odds ratio (OR) does NOT include one, the OR is statistically significant

A

TRUE

Ex: The odds ration was 3 (95% CI: 0.5 - 6)

**since this includes that the OR could have the value of ONE = it is NOT statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you interpret 95% confidence intervals (95% CI) for odds ratios (OR)?

A
  1. OR greater than one, 95% CI does NOT include one : Positive association; statistically significant
  2. OR greater than one, 95% CI includes one : NO association, NOT statistically significant
  3. OR less than one, 95% CI does NOT include one : Negative association, statistically significant
  4. OR less than one, 95% CI included one : No association, NOT statistically significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If the 95% CI for the relative risk (RR) does NOT include one, the RR (is / is not) statistically significant

A

IS

*remember, when the RR = one, there is no association between the two test groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do you interpret a RR greater than one, combined with a 95% CI that does NOT include one?

A

Positive association

Statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How do you interpret a RR less than one, combined with a 95% CI that includes one?

A

No association

Not statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you interpret a RR less than one, combined with a 95% CI that does NOT include one?

A

Negative association

Statistically significant

22
Q

T/F: P-value gives you information about the size of the test sample

A

FALSE

**it also does NOT give you any info about the range that you can expect to find the true value

23
Q

To be statistically significant, the p-value must be less than _____

A
  1. 05
    * if the p-value is greater than 0.06 - the association is NOT statistically significant and could have been caused by chance
24
Q

How do you interpret p-values that are less than 0.05?

A

We are 95% confident that an association as large as the one in our study was NOT caused by chance

or

We have 95% confidence that an association this large could not have been caused by chance

25
Q

How do you interpret the following value?

OR or RR or PR = 3.0 (p = 0.02)

A

Statistically significant. There is an association. We are 95% certain that an OR of 3.0 could NOT have been caused by chance.

26
Q

T/F: No matter how large the RR or OR; if the p-value is greater than 0.05, we must say there is no association

A

TRUE

27
Q

How are p-values calculated?

A

Using statistical tests - tests for statistical inference:

  1. Chi-squared test
  2. Student’s t test
  3. Correlation

(need to know when/where to use these three tests - do not worry about calculations)

28
Q

When testing a hypothesis, can you prove something is true, untrue, or both?

A

Untrue

You cannot prove that something is true

You can’t prove an association is true

But you can prove that either is NOT true –> Hence the use of a Null hypothesis

29
Q

What is a “Null” hypothesis?

A

hypothesis that suggests NO association

Used to be proven untrue and rejected - to confirm associations

30
Q

What is the alternative hypothesis?

A

The actual research question that we want the answer to (that there is an association)

31
Q

What values do we used to accept or reject the null hypothesis?

A

P values OR Confidence intervals (CIs)

32
Q

If a p-values is less than 0.05, do we accept or reject the null hypothesis?

A

We will REJECT the null hypothesis

this means that there is an association - the alternative hypothesis is accepted

33
Q

What must a p-value be to accept the null hypothesis?

A

Greater than 0.05

Accepting the null hypothesis means that there is no association - the alternative hypothesis is therefore rejected

34
Q

What is a type I error?

A

False positive: rejecting the null when it is NOT false (no association exists)

This is set at 0.05 (95% CI)

Simply put - *saying there is an association, when really there is not

35
Q

What is a type II error?

A

False negative: not rejecting the null when it is false (an association truthfully exists)

This is set at 0.20

Simply put - *saying there is no association, but there actually is

36
Q

___________ = the ability of a study to detect an association, if one does exist

A

Power

Power = 1 - Type II (0.08)

***larger sample sizes have more power

37
Q

Categorical data can be broken up into what two discrete categories?

A

Nominal (named, not ordered) - dichotomous test results (ex: horse vs donkey; or male vs female)

Ordinal (named and ordered, but no constant value between ranks) Ex: neonate vs juvenile vs adult vs geriatric

38
Q

What is continuous data?

A

The variable is numeric and can have one of many possible values

Ex: BG, weight, etc

39
Q

Describing categorical data by _______ _______ will summarize the number of animals in each category, counts on proportions, and the use of two-by-two tablets

A

Frequency distribution

40
Q

What methods can be used to describe categorical date?

A
  1. frequency distribution
  2. Tablets or bar charts
  3. Statistical test like Chi-squared etc
41
Q

What methods can be used to describe continuous date?

A
  1. frequency distribution and histogram
    - Central tendency
    - Dispersion
  2. Statistical tests (95% CI, t-test, correlation, etc)
42
Q

What information can be obtained from describing categorical data using Central tendency?

A

Describes the center of the distribution and measures central tendency (mean, median, and mode)

  • mean = sum of all values/# of data points (very sensitive to extreme values)
  • Median = the value which is in the center with half the data points above and half below
  • mode = the most frequently occurring value or observation
43
Q

What do you expect to see if there is a skewed distribution when analyzing categorical data with central tendency?

A

The mean and mode lines will not line up with the median

*symmetric distribution would have the mean and mode close the median (the same distance away)

44
Q

What two values are used in measuring the dispersion of continuous data?

A

*this method describes how closely the values are gathered around the center of the distribution

Measures:
Range (the difference between minimum and maximum)
Standard deviation (the average distance between each measurement and the mean)

45
Q

The chi-squared test is used to statistically evaluate what?

A

Difference in proportions

Used for categorical data

all two-by-two tables

46
Q

The Student’s t-test is used to statistically evaluate what?

A

The difference in means

Compares the average of two groups

*used for continuous outcome data and categorical explanatory variable (independent variable)

47
Q

What is correlation used for?

A

Statistical test that measures the strength and direction of a linear relationship between two continuous variables

Used for continuous data

48
Q

What does the choice of statistical test depend on?

A

The nature of the explanatory and outcome variables

49
Q

What statistical test is a test of independence between two categorical variables?

A

Chi squared

Used to answer: Does an association exist between the variables?

Used for two by two tables

50
Q

What statistical test is used to compare the mean values of a continuous variable between two groups?

A

Student’s t test

Incorporates both the mean and the variance (dispersion) around the mean

*requires the value to be normally distributed in the population and similar variance in both groups

H0 = the means for the two groups are the same

51
Q

What statistical test indicates the strength and direction of a linear relationship between two continuous variables?

A

Correlation coefficient (r)

often used for dose-response relationships

both variables are numerical, usually continuous

52
Q

What value is considered a strong correlation value? And a weak correlation value?

A

Range: 0.0 < r < 1.0

STRONG = r is greater than 0.08

WEAK = r is less than 0.08