Intro to Biostatistics Flashcards

1
Q

what are the two types of discrete/categorical variables?

A

nominal and ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a nominal variable

A

discrete group

ex: male/female, smoker/non-smoker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is an ordinal variable

A

ordered without meaningful intervals

ex: class rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the two types of continuous variables?

A

interval and ratio

–> can take continuous data and turn it into categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is an interval variable

A

ordered with meaningful intervals, without an absolute zero - rarely used in medical research

ex: temperature in celsius

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a ratio variable?

A

interval data with an absolute zero

ex: age, weight, cholesterol levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define frequency distribution

A

a systematic arrangement of numerical data from the lowest to highest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a systematic arrangement of numerical data from the lowest to highest

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the three ways grouped data can be presented?

A

frequency - absolute number in each category

relative frequency - percent in each category

cumulative frequency - cumulative percent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a bar graph

A

grouped frequencies used to display nominal data

-ex: cholesterol by gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how would you graph cholesterol by gender?

A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how would you graph grouped frequencies used to display nominal data?

A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is a histogram

A

grouped frequencies generally used to display continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a frequency polygon

A

midpoints of each group joined by straight line used to display continuous variables

“line graph”

–> excellent for displaying distribution of sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what type of graph would you used to show the distribution of a sample

A

frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a cumulative frequency polygon

A

displays the cumulative frequency for continuous variables

“100% of sample has cholesterol level below 260… 15% of sample has cholesterol level below 190”

17
Q

what type of graph would you use to approximate percentiles?

A

cumulative frequency polygon

18
Q

what is a survival curve

A

plots death or endpoints

- Kaplan Meier method preferred

19
Q

why are simple survival curves not useful in actual studies?

what is the solution to these problems

A
  • patients are not enrolled at the same time
  • patients drop out
  • patients are followed for varying lengths of time

-Kaplan Meier method

20
Q

what is the kaplan meier method and why does it solve the problem created by simple survival curves

A
  • used to plot survival in medical research
  • adjusts data to reflect the patients who are not followed for the entire study
  • also referred to as censored survival data
21
Q

what is a normal/gaussian distribution

A

frequency polygon with the appearance of a symmetrical bell-shaped curve; often generated by biologic/medical data

  • many statistical manipulations and inferences are dependent on the assumption of this distribution
  • more accurately approximated as sample size increases
22
Q

positive and negative skewing refers to what part of the distribution.. the mean or the tail?

A

the tail

23
Q

what is mean

A

the average value

  • sensitive to extreme scores
  • can misrepresent a population dramatically
24
Q

what is median

A

middle value

25
Q

what is mode

A

the value that occurs the most frequently

26
Q

what is range

A

difference between the highest and lowest value

27
Q

what is variance

A

quantifies the scatter present in the distribution of values

  • average of the squared differences from the mean
  • not a very intuitive measurement, won’t be asked to calculate
28
Q

what is standard deviation (SD)

A

square root of the variance, much more intuitive

  • most commonly used measure of variability
  • the more spread out the sample distribution, the larger this SD
29
Q

what is the most commonly used measure of variability

A

standard deviation

30
Q

what percentage of the population falls within 1 standard deviation of the mean?

A

68%

31
Q

what percentage of the population falls within 2 standard deviations of the mean?

A

95%

32
Q

what percentage of the population falls within 3 standard deviations of the mean

A

99.7%

33
Q

What are z scores?

A
  • used to transpose standard deviations into percentile data
  • identical to the standard deviations they represent

-used when considering the percent of a population above or below a specific level

34
Q

what is the multiplication rule?

A

used to calculate the probability of 2 (or more) independent events both occurring

-p(a)xp(b) = p(a+b)

35
Q

how do you calculate the probability of 2 or more independent events both occurring?

A

multiplication rule

36
Q

what is the addition rule?

A

used to calculate the probability of either of 2 or more independent events occurring

-p(a or b) = P(a) + p(b) - P(a + b)

–> if mutually exclusive, P(a + b)=0

37
Q

how do you calculate the probability of either of 2 or more independent events occurring?

A

addition rule