Intro to Biostatistics Flashcards

1
Q

what are the two types of discrete/categorical variables?

A

nominal and ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a nominal variable

A

discrete group

ex: male/female, smoker/non-smoker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is an ordinal variable

A

ordered without meaningful intervals

ex: class rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the two types of continuous variables?

A

interval and ratio

–> can take continuous data and turn it into categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is an interval variable

A

ordered with meaningful intervals, without an absolute zero - rarely used in medical research

ex: temperature in celsius

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a ratio variable?

A

interval data with an absolute zero

ex: age, weight, cholesterol levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define frequency distribution

A

a systematic arrangement of numerical data from the lowest to highest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a systematic arrangement of numerical data from the lowest to highest

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the three ways grouped data can be presented?

A

frequency - absolute number in each category

relative frequency - percent in each category

cumulative frequency - cumulative percent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a bar graph

A

grouped frequencies used to display nominal data

-ex: cholesterol by gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how would you graph cholesterol by gender?

A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how would you graph grouped frequencies used to display nominal data?

A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is a histogram

A

grouped frequencies generally used to display continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a frequency polygon

A

midpoints of each group joined by straight line used to display continuous variables

“line graph”

–> excellent for displaying distribution of sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what type of graph would you used to show the distribution of a sample

A

frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a cumulative frequency polygon

A

displays the cumulative frequency for continuous variables

“100% of sample has cholesterol level below 260… 15% of sample has cholesterol level below 190”

17
Q

what type of graph would you use to approximate percentiles?

A

cumulative frequency polygon

18
Q

what is a survival curve

A

plots death or endpoints

- Kaplan Meier method preferred

19
Q

why are simple survival curves not useful in actual studies?

what is the solution to these problems

A
  • patients are not enrolled at the same time
  • patients drop out
  • patients are followed for varying lengths of time

-Kaplan Meier method

20
Q

what is the kaplan meier method and why does it solve the problem created by simple survival curves

A
  • used to plot survival in medical research
  • adjusts data to reflect the patients who are not followed for the entire study
  • also referred to as censored survival data
21
Q

what is a normal/gaussian distribution

A

frequency polygon with the appearance of a symmetrical bell-shaped curve; often generated by biologic/medical data

  • many statistical manipulations and inferences are dependent on the assumption of this distribution
  • more accurately approximated as sample size increases
22
Q

positive and negative skewing refers to what part of the distribution.. the mean or the tail?

23
Q

what is mean

A

the average value

  • sensitive to extreme scores
  • can misrepresent a population dramatically
24
Q

what is median

A

middle value

25
what is mode
the value that occurs the most frequently
26
what is range
difference between the highest and lowest value
27
what is variance
quantifies the scatter present in the distribution of values - average of the squared differences from the mean - not a very intuitive measurement, won't be asked to calculate
28
what is standard deviation (SD)
square root of the variance, much more intuitive - most commonly used measure of variability - the more spread out the sample distribution, the larger this SD
29
what is the most commonly used measure of variability
standard deviation
30
what percentage of the population falls within 1 standard deviation of the mean?
68%
31
what percentage of the population falls within 2 standard deviations of the mean?
95%
32
what percentage of the population falls within 3 standard deviations of the mean
99.7%
33
What are z scores?
- used to transpose standard deviations into percentile data - identical to the standard deviations they represent -used when considering the percent of a population above or below a specific level
34
what is the multiplication rule?
used to calculate the probability of 2 (or more) independent events both occurring -p(a)xp(b) = p(a+b)
35
how do you calculate the probability of 2 or more independent events both occurring?
multiplication rule
36
what is the addition rule?
used to calculate the probability of either of 2 or more independent events occurring -p(a or b) = P(a) + p(b) - P(a + b) --> if mutually exclusive, P(a + b)=0
37
how do you calculate the probability of either of 2 or more independent events occurring?
addition rule