Biostats Flashcards

1
Q

What is the research question?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a population?

A

the largest collection of entities such as persons, animals, or cells for which we have an interest at a particular time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a parameter?

A

A descriptive measure computer from the datain a population. Usually, an unknown true value represented by a greek letter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a sample?

A

a subset or fraction of the population. We observe characteristics from the sample to apply to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a statistic?

A

A descriptive measure computer from data in a sample. Usually, not represented by greek letters. May have a line over it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a simple random sample?

A

A sample that is selected so that every unit in the population has an equal chance of being included. In a simple random sample there are two properties:

  1. unbiased: each unit has same chance of being chosen
  2. independence: selection of one unit has no influence on selection of other units.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a cluster sample?

A

group the population into small clusters and draw a simple random sample of clusters. May be good if traveling between randomly sampled units is high.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a systematic sample

A

start with a randomly chosen unit and select every kth unit thereafter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a stratified sample?

A

Divide population units into homogeneous groups and draw a simple random sample from each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a variable?

A

A characteristic that can take on different values in different people, animals, or things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a constant?

A

A measurement that stays the same from observation to observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the types of variables?

A

qualitative, quantitative, continuous, categorical, discrete, dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a qualitative varialbe?

A

categorized characteristics. This is all about measuring attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a quantitative variable?

A

measurements that convey information regarding amounts. Can be either discrete or continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a continuous varialbe?

A

Quantitative variable that does not possess gaps or interruptions characteristic of a discrete random variable. Infinite number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a categorical variable?

A

observations that have the same attributes are in the same category. A qualitative variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a discrete variable?

A

A variable characterized by gaps or interruptions in the values that it can assume. Typically, they are countable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a dichotomous variable?

A

type of categorical variable that can take only one of two values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the levels of measurement?

A

nominal variables, ordinal variables, interval variables, ratio variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a nominal level of measurement?

A

a qualitative level of measurement. Naming observations or classifying them into various mutually exclusive and collectively exhaustive categories. There is no natural ordering here.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is an ordinal level of measurement?

A

a qualitative level of measurement in which observations are not only different from category to category but can be ranked in some order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an interval level of measurement?

A

a quantitative level of measurement in which the distance between any two measurements is known but there is no true zero point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a ratiolevel of measurement?

A

a quantitative level of measurement in which equality of ratios as well as equality of intervals may be determined. there is a true zero value. Zero point represents an absolute absence of the characteristic being measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a mean?

A

average. As a parameter - population mean. Can also have a sample mean. The mean is unique (there is only one). It is also simple. Extreme values can influence the mean so that it is not a good measure of central tendency. Mu is the mean of the population which can be inferred if the sample mean is unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the median?

A

the middle value. divides the set into two equal parts. It is unique, simple, and not as drastically affected by extreme values as the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the mode?

A

The most common value in the set. May not be a mode or may be more than one. May be useful for qualitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a quartile?

A

you know what this is. to calculate 25th percentile, do (n+1)/4. 50th percentile = 2(n+1)/4. The 50th percentile is called the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is interquartile range?

A

this is from 25th percentile to 75th percentile. reflects the variability of the middle 50% of the obsrvations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is variance?

A

Allows us to measure the dispersion relative to the scatter of the values about their mean. Variability is not the same. Variability tell us how much the scores differ from one another, while variance tell us how much they differ from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is a standard deviation?

A

the square root of the variance. You do this to put the variance results back in the original scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the range?

A

the difference between minimum and maximum. Poor measure of dispersion as it only takes into account two values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the coefficient of variation?

A

Relative variation instead of absolute variation. Ratio of standard deviation to the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is an ordered array?

A

Listing of the values of a collection (either population or sample) in an order of magnitude from smallest to largest. Can easily determine min, max, range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the frequency distribution table?

A

listing of relative frequencies of each value as a percentage. usually, lists frequency, relative frequency, cumulative frequency, cumulative frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is a bar chart and how do you make it?

A

Display qualitative data. Used for nominal and ordinal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is a histogram and what are its components?

A

Displays frequency distribution. Used for quantitative data. The bars are drawn touching each other to indicate data are continuous. Ratio/interval data are categorized. This is similar to a bar chart that is used for categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a frequency polygon and what are its components?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a box-and-whisker plot and how do you make it?

A
  1. put varialb eof interest on horizontal axis
  2. draw a box - one end is Q1 and the other is Q3.
  3. divide box at Q2.
  4. Draw whisker from Q1 to lowest value
  5. Draw whiser from Q3 to highest value
  6. add a star for the mean.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is a stem and leaf plot and how do you make it?

A

This helps you order data from min to max. Stem = all but last digit of data point. Leaf = last digit of data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is a scatter plot and how do you make it?

A

If you have two continuous variables, then you can use this to see the relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the standard error?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is skewness and what are the types and what do they mean?

A

There is a tail. If the longer tail points to positive numbers, then it is positively skewed. If the longer tail pints to negative numbers, then it is negative skewed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is symmetry and how do you determine it is there?

A

It can be divided into two halves that are mirror images of each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are random variables?

A

variables that cannot be predicted in advance due to chance factors. ex = adult height for a newborn baby.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is a discrete random variable?

A

a random variable that is discrete. Has a countable number of possible outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is a continuous random variable?

A

a random varialbe that is continuous. can assume any value on a continuous segment of the real number line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is sample space?

A

A listing of all the possible outcomes. The fundamental counting principle allows us to figure out how many are in the sample space without having to count them all.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is an experiment when we talk in terms of probability?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

what is an outcome (in probability)?

A

a

50
Q

What is an event (in probability)?

A

Something that happens

51
Q

What is a union of events?

A

A or B

52
Q

What is an intersection of events?

A

More than one can occur at a time.

53
Q

What does it mean to be mutually exclusive?

A

Two events cannot occur simultaneously

54
Q

What is conditional probability?

A

probability that only involves a subset of a total group. The group has been restricted due to conditions or characteristics.

55
Q

What is joint probability?

A

The probability that a subject picked at random possess two charactristics at the same time. This is the upside down U.

56
Q

What is marginal probability?

A

a

57
Q

What are complementary events?

A

the probability of event A is equal to 1 minus the probability of its complement.

58
Q

Can you tell me the addition rule?

A

Given two events A and B, the probability that event A occurs or event B occurs is the probability that A occurs plus the probability that B occurs minus the probability that the events occur simultaneously. P(A or B) = P(A)+P(B)-P(A and B)

59
Q

Can you tell me the multiplication rule?

A

P(A and B) = P(B)P(A/B) = P(A)P(B/A).

60
Q

How can you tell that events are independent from each other?

A

P(B/A) = P(B). This tells us that A didn’t affect B. You can multiple P(A) and P(B) together. If this is not equal to the multiplication formula, then they are not independent.

61
Q

What are combinations?

A

An arragenemtn of objects, without repetition, where order is not important.

62
Q

What are permutations?

A

An arrangement of objects, without repetition, where order is important.

63
Q

What is the distribution of a random varialbe?

A

A table, graph, formula, or other device used to specify all possible values of a discrete random variable along with their respective probabilities.

64
Q

What is a sampling distribution?

A

a

65
Q

What is a normal distribution?

A

a

66
Q

What is a standard normal distribution?

A

a

67
Q

What are bernoulli trials and what distribution do they go with?

A

a sequence of bernoulli trials. Each trial results in one of two possible, mutually exclusive outcomes. one of the possible outcomes is denoted as a sucess and the other a failure. Experiment has n identical trials. Probability of success remains constant from trial to trial. Trials are indepdnent.. The binomial random varialbe, x is the count of the # of successes in the n trials. This is with binomial distribution

68
Q

What is a binomial distribution?

A

Derived from Bernoulli process. Have a random varialbe x which is the number of successes. All others are failures.fa

69
Q

What is the central limit theorem?

A

a

70
Q

What is the difference between descriptive and inferential statistics?

A

Descriptive statistics are used to organize and summarize data in samples and populations, whereas inferential statistics are used to make educated guesses about populations based on random samples.

71
Q

Why would you do random sampling?

A

without randomized design, there can be no dependable statistical analysis. Helps to avoid bias.

72
Q

What are class intervals and how do we get them?

A

A set of contiguous, non-overlapping intervals, such that each value in the set of observations can be placed in one, and only one, of the intervals.

73
Q

How many class intervals should be used?

A

No fewer than 6, no more than 15.

74
Q

What are some characteristics of distributions?

A

shape, central tendency, variability

75
Q

What are the different shapes that a distribution can take on?

A
  1. symmetric
  2. skewed
  3. modal
76
Q

What is modal distribution?

A

this is talking about the number of peaks. Unimodal has one peak. Bimodal has two peaks. Multimodal has more than two peaks

77
Q

What is central tendency?

A

the score near the center of the distribution. typical and representative score value.

78
Q

What is variability?

A

degree to which the measurements in a distribution differ from one another. Also called variation, spread, scatter

79
Q

What are the measures of central tendency?

A

mean, median, m ode

80
Q

Tell me about the mean and median in a positive dkewed distribution.

A

mean greater than median

81
Q

Tell me about the mean and median in a negatively skewed distribution

A

mean less than median

82
Q

Tell me about dispersion.

A

If values are the same, there is none. Some measures of dispersion are:

  1. range
  2. variance
  3. standard deviation
83
Q

What is the fundamental counting principle?

A

if there are m possible outcomes for one thing and n possible outcomes for another, then there are mn possible outcomes of doing both.

84
Q

What is a distinguishable permutation?

A

One in which there are so many outcomes but some of the outcomes are the same as some of the events were the same. To find the number of distinguishable permutations, take the total number of letters factorial, divided by the frequency of each unique letter factorial.

85
Q

What are the elementary properties of probability?

A
  1. Non negative number
  2. sum of all probabilities of mutually exclusive outcomes is 1.
  3. for any 2 mutually exclusive events, the probability of occurrence of A or B is equal to sum of individual probabilities. P(A or B) = P(A)+P(B)
86
Q

What is the probability of two events (OR)?

A

P(A or B) = P (event A occurs or event B occurs or both occur).

87
Q

What is unconditional probability?

A

probability that includes the total group.

88
Q

What are the properties of a discrete probability distribution?

A

all probabilities between 0 and 1. The sum of all probabilities equals 1

89
Q

What is a cumulative distribution?

A

Successively add the probabilities togther.

90
Q

How do we count the number of sequences in a large sample procedura?

A

combinations

91
Q

What is the formula for a binominal distributioN?

A

P=nCxp^xq^(n-x)

92
Q

What are the two parameters of a binomial distribution?

A

n and p. n is the number of trials and p is the probability.

93
Q

What is the mean of a binomial distribution?

A

mean = n*p

94
Q

What is the variance of a binomial distribution?

A

variance = npq

95
Q

When do we get a normal distribution?

A

When the number of values, n, approaches infinity and the width of the class intervals approaches zero, the frequency polygon becomes a smooth curve.

96
Q

How do we find the area under the curve of the normal distribution?

A

you could use calculus, but ther eis an easier way. You can use a continuous probability distribution.

97
Q

What is a continuous probability distribution

A

a non-negative function is called a probability distribution of the continuous random varialbe, x, if:

  1. total area bounded by its curve and the x-axis is equal to 1, and
  2. sub area under the curve, the x-axis, and the perpendiculars erected at any 2 pints a dn b gives the probability that X is between the poins A and B.
  3. The probability of any specific value of the random variable is zero.
98
Q

What are the characteristics of the normal distribution?

A
  1. symmetrical about its mean
  2. mean, median, mode are all equal
  3. parameters of normal distribution are mu and sigma. Mu determines the left or right shift. sigma determines the flatness or peakness of the graph.
  4. total area under the curve is one square unit.
  5. area from -1 sigma to +1 sigma is 68% of the total area
99
Q

how much is within one sd above and below?

A

68%

100
Q

How much is withon 2 sd above and below?

A

95%

101
Q

How much is within 3 sd above and below?

A

99.7%

102
Q

What is a standard normal distribution?

A

mean =0, standard deviation = 1. this also called the z-distribution.

103
Q

How do you calculate the z score?

A

z=(x-mu)/sigma

104
Q

What does the z score do/

A

transofrms a data value into the number of standard deviations that value is from the mean.

105
Q

What are the two purposes of sampling distributions?

A
  1. Allows us to answer probability questions about sample statistics
  2. Provide the necessary theory for making statistical inference procedures valid.
106
Q

What is the sampling distribution and why do we care about it?

A

It is the distribution of all possible values that can be assumed by some statistic, computed from samples of the same size, randomly drawn from the same population

107
Q

How many sampling distributions are there?

A

A sampling distribution is not just one statistic but a probability distribution. There is not 1 sampling distribution but many. There is a different sampling distribution for each combination of a statistic, a sample size, and a population

108
Q

What are the steps of constructing a sampling distribution?

A
  1. choose a specific measurement, sample size, and specific statistic. EAch choice defines a new sampling distribution
  2. draw a random sample (size n from the population selected with replacement)
  3. compute teh statistic that you want from your population.
  4. do steps 2 and 3 infinite times.
  5. construct relative frequency distribution for the statistic
109
Q

What is the relative frequency distribution?

A

this is what you construct after doing the sampling distribution steps. It is based on all possible random samples of the size n.

110
Q

What are the characteristics of a distrubtion?

A

mean, variance, shape

111
Q

How do the varaince and mean formulas vary for the sampling distribution compared to just normally?

A

they have the N^n.

112
Q

What is the standard error of the mean?

A

square root of the varaince of the sampling distribution is the standard error of the mean. Basically the sd divded by square root of n.

113
Q

What do we do if we are sampling from a non-normally distributed populatioN?

A

central limit theorem.

114
Q

What is the central limit theorem?

A

Given a population of any non-normal form with a mean mu and a finite variance, the sampling distribution of x, computed from samples of size n from this population, will have mean mu and varaince sigma^2/n, and will be approximately normally distributed when the sample size is large.

115
Q

How large does the sample size need to be?

A

Depends on nonnormality of the data. >30 usually.

116
Q

When can we be assured of an approximately normally distributed sampling distribution?

A
  1. when sampling is from normally distributed population
  2. when sample is large
  3. when sampling is from a population of which shape is unkown as long as sample is large.
117
Q

What is the z transformation for sampling distributions?

A

z = xbar-mubar/sigma/squareroot of n

118
Q

How do we do a sampling distribution for a proportion?

A

The mean of the distribution Mup is equal to the true population proportion p. The variance is p(1-p)/n

119
Q

How do we know that the sample is large enough for CLT when we have proportions?

A

n*p>5 and n(1-p)>5

120
Q

What is the z transofmration formula for proportions?

A

z=phat-p/(square root (p(1-p)/n))