Midterm Flashcards by Cassidy Ruocco

Concerned with presentation, organization and summarization of data. Organizing and graphing the data to get an idea of what they show

Descriptive statistics

How well did you know this?

Not at all

Perfectly

Allow us to generalize from sample of data to a larger group of subjects. Using results of a study to infer how everyone else will be affected

Inferential statistics

How well did you know this?

Not at all

Perfectly

Includes various methods of organizing and graphing the data to get an idea of what they show. Pictures, graphs, etc.

Descriptive statistics

How well did you know this?

Not at all

Perfectly

Allow us, as researchers, to generalize from our sample of data to a larger group or population. Are used to determine the probability that a conclusion based on analysis of data from a sample is true
*Subject to random error

Inferential statistics

How well did you know this?

Not at all

Perfectly

Whatever is being observed or measured. May be dependent or independent

Variable

How well did you know this?

Not at all

Perfectly

1) the outcome of interest and

2) changes in response to some intervention.

Dependent variable

dependent changes in response to the independent

How well did you know this?

Not at all

Perfectly

1) the intervention

2) what is being manipulated by the researcher.

Independent variable

How well did you know this?

Not at all

Perfectly

Can have only one of a limited set of values. Have values that can assume only whole numbers.
Ex..number of kids (2, 3, or 4. can’t have 2.13 kids), hair color, gender. There is no inbetween.

Discrete variables/data

How well did you know this?

Not at all

Perfectly

May have any value, within a defined range.

Ex…blood pressure, weight, serum levels, time, height.

Continous variables/data

How well did you know this?

Not at all

Perfectly

Central or typical value for a probability distribution or in simpler terms, averages.

Central tendency

How well did you know this?

Not at all

Perfectly

Is the measure of central tendency for interval and ratio data

Mean

How well did you know this?

Not at all

Perfectly

Used as a measure of central tendency when the mean would be meaningless, as with ordinal data… the value such that half of the data points are above and half are below it.

Median

How well did you know this?

Not at all

Perfectly

The only measure that may be used with nominal data and consists of the most frequently occurring category. Nominal data (ex: male vs. female) is derived from qualitative measures.

Mode

How well did you know this?

Not at all

Perfectly

refers to how closely the data cluster around the measure of central tendency.

A measure of dispersion

How well did you know this?

Not at all

Perfectly

Used with ordinal data, is the difference between the highest and lowest values.

always a single number
is easy to calculate but is unstable and it isn’t of much use

Range

How well did you know this?

Not at all

Perfectly

measures how far a set of numbers is spread out

*if small, indicates data points are very close to the mean

Variance

How well did you know this?

Not at all

Perfectly

Is the square root of the variance of a random variable or statistical population.. is expressed in the same units as the original measurement.
*Smaller indicates closer to the mean

Standard deviation

How well did you know this?

Not at all

Perfectly

Normal distributions are symmetrical around their…

Mean

How well did you know this?

Not at all

Perfectly

The mean, median, and mode in a normal distribution are…

Equal

How well did you know this?

Not at all

Perfectly

The area under the normal curve is equal to….

1.0

How well did you know this?

Not at all

Perfectly

Normal distributions are more dense in the ________ and less dense in the ______

More dense= CENTER
Less dense= TAILS

(Bell curve)

How well did you know this?

Not at all

Perfectly

Normal distribution is defined by these 2 parameters…

Mean

Standard deviation

How well did you know this?

Not at all

Perfectly

_____% of the area of a normal distribution is within one standard deviation of the mean.

How well did you know this?

Not at all

Perfectly

Approx _____% of the area of a normal distribution is within two standard deviations of the mean.

How well did you know this?

Not at all

Perfectly

3 different shaped lines can each be normal distributions with different shapes, what makes them have a different shape?

Different standard deviations

* The mean, median and mode all have the same value * The curve is symmetric around the mean * The tails of the curve get closer and closer to the X-axis as you move away from the mean but they never quite reach it. * In mathematics the curve approaches the X-axis asymptotically

Properties of the normal curve

Many statistical tests assume a....

Normal distribution

The mean and the variance are...

Independent! You can change one and the other will stay the same

Many natural phenomena are in fact...

Normally distributed

Whatever the actual distribution of data, if we draw a large number of samples of reasonable size, the means of those samples will always be normally distributed. This fact arises from the...

Central limit theorem

states that if we draw equally sized samples from a non-normal distribution, the distribution of the means of these samples will still be normal, as long as the samples are large enough. How large is large enough? It depends on the shape of the distribution. Generally anything over 30 is enough

Central Limit Theorem

What is the application of Central Limit Theorem...

Allows us to assume normal distribution

relative likelihood that a certain event will or will not occur, relative to some other events.

Probability

of an event is an "estimate" that the event will happen based on how often the event occurs after collecting data or running an experiment (in a large number of trials). It is based specifically on direct observations or experiences. All things must be equal. If circumstances change from testing scenario, the outcome probability will change

Empirical probability

Most of medical probabilities are derived through..

Empiric means

________ _______of an event is the number of ways that the event can occur, divided by the total number of outcomes. It is finding the probability of events that come from a sample space of known equally likely outcomes. -Chance of winning on the roulette wheel

Theoretical probability

Two events, X and Y , are ______ ________ if the occurrence of one precludes the occurrence of the other.

Mutually exclusive

Flipping a coin is an example of this because if the head side appears, the tails side does not.

Mutually exclusive

the probability of X or Y is the probability of X plus the probability of Y Pr (X or Y) =Pr (X) +Pr (Y)

Additive Law

Two events, X and Y , are _______ dependent if the outcome of Y depends on X , or X depends on Y . *Ex...life expectancy depends on gender, year born, access to health care, country of birth, etc.

Conditionally

Probability of being President is ______ on being a US citizen

Conditional

Pr (X and Y)= Pr(X) x Pr(X|Y) | Pr (X|Y) means the probability of X occuring given that Y has occured.

Multiplicative Law

The additive law is for...

Mutually exclusive

The multiplicative law is for....

Conditionally dependent

Shows the probabilities of different outcomes for a series of random events, each of which can have only one of two values.

Binomial distribution

When you perform a hypothesis test in statistics, a _______ helps you determine the significance of your results.

P-value

_______ tests are used to test the validity of a claim that is made about a population.

Hypothesis

Usually refers to a general statement or default position that there is no relationship between two measured phenomena, or no difference among groups.

Null hypothesis

A _____ p-value indicates strong evidence against the null hypothesis, so you reject the null hypothesis

Small *(typically ≤ 0.05)

A _____ p-value indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.

Large *typically > 0.05

_______ _______ in experimental measurements are caused by unknown and unpredictable changes in the experiment. These changes may occur in the measuring instruments or in the environmental conditions

Random errors

How do you minimize random error?

TAKE MORE DATA!

_______ ________ can be evaluated through statistical analysis and can be reduced by averaging over a large number of observations

Random error

includes all of the elements from a set of data. A measurable characteristic of a population such as mean, is called a parameter. Use N.

Population

consists of one or more observations from the population. A measurable characteristic of a sample is called a statistic. Use n.

Sample

is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so the pool of available subjects becomes biased

Random sampling

is often used instead of random sampling. It is also called an Nth name selection technique. After the required sample size has been calculated, every Nth record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. This is frequently used to select a specified number of records from a computer file.

Systemic sampling

is commonly used probability method that is superior to random sampling because it reduces sampling error. A stratum is a subset of the population that share at least one common characteristic. Examples of stratums might be males and females, or managers and non-managers. The researcher first identifies the relevant strata and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. "Sufficient" refers to a sample size large enough for us to be reasonably confident that the stratum represents the population.

Stratified sampling *Stratified sampling is often used when one or more of the strata in the population have a low incidence relative to the other strata

is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This nonprobability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.

Convenience sampling

reflects that there will be no observed effect for our experiment. This is what we are attempting to overturn by our hypothesis test. We hope to obtain a small enough p-value that we are justified in rejecting this. WANT TO DISPROVE THIS!

Null hypothesis

H0 states there is..

no difference between the interventions being studied (null hypothesis)

Reflects that there will be an observed effect for our experiment. *Denoted by Ha or by H1

Alternative or experimental hypothesis

If the null hypothesis is rejected....

then we ACCEPT the alternative hypothesis! | there is a relationship! this is what we want!

is inversely related to beta or the probability of making a Type II error. In short, power = 1 – β.

Statistical power

is the likelihood that a study will detect an effect when there is an effect there to be detected.

Statistical power

If statistical power is high, the probability of making a _____ error, or concluding there is no effect when, in fact, there is one, goes down.

Type II

Statistical power is affected chiefly by the size of the effect and the ______________ used to detect it

Size of the sample

occurs when the investigator determines there is a difference between the 2 groups (disproves H0) when in fact there is no difference, essentially, a false positive result

Type I Error

The probability of a Type 1 Error is defined as the

Alpha level

Occurs less commonly in ITT (intention to treat) due to the fact that ITT is a more cautious approach to data interpretation

Type I Error

occurs when the investigator determines there is no difference between the 2 groups (accepts H0) when in fact there is a difference, essentially a false negative result

Type II Error

The probability of a Type II Error is defined as the

Beta level

This type of error may occur when analysis is too cautious

Type II

A study in which people are allocated at random (by chance alone) to receive one of several clinical interventions. One of these interventions is the standard of comparison or control. The control may be a standard practice, a placebo ("sugar pill"), or no intervention at all.

Randomized Control Trial (RCT)

Someone who takes part in a randomized controlled trial (RCT) is called a

Participant or subject

seek to measure and compare the outcomes after the participants receive the interventions. Because the outcomes are measured, these are quantitative studies.

Randomized Control Trial (RCT)

are quantitative, comparative, controlled experiments in which investigators study two or more interventions in a series of individuals who receive them in random order. One of the simplest and most powerful tools in clinical research. GOLD STANDARD***

Randomized Control Trial

fall under the category of analytic study designs which aim to evaluate and identify causes or risk factors of diseases or health related events. The investigator does not intervene, yet “observes” and assess the strength and relationship between an exposure and a disease variable.

Observational studies

Cohort studies, case-control studies, and cross-sectional studies are all examples of....

Observational studies

group of people with defined characteristics who are followed up to determine incidence of, or mortality from, some specific disease, all causes of death, or some other outcome. In a cohort study, an outcome or disease-free study population is first identified by the exposure or event of interest and followed in time until the disease or outcome of interest occurs. Can be prospective or retrospective.

Cohort Study

Similar to taking a history and physical; the diseased patient is questioned and examined, and elements from this history taking are knitted together to reveal characteristics or factors that predisposed the patient to the disease.

Case control studies

Identify subjects by outcome status at the outset of the investigation. Outcomes of interest may be whether the subject has undergone a specific type of surgery, experienced a complication, or is diagnosed with a disease. Are quick, relatively inexpensive to implement, require comparatively fewer subjects, and allow for multiple exposures or risk factors to be assessed for one outcome

Case control study

Data are collected on the whole study population at a single point in time to examine the relationship between disease (or other health related state) and other variables of interest

Cross-Sectional study

provide a snapshot of the frequency of a disease or other health related characteristics in a population at a given point in time. Can assess the burden of disease in a population

Cross-Sectional Study

answers a defined research question by collecting and summarizing all empirical evidence that fits pre-specific criteria. Ex: researching articles by other researchers to answer your question about a treatment, disease, etc.

Systematic review

a subset of systematic reviews. A method that combines pertinent qualitative and quantitative study data from several studies to develop a single conclusion with a greater statistical power. Its conclusion is stronger than any single study due to more subjects, greater diversity of subjects, and more effects and results.

Meta-analysis

recommendations to clinicians about the care of patients with specific conditions. These are based on the best available research evidence and practice experience. The guidelines include a systematic review of research evidence and a decision analysis (a set of recommendations involving both the evidence and value judgments regarding the benefits and harms of alternative care options)

Clinical practice guidelines

percent of times a diagnostic exam identifies an abnormality **true positive is the numerator TP / TP + FN

Sensitivity

percent of times a diagnostic exam suggests the right diagnosis **true negative is the numerator TN / FP + TN

Specificity

A d-dimer test is _____ for blood clots but it is not _____ for PEs (the pt could have a DVT). To calculate the actual % we would have to be given a 2x2 table with true positive (TP), false positive (FP), false negative (FN) and false positive (FP) data.

it is SENSITIVE but NOT SPECIFIC

When you have a positive screening (like a d-dimer) the next indicator you need to be aware of is the __________. This % is the probability that the initial positive test result will be a *true positive* and the person actually has the disease.

PPV (positive predictive value)

The _________ is the % probability that your negative screening tests will actually mean you don’t have a disease (making sure the result is NOT a false negative).

NPV (negative predictive value)

these #s are what we use to calculate sensitivity and specificity and relate closely to positive and negative predictive values.

False positives/False negatives

For example, if the hospital you work in has a lot of false positive mammogram results it means the screening mammogram program has a low.......

Positive predicted value (PPV) | no equations for this, just collected data

the risk/benefit ratio of being in one group in a study.

Absolute risk

You need to calculate the _____ _____ of the control and the treatment groups before you can complete any other calculation. # outcomes (ex. survivors) / total #.

Absolute risk

The difference (thus the equation is a subtraction problem) in the risk of receiving treatment versus not. AR (control) - AR(treatment).

Absolute risk reduction

Clinically, _____ is important because you can express the risk/benefit ratio of trying a new drug to your patient in whole number forms

ARR (Absolute risk reduction)

Pretend the BMJ just published an article saying a (hypothetical) new late-onset asthma prevention drug had an ARR of .30, or 30%. And pretend that normally the occurrence of late-onset asthma in adults is 1 in every 600 adults. This ARR means that the new drug does what to the absolute risk of developing late onset asthma?

Reduces ARR! from the previous 1/600 to 1/2,000 adults. So, if one of your patients were at high risk for developing late-onset asthma, she may want to consider taking this new preventative medication.

The ratio of risk between the two groups in the study. ARR / AR (control).

Relative risk reduction (RRR)

Midterm Flashcards

(100 cards)