Exam 1 Flashcards
W. Edwards Deming
used statistics to aid the US Census estimates and helped to improve the quality of manufacturing
Plan-Do-Check-Act
problem solving process used in quality control
what did Deming use to call for change in management philosophy?
14 points
pattern recognition
helps to determine whether an event or observation is unique
regression to the mean
after a good/bad performance it almost always goes back to the mean
availability bias
link an event to something in your past
quantitative
numerical values that can be put into a number line
internal validity
truth within a study
external validity
truth beyond a study
objectivity
seeing things as they are without making it conform to a preconceived view
variables
the characteristics being measured
valued
realized measurements
categorical
places observations into unordered categories
examples of categorical things
sex, blood types, disease status
ordinal
puts observations into categories that can be ranked
example of ordinal things
cancer stage, opinions
examples of quantitative things
age, bp, body weight
GIGO
bad input means a bad output
imprecision
inability to get the same result upon repetition
bias
tendency to overestimate the true value
surveys
quantifies population characteristics
comparative studies
quantifies relationships between variables
census
attempts to collect information on all individuals in the population
probability sample
sample in which each member of the population has a known probability of entering the sample
sampling fraction
n/N
n
size of sample
N
population size
undercoverage bias
some groups are left out or underrepresented
volunteer bias
self-selected participants tend to be atypical of the population
nonresponse bias
a large percentage of individuals refuse to participate
stratified random samples
draws SRSs from a relatively homogenous groups
SRS
simple random sample
cluster samples
randomly select large units consisting of smaller units
experimental studies
investigator assigns the exposure to one group while leaving the other nonexposed
non experimental studies
classifies groups as exposed or non exposed without intervention
explanatory variable
treatment/exposure that explains or predicts changes in the response variable
response variable
outcome/exposure being investigated
are discrepancies in experimental and non experimental studies normal?
hek ya
confounding
occurs when effects of a lurking variable become mixed up with the effects of the explanatory variable
single-blind
subjects are kept in the dark about the specifics of the treatment they’re receiving
double-blind
subjects and investigators are kept in the dark
triple-blind
subjects, investigators and statistician are kept in the dark
frequency distributions
tells us how often we see the various values in a batch of numbers
what does a stem and leaf plot show us?
shape, location and spread
modality
number of peaks in a distribution
kurtosis
steepness of the mound
median
point that divide the data set into a top half and a bottom half
spread
informed way to refer to the dispersion or variability of data points
what is the worst kind of sampling and why?
convenience sampling, it’s usually biased
consecutive sampling
biased, sample people with characteristics that they want, often used in healthcare
best kind of sampling and why?
simple random sampling, each member has the same chance of being selected
systemic sampling
take every nth individual
cluster sampling
random sampling of natural groupings (schools, towns etc)
what should you check data for?
outliers, variables are normally distributed, see if you can combine
discrete random variable
countable set of random outcomes
is the mean susceptible to outliers?
yea
the mean can be used to predict…
1) an individual value drawn at random from a sample
2) a value drawn at random from a population
3) the population mean
is the median more resistant to outliers and skews?
ya
mean=median
symmetrical
mean>median
positive skew
range=
maximum-minimum
how many set does a quartile divide the data into
4
example of discrete random variable
of leukemia cases in a geographic region in a given period
continuous random variable
address quantities that take on an unbroken continuum of possible values
example of continuous random variable
time it takes to complete a task
probability mass function (pmf)
mathematical relation that assigns probability to all possible outcomes for a DISCRETE random variable
A
event A
Pr(A)
probability of event A
S
sampling universe
p
probability of success of each trial
q
probability of failures of each trial
why is it better to have a bigger sample size?
larger sample size means you’ll be closer to the actual mean
statistical interference
act of generalizing from a sample to a population with calculated degree of certainty
Where did Deming go to school?
University of Wyoming and Yale
Where did Deming work?
Western Electric and as a US census statistician
How did Deming help during war times?
improved the manufacturing process by reducing errors and minimizing waste
Who was Deming influenced by?
Walter Andrew Shewhart
Kaizen
quality improvement is a continuous process that requires teamwork and open communication and competence in problem solving
What is Demings quote?
quality is about people, not products
Nelson Data-to-Wisdom
starts with collecting data and then organizing it to find new insights and knowledge
normal probability graph shape
bell curve
uniform probability graph
each score is equally likely and is shaped like a rectangle
exponential probability graph
the negative slope, u kno the one
why do you need more data to get a good probability distribution?
too little makes it have a funky shape, more data means a smooth curve, like schreier
reliability
ability to collect the same values for a variables
example of reliability
measuring a childs height three times
validity
how truthful the data is (are free from error)
how does reliability look on a bullseye?
how close the darts are together
how does validity look on a bullseye?
if the darts hit the center
scientific ethos
statisticians must maintain objectivity
incidence
number of newly developing cases of a disease occurring in a defined population over a defined period
prevalence
total number of individuals with the disease in a population at a given point of time
RCT
randomized controlled trial
RCT definition
getting a group of people with the same condition then assigning them to an intervention group or control group
James Lind Scurvy treatment
took 12 sailor with scurvy and gave them all different treatments to see what worked
key features of RCTs
- randomization
- use of a control group for comparison
- blinding or masking
- ethics
frequency tables
lists all the data values and present the frequency count of each value
why are frequency tables nice?
good way to identify outliers and check for data entry errors
standard deviation
deviations around the mean (how far away from the mean they are)
variance
sum of squared deviations divided by the sample size minus one
random variable
numerical quantity that takes on different values depending on chance
what is the most common pdf?
normal
probability density function
population
set of all possible values for a random variable
event
outcome or set of outcomes
probability
proportion of times an event is expected to occur in the population
binomial random variable
type of discrete random variable that only has two possible outcomes
area under the curve
probability of something happening
normal random variable
continuous random variable that describe some natural phenomena
68-95-99.7 rule
68% of the area under the curve is within u+/-o
95% “” u+/-2o
98% ‘’’ u+/-3o
one side vs. two side tests
one sided=greater than Ho
two sided=greater or less than Ho
example of a normal random variable
height, weight, systolic blood pressure
sampling behavior of the mean
tend to be normal with and expected value equal to populations mean and standard deviation
interval estimation
surrounds the point estimate with a margin of error
standard error of the mean
deviation over square root of sample size
type 1 error
rejecting null when you should have kept it
significance level
a smaller P-value gives stronger evidence against Ho
whats the funky u mean?
expected value
type 2 error
keeping null when you should have rejected it
pdf definition
assigns probabilities to all possible outcomes for CONITNUOUS random variable