Intro to Inferential Stats Flashcards
what is statistics?
a collection, organization, summarization, and analysis of data
t/f: statistics involved drawing an inference about data when only part of the data is observed
true
what is a population?
the largest collection of entities for which we have an interest at a particular time
what is a sample?
a subset that is representative of a population
if a sample is not representative of the population, can we draw inferences from the data?
no
what is random sampling?
sampling technique in which each member of the population has an equal opportunity of being selected into the sample
what is a stratified random sample?
sampling technique where the population is broken into subcategories and the sample is taken from each stratum
what are some examples of stratums for a stratified random sample?
age, sex, socioeconomic status, level of injury
what factors can make a sample unrepresentative of the population?
bias and sampling error
what is an example of volunteer./participation bias that often occurs?
females tend to volunteer more often then men
what are 2 examples of sampling error?
1) sample of convenience
2) sampling from a single area
what is a parameter?
characteristic of population
is a parameter or statistic usually denoted by Greek letters?
parameter
is a parameter usually known?
no, it is usually inferred based on the statistic
what is a statistic?
characteristic of a sample
an estimate of the ___ is based on a ____
parameter, sample
t/f: we infer the parameter based on the statistic
true
what is a variable?
characteristics, #, or quantity that can be measured/counted
what are common examples of variables?
gender, age, # of patients, etc
what is a quantitative variable?
variable measured as a #
conveys info regarding the amount
what are examples of quantitative variables?
height, weight, length, age, temp, etc
what is a qualitative variable?
things that possess some characteristic of interest
data that can be categorized/described in words, not #s
what are some examples of qualitative variables?
medical dx, ethnic group, hometown, etc
t/f: if a qualitative variable is coded as a number it becomes quantitative data
false, it is still qualitative
what are the quantitative variable types?
continuous and discrete variables
what is a continuous variable?
ordered numerical data that can assume any value (within a range)
what are examples of continuous variables?
height, weight, age, force, systolic BP, cholesterol level
what is a discrete variable?
data in whole #s
ordered numerical data restricted to integer values (count data)
are dichotomous variables continuous or discrete?
discrete
what is a dichotomous variable?
dx of the flue, stroke, etc as +/-
what are examples of discrete variables?
of children, # of eggs per chicken, # of deaths, dx (+/-), dead or alive
what are the measurement levels? (No Oil In Rivers)
nominal, ordinal, interval, ratio variables
what is a nominal variable?
categories without a natural order
mutually exclusive
usually qualitative
what are examples of nominal variables?
gender, nationality, favorite animal
what are the 2 categorical variables?
nominal and ordinal variables
what is an ordinal variable?
categories w/a natural ordering (ranking)
quantitative order
exact dif bw measures is unknown
what are examples of ordinal variables?
socioeconomic status, position in a race, pain scale, RPE
what is an interval variable?
possible to order
intervals are known
no absolute zero (0 doesn’t mean nonexistent)
what are examples of interval variables?
temp and joint angles
what is a ratio variable?
zero represents the absence of a value
what are examples of ratio variables?
length and force
what are the 2 metric variables?
interval and ratio variables
what is observational research?
tracking people prospectively or retrospectively
not manipulating variables
correlations can be drawn
t/f: observational research yields strong cause-effect inferences
false
what is experimental research?
actively making adjustments to variables
requires planning on controls and experimental manipulations
does observational or experimental research yield greater cause-effect inferences?
experimental research
what is an independent variable?
the variable that is manipulated or controlled by the researcher (placebo, exercise, etc)
what is a dependent variable?
the variable that is measured
what you see based on manipulation of the IV
what is the purpose of descriptive statistics?
to numerically or graphically describe a set of data
what does N mean?
the population size
what does n mean?
the sample size
what are the measures of central tendency?
mean, median, mode
what are measures of central tendency showing?
where the data tends to cluster
what are measures of variability showing?
how data tends to spread out
what is the mean?
sum of all observations divided by the # of observations
the average
which measure is affected by every score in a distribution, including outliers?
the mean
what is the population mean represented by?
µ
what is the sample mean represented by?
x bar
is the population mean measured or inferred?
inferred
is the sample mean measured or inferred?
measured
what is the median?
the middle most observation or ordered data
what measure of central tendency is unaffected by extreme values?
the median
how is the median value obtained?
order the day from largest to smallest or smallest to largest and then find the middle value
what is the mode?
the most frequently occurring data (IR the most used brand of hot packs)
t/f: the mode is not usually used bc it doesn’t always exist and doesn’t tell us a lot for info
true
what is percentile?
the point at which a certain % of the data lie below it
ex: GRE in 90th percentile means that 90% of test takers scored below you
does the 50th percentile represent the mean, median, or mode?
median
what is the 1st quartile?
25th percentile
what is the 2nd quartile?
50th percentile (median)
what is the 3rd quartile?
75th percentile
what is the 4th quartile?
100th percentile
what is the interquartile range (IQR)?
the range of the 1st to 3rd quartile
difference be the 75th and 25th percentile
when should the mean be used?
when all available info is to be considered
when should the median be used?
when the middle score is needed, the most typical score is needed, or the data has extreme scores
what are the measures of variability?
range, IQR, variance, and standard deviation (SD)
what is the range?
can be given as a raw range or calculated by subtracting the smallest data point from the largest
what is SD?
the square root of variance
what is the easiest measure of variability to interpret?
SD
what is the coefficient of variation (CV)?
normalizing SD by mean
a unit less number
how is the CV calculated?
the sample SD divided by the sample mean (sometimes multiplied by 100)
how is the population variance represented?
sigma squared
how is sample variance measured?
S squared
how is the population variance calculated?
subtract the mean from every data point and add them up
then divide with the population size
how is the sample variance calculated?
subtract the mean from every data point and add them up
then divide by the sample size minus 1
t/f: the sample variance has no direction
true
if there is a greater variance is there a greater or lesser dispersion of data from the mean?
greater
how is the population SD represented?
sigma
how is the sample SD represented?
S
how is the population SD calculated?
take the square root of the population variance
how is the sample SD calculated?
take the square root of the sample variance
why is SD used more than variance?
bc it allows you to have the same units as the central tendencies
what is the advantage of CV?
it cancels out the units so that you can compare data regardless of units
reliability is ___ and ____
reproducibility and consistency
t/f: reproducibility and consistency is generally determined by correlations
true
what is intra-rater reliability?
anaylsis of a single rater’s consistency
direction, scoring, timing
what is inter-rater reliability?
test performed by 2 or more individuals on the same event
what is test-retest reliability?
repeating a test multiple times to determine consistency
same day, daily, 1 weeks apart
what is the range for reliability?
0-1
what does a reliability score close to zero mean?
bad reliability
what does a reliability score close to 1 mean?
good reliability
what is validity?
the soundness/appropriateness of the test in measuring what it is designed/intended to measure
what are the types of validity?
content, construct, concurrent, face, internal, and external validity
what is internal validity?
measure of the control w/in the experiment to ascertain that the results are due to the experimental manipulations
what is instrument error?
poor callibration of the treatment unit
what is investigator error?
error int technique, instructions, or investigator bias
what are common techniques to ensure internal validity?
blinding, placebo, and randomization of samples
what is external validity?
ability to generalize the results to the population from which samples were drawn
how well does the sample reflect the population of interest?
t/f: tight experimental control may make the study unrealistic (ie too many days a week of PT in the study)
true
what is sensitivity?
TP/(TP+FN)
snOUT
what is specificity?
FP/(TN+FP)
spIN
if a condition is present and the test is positive this is…
TP
if a condition is present and the test is negative this is…
FN
if a condition is absent and the test is positive this is…
FP
if a condition is absent and the test is negative this is…
TN
what is positive predictive value?
TP/(TP+FP)
proportion of true patients with positive results
what is the negative predictive value?
TN/(TN+FN)
what does a scatterplot allow for?
visualization of raw data
in a scatter plot, the IV is usually on the __ axis, and the DV is usually on the __ axis
X, Y
what are histograms?
bar graphs that visualize the pattern of the frequency distribution of the data
can measure mode very easily (the tallest bar)
normal distribution is characterized by what 2 pieces of info?
mean and SD
what is the empirical rule of normal distribution?
the frequency of data declines in a predictable manner as data deviates farther from the center of the bell curve
t/f: in the normal distribution, the mean=median=mode
true
where does 68% of the data lie in a normal distribution?
µ +/- 1 SD
where does 95.4% of the data lie in a normal distribution?
µ +/- 2 SD
where does 99.7% of the data lie in a normal distribution?
µ +/- 3 SD
what is the mean and SD of a standard normal curve?
mean=0
SD=1
when the SD of a normal curve changes what happens?
vertical curve change
smaller SD=shorter
larger SD=taller
when a normal curve has a mean of 0 an SD of 2, what happens to the curve?
it gets shorter and wider
when a normal curve has a mean of 3 and SD of 2, what happens to the curve?
it gets shorter and wider and shifts to he right
when the mean of a normal curve changes, what happens?
horizontal curve change
(+)=R shift
(-)=L shift
what is the z score?
raw score expressed in SD units
ex: if the mean is 80 and the SD is 11, 91 has a z score of +1, 69 has a z score of -1
mean=80
SD=11
how many data points (in %) fall bw 69-91?
68% (mu+/- 1 SD)
in the standard normal distribution, a z score of 0=__
mean
in the standard normal distribution, a z score of 1=__
1 SD
is 1.3 or -0.50 closer to the mean of a standard normal distribution?
-0.50
what does the z score tell us?
how far we are from the mean
what is a bimodal distribution?
a distribution with 2 equal modes (curve has 2 peaks of the same height)
what is skewness?
a disproportionate # of data points towards one end of the scale
what is (+) skew?
a curve with a longer R tail and more low value points
what is a (-) skew?
a curve with a longer L tail and more high value points
what is kurtosis?
relative peakedness of the curve
what is platykurtosis?
(-) kurtosis
lower peak
what is leptokurtosis?
(+) kurtosis
higher peak
with (+) skew, what is the relationship bw mean, median, and mode?
mode is the peak
mean follows the tail to the right and is the largest of the 3 #s
median is bw the two
with a normal curve, what is the relationship bw mean, median, and mode?
they are all the peak of the curve
mean=median=mode
with (-) skew, what is the relationship bw mean, median, and mode?
mode is the peak
mean follows the tail to the left and is the smallest of the 3 #s
median is bw the two
what does skewness of 0 and kurtosis of 0 indicate?
normal distribution
if skewness is >0, what does this mean?
more data points are towards the R tail ?? (shouldn’t it be left has more)
(+) skew
if skewness is <0, what does this mean?
more data points are towards the L tail ??? (shouldn’t it be right has more)
(-) skew
if kurtosis is >0, what does this mean?
the curve is more peaked
(+) kurtosis
leptokurtosis
if kurtosis is <0, what does this mean?
the curve is flatter
(-) kurtosis
platykurtosis
what is a hypothesis?
an educated guess/logical assumption that is based on prior research or known facts and that can be tested
if a hypothesis is supported over time, it can become a ___
theory
what is a theory?
belief regarding a concept/series of related concepts
what are examples of theories?
gravity, evolution, sliding filament theory of muscle contraction
what are the hypothesis testing procedures?
set up a hypothesis
choose which statistical test to use
find test statistic
find p value
draw a conclusion (based on hypothesis and research question)
what is the null hypothesis (H0)?
hypothesis that predicts no difference/no relationship bw the groups
what is the alternate hypothesis (Ha)?
hypothesis the predicts differences/relationships bw groups
is the H0 or Ha that research hypothesis?
Ha
t/f: H0 and Ha must be mutually exclusive and exhaustive
true
if you reject the null, you are ___ the alternate
accepting
if you fail to reject the null, you are __ the alternate
rejecting
if H0 is true, Ha is ___
false
if H0 is false, Ha is___
true
a ___ hypothesis MUST contain equality
null
if H0: mu=mu0, Ha:…
mu is not equal to mu0
if H0: mu is less than or equal to mu0, Ha:…
mu>mu0
if H0: mu is greater than or equal to mu0, Ha:…
mu<mu0
what is a 2-sided hypothesis (2 tailed)?
H0: mu=mu0
Ha: mu is not equal to mu0
what is a 1 sided hypothesis (1 tailed)?
H0: mu is less than or equal to mu0; Ha: mu>mu0
H0: mu is greater than or equal o mu0; Ha: mu<mu0
what is a 2 tailed test?
Ha: mu1 is not equal to mu2
1/2 rejection are is divided bw the 2 tails of the sampling distribution
what is a 1 tailed test?
Ha: mu1>mu2; mu1<mu2
all rejection area in one tail of the sampling distribution
what is test statistics?
how much data supports the alternate hypothesis (research hypothesis)
the bigger the test statistic (+ or -) the ___ the support for Ha
stronger
if a test statistic is closer to 0, you likely ___ Ha
reject
what is probability?
long-run proportion of a particular outcome
if p=0, the outcome is…
impossible
if p=1, the outcome is…
ensured
mean=80
1 SD=11
what is the probability of getting a data point bw 80-91 if we pulled one data point out at random?
34% (1/2 of 68% bc it’s + 1 SD)
what is level of confidence (LOC)?
% figure that establishes the probability that a statement is correct
what is the probability of error?
alpha
the remaining % of the LOC
what is the probability of error when the LOC is 68%?
32%
what is the probability of error when the LOC is 95%?
5%
what is alpha?
significance level
area under the normal curve that represents the probability of error
usually set at 0.05 (5%)
if p=0.01, what is the probability that H0 is true?
1 in 100
if p=.10, what is the probability that H0 is true?
10 in 100
1 in 10
10%
if p=.05, what is the probability that H0 is true?
5 in 100
1 in 20
5%
if p=.001, what is the probability that H0 is true?
1 in 1000
0.1%
the lower the probability, the decision is ___
stronger
if p<alpha, is this good?
yes!
if you fail to reject H0, and H0 is actually true, this is ___
TN
if you fail to reject H0, and H0 is actually false, this is ___
FN
type 2 error (beta)
what is a type 2 error?
FN
saying that there’s no change (fail to reject H0) when there is actually change (H0 is false)
beta
what is type 1 error?
FP
saying that there’s change (reject H0) when the is actually no change (H0 is true)
alpha
if you reject H0 and H0 is actually true, this is___
FP
type 1 error (alpha)
if you reject H0 and H0 is actually false, this is___
TP
if you accept H0 (fail to reject H0, reject Ha), is this positive or negative?
negative
if you reject H0 (accept Ha), is this positive or negative?
positive
alpha and beta have an ___ relationship
inverse
is beta a type 1 or 2 error?
type 2 error
is alpha a type 1 or 2 error?
type 1 error
t/f: you can’t reduce both alpha and beta at the same time
true
if alpha is increased, beta is ___
decreased
if alpha is decreased, beta is ___
increased
can you reduce type 1 and 2 errors at the same time?
no
what is the p value?
determined by the test statistic
associated with type 1 error (alpha)
making a decision on rejecting/failing to reject the null hypothesis depends on what?
the comparison of p to the level of significance (alpha)
if p is less than or equal to alpha, do we reject H0 and accept Ha?
yes!
if p is less than or equal to alpha, is this significant?
yes!
if p>alpha, do we fail to reject H0 (accept H0 and reject Ha)?
yes!
if p>alpha, is this significant?
no
how can you write a p value?
p=#
or
P </> #
t/f: it is required to report the p value in the conclusion
true
what is included in the conclusion?
summary of the question, parameter tested, and results
don’t just state “reject” or “failed to reject” H0
what is a confidence interval?
range of values associated w/a level of confidence
estimate of a range that describes whether the population (true) parameter is likely to be w/in a certain level of confidence
usually 95%