Clinical Stats Flashcards
what are the 2 types of data?
qualitative (categorical) or quantitive (numerical)
describe the 3 types of qualitative data.
binary - gender
nominal - named categories e.g. blood type
ordinal - ordered categories e.g. stage of cancer
describe the 2 types of quantitive data
continuous - any number within range
discrete - whole numbers
what are descriptive statistics? give the 2 methods how they are used.
methods of organising, summarising and presenting data in a convenient and informative day
- graphical technique - visualise data
- numerical technique - using numeric and tabular form
when looking at collected data, what should be considered?
- where is the centre?
- what is the range?
- any outliers/anomalies?
- what’s the shape of distribution
how can you plot categorical data?
bar charts
how can you plot continuous data?
histograms
box plots
describe a box plot
what are the 3 distribution shapes?
left skewed
symmetric
right skewed
how can you describe numeric data? (10)
- mean
- median
- mode
- quartiles
- interquartile range
- range
- variance
- standard deviation
- coefficient of variation
- shape - skewness
what is the mean? is it affected by extreme values?
add all the values divide by the number to values
yes, extreme values affect it
what is the median? is it affected by extreme values?
the middle number
if there are too middle numbers, find the in-between valve of them
nope, its not affected by extreme values
what is the mode?
the most common number
is there relation between mean, median and mode (the central tendencies)?
no
4 ways to measure variation
range
IQ range
standard deviation
coefficient of variation
how do you measure the range?
highest value-lowest value
how is the IQ range measures?
75% value - 25% value of data
how is the variance of data measured?
bigger the variation = the bigger the disperse of the values
how is standard deviation measured? and what does it show?
shows variation about the mean
in taking standard deviations and sample variation, why are values squared?
to eliminate negative values
on SPSS, describe how the data is input.
each column = variable
each row = constant e.g. the patient
variable mode - can change the names of each column
- can change the value to create a code e.g. 1 = males and 2 = females
- can change the decimals e.g. age doesnt need decimal
- measure can be changed to scale, ordinal or nominal
e.g. age = scale, sex = nominal, dfmt - ordinal
open excel sheet in SPSS, save the excel and open data
to use frequency stats
- go on analyse tab>descriptive stats>frequencies
- choose your variable
- click into the charts or statistics
to use descriptive stats
- go on analyse tab>descriptive stats>descriptive
- choose variables
- choose the descriptive stats
what is normal distribution, how does it appear on a graph? (3)
continuous data determined by the mean and the standard deviation
- symmetrical
- bell shaped
- mean, median and mode = equal
what is the central limit theorem?
under appropriate conditions, the distribution of a mean converges to a standard normal distribution
- happens over a long time
on a graph, what does the shape of the bell depend on?
the standard deviation
- larger the S.D - the larger the bell
- smaller the S.D - the smaller dispersion
from a bell on a graph, how do you find the mean?
tip of the bell, follow it down to find the value
describe the empirical rule.
if the area lies between the mean and +- standard deviation = covers 68% of the data
if the area lies between the mean and =- 2 standard deviation = covers 95% of the data
if the area lies between mean and =-3standard deviation = covers 99.7% of the data
how would you estimate the probability that a adult has an IQ between 70 and 115 on this graph?
overall area = 95%
minus the middle 68%
= 27% on either side of the middle
half it to get one side
= 13.5% the area from C-D
13.5+68 =81.5% probability
define population and parameter
a group of all items of interest
parameter - descriptive measure of a population
define sample and statistic
the set of data drawn from the population
statistic - a descriptive measure of a sample e.g. mean, mode, median, frequency, S.D
what’s a draw back from descriptive statistics?
doesnt allow you to create conclusions from the data
what is statistical inference ?
you can draw conclusions about populations based on sample data
what is sampling variation
when statistics vary from sample to sample due to random chance
how do you reduce sample variation
repeat sample multiple times
plot into histogram
will probably follow a normal distribution
= the centre would be the population mean
how do you measure sampling variability
using standard error
- it decreases with increased sampling size as variability decreases
what is standard error
the standard deviation of a sample statistic
how different the population mean is likely to be from a sample mean
what are confidence intervals and the formula?
how definite the data is
(sample statistic +- measure of confidence ) x standard error
how do you find the confidence interval from SPSS?
you can interpret that there is 95% that the true mean lies between the upper bound and lower bound results
how is a hypothesis test performed?
- create null hypothesis
- predict the sampling variability assuming null hypothesis is true
- do the experiment
- compare observed difference vs expected difference
- reject or accept the null hypothesis
what is p value
the probability of the data under the null hypothesis
when do you accept or reject the null hypothesis
reject the null hypothesis if the p value is less than 0.05 and accept the conclusion that the study has results
what are the types of error, describe them, where can they be found on a graph?
a-error - if the null hypothesis is true but we rejected it
- type I error
- false positive e.g. you’re not preggers but u are
b-error - if the null hypothesis is false but we accepted it
- type II error
- false negative
what is power?
correct rejection of null hypothesis
if the null hypothesis is false and we reject is
what is the null hypothesis
there are no differences in the study e.g. no difference between weather in leeds vs spain
what is the t test
ratio between observed difference vs expected difference
if the null hypothesis is rejected, is the study statistically significant
yes
when concluding the hypothesis, which 3 things must you consider?
- it is statistically significant? - if the null hypothesis is rejected, then yes
- is it clinically significant? - needs to be checked with clinician
- is it a causal association
describe the results from this t test
- always look at two sided p value as it shows the two possible values on each side
- only consider one sided p value if you have prior context
- use first row if equal variances are assumed
..
- the significance of the Levenes Test is 0.30 which is > 0.05 p value so we reject that the groups have similar variance
- SO, we use the results from the second row
- the p value is .852 so we accept the null hypothesis that the two groups have the same mean
where the p value on a chi squared test?
under asymptomatic significance
what 3 tests can be used to measure continuous outcomes - comparing means?
- independent two sample t test - independent groups
e.g. men vs women, smoker vs non-smoker - paired t test - correlating groups
e.g. twins, measurement before and after tx - ANOVA - 2+ independent groups
what 2 tests are used to measure categorical outcome? - comparing proportions
chi-squared - independent obeserviations
McNemar’s chi-square test - correlated observations
what is to be assumed for continuous data?
that the outcome is normally distributed
how do you know if the observations are correlated?
if you’ve got the same test subject at the beginning and at the end
e.g. testing if Rajan is 6” at the start of the day and if he is still 6” at the end
what if the assumption is NOT normally distributed or the sample size is too small, what test is used then?
non-parametric test
the Wilcoxon sign-rank test or the Mann- Whitney U test
- don’t calculate the means as it may be misleading
- use the median score
what does the ANOVA test mean?
Analysis of Variance test
why is the ANOVA test difficult to interpret?
it tells you that two groups differ, but not which ones, it requires more analysis
with ANOVA, why cant you just do multiple paired t tests? e.g. 3 paired test
more chance of making a type 1 error
1-(0.95)^3 = 14% chance
what test can be used to check multiple tests are significant?
Bonferroni Correction
- do paired tests
the new p value
= 0.05/number of tests - use the new p value to see which groups are significant
what is the alternative to ANOVA for a non-parametric test?
Kruskal-Walls test
what is the alternative to Chi-Squared if there are sparse cells?
Fisher’s Exact Test
what do you do when concluding a confidence interval?
report the upper and lower bound
- if it covers 0, then it is not statistically significant, accept null hypothesis
- if it doesn’t cover 0, then it is statistically significant, reject null hypothesis
what is correlation coefficient?
when you want to assess relationship between 2 variables
result will be from -1 to 1
then 0 = no correlation
- positive number = positive correlation
- negative = negative correlation
what is linear regression?
when the two variables are treated as equals
- one variable = independent predictor
- other = dependent outcome
define y=mx+c
m = gradient
c = where it intercepts on the y axis
describe absolute risk difference, relative risk ratio and odds ratio with vitamins 12/33 and placebo 22/35
absolute risk difference
- find the difference between the results in percentage
e.g. 62.9%-36.4% = 26.5%
risk ratio
- lower risk/higher risk
36.4%/62.9% = 57.8%
= 42% decrease in relative risk
odds ratio
- the ratio of a event occurring:not occurring
36.4% / 1-36.4
//////
62.9 / 1- 62.9%
= 0.34
= 66% decrease in relative odds
quick tip how to interpret odds ratio if RR>1 or RR<1
if RR>1 then odds ratio will always be bigger
if RR<1 then odds ratio will always be smaller;;er
describe kaplian-meier
estimates survival functions for each group
- describes study populations
what is survival analysis?
statistical method for analysing longitudinal data on the occurrence of events
what is a time-to-event?
the time from entry into a study until the study has a particular outcome
what is censoring in survival analysis
when subjects are lost to follow up/dropped out/study ends before they die
what are all the tests
cant flipping make tik-toks and bake kupcakes with marshmallow
CFMTTABKWM