Week 9 Flashcards
What is biostatistics
The science of analyzing data and interpreting the results so that they can be applied to solving problems related to biology, health or related fields
What is univariate analysis
Describe one variable in a data set using a simplele statistics like counts, proportions and averages
What is multivariable analysis
Encompasses statistical tests such as multiple regression models that examine the relationships among three or more variables
( Confounding)
What is a variable
Any quantity that varies from one entity to another (sometime within an entity over time)
Any attribute phenomenon or event that can have different values
What are the two sub categories of variables
Quantitative and Qualitative
What are the two types of variables in quantitative studies
Discrete
Continuous
What are the two types of variables in qualitative sutides
Nominal
Ordinal
What is the normal variable
No intrinsic or logical order or value
You can assign numbers to different categories
But they do not have any other numeric properties
What is Ordinal variables
Intrinsic value but with no clear or equal differences between levels
Mild vs moderate vs severe pain
We can 5 is better than 4 etc but we cannot attribute being measured 4 is not two times larger than 2
How do we display qualitative data
Pie chart
Bar chart
Frequency table
Who created the first pie chart
Florence Nightingale
What is a numeric variable
Any positive real number depends on the nature of the variable can be expressed in decimals
Meaningful numeric scales
Age blood pressure # of friend temperature
What is continuous variables
Can take any value
Blood pressure
Temperature
Can be plotted as a line
What is discrete of variables
Can take a finite or limited number of variables
Age in year
No of drinks
Can be plotted as dot
What is the Interval variable
Interval is the difference is meaningful
No natural zero ( when you have 0 it doesn’t mean nothing)
Arbitrary zero: Interval tempertaure 0 dosen’t mean no 0
What is ratio variable
Ratio is meaningful
Zero means absence of attribute ( is natural)
Temperature in Kelvin 0 means absolute 0
Blood pressure a dead person
Age
Income
What are the three aspects of central tendency
mean, mode, median
What is the point of histogram
Gives you an idea of distribution of all data
What is a symmetric normal distribution
Mean meidan and mode are the same
This can be Bp
Weight
Height
What is negatively skewed normal distribution
Mode is on the high side
Mean is low
Marks in modules and tutorial
Means that most people are on the positive end of normal distribution and the outliers are on the other side
What is positively skewed
The mean is the higher side because you have a few people on the high side like salary where the outliers skew it to the right but most people make a low or moderate income
What is the range
The difference between the minimum and maximum
What are the quartiles
Mark three values split the data in half
Q1 is hald of the lower half
Q2 is the median
Q3 is the middle of the upper half
How do you calculate interquartile range
Q3-Q1
What are the main points in a boxplot
The first line on the box is Q1
The middle line in the box is the Median
The last line on the box is Q3
The end of the left side is q1-1.5IQR
The end of the right side is q3+1.5IQR
There are outliers are the dot we don’t want to include them
How do you caculate variance
You add all your data points minus the mean square it and divide by the amount of numbers
How do you calculate standard deviation
Is the square root of the variance
How do you calculate standard error
Dividing the standard deviation by the total number of observations and then dividing by the total number of observations square rooted
How is normal curve split up
68%
95%
99.75%
Each middle part is 34.1%
Then out is 13.6%
Then 2.1%
Then 0.1%
How do you see how many are measured
You add mean + 1 sigma
and -1 sigma
Then 2 etc see where you are on the graph
What does a confidence interval do
Provide information about the expected value of a measure in a source population based on the measured value in a study population
What does a 95% confidence interval do
Means that 5% of the time the confidence interval is expected to miss capturing the true value of a measure in the source population
What is comparative statistics
You are comparing 2 groups
Comparing the main factors between exposed and unexposed in cohort studies
We can not just look at the calculated values ( these are estimates from samples subject to random sampling error)
What is inferential statistics
Techniques that use statistics from a random sample of a population to make evidence-based assumptions (inference) about the values of parameters in the population as a whole
What is hypothesis testing
To test an explicit statement or a hypothesis about a population parameter
What is the null hypothesis
There is no difference between the two or more values being compared
What is the alternative hypothesis
There is a difference between the two or more populations being compared
What are the steps in hypothesis testing
- Take a random sample from the population of interest
- Set up two competing hypotheses ( based on research questions)
- Use sample statstics (mean,frequency)
- Determine if the null hypothesis is really true
What is the p value
Determine whether the observed sample supports the null
When do you reject the null hypothesis
If the p value is less than 0.02 or usally less than 0.05
When do you accept the null hypothesis
0.1 and 0.9
How is the p-value caculated
From observed data based on pertinent test statistic
What is parametric test
Assumes the variables being examined have particular distributions
Inferential methods are based on types of distributions
Often normally disturbuted
What is a nonparametric test
Does not make assumptions about the distributions of reponses
Used for ranked variables and when the distribution of a ratio or interval variable is non-normal