final exam Flashcards
The average value of a data set (the sum of all values in the dataset divided by the number of values)
Mean
the mean, while excellent at identifying the middle of a dataset can be skewed by _______. This can render a false impression of our dataset
outlier data points
the middle value in the data set.
Median
- In odd numbered datasets, this will always be a value in the data set.
- In even numbered datasets, the median is the average of the two data points closest
to the middle
the most commonly appearing value in your dataset (i.e. it appears more often than other data points)
Mode
Bimodal is the term used to describe what in statistics?
When there are two modal values in a data set
the broadness of the distribution- it is defined by the upper and lower values of a dataset
Range
A narrower range of data will produce a _____ curve, while a broader range of data will produce a _____ curve
Peaked, flattened
The tendency of a curve to “lean” one direction or another because of the distribution of the data.
Skew
A _____ (+/-) skew has a mean that leans to the left and has a long tail to the right.
positive
A _____ (+/-) skew has a mean that leans to the right and has a long tail to the left.
negative
The “peakness” of a curve
Kurtosis
The more “peaked” the curve describing the date, the tighter the range and closeness of the values. This is called ______
Positive Kurtosis
The flatter the curve, the broader the range of data points. This is called _____
Negative kurtosis
The measure of dispersion of data in a dataset (how spread out it is from the mean
Standard deviation
If the data points close to the mean the standard deviation will be small. If they are spread out away from the mean, the standard deviation will be larger
What is the benefit of using standard deviation when making inferences from data?
It is less sensitive to the effects of outliers and other anomalies of distribution and combines the qualities of mean, median and mode.
Standard deviation serves as a marker of what range of values we can expect to fall within ____% of the curve’s volume.
95
_____ is used as the best estimator of a dataset’s fitness to be compared to other datasets
Standard deviation
Most statistics in RCT’s rely on data being normally distributed. The standard deviation is a part of how we measure treatment effects against each other
Can no-normal distributions of data be characterized using standard deviations?
Yes, but but a statistical “correction” must be done to create a “normalization” of the data
Research results are measure in outcomes. What are outcomes?
Outcomes are numerical measures of results that can be calculated, compared and assessed
for their “truthfulness”
From outcomes, we want to know THREE THINGS:
- statistical significance
- precision
- clinical significance
What does it mean when outcome results have statistical significance?
range of results and the point estimate of their average is “true” and not due to chance
What does it mean when outcome results have precision?
the range of results is “tight” around the point estimate of their average, and not spread across a wide range
What does it mean when outcome results are clinically significant?
the results matter clinically
_______ are used to establish whether a point estimate of an outcome is likely to be due to chance or not.
P-values
When considering P- values, we generally want to see a result that has less than a ____% chance of being due to random chance or error. This would be reported as a P- value of _____.
5%, p = 0.05
P-values that exceed 5% likelihood of being due to error are not generally accepted as being “statistically significant”
P = 0.005 is a way of stating that the outcome has a _____% probability it was due to
chance (a _____% chance the results were NOT due to random error or chance)
1.2% : 98.8%
In health care , we generally use p= _____ as our standard maximum p-value (95% certainty that the outcome is not due to random chance or error)
. 05
The number of subjects in a study (N) can affect P values. How will a high N affect it?
the less likely that an
outlier will affect the results and the lower the probability that the
outcome is due to chance.
i.e. lower P-value