Descriptive Statistics Flashcards
What is a histogram?
A diagram with rectangles whose area is proportional to the frequency of a variable and whos consist of bins (x-axis) and its y-axis is the frequency (number of observations).
What are descriptive statistics?
A visual descriptor of a data set.
What are inferential statistics?
Using your data to infer something about the general population
Making predictions about a broader set of people, situations or events
What is the R command for histograms?
hist()
What is the mean?
The average of the results of the data.
How do you calculate the mean?
Add up each of the data points and divide by the number of data points total (N)
What is the median?
The middle observation
What is the mode?
The most frequently observed value in the data
What are the measures of central tendency?
Mean
Median
Mode
What are the measures of spread?
Range
Interquartile Range
Standard Deviation
What is the range?
The full range of values in the data set.
(0%-100%)
How do you calculate the range?
Maximum - Minimum
What is the interquartile range?
The middle half of the data set
(25%-75%)
What is the standard deviation?
How far each point is from the centre of mass.
If your data is normal, what percentage of the population is within 1 standard deviation of the mean?
68%
If your data is normal, what percentage of the population is within 2 standard deviations of the mean?
95%
If your data is normal, what percentage of the population is within 3 standard deviations of the mean?
99%
What is the R command for calculating standard deviations?
sd()
The symbol s refers to?
The standard deviation of a sample
The letter σ represents?
The standard deviation of a population.
What is the formula for standard deviations?
The square root of the variance.
What is variance?
The average of the squared differences from the Mean.
What are the four steps for calculating the variance?
- Work out the Mean (average)
- For each number: subtract the Mean and square the result (the squared difference - to avoid negatives and positives cancelling each other out)
- Calculate the average of those squared differences
- Divide the average of the squared differences by N for population data or N-1 for sample data
What is the difference between calculating variance when using population data or sample data?
Population data = divide by N when calculating Variance
Sample data = divide by N-1 when calculating variance
Why can SD’s help determine clinical significance?
Tell us what is normal or abnormal.
What descriptive statistics should you use to describe nominal scale variables?
Frequency Tables
What are frequency tables?
A table consisting of the number of times you see each observation.
When do you use ‘cross tabulation’?
When you want to look at nominal variable relationships.
(e.g. males with brown eyes, females with green eyes etc.)