Week 8 - Introduction To Probability And Statistics Flashcards
What is probability?
The quality or fact of being probable
A strong likelihood or chance of something
Expressed mathematically as a value between 0 and 1.
What is subjective probability?
Personal belief in the chance of something occurring
Informal probability that has no mathematical formula
What is objective probability?
Probability derived from mathematical principles
Frequentist probabilities are the most common examples (how often an event occurs in the long run)
What is conditional probability
Probability of an event occurring given some other condition is true
Probability of rolling a FAIR coin and getting heads is .5.
Probability of rolling SNAKE EYES with a pair of FAIR dice is .028.
Measuring and summarisingdata
Im order to describe and analyse data we need to summarise it
-what location does the data cluster around? (Central tendency)
How spread out or distributed are the data around the central location (variability)
What is the types of data in statistics, categorical/ nominal?
is an arbitrary label (homeowner. Nonesmoker, male)
Label be represented with a name or a number
Nominal: vanilla, strawberry
Numerical: 1, 3 ,
What’s an ordinal scale?
Inherent order (ranks), discrete, some info about quantity. Steps may not be equal.
What an interval scale
Order + equal intervals Continuous Mathematical operations Does not have a TRUE zero 0 on the scale doesn’t mean the absence of the thing (0 degrees is not an absence of heat) (Eg 24 hour time)
What’s a ratio scale?
A scale of measurement. Has order, equal intervals and a true zero.
Physical quantities are ratio scale (mass, length, time etc)
What’s a discrete variable?
Specific values, typically whole numbers.
Categorical variables are discrete (male, female)
What’s a continuous variable?
Unlimited resolution between minimum and maximum
Continuous variables can be converted to discrete variables (but not vice versa) Eg 0ml = none, 750ml = some 3750= too much
A construct can be continuous, but the method of quantifying it may be discrete.
What are descriptive statistics
Can describe or summarise a set of data (scores)
Primary uses of descriptive statistics
- to describe a central tendency
- to describe variability (spread)
What is central tendency?
Observations cluster around a fixed point.
What does it mean when data is Bimodal or multimodal?
Bimodal is where there’s two modes
Multimodal > two modal values
Is the median robust to outliers?
Yes, because it’s the middle value so outliers don’t affect it.
If there’s an even number of scores it’s the average of the two middle scores.
What is a symmetrical distribution?
A normal distribution. If the tail is on the left hand side it’s negatively skewed and the opposite for positively skewed.
How do you determine the range?
Maximum- minimum
Depends entirely on two extreme scores.
If the max or min are outliers, the range over estimates the variability in the data.
What’s the inter-quartile range?
The range of the middle 50% of scores - range of 25th and 75th percentiles
Improves on the range by reducing the influence of outliers
What the standard deviation?
Quantifies spread of scores around the mean
Calculate how far each score is from the mean
Square each deviation score
Sum all squared deviation scores (minus 1)
Calculate the square root.
We can calculate a confidence interval to show what values are possible:
In the population based on the sample used.
Most common is 95% confidence interval - we are 95% confident the true value of the population lies within the confidence interval.
Report standard deviation and confidence interval with the mean
Example 1: make the sample mean the subject
Example 2: make the confidence interval the subject
What is an example of effect size?
Cohen’s d is the most common effect size when the difference between two means is of interest