Statistics Flashcards
What is descriptive statistics?
- Summarises and described data
2. Are concerned with measures of central tendency and measures of dispersion
What is a variable?
Is an attribute that has two or more divisions, characteristics or categories that can be measured or observed
What is a constant?
An attribute that does not change
Levels of measurement of variables
- Nominal - city of birth
- Ordinal - pain scale
- Interval - test scores
- Ratio - age, height, weight
What are the methods to summarising data?
- Tables
- Graphs
- Charts
What are the members of central tendency?
- Mean
- Median - middle-ranking number
- Mode - the most frequently occurring number
What are measures of dispersion?
- Percentiles
- Range
- Variance/standard deviation
What is range?
The difference between the largest and smallest value.
What are percentiles?
Percentiles are numbers that divide a distribution or area of a histogram into 100 parts of equal area
What is variance?
The variance is the average of the squares of the deviation of the observation from their mean.
What is standard deviation?
The SD is the square root of the variance.
What is normal distribution?
- Symmetrical, unimodal, bell-shaped distribution of values
- The peak occurs at the mean value
- The median, mode and mean all coincide at the same point
- Mean is 0, and the standard deviation is 1
- Normal distribution is a good descriptor of real data
- It is a good approximation of results that occur by change
- Many statistical procedures are based on normal distributions
What are the different shapes of frequency distributions?
- Bell-shaped distribution of values
- Asymmetric distribution of values (skewed to the left or right)
- Kurtosis
What is kurtosis?
Kurtosis is a statistical measure that defines how heavily the tails of a distribution differ from the tails of a normal distribution. In other words, kurtosis identifies whether the tails of a given distribution contain extreme values.
What is probability?
A value defined to be between 0 and 1.
Measures ‘how likely’ it is that an event occurs.
What is a binomial distribution?
Counts the number of ‘successes’ in a series of trials.
- Only two possible outcomes; “success” and “failure”
What is Poissin distribution?
Counts the number of events occurring in a fixed time period
- events occur at an average period
- events occur independently of the time since the last event
- approximates to Binomial distribution when N is large and 𝜋 is small
What is a continuous random variable?
A continuous random variable is a random variable that takes on an infinite range of values.
Continuous data is described by a probability density function - a smooth curve between two points on the horizontal axis signifies probability of an observation failing between those points.
The probabilities are associated with intervals rather than single points.
Steps in hypothesis testing
- Identify the research question
- Specify the null (Ho) and alternative (Ha) hypothesis
- Select the appropriate test statistic
- Collect data
- Perform required calculations
- Evaluate findings and report
- Develop appropriate interpretations of the conclusions
What is the null hypothesis?
The null hypothesis always states that there is no differences between groups, between treatments, or that one factors does not depend on the other.
We want to prove the opposite
What is the alternative hypothesis?
This is what we want to prove
What are the consequences of the null hypothesis and alternative hypothesis?
If we can determine that the results of an experiment are unlikely to have occurred by sampling error, we are inclined to reject the null hypothesis.
If the results are likely to have occurred by sampling error, we are inclined not to reject the null hypothesis
Directional vs non-directional hypothesis
- Non-directional hypothesis - only looks for a difference
2. Directional hypothesis - looks for at the direction of difference
What is a p-value?
The p-value is the probability of obtaining a test statistic with a value as extreme or more extreme than the one determined by the sample data.
The decision about whether there is enough evidence to reject the null hypothesis is made by comparing the p-value to the value if a (the level of significance of the test).
Common values for a are 0.05 or 0.01
Type 1 errors
Incorrectly rejecting a true null hypothesis.
- False positive
- Level as significance (a), is the probability of making a type 1 error (usually set at 0.05)
Type II errors
Failing to reject the null hypothesis when is should be rejected
- rate of false negative
- beta, the probability of a Type II error is usually set at 0.2 or 0.1
What is power?
The probability that the null hypothesis is rejected when the null hypothesis is false.