Statistics Flashcards
Descriptive data
Methods for organising, summarising, and presenting data in an informative way - Graphs, tables, and numbers
Inferential
Methods for drawing conclusions about a population, from a sample
Qualitative (Categorical)
Nominal - Categories that cannot be ordered (eg male, female)
Ordinal - Categories that can be ordered, but the numerical difference between groups cannot always be determined (eg, low-income, middle-income, high-income)
Quantitative (Numerical)
Discrete: Number
Continuous - Interval data, doesn’t contain a true zero, and ratio
Raw data
Collected data that has not been organised numerically or grouped
Frequency
How many times does value/category appear in the data. Can be expressed as the total number of individuals or expressed as a fraction/percentage
Quartiles
Quarters
1st quartile - located where 25% of all data points are equal to or lower than this Q1 value and 75% equal to or higher
Percentiles
100s
Median quartile
Second quartile, 50th Percentile
Interquartile range(IQR)
Q3-Q1
Sturge’s rule
A rule for determining the number of classes to use in a histogram or frequency distribution table - Optimal bins
k=1+3.33*log10(n)
Mean calculation
X̄=∑x,/n
Median
Middle of the data set. equal halves
Mode
Value which occurs with the greater frequency
Deviation from the mean
Difference between each price and the average price
x,-X̄
Symmetric distribution
Graph is a mirror image, Median=mean
Left skewed
mode>median>mean
Negative
Right skewed
mean>median>mode
Positive
Variance
The average of all deviations
σ^2=∑(x,-X̄)^2,/n-1
Standard deviation
A quantity expressing by how much the members of a group differ from the mean value of group
Sx=SQR(∑(x,-X̄)^2,/n-1)
Skewness
=3(mean-median)/standard deviation
Kurtosis
Measure of the tailedness of a distribution - how often outliners occur
=∑(x,-X̄)^4/n/S^4
Cross-sectional data
Observations from a particular point in time, containing different variables