Statistics Flashcards
Variance (or Sample Variance)
The average of the squared differences from the mean.
Measure of how far a set of numbers is spread out (but just a numerical value that doesn’t make much sense on its own).
Standard Deviation
Square root of the variance.
Tells you how much the data is spread out.
Regression Analysis
A way to find trends in data.
Will provide an equation for a graph so you can make predictions about your data.
Fitting a set of points to a graph.
Provides an estimate of one variable based on the linear function of another.
Sampling frame
The population of interest
T-test
Tests the differences in means.
Are 2 groups part of the same population?
Chi-square test
Goodness of fit.
How well expected/predicted values match observations.
Ordinal
Position in a list, ranking.
Even though may be given numerical values, such as 1, 2, 3, 4, the values themselves are meaningless, only the rank counts. So, even though one might be tempted to infer that 4 is twice 2, this is not correct. Examples: letter grades, suitability for development, and response scales on a survey (e.g., 1 through 5).
Interval
Data that has an ordered relationship where the difference between the scales has a meaningful interpretation.
Example: temperature. The difference between 40 and 30 degrees is the same as between 30 and 20 degrees, but 20 degrees is not twice as cold as 40 degrees.
Ratio
Both absolute and relative differences have a meaning. Example: distance measure, where the difference between 40 and 30 miles is the same as the difference between 30 and 20 miles, and in addition, 40 miles is twice as far as 20 miles.
p value
Number that you get by running a hypothesis test on your data. A P value of 0.05 (5%) or less is usually enough to claim that your results are repeatable.
R squared value
In regression, tells you how good your model is.
The values range from 0 to 1, with 0 being a terrible model and 1 being a perfect model.
Nominal data
Mutually exclusive groups or categories, llack intrinsic order.
Examples: zoning classification, social security number
The label of the categories does not matter and should not imply any order. So, even if one category might be labeled as 1 and the other as 2, those labels can be switched.
Population
The totality of some entity.
Example: the total number of planners preparing for the 2018 AICP exam.
Sample
Subset of the population.
Example: 25 candidates selected at random out of the total number of planners preparing for the 2018 AICP exam.
Descriptive Statistics
Describe the characteristics of the distribution of values in a population or in a sample.
Example, the mean could be applied to the age distribution in the population of AICP exam takers, providing a summary measure of central tendency (e.g., “on average, AICP test takers in 2018 are 30 years old”). The context will make clear whether the statistic pertains to the population (all values known), or to a sample (only partial observations). The latter is the typical case encountered in practice.