Maths Flashcards
Describe 4 types of data
- Ratio - data is interval data with a natural zero point. For example, time is ratio since 0 time is meaningful. Degrees Kelvin has a 0 point (absolute 0) and the steps in both these scales have the same degree of magnitude.
- Interval - like ordinal except we can say the intervals between each value are equally split (distance is meaningful). The most common example is temperature in degrees Fahrenheit. The difference between 29 and 30 degrees is the same magnitude as the difference between 78 and 79 (although I know I prefer the latter). With attitudinal scales and the Likert questions you usually see on a survey, these are rarely interval, although many points on the scale likely are of equal intervals.
- Ordinal - efers to quantities that have a natural ordering. The ranking of favorite sports, the order of people’s place in a line, the order of runners finishing a race or more often the choice on a rating scale from 1 to 5. With ordinal data you cannot state with certainty whether the intervals between each value are equal. For example, we often using rating scales (Likert questions). On a 10 point scale, the difference between a 9 and a 10 is not necessarily the same difference as the difference between a 6 and a 7. This is also an easy one to remember, ordinal sounds like order.
- Nominal - basically refers to categorically discrete data such as name of your school, type of car you drive or name of a book. This one is easy to remember because nominal sounds like name (they have the same Latin root).
Enumeration or census v. sample
Census refers to the quantitative research method, in which all the members of the population are enumerated. On the other hand, the sampling is the widely used method, in statistical testing, wherein a data set is selected from the large population, which represents the entire group.
Primary Data vs. Secondary Data
Primary data refers to the first hand data gathered by the researcher themself. Surveys, observations, experiments, questionnaire, personal interview, etc.
Secondary data means data collected by someone else earlier. Government publications, websites, books, journal articles, internal records etc.
Mean
Measure of central tendency. = Sum of items / Count of items
Median
Measure of central tendency. Sort items high to low and select middle item.
Mode
Which value occurs most often.
Bimodal distribution
When two clearly separate groups are visible in a histogram, you have a bimodal distribution. Literally, a bimodal distribution has two modes, or two distinct clusters of data.
Range
= High value minus low value
Variance
Subtract the mean from each value. Square the difference. Sum the squares of the differences and divide by the number of cases.
Standard deviation
Square root of the variance.
Measure of the amount of variation of a random variable expected about its mean. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range.
Normal distribution
Bell-shaped curve, where each band has a width of 1 standard deviation
Dependent variable (y variable)
Variable being predicted or explained
Independent variable (x variable)
Variable used to predict or explain.
Difference between bivariate regression and multiple regression
How many variables used to predict. Bivariate = One X variable, while multiple = two or more X variables.
Bivariate would be used to predict number of automobiles per household.
Multiple regression would be used to predict house sale price, based on a number of factors including bedroom and bathroom count, accessibility to employment, etc.
Regression
Assumes a straight line can be used
to describe the relationship between
the independent (x) variable and the
dependent (y) variable.
▪ y = a + bx (or y = mx + b)
▪ a is the line’s y intercept
▪ b is the line’s slope
▪ R2 measures how well the line fits the
data and ranges from 0.0 to 1.0
Cohort component model
We divide the population into cohorts by age (often
five years), sex, and race/ethnicity.
▪ Population change is subdivided into three
components: births, deaths, migrants
▪ Calculate birth rates, survival rates, and migration
rates for a recent period
▪ Extend those rates into the future, possibly adjusting
them upward or downward
▪ Birth and death data is readily available; migration
data is difficult, primary source is American
Community Survey
Economic base
theory
Assumes two kinds of industry
Basic or export: sells to customers outside
the area of analysis
Service or non-basic: sells to customers
within the area
Economic base multiplier
Total employment / basic employment
A multiplier of 4.0 says that 4 total jobs are
created for every additional basic job
Location quotients
Compare the local
concentration of
employment in an industry to the
national employment in that industry
▪ LQi =
Local percent of employment in industry i
National percent of employment in industry i
If LQi is greater than 1.0 we can assume an
export or basic industry
If LQi is less than 1.0 we can assume we import some goods or services
If LQi = 1.0, the region produces just enough to serve the region, and no more
Shift share analysis
interprets changes in an
industry’s local employment (over a period of x
years) in terms of three components:
National share: how much would local industry
employment have changed if it mirrored the
change in total national employment
Industry mix: how much additional would it have
changed if it mirrored the difference between
national industry employment and national total
employment
Local shift: how many additional jobs did the local
industry gain or lose, presumably due to local
competitive advantage or disadvantage.