Quant 1 Flashcards
Individuals vs. Variables
Individuals - the things we count or measure
Variables - Some characteristics of the individuals
Students - years in school, GPA
Cars - Color, mfg., price
Types of data
Categorical - Verbal (explaination) or coded (type 1, 2, 3 …)
Numerical - Discrete (whole nums) or continuous (fractional or rounded to a whole num)
Categorical Data
Puts individuals into groups
-year in school
-color
-type of climate
Numerical Data
Assigns numbers
-GPA
-Price
-Number of errors
Discrete vs Continuous
Discrete is countable, observable, integer value, or counted on hand.
Continuous is fraction or decimal, measured by an instrument, possibly turned discrete by rounding.
Descriptive, Predictive and Prescriptive Analytics
What happened? What will happen? What should we do going forward?
CRM
Customer Relationship Management
Post hoc fallacy
If A precedes B, then A causes B. Assuming causality.
Observation
A single member of a collection
Data Set
All the values of all of the variables of all of the observations
Coding categorical info
Assign a numerical value to a nonnumerical variable.
Time series data
x axis is equally spaced time gaps
Nominal Data
Categorical data. Qualitative, categorical, or classification. Weakest level of measurement
Ordinal Data
Implies ranking. (size of vehicle: Full-sized (1), compact (2), subcompact (3)) Stronger than nominal data but still weak. No averages can be computed because there is no definition of the distance between each variable.
Interval Data
eg. Survey data on satisfaction. Can compute things like average. No meaningful zero point.
Ratio Data
Meaningful zero point. All mathematical operations are applicable including logs and ratios. Zero does not have to be observable in the data (ie baby weights). Ratio data can be recoded down to ordinal data but the inverse is not possible. (ie categorizing blood pressures into ‘normal’, ‘elevated’ or ‘high’
Likert Scale
Survey research scale indicative of Interval Data. Strongly agree to strongly disagree. Distance between coded responses can be seen as equal, although ratios are not applicable. 4 is not twice 2.
Population
All of the items we are interested in. For example: all of the passengers on a plane (finite) or all of the coke products produced in an ongoing line (~infinite)
Sample
A subset of the population that we will actually analyze
Census
While a sample involves looking at a subset, a census looks at all of the individuals. The accuracy may be illusory
When a sample may be better than a census
Infinite Population - indefinite pop
Destructive testing - testing destroys population
Timely results - faster
Accuracy - too resource intensive to do a census
Cost - too expensive to do a census
Sensitive info
When a census may be preferable to a sample
Small population
Large sample size
Database exists
Legal requirements
Parameter
A measurement or characteristic of the population (mean or proportion) Usually represented by mu and pi
Statistic
A numerical value calculated from a sample (mean or proportion). Usually represented by x bar and p.