Chapter 1 - Statistics, Data, and Statistical Thinking Flashcards
Learn your vocabularies.
Statistics
The science of data.
It involves collecting, classifying, summarizing, organizing, analyzing, and interpreting numerical and categorical information.
[1.2] Descriptive Statistics
Utilizes NUMERICAL and graphical methods to explore data, i.e, to look for patterns in a data set, to summarize the info. revealed in a data set, and to present the info. in a convenient form.
Inferential Statistics
Utilizes sample data to make…
ESTIMATES, DECISIONS, PREDICTIONS, OR OTHER GENERALIZATIONS about a larger set of data.
[1.3] Experimental (or observational) unit
Is an object
example: person, thing, transaction, or event) upon which we collect data
Population
A set of units (usually people, objects, transactions, or events) that we are interested in studying.
Variable
A characteristic or property of an individual experimental (or observational) unit.
Measurement
The process we use to assign numbers to variables of individual population units.
Example: Measuring the preference for a food product by asking a consumer to rate the product’s taste on a scale from 1-10.
Census
When we measure a variable for every experimental unit of a population. “EVERYONE”.
Example: You’re doing a survey travel time by asking students at school.
> Asking everyone at school is a CENSUS of the school.
> But asking only 50 students is a SAMPLE of the school.
Sample
A subset of the units of a population.
Statistical inference
An estimate or prediction or some other generalization about a population based on information contained in a sample.
Example: The sample of 100 invoices, the auditor may estimate the total number of invoices containing errors in the population of 15,000 invoices. [Figure 1.2]
Reliability
How good the inference is.
- The only way we can be certain that an inference about a population is correct is to include the entire population (census?) in our sample.
Measure of reliability
A statement (usually quantified) about the degree of uncertainty associated with a statistical inference.
Four Elements of DESCRIPTIVE Statistical Problems
- The POPULATION or SAMPLE of interest
- one or more variables (characteristics of the population or experimental units that are to be investigated)
- Tables, graphs, or numerical summary tools
- Identification of patterns in data
Five Elements of INFERENTIAL Statistical Problems
- The population of interest
- One or more variables (characteristics of the population or experimental units) that are to be investigated
- The sample of population units
- The inference about the population based on information contained in the sample
- A measure of reliability for the inference.
[1.4] Process
A series of actions or operations that transforms inputs to outputs.
- A process produces or generates output over time.
Example: Processes of interest to businesses are those of production or manufacturing.
Black box
A process whose operations or actions are unknown or unspecified.
Sample
Any set of output (object or numbers) produced by a process is also called a sample.
Quantitative data
Measurements that are recorded on a naturally occurring numerical scale.
Example: The temperature (in degrees Celsius) at which each unit in a sample of 20 pieces of heat-resistant plastic begins to melt.
- Unemployment rare (%)
- Score of a sample on the GMAT or MCAT
- Number of females employed in each of a sample of 75 manufacturing companies
Qualitative Data (categorical)
Measurements that CANNOT be measured on a natural numerical scale; only classified into one of a group of categories.
Example:
- Political party affiliation
- Defective status (defective or indefective)
- Size of a car (subcompact, mid-size, full-size)
- A taste tester’s ranking (best, worst, etc.)
Three ways of obtaining data
- Published source
- Designed experiment
- Observational study (e.g., a survey)
[1.6] Published source
A book, journal, newspaper, or Web site.
Designed experiment
Researcher exerts full-control over the characteristics of the experimental units sampled.
Example: A group of experimental units that are assigned the treatment and an untreated (or control) group.
Observational study
Experimental units are OBSERVED in a natural setting. No attempt is made to control the experimental units sampled.
Example: Opinion polls, surveys.
What’s so significant about surveys?
Most common type of observational study, where the researcher samples a group of people to ask questions and record responses.
Representative sample
Exhibits characteristics typical of those possessed by the population of interest.
Simple random sample (?)
Ensures that every subset of fixed size in the population has the same chance of being included in the sample.
Random number generator
[Self explanatory]
Typical procedure for selecting a simple random sample. Generators are available in table form, online, and other statistical software packages.
Example: Excel/XLSTAT, Minitab, andSPSS
Selection bias
Results when a subset of experimental units in the population has little or no chance of being selected for the sample.
Nonresponse bias
A type of selection bias that results when the response data differ from the potential data for the nonresponders responders.
Measurement error
Refers to inaccuracies in the values of the data collected.
Example: In surveys, the error may be due to ambiguous or leading questions and the interviewer’s effect on the respondent.
[1.7] Business analytics
Refers to methodologies (e.g statistical methods) that extract useful information from data in order to make better business decisions.
Statistical thinking
Involves applying rational thought and the science of statistics to critically assess data and inferences.
Unethical statistical practice
When the selection bias in the sample was intentional with the sole purpose to mislead the public.