Chapter P: Preliminaries Flashcards
Introduction to statistical Investigations
What is a population?
A set of units (usually people, objects, transactions, or events) that we are interested in studying.
What is a Sample?
A subset of the units of a population
What is Descriptive Statistics?
It utilizes numerical and graphical methods to loos for patterns in a data set, to summarize the information revealed in a data set, and to present that information in a convenient form.
What is inferential Statistics?
It utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data.
What is a measure of reliability?
a statement (usually quantified) about the degree of uncertainty associated with a statistical inference.
What are the six steps of statistical investigation?
- Ask a research question
- Design a study and collect Data
- Explore the data
- Draw inferences
- Formulate conclusions
- Look back and ahead
What are the 4 pillars of Statistical Inference?
- Significance - How strong the effect is?
- Estimation - What is the size of the effect?
- Generalization - How broadly do the conclusions apply?
- Causation - Can we say what caused the observed difference?
What is Data?
Data can be thought of as the values measured or categories recorded on individual entities of interest.
What is quantitative data?
measurements that are recorded on a naturally occurring numerical scale.
What is Qualitative data?
measurements that cannot be measured on a natural numerical scale; they can only be classified into one of a group categories.
Distribution
describes the pattern of value/category outcomes
Class
one of the categories into which qualitative data can be classified.
Class frequency
the number of observations in a particular class.
class relative frequency
the class frequency divided by the total number of observations in the data set.
Class percentage
the class relative frequency multiplied by 100%
Sample mean
The sample mean of a set of quantitative data is the sum of the measurements divided by the number of measurements contained in the data set.
Sample Standard Deviation
The sample standard deviation is a valuable measure of variability of the data that determines the roughly average distance our quantitative data is from the meani
Histogram
A histogram for a quantitative variable is a graph that shows “how often: measurements in a particular range of numerical values called the class interval occur.
- calculate sqrt(n) and then round to determine the number of intervals
- calculate the max minus min then divide by number of intervals, then round to find the width of each interval
- if a measurement falls on the border of two classes “bump” it up
Center
The middle figure in quantitative data which “splits” the data. Half of the values should be larger that the center and half should be smaller.
Spread/variability
how far the data stretches: typically presented as the lower value to the highest value
Shape
The form of the distribution
Unusual observations
Are there outliers
Random Process
one that can be repeated a very large number of times (in principle, forever) under identical conditions with the following property
Probability
The probability of an event is the long-run proportion of times the event would occur if a random process were repeated indefinitely