Chapter 1, Statistical Investigations Flashcards
Population
Every individual in a group
Unit of Observation
One case or element you are collecting data from (ie. person, animal, object, time point, etc)
A variable can be thought of as a _______ or a ________ of the population that might vary with each observation
Attribute ; Feature
Statistic
Some calculation we might make from our collected data
Variable of interest
Changing quantity that is measured (ie. length, distance, etc)
Statistic possibilities
Probabilities
Statistics are generated in hopes that it estimates ______ or make judgements about them
Parameters
Parameter
some characteristic about the population
Nominal variables
Variables with outcomes that fall into categories with no inherent ordering/scale
What flu symptoms are you experiencing?
Nominal Variable
What fruits do you like to eat?
Nominal Variable
Ordinal variables
Variables whose outcomes fall into categories that have a meaningful ordering but not on a consistent numeric scale
Are you a freshman, sophomore, junior, or senior?
Ordinal Variables
Do you strongly approve, somewhat approve, somewhat disapprove, or strongly disapprove of this policy?
Ordinal Variables
Discrete Variables
variables whose outcomes fall on a numeric scale, but only take limited values (ie. whole numbers). These are typically countable.
How many days last month did you work out?
Discrete Variables
What is the number of bugs you squashed today?
Discrete Variables
Continuous Variables
Numeric and measurable (can take any value in a range)
What is the heaviest amount of weight you can squat?
Continuous Variables
How much time did you spend on the homework assignment?
Continuous Variables
π is a __________ for __________
parameter ; categorical data
π represents
a population proportion (abstract, you don’t know what it is)
p̂ is a ___________ for __________________
statistic ; categorical data
p̂ represents
a sample proportion (you know what it is)
μ is a ______________ for _____________
parameter ; numeric data
μ represents
population mean
X̄ is a ______________ for ______________
statistic ; numeric data
X̄ represents
sample mean
σ is a _____________ for ______________
parameter ; deviation
σ represents
a population standard deviation
s is a ___________ for _______________
statistic ; deviation
s represents
a sample standard deviation
Data on the left
left skewed/negatively skewed
Data on the right
right skewed/positively skewed
Bar plots
Used for a single categorical variable
Histograms
Used for single numeric variable (x-axis is variable, y is count)
Univariate variable
ask about characteristics of one variable
Multivariate variable
ask about the relationship between multiple variables
Response variable
a variable that we have an interest in better understanding or predicting. It is an outcome of interest
Explanatory variable
a variable that we think might help predict or explain the response variable. We may suspect it is the causal agent
Pearson’s correlation coefficient
A statistic between -1 and +1 that describes the direction and strength of a linear association between two numeric variables
r represents
correl. coeff. sample statistic
p represents
correl. coeff. population parameter
Q1
25th percentile ; median of the lower half
Q2
50th percentile ; median of the entire set
Q3
75th percentile ; median of the upper half of the