Vocabulary Flashcards
Sample
Set of observations drawn from a population
Categorical variable (qualitative)
Take on values that are names or labels (color of a ball, breed of dog)
Quantitative variable
Numerical; represent a measurable quantity (population: number of people)
Categorical graphs
Bar graph
Pie chart
Quantitive graphs
Box and whisker plot Scatter plot Dot plot Histogram Stem and leaf plot
Independence
Events are independent when one outcome does not affect probability or occurrence of the other
Shape
Distribution of a pattern of data in a sample (quantitive)
Symmetry, number of peaks, skewness, uniform
Center
Median of distribution
Spread
Variability of data
Wide range -> wide spread
Small range -> small spread
Mode
Most frequent occurring value in a population/sample
Uniform
All observations equally spread across range of distribution
Symmetric
Used to describe shape
Skewed (left vs right)
Skewed right: fewer observations in higher values
Skewed left: fewer observations in lower values
Outliers
Data point that differs greatly from other values in a sample
Median
Measure of central tendency
Smallest->largest, number in middle=median
Range
Difference between biggest and smallest random variable
Quartiles
Divide a rank-ordered data set into four equal parts
Percentile
Values that divide a rank-ordered set of elements into 100 equal parts
Mean
Average score
Sum of individual scores divided by number of individuals
Variance
Numerical value of how widely individuals value from the mean (standard deviation)^2
Standard deviation
Numerical value of how widely individuals vary from the mean
Parameter
Measurable characteristic of a population (mean, standard deviation)
Statistic
Characteristic of a sample
Sampling distribution
Probability distribution of the statistic for a sample
Resistant statistic vs nonresistant
A statistic is resistant of changing data does not change the statistic drastically
Nonresistant: changes with data
Mean: nonresistant
Median: resistant
Z-score
How many standard deviations an element is from the mean
Empirical rule
When a random independent trial is repeated under the same conditions, the fraction of trials that result in a given outcome converges to a limit as number of trials grows without bounds
Response variable
Quantity that is questioned in the study
Explanatory variable
Factor that can influence the response variable
Lurking variable
Variable that eliminates extraneous variables for the observed relationship between the independent and dependent variables
Simulation
Representation of something; not the real thing
Trial
A repetition of an experiment
Sample survey
Study that obtains data from a subset of a population
Bias
Tendency of a measurement process to over/underestimate the value of a population parameter
Randomization
Using chance methods to assign subjects to treatments (lurking variables distributed by chance)
Sample size
Number of elements in a sample from a population
Census
Study that obtains data from every member of a population
Simple random sample (SRS)
Population has n objects
Sample has n objects; possible samples of n subjects are equally likely to occur
Observational study
Research does not control how subjects are assigned to groups or which treatments groups receive
(Sample survey)
Experiment
Controlled study; subjects assigned to groups, treatments assigned
Placebo
Neutral treatment that has no “real” effect on the dependent variable
Sample space
Set of elements that represents all possible outcomes of a statistical experiment
Discrete random variable
Variable whose set of possible values is countable
Continuous random variable
A variable that can take any value between minimum and maximum values
Population
Total set of observations that can be made