Summer Assignment Flashcards
individuals
objects described by a set of data
variable
any characteristic of an individual
categorical variable
places an individual into one of several groups or categories
quantitative variable
takes numerical values where it makes sense to take an average
distribution
tells us what values the variable takes and how often it takes those values
discrete variables
a variable that cannot take on a value between its minimum and maximum value. Like flipping a coin; you can’t get 2.5 heads, only 2 or 3 heads
Continuous variables
opposite of discrete variables. Can take on any value between minimum and maximum value
Univariate Data
when you conduct a study that only looks at one variable; data that only contains one variable
Bivariate Data
data that contains two variables and examines the relationship between them. For example, height and weight.
Census
a study that obtains data from every member of a population. In most studies, a census is not practical, because of the cost and/or time required
Boxplot
Also known as box and whisker plot. A boxplot splits the data set into quartiles.
Bar Graph
A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent.
Conditional
The conditional probability of an event B is the probability that the event will occur given the knowledge that an event A has already occurred.
Sample
a set of observations drawn from a population.
Population
population refers to the total set of observations that can be made
Inference
Statistical inference is the process of using data analysis to deduce properties of an underlying distribution of probability
Frequency Table
When a table shows frequency counts for a categorical variable, it is called a frequency table
Relative Frequency
To compute relative frequency, one obtains a frequency count for the total population and a frequency count for a subgroup of the population. The relative frequency for the subgroup is:
Relative frequency = Subgroup count / Total count
Table
an arrangement of data in rows and columns, or possibly in a more complex structure.
Interquartile Range
a measure of variability based on dividing a data set into quartiles.
Five-Number
A five-number summary consists of five values: the most extreme values in the data set (the maximum and minimum values), the lower and upper quartiles, and the median