Descriptive Statistics Flashcards

1
Q

Association

A

A relationship between two variables if knowing the value of one variable is useful (to some degree) in predicting the value of the other variable. There are three aspects of this relationship: direction, strength, and form.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bar Graph

A

A graph that displays the distribution of a categorical variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Binary Categorical Variable

A

A categorical variable with only two possible categories, for example, left or right.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bins or Classes

A

These are the subintervals of equal length that are used in constructing a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bivariate Data

A

Data for which there are two variables for each observation (for example: x and y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Boxplot

A

The graph that illustrates the five-number summary. The boxes are drawn between the quartiles and median. The whiskers extend from the quartiles io the minimum or maximum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Categorical Variable

A

A variable that records a group designation such as gender or type of vehicle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Causation

A

A relationship between two variables that goes a step farther than correlation, stating that a change in the value of the x variable will cause a change in the value of the y variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Consistency

A

This refers to how variable, or spread out, the values in a dataset are for a quantitative variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correlation Coefficient

A

A number that measures the degree to which two quantitative variables are associated, generally donated by r.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data

A

The numbers or categories recorded for the observational units in a study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Direction

A

One of the three aspects of association between quantitative variables which refers to whether greater values of one variable tend to occur with greater values of the other variable (positive association) or with smaller variables of the other variable (negative association).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Distribution

A

The pattern of variation of a variable; with a categorical, distribution means the variable’s possible categories and the proportion of responses in each.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Dotplot

A

A display of the distribution of relatively small data sets where each data point is represented by a dot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Empirical Data

A

With mound-shaped, symmetric distributions, approximately 68% of the observations fall within one standard deviation of the mean, approximately 95% of the observations fall within two standard deviations of the mean, and approximately 99.7% fall within three standard deviations of the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Extrapolation

A

An attempt to predict the response variable for values of the explanatory variable beyond those contained in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Five-number Summary

A

The minimum value, lower quartile, median, upper quartile, and maximum value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Frequency or Count

A

The number of observational units in a subinterval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Histogram

A

A graphical display similar to a dotplot or stemplot, but more feasible when displaying very large data plots. Bars are constructed whose height correspond to the frequency in each subinterval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Influential (Observation)

A

When removing an observation from a data set substantially changes the least squares regression equation, the observation is considered influential. Typically, observations that have extreme explanatory (x) value (far below or far above the sample the sample mean) have potential to be influential.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Intercept Coefficient
(Y-Intercept)

A

The predicted value of the response (y) variable when the explanatory (x) variable has a value of 0 when the least square regression line is used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Interquartile Range

A

The difference between the upper quartile and lower quartile.

23
Q

Least Squares Regression Line

A

The line that achieves the exact minimum value of the sum of the squared residuals.

24
Q

Lower Quartile

A

The 25th percentile or the value such that 25% of the observations fall below it and 75% fall above it.

25
Q

Mean

A

The arithmetic average or balance point of a distribution.

26
Q

Median

A

The middle value in a distribution; often considered the “typical” value.

27
Q

Mode

A

The most commonly occurring value in a distribution.

28
Q

Modified Boxplot

A

A specialized boxplot that conveys additional information by treating outliers differently. On these graphs, you mark outliers using a special symbol and then extend the boxplot’s whiskers only to the most extreme non outlier value.

29
Q

Mound Shaped Distribution

A

Single peak is at the center of the distribution.

30
Q

Observational Unit

A

The person or thing to which the variable number or category is assigned such as a student in your class.

31
Q

Predictor Variable

A

The explanatory variable

32
Q

Population

A

This refers to the entire group of people or objects (observational units) or interest.

33
Q

Proportion

A

A fraction between 0 and 1, possibly including 0 and 1.

34
Q

Quantitative Variable

A

A variable that measures a numerical characteristic such as height or weight.

35
Q

Relative Frequency or Proportion

A

The fraction of observational units in a subinterval.

36
Q

Research Question

A

A question that looks for patterns in a variable or compares a variable across different groups or looks for a relationship between variables.

37
Q

Residual

A

The difference between the observed y-value and the y-value predicted by your line for the corresponding x-value. The residual indicates the vertical distance from an observation to the regression line.
Residual = observed - fitted

38
Q

Resistant

A

When a measure’s value is relatively unaffected by the presence of outliers.

39
Q

Sample Size

A

The number of observational units studied in a sample.

40
Q

Scatterplot

A

The simplest graph for displaying two quantitative variables simultaneously using a vertical axis for the response variable and the horizontal axis for the explanatory variable.

41
Q

Side-by-Side Stemplot

A

A stemplot that is used to compare two sets of data where a common set of stems is placed in the middle of the display with leaves out in either direction.

42
Q

Skewed Left

A

The tail of the distribution follows the smaller values towards the left.

43
Q

Skewed Right

A

The tail of the distribution follows the larger values towards the right.

44
Q

Slope Coefficient

A

The predicted change in the response (y) variable associated with a one-unit increase in the explanatory (x) variable when using the least square regression line.

45
Q

Split Stemplot

A

This type of stemplot is used when there are too few stems and important details can be lost because the data points are clumped together. A split stemplot that displays each stem twice, where the 0-4 leaves appear on the first stem and the 5-9 leaves appear on the second.

46
Q

Standard Deviation

A

The typical distance that a data value in a distribution differs from the mean of the sample.

47
Q

Statistical Tendency

A

This refers to the observational units in one group being more likely to be in a certain category (for a categorical variable) or to have higher values (for a quantitative variables) than those in another group.

48
Q

Strength

A

One of the three aspects of the association between quantitative variables which indicates how closely the observations follow the relationship between the variables. In other words, the strength of the association reflects how accurately you could predict the value of one variable based on the value of the other variable.

49
Q

Sum of Squared Residuals

A

Used in determination of the line of best fit it is the sum of the squares of the residuals. The line with the smallest sum is the line of best fit.

50
Q

Symmetric Distribution

A

Left side of a distribution is roughly a mirror of the right side.

51
Q

Upper Quartile

A

The 75th percentile or the value such that 75% of the observations fall below it and 25% fall above it.

52
Q

Variable

A

Any characteristic of a person or thing that can be assigned a number or category.

53
Q

Variability

A

The phenomenon of a variable taking on different values or categories from one observation unit to another.