statistics exam 2 Flashcards

1
Q

association

A

values of one variable tend to occur with certain values of another variable; detected when the conditional distributions differ from the marginal distribution and from each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

bias

A

a condition where the mean of the statistic values differs from the parameter and the statistic estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

bivariate data

A

data collected on two variables for each individual in a study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

central limit theorem

A

the name of the statement telling us that the sampling distribution of x bar is approximately normal whenever the sample is large and random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

conditional distribution

A

the distribution of the values in a single row (or a single column) of a two-way table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

control chart

A

a statistical tool for monitoring the input or output of a process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

control limits

A

u-3sigma/rt n and u+3sigma/rt n; used to detect out-of-control signals in a control chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

correlation coefficient

A

a measure of the strength of the linear relationship between two quantitative variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

disjoint events

A

events that cannot occur simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

distribution of a variable

A

a list of the possible values of a variable together with the frequency of each value (probabilities can be given instead of frequencies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

event

A

a single outcome or a combination of outcomes from a random phenomenon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

extrapolation

A

predicting a Y value using a value of X that is outside of the range of X values used to obtain the regression equation. This prediction could be very far off.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

inference

A

using results from a sample statistic value to draw conclusions about the population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

influential observation

A

an observation that substantially alters the values of slope and y intercept in the regression equation when it is included in the computations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

law of large numbers

A

The fact that the average (x bar) of observed values in a sample will get closer and closer to u as the sample size increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

laws of probability

A

the basis for hypothesis testing and confidence interval estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

least squares

A

a method for finding the equation of a line that minimizes the sum of squared residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

least squares regression line:

A

the line with the smallest sum of squared residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

lurking variable

A

a variable that is not measured but explains association between two variables that are measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

marginal distribution

A

the distribution of the values in the “total” row (or the “total” column) of a two-way table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

mean of the sampling distribution of x bar

A

the mean of all the sample means (x bars) from all possible samples of size n from a population; equals u

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

u

A

the mean of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

no association

A

a condition where values of one variable occur independent of values of another variable; detected when the conditionals of a two-way table equal the marginal distribution (and each other)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

out-of-control process

A

one sample mean outside three standard deviations of x bar or 9 sample means in a row above or below the center line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

outlier

A

an observation that falls outside the overall pattern of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

parameter

A

a characteristic of a population that is usually unknown; this could be mean, median, proportion, standard deviation computed on all the data from the population; a parameter does not have variability

27
Q

parameter symbols

A

u, sigma, and p (mean of population, standard deviation of population, proportion of a population)

28
Q

positive association

A

high values of one variable tend to associate with high values of another variable.

29
Q

probability of an outcome

A

a measure of the proportion of times an outcome occurs in a very long series of repetitions that gives us an indication of the likelihood of the outcome.

30
Q

process

A

sequence of operations used in production, manufacturing, etc.

31
Q

process in statistical control

A

a process whose inputs and outputs exhibit natural variation when observed over time

32
Q

quality control chart

A

a chart plotting the means, x bar, of regular samples of size n against time; this chart is used to access whether the process is in control.

33
Q

quantitative bivariate:

A

the type of data required for regression analysis

34
Q

r

A

the symbol for correlation coefficient

35
Q

r squared

A

the percentage of total variation in the response variable, y, that is explained by the regression equation; in other words, the percentage of total variation in the response variable, y, that is explained by the explanatory variable, X.

36
Q

random

A

a phenomenon that describes the uncertainty of individuals outcomes but gives a regular distribution of the outcomes in the long run.

37
Q

regression equation

A

a formula for a line that models a linear relationship between two quantitative variables

38
Q

residual

A

the observed y minus the predicted y; denoted y-yhat

39
Q

residual plot

A

a diagnostic plot of the explanatory variable versus the residuals used to access how well the regression line fits the data; complete scatter in a shoebox pattern is good whereas a megaphone pattern denotes unequal variance in Y’s across all levels of X and curvature in the form of a smile or a frown denotes that the linear model isnot best for that data.

40
Q

sample mean, x bar

A

the random variable ot the sampling distribution of x bar

41
Q

sample space

A

the list of all possible outcomes of a random phenomenon

42
Q

sampling distribution

A

a distribution of a statistic; a list of all the possible values of a statistic together with the frequency (or probability) of each value

43
Q

sampling distribution of x bar

A

a list of all the possible values for x bar together with the frequency (or probability) of each value; in other words, the distribution of all x bar’s from all possible samples

44
Q

sampling variability

A

the variability of sample results from one sample to the next; something we must measure in order to effectively do inference

45
Q

scatterplot

A

a two dimensional plot used to examine strength of relationship between two variables as well as direction and type of relationship.

46
Q

Simpson’s paradox

A

a condition where the percentages reverse when a third (lurking) variable is ignored; in other words, a condition leading to misinterpretation of the direction of association between two variables caused by ignoring a third variable that is associated with both of the reported variables.

47
Q

simulation

A

using random numbers to imitate chance behavior

48
Q

slope

A

a measure of the average change in the response variable for every one unit increase in the explanatory or independent variable

49
Q

standard deviation (s):

A

a measure of the variability of data in a sample about x bar.

50
Q

standard deviation of x bar, also called the standard deviation of the sampling distribution of x bar

A

a measure of the variability of the values of the statistic x bar about u; a measure of the variability of the sampling distribution of x bar; in other words, the “average” amount that the statistic, x bar, deviates from its associated parameter. computed as sigma/rt n

51
Q

statistic

A

a number computed from sample data (without any knowledge of the value of a parameter) used to estimate the value of the parameter.

52
Q

statistic symbols:

A

x bar, s, p hat (mean of sample, standard deviation of sample, proportion of sample)

53
Q

statistical process control

A

a procedure used to check a process at regular intervals to detect problems and correct them before they become serious.

54
Q

sum of squared residuals (or error)

A

the residuals are squared and added; denoted SSE.

55
Q

total variation in Y:

A

the sum of the squared deviations of the Y observations about their mean, y hat

56
Q

two-way table

A

a table containing counts for two categorical variables. It has r rows and c columns

57
Q

unbiased

A

a condition where the mean of the statistic values equals the parameter that the statistic estimates

58
Q

unexplained variation

A

the sum of squared residuals

59
Q

X:

A

the symbol for explanatory variable

60
Q

x bar-chart

A

a plot of sample means over time used to assess whether a process is in control

61
Q

Y:

A

the symbol for response variable

62
Q

y hat:

A

the symbol for predicted y

63
Q

z-score

A

a measure of the number of standard deviations of a value or observation from the mean.