Exam 1 (Modules 1-3) Flashcards

1
Q

What should be avoided in constructing “good” graphs?

A

minimize white space, avoid clutter on graph, avoid 3D effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Determine the five-number summary

A

(put data set in ascending order): minimum, Q1, Median (Q2), Q3, maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define “statistics”

A

Science of collecting, organizing, summarizing, and analyzing information. To describe and understand sources of variation in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define “the lurking variable”

A

“Correlation does not equal causation!”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define “statistic”

A

numerical summary of a SAMPLE (Roman letters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define “descriptive statistics”

A

organizing and summarizing data (numerical summaries, graphs, tables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define “inferential statistics”

A

take result from a sample, extend it to the population, and measure the reliability of the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define “parameter”

A

numerical summary of a POPULATION (Greek letters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discrete variable

A

quantitative variable that has either a finite number of possible values OR a countable number of possible values. *Count to get the value. EX: number of pets, number of college credits, number of seats in an auditorium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous variable

A

quantitative variable that has an infinite number of possible values that are not countable. *Measure to get the value. EX: distance, total rainfall, age, data use on a cell phone per month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Process of Statistics

A

1) Identify the research objective (what questions need to be answered?). 2) Formulate the research question (with at least 1 variable). 3) Collect the data needed to answer the question(s). 4) Describe the data. 5) Perform inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define “statistical thinking”

A

using statistics to analyze and critique information you come across, in order to be an informed consumer of information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Qualitative variable

A

contains a classification system for its variable values. May be text or numeric. EX: gender, zip code, nationality, phone number, numbers on team shirts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Quantitative variable

A

the variable values are a numerical range that can be added or subtracted to provide meaningful results. Equal interval magnitude scale. Can be discrete OR continuous. EX: height, weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Frequency distribution

A

lists each category of data and the number of occurrences in each category of data. Frequency column = number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Relative frequency

A

proportion/percent of observations within a category. RF = frequency / number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Pareto chart

A

bar graph whose bars are drawn in decreasing order of frequency or relative frequency

18
Q

Classes

A

categories in which data are grouped (i.e., 25-34, 35-44). Class width = difference between consecutive lower class limits.

19
Q

Class width value (CWV)

A

CWV = (largest data value - smallest data value) / number of classes (between 5-20)

20
Q

Describe what can make a graph misleading or deceptive

A

scale of the graph, inconsistent scale, misplaced origin (aka not starting at 0), use of 3D effects

21
Q

What makes a “good” graph”?

A

Not too much white space, avoid “prettifying,” avoid 3D effects

22
Q

3 characteristics of distribution

A

shape (bell-shaped, skewed), center (average value), spread (how far data goes from average value)

23
Q

Population arithmetic mean

A

(u - mu; N = size of population). u = (x1 + x2 + … xN) / N

24
Q

Sample arithmetic mean

A

(x-bar; n = size of sample). x-bar = (x1 + x2 + …xn) / n

25
Q

Median (“typical value”)

A

(n = number of observations). If n is odd, M = (n + 1) / 2 ||| If n is even, M = [(n/2) + (n/2 + 1)] / 2

26
Q

Resistant

A

numerical summary of data is resistant if extreme values (very large or very small) relative to the data do not affect its value substantially. EX: median, quartiles, IQR = resistant, mean = NOT resistant

27
Q

Mode

A

most frequent observation of the variable that occurs in the data set. “No mode,” “bimodal,” or “multimodal” (not usually reported). Only measure of central tendency that can be determined for nominal data (i.e., location of injuries)

28
Q

Define “dispersion”

A

the degree to which the data are spread out

29
Q

Range

A

the difference between the largest and the smallest data value. Simplest measure of dispersion. NOT RESISTANT

30
Q

Standard deviation

A

Typical spread from the mean. The farther the observation is from the mean, the larger the [absolute value of] deviation. xi - u = deviation about the mean for the ith value of a population. xi - x-bar = deviation about the mean for the ith value of a sample.

31
Q

Population standard deviation

A

(o~) = square root of ((sum xi2) - [(sum xi)2 / N]) / N

32
Q

Sample standard deviation

A

(s) = square root of ((sum xi2) - [(sum xi)2 / n]) / n-1

33
Q

Variance

A

SQUARE of the standard deviation (so the answer BEFORE you take the square root for the standard deviation formula)

34
Q

the Empirical Rule

A

*bell-shaped only. 68% = 1 standard deviation. 95% = 2 standard deviations. 99.7% = 3 standard deviations.

35
Q

Chebyshev’s Inequality

A

*any shape graph. AT LEAST (1 - 1/k**2) x 100% of the observations lie within k standard deviations, where k > 1.

36
Q

The variance of a population is the arithmetic average of the squared deviations about the population mean. (T/F)

A

TRUE

37
Q

z-score

A

represents the distance that a data value is from the mean in terms of the number of standard deviations. Can be positive, negative, or zero. Provides a way to compare apples to oranges.

38
Q

z-score formulas (population & sample)

A

population: z = (x - u) / o~ ||| sample: z = (x - x-bar) / s

39
Q

five-number summary

A

minimum, Q1, M (Q2), Q3, maximum

40
Q

find lower and upper fences

A

lower = Q1 - 1.5(IQR) ||| upper = Q3 + 1.5(IQR)

41
Q

​Which variable has more​ dispersion? Why?

A

Variable y the interquartile range of variable y is larger than that of variable x.