Summer Vocab Flashcards

1
Q

Statistics

A

The study of variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variability

A

Differences. How things differ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two branches of statistics

A

Inferential and descriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Descriptive stats

A

Description of the data collected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Inferential Stats

A

Inferences made about the data collected. What conclusions can be made about the entire population based on the data? A small sample can tell a lot about the larger population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Descriptive v. Inferential Stats

A

Descriptive you are describing the data set, what we know about it. Inferential we are making conclusions about the population based on the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data

A

Any collected information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Population

A

The group you’re interested in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample

A

A subset of a population. Taken to make inferences about the population. We take statistics from samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data v. Stats

A

Data is each little bit of information collected from the subjects. They are the INDIVIDUAL little things we collect. we summarize them by, for example, finding the mean of a group of data. If it is a sample, then we call that mean a “statistic” if we have data from each member of population, then that mean is called a “parameter”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Parameter

A

A numerical summary of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Statistic

A

A numerical summary of a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

We are curious about the average wait time at a Dunkin Donuts drive through in your neighborhood. You randomly sample cars one afternoon and find the average wait time is 3.2 minutes. What is the population parameter? What is the statistic? What is the parameter of interest? What is the data?

A

The parameter is the true average wait time at that Dunkin Donuts. This is a number you don’t have and will never know. The statistic is “3.2 minutes.” It is the average of the data you collected. The parameter of interest is the same thing as the population parameter. In this case, it is the true average wait time of all cars. The data is the wait time of each individual car, so that would be like “3.8 min, 2.2 min, .8 min, 3 min”. You take that data and find the average, that average is called a “statistic,” and you use that to make an inference about the true parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Census

A

Sample of an entire population, you get information from every member of the population. Makes sense for smaller populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Random Variables

A

If you randomly choose people from a list, then their hair color, height, weight, and any other data collected from them can be considered random variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Quantitative variables

A

Numerical measures like IQ and height.

17
Q

Categorical variables

A

Categories like eye color and music preference.

18
Q

Discrete Variables

A

Can be counted like “numbers of cars sold” they are generally integers.

19
Q

Continuous Variable

A

The weight of a mouse (4.344 oz.)

20
Q

Random Sample

A

When you choose a sample by rolling dice, choosing names from a hat, or other REAL RANDOMLY generated sample. Humans can’t really do this well without the help of a calculator, cards, dice, or slips of paper.

21
Q

Frequency

A

How often something comes up.

22
Q

A Frequency Distribution

A

A table, or a chart, that shows how often certain values or categories occur in a data set.

23
Q

Relative Frequency

A

The PERCENT of time something comes up.

24
Q

How do you find relative frequency?

A

Divide the frequency by the total

25
Q

Cumulative frequency

A

Add up the frequencies as you go. Suppose you are selling 25 pieces of candy. You sell 10 the first hour, 5 the second, 3 the third and 7 in the last hour, the cumulative frequency would be 10, 15, 18, 25

26
Q

Relative cumulative frequency

A

It is the ADDED up PERCENTAGES.. An example is selling candy, 25 pieces sold overall…, with 10 the first hour, 5 the second, 3 the third, and 7 the fourth hour, we’d take the cumulative frequencies, 10, 15, 18 and 25 and divide by the total giving cumulative percentages… .40, .60, .64, and 1.00. Relative cumulative frequencies always end at 100 percent.

27
Q

Bar charts v histograms

A

bar charts are for categorical data (bars don’t touch) and histograms are for quantitative data (bars touch)

28
Q

Mean

A

the old average we used to calculate. It is the balancing point of the histogram

29
Q

Population mean v. sample mean

A

population mean is the mean of a population, it is a parameter, sample mean is a mean of a sample, so it is a statistic. We use sample statistics to make inferences about population parameters.

30
Q

Symbols for pop and sample mean

A

Mu for population mean, xbar for sample mean.

31
Q

Histogram: mean and median

A

Mean is balancing point of histogram, median splits the area of the histogram in half.

32
Q

median

A

the middlest number, it splits area in half (always in the POSITION (n+1)/2 )

33
Q

mode

A

the most common, or the peaks of a histogram. We often use mode with categorical data

34
Q

Why don’t we always use the mean, we’ve been calculating it all of our life ?

A

It is not RESILIENT, it is impacted by skewness and outliers