Chapter 1 Flashcards

Question 1

Q

Cases

Answer

A

The objects described by a set of data.

Ex. Customers, companies, subjects in a study, stock

Question 2

Q

Label

Answer

A

Is a SPECIAL VARIABLE used in some data sets to distinguish the different cases

Question 3

Q

Variable

Answer

A

Is a characteristic of the case–> different cases can have different values for variables

Question 4

Q

Observation

Answer

A

Describes the data for a particular case

Question 5

Q

Categorical Variable

Answer

A

Places a case into one of several groups or categories

Ex. Bar Graphs, Pie Charts, and Pareto Charts

Question 6

Q

Quantitative Variable

Answer

A

Takes numerical values arithmetic operations, such as adding and averaging, makes sense

Question 7

Q

Statistical Software

Answer

A

In some statistical software spaces are not allowed in variable names–> instead use an underscore

Question 8

Q

Ordered Categorical Variable

Answer

A

Possible values for a grade…A, B, C, D..etc because A is better than B which is better then C and so on

Question 9

Q

Nominal Variable

Answer

A

A categorical variable that is not ordered

Question 10

Q

Instruments

Answer

A

Different areas of application (marketing) can also have their own special variables–> these variable are measured with instruments

Question 11

Q

Rate

Answer

A

Computing a rate is one of several ways of adjusting one variable to create another–> sometime more meaningful than count

Question 12

Q

Distribution

Answer

A

Describes how to values of a variable vary from case to case

Question 13

Q

Pareto Chart

Answer

A

Categories are ordered from MOST frequent–>least frequent–>most important categories for a categorical variable
Ex. frequently used in quality control settings

Question 14

Q

Histogram

Answer

A

The most common graph of the distribution of a quantitative variable wear we group near values into classes–> for small data sets a stemplot can be used

Question 15

Q

How can you describe the overall pattern of a histogram

Answer

A

You can describe the overall pattern of a histogram by its SHAPE, CENTER, and SPREAD

Question 16

Q

Outlier

Answer

A

The most important type of deviation–> an individual value that falls outside the overall pattern

Question 17

Q

When is a distribution symmetric?

Answer

A

If the right and left sides of the histogram are mirror images of each other

Question 18

Q

Skewed to the right

Answer

A

If the right side of the histogram extends much farther out than the left side..and vice versa

Question 19

Q

Positively skewed

Answer

A

Data that skews to the right–> positive skewness is the MOST common type of skewness that we see in real data

Question 20

Q

Time plot

Answer

A

Plots each observation against the time it was measured–> time on a horizontal and the variable you are measuring on a vertical scale

Question 21

Q

Mean

Answer

A

The most common measure of center is the ordinary arithmetic average–> NOT a resistant measure of center as it can be influenced by outliers

Question 22

Q

Median

Answer

A

The median is the midpoint of a distribution, the number such that half the observations are smaller and half are larger

Question 23

Q

Median Odd

Answer

A

(N+1)/2 observations up from the bottom of the list

Question 24

Q

Median Even

Answer

A

It is the mean of the two numbers in the middle

Question 25

Q

Median vs Mean

Answer

A

The median is more resistant than the mean

Question 26

Q

Median and Mean in a Symmetric Distibution

Answer

A

They are close together–> exactly symmetric exactly the same

Question 27

Q

Median and Mean in a skewed distribution

Answer

A

The mean is farther out on the long tail than the median

Question 28

Q

The five number summary

Answer

A

Boxplot–>consits of the smallest observation, the first quartile, the median, the thrid quartile, and the largest observation –> in order form largest to smallest

Question 29

Q

The five number summary vs. distribution

Answer

A

Not the most common numerical description of distribution

Question 30

Q

Most common numerical description of distribution

Answer

A

The mean to measure the center and the standard deviation to measure the spread

Question 31

Q

Standard deviation

Answer

A

Measures spread by caluculating how far the observations are from their mean–> should only be used when the mean is chosen as the method of center

Question 32

Q

n-1

Answer

A

Degrees of freedom of the variance or standard deviation

Question 33

Q

S=0

Answer

A

Only when ther is no spread–> means all the observations have the same value, otherwise S is greater than 0

Question 34

Q

What does it mean if the standard deviation is higher?

Answer

A

S gets larger when the observations are more spread out across their mean

Question 35

Q

Units

Answer

A

S has the same units of measurement as the original observation

Question 36

Q

S and the Mean

Answer

A

Like the mean, S is not resistant a few outliers or strong skewness can greatly increase S

Question 37

Q

How do you measure risk in finance

Answer

A

Taking a looking at the standard deviation of returns –> large spread –> less predictable–> more risky
BUT five number summary would be more informative

Question 38

Q

Density curve

Answer

A

A density curve is a mathematic model for the distribution of a quantitative variable

Question 39

Q

What does a density curve describe?

Answer

A

The overall pattern of a distribution. Thea area under the curve AND within any range of values is the proportion of all observations that fall within that range

Question 40

Q

68-95-99.7 rule

Answer

A

68% of observations fall within 1 standard deviation of the mean
95% of observations fall within 2 standard deviations of the mean
99.7% of observations fall within 3 standard deviations of the mean

Question 41

Q

Z-Score

Answer

A

Standardized value–> tells us how many standard deviations the observation falls away from the mean and in which direction

Question 42

Q

Z-score positive

Answer

A

Observations larger than the mean

Question 43

Q

Z-score negative

Answer

A

Observations smaller than the mean

Question 44

Q

Sample survey

Answer

A

Collects data from a sample of cases that represent a larger population of cases

Question 45

Q

Observation vs Experiment

Answer

A

We do not attempt to influence the responses by imposing a treatment (change)

Question 46

Q

Training Data Set

Answer

A

In some studies we generate one set of data to generate a set of results
Ex. model to predict something

Question 47

Q

Database

Answer

A

Data sets for statistical analysis can be extracted

Question 48

Q

Data warehouse

Answer

A

System for organizing, storing, and analyzing complex data

Question 49

Q

Sampling frame

Answer

A

A list of items to be sampled

Question 50

Q

Response rate

Answer

A

The proportion of the original sample who actually provide usable data

Question 51

Q

Undercoverage

Answer

A

Some groups in the population are left out of the process of choosing the sample

Question 52

Q

Nonresponse

Answer

A

Occurs when a case chosen for the sample cannot be contacted or does not cooperate