unit 1 - chapter 2 - descriptive statistics Flashcards

Question 1

Q

mean, median and mode

Answer

A

Where’s the middle of the distribution (shows curved to left bar graph)

Mode highest point on graph
Median will be somewhere in the middle
Mean will be pulled/dragged by the outliers

Bell curve means all are the same

Question 2

Q

levels of data… measurements to use

Answer

A

1 - Nominal.… Mode

2 - Ordinal…Median (p50/50th percentile)

3 - Interval…Mean
4 - Ratio….Mean

Question 3

Q

nominal - mode

Answer

A

Mode for category or value for the graph of top billion dollar content companies?
Disney is the mode it is the top of the chart (the most)

Mean for Netflix original content hours?
Drama and Kids because they are the top of the chart (the most)

Mean for time per day on netflix
Highest point on the graph is sunday (2.10 HRS)
It is not tuesday/wednesday and friday because they add up the most and are the same (1:30 HRS)

Question 4

Q

ordinal - median

Answer

A

Median for movie rating and attendance?
For value it is 300 which is PG13 based off attendances and going half way up

Median for levels of pain and frequency
3.5 level of pain based off of half way up of frequency

Median = PCT(n+1)
PCT = %
N = sample

Question 5

Q

interval and ratio - mean

Answer

A

Add everything up and divide by N
X bar = sigma (x) / n

Mean for blended strawberry
CF is cumulative frequency
N = 52 (CF top number)

X bar = sigma (x * f)/ n
Units * frequency / CF

Check this…..
The range is
Get it from units!!
= 60-55
= 5

Question 6

Q

mode

components:
quantity:
outliers:

Answer

A

components: no formula
quantity: one or more
outliers: not affected

Question 7

Q

median

components:
quantity:
outliers:

Answer

A

components: size of dataset
quantity: only one
outliers: not affected

Question 8

Q

mean

components:
quantity:
outliers:

Answer

A

components: dataset size and data points
quantity: only one
outliers: affected

Question 9

Q

4 levels of data

Answer

A

Nominal - variation ratio
Ordinal - median deviation
Interval - standard deviation
Ratio - standard deviation

Question 10

Q

standard deviation vs variance

Answer

A

sd
(more risk) Sample = S
Population = Sigma (o)

variance
Sample = s^2
Population =sigma^2 (o^2)

population = parameter
statistic = sample

Question 11

Q

standard deviation

Answer

A

Different answers: s or o (s2 or o2)
Easier to solve by hand
Square the numerator because x - x bar = 0

Downside of s2 and o2
Problem is variance is in a magnitude greater than data
Answer is squared

Question 12

Q

standard deviation

Answer

A

Will not zero out
Is based on the mean
Average distance (ruler)
Same scale as original data
Quiz question: Thus the standard deviation is the..
Standard (benchmark)*
Note: SD is influence by outliers

Question 13

Q

standard deviation is used for

Answer

A

Used as a descriptor
Used to normalize data
In business as a measure of volatility, risk, control and outcome assignment

Question 14

Q

the variance

Answer

A

Historical value
Appears ana an element or aggregate variability in statistical tools such as ANOVA, SLR, and multiple regression
Dropped in bigger formulas

Question 15

Q

facebook practice problem

Answer

A

Can you calculate the mean for less than $40 million
What facebook said they would do:
Avg duration of video viewed = total time watched / total number of users watching video

What they actually is:
Avg duration of video viewed = total time watched / total number of users watching video 3 or more seasons
Denominator smaller avg will be bigger
Average duration metrics were inflated 150-200%
Highlight clips increased this number
Tik tok vs youtube videos on facebook

Calculate the mean for less than 40 million dollar contract for Facebook

Question 16

Q

quartiles vs percentiles

Answer

Study These Flashcards

A

Quartiles are special percentiles. The first quartile, Q1, is the same as the 25th percentile, and the third quartile, Q3, is the same as the 75th percentile. The median, M, is called both the second quartile and the 50th percentile.

The third quartile, Q3, is nine. Three-fourths (75%) of the ordered data set are less than nine.
The interquartile range is a number that indicates the spread of the middle half or the middle 50% of the data.

To calculate quartiles and percentiles, the data must be ordered from smallest to largest. Quartiles divide ordered data into quarters. Percentiles divide ordered data into hundredths.
To score in the 90th percentile of an exam does not mean, necessarily, that you received 90% on a test. It means that 90% of test scores are the same or less than your score and 10% of the test scores are the same or greater than your test score.

Question 17

Q

sample mean vs population mean

Answer

Study These Flashcards

A

The letter used to represent the sample mean is an x with a bar over it (pronounced “x bar”): 𝑥–

The Greek letter μ (pronounced “mew”) represents the population mean. One of the requirements for the sample mean to be a good estimate of the population mean is for the sample taken to be truly random.

Question 18

Q

when is the mean = median
when is the mean > median

Answer

Study These Flashcards

A

when the distribution is symmetrical
when the distribution is skewed to the right

mean not mode!
where does the average lie?

Question 19

Q

why is the standard deviation important?

Answer

Study These Flashcards

A

provides a numerical measure of the overall amount of variation in a data set, and
can be used to determine whether a particular data value is close to or far from the mean.
the standard deviation provides a measure of the overall variation in a data set

Question 20

Q

variability in samples

Answer

Study These Flashcards

A

Observational or measurement variability
Natural variability
Induced variability
Sample variability

Question 21

Q

variability in samples - measurement variability

Answer

Study These Flashcards

A

Measurement variability occurs when there are differences in the instruments used to measure or in the people using those instruments.

If we are gathering data on how long it takes for a ball to drop from a height by having students measure the time of the drop with a stopwatch, we may experience measurement variability if the two stopwatches used were made by different manufacturers

Question 22

Q

variability in samples - natural variability

Answer

Study These Flashcards

A

Natural variability arises from the differences that naturally occur because members of a population differ from each other.

For example, if we have two identical corn plants and we expose both plants to the same amount of water and sunlight, they may still grow at different rates simply because they are two different corn plants.

Question 23

Q

variability in samples - induced variability

Answer

Study These Flashcards

A

Induced variability is the counterpart to natural variability; this occurs because we have artificially induced an element of variation (that, by definition, was not present naturally):

For example, we assign people to two different groups to study memory, and we induce a variable in one group by limiting the amount of sleep they get.

Question 24

Q

variability in samples - sample variability

Answer

Study These Flashcards

A

Sample variability occurs when multiple random samples are taken from the same population. For example, if I conduct four surveys of 50 people randomly selected from a given population, the differences in outcomes may be affected by sample variability.

unit 1 - chapter 2 - descriptive statistics Flashcards

(24 cards)