Statistics Flashcards

1
Q

three methods of graphing grouped quantitative data

A
  1. Histogram - classes are marked on the horizontal axis, frequencies/relative frequencies/percentages on the y
  2. polygon - graph formed by joining the midpoints of the tops of successive bars in a histogram with straight lines
  3. frequency distribution curve - where the frequency polygon eventually becomes a smooth curve
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

shapes of histograms

A

symmetric
skewed
uniform/rectangular

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Skewed right

A

tail is longer on the right side

mean will be greater than median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

skewed left

A

tail is longer on the left side

mean will be less than the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

cumulative relative frequency

A

cumulative frequency / total observations in the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

stem and leaf for the number 46

A

4|6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

5|2 which is the stem and which is the leaf

A

stem is 5
leaf is 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

major shortcoming of the mean as a measure of central tendency is that

A

it is very sensitive to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

median

A

value of the middle term of the data set ranked in increasing order

it is not influenced by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

mode will be greater than / less than mean for a left skew

A

mode will be greater than the median and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

order for mean, median, mode for left skew

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

order of mean, median mode for right skew

A

mode, median, mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

variance

A

for a population = σ2

for a sample = s2

calculated as: the sum of all your (values - the mean)squared / sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

standard deviation

A

positive square root of the variance

provides a measure of dispersion of ABSOLUTE variability, not of relative variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when we want to compare variablity of two different data sets, we have to use

A

the coefficient of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

coefficient of variation

A

expresses the standard deviation as a percentage of the mean

CV = (the standard deviation / the mean) x 100%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

mean of a population/sample is calculated as

A

the sum of the midpoint multiplied by the frequency divided by the population number/sample number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Chebyshev’s theorem for standard deviation

A

for any number (k) greater than 1, at least ( 1 - 1/k^2 ) of the data values lie within k standard deviations of the mean. K can be 1 but it cannot be less than 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Empirical rule for Chebyshev’s theorem

A

68% of the observations lie within one std dev of the mean

95% of the obvs lie within 2 std dev of the mean

99.7% of the obvs lie within 3 std dev of the mean

20
Q

quartiles

A

three summary measures (Q1 Q2 Q3) that divide a ranked data set into four equal parts

21
Q

first quartile definition

A

the midpoint between the median and the minimum

22
Q

third quartile

A

the midpoint between the median and max

23
Q

how do you find the median

A

rank all data points in order, pick the middle one

location of Q2/median = (n + 1)/2

if there is an even number of data points, you add up the two values on either side of the mean location and divide it by two (see example on page 6 of 5.3)

24
Q

percentile definition

A

the summary measures that divide a ranked data set into 100 equal parts. each data set has 99 percentiles.

the Kth percentile = a value in a data set such that about K% of the measurements are smaller than the value of Pk and about (100-k)% of the measurements are greater than the value of Pk

25
how to find the location of a percentile
K(n)/100 ex: find Pk for the 42 percentile Kn/100 = (42 x 12)/100 = 5.04 locate the value of the 5th term in your ranked data set. If it was 5.9, you would pick the 6th term (suppose it is 11 billion) you would then say "approximately 42% of _____ had ______ less than or equal to 11 billion
26
finding percentile rank of x ex: find the percentile rank for x what does this tell you
percentile rank = [(number of values less than x) / (total number of values in the data set)] x 100% lets say you found the percentile rank to be 67% - this means that about 67% of the companies had a ____ less than x helps you determine what percentile some value is
27
equation of the lower inner fence
Q1 - 1.5(IQR)
28
equation of the upper inner fence
Q3 + 1.5(IQR)
29
Left skewed box plot
distance between minimum to the median is larger than distance between median to maximum
30
relative frequency
if an experiment is completed n times, and an event A is observed f times then probability (according to relative frequency concept) is: P(A) = f/n
31
law of large numbers (probability and relative frequency)
if you repeat an experiment multiple times, the probability of an event obtained from relative frequency approaches the actual or theoretical probability
32
subjective probability
the probability assigned to an event based on subjective judgement, experience, and belief
33
determining total outcomes when you have an experiment with multiple steps
multiple the number of outcomes for each of the steps
34
marginal probability (simple probability)
the probability of a single event without the consideration of any other event
35
probability
a numerical measure of the likelihood that a specific event will occur denoted by P probability lies between 0 and 1
36
symbol for the probability of a simple event (Ei)
P(Ei)
37
conditional probability
the probability that an event wil occur given that another event has already occurred P(A\B) = the probability of A given that B has already occurred
38
independent events
when the occurence of one does not affect the probability of the occurrence of the other
39
complementary events
the complement of event A is denoted by A(line on top) and is described as the event that includes all the outcomes for an experiment that are not in A
40
intersection of events
the collection of all outcomes that are common to both A and B
41
joint probability
the probability of the intersection of two events P(A and B) P(A and B) = P(A) P(B|A)
42
union of events
all outcomes of either event a or b P(A or B) = P(A) + P(B) - P(A and B)
43
probability distrubution (discrete random variable)
lists all the possible values the random variable can assume and their corresponding probabilities
44
the mean of a discrete random variable
x - also called the expected value, and is denoted by E(x)
45