Chapter 3: Flashcards

1
Q

What is arithmetic Mean

A

the most commonly used measure of central tendency - affected by extreme values - sum of all numerical values then dividing them by total number of observations - do NOT use when data has extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the most commonly used measure of central tendency

A

arithmetic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

can you use arithmetic mean if there are extreme values

A

you should not use it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is median

A

middle value in an ordered array of data - not affected by extreme values (outliers) - if N is odd, the median is the middle number - if n is even, the median is the average of the two middle numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

is median affected by extreme values

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is mode

A

the value in a set of data that appears most frequently - not affected by extreme values - used for descriptive purposes (because it is more variable from sample to sample than other measure of central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

is mode affected by extreme values

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

which measure of central tendencay is used for descriptive purposes

A

mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is geometric mean

A

multiply all the numbers together than to the exponent of 1/number of variables - help measure the status of an investment over time - useful measure of the rate of change or a variable over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what central tendancy helps measure the status of an investment over time

A

geometric mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what central tendency is useful for measuring the rate of change or a variable over time

A

geometric mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are quartiles

A
  • most widely used measure of noncentral location - used to describe properties of large stes of numerical data - whereas the median is the value that splits the ordered array in half (50% of the observations are smaller and 50% are loarger) quartiles are descriptive measure s that split the ordered data into 4 quarters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the most widely used measure of non-central location

A

quartiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how do you compute quartiles

A
  1. Determine the location (total numbers +1)x 25/100 = Qartile1 (total numbers +1) x 50/100 = quartile 2 (total numbers +1 ) x 75/100 = quartile 3
  2. Locate the number in the list

for instance location of 2.75 is between number3 and 6

6-3 = 3 x .75 (for the first quartile, .5 for second and .25 for 3rd)

=2.25 + 3 (the first number )= 5.25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is measure of variation

A

descrives numerical data - the amount of dispersion or spread in the data - two sets of data may differe in both central tendency an dvariation - or they may have the same measures of variation but different central tendencies or - two sets of data may have the same measures of central tendency but greatly different variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the 5 measures of variation

A
  1. range 2. interquartile range 3. variance 4. standard deviation 5. coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is range

A

is the difference between the largest and smallest observation in a set of data - measure the total spread in the set of data - simple weakness is that it does not take into account how the data are distributed between the smallest and largest values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is interquartile range

A
  • also called midspread - difference between the third and first quartiles in a set of data - subtract the first quartile form the third quartile - not influenced by extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do you calcluate rane?

A

Highest number - the smallest number

20
Q

How do you calculate the Variance?

A
  1. Calcluate the mean (largest number - smallest number)
  2. calcluate the variance
    - subtract the mean
    - the square the result
    - the add up the squared numbers
    - then once we sum up the square differences we divide by the number of values in the population to get the average squared difference - 1?
21
Q

Why can Bariances be hard to interpret

A

because they can be quite large

22
Q

why use standard deviation?

A

because variance can be quite large

23
Q

how do you calculate the standard deviation?

A

square root of the variance

24
Q

What is the Coefficient of Variation (CV)?

A

measure of variability relative to the mean

25
Q

How do you calcluate Coefficient of Variation

A

Calculate:

Take the standard deviation and divided it by the mean

Standard deviation / mean = a %

Useful for comparing two data sets to see which one is more variable

26
Q

why would you use Coefficient of Variation (CV)?

A

useful for comparing two data sets to see which one is more variable

27
Q

What are the Calcluations for a box and Whisker Plot (5 Number Summary)

A
  1. Determine the smallest number
  2. Determine the quartiles

Total number of numbers + 1 x 25/100 = quartile 1 %

Then find the amount in the list of numbers

Total number of numbers + 1 x50/100 = 2nd quartile %

Total number of numbers + 1 x 75/100 = 3rd quartile %

  1. Determine the median

Median is Quartile 2

  1. Determine the largest number
28
Q

How do you determine a box and whisker plot

A
  1. determine the quartiles
  2. ????????????
29
Q

how do you Calculate the interquartile range?

A

3rd quartile - 1st quartlie

  • 50% of the observations are within the box ?
30
Q

Explain how to understand the box and whisker plot regarding outliers

A
  • true as long as there are no outliers (found as dots at the end of the whiskers)
  • if it is a value gerater than the value of the 3rd quartile + 1.5 x interquartile range
  • OR if it is less than the value of the first quartile - 1.5 x interquartile range

these show you if there are outliers

31
Q

what are the measures of variation?

A
  1. range
  2. variance
  3. stnadard deviation
  4. coefficient of variation
32
Q

What are the shapes of Distribution and what does it mean

A

Describes how data is distributed

measure of shape can be

  1. Symmetric or
  2. skewed (left skewed or right skewed)
33
Q

The 5 number summary what is it used for

A

to determien the shape of a distribution (box and whisker summary)

34
Q

What is the 5 number summary a measure of?

A

a measure of

  1. central location as well as
  2. relative standing
35
Q

what does percentile mean

A

what number has a ceratain % of the data below that number

(ie. 25 percentile means 25% of hte numbers are below that number)

36
Q

what do the measures of assocaition measure?

A

how strong the relationship is between two variables

  • specifically, we are most intersted in linear relationships (where catter plots show a striaght line)
37
Q

What does covariance measure

A

measures how two variables change together

  • if one goes up does the other go up? or will it go down? or do we know nothing at all (vairables are not assocaited with each other)
38
Q

how do you calclaute covariance

A

difference of the mean from each data point (like variance)

  • we use both data sets and multiply differnces together
  • if one increases when other decreases consistently
    - will be negative (positive x negative)
    - covariacne will be high and negative
  • if they increase and decrease together consistently
    - will be positive (positive x positive or negative x negative)
    - covariance will be high and positive

if inconsistent - some psositves and negatives, covariance will be low (may be negative or positive)

39
Q

for covariance, if one variable increases when the other decreases consistently, what does this show

A

it will be negative (positve x negative)

  • covariance will be high and negative
40
Q

for covariance, if they increase and degrease together consistently, what does this show

A

will be positive (positve x positive or negative x negative)

  • covariance will be high and positive
41
Q

for covariance, if the variables are incosistent, what does this show

A

some psositve and negatives

  • covaraince will be low (MAY BE POSITVE OR NEGATIVE)
42
Q

for covariance the higher the number shows what

A

the higher numer shows a stronger relationship

43
Q

what does coefficeint of correlation show

A

shows the linear relationship

44
Q

how do you calcluate the coefficient of correlation?

A

COVARIANCE / (STANDARD DEVIATIONS MULTIPLIED TOGETHER)

  • -1 means perfect negative linear relationship
  • 0 means no relationship
  • +1 means perfect positive linear relationship
45
Q

what are the features of correlation coefficent

A
  1. unit free
  2. ranges between -1 and 1
  3. the closer to -1, the stronger the engative linear relationship
  4. the clsoer to 1, the stronger the psotive linear relationship
  5. the clsoer to 0, the weaker any positive or negative linear relationship
46
Q

How do we display correlation coefficcents?

A

scatter plots? add pic if remember

47
Q

what are some ethical considerations

A
  1. should document both good and bad results
  2. should be presented in a fair, objective and neutral manner
  3. should not use inappropriate summary mesures to distrort facts