QU1 chapter 3 notes Flashcards

1
Q

What are the measures of central tendency

A
  1. Arithmetic mean
  2. median
  3. mode
  4. geometric mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe arithmetic mean

A
  • most commonly used measure of central tendency
  • affected by extreme values
  • do not use when data has extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you calculate arithmetic mean

A

sum of all numerical values then divide them by total number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

describe median and calculate

A
  • the middle value in an ordered array of data
  • not affected by extreme values (outliers)
  • if N is odd, then median is the middle number
  • if N is even, then median is the average of the two middle numbers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

describe mode

A

the value in a set of data that appears most frequently

  • not affected by extreme value
  • used for descriptive purposes only (because it is more variable from sample to sample than other measures of central tendency)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe geometric mean

A

help measure the status of an investment over time

- useful measure of the rate of change or a variable over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you calculate the geometric mean

A

multiply all the numbers together then to the exponent of 1/number of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a quartiles

A
  • most widely used measure of noncentral location
  • used to describe properties of large sets of numerical data
  • whereas the median is the value that splits the ordered array in half (50% of the observations are smaller and 50% are larger), quartiles are descriptive measures that split the ordered data into 4 quarters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how do you compute quartiles
Computer the quartiles of the 3 year annualized returns after removing CI signature Select Canadian Seg I. The ordered array is:

5.34 6.15 6.85 7.11 9.05 10.16 10.79 11.35 13.43 13.43 13.93 17.1

A

Solution:
Q1 = (n+1)/4 ordered observation
= 13 + 1 / 4 = 3.5 ordered observation

Step 2: Q1 is approximated by using the arithmetic mean of the third and fourth ordered observations

Q1 = 6.85 + 7.11 / 2 = 6.98

In addition:
Q3 = 3(n+1)/4 ordered observation
3(13+1) /4 = 10.5 ordered observation

Therefore, using rule 2, Q3 is approximated by the arithmetic mean of the 10th and the 11 ordered observation

Q3 = 13.43 + 13.43 /2 
	= 13.43
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the measures of variation

A
  1. range
  2. variance
  3. standard deviation
  4. coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

describe range

A

difference between the largest and the smallest observation

- ignores the way in which data are distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is interquartile range

A
  • measure of variation
  • also called mid-spread (spread in the middle 50%)
  • not affected by extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate interquartile range

A

difference between the first and third quartiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is variance

A
  • important measure of variation

- shows variation about the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do you calculate the sample variance

A

sum of the squared differences around the arithmetic mean divided by the sample size minus 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is standard deviation

A
  • most important measure of variation
  • shows variation about the mean
  • has the same units s the original data
  • most practical and most commonly used measure of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

which measure of variation is most important

A

standard deviation

18
Q

how do you calculate standard deviation

A

square root of the sum of the squared differences around the arithmetic mean divided by the sample size minus

19
Q

What is coefficient of variation (CV)

A
  • measures relative variation
  • expressed as %
  • higher value indicates greater variability relative to the mean
  • used to compare two or more sets of data measures in different units
  • measures the scatter in the data relative to the mean
20
Q

what is the calculation for coefficient of variation

A

CV = (standard deviation/mean) 100%

21
Q

What is shape of a distribution

A
  • describes how data is distributed
  • measures shape
  • can be symmetric or skewed
22
Q

If the mean and median are equal the shape will be

A

symmetric (or zero skewed)

23
Q

if the mean exceeds the median, the shape is

A

Right Skewed

- the variable is called positive or right skewed

24
Q

if the median exceeds the mean the shape is

A

called left-skewed

- also called negative

25
Q

how does positive skews happen

A

when the mean is increased by some unusually high values

26
Q

how does negative skews happen

A

when the mean is reduced by some extremely low values

27
Q

How is that variables are symmetrical (shapes)

A

when there are no really extreme values (low and high values balance each other)

28
Q

What is the 5 number summary used for

A

to determine the shape of a distribution

29
Q

what does the 5 number summary include

A
  1. smallest value
  2. first quartile (Q1)
  3. the second quartile (Q2)
  4. the third quartile (Q3)
  5. the largest number
30
Q

what is used to display data using 5-number summary

A

box-and-whisker plot

31
Q

using the 5 number summary to recognize symmetry in data

A
  1. the distance from x smallest to the median = the distance form the median to x largest
  2. the distance form x smallest to Q1 equals the distances form Q3 to x largest
32
Q

5-number summary what dos the right-skewed distribution mean

A

the distance from the median to x largest is greater than the distance form the x smallest to the median

also

the distance form Q3 to x largest is greater than the distance form x smallest to Q1

33
Q

5 -number summary what does the left skewed distribution

A

the distance from x smallest to the median is greater than the distance form the median to x largest

also

the distance form the x smallest to Q1 is greater than the distance form Q3 to x largest

34
Q

what does the coefficient of correlation measure

A

measures the strength of the linear relationship between two quantitative variables

35
Q

how do you calculate coefficient of correclation

A

??

36
Q

what are the features of correlation coefficient

A
  1. unit free
  2. ranges between -1 and 1
  3. the closer to -1, the stronger the negative linear relationship
  4. the closer to 1, the stronger the positive linear relationship
  5. the closer to 0 the weaker any positive or negative linear relationship
37
Q

what is the data analysis objective

A

should report the summary measures that best meet the assumptions about the data set

38
Q

what are the pitfalls in numerical descriptive measures

A
  1. data analysis is objective

2. data interpretation is subjective (should be done in fair, neutral and clear manner)

39
Q

What are some ethical considerations for numerical descriptive measures

A
  1. should document both good and bad results
  2. should be presented in fair, objective and neutral manner
  3. should not use inappropriate summary measures to distort facts
40
Q

what is central tendency

A

??