BA Chapter 4 Flashcards

1
Q

Arithmetic mean (mean)

A

Sum all observations and divide by the total number of observations. (u)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Median

A

The middle value of a ranked data set. Half the data are above, half are below the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Coefficient of kurtosis 峰态系数

A
  • Kurtosis refers to peakedness or flatness of the curve.
  • Coefficient of Kurtosis =KURT(data, range)
    • CK < 3 indicates the data is somewhat flat with a wide degree of dispersion.
    • CK > 3 indicates the data is somewhat peaked with less dispersion.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Coefficient of skewness 歪斜系数

A
  • Skewness describes lack of symmetry
  • Coefficient of Skewness =SKEW(data, range)
  • CS is negative for left-skewed data.
  • CS is positive for right-skewed data.
  • |CS| > 1 suggests high degree of skewness.
  • 0.5 ≤ |CS| ≤ 1 suggests moderate skewness.
  • |CS| < 0.5 suggests relative symmetry.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Skewness

A
  • Skewness describes lack of symmetry.
  • Coefficient of Skewness =SKEW(data, range)
  • CS is negative for left-skewed data.
  • CS is positive for right-skewed data.
  • |CS| > 1 suggests high degree of skewness.
  • 0.5 ≤ |CS| ≤ 1 suggests moderate skewness.
  • |CS| < 0.5 suggests relative symmetry.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Relationship between variables is measured by:

  • Covariance
  • Correlation
A
  • Covariance is a measure of the linear association between two variables, X and Y. Depends upon units of measurement, so difficult to interpret.
  • Correlation is a measure of the linear association between two variables, X and Y. Does not depend upon units of measurement. Known as the: Pearson product moment correlation r.
  • r represents correlation coefficient 相关系数
  • Relationships can also be visualized with a Scatterplot. Scatterplot is the only graph that shows if a relationship exists between two variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quiz: What coefficient measures the linear relationship between two variables?

A

Correlation of Variation (CV).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dispersion 散布 分散

A

The degree of variation in the data, i.e., the numerical spread of the data. Several statistical measures characterize dispersion: the range (max minus min), variance, and standard deviation (square root of the variance).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Interquartile range

A
  • The difference between the first and third quartiles, Q3-Q1, is often called the interquartile range, or the midspread.
  • NOTE:

The first quartile (Q1) is defined as the middle number between the smallest number and the median of the data set. The second quartile (Q2) is the median (the middle value) of the data. The third quartile (Q3) is the middle value between the median and the highest value of the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Empirical rules

A

If a data set has an approximately bell-shaped relative frequency histogram, then:

  1. About 68% of the observations lie within one standard deviation of the mean.
  2. 95% of the observations lie within two standard deviations of the mean.
  3. 99.7% of the observations lie within three standard deviations of the mean.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chebyshev’s Theorem

A
  • The Empirical Rule does not apply to all data sets, only to those that are bell-shaped, and even then is stated in terms of approximations. A result that applies to every data set is known as Chebyshev’s Theorem.
  • For any numerical data set:
    • At least 3/4 of the data lie within two standard deviations of the mean.
    • At least 8/9 of the data lie within three standard deviations of the mean.
    • At least 1−1/k2 of the data lie within k standard deviations of the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Coefficient of variation & Return to risk

A
  • The coefficient of variation (CV)
    provides a relative measure of dispersion.
  • The return to risk = 1/CV.
  • Return to risk provides a relative
    measure of risk with respect to the
    return.
  • Easier to compare than the
    standard deviation.
  • The smaller the CV, the less the risk.
  • The larger the Return to Risk,
    the better the return with respect to
    the risk involved.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Range

A

Highest value (Maximum) minus the lowest value (Minimum).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard deviation (σ)

A

The square root of the Variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Variance (σ2)

A

An overall measure of how far each value is from the mean.

  • An average of the squared deviations from the mean.
  • Units are squared.

(square v. 使成平方)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Unimodal & Bimodal

A
  • Histograms that have only one “peak” are called unimodal.
  • Histograms that have two “peaks” are called bimodal.
17
Q

Quiz: Which one of the following is not a measure of dispersion?

  • Range
  • Variance
  • Standard Deviation
  • Midrange
  • Interquartile
A

The answer is Midrange. This question came from the quiz for chapter 4.

18
Q

Midrange

A

The average of the largest and smallest value in the data set. (Max + Min)/2.

19
Q

Mode

A

The observation that occurs most frequently.

20
Q

Outlier

A

Observations that are radically different from the rest. The Mean and Midrange are affected by outliers.

21
Q

Population & Sample

A
  • Population: All items of interest for a particular decision or investigation (N).
    • Examples:
      • All married drivers over 25 years old
      • All subscribers to Netflix
  • Sample: a subset of the population (n)
    • Example:
      • A randomly selected list of individuals who rented a comedy from Netflix in the past year.
22
Q

Standardized value (z-score)

A
  • A standardized value is a transformed value. It is a relative measure of the distance an observation is from the mean.
  • Referred to as a Z-value or Z-score
23
Q

Statistical Thinking

A

Statistical Thinking = Critical Thinking!!! Statistical Thinking is a philosophy of learning and action for improvement, based on principles that:

  1. All work occurs in a system of interconnected processes.
  2. Variation exists in all processes.
  3. Better performance results from understanding and reducing variation.