Module 3 Flashcards

1
Q

Population vs sample mean

A

Symbols:
Σ: Sum
μ: Population mean
N: Population size
Sample variance uses n−1 in the denominator to ensure it is an unbiased estimator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Measures of central location

A

Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Median

A

Definition: The middle value of a dataset that divides it into two equal halves.
Key Point:
The median is useful when outliers are present.
If mean and median differ, outliers likely exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mode

A

Definition: The value(s) occurring most frequently in a dataset.

Types:
Unimodal: One mode
Bimodal: Two modes
Multimodal: Three or more modes (less useful)

Examples:
Acetech salaries: $40,000 is the mode (most common).
Women’s sweatshirt sizes: “L” is the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Differences between mean, median, and mode

A
  • Use mean when data is symmetrical and free of outliers.
  • Use median when data contains outliers or is skewed.
  • Use mode for categorical variables or to identify most frequent values.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Weighted mean

A

A mean where different data points are assigned specific weights.
Useful when observations contribute unequally to the average.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Histograms

A

Purpose:
Visualizes data distribution, clustering, spread, and shape.

Key Characteristics:
Symmetric: Mirror image on both sides of the center.
Skewed:
Positive: Long right tail.
Negative: Long left tail.
Symmetric and Unimodal

Distribution:
Mean = Median = Mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Percentiles

A

Definition:

Divide data into 100 equal parts.

Specific percentiles:
25th Percentile: Q1
50th Percentile: Q2 (Median)
75th Percentile: Q3

Application:
Ideal for large datasets.
Used in five-number summaries (Minimum, Q1, Median, Q3, Maximum)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the five number summary?

A

Minimum, Q1, Median (Q2), Q3, Maximum.

Purpose:
Summarizes the spread and relative position of data.
Example: Growth and Value variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Measures of dispersion

A

Variance and standard deviation

Variance:
The average of squared differences between observations and the mean.
Units are squared.

Standard Deviation:
Square root of variance, returning to the original units.
Represents typical spread around the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Importance of variance and standard deviation

A

Variance: Highlights how data points differ from the mean.

Standard Deviation: Provides a clear measure of spread in the same units as the data.

Use Cases: Evaluate consistency, risk, or variability in data (e.g., financial analysis).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The Empirical Rule

A

Bell-shaped distribution:
68% of data within 1 standard deviation of the mean.
95% within 2 standard deviations.
99.7% within 3 standard deviations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Z-scores for Outlier Detection

A

Measures how many standard deviations a value is from the mean.

For a symmetric, bell-shaped distribution, outliers have Z-scores less than -3 or greater than +3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is covariance?

A

Measures the degree to which two variables change together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is correlation?

A

Standardized measure of covariance, ranging from -1 to 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Z-score equation and what does it stand for?

A

Z-score = X−μ/σ

X is the value,
μ is the mean
σ is the standard deviation.

17
Q

Standard deviation

A

Square root of variance