550 Flashcards

1
Q

LOM - Nominal

A

Characteristics: Categorical
Math: Equality (=, !=)
Central Tendency: Mode
Variability: None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

LOM - Ordinal

A

Characteristics: Categorical, Rank Order
Math: Equality (=, !=), Comparison (>,<)
Central Tendency: Mode, Median
Variability: Range, Interquartile Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

LOM - Interval

A

Characteristics: Categorical, Rank Order, Equal Spacing
Math: Equality (=, !=), Comparison (>,<), Add/Subtract (+/-)
Central Tendency: Mode, Median, Arithmetic Mean
Variability: Range, Interquartile Range, Standard Deviation, Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

LOM - Ratio

A

Characteristics: Categorical, Rank Order, Equal Spacing, True Zero
Math: Equality (=, !=), Comparison (>,<), Add/Subtract (+/-), Mult/Div (x /)
Central Tendency: Mode, Median, Arithmetic Mean, Geometric Mean
Variability: Range, Interquartile Range, Standard Deviation, Variance, Relative Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

LOM - Nominal Numeric

A

Non-numeric categories coded as numeric are not really numbers and have no quantitative meaning i.e. T = 0, F = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

5 Number Summary

A

Minimum
1st Quartile - 25%
Median
3rd Quartile - 75%
Maximum

Displayed using Boxplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Range vs IQR

A

Range is highly influenced by outliers
Inner Quartile Range is resistant to outliers
Based on the 1st and 3rd quartile
High Outlier > Q3 + 1.5IQR
Low Outlier < Q1 - 1.5
IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Standard Deviation

A

Average distance from the mean value of all values in a set of data.
Smallest is 0.
Sensitive to outliers and skew.
Square root of the sum of the difference between each value and the mean squared divided by the total number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Measures of Central Tendency

A

Normal aka No Skew: Mean = Median = Mode
Left Skewed aka Right Hump: Mean < Median < Mode
Right Skewed aka Left Hump: Mode < Median < Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Normal Distribution + Empirical Rule

A

Bell-shaped, unimodal, symmetrical distribution of a quantitative variable with mean=median=mode.

68% within 1 standard deviation
95% within 2 standard deviations
99.7% within 3 standard deviations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Z-Score

A

A standardized score that measures how many standard deviations a data point is from the mean of a group.
Z = (value - mean)/sd
0 is equal to the mean. 1 is equal to 1 sd.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kurtosis

A

Normal curve is 3 or 0
Thin pointy curve is >3 or (+)
Flat and spread out is <3 or (-)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Descriptive vs Inferential Statistics

A

Numbers that describe the data set ex. batting average
vs
Using confidence intervals and significance tests to make inferences about a population from a sample ex. how likely a player is to perform well in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Mean vs Median vs Trimmed Mean

A

Balance point of the distribution, sensitive to extreme values.
Equal areas point , resistant to extreme values.
Calculate the average by removing a certain percentage of the highest and lowest values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Histogram
Box and Whisker Plot
Dotplot/Stemplot
Bar Graphs

A

Good to visualize the shape of a large amount of data that is integer or ratio
Useful for showing the distribution of data
Best for small sets of quantitative data
For categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Probability vs Statistics

A

Probability relates to how often different events occur
- We know the model aka conditions but we don’t know the data
Statistics we know the data but we don’t know the model aka conditions
- The core of inferential statistics is figuring out the model

17
Q

Probability Distribution

A

Outcomes of a trial must be disjoint aka mutually exclusive aka can’t occur at the same time.
Probabilities must be between 0 and 1.
Probabilities must sum to 1.

18
Q

Distribution

A

A function that shows the possible values for a variable and how often those values occur
Discrete
- Poisson
- Binomial
- Uniform
- Geometric
Continuous
- F
- Uniform
- Normal
- Chi-Square
- T

19
Q

Random Variables

A

A numerical value that depends on the outcome of a chance experiment.
Random varaible T = number of tails occuring in two tosses.
T is a random variable since it has numerical values (0, 1, 2) and it is based on a random process
- Coin Toss

20
Q

Discrete vs Continuous Random Variables

A

Discrete:
- Isolated points along # line
- # of items purchased, # of customers on website
- Counting
- Histograms
Continuous:
- All points in some interval
- Temperature of a freezer, weight of a pineapple
- Measuring
- Density Curve

21
Q

Uniform Distribution (Discrete/Continuous)

A

Distribution in which all outcomes are equally likely
Discrete
- Count
- Simple random sample from a population or to model events that are equally likely such as die rolling
Continuous
- Measure