EXAM 1 Flashcards
Descriptive Statistics
Pictorial and tabular methods (stem and leaf, histogram) Numerical measures (mean, median, range, variance)
Inferential Statistics
Draw conclusions about a parameter (Confidence intervals, Hypothesis testing)
Population Sample Variable Observation Data
Population- a well-defined collection of objects
Sample- a subset of the population
Variable- characteristics of the objects
Observation- an observed value of a variable
Data- a collection of observations
Variable
Characteristic whose value may change from one object to another in the population
Discrete (# of people in a room)
Continuous (length of a road)
Stem and leaf plot
Leading value on main axis and other values on the chart
In order smallest to largest
Visualize symmetric distribution, peaks and outliers
Dotplot
Represent observations with dots above the measurement
Stack dots vertically
Visualize typical values, spread of set, extremes, and gaps
Histogram
When the set is large
Relative frequency over measurement
Unimodal (one peak), Bimodal (two peaks), Multimodal (many peaks), Symmetric (left=right), Pos skewed (right tail), Neg skewed (left tail)
Relative Frequency
c = frequency of c / total observations
Sample mean
Mean of all observations in the data set
x bar = ∑ x / n
Measures location/ center of a sample
Population mean, μ, is the average of the entire population
Sample median
Middle value of the sample, x~
Order from small to large and find middle value
Average the value if there are two middle values
Population median, μ
Trimmed mean
A method used to make the mean less skewed due to outliers by removing a percent of the top and lower values
Quartiles
Median separates the sample into lower and upper sub-samples Q1 is median of lower half Q2 is median Q3 is the median of the upper half IQR = Q3-Q1
Variance
Average magnitude of the devition from the sample mean
s^2 = (∑ xi - x bar) ^2 / n-1
Standard Deviation
s = sqrt(s^2)
Square root of variance
Degree of Freedom
n-1
Sxx
Numerator of s^2
Sxx = ∑ (xi - xbar)^2
Sxx = ∑ xi^2 - [ (∑ xi)^2 / n ]
Boxplots
Five number summary (min, Q1, x~, Q3, max)
Center, spread, symmetry, outliers
1. Draw horizontal axis find Q1, Q2, Q3, and IQR
2. Place a rectangle above the axis with the Q1 and Q3
3. Place a vert line on Q2
4. Draw wiskers to largest and smallest point
Experiment
Any activity or process whose outcome is subject to uncertainty
Sample space
the set of all possible outcomes of that experiment, S
Event
Any collection or subset of outcomes contained in the sample space S
Use upper case letters to denote events
Simple: One outcome
Compund: More than one outcome
Complement
A’
all outcomes of S that are not in A
Union
A ∪ B
All outcomes that are either in A or B or in both events
Intersection
A ∩ B
All outcomes that are in both A and B
Empty Set
The null event
No outcomes
“∅”
Mutually Exclusive or Disjoint
A and B have not outcomes in common
A ∩ B = ∅
Probability
Measure of the chance that A will occur
Axioms
P(A) ≥ 0
P(A) ≤ 1
P(S) = 1
Interpreting relative frequency
Rel Freq varies at low number of times
Will approach the limiting rel freq after many times
Complement Law
P(A) = 1 − P(A′)
Addition Law
P (A∪B) = P (A) + P (B) − P (A∩B)
Mutually Exclusive: P (A∪B) = P (A) + P (B)
P (A∪B∪C) = sum of individual - double intersections - all intersection
Equally likely outcome
Fair coin, fair die
p = 1/N
N = # of outcomes
Equally likely multiple outcomes
p = N(A) / N N = # of possible outcomes N(A) = # of outcomes in question, A
Product rule for ordered pairs
Number of possible pairs = n1 * n2
n = # of possible options
Permutations
An ordered subset
P k,n = n! / (n−k)!
Combination
An unordered subset
C k,n = n! / (n−k)! * k!
Conditional Probability
Probability of A given B had occured
P (A|B) = P (A ∩ B) / P (B)
Multiplication Rule with cond prob
P (A ∩ B) = P (A|B) P (B) = P (B|A) P (A)
Bayes’ Theorem / Law of total probability
P(B) = ∑ P(B | Ai) P(Ai)
Independence
Two events A and B are independent if P (A|B) = P (A), and are dependent otherwise
Multiplicative rule when independent
P (A ∩ B) = P(A) * P(B)
Random variable
Any rule that associates a number with each outcome in S
Upper case letters
Bernoulli random variable
Any random variable whose only possible values are 0 and 1
Discrete vs Continuous
Discrete = countable, number of something, whole numbers Continuous = possible values consists of every number in a range, many decimals, probability = 0
Probability mass function (discrete)
Distribution of all the probabilities
For every possible value of x of the random variable, the pmf specifies the probability of observing that value when the experiment is performed.
Cumulative distribution function
Method to describe the distribution of probability by finding the probability if x is greater than or equal to
Creates a step function
Cumulative Distribution Equation
P(a ≤X ≤b) = F(b) −F(a−)
Expected Value
E(X) = μx = ∑ x ·p(x) = sample mean
E(X^2) = ∑ x^2 ·p(x) = expectation squared
Expectation Variance
V (X) = ∑ (x −μ)^2 ·p(x) = E[(X −μ)2]
SD = sqrt( E[(X −μ)2] )
alt.
E[(X −μ)2] = E(x^2) - [E(x)]^2
Variance of Linear Function
V (aX + b) = σ^2 (aX+b) = a^2 σ^2 X and σ aX+b = |a|σ X
Binomial Distribution
An experiment where trials are independent and can take on either success or failure.
P = C 5,x / N
b(x;n,p) = n,x p^x (1-p)^(n-x)
E(X) = np
V (X) = np(1 −p)
σX = √np(1 −p)