Test Flashcards
Categorical
Most common: graphical display
Pie chart
Bar chart
Pictograms
Frequency tables
Numerical summaries: category counts and percentages
Quantatative
Histogram [=including, (=not including
Stem plot
Box plot (5# summary)
- longer/shorter quartile means spread of data not more data
Mean
Average
Median
Middle
More
Most often occurring
Standard Deviation
Use w/ symmetry and mean
68% fall w/in 1 SD of the mean
95% fall w/in 2 SD of the mean
99% fall w/in 3 SD of the mean
IQR
Inner quartile range
Gives us the middle 50%
Used w/skewed data and median
1.5 IQR
Used to detect outliers
Q1-1.5(IQR)
Q3-1.5(IQR)
Explanatory Variable
X
Variable that claims to explain, predict or affect the response
Response variable (Y)
Outcome of the study
C — Q
Box plots
C — C
Two way tables / contingency table
Q — C
Conditional percentile tables
Q — Q
Scatter plot
Increase in X = increase in Y
Decrease in X = decrease in Y
U shape = not positive or negative
r= linear correlation coefficient
( -1 to 1 )
0 to -1 = neg relationship
0 to +1 = pos relationship
Measures strength of linear relationship
Simpsons Paradox
When a lurking variable causes us to think the direction of an association
Population
Group chosen for sampling
Sample Frame
List of individuals to be sampled
Sample
Actual individuals chosen for sample
Simple Random Sample
Individuals sampled at random without replacement.
Selecting names out of a hat
Cluster sample
Used when population is naturally divided into groups
Students in university divided into majors
Stratified sample
Used when population naturally divided into subpopulations
Students in certain college divided by gender or year in college
Systematic sample
Obtain contact information and sample every so many people (I.e. Every 50th person)
Observational Study
Values of variables are recorded as they naturally occur
Experiment
Researcher defines the explanatory variable
Prospective
Values of the variables recorded forward in time
Retrospective
Values of variables recorded backward in time
Blind experiment
Subjects unaware of which treatment they are receiving
Double Blind Experiment
Testing procedure designed to eliminate biased results
Where identity of those receiving a test treatment is concealed from both administrators and subjects until study is completed.
Hawthorne Effect
People in an experiment behave differently from how they would normally behave.
Lack of realism
Subjects/treatments/setting of an experiment may not realistically duplicate the conditions we want to study.
Noncompliance
Failure to conform to roles / standards
Blocking
Divide subjects into groups of individuals who are similar with respect to an outside variable.
Matched pairs
Special case of randomized block design. Used when experiment has two treatment conditions and subjects can be grouped into pairs. Then within each pair subjects are randomly assigned to different treatments.
Randomized response
Survey technique for eliminating evasive answers.
Leading question
Questions that influence the response
Sensitive questions
Questions that may make someone answer dishonestly because of how they feel. (I.e. Questions about lowest grade last year).
Classical problems (theoretical/true problems)
Games of chance
Flipping coins, rolling dice, spinning spinners
Empirical problems (relative frequency)
Run a simulation or use a random sample
Use a series of trials that produce outcomes that cannot be predicted in advance
Law of large numbers
As the number of trials increases, the relative frequency becomes the actual probability
Rule #1. Probabilities are between 0 and 1
A +B + C = 1
Rule #2
Something must happen
As number increases should see a change
Rule #3. Complement Rule
P(not A) = 1-P(A)
Rule #4 addition rule for disjoint events
P(A or B) = P(A) + P(B)
P(A or B) = probability that event A occurs or event B occurs or both)
Rule #5 multiplication rule for independent events
P(A and B) = P(A)*P(B)
P(A and B) = Probability that event A and event B occur.
Disjoint events
Whether or not it is possible for the events to occur at the same time
Independent events
If event A occurring does not effect the probability of event B will occur.
Probability of at least
L = at least / or not
P(L) = 1-P(notL)
Rule #6 general rule of addition
P(A or B) = larger number
P(A and B) = smaller number
Distribution means…
What values the variables take and how often the variables take those values.
Skewed right
Most data on left, minimal on right.
(Right tail (larger values) is much longer than left tail (smaller values))
Example: distribution of salary.
Skewed Left
Left tail (smaller values) is much longer than the right tail (larger values).
(Example: age of death from natural causes).
Stemplot
Retains actual data and organized it.
Given that the student is male, what is the probability that he has one or both ears pierced?
P(E) = probability of having one or both ears pierced
P(M) = male student
P(E | M )
Formal definition of probability
P(B | A) = P(A and B) / P(A)
If service A has failed to deliver the document on time, what is the probability that it has arrived on time using service B.
P(B | not A) = P( B and not A) / P ( not A)
Testing for independence
Compare the overall probability to the conditional probability.
Compare P(B | A) to P (B) as well as P(A | B) to P(A) or, P(B | A) to P(B | not A)
If the two events are equal then they are independent.