Final Exam Flashcards
Definition of non-probability sampling:
Probability of selecting any particular member is unknown
Types of Non-Probability Sampling: Definitions. (CJSQ)
Convenience: Procedure of obtaining the people or units that are most conveniently available
Judgment: An experienced individual selects the sample based on his or her judgment about some appropriate characteristics required of the sample member
Snowball: Initial respondents are selected by probability methods. Additional
respondents are obtained from information provided by the initial
respondents
Quota: The sample contains the same proportion of characteristics specified by the researcher as is evident in the population being examined
Probability sampling definition + pro’s and cons.
Probability Sampling: All population members have a known probability of being in the sample.
Advantages
* Sampling error can be computed
* Determine the degree of accuracy
Disadvantages
* Expensive
* Take time and effort to design and execute
Types of Probability Sampling – Their definitions. (Stratified, cluster, simple random, systematic random).
Stratified Sample: Population is partitioned into mutually exclusive groups called strata, according to criterion such as geographic location, grade, age or income etc.
Cluster Sample: Population is partitioned into mutually exclusive clusters. Randomly select some clusters. Members in the selected clusters are all selected.
Simple Random Sample: Each member of the population has an equal probability to
be selected.
Systematic Random Sample: Sample in a systematic way. Each member of the population has an equal probability to
be selected. (Every 6th person, 5th, etc).
How to calculate Total error
Total Error: Difference between the true value and the observed value of a variable
(sampling error + non sampling error)
Sampling Error: Error is due to sampling
Non-sampling Error: Error is observed in both census and sample
Central measures of Tendency, dispersion
Measures of Central Tendency: Mean, Median, Mode.
Measures of Dispersion:
Range
Deviance: The differences between each
observation value and the mean)
Variation: Measure of the sample dispersion based on degree to which a response differs from sample average response.
Standard Deviation: Square root of variance.
Probability distribution- Normal distribution
Z score = (X µ) / σ
how many standard deviations below or above the population mean a raw score is.
z- score of 1 is 1 standard deviation above the mean
Z scores are a way to compare results to a “normal” population
E.g. A z score can tell you where a person’s weight is compared to the average
population’s mean weight.
Sampling distribution, Standard Error, Confidence interval estimation
Empirical probability distribution:
All the outcomes in a distribution of research results and each of their
probabilities what actually happened
The probability distribution of a variable lists the possible outcomes together
with their probabilities
When to use which statistical Chi-Squire test?
Chi-Squire Goodness of fit: Used to investigate how well the observed pattern fits the expected pattern.
Chi-Squire Test of Independence: Used to test if one variable has no influence on the other variable.
Relationship between P value and Alpha
Alpha, the significance level, is the probability that you will make the mistake of rejecting the null hypothesis when in fact it is true.
P - value is the probability of finding the observed, or more extreme, results when the null hypothesis is true
The smaller the p value, the stronger the evidence that you should reject the null hypothesis
Conditions when to use Chi-square test
Test of Independence
Are there associations between two or more variables in a study?
Test of Goodness of Fit
Is there a significant difference between an observed frequency distribution and a theoretical frequency distribution?
Walk through the process of Chi - Squire testing.
Formulate Hypothesis
Calculate row and column totals:
Calculate row and column proportions
calculate expected frequencies
Calculate test statistic. (X^2)
Calculate Degrees of freedom
Get critical value from table.
Make a decision.
Know when to reject or fail to reject Null hypothesis
Reject Null if the test-statistic is greater than the critical value or p-value < 0.05
How do you calculate Degrees of freedom for Chi Squire Test?
(Rows - 1) * (Columns - 1)
Rows go left and right, Columns go down.
When do we use T-stat or Z-stat?
If we know the population standard deviation, use Z-stat.
If only the sample standard deviation is known, we use T-stat.
Testing hypothesis about a single mean (both one-tail and two-tail)
Compare a sample to the population. Population mean must be set.
How to decide if it’s one or two tail? It’s a one tail if we use a > or <, it’s two tail if we use =/.
Z calc = (Mean - hypothesized mean) / population standard deviation.
Unrelated sample t-test (both one-tail and two-tail)
Compare one sample to another sample
two different groups of
participants.
The hypothesis should be:
𝐻0:𝜇1=𝜇2
𝐻1:𝜇1≠, > , < 𝜇2
Know when to reject or fail to reject Null hypothesis after looking at p-value and also from excel output
Make sure to pay attention to your hypothesis: one tail and two tail values are given in excel.
If P value is greater than alpha, we fail to reject the null.
If P value is less than alpha, we reject the null.
If T-stat is < T - Crit, we fail to reject the null.
If T-stat is > T-Crit, we reject the null.
Know what to conclude after looking at p-value and also from excel output
Degrees of freedom for T test.
Should be equal to (N minus 1)
Concept of correlation
Measures strength of the relationship between two variables.
Difference between correlation and causation
Causation allows you to see which events or initiatives led to a particular outcome.
Correlation is just a means of measuring the relationship between variables to find statistically relevant trends.
Regression - Interpreting regression coefficients, R squared
Look through the regression coefficients(𝛽 ) and corresponding p-values.
- When you have a low p-value (typically < 0.05), the independent variable is statistically significant.
- The coefficients represent the average change in the dependent variable given a one-unit change in the independent variable (IV) while controlling the other Iv’s
(Example: Customer satisfaction increases by 0.534 units for every unit change in prices.)
Do you know how to calculate Alpha?
(1 - confidence level percent)
Empirical Ruling:
Empirical rule: If the histogram of data is approximately bell shaped, then:
1.About 68% of the cases fall between Y
bar s.d. and Y bar + s.d.
2.About 95% of the data fall between Y
bar 2s.d. and Y bar + 2s.d.
3.All or nearly all the data fall between Y
bar 3s.d. and Y bar + 3s.d.