inferential statistics Flashcards
What is the purpose of inferential statistics
allows us to study samples and then make generalizations about the population.
draws a conclusions about a population by examining the random sample.
Helps researchers test hypotheses and answer research questions, and derive meaning from the results.
a result found to be statistically significant by testing the sample is assumed to also hold for the population from which the sample was drawn.
Enables researchers to:
Estimate population proportions
Estimate population mean
Estimate sampling error
Estimate confidence intervals
Test for statistical significance
What are the two main types of methods in inferential statistics
Two main methods:
estimation
the sample statistic is used to estimate a population parameter.
a confidence interval about the estimate is constructed.
hypothesis testing
a null hypothesis is put forward.
Analysis of the data is then used to determine whether to reject it.
Differences between null and alternative hypothesis
the alternative (research) hypothesis (H1) is true and the null hypothesis (H0) is not. in testing differences, the H1 would predict that differences would be found, while the H0 would predict no differences.
Tell me about the significance interval and what the most level is
Researchers set the significance level for each statistical test they conduct.
by using probability theory as a basis for their tests, researchers can assess how likely it is that the difference they find is real and not due to chance.
The level of significance is the predetermined level at which a null hypothesis is not supported. The most common level is p < .05.
P =probability
< = less than (> = more than)
If the .05 level is achieved (p is equal to or less than .05), then a researcher rejects the H0 and accepts the H1.
If the .05 significance level is not achieved (p is more than.05), then the H0 is retained.
Describe the difference between a type 1 and type 2 error.
Type I error: Erroneously rejecting the null hypothesis. Your result is significant (p < .05), so you reject the null hypothesis, but the null hypothesis is actually true.
Type II error: Erroneously accepting the null hypothesis. Your result is not significant (p > .05), so you don’t reject the null hypothesis, but null hypothesis is actually false.
How can you control a type two 2 error
Type II errors can also be controlled by the researcher.
The Type II error rate is sometimes called beta, as a complement to alpha.
How can the beta rate be controlled? The easiest way to control Type II errors is by increase the statistical power of a test.
What happens with a larger sample size does the statistical power go up or down
Power is strongly influenced by sample size. With a larger N, we are more likely to reject the null hypothesis if it is truly false.
(As N increases, the standard error shrinks. Sampling error becomes less problematic, and true differences are easier to detect.).
The probability that the statistical test will correctly reject a false null hypothesis
a researcher would like to have a high level of power
Factors that affect the power of an experiment
(4) factors
Alpha level
Sample size
Effect size
One-tailed or two-tailed test
Describe the power level in more detail here
This is an indication of the size of the treatment effect, its meaningfulness.
With a large effect size, it will be easy to detect differences and statistical power will be high.
But, if the treatment effect is small, it will be difficult to detect differences and power will be low.
Give me what the most used CI is
Estimate the population mean or proportion based on the sample survey.
Confidence level: social science standard is 95%.
95% certain that our population estimate is correct within a specified range
This is the precision of the estimates.
Hypothesis testing procedure give me the steps
State the hypothesis (H0) Select the probability level (alpha) Determine the value needed for significance Calculate the test statistic Accept or reject H0
What if you accept the H0 what do you say
Step 1
Ho: no difference between 2 means; any difference found is due to sampling error.
any significant difference found is not a TRUE difference, but CHANCE due to sampling error.
Level of signifance reject the HO setting your alpha level
Probability that sample means are different enough to reject Ho (.05 or .01).
level of probability or level of confidence
compute the calculated value step 3
Computing Calculated Value
Use statistical test to derive some calculated value (e.g., t value or F value).
Step 4 Obtain critical value
Obtain Critical Value
a criterion used based on df and alpha level (.05 or .01) is compared to the calculated value to determine if findings are significant and therefore reject Ho.
Reject or accept the null
CALCULATED value is compared to the CRITICAL value to determine if the difference is significant enough to reject Ho at the predetermined level of significance.
If CRITICAL value > CALCULATED value –> fail to reject Ho
If CRITICAL value < CALCULATED value –> reject Ho
If reject Ho, only supports H1; it does not prove H1
Parametric vs. Non parametric data
Parametric statistical tests generally require interval or ratio level data and assume that the scores were drawn from a normally distributed population or that both sets of scores were drawn from populations with the same variance or spread of scores.
Nonparametric tests do not make assumptions about the shape of the population distribution. These are typically less powerful and often need large samples.
Give me some details about parametric data
What two type of Quantitative data does it go off
is it randomized
are scores evenly distributed
what size does the sample size have to be
Parametric tests are based on a variety of assumptions, such as: Interval or ratio level scores Random sampling of participants Scores are normally distributed N = 30 considered minimum by some Homogeneity of variance Groups are independent of each other
non parametric data - what types of research uses this data
Does it have as much statistical power as others
Considered assumption free statistics.
Appropriate for nominal and ordinal data or in situations where very small sample sizes (n < 10) would probably not yield a normal distribution of scores.
Less statistical power than parametric statistics.
Improving the Probability of Meeting Assumptions
Utilize a sample that is truly representative of the population of interest.
Utilize large sample sizes.
Utilize comparison groups that have about the same number of participants.
Chi square test
What type of Data is this used on
Think opinions on surveys and who circled a certain anwser x amount of times. For example if you gave a survey on who likes chocolate, vanilla, or strawberry ice cream the calculated the responses on each survey what test would you use?
A nonparametric test is used when the researcher is interested in the number of responses, objects, or people that fall in two or more categories.
Used with crosstabs.
Based on a mathematical formula that looks at the differences between the actual data compared to how the data should have looked if there was no difference.
Used with nominally scaled data which are common with survey research.
T-test which kind of data is this good for and what purpose does it sevre?
Test the difference between two sample means for significance.
pretest to posttest requires interval or ratio level scores easy to compute pretty good small sample statistic relates to research design
one group t-test
dependent groups
independent t-test
explain the differences
One-Group t-test
t-test between a sample and population mean
Independent Groups t-test
compares mean scores on two independent samples
Dependent Groups (Correlated) t-test
compares two mean scores from a repeated measures or matched pairs design.
most common situation is for comparison of pretest with posttest scores from the same sample.
for Nonparametric data (what type of t-ttest do you use when the data is not randomized or proportionate
Mann-Whitney U-Test This test is used for the independent groups situation.
Wilcoxon Signed-Ranks Test This test is used for the paired samples situation.
When is an ANOVA used
Analysis of variance (ANOVA) tests the difference(s) among two or more means.
It can be used to test the difference between two means.
Extension of dependent groups t-test, where each subject is measured on 2 or more occasions.
a.k.a “within subjects design”
Pearsons Correlation Rank Coefficient describe when to use it.
Correlation—the extent to which two variables are related across a group of subjects
Pearson r
It can range from -1.00 to 1.00.
0.00 indicates the complete absence of a relationship.
Correlation coefficient: -1.00 is a perfect inverse relationship—the strongest possible inverse relationship.
Correlation coefficient: 1.00 is a perfect positive relationship—the strongest possible direct relationship
The closer a value is to 0.00, the weaker the relationship.
The closer a value is to -1.00 or +1.00, the stronger it is
Describe Spearmans rank coefficient
Spearman’s Rank Order Correlation (Greek letter rho)
The Spearman’s rank-order correlation is the nonparametric version of the Pearson product-moment correlation.
Spearman’s coefficient, like any correlation calculation, is appropriate for both continuous, discrete and ordinal variables.
Describe differences between inter rater realiablity and intra realiability
I
Reliability is used to describe the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. For example, measurements of people’s height and weight are often extremely reliable.
Inter-rater reliability assesses the degree of agreement between two or more raters in their appraisals.
Test-retest reliability assesses the degree to which test scores are consistent from one test administration to the next.
Intra-rater realiability- is when only one person is being studied over a certain period of time doing the same test over and over again and seeing if he or she gets similair results