Exam 1 Flashcards
Definition of p-value: (2)
The probability that our results are due to chance alone. The probability of making a type 1 error.
How is p-value used to determine whether to accept or reject the null hypothesis?
P-value is compared to alpha=0.05 to determine whether to accept or reject it. If “p” is less than or equal to 0.05 than we reject the null hypothesis, and we accept if greater than 0.05.
What is statistical power?
Statistical power is the probability of successfully rejecting the null hypothesis.
Increasing “n” will ____ power
Increase
Give 2 reasons why it is not possible for researchers to study an entire population
- It would take too much resources to study an entire population.
- There are usually too many subjects in a population to be able to gather them for a study.
What is a type 1 error?
A type 1 error is when you reject the null hypothesis but you should have accepted it because its true.
What is standard deviation a measure of?
Standard deviation is a measure of how data points, x, differ from the mean, (bar X).
What is random assignment and what is one advantage of using it?
Giving participants in a study an equal probability of being assigned to a certain group. One advantage is that it gets rid of some bias.
Which of the following is not a measure of sampling error?
a. Confidence limits
b. Power
c. Standard error
d. None of the above
b. Power
Describe 95% confidence limits
A researcher is 95% confident that the population mean is within the confidence limits.
On what axis is the independent variable plotted?
Horizontal
Which three measures are approximately equal in a normal distribution?
Mean, Median, and Mode
What type of statistical analysis should be used to compare observed phenotypic frequencies for a dihybrid cross involving eye color (red, white) and wingless (wildtype or wingless) to an expected ratio of 9:3:3:1?
Chi Square Goodness of Fit
How is Z-score Calculated?
(“x” - mean)/Standard deviation
In the tattoo hepatitis test, what type of statistic test should be used to analyze the data?
Chi square test of Independence
What is the null hypothesis for the hepatitis/tattoo case?
Hepatitis C is not dependent on the tattoo parlor.
What is the alternative hypothesis for the hepatitis/tattoo case?
Hepatitis C is related to the tattoo parlor.
What are three examples of categorical variables?
Gender, Phenotype, Genotype
What are three examples of numerical variables?
Height, Age, Biomass
What two things must a histogram have?
Measured variable on x-axis, and frequency on y-axis
What is gaussian distribution?
When the mean, median, and mode are all equal. Normal bell shape curve.
How is range calculated?
Highest-Lowest
How is variance calculated?
(Standard Deviation)^2
How is the mode of a graph recognized?
Always the highest peak
What do positive and negative skew look like on a graph?
Neg. Skew has a tail on the left, and pos. skew has a tail on the right
Describe Symmetric Distribution
50% of the data is above the mean, and 50% of the data is below the mean
What is sampling error?
How much in error of the population we are by studying the sample.
What can influence sampling error?
-Anomalies: Subjects that differ greatly from the mean
What is the standard error of the mean?
How much our sample mean is different from the population mean.
What are 3 ways to lower standard error?
- Increase the sample size
- Decrease the standard deviation
- Improve measurement
What do two sample t-tests test for?
To determine whether the difference between two sample means is statistically significant.
What is meant by “Statistically significant”?
More than we’d expect by chance
How do you decide if a 2-sample t-test or a paired t-test should be conducted?
If the experiment can’t be performed twice, then a 2-sample t-test must be used. Otherwise use a paired-t-test
What 2 things does a 2-sample t-test assume?
- The data for each sample is normally distributed
- The variances for each sample are statistically equal
What 1 thing does a paired t-test assume?
The differences between pairs of data are normally distributed
What is a type 2 error?
When you accept the null hypothesis but should have rejected it.
What is the non-parametric alternative test to a 2-sample t-test?
Kruskal-Wallis Test
What is the non-parametric alternative test to a paired t-test?
Wilcoxon Signed Rank test
What is an advantage of random sampling?
Eliminates some bias
What is the difference between a bar graph and a histogram?
Bar Graph: Shows categorical data on the x-axis and a d.v. on the y-axis
Histogram: Shows numerical data on the x-axis and frequency on the y-axis
What are the three measures of central tendency?
Mean, median and mode
In a Z-test, what is the null hypothesis compared to?
0.025 instead of 0.05
When is welch’s approximation used?
For samples with unequal variances, as a nonparametric alternative to the 2-sample t-test
What is the difference between a parametric and non-parametric test?
A parametric test makes assumptions about the population that must be met in order to conduct the test, such as homogeneity of variances or normalized data. Whereas a nonparametric test does not make assumptions about the population.
What is the nonparametric alternative to a 2-sample t-test that does not meet the assumption of normality?
Mann-Whitney U-Test
What is the nonparametric alternative to a 2-sample t-test that does not meet the assumption of equal variances?
Komolgrov-Smirnov Test
What nonparametric test determines homogeneity of variances?
Levene’s Test
Why is 1-way ANOVA better than multiple t-tests?
Designed to protect against alpha inflation, aside from saving time unnecessarily spent conducting multiple tests. Reducing alpha inflation lessens the chance of receiving a false positive.
What is the nonparametric alternative to One Way ANOVA?
Kruskal-Wallis Test
What is the difference behind the purpose of correlation and the purpose of linear regression?
Correlation analysis specifically tests for a linear relationship between 2 measured independent variables and a measured dependent variable. Regression attempts to fit a line, or curve, to data to look for a dependence of one variable on another.
Describe the F-Ratio
Compares the variation between each group being tested, relative to the variation between individuals within each group. A large F-Ratio indicates more variation between groups relative to the variation within groups
“2 x 3” design is:
Comparing means of 2 independent variables, 1 with 2 levels and 1 with 3 levels. giving a total of 6 groups
Main Effect:
Explains whether the effect of an independent variable on a measured dependent variable is significant or not
Interaction:
Explains if two independent variables influence one another in their effect on the dependent variable
What “r” values represent a strong correlation?
Absolute values closer to “1”
What is the main assumption of repeated measures ANOVA?
Sphericity: The variances of all possible pairwise combinations of groups are equal
How are 2-way ANOVA results reported?
F(2,84)=15.75, p
How are regression results reported?
B=0.287, p
How is a regression equation set up?
Dependent=B value(for I.V.) x I.V. + B value (for D.V.)
GO INTO DETAIL WITH SPECIFIC CONCLUSIONS
GO INTO DETAIL WITH SPECIFIC CONCLUSIONS
What do you do after calculating a Z score?
Compare it to the value in the table to find your p-value, then compare that to 0.025
How are chi-square results reported?
X2(1)=2.251, p=0.134
How do you calculate degrees of freedom for chi-square?
of categories - 1
How is the chi-square value compared to the critical value?
If the calculated value is higher than the critical value, then our p-value is higher than 0.05 so we reject the null hypothesis.
What does multiple regression look at?
Whether “y” depends on any of the MULTIPLE independent variables
How does simultaneous multiple regression work?
All I.V.’s are analyzed at the same time to give an overall R squared value, as opposed to step-wise multiple regression in which each I.V. gets its own R squared value
What result should be examined when interpreting step-wise multiple regression results?
The column on the right tells us if the change in R squared is significant or not
How is chi-square test of independence different from correlation?
It analyzes 2 or more CATEGORICAL variables with counted FREQUENCIES to determine if they are related
What is an example of a test of independence?
Is prostate cancer outcome dependent on the type of treatment? (categories are surgery and radiation)
How are correlation results reported?
r(18)= 0.254, p=0.280
How are degrees of freedom calculated for correlation problems?
n-2
Ex: sample of 20-2=18
When should you use a line graph?
When variable on x-axis is numerical, not categorical
What does each dot on a line graph represent?
A different mean
What must line graphs with 2 I.V.s have?
A Legend
What do asterisks indicate on a bar graph?
Which comparisons are significant, N.S. = Not significant
What does a scatterplot look like? When are they mostly used?
Variables on both axes are numerical, ALL the raw data are plotted. Used with small sample sizes. A measurement of X and a measurement of Y make a dot on the graph that shows trends and relationships
Which graph lacks error bars, bar graph or histogram?
Histogram because we are plotting counts, not means
When is a table used?
When there are too many statistical results to report in an analysis or show in a graph.
What 4 other depictions can be used to show data?
Photographs, flowcharts, diagrams, and maps
How is “and” probability calculated?
Prob. of “A x B”
Ex: Flipping a coin to get heads twice= 1/2 x 1/2 = 1/4
How is “or” probability calculated?
Prob. of “a” + “b”
Ex: Rolling a 2 OR a 3= 1/6 + 1/6 = 2/6 = 1/3
How are “at least” probability problems calculated?
Find lesser probability and subtract from 1
Ex: At least 1 coin comes up heads=
Both tails is 1/2 x 1/2 = 1/4 , 1 - 1/4 = 3/4 = 75%
How do you determine if a study is real?
Replicate the experiment and see if you get the same results
Define Validity
The extent to which a measure actually indicates what i’s intended to
Peer-Reviews don’t…….
Protect against falsified data
List three ways a report could be inadequate or false by lying with statistics:
- Sample size unrepresentative/small/biased
- Inadequate controls/comparisons
- Non-randomized
What is the problem with non-random studies?
Could be biased allocation of patients to certain groups
What is regression used for?
Predicting an equation of “Y” based on “X”
What is the difference between the chi square test of independence and the goodness of fit test?
The test of independence uses categorical variables to find a relationship between them, while the goodness of fit test looks to see if observed frequencies match what is expected by chance alone.