Statistics Flashcards by Melody Robinson

Assumptions in ANOVA

The Analysis of Variance (ANOVA) test is utilized to explore differences between or among means of two or more groups as well as examine the individual and interacting effects between multiple independent variables. The first assumption of an ANOVA test is homogeneity of variance which expects that both populations have similar variance; this is important as unequal variances might lead a researcher to make erroneous conclusions about true or null differences between groups. The second assumption is the normal distribution of error/residuals among groups/conditions. The third assumption is that the groups being examined are considered independent observations.

How well did you know this?

Not at all

Perfectly

Interaction in ANOVA

Exploring interaction effects within a test like a factorial ANOVA explores whether the effect of one independent variable’s effect on the dependent variable depends on the level of another independent variable. A significant interaction will outweigh significant main effects (significant changes in a variable) in the interpretation of the results. An example of an interaction effect that might be investigated is whether the effect of a Treatment Condition of CBT that is significantly different between group one and group two is independent of, or interacts with, Gender. If an interaction effect is significant, results might show that the Treatment effect was much greater for cisgender women than transgender women, possibly due to the lesser applicability of treatment modules to the unique characteristics of transgender women’s lives.

How well did you know this?

Not at all

Perfectly

Simple/Main Effects

When an interaction is significant, follow-up tests examining simple and main effects allow one to examine the interaction further. A simple effect is defined as the effect of an IV within one level of the second IV. For example, a simple effect might examine the scores of transgender women in the treatment condition of CBT. On the other hand, a main effect simply examines one IV and its differences on the DV averaged across levels of the second IV. For instance, we might expect there to be a main effect of a treatment condition generally because one group receives an intervention, and another does not. An interaction would be implied if this improvement with treatment condition varied depending on participants’ gender.

How well did you know this?

Not at all

Perfectly

Assumptions in Linear Regression

A linear regression test examines predictions of X variables on Y variables/criterion, and both are typically on an interval scale, ratio scale, or nominal while dummy coded. The first assumption of a linear regression is that the predictor and criterion should have a linear relationship, with a straight line of best fit; this can be examined through scatterplots. The second assumption is that the residuals of variables need to be normally distributed, or unimodal, symmetrical, and mesokurtic; this can be examined through histograms and normality statistics. The third assumption is homoscedasticity, or that the variance of residuals is consistent across all levels of the predictor; this can be examined by plotting the residuals. The fourth assumption is the independence of observations and residuals to ensure no correlation between errors exists.

How well did you know this?

Not at all

Perfectly

Central Limit
Theorem/Sampling Distribution of the Mean

The sampling distribution of the mean is the distribution of values we would expect to obtain for the mean if we drew an infinite number of samples from the population in question and calculated the mean on each sample. The central limit theorem describes how the sampling distribution of the mean (i.e., the distribution of sample means) approaches normal as sample size increases, even if the parent population is not normally distributed. The rate at which the sampling distribution of the mean approaches normal as n increases is a function of the shape of the parent population. If the parent population is itself normal, then the sampling distribution of the mean will be normal regardless of n. Also, when sample sizes are large enough (>30), the sampling distribution will be approximately normal, even if the population does not have a normal distribution. If sample sizes are smaller, this theorem states that if unimodal, the sampling distribution should still be relatively unimodal.

How well did you know this?

Not at all

Perfectly

Box plot

Developed by John Tukey to examine the dispersion of data compared to a histogram or stem-and-leaf. The boxplot is designed to examine outliers around the median of the data by using the 1st and 3rd quartiles which bracket the middle 50% of scores (i.e., interquartile range). We would draw a box around the 1st and 3rd quartile, with a vertical line representing the median. The edges of the box plot, represented by the whiskers, show the lower and upper quartiles of 25% of the data. The maximum and minimum are determined by 1.5xIQR of Q1 and 3; any point beyond these is considered an outlier. Examining the boxplot allows us to tell if the distribution is symmetric by examining whether the median lies in the center of the box. Skewness can also be determined by the length of the whiskers compared to one another. Outliers are determined as values that are outside the whiskers.

How well did you know this?

Not at all

Perfectly

Homoscedasticity vs. Heteroscedasticity

Homoscedasticity (or the homogeneity of variance) is an assumption of ANOVA and linear regression that assumes populations or variables have the same variance or error. Homoscedasticity can be examined through a residual scatterplot which should graphically show a line of data points. Heteroscedasticity (or the heterogeneity of variance) is the opposite in which populations or variables have different variances. Heteroscedasticity is a larger issue in research because residuals systematically change based on the level of the IV. We expect errors in prediction but want errors to be random in nature and therefore uniformly distributed around the regression line. In a graphical plot of residuals, heteroscedasticity is implied when residuals and fitted values cone or fan out in a plot.

How well did you know this?

Not at all

Perfectly

Confidence intervals

Confidence intervals are provided as a range of values that likely contain an unknown population parameter, typically set at a confidence level of 95%. The upper and lower bounds of a confidence interval may change depending on the sample. In other words, if we use the same sampling method to select different samples and computed an interval estimate each time, we would expect the true population parameter to fall within the interval estimates 95% of the time. Some might misinterpret CIs to mean that we are 95% confident a true population parameter falls within the interval calculated. CI’s are statements of probability the interval encompasses the target population statistic, not that the target statistic is within the interval because the population parameter does not vary. Since the population parameter does not vary, variation in the CI’s is due to errors in the samples.

How well did you know this?

Not at all

Perfectly

Family-wise Error Rate/Post-Hoc

When making comparisons between group means, a family of conclusions is made. This might commonly be encountered when conducting post-hoc tests on an analysis like an ANOVA in which one might consider how specific groups might differ on an outcome. The family-wise error rate is the probability that these comparisons will result in at least one Type I error. While the error rate of each comparison on its own stays the same, the family of comparisons’ error rate is influenced (i.e., Type I error is inflated); this is why researchers like Howell (2013) emphasize making post-hoc test decisions based in theory and practice and not searching for every possible comparison because multiple comparisons increase the family-wise error rate. A common way to control the family-wise error rate is the Bonferroni correction to utilize a more conservative alpha level based on the number of comparisons being made.

How well did you know this?

Not at all

Perfectly

Effect size vs Statistical Significance

The more common practice of reporting effect sizes is to provide greater information about statistical results beyond significance testing. Significance testing on its own is to examine whether results, whether trivial or major, are unlikely to be due to chance. Significance testing is based on a chosen alpha value (e.g., p < .05). Statistical significance can be influenced by sample size, with significance more likely to be found due to larger power from larger sample sizes. However, effect size (e.g., the d-family or r-family of effect sizes) speak to “importance”, or the degree of meaningfulness of the difference or relationship being examined. Effect sizes can be standardized across research in order to be compared (e.g., Cohen’s d expressed in standard deviation units for interpretation), with larger effect sizes implying a greater practical significance of a finding. A test using 15,000 individuals may find statistical significance given the large number of participants, but the effect found might be meaningless depending on the determined effect size. Reporting effect sizes is necessary so researchers can interpret the meaningfulness of a significant finding for real-world applications.

How well did you know this?

Not at all

Perfectly

Measures of central tendency/Mean, Median, Mode

Mean, median, and mode are measures of central tendency. The mode is the most common score, can be used with nominal/ordinal data, and is unaffected by extreme values; however, it may not represent the entire data set, can be unstable from sample to sample, and may require two scores. The median is the score at the 50th percentile, the value may not actually occur in the data, unaffected by extreme values, and the median location = (n+1)/2. The mean is the average score, the value may not actually occur in the data, can be manipulated algebraically, influenced by extreme values, and usually a better estimate of the population mean. Under the normal distribution, the mean, median, and mode are all equal. Skew will pull the mean towards the longer tail of the distribution.

How well did you know this?

Not at all

Perfectly

Multicollinearity

Collinearity is when the predictors within an equation or model are correlated together. This might occur if the predictors measure the same construct, predictors are naturally correlated (e.g., weight & BMI), or sampling error. High multicollinearity can be problematic given its inflation of standard errors in regression coefficients, leading one to assume there is no relationship between the predictor and criterion. Additionally, multicollinearity can lead to faulty conclusions about R2 in which predictive capability is reduced in the variables. Predictor variables that overlap with one another do little to explain the intricacies of the prediction. When tolerance is calculated, it explains the degree of overlap between predictors as well as instability in the model. When variance inflation factor (VIF) is calculated (1 divided by tolerance) it explains how much a variable contributes to standard error. Want high tolerance (at least above .10) and low VIF (<10). To correct multicollinearity: (1) eliminate redundant X variables from the analysis, (2) combine X variables through factor analysis, (3) for some multicollinearity can use centering transformation and transform the variable to a deviation of the mean.

How well did you know this?

Not at all

Perfectly

Residuals

Residuals are the difference between the predicted scores from the regression line and represent the error in prediction. Residuals reveal the amount of variability in the DV that is “left over” after accounting for the variability explained by the predictors in the analysis. In regression, one is making a guess/prediction that an IV is associated with the DV; a residual is a numeric value for how much the prediction was wrong. The lower the residual the more accurate the predictions in your regression, which indicates your IVs are related to, or predictive of, the DV. Scatterplots with a line of best fit can help visualize residuals in a data set. An assumption in regression is homoscedasticity, which assumes that the variance of the residuals is consistent across the predictor.

How well did you know this?

Not at all

Perfectly

Measures of Dispersion/Variability & Variance

Measures of dispersion/variability help indicate the degree to which individual observations are clustered around or deviate from the average value, as the average value may reflect a tight range or wide range of values. Dispersion can occur around the mean, median, or mode. Measures of dispersion are as follows:

Range – The distance between the highest and lowest scores (e.g., 1-10) but is heavily dependent on extreme scores, making it difficult to ascertain scores in the middle or overall variability in scores.

IQR – This method attempts to mitigate the effect of extreme scores on range by utilizing the 50% range (Q3-Q1) while discarding the upper (Q3) and lower (Q1) 25% of the distribution of scores. This might be an issue if too much data is tossed that might have been meaningful. Winsorizing scores is similar to this process but utilized a 10% replacement with the nearest score.

Standard Deviation – Defined as the positive sqrt of the variance for a sample and represents the average of the deviations of each score from the mean. Regarding dispersion, standard deviation in a normal distribution allows us to examine how distant a score is from the mean, giving us a picture of overall variation in a data set.

Variance – Refers to summing the squared deviations and divide by the number of scores (N-1 for sample, N for population). This additionally allows one to see how scores are dispersed around the mean. The square root of the variance is taken as variance calculated itself has little interpretability.

How well did you know this?

Not at all

Perfectly

Normal distribution (contaminated/mixed)

The normal distribution refers to how recorded values are overall distributed and tends to represent natural phenomena (e.g., IQ). Under the normal distribution, the mean, median, and mode of values are equal, the distribution is unimodal, it has no skewness, is symmetric, and is mesokurtic (i.e., bell curve). In a normal distribution, 68% of observations fall within 1+- SD of the mean. Normal distributions allow one to calculate z-scores with a mean of 0 and SD of 1. A contaminated (or mixed) distribution was described by Tukey (1960) as occurring when there are two normal distributions with mixed probabilities, leading to heavy tails in the distribution as a wider distribution is contaminating the primary distribution. Some data points may be outliers or come from a distribution with a different mean or variance.

How well did you know this?

Not at all

Perfectly

Outliers and their effects

Study These Flashcards

Outliers are scores that significantly deviate from the rest of the data. These can be extreme values. Outliers sometimes represent errors in recording or scoring data, or actual issues with the participant (e.g., actual infant defect instead of a data entry error). Outliers effect: (1) skewness of a distribution pulling the distribution in the direction of the outliers; (2) the mean, pulling the mean value towards the outliers; (3) increasing the range of values; (4) nonlinear transformation of values. Outliers might be handled by removing them from the dataset altogether and reported or trimming/Winsorizing the values to the next closest score that is not an outlier. Outliers can be identified by using a box plot or histogram visualization.

Pearsons correlation coefficient

Study These Flashcards

The Pearsons product moment correlation coefficient ranges between values of 1 to -1 and reveals the degree of the relationship between two variables. The closer the value is to either -1 or 1, the stronger the relationship between the two variables; in other words, the closer fit to a regression line, the stronger a linear relationship. The – and + value speaks to the direction of the relationship; values of -.23 and .23 have the same degree of relationship, but different directions (i.e., negatively related [opposite] or positively [same]). However, this does not give us information about cause and effect, and correlation coefficients are biased in small sample sizes (i.e., don’t represent the population). Formula: r = COVxy / sx sy (covariance of X and Y / standard deviations of x and y).

Scatter Plot of Bivariate Regression

Study These Flashcards

Scatterplot allows helps to visually examine the relationship between 2 variables (IV and DV/predictor and criterion) in a data set. The data is displayed as a collection of points for each participant, each having the IV value on the horizontal axis and DV value on the vertical axis. A scatterplot demonstrates if there’s a linear relationship and displays its strength, direction, and shape (e.g., linear, curvilinear). A regression equation can be fitted that best explains the data.

Slope and intercept in bivariate (1-predictor) regression

Study These Flashcards

Formula is Ŷ = bX + a, where Ŷ = the predicted value of Y (DV) for a given X value. b = the slope of the regression line, or the amount of change in Ŷ associated with a one-unit change in X (IV; positive or negative); the slope represents the steepness or rate of change of the line and is not a perfect estimate. a = the intercept or the value of Ŷ when X (IV) = 0. And X = the value of the predictor variable (IV). The slope equation defines the linear relationship between two variables and can be used as the best estimate for an average rate of change (i.e., knowing both the intercept and slope helps predict values of Y given X value).

Standard error of estimate

Study These Flashcards

In regression, the SEE describes the average distance between a regression line and actual data points of the dependent variable (standard deviation of residuals/error). SEE allows one to predict error in the regression model. The SEE helps us determine how good our regression line is at predicting Y values, indicating that the smaller the SEE, the better fit of a regression model to its data (i.e., less error). Or, the greater the distance the obtained value is from the predicted value, the higher the SEE. Around 95% of the data should fall within 2SEEs. Squaring the SEE results in residual variance or error variance.

Standard error of estimate (SEE)

Study These Flashcards

Standard error of the mean (different from SEM/measurement)

Study These Flashcards

The standard error of the mean is considered to be the variability of a sampling distribution of means (i.e., the larger the sem, the wider the distribution), allowing one to estimate how far a sample mean falls from the population mean. sem is equal to the population SD (sometimes estimated by the sample SD) divided by the sqrt of n. This equation is why sem decreases when sample size increases.

Z-scores

Study These Flashcards

Z-scores are utilized when transforming data points into standardized scores that maintain their numerical standing and overall distribution (i.e., a non-normal distribution will stay non-normal). A z-score represents standard deviations of a score above (+) or below (-) a mean. On their own, z-scores have a distribution with a mean of 0 and an SD of 1. The formula for calculating a z-score is z=(x-u)/o (i.e., z equals a value minus the pop mean, divided by the pop SD). For some intelligence tests sch as the WAIS, z-scores are transformed into having a distribution of 100 and an SD of 15. Ultimately, z-scores allow for alternative interpretation of scores that might be simpler than raw scores; for instance, they allow a researcher to compare distributions across studies and compare individuals’ scores. 95% of scores fall within 2 SDs on either side of the mean.

Student’s t-test

Study These Flashcards

Student’s t-test is a null-hypothesis significance test developed by Gosset under the pseudonym “Student”. It is used to test if there are mean differences between two groups using the t-statistic and t-distribution. 3 types of students t-test (1) one sample— compares sample mean to population mean using the sample SD as an estimate of the population SD. The SD of the population is unknown, but the mean is known. This is based on the premise that the sample was drawn from a normally distributed population; (2) A paired/dependent samples—two samples are paired together and tested to see whether the difference scores are significantly different from 0; (3) Independent sample—tests difference of means between two independent groups. Assumptions are normality of sampling distribution, homogeneity of variance, independence of observations.

Type 1 vs 2 Error

Type I error refers to rejecting the null hypothesis when it is in fact true. For example, a researcher concludes that the result confirms their hypothesis when the result was actually due to random error. Type I error is represented by alpha as the probability of this occurring. Alpha is usually arbitrarily set prior to conducting hypothesis testing (e.g., 0.05, 0.01). By reducing alpha and the probability of a type I error, we are increasing the probability of committing a type II error. A type II error is failing to reject (or accepting) the null hypothesis when the null hypothesis is indeed false and the alternate hypothesis is true (i.e., there was a finding that was missed). This is represented by beta. For example, with an alpha = .01, the beta = .92. Type I/Type II error are important because depending on the study you are running it may be important to reduce alpha, thereby increasing your odds of making a type II error, but decreasing your odds of making a type I error. For example, a study might explore the effects of a drug with extreme side effects but impactful benefits; reducing alpha to 0.01 might help researchers err on the side of caution so they do not disperse a drug that does not truly cure the disease and unnecessarily give dangerous side effects (Type I error).

Shape of Frequency Distribution

In examining the frequency distribution, we can look at (1) modality—the number of major peaks in a distribution. Two peaks refer to bimodal (the peaks do not have to be the same height) and unimodal which has a distribution with one major peak. (2) Symmetry—does the distribution have the same shape on both sides of the center. A normal distribution is peaked in the center and falls evenly and symmetrically on both sides of center. (3) Skewness—the degrees of asymmetry in the distribution. A distribution can be positively or negatively skewed in which case the distribution stacks up on the right or left side of the distribution, rather than the center. (4) Kurtosis—the relative concentration of scores in the center of the distribution, as well as the tails and shoulders of the distribution. A distribution can be mesokurtic (normal) in which the tails are neither too thin or thick, and not too many or few scores concentrated in the center. A distribution can also be platykurtic in which the distribution flattens out due to more scores concentrated in the shoulders. Leptokurtic is a peaked distribution in which too many scores are in the center and the tails of the distribution.

Statistics Flashcards

(26 cards)