statistical analysis Flashcards
Independent Samples t-test
Definition and Assumptions
Definition:
Determine if there is a significant difference between the means of two independent groups.
Assumptions:
Data from each group are independent.
Data are approximately normally distributed.
The variances of the two groups are approximately equal.
Analysis with t-test
Analysis and Interpretation
compare the p-value with the chosen significance level (usually 0.05).
If the p-value is less than the significance level (e.g., p < 0.05), it suggests that there is a significant difference.
If the p-value is greater than the significance level (e.g., p ≥ 0.05), there is insufficient evidence to conclude a significant difference.
Repeated sample t-test
Definition: tests for a significant difference between the means of related groups, where each subject is measured at two or more time points or conditions.
math skills over time: beginning (time 1) and end (time 2)
Mean
Average of a set of numbers, calculated by summing all the values and dividing by the total count.
Calculation: Mean = sum of all values / total count
Median
The middle value of a dataset when arranged in ascending order
Representative of Central Value (especially when the data is skewed or contains outliers)
Mode
The value that appears most frequently in a dataset.
Histogram
A graph showing how often different values occur in a dataset. It’s like splitting data into groups and counting how many values fall into each group.
Helps us see if data is skewed, has outliers, or follows a specific pattern, like a bell curve for normal distribution.
Helps us understand how data is spread out
Histograms and Normality
A normal distribution looks like a symmetric, bell-shaped curve. It means most data points are in the middle, tapering off towards the ends.
This shape indicates that data is evenly spread around the average, making it easier to predict outcomes.
Normality
data is symmetrically distributed around the mean, with the majority of values clustered near the center and fewer values spread out towards the tails.
assessed visually using histograms, Q-Q plots, or box plots
t-tests, ANOVA, and regression, rely on the assumption of normality
more representable and valid
Sample Size and Noramlity
As the sample size increases, the variability of sampling distribution decreases. Also, as the sample size increases the shape of the sampling distribution becomes more similar to a normal distribution regardless of the shape of the population.
Need atleast 30 participants
Q-Q Plot Graph
Q-Q plot compares our data to a perfect “normal” dataset. It plots how our data points stack up against the ideal.
Normality and Q-Q Plot Graph
Picture two sets of dots on a graph. If they make a straight line, our data is “normal”. If they curve or stray, it’s not.
Box Plot
displays the five-number summary of a set of data. The five-number summary is the minimum, first quartile, median, third quartile, and maximum
insights into the variability and central tendency of the data, as well as the presence of outliers across groups or conditions.
Scatterplot
the relationship between two continuous variables, with each data point representing an observation.
allows for the identification of patterns, trends, or correlations between variables
Bar Chart
representation of categorical data, where the height or length of each bar represents the frequency or proportion of observations in each category.
facilitates comparisons between categories and visualizes differences in frequencies or proportions across groups or conditions.
Plot graph
individual data points are connected by straight lines, typically used to show trends or changes over time.
longitudinal data or changes in variables across different time points or conditions
Degrees of Freedom
tell us how much data can vary without messing up our calculations (how free it is).
df = n1 + n2 - 2 (the number of groups minus 1)
crucial in determining the appropriate critical values for hypothesis testing and estimating the variability of sample statistics
Sampling distribution of mean differences
the distribution of the differences in means between two samples that are randomly drawn from the same population.
understand how much variability we might expect in the differences between sample m
Sampling distribution of mean differences and t-test
compare the means of two independent groups to determine if there is a significant difference between them.
By comparing the observed difference in sample meansto the distribution of mean differences from the sampling distribution, we can assess whether the observed difference is statistically significant (if its larger than the variability expected by chance then its not significant)
P-value
A p-value, or probability value, is a number describing how likely it is that your data would have occurred by random chance
quantifies the evidence against a null hypothesis