W6: Independent and Dependent t-tests Flashcards
What are the 4 most common continuous probability distributions
- Normal
- Chi-Squared
- Positive Skew
- (Student’s) t
- Assymmetrical Skew
- F
- Positive Skew
What kind of probabiity distributions are there? What are they defined by?
- Continuous / Discrete
- Univariate / Multivariate
- Univariate = 1 particular statistic
- e.g. variance
- Multivraite = >1 statistics
- e.g. covariance
- Univariate = 1 particular statistic
- Central / Non-Central
- Central
- Under h0
- Non-Central
- Under H1
- Central
They are defined by their repsective parameters
What are 2 ways of expressing probability distributions. How do they relate to each other
- Density Functions
- E.g. Bell-shaped Curve
- Cumulative Distribution Functions
- p values
- For continous distributions, Cumulative Distribution Function is obtained by integrating Density Functions
What is the normal density function defined by (in terms of parameters)
- ) Mean (μ)
- ) Standard Deviation (σ)
What is the difference between a “normal distribution” and a “standard normal distribution”
Standard Normal Distribution
- Mean (μ) = 0
- Standard Deviation (σ) = 1
How do change a normal distribution into a standardised normal distribution. What is it called
Argument Z or Z statistic
- Standardard Normal Variate
- It is transforming the x argument into a standardized Z value
- Z = (x - μ) / σ
- Note: Capital Letter Z
- While the formula is the sam as a z-score, it is NOT a z-score, but a z-distribution
- Mean: 0; SD: 1
How do we calculate probabilities in a density function
Area:
- Below any value on x-axis
- Above any value on x-axis
- Between any 2 values on X-axis
What is so important about a probability distribution. Think through… The answer is long (Mathematically and Practically)
Mathematically
- A sampling distribution of a sample statistic will correspond to a particular probability distribution
- i.e., 1-1 relationship between sampling distribution and probability distribution
Practically
- We do not have to construct a sampling distribution to undertake research. Since we only have one sample and one summary characteristic, we can use probability distributions as a basis to knowing what the distribution of a sample statistic (sampling distribution) will be, and using that probability distribution, we can
- Construct CI
- Calculate P values
What are probability distributions used for
- ) Constructing confidence intervals
- ) Calculating P values for null hypothesis tests
Correlation coefficients, Cramer’s V, Odds ratio, Regression Coefficients, R2, mean difference
What are they distributed as
t (df)
- Correlation coefficients
- Regression coefficients
- Mean differences
Chi-Squared
- Cramer’s V
N (logOR, σ)
- Odds Ratios
F (df1, df2)
- R2
How do we calculate confidence intervals from probability distributions. What do we need?
Example: 95% CI in t-Statistic
- Standard error.estimate
- Derived from sample statistic (b^ )
- σ^b
- Critical t-value
- Derived from desired confidence
- 95% CI = t.975
- (1 + .95)/2
ME
- ME = σ^b x t.975
- (b^ - ME) <= b^ <= (b^ + ME)
What kind of Margin of Error for Confidence Intervals do we normally get when it is derived from t-statistic probability distribution
Symmetrical Margin of Error
(b^ - ME) <= b^ <= (b^ + ME)
- Gurantee that 95% of all CI constructed from repearted samplings from the same population will contain the true population parameter value
When we think of a difference, we are interested in groups differing in _______
When we think of a difference, we are interested in whether groups differ in terms of their respective POPULATION MEANS
When phrasing RQ in terms of a difference, must we state which group is higher or lower
We can, but its not necessary.
We are interested whether there’s a differences
When we think of a difference, we are essentially asking whether members of each group are
- ) Distinct population defined by different means
- ) Subgroups within the same population with the same mean
How can two groups be formed. Some properties of the groups.
1.) Mutually-exclusive groups / Independent
- Each score in one group is independent of all scores in the other group
- Participants can only belong to one group
- Size of group not necessarily same
2.) Mutually-paired groups / Dependent
- Each score in one group is linked to a score in the other group by either (a) Measured twice in different time (b) Dependency (twins, husbands,etc)
- Size of groups must be the same
Are groups categorical / continous
Categorical
How do we initially examine distribution of scores in both groups
Boxplots:
- Group medians
- Outliers
- Normality
- Homogeneity of Variance
(NO MEAN)
After a boxplot to examine distribution of scores, what is nice to use to check out outliers
qqPlot. Check out spread and outliers.
Since there will always be a difference between sample means of 2 groups, what are we actually finding out
Sample means will ALWAYS be different, but we are actually find out if this
Difference is:
- Due to random sampling variability when groups from same population
- Due to random sampling variability + difference in population means when groups from different populations
Sampling distribution of Group Mean Differences: What do we try to create
Confidence Intervals
Given a sample distribution of group mean difference where M (μ1 - μ1) = 0, what does the confidence interval tell us
The range of plausible values if there is no mean difference
- ( 0 - crit.t*se, 0 + crit. t*se)
Given a sample distribution of group mean difference where M (μ1 - μ1) = 1.25, what does the confidence interval tell us
The range of plausible values given the mean difference of 1.25
- ( 1.25 - crit.t*se, 1.25 + crit. t*se)
- Notice how the values are relative to 1.25
When investigating differences between 2 independent groups, what are the 3 assumptions
- Observations are independent
- Observed scores on the construct measure are normally distributed
- Homogeneity of variance in 2 groups
- Group 1 compared to Group 2
(No assumption of linearity)
When investigating differences between 2 independent groups, how do ensure “Observations are independent” is met
- Usually met
- Unless duplication / 1 variable affects the other
When investigating differences between 2 independent groups, how robust is the confidence interval if “Homogeneity of variance” is violated
- Balance of Design
- Balanced
- Protects violation of homogenetiy of variance assumption
- Unless vastly different
- Protects violation of homogenetiy of variance assumption
- Unbalanced
- Mild hetereogenity unrobust
- Must calculate CIs using adjustment separate variance estimate
- Mild hetereogenity unrobust
- Balanced
When investigating differences between 2 independent groups, how do ensure “Homogeneity of variance” is met
Levene test or Fligner Test
- Can use boxplot for preliminary
When investigating differences between 2 independent groups, compare unstandardised and standardised CI
Unstandardised
- Robust against mild-to-moderate non-normality (especially when it’s a balanced design)
Standardised (g or delta)
- Not robust against even mild normality
- Often used when metric of construct measure is abritrary and size of the observed mean difference is not easy to interpret
How robust is the standaridized/unstandardized confidence interval in group differences
Unstandardized confidence intervals are robust against mild-to-moderate nonnormality in difference scores.
Standardized confidence intervals are not robust against even mild non-normality in difference scores.
What are 2 types of standardized mean differences. And what do they require?
- Hedges’ g
- Independence of Observations
- Normality
- Homogeneity of Variance
- Bonett’s
- Independence of Observations
- Normality
When investigating differences between 2 independent groups, “Homogeneity of variance”: When given a high p value in the levenetest/flinger.test, what does it mean
- It is a null hypothesis test assuming homoegenity of variance.
- Typically about p < .05, but if sample size is small then p < .10 might be better
- We can assume homogeneity of variance
- “equal variance assumed”
In the investigation of difference between 2 independent groups, what is the convention with regards to which group is subtracted? Also, in the output, will “equal variance assumed” be the same as “unequal variance assumed”, in unstandardised and standardised
Usually it’s first group minus second group.
Unstandardised Mean Difference
- The estimate for equal variance assumed will be the same as the estimate for unequal varaince assumed (Difference is the same)
- SE differs which will lead to different confidence intervals.
Standardised Mean Difference
- The estimate for equal variance assumed will differ from the estimate for unequal varaince assumed
- SE differs which will lead to different confidence intervals
What is the basis of analysis when we calculate in differences between 2 dependent groups
Difference Scores
- Difference between the pair of scores for each individual
- SE in dependent samples t-test accounts for correlation between scores on the two groups
What happens if the population mean difference is 0/non-0
- If population mean diff = 0, no difference between 2 dependent groups on construct
- If population mean diff /=/ 0, difference between 2 dependent groups on construct
- Note: Mean difference here refers to difference scores
When investigating differences between dependent groups, what are the assumptions
- Observations are independent.
- Observed scores (referring to mean difference) on the construct measure are normally distributed.
[The homogeneity of variance assumption is not relevant because the analysis is undertaken on the difference scores] [No Linearity]