Statistics Flashcards
What is a single sample t-test used for?
The One Sample t-Test determines whether the sample mean is statistically different from a known or hypothesized population mean.
Define Pobs
Pobs is the probability of getting a Statobs that is as, or more extreme than the Statobs, given that the null is true (i.e., that there is no difference)
What ingredients do you need to calculate power?
Delta, alpha, df, crit 1, crit 2, phiprime*
What is PhiPrime*?
A specific departure from the null
What is BetaPhiPrime*?
Type 2 error
i.e., Accept H0 when H0 falso
Saying that there is no relationship when there is one
1-BetaPhiPrime*
Power
~
coming from
I
given that
nct
non-central t (ie the alternative distribution)
What are the steps in Keisen to find your critical values?
- Probability Function
- Go to Students T-Distribution (percentile)
- Input df and alpha
Cumulative Distribution in the students t-distribution (percentile) refers to what?
Alpha (remember to split your alpha if 2-tailed)
What do you need to calculate your critical values in Keisen?
Alpha and df
For calculating power you go to which distribution in Keisen?
Noncentral t-distribution
To calculate Power in Keisen, you need what 3 things?
- Crit
- df
- Delta
H0 is false by 0.55 standard deviations - is another way of saying what?
phiprime star is .55
In the case of ANOVA, how many null states of nature are there?
There are an infinity of null states of nature, and so too, then, null distributions.
What defines extremity in terms of H0?
Critical values
What is a sound procedure?
One where:
- the assumptions are reasonable
- the probability of Type 1 and Type 2 error have been made small.
Probability does not equal chance, rather it equals what?
proportion
What are the 5 steps that should be followed when reporting on analysis?
- Data analysis
- Discussion of error control, leading to a choice to a
- Assumption checking
- Presentation of decision rule
- (if focal H0 rejected) estimation of magnitude of effect
Step 1 of 5 when reporting on analysis is ‘data analysis’ - what does this include?
- Measure of centrality (mean and median)
- Dispersion (SD and range inc. max and min)
- Shape (skewness and kurtosis)
In the case that a is set to .01 and Pobs is .007, what would the presentation of a decision apropos focal hypothesis pair look like?
It must be a clear formal presentation in which a is noted, and the action to be taken in respect to H0 is stated.
e.g., a=.01 > Pobs=.07
Therefore:
rehect H0: mew1=mew2
When depicting probabilities as areas under a curve, what must you do?
You must label the curve in question; else the area shown has no meaning
When checking on the assumptions of normality, one draws a single conclusion on the basis of information available from what 3 things?
- CIs for skewness
- CIs for kurtosis
- QQ plot
How would you state that the data are normally distributed?
“Based on the evidence available from the CIs and QQ plot, the assumption of normality appears to be reasonable”
Can sample QQ plots and CIs definitively confirm/disconfirm the assumption of normality?
No. They allow one merely to draw a conclusion as to whether or not the sample is in keeping with a normally distributed population.
Where is the peak of a negatively skewed distribution?
To the right of the peak in a normal distribution
Where is the peak of a positively skewed distribution
To the left of the peak in a normal distribution. i.e., close to the Y-axis
What is a leptokurtotic distribution?
One with a pointy top
What is a mesokurtotic distribution?
A normal distribution
What is a platykurtotic distribution?
A flatter distribution
What is the skew of an F distribution?
An F distribution is always positively skewed
Why do we do a linear transformation?
- For convenience
2. To fit with convention
Why do we do a linear transformation?
- For convenience
2. To fit with convention
What does .82 power mean?
That there is an 82% chance of detecting H1 if that is the state of nature
What are the two types of decision rules?
- point-point (statobs, critical value)
2. probability-probability (pobs, alpha)
Which probability rule is easier to use in practice?
Probability-probability
Describe the non-directional point-point decision rule.
If crit1
What is the aim of an hypothesis test?
To make a correct call regarding which of H0 and H1 is true. That is to say, to make a correct binary decision: H0 is true; or H0 is false
What is a sound inferential testing procedure?
One which yields correct decisions with high probability Practically, this requires that: 1) the assumptions that underpin the procedure are reasonable; 2) the probability of making a type 1 error has been made small; 3) the probability of making a type 2 error has been made small
Of what is type 1 error control comprised?
- the choice of critical values; 2. the formulation of a decision rule
What is the probability-probability decision rule?
If pobs ≤ alpha, reject H0 else, retain H0
What is Pobs?
Pobs is simply a nonlinear transformation (onto: [0,1]) of statobs;
it quantifies inconsonancy of the result statobs with H0… the smaller the value of pobs, the less in keeping is the result statobs with H0… thus, the smaller the value of pobs, the greater is the evidence that H0 is not the extant state of nature.
inconsonancy
lacking harmony/compatibility with
What is Type 1 error?
Rejecting H0, under the condition that H0 is true
What is Type 2 error?
Retaining H0, under the condition the H0 is true
What is the boundary null distribution?
The boundary null distribution arises in the quantiative scenario wherein we are interested in making a binary decision regarding a directional hypothesis pair.
In the case:
Ho: μ_k ≤ μ_GVA vs H1: μ_k> μ_GVA
Notice, there are an infinite number of possible null distributions (or an infinite number of ways that μ_k< μ_GVA). But there exists only one way in which μ_k= μ_GVA. The latter is called the boundary null distribution. As such, to construct a decision-rule we set α to the boundary null distribution.
What is an assumption?
An assumption is any potentially false claim about the population save for the null and alternative hypothesis. As assumptions are side conditions asserted to derive a particular model to test the binary hypothesis pair. There are no parametric assumptions made for this hypothesis test of statistical independence.
When labeling the alternative distribution, what components should you include?
name of curve, n-df, phiprime*
e.g.
nct (9, .55)
What is power game 2?
Wherein you error balance Type I and Type II error a priori for a number of relevant effect sizes (small, medium, large: usually guided by Cohen).
What kind of variables must the IV and DV be in order for us to run an ANOVA?
IV - must be nominal
DV - Quasi-continuous
What is the conditional mean?
The mean of the DV contingent upon the IV
What is an F ration?
variance explained by the IV:Variance not explained by the IV
What is ANOVA technology used for?
first sentence in answering the question about how ANOVA technology can be employed
ANOVA provides us with the ingredients necessary to make inferences about relationship with a quasi-continuous DV and a nominal IV
What do we mean by relationship?
second sentence in answering the question about how ANOVA technology can be employed
Relationship is any case where X and Y are functionally related in which changes in X(IV) affect Y(DV). There are two components to relationship: type and strength. Strength is dependent on type.
What does the symbol for the conditional mean function look like?
E(YIX)
What does the type of the relationship in ANOVA refer to?
third sentence in answering the question about how ANOVA technology can be employed
The type of relationship refers to the shape of the regression/conditional mean function: E(YIX).
The type of relationship refers to the shape of the regression/conditional mean function: E(YIX). In a relationship with a quasi-continuous DV and a nominal IV, there are two possible types - what are they?
Flatline
Non-flatline
A flatline is a type of relationship. Describe it.
third sentence, point a, in answering the question about how ANOVA technology can be employed
A flatline is where all conditional means are equal to the grand mean. This means that there is no relationship between X and Y.
Thus, it is true that:
mu1 = mu2…muj and a1=a2…aj
and sigma_a_sq=0
Thus, H0 true
What is alpha_j?
The means of each conditional mean minus the grand mean
A non-flatline is a type of relationship. Describe it.
third sentence, point b, in answering the question about how ANOVA technology can be employed
Non-flatline: at least two muj’s are off the grand mean, showing that there is a relationship between X and Y. Thus, H1 true.
There are two components of a relationship. Which must be addressed first and why?
The type of the relationship determines the strength, and must be determined first.
What is ω2?
ω2 is a measure of effect - it tells you how much of the variability in the DV is accounted for by the IV
What is the formula for ω2subA?
σ2A/σ2A+σ2
ie
variability due to factor/total variance in DV
σ2A
Variability of the DV associated with the factor
σ2
All variance within the DV not explained by the factor
j
levels of the IV
What are Cohen’s values for ANOVA power?
- 1
- 25
- 4
What is the formula for SSt?
SSt = SSa + SSu/a
What is SSt?
Total variance
What is SSa?
Variance due to factor A
What is SSu/a
Within variance (or variance not explained by the factor)
What is the difference between σ2A and SSt?
σ2A = variance explained by the factor at the population level
SSt = variance explained by the factor at the sample level
What is the difference between σ2 and SSu/a?
σ2 = within group variance at the population level
SSu/a = within group variance at the sample level
What does the E stand for in E(MS)?
Average of the…
What are mean squares?
Mean sqaures are estimates of variance across groups
What is the difference between Pobs and Statobs?
Pobs is a nonlinear tranformation of Statobs.
Pobs represents the area under the curve (in the direction of extremity) for Statobs
Statobs is the percentile point
What is the skewness and kurtosis of a normal distribution?
0
What does the skewness and kurtosis interval (lower bound - upper bound) need to include so that the assumption of normality is supported?
0
How do you calculate the upper bound estimate for skewness and kurtosis?
Sample skew/kurtosis value + 1.96
note: 1.96 is the estimate of standard error in the statistic
How do you calculate the lower bound estimate for skewness and kurtosis?
Sample skew/kurtosis value -1.96
note: -1.96 is the estimate of standard error in the statistic
Bivariate
Involving 2 variables
Dichotomous
two opposing things