Stats Final Flashcards

1
Q

Alpha

A
  • the probability of making a type 1 error Or p value
  • the probability of rejecting the null when it is true
  • considered to be more serious than type II error
    • this is because we are saying something is there when it is not
  • probability is never zero, a differing p value does not say one is better than the other, it just means there is less error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Beta

A
  • the probability of making a type II error
  • accepting the null when it is false
  • failing to reject the null when it is false
  • saying there is not a difference, when there is.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternative Hypothesis

A
  • Also known as experimental hypothesis
  • denoted H1, statement of what a statistical hypothesis test is set up to establish
  • The alternative hypothesis might also be that the new drug is better, on average, than the current drug
  • where a difference exists
  • has a direction, where as a null does not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bonferroni Correction

A
  • A correction applied to the alpha level to control for type I error rate when multiple significance test are carried out.
  • Current p value/# of correlations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is central limit therom?

A
  • The distribution of the sum or average of a large number of an independently distributed variable will be approximate normal regardless of the underlying distribution.
  • Sampling distribution will be normal.
  • if we have a large enough sample size, it will fall within the normal distribution regardless of the population that we pulled from
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cohen’s d

A
  • Mean difference/pooled standard deviation
  • Interpreting effects sizes
  • d = .2 = small effect
  • d = .5 = medium effect
  • d = .8 = large effect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Degrees of freedom

A
  • Related to the number of values free to vary when computing a statistic
    • The number of pieces of information that can vary independently of one another
  • The minimum amount of data needed to calculate a statistic
  • A number or numbers used to approximate the number of observations in the data set for the purposes of determining statistical significance
    • In some cases one less than the number of variables

The df is necessary to interpret a chi-square statistic, an F ratio, or a t value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is deviation?

A
  • Difference between the observed value of the variable and the variable predicted by a statistical model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Effect size

A
  • An objective and standardized measure of the magnitude of an observed effect
  • Measures include cohen’s d, glass’s g, and Pearson’s correlations coefficient r
  • The measure of the strength quantitatively
  • Reported as small, medium, and large
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a confidence interval?

A
  • Provides another measure of effect size
  • We usually use a CI of 95% which contains a p-value of .05.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

experimentwise error rate

A
  • Probability of making a Type I error in an experiment involving one or more statistical comparisons where the null hypothesis is true in each case
  • Increases when there are more tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a parameter?

A
  • Variables are measure constructs that vary accross entities in the sample, and perameters describe the relations between those variables in the population.
  • What we use to estimate about a population.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the null?

A
  • Hypothesis stating that there is no difference.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a population

A
  • The colletion of units to which we want to generalize findings or a statistical model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is power?

A
  • 1-Beta
  • probablity of correctly rejecting the false null hypothesis.
  • the probability of correctly finding an effect when an effect really exists.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a sample?

A
  • a smaller, but hopefully representative collection of units from a population used to determine truths about that given population.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a sampling distribution?

A
  • This is the distribution of a sample statistic and theoretical distribution.
  • if the experiment was repeated an infinite amout of times
  • similar to central limit theorm.
  • the numbers that are derived from a sample an infinite amount of times
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is sample variation?

A
  • The extent to which a statistic varies within samples taken from a population because members in samples differ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is standard error?

A
  • average amount of deviance from the mean for a given sample
    • you can only have one standard error, but you can have multiple SD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the standard error of the mean?

A
  • Tells us how confident we should be that the sample mean represents the population mean.
  • Average difference between sample mean and population mean.
  • how much error we can expect when we compare our mean to the population mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a test statistic?

A
  • how you get the f-value in the ANOVA
  • This equals signal/noise.
  • How frequently different values occur
  • Used to test hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is Type I error?

A
  • Rejecting the null hypothesis when the null is true
  • Saying there’s a group difference when there is not
  • this is inversely related to Type II error.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is type II error?

A
  • Failing to reject the null hypothesis when the null is false.
  • frequently due to sample size being too small
  • Saying there’s no group difference when there is
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Be familiar with the concept of a test statistic as a ration of signal (effect) to noise (error)

A
  • Signal is the effect which is also the variance explained by the model.
  • Noise is the error, which is the variance not explained by the model.
  • The higher the error the less the effect, and the inverse.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a one-tailed test of significance?

A
  • A directional hypothesis
  • entirely in one tail of the probablity distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a two-tailed test?

A
  • Non-directional hypothesis
  • you are looking two tails
  • this is typically what we used
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the three tools (or indices) that provide ways to assess how meaningful the results of statistical analysis are?

A
  • Statistical Significance
  • Confidence Intervals
  • Magnitude of the effect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Statistical Significance

A
  • provides a way to assess how meaningful the results of statistical analyses are
  • (p value)
  • affected by sample size
  • the odds that observed result is due to chance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

confidence intervals

A
  • provides a way to assess how meaningful the results of statistical analyses are
  • tells us how well sample statistics generalize to the larger population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

magnitude of the effect

A
  • provides a way to assess how meaningful the results of statistical analyses are
  • what the effect is that is not influenced by the sample size, this is unlike statistical significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the relationship between p-value and alpha and Type I error?

A
  • P-value is also known as alpha.
  • Alpha is the probablity of making a Type I error.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Be able to describle and illustrate the method of least squares using the mean as the statistical model.

A
  • A method for estimating parameters that is based on minimizing the sum of squared errors.
  • Used to determine the line of best fit in regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How does the standard deviation of the sample affect the standard error of the mean?

A
  • The larger the the standard deviation, there is a greater assumer variation of scores in the population. Therefore, there will be a larger standard error of the mean.
  • the more you deviate from the mean, the more error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How does the size of the sample affect the standard error of the mean?

A
  • When the sample size is bigger, then the error will be smaller.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Correlation coefficient

A

Is an effect size

Do DV or IV

The strength of association or relationship between x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Covariance

A
  • When changes in one variable are met with similar changes in the other variable
  • When one deviates from the mean we expect the other to deviate from the mean in the same way
  • Means in one variable meet means in another variable
  • Formula: sum of the cross product deviation/ N-1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Cross product deviations

A
  • a measure of the total relationship between two variables
  • multiply the deviations of one variable my the corresponding deviations of second variable
38
Q

What is a dicotomous variable?

A
  • nominal variable with two categories.
39
Q

truncated range

A
  • truncated range = restricted variance
  • When the scores on one or both of the variables in the analysis do not have much range in the distribution
    In one or both variables may attenuate the correlation coeficcient.
  • Lack of variability
  • This may weaken the correlation coefficient
40
Q

two fundamental characteristics of correlation coefficients

A
  1. direction (sign +,-)
  2. strength (magnitude, numerical value)
41
Q

What is a perfect positive correlation? What is a perfect negative correlation?

A
  • 1.00 would be the perfect positive correlation
    • Every member of the population is the same.
    • High score on one variable is related to high score on another variable
  • -1.00 would be the perfect negative correlation
    • High score on one variable is related to a lower score on the other variable
    • Inversely related
42
Q

Pearsons product moment correlation

A
  • Two continuous variables
  • interval, ratio
  • Standardized measurement of covariance
  • example:
    • trying to correlate GPA and SAT scores
43
Q

Phi Coefficient

A
  • two dichotomous variables
44
Q

Spearmans rho

A
  • 2 variable with Ordinal
45
Q

Point Biserial

A
  • discrete dichotomous, continuous
    • Discrete: pregnant vs not pregnant
46
Q

Biserial

A
  • continuous dichotomous, continuous
    • Continuous dichotomous: pass/fail
47
Q

When the correlation coefficient is zero, what is the relationship between the X and Y variable?

A
  • If the correlation coefficient is zero, there is not relationship between the variables of X and Y.
  • if x is zero, then y equals the constant. (it is horizontal)
48
Q

Be able to interpret a 95% confidence interval around a sample correlation coefficient.

A
  • This means that there is 95 times out of 100 the score would land in that particular confidence interval
49
Q

If given a 95% confidence interval around a sample correlation coefficient, be able to test the null hypothesis that no relationship exists between the two variables (X & Y).

A
  • The null hypothesis is that there are no differences
  • “we can test the hypothesis that the correlation is different from zero (i.e., different from ‘no relationship’)” (Field, p. 268)
50
Q

Issues that may attenuate the correlation coefficient

A
  1. Outliers
  2. Curvilinear
  3. Truncated Range
51
Q

Regression Coefficient

A

Indicates the effect of the IV on the DV

For each unit increase in the IV there is an expected change equal to the size of b in the DV

b = (unstandardized) the slope of the regression line

B (beta) [standardized]

52
Q

F ratio

A
  • MSreg/MSres or MSbetween/MSwithin
  • known probability distrubution
  • variance between groups/variances within groups
  • It is the ratio of the average variability in the data that a given model can explain to the average variability unexplained by that same model. It is used to test the overall fit of the model in simple regression and multiple regression, and to test for overall differences between group means in experiments
53
Q

Outcome variable

A
  • A variable whose values are trying to predict from one or more predictor variables
  • DV in regression
54
Q

Predictor Variable

A
  • A variable that is used to try to predict values of another variable known as an outcome variable
  • IV in regression
55
Q

Predicted Value

A
  • The value of an outcome variable based on specific values of the predictor variable or variables being placed into statistical model
  • Y’
  • All of the predicted values make up the predicted line
56
Q

Residual

A
  • error
  • The difference between the value a model predicts and the value observed in the data on which the model is based.
57
Q

Simple Regression

A
  • Also known as bivariate linear regression
  • A linear model in which one variable or outcome is predicted from a single predicted variable.
  • The model takes the form Y = a +bx + E
    In which Y is the outcome variable, X is the predictor variable, b1 is the regression coefficient associated with the predictor and bo is the value of the outcome when the predictor is zero
58
Q

Sum of Squares

A
  • SStotal = SSreg + SSresidual
  • SSreg/SStotal= r2
  • SSres/SStotal=error variance
59
Q

T-statistic

A
  • T is a test statistic with a known probability distribution (t distribution)
  • In the context of regression it is used to test whether a regression coefficient b significantly different from zero; in the context of experimental work it issued to test whether the differences between two means significantly different from zero. See also paired samples t test and independent t test
60
Q

regression coefficient is zero, what does the resulting regression line look like?

A
  • If the slope us zero, then a equals the mean of y, and you have to question whether y is adding to the predictor or improving on the prediction?
  • When x and y do not covary, then b=0
61
Q

regression coefficient in terms of the relationship between X and Y

A
  • for every x there is a change in yà you need to know b as in the slope
  • For every 1 unit change in x there is blank change in y
62
Q

In simple linear regression, how are the bivariate correlation coefficient (r) and the regression coefficient (b) similar? How are they different?

A
  • b is the slope
    • the IV on the DV, or the slope of the regression line
    • If b = slope, then if b = 0 = horizontal regression line
    • The constant or y-intercept (a) is the point in which the regression line crosses the Y-axis
    • This just tells us the direction of the relationship
  • r is the correlation
    • measure of effect size
    • the strength of the relationship
  • they are both showing a relationship between x and y
63
Q

What do we get when we partition the sum of squares for the dependent variable

A
  • SStotal = SSreg + SSresidual
  • Shared variance and the error
64
Q

95% confidence interval around a sample regression coefficient

A
  • You have a statistically significant sample regression coeffeicient if your confident interval does not contain zero
65
Q

Categorical Variable

A
  • nominal variable with multiple groups
    • gender as an example
66
Q

continuous variable

A
  • interval scales
    • Interval scales provide information about order, and also possess equal intervals
      • example: likert
67
Q

Dependent Variable

A
  • Another name for outcome variable
  • This is the name usually associated with experimental methodology and is used to because it is the variable that is not manipulated by the experimenter and so its value depends on the variables that have been manipulated
68
Q

independent variable

A
  • Another name for predictor variable
  • This name is usually associated with experimental methodology and is used because it is the variable that is manipulated by the experimenter and so its value to does not depend on any other variables
69
Q

Matched or Paired Samples

A
  • matched, dependent, or paired samples
  • within group design
  • Those samples in which the same attribute, or variable, is measured twice on each subject, under different circumstances. Commonly called repeated measures
70
Q

Standard error of difference between means

A
  • measure of variability of differences between sample means
    • after an infinite amount of differences between the means, standard error of the differences becomes the SD of those populations mean differences
71
Q

One-Sample t- test

A
  • Goal: mean of DV compared to test value (normative value)
  • H0: sample mean = test value
    • the mean difference is zero
  • Effect Size:
    • Again, a common effect size for an independent-samples t test is Cohen’s d
    • Cohen’s d provides estimates of differences in standard deviation units
    • Again, d values of .2, .5, and .8, regardless of sign, are interpreted as small, medium, and large effect sizes respectively
  • Continuous scales (IV,DV)
    *
72
Q

Paired Samples t test

A
  • Goal: compare group means or two means or averages
    • Pairwise comparison
    • This is a within group design
    • Thus, we want to know if there is a significant change in scores from time 1 to time 2
  • The value of interest is the change score (posttest – pretest)
  • Continuos scales
  • H0: MD = 0
  • H0: MT1 = MT2
  • Effect Size:
  • Again, a common effect size for an independent-samples t test is Cohen’s d
    • Cohen’s d provides estimates of differences in standard deviation units
    • Again, d values of .2, .5, and .8, regardless of sign, are interpreted as small, medium, and large effect sizes respectively
73
Q

Independent samples t test

A
  • Goal: An independent-samples t test is used when you want to compare the means of two independent samples (i.e., a subject cannot be a member of both sub-samples) on a given variable
    • These tests require one categorical or nominal IV, with two levels or groups, and one continuous DV (i.e., interval or ratio scale)
  • In this type of t test we want to know whether the average scores on the DV differ according to which group one belongs
  • H0: MD = 0
    • The null hypothesis is that the mean difference (MD) between the two group means equals zero
  • H0: MG1 = MG2
    • The null hypothesis would also be that there is no difference between the mean time spent talking of the low stress condition when compared to the mean time spent talking of the high stress condition
  • Effect Size:
    • Again, a common effect size for an independent-samples t test is Cohen’s d
    • Cohen’s d provides estimates of differences in standard deviation units
    • Again, d values of .2, .5, and .8, regardless of sign, are interpreted as small, medium, and large effect sizes respectively
  • Lavene’s Test for Equality of Variance
    • The null hypothesis for Levene’s test is that the variances of the two groups are equal
    • When Levene’s test is not significant, then we may assume that we have equal variances
    • When Levene’s test is significant, then we may not assume that we have equal variances
74
Q

T Test Assumptions

A
  • Normality
  • Independence Assumption
  • Homogeneity of Error Variance
    • Only pertains to Independent Samples t-test and ANOVA
75
Q

Normality Assumption

A
  • One sample
    • DV is normally distributed
    • Larger sample size makes this more likely
    • Generally robust (not sensitive) to violations of this assumption
  • Paired sample
    • difference scores are normally distributed
    • moderate and large sizes may yield reasonably accurate p values when this is violated
    • power of this test is reduced considerably if the distribution is severely non-normal
  • Independent sample
    • DV sample is normally distributed in each of the 2 groups
76
Q

Independence Assumption

A
  • Random sample from the population
  • Independent from other subjects, not influenced by other subjects
77
Q

Homogeneity of Error Variance

A
  • Only pertains to Independent Samples t-test
  • Assume that error variance in each group is roughly equal
  • If this is violated, we could end up with a large variance and a large error
  • We use SPSS to compute an approximate when this is violated
    • Lavene’s Test
78
Q

What is the null hypotheis for Levene’s test of equality of (error) variances?

A
  • Variance of the two groups are equal
    • When Levene’s test is not significant, then we may assume that we have equal variances
    • When Levene’s test is significant, then we may NOT assume that we have equal variances.
79
Q

95% confidence interval around two group means for an independent-samples

A
  • This confidence interval is around the mean difference
  • The confidence interval values indicate that the true population mean difference lies between 3.61 and 42.65 with a 95% probability
  • Notice the hypothesized mean difference of zero does not fall within this range
    • That is why we found the mean difference to be statistically significantly different from zero
80
Q

Eta-squared

A
  • An effect size measure that is the ratio of the model sum of squares to the total sum of squares.
  • The coefficient of determination
    • SSbetween/SStotal
    • % of variance in DV accounted for by IV
81
Q

Brown-Forsythe F

A
  • used when variance across groups are not equal
  • part of ANOVA
82
Q

Grand variance

A
  • The variance within an entire set of observations
83
Q

Orthogonal

A
  • means perpendicular (at right angles) to something
  • it tends to be equate to independence in statistics because of the connotation that perpendicular linear models in geometric space are completely independent (one is not influenced by the other)
  • When performing statistical analysis, independent variables that affect a particular dependent variable are said to be orthogonal if they are uncorrelated
84
Q

Pairwise comparisons

A
  • different comparison of levels of the IV
  • k(k-1)/2
85
Q

One-Way Analysis of Variance (ANOVA)

A
  • Tests the significance of the difference between two or more group means
  • Allows us to test more than one pairwise comparison in one analysis
  • 1 IV that is nominal with 2 (or more) categories/groups/levels
  • 1 DV that is measured on a continuous scale
86
Q

Grand mean

A
  • the mean of all means
87
Q

determine the number of pairwise comparisons

A
  • K(k-1)/2
88
Q

What is the key difference between a prioi tests and post hoc tests? In other words, why would one approach be chosen over the other?

A
  • A priori
    • Planned in advance
    • less comparison
    • theory driven
  • post Hoc
    • Data driven
    • exploratory
    • after the omnibus F-test
89
Q

Overall test of significance of the F ratio

A
  • Omnibus F test
90
Q

Difference between simple linear regression and multiple linear regression

A
  • Simple
    • One predictor
  • Multiple
    • More than one predictor