Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

measures of central tendency

A

a single number that describes the general location of a set of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

mean

A

a measure of central tendency; average; add up all scores and divide by number of scores; “weighted” mean: weighted by frequency; the sum of all the distances (or deviations) of all scores from the mean is zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

median

A

a measure of central tendency; score corresponding to 50th percentile; “middle number”; when the distance of scores is symmetric, the mean and median will be equal; in a positively skewed distribution, the mean is larger (and vice versa)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mode

A

simplest measure of central tendency; score that occurs most often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

frequency distribution

A

think how “frequent” data is “distributed”, a form of organizing data to make it make better sense; the number of observations within a given interval; shows the “frequency” of occurrence of each possible outcome of a repeatable event observed many times; for ex. - election results, test score listed by percentile; can be in table form, graphed as a histogram or pie chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

scatter plot

A

use when you have 2 variables that pair well together to view their relationship and see if there is a positive or negative correlation; x, y data is required for this type of graph (2 variables - independent and dependent); related term - correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

summation notation

A

convenient/simple form of shorthand used to give concise expression for values of a variable; 8 rules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

order of operations

A

PEMDAS (parentheses, exponents, multiplication/division, addition/subtraction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

percentile rank

A

single number that gives the percent of cases in the specific reference group scoring at or below that score; for ex. - the 5th percentile is the score in a distribution whose percentile rank is 5; state wide test - categorize test scores to put into; for ex. - can be pictured in a histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

z-scores/transformation

A

numerical measurement that describes a value’s relationship to the mean of a group of values; measured in terms of standard deviations from the mean; if a z score is 0, it indicates that the data point’s score is identical to the mean; purpose - allows us to to calculate the probability of a score occurring within our normal distribution and enables us to compare two scores that are from different normal distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

one-sample t-test

A

used to test the statistical difference between a sample mean and a known or hypothesized value of the mean in the population; use your findings to find z-score and compare to population mean; used when the population mean is unknown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

dependent sample t-test

A

aka paired sample t-test; used to determine whether the mean difference between two sets of observations is zero; used when you want to find the statistical difference between two points, times or conditions; each subject is measured twice, resulting in pairs of observations; involves repeated measures (think “within” subjects); for ex. - pre-post observations where “d” is the time in between

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

independent sample t-test

A

aka two sample t-test; compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different; assume that the dependent variable should be normally distributed within each population; variances of the two populations should be approximately equal (homogeneity of variance); the null hypothesis would be that there is not a difference between the 2 groups; think “between” subjects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

correlation

A

a measure of how 2 or more variables are related to one another; important to always distinguish between correlation (A and B) - have relationship - and causation (A led to B); correlations range from -1 and +1 (otherwise there’s an error in your math); positive values reflect a positive correlation (as one score increases, scores on the other should also increase); negative values reflect a negative correlation (as one score increases, scores on the other should decrease); for ex., linear correlation, Pearson’s correlation coefficient (r), correlation using raw scores, spearman rank-order correlation; P (rho) = population correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Pearson’s r (correlation) - calculated thru SPSS

A

measure of the strength of association between 2 variables; the nearer the scatter of points is to a straight line, the higher the strength of association; for ex. - knowing the relationship between age and blood pressure; does not matter what measurement units are being used; r = 1 is a perfect positive correlation, r = -1 is a perfect negative correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

null hypothesis significance testing

A

a method of statistical inference by which an observation is tested against a hypothesis of no effect; test has “statistical significance” when null is rejected; if the p-value is less than (or equal to) .05, then the null hypothesis is rejected in favor of the alternative hypothesis; if the p-value is greater than .05, then the null hypothesis is retained (not rejected); in a t-test if the absolute value of the t-value is greater than the critical value, you reject the null hypothesis; if the absolute value of the t-value is less than the critical value, you fail to reject the null hypothesis (null is retained); null hypothesis (H0) specifies the hypothesized population parameter (for ex. - H0: mean is equal to 500), when the null is rejected, we conclude that not all the means are equal: that is, at least one mean is different from the other means; alternative hypothesis (HA) specifies another value or set of values for population parameter (for ex. - HA: mean is not equal to 500); p value (probability value) tells you how likely it is that your data could have occurred under the null hypothesis, it is a proportion (normally 0.05); H0 is assumed to be true until proven otherwise; H0/null is a statement of “no difference,” “no association,” or “no treatment effect”; HA is a statement of “difference,” “association,” or “treatment effect”

17
Q

type I error

A

reject a true null hypothesis, aka a “false positive”

18
Q

type II error

A

non-rejection/retention of a false null hypothesis, aka a “false negative” or miss

19
Q

multiple comparisons

A

use if the null hypothesis is rejected; conducts an analysis of all possible pairwise means; for ex. - if the ANOVA test is significant, then multiple comparisons would compare pairwise comparisons like: Brand A to Brand B, Brand A to Brand C, Brand B to Brand C; related terms include null hypothesis significance testing and ANOVA

20
Q

confidence intervals

A

alpha of .01 = 99% (larger difference); alpha of .05 = 95%; purpose is to show us the likely range of values of our population mean and give us richer data and show the degree of uncertainty or certainty in a sampling method

21
Q

normal curve

A

continuous probability distribution that is symmetrical on both sides of the mean; population mean or sample mean is the center line

22
Q

regression

A

used when you want to predict a continuous dependent variable from a number of IVs; moving from inference to prediction; most common type is linear regression; next step after correlation, used when we want to predict the value of a variable based on the value of another variable; want to predict the DV; called regression b/c when r < 1, the predicted score will typically be closer to the mean than the actual score; the predicted scores are “regressing” (going back) towards the mean

23
Q

explained variance

A

the sum of the square of the differences between each predicted y-value and the mean of y

24
Q

unexplained variance

A

the sum of the square of the differences between the y-value of each ordered pair and each corresponding predicted y-value; total variance: total variance = explained variance + unexplained variance, the sum of squares of the differences between the y-value of each ordered pair and the mean of y, the smaller the variance the closer the data points are to the mean and to each other, “the average” of the squared distances from each point to the mean; related terms include correlation and regression

25
Q

ANOVA

A

analysis of variance; used to analyze the differences among means in a sample or compare two means from two independent (unrelated) groups using the F-distribution (one-way ANOVA); difference between t-test and ANOVA is ANOVA can analyze three or more populations

26
Q

two-way (factorial) ANOVA

A

use when you want to know how 2 IVs, in combination, affect a DV; only difference between a one way and two way ANOVA is the amount of IVs

27
Q

repeated measures ANOVA

A

aka within-subjects; compares means across one or more variables that are based on repeated observations; can also include zero or more IVs, but has at least one DV; related to dependent samples t-test (but there could be no IV here)

28
Q

interactions

A

special property of three or more variables, where two or more variables interact to affect a third (dependent) variable in a non-additive manner; usually tested using an ANOVA, could use multiple regression; related term is ANOVA

29
Q

chi square

A

compares two variables in a contingency table to see if they are related; tests to see whether distributions of categorical variables differ from each other; the null hypothesis of the chi square test is that no relationship exists on the categorical variables in the population (they are independent)

30
Q

power

A

statistical power - the probability of a hypothesis test of finding an effect if there is an effect to be found; power analysis - can be used to estimate the minimum sample size required for an experiment, given a desired significance level, effect size, and statistical power; you want the power of your study to be high; the likelihood that you’ll find a significant result (for ex., being able to reject the null) assuming the alternative hypothesis is actually true; if your power is low, then there is a high chance that your study will not be able to detect a true effect; how to increase power - increase sample size (most common), increase alpha level, and conduct a directional (one-tailed) test

31
Q

a priori power

A

used when designing a study; tells you what sample size is needed to detect some level of effect with inferential stats (for ex., p-value); base effect size on past research; can always be conservative by predicting a small-medium effect size (.20-.40); if you know alpha and “d”, and the desired power level (.80 is default), then you can find N (sample size)

32
Q

post-hoc power analysis

A

follow-up analysis, especially if the finding is non-significant; if you know N and alpha, and you can estimate d, then you can calculate

33
Q

error/noise

A

measurement or sampling errors; the variability within a sample or estimation error caused by a random variable; unknown error between the retained value and true value

34
Q

heterogeneity

A

variability among data or study outcomes in meta-analysis; populations, samples, or results are different; opposite of this is homogeneity

35
Q

bonferoni corrections

A

multiple-comparison correction used when several dependent or independent statistical tests are being performed simultaneously; purpose - adjusts probability (p) value b/c of the increased risk of a type I error; formula: divide the original alpha level (most likely 0.05) by the number of tests being performed; most common/easiest post hoc test; can be used with any type of statistical test (for ex. - just not ANOVA, but correlation, etc.)