Final Exam Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is the concept of a population?

A

An entire group of individuals

Ex. All voters in the United States

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the concept of a sample?

A
  • Usually populations are too large to examine the entire group, so a smaller sample is taken to represent the population
  • Goal is to use sample results to answer questions about a population (inferential statistics)

sample statistics are not perfect representatives of population parameters (this discrepancy is called sampling error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Know how to define and identify a nominal variable

A
  • Unordered set of categories that are identified by their different names
  • Measurements can label and categorize observations, but do not make any quantitative distinctions between observations
  • Only determination you can make is whether two individuals are the same or different on that variable

ex. favorite ice cream flavor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Know how to define and identify an ordinal variable

A
  • An ordered set of categories
  • Tells you the direction of difference between two individuals (but not the size of said difference)

Ex. class rank, place in a race, large vs. small drink

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Know how to define and identify an interval variable

A
  • An ordered series of equal-sized categories
  • Identifies the direction and magnitude of a difference
  • Zero point is located arbitrarily… zero does not mean none of the thing

ex. Temperature in C or F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Know how to define and identify a ratio variable.

A
  • Ordered series of equal sized categories
  • Can identify the direction and magnitude of a difference
  • Zero indicates none of the thing

Ex. Temperature in K, distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Know how to define and identify a correlational study

A
  • Goal is to determine the strength and direction of the relaitonship between two variables
  • Uses observations of the two variables as they exist naturally
  • Correlation cannot determine causation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Know how to define and identify an experimental study

A
  • Examine the relationship between 2 or more variables by changing one variable and observing the effects on the other variable
  • To establish a cause and effect relationship between the two variables, an experiment attempts to control all other variables to prevent them from influencing the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Know how to define and identify a nonexperimental study

A
  • Compare groups of scores but do not use a manipulated variable to differentiate groups
  • Therefore, no causal determinations can be made
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When given a dataset, know how to compute: the mode

A
  • The mode is the most frequently occuring score or class interval in the distribution
  • In a frequency distribution graph, the mode corresponds to the high point of the distribution
  • Can be measured for data measured on any scale of measurement; is the only measure of central tendency that can be used for data measured on a nominal scale
  • General term is also used to describe a peak in a distribution that is not necessarily the highest point… (major mode at the highest peak and a minor mode at a secondary peak, used when distribution is clearly humped)

Possible to have more thn one mode: 2=bimodal, 3+=multimodal…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When given a dataset, know how to compute: the median

A
  • The median divides the scores so that 50% have values equal to or less than the median
  • If scores are listed smallest to largest, the median is the midpoint of the list
  • requires scores that can be placed in rank order and measured on an ordinal, interval, or ratio scale
  • If odd # scores, median is the middle score… if even # scores, median is the sum of the 2 middle scores divided by 2

Median is relatively unaffected by extreme scores, so tends to stay in the center of the distribution even when there are a few extreme scores or the distribution is very skewed. In these situations, the median serves as a good alternative to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When given a dataset, know hot to compute: the mean

A
  • The mean is calculated by computing hte sum of the entire set of scores, and dividing this sum by the number of scores
  • Most commonly used measure of central tendency
  • Can be used for ordinal, interval, or ratio scales (best for interval and ratio)
  • Conceptually, the mean can also be defined as the balance point of the distribution (sum of the distances below the mean is exactly equal to the sum of the distances above the mean)

-Changing the value of any score will always change the mean. Discarding or adding new scores will almost always change the mean (unless you discard or add a score that is equal to the mean)
-If a constant value is added or subtracted from every score the mean is also changed by that same constant value. Smae when multiplying or dividing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Know under what cirumstances the mean does not provide a representative value

A
  • When the distribution contains a few extreme scores (like US income)
  • Or is very skewed… the mean will be pulled toward the tail or toward the extreme scores (the mean will not provide a central value)
  • In a definitively humped distribution, the mean score may actually represent nobody in the distribution
  • With data from a nominal scale, it is impossible to compute a mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Know how to identify the different shapes of distribution graphs: Symmetrical

A

-left side is roughly a mirror image of the right (can be normal curve, can be bimodal is the two peaks are mirror images)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Know how to identify Positive and Negative skew

A
  • Skewed distribution: scores pile up on one side of the distribution
  • Leave a “tail” of a few extreme values on the other side
  • Positive skew- scores pile on the left side with the tail pointing right
  • Negative skew- scores pile on the right side with the tail pointing left
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Know how to identify bimodal distribution graph

A

Clearly has two humps, or two peaks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Relationships between mean, median, and mode

A
  • The three are often systematically related because they all measure central tendency
  • Ex. In a symmetrical distribution, mean=median=mode (if there is one mode)
  • Ex. in skewed distribution: mode located at the peak on one side, mean usually displaced toward the tail on other side, median usually located between the mean and the mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do the notations S and σ mean?

A
  • S=sample standard deviation
  • σ=population standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Know how to calculate the sum of squares given a dataset

A
  1. Find the mean of the dataset
  2. Subtract the magnitude of the mean from each (find the deviations)
  3. Square this value (square the deviations)
  4. Add up the squared deviations (this is the sum of squares)!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Know how to calculate the variance given a dataset

A
  1. Find the sum of squares
  2. Divide sum of squares by n-1(sample) or N (population)
  3. This value is the variance (s squared!)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the difference between the standard deviation for a sample vs. a population?

A
  • For samples, we divide the SS by n-1 when finding the variance
  • For populations, we divide the SS by N when finding the variance
  • This is to inflate the estimate of variance, to account for the fact that sample variance will typically underestimate population variance (this effect is stronger with smaller samplse and the effect of df helps account for that too)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the most common measure of variability?

A
  • Standard deviation
  • Measures the average distance from the mean for scores in the data set
23
Q

Know how to define a Z-score

A
  • Z-scores extress data in terms of the mean and standard deviation; tells us how far away the point is from the mean as a proportion of standard deviation
  • The value can tell us exactly where a raw score is located relative to all other scores
  • Sign (+ or -) identifies whether the X score is above or below the mean… its’ numerical value=the # of standard deviation units between X and the mean
24
Q

What does a z-score of 0 indicate?

A

The X-score is = to the mean

25
Q

What is the role of probability in inferential statistics?

A
  • It’s impossible to predict exactly which scores will be obtained when you take a sample from the population
  • Probability allows us to determine the likelihood of getting specific samples
  • If the probability of getting a specific sample is low, we can say that the sample probably came from some other population
26
Q

Understand the defiinition of probability

A
  • Probability is the likelihood of an event occurring for a situation in which several different outcomes are possible
  • probability of A= # of outcomes classified as A / total # of possible outcomes

Typically we use proportions to summarize previous observations, and probability is used to predict future, uncertain outcomes

27
Q

Understand random sampling and the requirements that must be satisfied

A
  • Random sampling requires that each individual in the population has an equal chance of being selected. A sample obtained by this process is called a simple random sample
  • Independent random sampling (aks random sampling with replacement) also requires that the probabilities must stay constant from one selection to the next (the probability someone will be selected does does not change by the individuals already selected)
28
Q

What makes a distribution of sample means normal?

A
  • Distribution of sample means is the collection of sample means for ALL possible random samples of a particular size (n) that can be obtained from a population
  • Logically, most of the samples should have means close to the population mean. As a result, the samples means should pile up in the center of the distribution and the frequencies should taper off as the distance between the population mean and sample mean increases

The sample means obtained with a large sample size should cluster relatively close to the population mean; the means obtained from small samples should be more widely scattered (less representative)

29
Q

Understand sampling error

A
  • The natural discrepancy, or amount of error, between a sample statistic and its corresponding population parameter
30
Q

Central Limit Theorum

A
  • Mean of the theoretical distribution of sample means is called the Expected Value of M (Always equal to the population mean)
  • The standard deviation of the theoretical distribution of sample means is called the Standard Error of the Mean and is computed by (o / square root of n)
  • The shape of the distribution of sample means is typically normal
  • Distribution of sample means approaches a normal distribution as n approaches infinity (is guaranteed to be almost perfectly normal if the population the samples are obtained from is normal, or the sample size is n=30 or more)
31
Q

What is the alpha level?

A
  • The alpha level establishes a criterion for deciding whether or not to reject the null hypothesis
  • Critical region consists of outcomes very unlikely to occur if the null hypothesis is true (defined by associations that are very unlikely to obtain if no effect exists)
  • If p-value is less than the alpha level, we reject the null hypothesis (if not, we fail to reject the null)
32
Q

Understand Type I and Type II errors and the differences between them

A
  • Type I error occurs when the sample data indicate an effect when no true effect exists. (Rejecting the null when the null is true)…caused by unusual, unrepresentative samples…hypothesis tests are structured to make this error unlikely
  • Type II error occurs when the hypothesis test does not indicate an effect but in reality an effect does exist. (Fail to reject the null when the null is actually false)… more likely with a small treatment effect or too small sample size
33
Q

How is p-value used in hypothesis testing?

A
  • The p-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis
  • We set the alpha at a=.05 and check to see if p is less than .05
  • When p=.05 there is around a 20-50% chance of a Type I error
34
Q

Why do we calculate effect size in addition to statistical significance?

A
  • A hypothesis test evaluates the statistical significance, and is influenced by sample size and magnitude of treatment effect.(Even a very small effect will be significant in a large enough sample…would have a very small standard error)
  • Effect size meausres the absolute magnitude of an effect, independent of sample size
35
Q

How do you calculate and interpret Cohen’s D?

A
  • Cohen’s d is a standardized effect size… measures mean difference in terms of the standard deviation
  • .2-.49=small effect, .5-.79=med effect, .8+=large effect
  • Effect size is not influenced by sample size, but a large standard deviation decreases it
36
Q

What is the power of a hypothesis test and why is it important? What factors influence statistical power?

A
  • The power of a hypothesis test is the probability that the test will reject the null hypothesis when there is actually an effect
  • Depends on: effect size (larger effects are easier to find), sample size (larger samples make it easier to find effects), alpha level (larger alpha level makes it easier to find effects), directional vs. non-directional (directional tests make it easier to find effects)
37
Q

When is a t-test used instead of a z-score?

A
  • A z-score compares a sample mean to a known population mean using a t-test, but to do so we need the population standard deviation to calculate the standard error
  • The t-test does not require knowledge of the population standard deviation (but do need to know or guess something about the population mean)

All that is required for a t-test is a sample and a reasonable hypothesis about the population mean

38
Q

What four assumptions should be true (or close to true) when using the t-statistic?

A
  • The data are continuous
  • The sample data have been randomly sampled from the population
  • The variability of the data ine ach group is similar
  • The sampled population is approximately normally distributed
39
Q

What are the steps in conducting a one-sample t-test?

A
  • estimate population mean from previous research or theory, or it can be chosed to represent a defined and meaningful threshold
  • We calculate how much difference between M and μ is reasonable to expect (“the noise”)… using estimated standard error of M by using s instead of σ
40
Q

What is the influence of sample size and sample variance on t-test?

A
  • With large samples, t-test is very similar to a z-test; with small samples the t-value will provide a relatively poor estimate of z
  • Large df makes t-distribution nearly normal, small values makes it flatter and more spread out

For one sample t-test degrees of freedom: df=n-1

41
Q

How do you calculate Cohen’s D for t-statistic?

A
  • Cohen′ s 𝑑= 𝑀 − μ /𝒔 (uses s instead of σ)
  • .2-.49=small effect
  • .5-.79=med effect
  • .8+=large effect
42
Q

What does r2 mean?

A
  • R-squared is the percentage of variance accounted for by the IV
  • R2= t2/(t2 + df).
  • .01-.08=small effect
  • .09-.24=medium effect
  • .25+=large effect
43
Q

Independent Measures vs. Single-sample t-tests

A
  • Independent Measures uses two separate and indepedent samples
  • Can test for mean differences between two distinct populations (ex. those w college degrees and those without) or between two different conditions (ex. therapy vs. placebo)

Independent measures is used in situations where a researcher has no prior knowledge about iehter of the two populations (or treatments) being compared. Both population means and standard deviations are unknown… the values must be estimated from the sample data

44
Q

Understand the differences in hypotheses of independent measures vs. single-sample t-test (this is the independent meaures ex.)

A
  • Non-directional(Two-Tailed)
    HO: μ1 = μ2
    H1: μ1 ≠ μ2
  • Directional(One-tailed)
    HO: μ1 ≤ μ2
    H1: μ1 > μ2
45
Q

Be able to calculate the independent-measures t-statistic and degrees of freedom

A
  • df of sample 1 is n-1, df of sample 2 is n-1 (add these together for total df)
  • test statistic uses the M’s from both samples in the numerator, and the variability from both samples to calculate standard error in the denominator
  • t= (M1-M2) / S(M1-M2)
46
Q

homogeneity of variance assumption

A
  • Two populations being compared must have the same variance
47
Q

Be able to calculate effect size (Cohen’s d and r-squared) (Independent Samples)

A
  • Cohen’s D: M1-M2 / Square root of pooled variance
  • R-squared: t2 / t2+df
48
Q

Understand when a repeated measures design is used and be able to identify examples

A
  • Repeated measures evaluates the mean difference between two measurements taken from a single sample (you have each participant complete both the congruent and incongruent condition, there is no control group, the data consist of two scores for each individual, you use difference scores to determine the effect of the conditions)
  • test a hypothesis about the population mean difference between two meausrements using a single sample
49
Q

Understand the strengths and weaknesses of using a repeated-measures designn over an independent-measures design

A
  • Requires fewer participants bc individual differences in performance from one participant to another are eliminated (reduces the variance, which reduces estimated standard error, which increases power)
  • Are particularly well suited for examining changes that occur over time (ex. learning or development)
  • Disadvantages are testing effects (exposure may influence scores in the second condition)
  • Another disadvantage are floor and ceiling effects (floor= only can go up, ceiling=only can go down)
50
Q

Be able to calculate a repeated-measures t-statistic and ake a decision regarding your hypothesis

A
  • t= MD-uD / SMD
  • SMD= SD / square root of n
  • df= n-1
51
Q

What are the strengths of using an ANOVA over multiple t-tests?

A
  • Can evaluate mean differences between two or more populations
  • Protects researchers from excessive risk of a Type I error bc it automatically adjusts for the effect testing multiple hypotheses has on Type I errors
52
Q

Understand logic behind f-ratio

A

F= MS between / MS within
-Ms between due to treatment effect
-MS within changes only due to random chance or sampling error
-F should be near one if null is true (large effect of IV produces a large F-ratio
-Critical region is df between (column) and df within (row)

53
Q

Why are post hoc tests conducted after ANOVAs and what do they tell us?

A
  • Determine exactly which groups are different and which are not
  • Done after an ANOVA where H0 is rejected
  • The tests compare the treatments two at a time to test the mean differences while correcting for concerns about experiment-wise type I error inflation
54
Q

Effect size for One-way ANOVA

A

.02-.12= small effect
.13-.25=med effect
.25+=large effect