Midterm 2 - Chapter 12 Flashcards

1
Q

When do we need stats in the research process?

A

After collecting data - need to summarize & communicate findings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 types of stats:

A
  1. Descriptive Statistics
  2. Inferential Statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How should we most efficiently present research:

A

Want to convey maximum information using minimum space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Purpose of descriptive statistics?

A

Summarizes mass of data points
- Understanding and interpretation
- Visual displays, appropriate calculations

In experiments can calculate within each
- condition/group
- Mean, standard deviation…

In correlation designs
- For each variable, calculate mean, standard deviations, etc
- For every pair of variables, calculate a correlation coefficient (also a descriptive statistic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3 Types of Descriptive Statistics

A
  1. Measures of Central Tendency (Mean, Median, Mode)
  2. Measures of Variability (Range, Variance, Standard Deviation)
  3. Measures of Relationship (Correlation, Multiple Regression, Multiple Correlation, Partial Correlation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Scales of Measures

A
  • Nominal
  • Ordinal
  • Interval
  • Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Nominal

A

Group or categorization
- No order or direction
- Summarized by proportion/percentages or the mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ordinal

A

Ranked order (1st, 2nd, 3rd..)
- Uneven spaces between “scores”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Interval

A

Numerical scales in which intervals have the same interpretation throughout but no true zero (e.g. temp in celsius - 0 deg still indicates a temperature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Ratio

A

An interval scale with a true zero reference point (e.g. 0 pounds)
- Summarized with the mean or median and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Measures of central tendency

A
  • Describe what’s happening at middle of data
  • What’s “normal”?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 measures of central tendency:

A

Mean, Median, Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Mean

A

= arithmetic average

  • What we usually use!
  • Uses information from every single score
  • Add up all scores in each group and divide by the
    number of scores in each group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Downsides of Mean:

A

□ Affected by outliers (i.e., extreme scores)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Upsides of mean:

A

□ With increasing sample size, each extreme score has less effect on the mean.
□ Maximizes use of all of our data.
□ Has mathematical properties that enable us to
use it in statistical analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Outliers - With increasing sample size, the mean is

A

Less affected by outliers

□ Main idea here - check for outliers if you only have a small sample, but try to get a large sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Median

A

= score that divides group in half

  • 50% of the scores above, 50% below
  • Used if there are extreme scores (outliers)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How to find median:

A

Put scores in order. Count number of scores.

If odd #: identify the middlemost score.

If even#: identify two middle scores, take average of them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When is median useful?

A

Whenever it’s most descriptively informative to report the value for which equal numbers of people score higher
and lower (e.g. income)

  • Also, when you can spot an outlier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mode

A

= most frequently occurring score

  • Sometimes no mode; sometimes more than one
  • Usually used for nominal or ordinal variables
  • Put the scores in order – look for most frequently occurring score(s)
  • May be none, or more than one
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When is mode useful?

A

Whenever it’s most descriptively informative to
report the most frequently occurring score (e.g.: employee salary distribution)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Measures of Spread:

A
  • Variability
  • Standard Deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Variability:

A

The spread in a distribution of scores

AMOUNT of spread is often measured by Standard Deviation

How much each score deviates from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Possible Measures of Variability

A

□ Range (max – min)
□ Variance
□ Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Issues with range

A

can be too simplistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Issues with variability:

A

not very descriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Standard Deviation

A

□ A measure of variability that enables reference to the Normal Distribution so it’s meaningful (as opposed to variance).
□ Defines what’s “normal” for that variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How to calculate variance:

A
  • Find out how much each score deviates from the mean (mean of 5, 0 = 5)
  • Square each number
  • Add up these numbers
  • Divide by TOTAL number of scores being calculated MINUS 1 (not the total of the numbers, but how many there are - 7, 3: not 10, but 1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

One half of the bell curve, SD of +/-1, SD of +/- 2, SD of +/- 3

A
  • 50%
  • 68%
  • 95%
  • 99.7%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How to calculate Standard Deviation:

A

SQUARE ROOT of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Measures of Relationship:

3 types of Descriptive Stats

A
  • Correlation
  • Multiple Regression
  • Multiple Correlation
  • Partial Correlation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Correlation (r) and r-squared

A

□ +/- = “direction” of relationship - Positive or negative?
□ Number = “strength” of relationship?
(How closely is one set associated with other set?)

For linear relationships!!
- r = 0 could mean no relationship OR a non-linear relationship!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Correlations - Restriction of range

A

Correlations can be misleading if the full range on both variables is not measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Multiple Correlation (R) and R-squared

A

Expressing correlation as a percentage:
- r2 = a proportion
- r2 = .28 means that 28% of the variance in scores is shared by MC and written portions.
- r2 = .28 means that 28% of the variance in MC scores is predicted by written scores & vice versa

Measures the proportion of variance in the dependent variable that can be explained by the independent variable in a regression model.

35
Q

Amount of Shared Variance

A

If r2 = 0, there is NO shared variance!
If r2 = 1, there is 100% shared variance (goes from 0-1 only)

36
Q

READINGS

A
37
Q

In which type of scale are the intervals between each rank order NOT equal?

A

Ordinal scale

38
Q

In which type of scale ARE the intervals between each rank order equal?

A

Interval Scale (differences between 90-95 are the same as 115-120)

39
Q

When it’s difficult to know whether an ordinal or interval scale is being used, what should we do?

A

Treat variables as a interval scale. because when ordinal scales are averaged across many instances, they take on properties similar to an interval scale

40
Q

Data measured on __ and __ scales can be summarized using __

A

interval; ratio; MEAN

41
Q

Variables measured on interval and ratio scale are often referred to as…

A

continuous variables - the values represent an underlying continuum

42
Q

Interval and ratio scales can be treated the same way..

A

Statistically

43
Q

Why should we first explore variables separately?

A

Allows us to get a sense for what the data for each of our variables look like and also identify any possible errors that might have occurred during data collection

44
Q

Graphing frequency distributions

A

Frequency Distribution: indicates # of p’s select each possible category/scale on a variable

EX: a poll asks 100 people how many pets they have. They find that 38 people have no pets, 25 have one pet, 17 have two pets, 6 have three pets, and 14 have four or more pets.

45
Q

Pros of Graphing frequency distributions:

A
  • See what scores are most common/uncommon
  • See shape of distribution
  • Identify OUTLIERS: scores that are unusual, unexpected, very different from scores of other participants
46
Q

Bar Graph:

A

Uses a separate and distinct bar for each piece of information

Used for comparing group means and percentages

47
Q

Types of Frequency Distributions:

A
  • Bar Graphs
  • Pie Charts
  • Histograms
  • Frequency Polygons
48
Q

Pie Charts

A

Divide a whole circle that represents relative percentages

Useful in representing data on a nominal scale

49
Q

Histograms:

A

Uses bars to display a frequency distribution for a continuous variable (e.g. continuous, increasing amounts of a variable)

50
Q

How do histograms differ from bar graphs:

A
  • Histograms: bars touch each other, reflecting a continuous variables (ie on x axis)
  • Bar Graph: gaps between each bar, helping communicate that values on x-axis are nominal categories
51
Q

Normal Distribution

A

A distribution of scores that is frequently observed, and rather important for stats

Majority of scores cluster around the mean

Only possible for continuous variables (interval or ratio)

52
Q

Standard Deviation:

A

How scores spread out from the mean, on average

53
Q

Breakdown of normal distribution/deviation:

A
  • 68%: fall within 1 standard deviation above and below the mean
  • 96% fall within 2 standard deviations above/below mean
54
Q

Frequency Polygons:

A

Alternative to histograms - use a line to represent frequencies for continuous variables

Helpful when you want to examine frequencies for multiple groups simultaneously

55
Q

Descriptive Statistics:

A

Calculating statistics to describe or summarize our data

56
Q

2 main types of descriptive stats:

A

1: measures of central tendency - capture how participants scored overall, across the entire sample
2: measures of variability - how differently the scores are from each other, or how widely they’re spread out or distributed

57
Q

Central Tendency

A

Tells us what the scores are like as a whole, or how people scored on average:
- Mean (represented by X in calculations, M in reports)
- Median (Mdn in scientific reports)
- Mode (for variables that employ an interval, ratio, or ordinal scale)

58
Q

Variability

A

Characterizes the amount of spread in a distribution of scores, for continuous variables

59
Q

Comparing Group Percentages

A

e.g. - wanting to know how groups differ in the ways they respond to questions

  • Can calculate percentages for each group and compare
60
Q

Comparing group means

A

e.g. wanting to see how groups, on average, responded and comparing these numbers

61
Q

Graphing Nominal Data

A

Common way to graph relationships between variables when one variable is nominal is to use a bar graph or line graph

62
Q

When are bar graphs used compared to line graphs?

A

Bar graphs - when values on x-axis are nominal

Line graphs - when values on x axis are numeric

63
Q

Describing Effect-Size Between Two Groups

A

Effect-Size: describing relationships among variables in terms of size, amount, or strength; helps determine how large effects are

64
Q

Effect Size - Cohen’s d:

A

Cohen’s d: comparing two groups on their responses to a continuous variable
- Difference in means between two groups, standardized by expressing it in units of standard deviation

  • In a true experiment, the Cohen’s d value describes the magnitude of the effect of the IV on the DV
  • When studying naturally occurring groups, describes magnitude of effect of group membership on a continuous variable
65
Q

Smallest possible value for Cohen’s d:

A

0 - no effect, no max value

66
Q

Different analyses are needed when you don’t have distinct groups you wish to compare, but rather..

A

have a range of scores to investigate in terms of their relationship with other scores

67
Q

What data is appropriate for correlational designs?

A

Correlation coefficient: statistic describing whether, how, and how much two variables relate to one another (many different types)

68
Q

Pearson R Correlation Coefficient

A
  • r = 0 to 1 (NOT a percentage or probability)
  • tells us the direction of the relationship
69
Q

How can R be graphed - Scatterplots

A
  • Scatterplot: each pair of scores is plotted as a single point in a graph
  • Perfect relationships = perfectly diagonal lines (however, remember measurement errors!)
  • Whenever relationships aren’t perfect, if you know a person’s score on the first variable, you can’t perfectly predict what that person’s score will be on the second variable
70
Q

Pros of scatterplots:

A
  • Provide ways of seeing how variables relate to one another
  • Allow researchers to detect outliers
71
Q

Important Considerations:

A
  • Restriction of Range
  • Curvilinear Relationship
72
Q

Important Considerations - Restriction of Range:

A
  • If the full range of possible scores isn’t sampled, but instead restricted, the correlation coefficient produced with these data can be misleading - LESS variability in the scores and thus, less variability that can be explained or predicted by the other variable
  • The issue can occur when people you’re sampling are all very similar on one or both of the variables you are studying
73
Q

Important Considerations - Curvilinear Relationship

A

Pearson Correlation only designed to detect linear relationships - if relationship is not linear but curvilinear, the correlation coefficient will fail to detect this relationship

Another type of statistic must be used to determine the strength of the relationship

74
Q

Correlation Coefficients as Effect-Sizes

A

Correlation coefficients not only allow us to examine relationships between continuous variables, they are also indicators of effect size

75
Q

Correlation Coefficients as Effect-Sizes - Square Value of R

A

By multiplying R by itself, it lends itself to a simple interpretation: THE PROPORTION OF VARIANCE BEING EXPLAINED - AKA Squared Correlation Coefficient

76
Q

Regression

A

Regression: advanced way of examining how variables relate or covary (a statistical technique); analyzes relationships among variables

77
Q

The Regression Equation

A

Y = a + bX
(Y = criterion variable: score we wish to predict, X = predictor variable: known score, a = y-intercept, b = slope of line)
The same as an equation for drawing a straight line - the line that best summarizes all of the data points

Can be used to make specific predictions

78
Q

Multiple Correlation

A

(symbolized as R, distinguished from Pearson r)

Provides correlation between a combined set of predictor variables and a single criterion variable (as any phenomenon is likely determined by many factors, accounting for these permits a greater accuracy of prediction)

79
Q

Squared Multiple Correlation Coefficient

A

R squared can be interpreted in the same way as the Squared Correlation Coefficient (r squared)

R squared tells you the proportion of variability in the criterion variable that is accounted for by the combined set of predictor variables

80
Q

Regression is more powerful than correlation because…

A

it can be expanded to accommodate more than one predictor to predict the criterion variable - this expanded model is AKA multiple regression, allowing us to examine the unique relationship between each predictor and the criterion

In contrast to multiple correlation, which only provides a single value for the relationship between the combined set of predictors and the criterion variable

81
Q

Order of Interpreting Data Analysis:

A
  1. Correlation Coefficient
  2. Multiple Correlation
  3. Multiple Regression
82
Q

What technique helps address the third variable problem?

A

Partial Correlation: provides a way of statistically controlling for possible third variables in correlational

Estimates what the correlation between the two primary variables would be if the third variable were held constant - in other words, if everyone responded to this third variable in the exact same way

83
Q

What can you do with a calculated partial correlation?

A

With a calculated partial correlation, you can compare with the original correlation to see if the third variable was influencing the original relationship

84
Q

Advanced Modelling Techniques:

A
  • Structural Equation Modelling (SEM): examines models (an expected pattern of relationships among numerous different variables) that specify a set of relationships among many variables