psych 218 - M1 Flashcards

1
Q

methods of knowing

A
  • authority: something is true due to tradition / someone says so
  • rationalism: uses reasoning
    • if premises are sound and carried out with logic, conclusions will be true
    • can be inadequate: phenomenon may have multiple causes
  • intuition: sudden insight that springs into consciousness all at once
    • often after reasoning has failed, mysterious process
  • scientific method: relies on objective assessment regardless of scientist’s beliefs
    • form hypothesis from reasoning or intuition > design experiment > analyze statistically > hypothesis is rejected or supported
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variable [def]

A

Property that can have different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Independent Variable [def]

A

The variable systematically manipulated by the researcher

Can be “predictor variable”: presumed cause of another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dependent Variable [def]

A

Variable that is measured by the researcher to determine effect of the IV

Can be “criterion variable”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data [def]

A

Measurements that are made on the subjects of an experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample [def]

A

Subset of a population

Described by sample statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Population [def]

A

Complete set of individuals, objects or scores that the investigator is interested in studying

Described by parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Types of Research

A
  • observational studies: no variable is actively manipulated by the investigator > cannot determine causality
    • naturalistic observation: obtain accurate description of situation being studied
    • parameter estimation: conducted on samples to estimate level of population characteristics
    • correlational studies: see whether 2+ variables are related
  • true experiments: determine if change in 1 variable causes change in another variable > determine causality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Types of Statistics

A
  • Descriptive statistics: seek to understand patterns in the sample
  • Inferential statistics: seek to infer whether patterns generalize to the population
    • make predictions of population parameters through sample
    • help quantify confidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Measurement Scales

A
  • nominal scale: values are arbitrary
    • can only count things that are alike
  • ordinal scale: values are ranked, but interval between values are not equal
    • can compare things and put them in order
    • know rank, do not know magnitude
  • interval scale: ranked with equal intervals, but no absolute 0 point
    • can add and subtract
  • ratio scale: ranked with equal intervals and an absolute zero
    • can calculate ratios
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Discrete v Continuous Variables

A
  • discrete: no possible values between adjacent units on the scale
    (i.e. number of dogs)
  • continuous: infinite possible values between adjacent units
    (i.e. weight of apple)
    • real limit: values above and below the recorded value
    • 1/2 of the smallest measuring unit
    • i.e. 34.45 kg: smallest unit is 0.01 kg > real limit is 0.01 / 2 = 0.005 kg
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Significant Figures

A
  • Descriptive statistics: 2-3 decimals
  • Correlation, Regression, p-values: always 3 decimals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Frequency Distribution (f) [def]

A

present score values and their frequency of occurrence
- usually lowest score value at the bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Grouped Frequency Distribution

A
  • create clusters within frequency distributions
  • must choose intrinsically meaningful intervals (represent simply and accurately)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relative f distribution [def]

A

Proportion of cases that fall into a class interval
- f / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cumulative distribution

A
  • cum. f distribution: add up all data at maximum of a class at and below it
  • cum. % distribution: percentage of scores at maximum of a class and below it
    • makes it easy to find median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Drawing Graphs

A
  • vertical axis: ordinate, Y-axis
    • shows the DV, plot score values
  • horizontal axis: abscissa, X-axis
    • show the IV, show frequency of score values
  • must have title and label for axis
  • axis should start at 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Types of Graphs

A
  • Bar Graphs: no numerical relationship between categories
    • represents nominal data
    • have gap between bars to show discontinuity
  • Histogram
    • represents ordinal data
    • shows continuity of the variable > bars must touch each other
  • Frequency Polygons
    • represents interval or ratio data
    • similar to histogram, but uses plotted points at midpoints
    • minimizes importance of class > gets expected shape of distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Distribution Shapes

A
  • Normal distribution: ideal
    • symmetrical
    • unimodal
  • Positive skew: data is more clustered on the lower end
    • mode < median < mean
  • Negative skew: data is more clustered on the upper end
    • mean < median < mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Models of Central Tendency

A
  • Mean
  • Median
  • Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Mode (Mo)

A
  • most frequent observation
  • only models nominal data
  • best used when you have to be exact (close is not good enough)
    • i.e. betting in sports
  • bimodal: data has 2 modes
  • all observed values are the mode: data is all unique
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Median (Mdn, P50)

A
  • splits the distribution evenly at the 50th percentile
  • can be for ordinal data
  • properties:
    • less sensitive to extreme scores than mean
    • more subject to sampling variability than the mean, but less than the mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Mean (X-bar, M, mew)

A
  • average of data set
    • acts as a fulcrum: balances all values in the data set
  • properties:
    • sensitive to exact value of all scores
    • sensitive to extreme scores
    • sum of deviations always = 0
    • sum of squared deviations is a minimum
    • least subject to sampling variation
    • very sensitive to outliers
24
Q

Outliers

A
  • highly atypical observations
  • mean is very sensitive to outliers
  • solutions:
    • use median
    • compare mean with and without outlier
    • determine source of the outlier
25
Q

Robustness of Central Tendencies

A
  • robustness increases as sample size increases: models become more accurate
  • to outliers: mode > median > mode
  • to sampling variability: mean > median > mode
    • mean has greatest consistency
    • mean is preferred to maximize consistency
26
Q

Models of Variability

A
  • Range
  • Standard Deviation
  • Variance
27
Q

Range

A
  • absolute difference between the most extreme scores
  • just like the median
    • not influenced by skews and outliers
    • affected by sampling variability
28
Q

Standard Deviation (s, sd, σ)

A
  • average raw deviance: average difference between any score and the mean
  • just like the mean
    • more robust with sampling variability
    • influenced by skew and outliers
  • properties:
    • gives measure of dispersion relative to the mean
    • sensitive to each score in the distribution
29
Q

Standard Deviation Steps

A
  • [1] calculate sample mean
  • [2] subtract data from mean for deviation scores
    • deviation score: amount an observation deviates from the sample mean
  • [3] square deviation scores
    • value will be 0 if not squared because mean is the fulcrum
  • [4] sum the deviation scores
    • summed deviation: will be 0 because the mean is a fulcrum
    • sum of squared deviations (SS): each deviation square will become positive > can sum
  • [4] divide by N or N-1
    • N-1 inflates the variability a bit when estimating for the population
  • [5] square root to get standard deviation
30
Q

Variance (s^2, σ^2)

A
  • average squared deviance from the mean
  • just like the mean:
    • more robust with sampling variability
    • influenced by skews and outliers
31
Q

Normal Curve

A
  • ideal distribution to work with
  • perfectly symmetrical: not skewed, no outliers, mean = median = mode
  • unimodal: mean = mode
  • perfectly variable: variation of scores is predictable
  • asymptotic tails: curve never reaches the X-axis, but gets closer and closer
  • important to know if we are working from a sample of normally-distributed population
    • our estimates will be off if not normally distributed
32
Q

z-scores (= standard scores)

A
  • transformed score to designate how many SD units the score is above/below the mean
    • how many σ‘s observation is from μ
  • what it does:
    • shows how a value stacks up against the population distribution
    • transforms raw units to standard deviation > can compare different quantities with one another that were not directly comparable
  • z scores are a transformation
    • standardization: each observation is divided by SD
    • centering: subtract each observation with μ > sets the μ = 0
33
Q

Characteristics of z-scores

A
  • z-scores have the same shape as set of raw scores
  • mean of the z-scores is set at μ = 0
  • standard deviation of z-scores always s = 1
34
Q

Relationship

A
  • pattern between 2 variables
    • best visualized through scatterplots
    • quantifies the relationship’s form, magnitude and direction
35
Q

Linear Relationship

A
  • a relationship between 2 variables is one in which the relation can be most accurately represented by a straight line
    • can be perfect or imperfect relationship
  • correlation coefficient: expresses quantitatively the magnitude and direction of the relationship
    • magnitude: larger absolute value = greater
    • direction: positive = direct, negative = inverse
36
Q

Equation of the straight line

A

Y = bX + a
- a = Y-intercept when X=0
- b = slope of the line

37
Q

Regression v Correlation

A
  • correlation: concerned with magnitude and direction of relationship
    • uses z-score units
  • regression: focused on using the relationship for prediction
    • uses raw units
38
Q

Pearson’s r

A
  • measure of the extent to which paired scores occupy the same or opposite positions within their own distribution
  • stronger relationship > more accurate prediction of variability of Y accounted for by X
39
Q

Pearson’s r Formulas

A
  • Conceptual Equation
    • input z scores (not raw data)
    • the more positive values, the stronger the positive correlation
  • Computational Equation
    • input raw scores (more complicated, but need less components)
40
Q

Assumptions of the Pearson’s r

A
  • Linear relationship
  • Interval or Ratio level data
  • Absence of extreme outliers
41
Q

Coefficients of Determination (r^2)

A
  • proportion of the total variability of Y accounted for by X
  • shows variability / explained variance
  • 0 < x < 1
  • r^2 is always smaller than r
42
Q

Choosing other correlation coefficients

A
  • Spearman’s rho: use when one or both variables are ordinal
  • Point biserial: 1 variable is interval/ratio, 1 variable is dichotomous
  • Pearson’s Phi: use when both variables are dichotomous
  • Eta correlation ratio: used for non-linear relationships
  • Partial correlations: relationship between 2 variables after the effect of the 3rd variable has been removed
43
Q

Effect of changing range on correlation

A
  • in most cases, will lower the correlation
  • line you fit will be biased in some way
    • extrapolating from a very small range
    • not seeing the big picture
44
Q

Effect of extreme scores on correlation

A
  • can drastically alter the magnitude of the correlation coefficient
  • should check scatter plot before computing > if present, must use caution when interpreting the relationship
  • will have a larger effect for a smaller sample
45
Q

Correlation does not imply causation

A
  • correlation between X and Y may be spurious
  • X may be the cause of Y
  • Y may be the cause of X
  • third variable is the cause of the correlation
46
Q

Criterion Variable

A

what you are trying to make predictions about
(i.e. attitude towards the movie)

47
Q

Predictor Variable

A

What is predicted to lead to the criterion
(i.e. gender)

48
Q

Regression Line

A
  • line of best fit is the least square regression line
    • prediction line that minimizes the total error of prediction
    • total error is less for the least-squares regression line than any other possible prediction line
49
Q

Requirements for the linear regression

A
  • sample is appropriate for a linear model
  • prediction is within the range of original variables
    • we do not know if relationship continues to be linear at more extreme values beyond our range
50
Q

Similarities between Mean and Linear Regression

A
  • smooths over all imperfections
  • acts as the fulcrum to balance the data > best prediction as mean / line of best fit
  • shows how much error is in our model
    • M: with deviation scores through the standard deviation
    • R: with prediction scores through the standard error of the estimate
  • distribution of errors
    • variability of ±1 SD or Sy|x is would deviate by 68.2% from the mean / line
51
Q

Standard Error of the Estimate

A
  • the degree to which our Y values differ from what we predicted
    • with each predictor added, we subtract one from N
  • standard deviation of Y given X
52
Q

Homoscedasticity

A
  • assumes the scatter is equally away from the predicted line (balanced)
  • standard error of the estimate is only meaningful if Y is constant over values of X
53
Q

Multiple Regression

A
  • contains 2+ predictors, but still only 1 criterion
  • adding a new predictor will always quantitatively improve prediction score
    • standard error of the estimate decreases
    • multiple coefficient of determination increases
  • be mindful that some relationships may be crud
    • do not just randomly add new predictor variables
54
Q

Multiple Coefficient of Determination (R^2)

A
  • helps avoid double counting
    • cannot merely just add up the different r^2 values
  • for fully redundant predictors: R^2 will not increase, but will equal to the r^2 of the better predictor
  • for fully orthogonal predictors: can add r^2 together because it will not be double counting
55
Q

Meehl’s 6th law of psychology

A
  • crud factor: “everything correlates with everything else”
  • adding predictors always increases R^2, even if they are crud predictors
  • solution: use R^adj to add penalties for each predictor added
56
Q

Categorical predictors in regression

A
  • dummy coding: assign variables as either “1” or “0”
    • result: only adds criterion value when “1” occurs
  • contrast coding: assign variables as “1” or “-1”
    • result: criterion value changes when its “1” v “-1”
  • results in parallel slopes that scoot up and down
    (i.e. for all values of sleep, Dan will be a bit more grumpy when it is raining outside