psych 218 - M1 Flashcards

1
Q

methods of knowing

A
  • authority: something is true due to tradition / someone says so
  • rationalism: uses reasoning
    • if premises are sound and carried out with logic, conclusions will be true
    • can be inadequate: phenomenon may have multiple causes
  • intuition: sudden insight that springs into consciousness all at once
    • often after reasoning has failed, mysterious process
  • scientific method: relies on objective assessment regardless of scientist’s beliefs
    • form hypothesis from reasoning or intuition > design experiment > analyze statistically > hypothesis is rejected or supported
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variable [def]

A

Property that can have different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Independent Variable [def]

A

The variable systematically manipulated by the researcher

Can be “predictor variable”: presumed cause of another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dependent Variable [def]

A

Variable that is measured by the researcher to determine effect of the IV

Can be “criterion variable”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data [def]

A

Measurements that are made on the subjects of an experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample [def]

A

Subset of a population

Described by sample statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Population [def]

A

Complete set of individuals, objects or scores that the investigator is interested in studying

Described by parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Types of Research

A
  • observational studies: no variable is actively manipulated by the investigator > cannot determine causality
    • naturalistic observation: obtain accurate description of situation being studied
    • parameter estimation: conducted on samples to estimate level of population characteristics
    • correlational studies: see whether 2+ variables are related
  • true experiments: determine if change in 1 variable causes change in another variable > determine causality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Types of Statistics

A
  • Descriptive statistics: seek to understand patterns in the sample
  • Inferential statistics: seek to infer whether patterns generalize to the population
    • make predictions of population parameters through sample
    • help quantify confidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Measurement Scales

A
  • nominal scale: values are arbitrary
    • can only count things that are alike
  • ordinal scale: values are ranked, but interval between values are not equal
    • can compare things and put them in order
    • know rank, do not know magnitude
  • interval scale: ranked with equal intervals, but no absolute 0 point
    • can add and subtract
  • ratio scale: ranked with equal intervals and an absolute zero
    • can calculate ratios
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Discrete v Continuous Variables

A
  • discrete: no possible values between adjacent units on the scale
    (i.e. number of dogs)
  • continuous: infinite possible values between adjacent units
    (i.e. weight of apple)
    • real limit: values above and below the recorded value
    • 1/2 of the smallest measuring unit
    • i.e. 34.45 kg: smallest unit is 0.01 kg > real limit is 0.01 / 2 = 0.005 kg
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Significant Figures

A
  • Descriptive statistics: 2-3 decimals
  • Correlation, Regression, p-values: always 3 decimals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Frequency Distribution (f) [def]

A

present score values and their frequency of occurrence
- usually lowest score value at the bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Grouped Frequency Distribution

A
  • create clusters within frequency distributions
  • must choose intrinsically meaningful intervals (represent simply and accurately)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relative f distribution [def]

A

Proportion of cases that fall into a class interval
- f / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cumulative distribution

A
  • cum. f distribution: add up all data at maximum of a class at and below it
  • cum. % distribution: percentage of scores at maximum of a class and below it
    • makes it easy to find median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Drawing Graphs

A
  • vertical axis: ordinate, Y-axis
    • shows the DV, plot score values
  • horizontal axis: abscissa, X-axis
    • show the IV, show frequency of score values
  • must have title and label for axis
  • axis should start at 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Types of Graphs

A
  • Bar Graphs: no numerical relationship between categories
    • represents nominal data
    • have gap between bars to show discontinuity
  • Histogram
    • represents ordinal data
    • shows continuity of the variable > bars must touch each other
  • Frequency Polygons
    • represents interval or ratio data
    • similar to histogram, but uses plotted points at midpoints
    • minimizes importance of class > gets expected shape of distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Distribution Shapes

A
  • Normal distribution: ideal
    • symmetrical
    • unimodal
  • Positive skew: data is more clustered on the lower end
    • mode < median < mean
  • Negative skew: data is more clustered on the upper end
    • mean < median < mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Models of Central Tendency

A
  • Mean
  • Median
  • Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Mode (Mo)

A
  • most frequent observation
  • only models nominal data
  • best used when you have to be exact (close is not good enough)
    • i.e. betting in sports
  • bimodal: data has 2 modes
  • all observed values are the mode: data is all unique
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Median (Mdn, P50)

A
  • splits the distribution evenly at the 50th percentile
  • can be for ordinal data
  • properties:
    • less sensitive to extreme scores than mean
    • more subject to sampling variability than the mean, but less than the mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Mean (X-bar, M, mew)

A
  • average of data set
    • acts as a fulcrum: balances all values in the data set
  • properties:
    • sensitive to exact value of all scores
    • sensitive to extreme scores
    • sum of deviations always = 0
    • sum of squared deviations is a minimum
    • least subject to sampling variation
    • very sensitive to outliers
24
Q

Outliers

A
  • highly atypical observations
  • mean is very sensitive to outliers
  • solutions:
    • use median
    • compare mean with and without outlier
    • determine source of the outlier
25
Robustness of Central Tendencies
- robustness increases as sample size increases: models become more accurate - to outliers: mode > median > mode - to sampling variability: mean > median > mode - mean has greatest consistency - mean is preferred to maximize consistency
26
Models of Variability
- Range - Standard Deviation - Variance
27
Range
- absolute difference between the most extreme scores - just like the median - not influenced by skews and outliers - affected by sampling variability
28
Standard Deviation (s, sd, σ)
- average raw deviance: average difference between any score and the mean - just like the mean - more robust with sampling variability - influenced by skew and outliers - properties: - gives measure of dispersion relative to the mean - sensitive to each score in the distribution
29
Standard Deviation Steps
- [1] calculate sample mean - [2] subtract data from mean for deviation scores - deviation score: amount an observation deviates from the sample mean - [3] square deviation scores - value will be 0 if not squared because mean is the fulcrum - [4] sum the deviation scores - summed deviation: will be 0 because the mean is a fulcrum - sum of squared deviations (SS): each deviation square will become positive > can sum - [4] divide by N or N-1 - N-1 inflates the variability a bit when estimating for the population - [5] square root to get standard deviation
30
Variance (s^2, σ^2)
- average squared deviance from the mean - just like the mean: - more robust with sampling variability - influenced by skews and outliers
31
Normal Curve
- ideal distribution to work with - perfectly symmetrical: not skewed, no outliers, mean = median = mode - unimodal: mean = mode - perfectly variable: variation of scores is predictable - asymptotic tails: curve never reaches the X-axis, but gets closer and closer - important to know if we are working from a sample of normally-distributed population - our estimates will be off if not normally distributed
32
z-scores (= standard scores)
- transformed score to designate how many SD units the score is above/below the mean - how many σ‘s observation is from μ - what it does: - shows how a value stacks up against the population distribution - transforms raw units to standard deviation > can compare different quantities with one another that were not directly comparable - z scores are a transformation - standardization: each observation is divided by SD - centering: subtract each observation with μ > sets the μ = 0
33
Characteristics of z-scores
- z-scores have the same shape as set of raw scores - mean of the z-scores is set at μ = 0 - standard deviation of z-scores always s = 1
34
Relationship
- pattern between 2 variables - best visualized through scatterplots - quantifies the relationship's form, magnitude and direction
35
Linear Relationship
- a relationship between 2 variables is one in which the relation can be most accurately represented by a straight line - can be perfect or imperfect relationship - correlation coefficient: expresses quantitatively the magnitude and direction of the relationship - magnitude: larger absolute value = greater - direction: positive = direct, negative = inverse
36
Equation of the straight line
Y = bX + a - a = Y-intercept when X=0 - b = slope of the line
37
Regression v Correlation
- correlation: concerned with magnitude and direction of relationship - uses z-score units - regression: focused on using the relationship for prediction - uses raw units
38
Pearson's r
- measure of the extent to which paired scores occupy the same or opposite positions within their own distribution - stronger relationship > more accurate prediction of variability of Y accounted for by X
39
Pearson's r Formulas
- Conceptual Equation - input z scores (not raw data) - the more positive values, the stronger the positive correlation - Computational Equation - input raw scores (more complicated, but need less components)
40
Assumptions of the Pearson's r
- Linear relationship - Interval or Ratio level data - Absence of extreme outliers
41
Coefficients of Determination (r^2)
- proportion of the total variability of Y accounted for by X - shows variability / explained variance - 0 < x < 1 - r^2 is always smaller than r
42
Choosing other correlation coefficients
- Spearman's rho: use when one or both variables are ordinal - Point biserial: 1 variable is interval/ratio, 1 variable is dichotomous - Pearson's Phi: use when both variables are dichotomous - Eta correlation ratio: used for non-linear relationships - Partial correlations: relationship between 2 variables after the effect of the 3rd variable has been removed
43
Effect of changing range on correlation
- in most cases, will lower the correlation - line you fit will be biased in some way - extrapolating from a very small range - not seeing the big picture
44
Effect of extreme scores on correlation
- can drastically alter the magnitude of the correlation coefficient - should check scatter plot before computing > if present, must use caution when interpreting the relationship - will have a larger effect for a smaller sample
45
Correlation does not imply causation
- correlation between X and Y may be spurious - X may be the cause of Y - Y may be the cause of X - third variable is the cause of the correlation
46
Criterion Variable
what you are trying to make predictions about (i.e. attitude towards the movie)
47
Predictor Variable
What is predicted to lead to the criterion (i.e. gender)
48
Regression Line
- line of best fit is the least square regression line - prediction line that minimizes the total error of prediction - total error is less for the least-squares regression line than any other possible prediction line
49
Requirements for the linear regression
- sample is appropriate for a linear model - prediction is within the range of original variables - we do not know if relationship continues to be linear at more extreme values beyond our range
50
Similarities between Mean and Linear Regression
- smooths over all imperfections - acts as the fulcrum to balance the data > best prediction as mean / line of best fit - shows how much error is in our model - M: with deviation scores through the standard deviation - R: with prediction scores through the standard error of the estimate - distribution of errors - variability of ±1 SD or Sy|x is would deviate by 68.2% from the mean / line
51
Standard Error of the Estimate
- the degree to which our Y values differ from what we predicted - with each predictor added, we subtract one from N - standard deviation of Y given X
52
Homoscedasticity
- assumes the scatter is equally away from the predicted line (balanced) - standard error of the estimate is only meaningful if Y is constant over values of X
53
Multiple Regression
- contains 2+ predictors, but still only 1 criterion - adding a new predictor will always quantitatively improve prediction score - standard error of the estimate decreases - multiple coefficient of determination increases - be mindful that some relationships may be crud - do not just randomly add new predictor variables
54
Multiple Coefficient of Determination (R^2)
- helps avoid double counting - cannot merely just add up the different r^2 values - for fully redundant predictors: R^2 will not increase, but will equal to the r^2 of the better predictor - for fully orthogonal predictors: can add r^2 together because it will not be double counting
55
Meehl's 6th law of psychology
- crud factor: "everything correlates with everything else" - adding predictors always increases R^2, even if they are crud predictors - solution: use R^adj to add penalties for each predictor added
56
Categorical predictors in regression
- dummy coding: assign variables as either "1" or "0" - result: only adds criterion value when "1" occurs - contrast coding: assign variables as "1" or "-1" - result: criterion value changes when its "1" v "-1" - results in parallel slopes that scoot up and down (i.e. for all values of sleep, Dan will be a bit more grumpy when it is raining outside