psych 218 - M1 Flashcards

Question

Robustness of Central Tendencies

Answer 1

- robustness increases as sample size increases: models become more accurate - to outliers: mode > median > mode - to sampling variability: mean > median > mode - mean has greatest consistency - mean is preferred to maximize consistency

Answer 2

- Range - Standard Deviation - Variance

Answer 3

- absolute difference between the most extreme scores - just like the median - not influenced by skews and outliers - affected by sampling variability

Answer 4

- average raw deviance: average difference between any score and the mean - just like the mean - more robust with sampling variability - influenced by skew and outliers - properties: - gives measure of dispersion relative to the mean - sensitive to each score in the distribution

Answer 5

- [1] calculate sample mean - [2] subtract data from mean for deviation scores - deviation score: amount an observation deviates from the sample mean - [3] square deviation scores - value will be 0 if not squared because mean is the fulcrum - [4] sum the deviation scores - summed deviation: will be 0 because the mean is a fulcrum - sum of squared deviations (SS): each deviation square will become positive > can sum - [4] divide by N or N-1 - N-1 inflates the variability a bit when estimating for the population - [5] square root to get standard deviation

Answer 6

- average squared deviance from the mean - just like the mean: - more robust with sampling variability - influenced by skews and outliers

Answer 7

- ideal distribution to work with - perfectly symmetrical: not skewed, no outliers, mean = median = mode - unimodal: mean = mode - perfectly variable: variation of scores is predictable - asymptotic tails: curve never reaches the X-axis, but gets closer and closer - important to know if we are working from a sample of normally-distributed population - our estimates will be off if not normally distributed

Answer 8

- transformed score to designate how many SD units the score is above/below the mean - how many σ‘s observation is from μ - what it does: - shows how a value stacks up against the population distribution - transforms raw units to standard deviation > can compare different quantities with one another that were not directly comparable - z scores are a transformation - standardization: each observation is divided by SD - centering: subtract each observation with μ > sets the μ = 0

Answer 9

- z-scores have the same shape as set of raw scores - mean of the z-scores is set at μ = 0 - standard deviation of z-scores always s = 1

Answer 10

- pattern between 2 variables - best visualized through scatterplots - quantifies the relationship's form, magnitude and direction

Answer 11

- a relationship between 2 variables is one in which the relation can be most accurately represented by a straight line - can be perfect or imperfect relationship - correlation coefficient: expresses quantitatively the magnitude and direction of the relationship - magnitude: larger absolute value = greater - direction: positive = direct, negative = inverse

Answer 12

Y = bX + a - a = Y-intercept when X=0 - b = slope of the line

Answer 13

- correlation: concerned with magnitude and direction of relationship - uses z-score units - regression: focused on using the relationship for prediction - uses raw units

Answer 14

- measure of the extent to which paired scores occupy the same or opposite positions within their own distribution - stronger relationship > more accurate prediction of variability of Y accounted for by X

Answer 15

- Conceptual Equation - input z scores (not raw data) - the more positive values, the stronger the positive correlation - Computational Equation - input raw scores (more complicated, but need less components)

Answer 16

- Linear relationship - Interval or Ratio level data - Absence of extreme outliers

Answer 17

- proportion of the total variability of Y accounted for by X - shows variability / explained variance - 0 < x < 1 - r^2 is always smaller than r

Answer 18

- Spearman's rho: use when one or both variables are ordinal - Point biserial: 1 variable is interval/ratio, 1 variable is dichotomous - Pearson's Phi: use when both variables are dichotomous - Eta correlation ratio: used for non-linear relationships - Partial correlations: relationship between 2 variables after the effect of the 3rd variable has been removed

Answer 19

- in most cases, will lower the correlation - line you fit will be biased in some way - extrapolating from a very small range - not seeing the big picture

Answer 20

- can drastically alter the magnitude of the correlation coefficient - should check scatter plot before computing > if present, must use caution when interpreting the relationship - will have a larger effect for a smaller sample

Answer 21

- correlation between X and Y may be spurious - X may be the cause of Y - Y may be the cause of X - third variable is the cause of the correlation

Answer 22

what you are trying to make predictions about (i.e. attitude towards the movie)

Answer 23

What is predicted to lead to the criterion (i.e. gender)

Answer 24

- line of best fit is the least square regression line - prediction line that minimizes the total error of prediction - total error is less for the least-squares regression line than any other possible prediction line

Answer 25

- sample is appropriate for a linear model - prediction is within the range of original variables - we do not know if relationship continues to be linear at more extreme values beyond our range

Answer 26

- smooths over all imperfections - acts as the fulcrum to balance the data > best prediction as mean / line of best fit - shows how much error is in our model - M: with deviation scores through the standard deviation - R: with prediction scores through the standard error of the estimate - distribution of errors - variability of ±1 SD or Sy|x is would deviate by 68.2% from the mean / line

Answer 27

- the degree to which our Y values differ from what we predicted - with each predictor added, we subtract one from N - standard deviation of Y given X

Answer 28

- assumes the scatter is equally away from the predicted line (balanced) - standard error of the estimate is only meaningful if Y is constant over values of X

Answer 29

- contains 2+ predictors, but still only 1 criterion - adding a new predictor will always quantitatively improve prediction score - standard error of the estimate decreases - multiple coefficient of determination increases - be mindful that some relationships may be crud - do not just randomly add new predictor variables

Answer 30

- helps avoid double counting - cannot merely just add up the different r^2 values - for fully redundant predictors: R^2 will not increase, but will equal to the r^2 of the better predictor - for fully orthogonal predictors: can add r^2 together because it will not be double counting

Answer 31

- crud factor: "everything correlates with everything else" - adding predictors always increases R^2, even if they are crud predictors - solution: use R^adj to add penalties for each predictor added

Answer 32

- dummy coding: assign variables as either "1" or "0" - result: only adds criterion value when "1" occurs - contrast coding: assign variables as "1" or "-1" - result: criterion value changes when its "1" v "-1" - results in parallel slopes that scoot up and down (i.e. for all values of sleep, Dan will be a bit more grumpy when it is raining outside