Midterm Flashcards

1
Q

Why has psychology been criticized for not being a legitimate scientific enterprise? Mention Meehl’s criticisms,

A

Meehl’s Criticisms

  • With a large enough sample size anything can be rejected, there will always be a difference between 2 means
  • findings are context dependent
  • null is technically always wrong
  • hard to refute theories
  • non-cumulative
  • Theories are never refuted nor corroborated, conflicting and just creates more “rubble”
  • Prone to bandwagons and sinking ships
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why has psychology been criticized for not being a legitimate scientific enterprise? Mention measurement precision, and hypothesis

A

Measurement Precision and hypothesis testing

  • In psychology more precision leads to a better chance at finding a significant difference, whereas in the physical sciences more precision makes it harder to find a difference.
  • As our procedures get better, more powerful, more likely to find null to be false
  • More precise = more likely to reject the null
  • Physics makes point predictions whereas psychology tests for zero’s.
  • In psychology null statistical hypotheses are not derived from substantive theories thus its rejection would not increase the plausibility of the substantive theory.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is cumulative progress in psychology slow?

A
  • We rarely refute theories
  • Due to money and resources, large sample sizes are hard to get. This in combination with significance testing produces a large amount of conflicting results.
  • Humans are harder to study
  • In psychology there may be many multiway interactions needed to explain/predict a certain behavior
  • Theories themselves are rarely put to the test, auxiliary theories are.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the problem regarding core and auxiliary theories in psychology vs. other sciences?

A
  • There is a large gap between core and auxiliary theories in psychology vs the hard sciences. Auxiliary theories are loosely derived from substantive theories.
    * Independent testing of auxiliary theories is harder to do in psychology because so many variables are at play.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe Lakatos’ view of scientific progress and the historical perspective

A

Laktos
-Scientists tenaciously cling to their theories
-Theories are never abandoned after refuting evidence due to a protective belt of auxiliary theories
-Theories that propose novel content but are not corroborated with core theories are called ad hoc theories
Historical-
Theories must explain previous findings and build upon those findings in order for progress to be made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe positive and negative heuristics

A
  • Negative
    * Defensive reaction to refuting evidence by attributing the problems to less important features, blaming auxiliary theories not the core.
    * Ex. Blame participants, too small of sample size, sensitivity of measure (usually blame auxiliary factors)
  • Positive
    * How we can modify the core to account for our findings while still preserving the core, adjusting or expanding core.
    * Ex. Theory true for men not women, or parents are not the only reason children become psychopaths (need to look at more factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe Progressive and degenerating research

A

-Progressive
*Theories that expand
*A theory remains progressive as long as its positive heuristic is still capable of anticipating novel facts
-Degenerative
*Theories that are no longer moving forward and have stalled
*However it is considered okay to cling to a degenerating research program if no rival program exists that satisfies all the findings.
When a competing theory supports and explains all the other stuff
Sign of a degenerating research program (one that isn’t progressing) → not able to account for new emerging evidence, use negative heuristics (always blame auxiliaries, always on the defense), not expanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe Crucial tests

A
  • Crucial tests pit one theory against another to see which is correct
    * Ex. Einstein’s theory of relativity and the solar eclipse
  • The need for a crucial test is usually seen in hindsight; history sometimes dictates if a test was a crucial one of not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is Lakatos’ view relevant to the criticisms of psychological science vs. other sciences?

A
  • hard sciences cling to their theories just as much as psychology
  • The notion that physical sciences are consistent or cumulative is not supported and the claim that psychological sciences are inconsistent or non-cumulative is not supported either.
  • Critics hold an idealized version of research practices in physics as a standard by which psychology should follow however research in physics as shown similar deficits in methodology proportional to those found in psychological research.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is construct validity and how is it related to other forms of validity?

A

a broad concept encompassing all forms of validity, essentially depicting the accuracy of the measure. Is the measure measuring what it is supposed to be measuring?
-As our understanding of the construct becomes clearer, measures shift to make them more accurate.
-It is a process rather than an endpoint
- It should be consulted at all points of developing a measure, not just at the end
- It is best understood as an over arching concept of all types of validity
three aspects of construct validity; substantive validity- ( lit reviews) structural validity, and external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is construct validity not a static quality of a test that can be established with a single study?

A
  • Construct validity is dynamic.
  • When scales are constructed some aspect of the theory will be valid but not all will support the theory
  • researcher must decide whether the fault lies with the test or the theory
  • one cannot just throw away years of work because a single study didn’t support the theory
  • construct validity is acquired by rigorous testing of alternative hypotheses to further our understanding of the construct.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is construct validity typically assessed and when should it be?

A
  • Construct Validity is typically considered after the test has been constructed, a post-hoc fashion. However, it is more appropriately considered a process than an endpoint.
  • Construct validity should be considered at all stages of the scale construction process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is a lit review an important first step in scale construction?

A
  • reveals whether psychologists already have a reliable scale for measuring a specific construct, thus if they do, development of a new scale may not be necessary.
    • However, a new scale may be important if previous scales define the construct differently or are measuring ranges that are too narrow or too broad
  • The literature may also reveal if new scales are needed to advance a theory or cross-validation with other measures of the same construct
  • It can develop a clear conceptualization of the target constructs.
  • although one may already have a general sense of the concept, literature reviews may be helpful in considering alternative explanations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is it useful to write a formal definition of a construct in the very early stages of test development?

A
  • A formal definition helps finalize the construct and define its breadth and scope.
  • The researcher can define lower order components of the construct such as subcomponents, allowing the conceptualization of the construct to expand to include all overlapping subcomponents.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why should initial item pools be over-inclusive?

A
  • so every possible aspect of that construct is covered and the boundaries of that construct defined
  • As analyses are run, weaker and unrelated items can then be dropped from the final scale
  • You can always take items away but you cant add new ones throughout.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is content validity?

A

the degree to which a measure is representative to all the possible facets of a target construct.

  • Related to relevancy, the appropriateness of a measure’s items when measuring the target construct. All items in the measure should fall within the target construct.
  • Also relate to representativeness, the degree to which the item pool adequately samples content from all aspects of the target construct. Often used in the form of subscales.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why should items with extreme endorsement probabilities not be automatically dumped?

A
  • Often removed because researchers believe they don’t offer much information
  • may be beneficial to know who are the five percent who responded oppositely, ex. To a question on suicidal thoughts
  • many measures are tested across a wide variety of individuals (college student to psychiatric patients) and these groups may differ on their average trait levels, thus excluding that item may dispose of important information regarding a certain group of people
  • It is important to look at people on extreme ends too
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the pros and cons of dichotomous response formats?

A
  • Pros
    - Easier scoring and analyses (computers have made this obsolete)
    - Less time consuming
  • Cons
    - Less reliable (Must make the scales longer in order to achieve same reliability as polytomous scales)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Describe the rational-theoretical method of item selection

A

a method in which the scale developer simply writes items that appear to be consistent with the target construct.

 - Pros
  - Simple
  - Good convergent validity   - Cons
     - Poor discriminant validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe the Criterion-keying method of item selection

A

items on selected for a scale based solely on their ability to discriminate between individuals in a “normal” group and those from the group containing the construct being measured

  • Item content is irrelevant
  • Pros
  • Good discriminant validity
  • Good convergent validity
    - Empirical
  • Cons
  • Measures are atheorectical and don’t advance psychological theory in a meaningful way
  • Scales are highly heterogeneous, lack internal consistency, making proper interpretation of scores difficult
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is factor analysis and why is it a useful tool in scale construction?

A

A statistical procedure used to identify clusters or groups of related items on a test. It is extremely useful for producing homogenous scales with good discriminant validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Describe the features of good and bad candidate items

A
  • Good Candidate Items
    - items that load moderately on a primary factor and only minimally on other factors
  • Bad Candidate Items
    - items that load weakly with the hypothesized factor or cross-loads on other
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe the kind of factor analysis results that are required for measures of broad constructs that have subscales.

A

Groupings must be homogenous and correlated for a broad construct to be broken into subscales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Differences between internal consistency and homogeneity?

A
  • Internal Consistency
    • measured by Cronbach’s alpha
    • indicates the overall degree of interrelation between a set of items
  • examines the average inter-item correlation and the number of items on the scale
  • Homogeneity
    • uses internal consistency to establish homogeneity
    • indicates the extent to which all of the items on a given scale tap a single facet of a target construct.
    • examines the mean of the distribution of the inter-item correlations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How will the inter item correlation matrix differ for uni- and multi-dimensional pools of items?

A

-there may be significant variability in inter-item correlations in a multi-dimensional pool of items whereas a uni-dimensional pool of items would cluster around the average.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is examined in the external validity phase and how is the task different from the task in the structural validity phase?

A
  • examines the relationship between the new measure and important test and non-test criteria are congruent with one’s theoretical understanding of the target construct and its position with respect to other similar and dissimilar constructs called the nomological net.
  • structural validity phase involves analyses of items WITHIN the new measure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Convergent validity

A

the extent to which a measure correlates with other measures of the same construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Discriminant Validity

A

the extent that a measure does not correlate with measures of other constructs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Predictive validity

A

The extent to which a measure can predict a criterion occurring in the future.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Concurrent validity

A

Relating a measure to criterion evidence collected at the same time as the measure itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is measurement and what problem occurs with almost all measures?

A
  • Measurement = the process of building models that represent phenomena of interest, typically in quantitative form
  • Error will always occur
  • This error occurs not just in psychology – Amniotic fluid ex two doctors could decide different things
  • Almost all measures struggle with validity and reliability as no measure will be perfectly reliable and perfectly valid
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Why should we expect measurement models to be eventually proven wrong?

A
  • All measurement models will eventually be proven wrong because other models will be developed that supersede those models
  • measurement models must be specified explicitly so that they can be evaluated, disconfirmed, and improved
  • Comparative model testing is one of the best ways to determine which model is the “least wrong.”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Describe the two components of observed scores in classical test theory and the definition of reliability

A

1) Observed Score = True Score + Error
- The true score is the score each person would obtain if there were no error. It also represents the population parameter.
- Error is all the variations in the circumstances of measurement that are not related to the measurement itself.
2) Reliability = the consistency of a measurement procedure and the extent to which scores produced by the measure are replicable. The ratio of the true score variance to the observed score variance. A reliability of 0.7 or higher is sufficient but depends on the circumstances of the study ex. A study of suicide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Describe the cost of low reliability

A
  • Trouble with operationalizing constructs if the “meter stick” isn’t consistently measuring a construct
    • The true correlation between a measure and the constructs measured may be underestimated (Attenuated)
  • Small sample sizes combined with low reliability make detecting an effect difficult
  • When true correlations are small and are combined with low reliability it makes it difficult to detect a difference.
  • Effect sizes will be severely underestimated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is correction for attenuation and why should it be performed?

A
  • Correction for attenuation is a statistical procedure in which reliability indices are used to correct for underestimated observed correlations due to unreliability.
  • It should be performed such that researchers do not underestimate the strength of the relationship between two variables, thus resulting in lower effect sizes and possibly insignificant findings when an effect does exist.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the major benefit of generalizability theory over classical test theory statistics?

A
  • permits a decision maker to pinpoint the source of measurement error and quantify them
    • The idea is to obtain a certain level of generalizability, which is the extent to which a score is interchangeable with other scores, ex. Similar numbers are interpreted the same way, amniotic fluid
  • G-Theory disentangles multiple source of error rather than a broad error term the CTT provides
  • G-theory allows researchers to see exactly where the sources of error are coming from and possibly reduce these problems
  • It taps into every facet of that may influence reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Three sources of reliability

A

1) Internal Consistency
- Every item is correlated with every other item, ex. Split-half reliability
2) Interrater
- How much scores vary across different raters or judges
3) Test-Retest
- How responses vary across time, ex. IQ test results remain relatively stable across multiple sessions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is Cronbach’s alpha? Explain why alpha is not a pure index of internal consistency and why high alpha values do not necessarily indicate homogeneity.

A
  • average of all possible split half reliabilities (split half= half pool items into two groups and compute the correlations between the two split groups-should be random)
  • it is not a pure index of internal consistency because alpha levels are not a good reflection of the data. Internal consistency is problematic and it is best tool look at homogeneity.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Latent trait

A

A trait that underlies and directly influences individual’s behaviors and responses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Item response function

A

A mathematical function describing the relationship between where an individual falls on the continuum of a given construct and the probability that they will give a particular response to a scale item designed to measure that construct.

41
Q

Item discrimination

A

A parameter ind icating how well an item can discriminate from people high on a trait and low on a trait. (The steepness of the line)

42
Q

Item difficulty (Threshold Parameter)

A

The latent level that corresponds to a 50% chance of getting the item correct or endorsing the item. (Where the inflection point occurs)

43
Q

What should item discrimination and threshold parameters look like for a well-designed test?

A
  • should have a steep slope and have items with inflection points that hit every level of the latent trait continuum( mean that we would have a scale that tests every aspect of the latent trait continuum, from the easy questions to middle questions and difficult questions (All levels of difficulty)
  • every item would effectively discriminate individuals who score high or low on the latent trait with fairly good accuracy.
  • If graphed using item information functions- should have a high peak and hit every level of the latent trait continuum-High peaks indicate that that item provides a lot of information thus making it easier to differentiate between individuals high and low on a trait.
44
Q

Item information function

A

A graphical curve that displays the amount of discrimination between respondents that an item provides across the latent trait continuum.

45
Q

Scale information function

A

The sum of all item information functions. Shows precision of scale across all levels of the latent trait continuum.
-It is useful because it allows the linking of different scales. This method allows researchers to link different measures across age, cultures, genders, etc.

46
Q

Reliability information function

A

How informative a test item is across all levels of the latent trait continuum.
-This is useful because it can expose items that are redundant or are not operating properly and thus are not needed. It can also identify problems with item response options (i.e., dichotomous vs polytonomous)

47
Q

How can IRT be used to evaluate the appropriateness of response categories and item redundancy?

A
  • Response option curves can be generated to assess the discrimination of each response option from the other. If the curves overlap this can mean that the labelling on the response options was confusing or that individuals couldn’t discriminate the difference between say a 2 or a 3 on a 5-point scale.
  • Test information and reliability functions can be generated to test for redundancy in a scale. If two items have similar curves or overlap each other than the researcher will know that those two items are providing similar information and thus one of the can be eliminated to make the scale more efficient.
48
Q

Item bias

A

occurs when individuals from the different groups, who are at the same points on the latent trait continuum, obtain different scores on the item.

49
Q

Test bias

A

occurs when individuals from the different groups, who are at the same points on the latent trait continuum, obtain different scores on the overall test.
-It is important to note that item biases often cancel each other out, so researchers aren’t as concerned by them as test biases that affect the overall score.

50
Q

What is computer adaptive testing (CAT)

A

a form of computer-based test that adapts to the examinee’s ability level.
-When an individual begins a test, the computer starts with a question of average difficulty, depending if the individual gets the question right or wrong will determine which question the computer chooses next. A wrong answer will lead to the computer choosing an easier question, a right answer will lead the computer to choosing a harder question

51
Q

When using CAT Explain why different test-takers may receive different items and different numbers of items, and why CAT can be used to obtain accurate scores for people using relatively few items

A
  • enables the computer to zone in on the individual’s ability level quite accurately
  • Once the individual has reached a level in which they can consistently answer questions, and they can’t answer questions of higher difficulty, the computer stops the test and provides an estimated ability level. -Due to anxiety or other influences, some individual’s may jump along the latent trait continuum quite frequently
  • computer must administer more questions until the individual’s responses fall into a more consistent pattern and an ability level can be estimated
  • CAT allows researchers to obtain accurate estimates of an individual’s place on the latent trait continuum using relatively fewer questions
52
Q

Describe the essence of regression. Mention model, error, least squares, slopes, and intercepts in your answer.

A
  • Regression is the process of fitting a model or a line to a set of data and using it to predict an outcome variable (DV) from a predictor variable (IV) in simple regression or multiple predictor variables (IV’s) in multiple regression
  • Outcome = Model + Error
  • To assess the fit of a model a linear regression line is computed using the least squares method. In this method the sum of the squared differences for every possible regression line is calculated. A line of best fit occurs when the sum of squared differences from the line or the residuals are at an absolute minimum. In other words, when your error term is the lowest.
  • The slope of the regression line is the change in the outcome associated with a unit change in the predictor variable.
  • The intercept of the regression line is the predicted outcome variable when the predictor is at zero.
53
Q

How is the goodness of fit of regression models assessed?

A
  • assessed using the least squares method in which the difference of each point from the line (The residuals) is at a minimum value
  • done by first calculating the difference between the observed value and the mean because the mean will be a fairly good fit This model (Sum of squares total) shows how good the mean is as a model of the observed data
  • Next, we fit a line of best fit to the data (A regression line) and compared that line to the observed values. -This model (Sum of squares residual) represents the degree of inaccuracy (The residuals) when the best model is fitted to the data
  • To improve the final regression line, the difference between the two lines are calculated (Sum of squares total – sum of squares residual = sum of squares model)
  • results in a reduction of the inaccuracies in the model and produces a line of best that can accurately predict outcome variables from predictors.
54
Q

Raw or unstandarized slopes

A
  • Typically a linear line, raw slopes are the expression of the variables in their raw units
    • Relatively simple and easy to use; however, raw slopes cannot be meaningfully compared because they are in different metrics
  • For example education may be measured in years and income measured in $1000 units, therefore the expression of the relationship indicated by the slope of the regression line would describe something like each increment of one raw unit of education (years), projected earnings would increase 1000 raw units of income (dollars).
55
Q

Standardized slopes

A
  • Typically represented by a normal curve, standardized slopes are the expression of the variables in common units, z-scores.
    • allows researchers to compare residuals across different models, assess significance, and allows for meaningful comparisons thus researchers can determine if one predictor had more of an effect than another predictor in a multiple regression analysis.
      • For example one standard deviation of education, projected earnings would increase by .40 standard deviations of income.
56
Q

Forced entry method

A

a method in which all predictors are forced into the model simultaneously. No prior decisions are made about the order in which they are entered, only theoretical reasoning for including the predictors. The simplest procedure; however, as the amount of predictor variables increase the results can become hard to interpret (especially since overlapping predictors is common in psychology)

57
Q

Hierarchical method

A

a method in which predictors are selected based on past work and are entered into the model by order of importance in predicting an outcome. This method is used most often.

58
Q

Forward entry method

A

a method used in stepwise methods in which the computer searches for the best predictor out of all possible predictors in predicting the outcome variable and adds it to the model

  • Then the computer selects the next best predictor and so on until the addition of a new predictor doesn’t improve the model and the computer stops
  • R does this based on the Akaike information criteria (AIC). A lower AIC indicates a better model; therefore, predictors are kept as long as they lower the AIC.
59
Q

Why should stepwise methods be avoided (or used with great caution)?

A

you will trim your predictors down just for that specific sample, which will make it hard to replicate and generalize. Generalization is key if you want to conclude anything significant from your study and step wise methods will trim your study so that it will fit that sample, but will make it hard to replicate on any other samples used
-If results cannot generalize they do not add anything to research and the growth of psychological theories.

60
Q

Describe how the following statistic can be used to look for outliers and influential cases: standardized residuals,

A

standardized residuals are the differences between the point and the regression line converted into z-scores (Residual divided by the standard error).
-This is important in interpreting outliers and error because it provides researchers with a universal cut off point for what constitutes an acceptable value. Therefore, if more than 5% of the points falls beyond 1.96 and -1.96 the model is a poor representation of the data, assuming normality.

61
Q

Describe how the following statistic can be used to look for outliers and influential cases: DFFIT

A

DFFIT is the difference between the predicted value for a case when the model is calculated including that case and excluding that case. This requires an adjusted predicted value that is computed by taking every case in a distribution and testing whether or not exclusion of that case improves outcome predictions. If a case is not influential, it DFFIT value will be zero. Don’t want cases to have a large influence, they should all be relatively small.

62
Q

Describe how the following statistic can be used to look for outliers and influential cases: DFBETA

A

DFBETA is the difference between a parameter estimated using all cases and estimated when one case is excluded. This technique allows researchers to identify cases that have a particularly large influence on the parameters of the regression model (outliers).

63
Q

Explain why regression is more flexible and encompassing than ANOVA.

A

-Regression can be used on continuous or dichotomous predictors whereas ANOVA will only work on continuous variables.

64
Q

Why is it important that the predictors in a regression model be uncorrelated with “external variables”?

A

-If predictors correlated with external variables this will be problematic when interpreting your results. External variables could play the part of a third variable that significantly influenced your results. So you wont be confident that your outcome variable was due to your predictors, there could’ve been a third variable that you did not control for influencing the results.

65
Q

Methods of cross validation : R2

A
  • indicates the loss of predictive power, how much variance in Y would be accounted for if the model had been derived from the population from which the sample was taken.
  • Always goes down from R2 value.
  • Small sample size and lots of predictors will cause the adjusted to R2 value to decrease. If this happens the line of best fit may over fit the values, thus over estimating effect size
66
Q

Methods of cross validation: Data splitting

A
  • Randomly splitting a data set, computing a regression equation on both halves of the data and then comparing the resulting models.
  • The regression line should fit both halves similarly.
67
Q

Describe the factors that are important in determining sample size.

A

General rule is about 15 cases per predictor but depends on the size of the effect we are trying to detect and how much power we need to detect them. To figure this out, a power analysis must be run to determine an appropriate sample size. So the main factors are number of predictors, size of the effect wanted, and power needed to attain an effect.

68
Q

Problems created by multicollinearity- untrustworthy Bs

A

As collinearity increases so do the standard errors of the b coefficients. Big standard errors for b coefficients means that these b’s are more variable across samples and less likely to represent the population.

69
Q

Problems created by multicollinearity- Limits the size of R

A

When two predictors are correlated they may account for the same amount of shared variance in an outcome. Thus having both accounts for no more variance then just having the one. If predictors are uncorrelated then we can be sure they are accounting for different portions of the total variance.

70
Q

Problems created by multicollinearity-Importance of predictors

A

Multicollinearity makes it difficult to assess the individual importance of a predictor. If predictors are highly correlated and account for similar variance we can’t be sure which predictor is the important one in predicting the outcome.

71
Q

Why is it important to study the variable correlation matrix before the results of a regression analysis?

A

-To assess instances of collinearity or multicollinearity. These correlations can bias our regression line and produce results that may not be entirely correct (imprecise b values, conservative significance tests); therefore, these problems should be dealt with before the results are evaluated.

72
Q

What is R2?

A

-R2 change the change in model one to model two. In model two, several predictors are added, this R2 change value will give us the difference between the models such that we will know if more of the variance is accounted for in the second model through the addition of new predictors. This change can then be tested for significance using an ANOVA test.

73
Q

What does a 95% confidence interval for a b coefficient tell readers?

A

-If we were to replicate this study multiple times on different random samples we are 95% confident that the estimated slope of the regression line would fall between the lower and upper limits.

74
Q

Describe the consequences/implications when a data set does not meet the regression assumptions. Lack of independent samples

A

If two cases in our sample are correlated (such as two sisters taking part in a study) then it can lead to imprecise b coefficients and an over estimated effect size. This means that your results are not very meaningful as they are biased and may not generalize to the larger population.

75
Q

Describe the consequences/implications when a data set does not meet the regression assumptions. Outliers and influential cases

A

An outlier in a regression analysis can change the slope of the regression line such that it is no longer accurate or meaningful. It may also increase residual variance, thus increasing the chance of making a type II error. The best approach to fix this is to compute the regression line again without the outlier and see how much the slope changes, if it is significant its best to drop the outlier and mention it in your results section.

76
Q

Describe the consequences/implications when a data set does not meet the regression assumptions. Non-normality

A

if a data set doesn’t follow a normal distribution then the slope of the regression line may be biased and affected by outliers. Thus results will not be an accurate depiction of the larger population. To fix this a robust regression method should be used, such as bootstrapping.

77
Q

What do readers need to see when reading the results of a regression analysis?

A

-Regression analyses are often depicted uses tables. These tables will include the b coefficients, standard error of the b coefficients, beta weights, the change in R2, and the p values of the significance tests for all predictors in each model. It is also good practice to report the confidence intervals for each model. Some researchers also include correlation coefficients to indicate the absence of collinearity.

78
Q

Partial correlations

A

the correlation between two variables in which the effects of the other variables are held constant. In other words, the amount of variance two predictors share when a the shared variance of a third predictor is controlled for.

  • Can use for dichotomous or continuous variables.
  • The third variable shares variance with BOTH of the predictors.
  • Best used when you want to look at the relationship between two variables when a third variable or possibly confounding variable is controlled for.
79
Q

Semi-partial correlations

A

the correlation between two variables when the effect of the third variable is controlled for in only ONE of the predictors.

- Third variable is pulled out of IV not DV   - Best used when trying to explain the variance in one particular predictor from a set of predictors.
80
Q

How can categorical variables be used as predictors in regression?

A
  • The problem with using categorical predictors in regression is that in most cases a predictor will have more than two categories (ex. Religion)
  • Dummy coding = A way of representing groups of people using only zeros and ones. The amount of dummy variables needed is the amount of predictors minus one, as the baseline group will take a zero across all predictors. All the predictors are set up in contrasts, the placement of the 1, indicating which predictor is which (ex. 0100 or 1000)
  • Once the contrasts are set up a multiple regression analysis can be computed.
81
Q

Give an example of an interpretation of a raw b value for a dummy variable.

A

-the beta value represents the shift in the change in the outcome scores as predictors change. Ex. Hygiene scores go down (a person becomes smellier) as a person changes from listening to no music to listening to rock music

82
Q

What are moderators, when are they typically introduced?

A
  • a variable that alters the direction or strength of the relationship between a predictor and an outcome. Essentially an interaction effect by which the effect of one variable depends on the level of another variable
    • Represents, “For whom” a variable most strongly predicts or causes an outcome variable magnitude and the direction of the relationship.
  • Can be categorical or continuous.
    • Often sought after the fact, when a weak effect was found researchers often search for other interactions that could explain this result
83
Q

How are moderator variables tested?

A
  • In regression moderator’s variables are often treated as another IV. To represent the interaction between the moderator variable and the predictor product terms are formed by multiplying the moderator term by the predictor term using the newly coded dummy variables. A product term is created for each level of the moderator variable. Once all of the product terms have been created a hierarchal multiple regression approach can be used to get an F statistic that determines if a significant amount of the variance in the outcome was accounted for by the product term. A significant result means that an interaction has occurred and a graph must be generated to see what the interaction looks like.
    - Enhancing interactions = more effect in one group than the other
    - Buffering interactions = as one groups goes up the other goes down, zero effect size
84
Q

What are mediators and when are they sought?

A

The variable that explains the relationship between a predictor and an outcome. The mechanisms through which a predictor influences an outcome variable.

  • Causal mechanism
    • IV1 predicts DV because of IV2; therefore, IV1 and DV may not be related at all if IV2 accounts for the total variance in IV1.
    • Often sought after when we find a relationship and want to explain why it occurs.
    • Represents, “Why and how” one variable predicts or causes an outcome.
85
Q

How is the form of a significant moderated relationship uncovered?

A

-ANOVA’s can be used to test for significant moderator relationships but only when the IV and DV are categorical. However, if they are continuous many researchers perform a median split and separate the continuum into portions such that they can use an ANOVA instead of regression methods.

86
Q

MANOVA is used for what kind of data?

A

One or more categorical IV’s and 2 or more continuous DV’s

87
Q

What are the (supposed) benefits of MANOVA over running multiple ANOVAs?

A

MANOVA’s protect against family wise error, which inflates as more ANOVA’s are run on the data and increases the chances of making a type I error. In addition, MANOVA’s provide more information about the relationship between the dependent variables, therefore a MANOVA can inform us if groups can be distinguished by a combination of scores on several dependent measures.

88
Q

Summarize the similarities between MANOVA and ANOVA with regards to sums of squares and SSCP matrices.

A
  • F ratio compares systematic variance to unsystematic variance
    - Both are interested in how much variance can be explained by the experimental manipulation
    - The product of the F ratio is a value representing the effect of systematic over unsystematic variance.
    - Both use the sum of squares method, the total squared difference between the observed values and the mean value, telling us how much variation can be accounted for by the model.
89
Q

Summarize the differences between MANOVA and ANOVA with regards to sums of squares and SSCP matrices.

A
  • ANOVA is univariate, MANOVA is multivariate.
    • ANOVA uses single values for systematic and unsystematic variance whereas MANOVA deals with multiple DV’s and must use matrices representing all of the systematic over matrices representing all of the unsystematic variance in the all of the DV’s.
      • The product of the F ratio is a single value in ANOVA, representing the effect of systematic over unsystematic variance. Whereas, the product for a MANOVA is a matrix, representing the effect of the systematic variance in all the DV’s over the unsystematic variance in all the DV’s.
        • MANOVA calculates the sum of squares but also calculates the cross products of the DV’s. Thus allowing researchers to look at the correlation between DV’s. These cross products represent the total value for the combined error between two variables.
90
Q

Explain what is produced by an eigen analysis.

A

An eigen analysis produces statistics for the discriminant functions. The two main products of an eigen analyses are a matrix of eigen vectors and a list of eigen values. Eigen vectors are the two perpendicular lines intersecting a data set along a diagonal line, in which all the values off the diagonal lines are zero and only those values along the diagonal line are used for further analyses. Thus the matrices produced (those representing the systematic over unsystematic variance, will be exceptionally smaller and easier to interpret. Eigen values (the values along the diagonal dimension of the HE-1 matrix) are conceptually equivalent to the F ratio in ANOVA, thus indicating whether or not the groups are significantly different from each other along each DV.

91
Q

What are discriminant function variates, and how many can there be?

A
  • linear variates used to predict which group a person belongs to (to discriminate them). Linear variates are the linear combinations of the dependent variables, allowing us to investigate the relationship between combinations of DV’s.
  • The number of discriminant function variates is the number of DV’s minus one. Therefore, to avoid confusion and a lot of headaches from trying to interpret so many different dimensions it is best to stick with about 5-7 DV’s.
92
Q

Describe the similarities between discriminant functions and multiple regression equations.

A
  • Product of analyses is a set of weights for variables, indicating how much variance can be accounted for by each predictor in the total outcome variance.
    - Both are interested in predicting an outcome variable.
    - Both fit linear models to data sets in order to predict an outcome variable.
93
Q

Describe the differences between discriminant functions and multiple regression equations.

A
  • In regression the weights produced for the variables are for the IV’s not the Dv’s as in discriminant functions.
    - In discriminant function we are interested in predicting an IV from multiple DV’s, not a DV from multiple IV’s as in multiple regression.
    - Several discriminant functions can be produced from a set of DV’s; however, in multiple regression all independent variables are included in a single model.
94
Q

What kinds of comparisons, or ratios, are involved in multivariate significance tests?

A

Multivariate significance tests involve eigen values

  • magnitude of the eigen values indicates if there is a significant difference between groups somewhere in the multivariate space
  • like an ANOVA, the multivariate significance test compares systematic variance to unsystematic variance
  • unlike ANOVA, the F test is for the linear combinations not the individual linear models.
95
Q

Explain why researchers might be justified in choosing Roy’s test over other multivariate tests.

A

Roy’s test tends to be more powerful because the computer searches for the discriminant function with the largest separation among the groups and runs analyses on that discriminant function, increasing the likelihood of finding a significant result. Whereas other tests, simply test all the discriminant functions.

96
Q

What is the independence of observations assumption?

A

Observations should be statistically independent. Not correlated with each other.

97
Q

What is the homogeneity of covariance matrices assumption and how is it tested?

A
  • the assumption that each of the variances in the DV’s are equal and that the correlation between any two variables is the same in all groups
  • tested by assessing whether the population variance-covariance matrices of the different groups in the analyses are equal\
  • called Box’s test, and the results should be non-significant, meaning the matrices are the same.
98
Q

Main follow-up analysis methods after a significant multivariate test : Discriminant analyses or multivariate analyses

A

find the linear combinations of the DV’s that best separates or discriminate the groups. This method is more in keeping with the analytic methods of MANOVA because it focuses on the relationships that exist between the DV’s and the underlying dimensions.

99
Q

Main follow-up analysis methods after a significant multivariate test :Conduct several ANOVA’s or univariate test on each of the dependent variables.

A

It is thought that the previously used MANOVA will protect the use of the ANOVA’s after from the effects of type I errors. This is flawed because the MANVOVA only protects from type I errors against the DV for the group differences genuinely exist. Therefore, Bonferroni adjustments are needed to correct the subsequent ANOVA’s.