RS4 Flashcards

Question

Why do you square residuals

Answer 1

1. makes them all positive | 2. Emphasises the larger deviations - exaggerates them

Answer 2

The squared residuals (errors) all added together.

Answer 3

We are comparing it to the line without the predictor variable (the straight line) - if it is good it should reduce the distance or the SSE (sum of squared errors)

Answer 4

No it's actually just the mean of a distribution

Answer 5

Min Σ (yi ŷi)2 yi = actual value ŷi = predicted Minimum of the sum of the squared residuals (the y's) This formula is used to calculate the SST, SSR and SSE

Answer 6

The point of the mean x value and mean y value, the line of best fit must go through this

Answer 7

ŷi = 0.1462x - 0.8188 b1 = 0.1462(1) (gradient) b0 = -0.8188 SO for every £1 the bill amount (x) increases the tip (ŷi) increases by 0.1462- this part makes sense if the bill amount is zero (x) then the expected tip is -0.8188: this obviously doesn't make sense, but it doesn't have to.

Answer 8

It's the Sum of squares for residual. The SST (total) is the SSE (error) in the straight line with maximum error In our model we want less error. The SSR (regression sometimes called model) is this SST - SSE. The better the line fits the data the smaller the SSE will be and the bigger the SSR (model) will be. We are comparing to our bad model when SST = SSE

Answer 9

The difference between the predicted value of y (from the model) and the actual value at each point. The predicted value has been calculated by plugging in the x values to the simple linear regression equation.

Answer 10

Quantifies the ratio between SSR (model) and SST (total). Shows us how well the model fits It is an r2 value Do SSR/SST, can times the r2 by 100 to get a percentage

Answer 11

The fact that independent (predictor variables) variables can also be related to EACH other as well as the outcome (dependent variable) This is bad if they are because it can be harder to discern what factor is really affecting the dependent variable, If this happens they are redundant and you'd take out related variables

Answer 12

Means that there are many more relationships to consider, have to account for each predictors relationship to the outcome as well as to each other.

Answer 13

ŷ = b0 + b1x1 + b2x2 + ... (however many predictors)

Answer 14

Each coefficient is the estimated change in y (outcome) corresponding to a 1 unit increase in that predictor, when all other variables are held constant.

Answer 15

The R squared adjusted for the number of independent variables (lower than R squared)

Answer 16

Tells us how wide the 'band' is around the regression line - in the independent (predictor variables units)

Answer 17

VIF should be below 10 Tolerance should be higher than 0.2

Answer 18

Liberal: 15 ptps per predictor Conservative: Tabachnick and Fidell (1996) suggest 50 + 8 x(number of IVs) Stepwise suggests 40 per IV

Answer 19

Random sampling variability is the fact that when you sample a population it may come out different with different samples i.e. you might have a different sample mean several times Sampling error is the fact that the sample you have will be different from the mean population score

Answer 20

Formulated by pierre laplace (1810): The idea that different samples taken from the same population will often have different sample stats such as mean and SD - random sampling variability

Answer 21

The standard error measures the expected difference (due to chance) between the sample mean and that of the population The standard deviation is achieved becuase of random sampling variability, the standard error of the mean is essentially the standard deviation of a distribution of samples

Answer 22

The t score formula uses estimated standard error instead of standard deviation used when we need to estimate the standard deviation (error) of a population because we don't know it - most situations, in this situation we use N-1 to estimate the SD formula

Answer 23

We know that 95% of the scores will be +/-1.96 SEs (SDs) from the mean (if the sample is normally distributed) - when samples are above 100, if they are lower you have to increase that SE figure. The Confidence Interval will be the confidence interval x the SE(mean) so: Upper 95% CI: Sample mean + (1.96 x SE(mean)) (-) for lower 95% CI

Answer 24

Two tailed

Answer 25

population mean/square root of N

Answer 26

SD decreases | N increases

Answer 27

R2, you could add addition predictor variables to a multiple regression equations

Answer 28

The way of making a linear transformation of a normal variable into a standard normal variable

Answer 29

Measuring a psychological variable with objective numerical or categorical measures

Answer 30

- Don't have to manipulate variables - Quick and easy to administer - Potentially large number of responses - Anonymous responding may produce more truthful responses

Answer 31

- Response rate is low - internet - Difficult to correct misunderstanding (if you aren't there then people may interpret questions in different ways) - Potential influence of question order - potential influence of question wording

Answer 32

Open questions Closed questions; - Yes/No - PANAS - positive and negative affect schedule - continuum of agreement - Likert-like

Answer 33

Advantages: - Gets all the info - Does not lead respondent - More naturalistic Disadvantages: - Can be difficult to complete - Difficult to code and analyse - Poor when numeric results required

Answer 34

Advantages: - Easy to code and analyse - Good when a numerical result is required - Quick for respondents to complete Disadvantages: - Can encourage bias (if worded badly) - Can miss possible answers - Create opinions where none exist

Answer 35

So that you can try and eliminate error. The observed response is a combination of true response and error.

Answer 36

Ptp must read and understand question Ptp must decide on their attitude Ptp must match their attitude to the scale in the questionnaire Things can go wrong at any of these steps

Answer 37

Short, clear and unambiguous be very clear with how you define terms that can be interpreted differently, use simple language Use don't know or unambiguous Avoid double barrelled questions i.e. only ask about one thing, and clearly do it. Avoid quantitative statements: 'Private education is better than non-private' - could say no and mean that private was the same or worse Avoid leading questions Avoid loaded terms Avoid hypothetical situations Avoid double-negatives

Answer 38

Attitude is a continuum 5/7 have a reasonable number of choices but are not unmanageable More than 7 - labelling becomes more difficult If you leave them unlabelled this can leave to ambiguity

Answer 39

Very broad questions going down to narrow specific questions This encourages completion and allows for logical expression of ideas§

Answer 40

May be several dimensions to the attitude we want to measure Minimise the effects of random error

Answer 41

A balanced scale is one where the questions are framed from both sides of a viewpoint, so do that. This can reduce acquiescence bias.

Answer 42

Tendency to agree with a statement regardless of its content

Answer 43

More than you want overall - some will not be suitable

Answer 44

The internal consistency of a test or scale Items should correlate with each other. One set of items should correlate with another set. Use Cronbach's alpha or split-half

Answer 45

Items with large variance is good. They will discriminate between high and low scorers

Answer 46

How items correlate with others: look at overall correlation matrix If an item is removed then you have to relook at all the data again

Answer 47

Reject items that do not correlate with the total score. Preferred way to do this is correlate it with the total excluding the item you are measuring

Answer 48

Does one half correlate with the other: - Odd and even numbers - First half with the second half - Random selection

Answer 49

The coefficient for split half reliability 0.7 is normally adequate but 0.6 may be allowed. n x known reliability --------------------------- 1 + [(n -1) x known reliability] n is normally 2, but can be more if you need to determine how long your questionnaire needs to be.

Answer 50

The average of every correlation between every possible half of the items with every other possible half Best measure of reliability - does not depend on one particular split

Answer 51

Add all the spearman-brown coefficients and divide by the number of them Using means squares: Between people variance - error variance ----------------------------------------------------------- between people variance

Answer 52

Item analysis works on the assumption that we are only measuring one construct - and all questions relate to that. Factor analysis is measuring the different factors that make up your questionnaires

Answer 53

Data reduction statistical technique Takes a large set of variables and reduces it using smaller set of factors/components that are independent of each other.

Answer 54

Deductive: - All As and Bs - All Bs are Cs All As are C Inductive: A1 is B A2 is B A3 is B All B's must be A - you are generalising from a finite number of observations ``` Abductor reasoning (factor analysis): - devising a theory from observations (but not with direct testing) ``` The surprising fact C is observed, If A were true C would be a matter of course, hence there is reason to suspect A is true Not conclusively true however

Answer 55

Hypothetical variable assumed to underlie a group of highly correlated items The greater the loading the more that factor explains the variance of those items An items 'loading' is a correlation coefficient, ranges from -1 to +1

Answer 56

Trying to understand the underlying dimensions and psychological processes behind the responses Have as clear a solution as possible

Answer 57

Exploratory: - Highlight factors within a set of responses Confirmatory: - Used to test whether a set of data fits a pre-existing pattern of factors

Answer 58

Extraction: - Determines how many factors underlie the data - Normally principle components analysis Rotation: - determines the loading of each item - Either Orthogonal or Oblique Orthogonal: assumes each factor is unique, theoretical model may suggest independent variables (e.g. varimax) Oblique: More often used, determines the relationship of factors to one another

Answer 59

Look at correlation matrix, want a few correlations above 0.3. Barlett's test of sphericity: should be above 0.6 KMO - Kaiser-Meyer-Olkin: Should be above 0.6

Answer 60

Look at the eigenvalue: Factors with a value above 1 will be retained Scree test: at which point does it become horizontal

Answer 61

tries to present the factors so that factors are easiest to interpret, so that they load on the factors that they are most correlated with.

Answer 62

Correlation of between two scores at two different times of testing, this can be affected by: - Internal reliability - External factors, such as mood or fatigue

Answer 63

Characteristics that are stable and do not vary much in the short term Others do vary e.g. mood

Answer 64

Attempt to reduce any carry-over effect from doing the test before Can give another version of the test Limited by internal reliability of the two forms of the test

Answer 65

The extent to which the test measures what it claims to measure.

Answer 66

Subjective test of whether the items appear to be measuring what you think they are

Answer 67

Trying to get a full range of the content the items should be measuring Subjective Seek out experts or people who are similar to those you are trying to measure

Answer 68

Comparing your test to a previously established test. May be a more established and accepted test of the same variable, i.e. physiological measures.

Answer 69

The ability of the test to predict future events e.g, if intelligence predicts employment No necessary for a test - but may add to validity

Answer 70

Convergent: measures of the same variable must all correlate with each other, regardless of the type of measure e.g. interview should correlate with questionnaire and physiological measures. Discriminant validity is the opposite, there should be little correlation between a test and measures of DIFFERENT variables

Answer 71

A form of validity when the construct is not assessed with respect to external criteria. A test that you believe is the same thing you believe in.

Answer 72

The idea that a new construct has few specifiable associations in which to pin down the construct. The construct will send out roots in many directions attaching it to associations as research proceeds. The construct will change and become more defined and rigid as research goes on.

Answer 73

TRICK QUESTION. HA. You Can't. The whole investigation is a test in a way or some other suitably fluffy bollocks explanation. Martin goes on about this for about 15mins.

Answer 74

The size of the relationship between two variables.

Answer 75

R squared or eta squared (correlation) is an effect size Cohens d, look at the difference between two means, and standardise it (SD), so you can compare across studies: m1 - m2 ----------- SD D=0.2 is small 0. 5 is medium 0. 8 is large Can report as SD's or the original units

Answer 76

Type 1 error: - Finding a significantly result when it is not really there. Type 2 error: - Failure to find a significant result when there is one.

Answer 77

Likelihood of finding an effect when one does really exist: A study with a power of 0.8 has an 80% chance of finding a significant effect when it does actually exist.

Answer 78

Effect size - the larger the better Sample size - the larger the better Alpha level - the less stringent the more power, however increases chance of type 1 error. So can't change.

Answer 79

Could do pilot. Getting the average effect size from other research (meta-analysis). If research is new then the effect size might be new (you don't know) On the side can do cost-benefit analysis. Minimum effect size that you need to see for the study to be worthwhile.

Answer 80

# Define variables of interest. Plan database search - inclusion and exclusion criteria Calculate effect sized from other research Combine them: Convert to z score, calculate average z-score then convert back.

Answer 81

Less stringent alpha (not gonna happen) Increase sample size Reduce noise (reduce SD): - Standardise procedures - More reliable measures - Repeated measures design Focused (planned) contrasts rather than omnibus test. - Use a test that only look for a linear relationship: not ANOVA. Combine results of individual studies

Answer 82

A psychological tendency expressed as evaluating a particular entity with some degree of favour or disfavour Latent Complex Not the responses, but the attitude underlying that response

RS4 Flashcards

(107 cards)