WEEK 2 - Measuring variables, sampling, validity and reliability Flashcards

Question

When are larger sample sizes needed?

Answer 1

- When the sample is heterogeneous (when the sample is composed of widely different people) - When you want to break down the sample into subcategories (e.g. look at gender separately) - If you want to obtain a more narrow or precise confidence interval - when you expect a small effect or weak relationship - for some statistical techniques

Answer 2

- If less than 100, use entire population - Larger sample sizes make it easier to detect an effect or relationship in the population - Compare to other research studies in area by doing a literature review - Use a power table for a rough estimate - Use a sample size calculator (e.g. G power)

Answer 3

When we want to measure something (e.g religion, self-esteem, tennis ability) we need to choose a metric with which we can measure it The metric will determine the statistical analysis we will perform

Answer 4

Nominal Ordinal Ratio Interval (NORI)

Answer 5

Something which is purely categorical information (quality or kind of something) Example: religion, gender

Answer 6

A rank order Ordinal variables do indicate an underlying quantity but they do not obey mathematical laws (you cannot meaningful subtract or divide)

Answer 7

A true number in the sense that there are equal intervals implied but no true zero point - example: temperature in degrees

Answer 8

A true number. The distinguishing feature of a ratio scale variable is that it has a meaningful zero point, which participants could use to indicate the quantity is completely absent

Answer 9

The issue is you can't assess these until after you have developed your questionnaire and use them . Therefore, a pilot test can be beneficial Many people chose to use established measures instead of their own

Answer 10

Internal validity External validity

Answer 11

- Many (if not most) variables in social research cannot be directly observed example: motivation, satisfaction, helplessness Therefore, the challenge is to make a judgment call on weather we are measuring what we think we are measuring

Answer 12

Face validity content validity criterion-related validity - (concurrent validity +predictive validity) construct validity - (convergent + divergent)

Answer 13

- Asks the question: on the face of it, does my measure seem to relate to the construct? - Measures that lack face validity have the potential to alienate research participants (what are they really trying to measure) - a weak subjective method for assessing validity but a first step

Answer 14

- The extent to which the measure represents a balanced adequate sampling of relevant dimensions - Considers what should go into a measure and what should stay out - considers boundaries - How much does the measure cover the content of the definition? Example: which of the following would be a more valid test of mathematical ability; - a 20-question test containing addition problems - a 20 question test containing addition, subtraction, multiplication problems

Answer 15

- involves checking the performance of your measure against some external criterion

Answer 16

Concurrent - Establish the validity of your measure by comparing it to a “gold standard” (i.e., Existing validated measure of the same construct) Predictive - does the measure predict/relate to some criterion that you would expect it to predict

Answer 17

- Does the measure predict something that it is theoretically supposed to predict - Does the measure differentiate between people in a way that you would expect * What should a measure of the following constructs predict? * Iq -> perhaps some cognitive-based performance task * Workplace depression scale -> number of mental health sick days

Answer 18

Demonstrating that the measure relates to the theoretical construct of interest

Answer 19

convergent divergent

Answer 20

Demonstrates that the measure relates to other similar measures

Answer 21

Demonstrates that the measure does not relate to unrelated constructs

Answer 22

Face - in the judgment of others items appear to relate to construct content - Captures the entire meaning (all elements of definition) of a construct criterion - agrees with external source concurrent - agrees with existing gold standard measure Predictive- agrees with future behaviour construct - how well multiple indicators relate to each other (consistent with theory) convergent: similar measures (or measures of theoretically related constructs) are related divergent - different measures are unrelated

Answer 23

- The consistency or repeatability of your measurement

Answer 24

1. Stability of the measure (test-retest) 2. Internal consistency of the measure (split-half, Cronbach's alpha) 3. Agreement or consistency across raters (inter-rater)

Answer 25

- Addresses the stability of your measure - you administer the measure at one point in time and then you give the same measure to the same participant at a later point in time - You correlate the scores on the two measures

Answer 26

- memory effect - practice effect (performance improves because person has had practice taking test before) - Another thing to consider is time between intervals. If too short there is a greater risk of memory effect. If too long there is a risk of other variables (e.g. additional learning) influencing results

Answer 27

- administer a battery of questions - split the measure into two halves - correlate the score on the two halves of the measure higher correlation means greater reliability strengths: eliminates memory and practice effects limitations: are the two halves equivalent

Answer 28

- Assesses the internal consistency of the measure ie. tells you how well items or questions in your measure appear to reflect the same underlying construct. - You will get good internal consistency if individuals respond in approximately the same way on your survey Cronbach’s alpha can range from 0 (when the items are not correlated with one another) to 1.00 (when all items are perfectly correlated to each other). The closer the alpha is to 1.00, the better the reliability of the measure

Answer 29

- Checking the match between two or more raters or judges in your study

Answer 30

- Nominal or Ordinal scale (the percentage of times different raters agree) - Interval or ratio scale - correlation coefficient

Answer 31

- Test-retest coefficients > .70 - Internal consistency >.70 (but ideally much higher) - Rating consistency >.90

Answer 32

*measurement error serves to weaken our statistical tests * All other things being equal, more error in measurement means lower power * Choosing a measure which is highly reliable decreases measurement error and increases the power of your design

Answer 33

Yes! You could have a consistent measure that does not actually measure the construct

Answer 34

Yes. Example of valid tool but is unreliable – something that is difficult to implement (e.g., Skin fold tests – require technical skill) – may be unreliable across multiple administrators.

Answer 35

Test-Retest - Same question given on different occasions and data correlated Split half - Split questions in half and correlate data from two halves Inter-item reliability - Overall correlation between items in scale Inter-rater - Checking for agreements between multiple raters or judges

WEEK 2 - Measuring variables, sampling, validity and reliability Flashcards

(59 cards)