Experimenal Validity & Measurement Flashcards

1
Q

Review of last class:

Internal Validity
External Validity
Probabilistic Knowledge
Maturation
Testing Effects
Statistical Regression to the mean
Selection of participants
Hawthorn Effect

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is criterion referenced? (2)

A

Individual’s performance compared to some absolute level of performance set by a researcher

E.g. set a minimum score, set performance expectations, have age of acquisition expectation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between Validity and Reliability?

A

Reliability refers to the consistency of a measure (whether the results can be reproduced under the same conditions).

Validity refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is norm referenced? (2)

A

Individual performance compared to grp norms

Standardized when raw score standardized in some way, smoothed out to ‘normal curve’ (e.g., z, T, percentiles)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Measurement Reliability?

A

Stability, consistency of the measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the four types of test reliability?

A

Intrasubject reliability
Intrarater reliability
Interrater reliability
Test Retest reliablity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Intrarater reliability?

A

Consistency of the data recorded by one rater over several trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Intrasubject reliability?

A

The reproducibility of the identical responses (answers) to a variety of stimuli (items) by a single subject in two or more trials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is interrater reliability?

A

The degree of agreement among independent observers who rate, code, or assess the same phenomenon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reliability is reported as (3)

A

correlation, standard error of measurement (SEM) or % agreement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is validity? (2)

A

The degree to which an instrument measures what it is intended to measure

How appropriate, meaningful and useful are inferenced drawn from the measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does high validity imply? (3)

A

A measurement is relatively free from error.
A valid test is also reliable.
A reliable measure is not necessarily valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 types of Validity?

A

Construct validity
Content validity
Criterion-related validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is construct validity? (3)

A

Construct validity refers to whether a test measures an abstract construct adequately. An example is a measurement of the human brain, such as intelligence, level of emotion, proficiency or ability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the methods for establishing construct validity? (2)

A

Known Groups Method
Factor analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define known groups method: (2)

A
  • One of the methods of establishing construct validity
  • Degree to which an instrument can demonstrate different scores for groups known to vary on the variable being measured.

e.g., An intelligence test should differentiate individuals with DS and TD
e.g., A measure of functional independence
Should decrease with increasing age in seniors
Should be related to the level of care needed
Should be related to severity of impairment
e.g., A measure of functional hearing
Should decrease with hearing level (dB)
Should decrease in older adults

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define Factor Analysis:

A

A construct is made up of a variety of dimensions
- Each dimension can be assessed using a variety of tasks
- If we measure a series of variables/items associated with a construct
Some variables would be highly correlated
Some would have little correlation
Performance on variables that are similar would cluster together
The array of clusters define the construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Content Validity?

A

Refers to the extent to which a measure represents all facets of a given construct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do we mean that content validity ‘‘Indicates how adequately/fully an instrument samples the variable being measured.’’? (2)

A

Samples all aspects of the construct
Reflects the relative importance of each part

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What do we mean by ‘‘Establishing content validity is essentially a subjective process
‘’? (3)

A
  • No statistic measures content validity
  • Content validity is established through expert opinion, review of literature, operational definitions of the test variables
  • Specific to the stated objectives
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Explain Face Validity: (3)

A

a weak type of content validity

A subjective assessment that an instrument appears to test what it is supposed to test and usually not quantified
Attempt to quantify: # of raters who assess the test as having face validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is Criterion-Related validity?

A

Extent to which one measure is related to other measures or outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Explain Criterion-Related Validity: (4)

A
  • Most objective measure of validity
  • Often assessed by correlating performance on the measure of interest and the criterion measure
  • Only useful if the criterion measure is stable and valid
  • The target and the criterion measurements must be measured independently and without bias (e.g. blinded)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the two types of criterion-related validity?

A

Concurrent Validity
Predictive Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Describe Concurrent Validity: (2)
- One of the two types of Criterion-Related Validity - Two measures are collected at relatively the same time and the performance on each is related e.g., # words produced in a spontaneous speech sample and PPVT-V scores
26
Describe Predictive Validity: (2)
- One of the two types of Criterion-Related Validity - The measure of interest is collected earlier in time and related to a criterion measure collected later e.g., does a parent report measure of vocabulary at 2 years predict PPVT-V scores at 4 years
27
A test with good validity must be able to show two things; What are they?
Convergent validity Discriminate validity
28
What is Divergent validity?
Low correlation with tests that measure different constructs
29
What is Convergent validity?
High correlations with tests that measure the same construct
30
What are threats to Measurement Reliability and Validity?
Ambiguous, unclear, inconsistent instructions Observer Bias Reactivity Floor and ceiling effects (Validity only)
31
How can Observer bias affects reliability and validity? (2)
Confirmatory bias e.g., Identification of hypernasality before/after pharyngeal flap surgery in cleft-palate patient Carryover effects e.g., first scoring affects the second scoring
32
How can Observer bias reliability and validity? (2)
Confirmatory bias e.g., Identification of hypernasality before/after pharyngeal flap surgery in cleft-palate patient Carryover effects e.g., first scoring affects the second scoring
33
How can reactivity affect validity and reliability?
Influences that distort the measurement e.g., participant’s awareness of measurement.
34
How can floor and ceiling effects affect validity? (2)
Decreases the variability of a measure Particularly difficult when measuring change over time
35
What is sampling? (4)
When a Sample is drawn from a target population Practical difficulties in accessing a target population Therefore, we select from an accessible population Voluntary Participation
36
In order to generalize a target population, sample must be: (2)
Representative: same relevant characteristics In the same proportions
37
What is sampling bias?
Introduced when certain characteristics in the sample are over- or under-represented relative to the target population
38
What is sampling bias?
Introduced when certain characteristics in the sample are over- or under-represented relative to the target population
39
Explain conscious sampling bias: (2)
- Purposeful selection - Strategic limiting of population of interest e.g., Election poll: Voters e.g., High-functioning autism - Limits generalization in predictable ways - Essentially saying population of interest is a subset of total population - only generalize to the subset
40
Explain uconscious sampling bias: (3)
- It is a problem - It is Unplanned and unpredictable e.g., Election poll: how reach - land lines, cell phone, internet e.g., People who respond to polls - Limits generalization in unpredictable ways
41
How can we limit unconscious bias?
Probability Sampling
42
Explain Probability Sampling: (2)
Better control for unconscious bias than non-probability Randomized selection procedures used
43
Why do we use randomized selection procedures? (4)
limit unconscious sampling bias Assures that every member of the population has an equal chance of being chosen Outcomes are more generalizable Controls selection bias and therefore sampling error BUT, no sampling is error free, therefore does not guarantee representativeness Any sampling error assumed to be due to chance Theoretically possible but participation always a choice for ethical reasons
44
When do we use Non-Probability sampling?
- When probability sampling is not possible - Randomized selection procedures are NOT used (Often because of population access difficulties (i.e., clinical populations)) - Sampling bias not controlled (Therefore cannot assume that the sample represents the characteristics of the larger population Limits generalizability)
45
What are 4 types of probability sampling techniques?
Simple random Systematic Stratified Cluster
46
Explain Simple random sampling: (2)
- Each member of population has equal chance of being selected and selection of each is independent - Simple random selection, without replacement e.g., all clinicians in a membership list Use a random number generator www.random.org Randomly select a start point and a direction of movement, sample consecutively up to required number (e.g., n = 15)
47
Explain Systematic probability sampling:
Less laborious, more convenient Applies to ordered lists e.g., alphabetized list of Kindergarteners in a school board Randomly select a start point Sample using a predetermined sampling interval e.g., every 8th one A problem is the list is ‘ordered’ in some significant way – ensure is randomly ordered or ordered based on irrelevant factor
48
Explain Stratisfied random probability sampling:
Used to insure to ensure sample has same proportion of subgroups in as in population or to get adequate numbers of subgroups Improves the representativeness of the sample and precision of outcomes Based on knowledge of variations of a characteristic in the population e.g., ASD population: 4:1 male to female ratio Partition the population into non-overlapping strata (levels, subsets) e.g., males and females Randomly sample in proportion to the distribution in the population Note: Choose the stratification variable carefully: relevance to the study
49
Explain Disproportional probability sampling:
- Type of Stratisfied - If a population subset of interest occurs infrequently enough to threaten statistical power, can sample disproportionately e.g., Population = 6000 SLPs in Canada (CIHI) Male to female ratio = 1:30; but may be interested in having male responses represented If use simple random sampling, few males would be selected; n = 100, 3.3 males Instead, select 50 males and 50 females Then statistically weight the male scores to represent their proportional distribution in the larger population
50
Explain Disproportional probability sampling:
- Type of Stratisfied - If a population subset of interest occurs infrequently enough to threaten statistical power, can sample disproportionately e.g., Population = 6000 SLPs in Canada (CIHI) Male to female ratio = 1:30; but may be interested in having male responses represented If use simple random sampling, few males would be selected; n = 100, 3.3 males Instead, select 50 males and 50 females Then statistically weight the male scores to represent their proportional distribution in the larger population
51
Explain Cluster random sampling:
Using naturally occurring groups as sampling units Often a population is too large or dispersed to obtain a complete listing of possible participants e.g., Population: children in elementary schools in NS Cluster or multi-stage sampling method Randomly sample ‘families of school’ (clusters) e.g., schools Randomly sample students within schools
52
What are three types of Nonprobability sampling?
Convenience Snowball Purposive
53
Explain Convenience sampling:
- Chosen as they become available Volunteers Self-selection introduces bias i.e., why did they volunteer?
54
Explain Quota sampling: (2)
- Convenience sample with restrictions - Controls for potential confounds from known population characteristics e.g., male: female ratio in ASD
55
Explain Purposive nonprobability sampling:
- Hand pick subjects on the basis of certain characteristics e.g., Chart review, participation in an intervention - Used in qualitative research
56
Explain Snowball non-probability sampling:
Snowball sampling e.g., word of mouth
57
How do we assign subjects to groups? (4)
- Random By individual By block (e.g., different blocks for severity level) - Systematic - Consecutive - Matched
58
What are the two main types of statistics?
Descriptive Inferential
59
What are the levels of measurements?
Nominal (Names with no order) Ordinal ( Ranked order) Interval ( Equal Intervals with no true 0) Ratio (Intervals with a true 0)
60
Discrete data refer to (2)
Nominal and Ordinal
61
Continuous data refer to (2)
Interval & Ratio
61
Continuous data refer to (2)
Interval & Ratio
62
What is a frequency distribution?
Number of times each value occurs in a the data set
63
What are the three measures of central tendency?
Mean Median Mode
64
Which type of scales would the mean use?
Interval or ratio
65
Which type of scales would the median use? (2)
Ordinal data Interval or ratio if distribution not normal
66
Which type of scales would the median use? (2)
Nominal data Ordinal data
67
Explain Skewness and measures of central tendency
67
Explain Skewness and measures of central tendency
68
What are the 3 measures of variablity?
Range Variance Standard Deviation
69
What is the normal curve?
A bell shape probability curve that tells you the probability distribution of a data set
70
What is the standard error of Measurement?
Estimate of how far an individual’s score is from the ‘true’ score or how it would vary with repeated measurement Determined from how much ‘error’ is there in a measure, its reliability
71
What is the Standard Error of the Means?
Estimate of how far from the population mean a given sample’s mean is. Based on notion of ‘mean of means’ (the average of a number of samples from a population if you collected multiple samples from 1 population) Varies with sample size – larger sample, smaller SEM Estimated from sample standard deviation and sample size Used in calculation of statistical tests
72
What are confidence intervals?
- Calculated from Standard Error of the Means (SEM) - Range in which you are confident, at a specific level, that the true population mean lies
72
What are confidence intervals?
- Calculated from Standard Error of the Means (SEM) - Range in which you are confident, at a specific level, that the true population mean lies
73
Why do we use inferential statistics?
- Used to ‘infer’ sample results to population - Determine if results (e.g., difference between groups) are ‘significant’ - Probability (No finding is absolute How likely the results consistent with there being a difference; Level of confidence that finding was ‘real’ and not due to chance Across studies, replication important to confirm)
74
What are the steps in hypothesis testing? (6)
State null and alternative hypotheses Set alpha level Gather data Perform statistical test Compare calculated to critical value Make decision
75
Explain what occurs in the first step of Hypothesis testing:
- Ho = Null Hypothesis What you’re trying to reject/disprove Expresses no difference or no relationship between the independent and dependent variables Ho: μ1 = μ2 Also called the statistical hypothesis - Alternative to the Null State that there is a relationship Can be Directional or Non-directional H1 : μ1 ≠ μ2 H1: μ1 > μ2 or H2 : μ1 < μ2 Also called the substantive hypothesis What the researcher is predicting Tested against the null hypothesis
76
Explain what occurs in the Second step of Hypothesis testing:
Also called significance level, probability level or confidence level Conventionally set at p = .05 Preliminary/exploratory research may set p = .10 Should make adjustments to keep study-wide error at p = .05
77
How do we make correction/adjustments of alpha level for multiple tests? (4)
- Adjustment to alpha level to compensate for running multiple tests - Controlling for Type 1 Error (saying is a difference when there isn’t) - Keep experiment (family)-wide error rate at .05 by adjusting alpha level for each analysis
78
What is the simplest and most conservative way to make correction/adjustments of alpha level for multiple tests? (4)
- Bonferroni correction is the simplest and very conservative .05/number of tests run E.g., if run 3 comparisons, p = .05/3 = .016
79
What are the 3 determinants of probability of reaching significance?
- Sample size - Between group differences - With-in group differences (variance)
80
Explain what occurs in the Third step of Hypothesis testing:
Discussed sampling earlier, measurement etc
81
Explain what occurs in the Fourth step of Hypothesis testing:
- Use sample data to get calculated value - Critical you use an appropriate statistical test - Different tests give you different statistics (e.g., t, F, U, r, etc.)
82
Explain what occurs in the Fifth step of Hypothesis testing:
- Computer software will give you exact p value - Are set values for cut between significant and not for each test statistic (i.e., t, F, etc.) for given alpha level and degrees of freedom (based on ‘n’ and number of groups done) - Look for test statistic to be greater than or equal to critical value Larger t, F etc. gives smaller p
83
Explain what occurs in the Fifth step of Hypothesis testing:
- Computer software will give you exact p value - Are set values for cut between significant and not for each test statistic (i.e., t, F, etc.) for given alpha level and degrees of freedom (based on ‘n’ and number of groups done) - Look for test statistic to be greater than or equal to critical value Larger t, F etc. gives smaller p
84
Explain what occurs in the Last step of Hypothesis testing:
If p ≤ ‘alpha level’ (usually .05), you reject the Null Hypothesis So accept there is a difference Look at group means to see direction 2-tailed vs. 1-tailed tests
85
What are the two categories TYPES of errors?
Type 1 Type 2
86
What are type one errors?
False positives : accepting the alternative hypothesis while H0 is true
86
What are type one errors?
False positives : accepting the alternative hypothesis while H0 is true
87
What are type two errors?
False negatives: Do not reject H0 when H1 is true
88
What is statistical power?
Statistical test’s ability in a particular study to detect a difference
89
What occurs is Power is too low?
More chances to get Type ll
90
What is Power affected by? (4)
- Sample size (bigger sample, more power) - Between group differences - size of difference between groups: effect size (larger difference, more power) - Within group differences, amount of variance (smaller variance, more power) - Alpha level (more liberal alpha, more power)
91
What is statistical significance?
a measure of reliability or stability of the difference
92
What is Practical significance?
The size of the difference
93
What is Effect size?
-Measure of size of difference - A priori – researcher determines what would be an important difference used in power estimates for planning study - Calculated – computed from study data - Unaffected by N
94
What is cohen's d?
- Standard deviations separating group means, overlap of groups - Small =.2, medium = .5, large = .8
95
What are Eta squared (ɳ2 ) & omega squared (ω2)?
- Measures of percentage of the variance accounted for by the independent variable - Small < 06. medium .06 -.15, large > .15
96
What are two other types of significance?
- Clinical Significance What does the difference between the groups mean Importance/value of the difference Some use interchangeably with practical significance Done at the group level - Personal Significance Distinction suggested by Bothe & Richardson, 2011 An individual client’s sense of value of change for her/him
97
When is the term of statistical trend used?
- Used when p value approaches significance - People vary on acceptance of this type of reporting - p = .05 is arbitrary, a convention - Is a 6% probability of getting a difference at least as big unimportant/uninteresting?? - Typically call a ‘trend’ or ‘approaching significance’ when p between .06 – .10