Results Flashcards
measurement
Assigning numbers to objects or events
instrumental measurement
◦ Audiometer ◦ Nasometer ◦ Visipitch
quantified measurement
◦ Hearing level ◦ Nasality ◦ Voice Fo
behavioral measurement
◦ Tests ◦ Surveys ◦ Questionnaires
nominal
attributes are only named -Nominal data deals with names, categories, or labels. -Data at the nominal level is qualitative
ordinal
-attributes can be ordered -Data at this level can be ordered, but no differences between the data can be taken that are meaningful. Subjects were classified based on degree of hearing loss 1 = Mild 2 = Moderate 3 = Moderately-Severe 4 = Severe 5 = Profound
interval
distance is meaningful -The interval level of measurement deals with data that can be ordered, and in which differences between the data does make sense. -There is an equal distance between each value. -Data at this level does not have a starting point. ex) Words were presented at 3 different levels: ◦ Soft = 40 dB ◦ Medium = 60 dB ◦ Loud = 80 dB
ratio
absolute zero
measurement accuracy
Observed score = true score + error X=T+E - True score obtained under ideal conditions - Error occurs because “ideal” never exists - Measurement with less error = more reliable
unsystematic (random) errors
- Random errors in experimental measurements are caused by unknown and unpredictable changes in the experiment. - Random errors often have a normal distribution
sources of random error
- Fluctuation in equipment - Changes in the environment - Changes in the subject - Changes in the observer - Measure is limited sample of behavior that is not stable
systematic error
- Systematic errors usually come from the measuring instruments. ◦ There is something wrong with the instrument ◦ The instrument is used incorrectly by the experimenter - There’s something wrong with the examiner :(
reliability
Can we depend on a measurement? Precision and Accuracy
precision
◦ Stability of the measurement ◦ Reproducibility
accuracy
◦ Closeness of obtained score to expected or true score
how to estimate the reliability of a measurement
- Test-retest - Coefficient of stability - Equivalence of Measurement ◦ Alternate equivalent forms - Coefficient of equivalence
internal consistency
◦ How well each item is measuring same thing?
standard error of measurement
- It is an estimate of how often you can expect errors of a given size - Based on the reliability coefficient and the test’s standard deviation - Low SEM = high level of score accuracy
inter-observer reliability
◦ Two or more observers measuring the same event ◦ Equivalence
intra-observer reliability
◦ One observer measuring event at different times ◦ Stability
coefficient of correlation
• A correlation coefficient is a statistical summary of the relation between two variables.
• It is the most common way of reporting the answer to such questions as the following:
▫ Does this test predict performance on the criterion test?
▫ Do these two tests measure the same thing?
▫ Are these two constructs related?
validity of measurements
Degree to which the measurement measures what it is supposed to measure
Are we really measuring what we intend to measure?
Truthfulness of measure.
internal validity
Is researcher justified in concluding cause/effect
relationship?
▫ Were variances controlled to avoid contamination
of results?
external validity
Can findings can be generalized to population as a whole?
content validity
Subjective, logical assessment of the measurement instrument
•What behaviors are being measured?
•How well does instrument measure a sample of the behaviors?
•Are you really measuring the behavior that you think you are measuring?
criterion reliability-concurrent reliability
Do results agree with results of current, well validated test of the same thing?
▫Scores on Spivak’s Handy Dandy IQ Test (SHDIQT) are the same as those obtained on the Wechsler Intelligence Scale
criterion reliability-predictive reliability
▫Criterion test administered after measurement
▫Did measurement predict criterion results?
construct validity
•The degree to which a test measures an intended hypothetical construct
▫ie. Is the variable you’re testing actually measured by the experiment?
▫I think I’m measuring memory, but maybe I’m just measuring attention, level of interest, IQ, language proficiency—–
threats to internal validity-history
•Any event outside of the research study that can alter or affect subjects’ performance.
•Randomization often minimizes this risk
▫outside events that occur in one group are also likely to occur in the other
threats to internal validity-maturation
•The natural physiological or psychological changes that take place as we age.
•Maturation can play a major role in longer-term studies.
▫Was it the treatment that caused outcome or maturation of the subject?
▫Would result have occurred naturally with time?
▫This is especially important in childhood
threats to internal validity-testing
- People tend to perform better at any activity the more they are exposed to that activity.
- Act of taking a pre-test may affect performance on post-test (Re-active Pre-test)
- How can I avoid affect of pre-test?
threats to internal validity-instrumentation
- Changes in scores may be related to the changes in the instrument or test rather than the independent variable.
- How can I control for influence of instrument?
threats to internal validity-statistical regression
- Refers to the tendency for subjects who score very high or very low to score more toward the mean on subsequent testing.
- Statistical regression, or regression to the mean, is a concern especially in studies with extreme scores.
threats to interal validity-subjects
- Manner in which subjects are selected and assigned to groups
- Groups equal on important dimensions
- Subject matching or Randomization
experimenter bias
(Rosenthal Effect)
•Researchers may be biased toward the results we want
•This bias can effect our observations
•Control for this bias by blinding
mortality
- Mortality, or subject dropout
- One group may experience greater dropout than another, thereby upsetting equivalence between groups
external validity
Generalizability of the conclusions
threats of external validity
•Effect of study location
•Reactive effects of experimental arrangements
▫Generalize from one context to another
•Studies in laboratory environment often have poor external validity
demand characteristics (john henry effect, hawthorne effect)
- Performance of subjects influenced by the anticipated results of a study
- Exhibit performance that they believe is expected of them.
- control for this by-Blinding
order/carryover effects
•Order effects refer to the order in which treatment is administered and can be a major threat to external validity if multiple treatments are used.
treatment interaction effects
- Treatment can affect people differently depending on the subject’s characteristics.
- Potential threats to external validity include the interaction between treatment and any of the following: selection, history, and testing.
data distribution
All quantitative data forms a distribution
◦Characteristics
Central tendency
Variability
Skewness
Kurtosis
normal data distribution
Normal distribution is symmetrical
The mean, median, and mode of a normal distribution are identical
◦68% of the population fall between one and two standard deviations from the mean
◦ 95% of the population fall between two standard deviations from the mean.
skewed data distribution
The skew of a distribution refers to how the curve leans.
The more skewed a distribution is, the more difficult it is to interpret.
kurtosis
Kurtosis refers to the peakedness or flatness of a distribution.
◦leptokurtic.
Positive Kur
◦platykurtic.
Negative kur
◦mesokurtic
Normal distribution.
what does the nature of data distribution affect?
Type of distribution will effect type of statistics used
◦Normal distribution: Parametric Statistics
◦Non-normal: Non-parametric Statistics
how to test for skewness to the right
55, 55, 56, 57, 60, 61, 63, 66, 66, 310
Mean = 82.4
SD =75.57 so… SD x 2 = 76 x 2 = 152
Mean minus 2xSD = 82 – 152 = -70
Result should not be lower than 0
Negative # indicates skew to the right
how to test for skewness to the left
12, 46, 55, 55, 60, 65, 65
Mean = 45
SD = 18 so… SD x 2 = 18 x 2 = 36
Mean plus 2xSD = 45 + 36 = 81
If answer is larger than largest score (65) than distribution is skewed to left
This distribution is not skewed to left.
variability
How scores differ from mean
◦Spread out
◦Cluster around the mean
measures of variability
Range
◦Lowest to highest score
Variance
◦Average of the squares of distance from mean
Standard Deviation
◦Square root of the variance
Interquartile range ( Q)
◦When values not symmetrical around mean
◦Range of scores in middle 50% of scores
◦Excludes extreme scores in lower & higher 25%
goals of inferential stats
to determine what might be happening in a population based on a sample of the population and to determine what might happen in the future (estimation and prediction)
confidence intervals
- -theresdhf -theredsfsdfsdf dfg fgfgdfgertertertertretertthere will be a range of sample means within any population
- how closely does the mean of our sample match the mean of our population?
standard error of the mean
- how far each sample mean varies from the population mean
probability of error
- decide prior to the experiment
- common levels of acceptable error (referred to as significance-.05, .01, .001) abbrviated with “a”
- How much error are we willing to accept?
- abbreviated with “p”
if null is accepted..
then p>a
probability of error is greater than acceptable error
if null is rejected…
then p<=a
probability of error is less than or equal to acceptable error
type I error
when the results of research show that a difference exists but in reality there is no difference
- perhaps also (a) was set too high and lowering the amt of acceptable error (a) would reduce the chances of type 1 error
- lowering the amt of a also increases the chance of type II error…
type II error
- the acceptance of the null hypothesis when in fact the alternative is true: there IS a significant difference in the population but we fail to find this difference
- study is said to lack power.
- power refers to a study’s strength to find a difference when it actually exists.
power analysis
- probability that your test will find a statistically significant difference when it actually exists.
- generally accepted that power should be .8 or greater (80%)
- increases with sample size, means you have collected more info
effect size
- when a difference is statistically sig, it doesnt mean that it is big, important, or helpful in decision making
- cohen’s d= mean1-mean2/SD
- .1=trivial effect
- .1-.3=small effect
- .3-.5=moderate
- >.5=large difference effect
t-test
- tests significance between 2 groups based on mean, SD, and # of subjects
anova
- used for more than 2 groups or more than 1 IV or DV
- tells us significant differences among groups and if variance btw groups is larger than variance within groups