Research and Statistics Flashcards
dependent variable
What is measured. It is what is affected by manipulation of the independent variable
Independent variable
The treatment measure - what is manipulated by the experimenter.
Cause-effect relationships
Can be found by experiments
Cross-sectional research
measures people from various age groups simultaneously
Ex Post Facto studies
(Retrospective / causal-comparative studies)
The independent variable (I e., # siblings, SES; # cigarettes) known. Start with effect and seek causes; no random assignment
Solomon Four-Group Design
With two control groups and two experimental groups. Half the groups have a pretest and half do not have a pretest. This tests both the effect itself and the effect of the pretest.
Between Subjects Design
Grouping Participants to Different Conditions
Within Subject Design
Participants Take Part in the Different Conditions - Also: Repeated Measures Design
Counterbalanced Measures Design
Testing the effect of the order of treatments when no control group is available/ethical
Matched subjects design
Matching participants to create similar experimental- and control-groups
Double-Blind Experiment
Neither the researcher, nor the participants, know which is the control group. The results can be affected if the researcher or participants know this.
Bayesian Probability
Using bayesian probability to interact with participants is a more dvanced experimental design. It can be used for settings were there are many variables which are hard to isolate. The researcher starts with a set of initial beliefs, and tries to adjust them to how participants have responded
Nominal Variables
Category or name
i.e. Girls / Boys
Ordinal Variables
Rank or ordering of levels
- Likert Scale
Interval Variables
Numerical with equal intervals or distance between numbers
- Scores on an exam
Ratio Variables
Same as interval scale but there is an absolute zero indicating an absence of the property measured
- i.e. Syllables stuttered
Single-Subject Designs
- Help establish efficacy of treatment procedures and cause-effect relations
- Dependent variables are measured continuously and results don’t require statistical analysis
- Types include AB, ABA, ABAB, and multiple-baseline design
A= skills measured without treatment or when withdrawn
B= skills taught and results measured
Multiple-Baseline Designs
- A multiple baseline design can be across subjects so it includes several subjects who are taught one or more behaviors in a staggered way to show that only the behaviors of treated participants change.
- A multiple baseline design can also be across settings
- Collect base rates in 3 or more settings, teach behavior in one setting, repeat assessing in untreated settings then teach behavior in another setting. Teach in different settings until behavior is trained in all settings.
Group Designs
- Experimental or nonexperimental
- True experimental research helps rule out effects of confounding variables by using randomization and a control group
- Within subjects designs have one group
- Between subjects designs have two+ groups
- Pretest-Posttest Control Group Design
Validity in Research
Degree to which an instrument measures what it intends to measure
- Content validity
- Criterion-related validity
- Concurrent validity
- Predictive validity
- Construct validity
8 Internal Validity Issues
Internally valid findings must reflect true cause-effect
- Instrumentation – measurement devices
- History – life events responsible for changes
- Statistical regression – behavior that is at poles (high or low) moves toward mean
- Maturation – unexpected biological changes in participants
- Attrition – mortality and drop-out rate
- Testing – repeated measurement
- Subject selection bias – factors influencing initial selection
- Interaction of factors – combination of above
3 External Validity Issues
Externally valid findings must have generalizability
- Hawthorne Effect – study’s results affected by fact that people know they are taking part in an experiment
- Multiple-Treatment Interference – can be a concern when 2+ treatments are administered to the same person; order effects can also be a concern
- Reactive or Interactive Effects of Pretesting – may be a problem if the pretest measure is also the dependent variable
- for example, a vocal hygiene questionnaire is given to a control group subject who then tries to modify his vocal abuse
Reliability in Research
Consistency with which something is measured on repeated occasions
- Test-retest reliability
- Alternate-form reliability
- Split-half reliability
- Interobserver / interjudge reliability
- Intraobserver or intrajudge reliability
Inductive method
Specific to general
- Conclusion based on patterns that you see.
- Experiment first and draw conclusion next.
EXAMPLE: What is the next number in the sequence 6, 13, 20, 27
Deductive method
General to specific
- Conclusion based on previously known facts.
- Make a conclusion first and verify later.
EXAMPLE: All men are mortal. (major premise)
Socrates is a man. (minor premise)
Therefore, Socrates is mortal. (conclusion)
Levels of Evidence
Ia
Well-designed meta-analysis of >1 randomized controlled trial (using a systematic review)
Ib
Well-designed randomized controlled study
IIa
Well-designed controlled study without randomization
IIb
Well-designed quasi-experimental study including a cohort study or case-controlled study from independent researchers
III
Well-designed non-experimental studies, i.e., correlational and case studies including time-series single-subject investigations
IV
Expert committee reports, opinions of authorities, descriptive studies, and descriptive clinical cases
Probability
Alpha level of p <.05
Effect Size
Cohen’s d of .3 for small, .5 for medium, and .8 for large
Variability - dispersion or spread in data set
Standard deviation
Central Tendency – distribution of set of scores reflecting the average
Mean, median, mode
Parametric Statistics (vs non-parametric)
Paremetric statistics require that assumptions be met about distributions:
- Adequate sample size
- Homogeneity of variance
- Fairly normal distribution
Parametric statistics are more powerful but sensitive to violations of the assumptions
Parametric = t-test (no more than two groups)
- or ANalysis Of VAriance (ANOVA) with two or more groups (Example - boys and girls at different ages on vocabulary)
Non-Parametric Statistics (vs parametric)
Non-parametric tests do not make an assumption about the data, they require less information.
They are less powerful than the parametric tests and it tends to be more difficult to find statistical significance.
Standard parametric tests also have corresponding non-parametric counterparts.
- Wilcoxon Signed Rank test / Paired t-test
- Kruskal-Wallis test / One-way between-subjects ANOVA
Positive predictive value
95% probability that child who fails screener will be identified by formal testing as having a speech problem
Negative predictive value
95% probability that a child who passes the screener will be identified as having normal speech
Sensitivity
Percentage of true positive results
Face validity
Extent to which measure looks valid to the examinee
Content validity
The extent to which a measure represents all facets of a given social construct.
Reliability coefficient
High reliability coefficient=test yields replicable results
Standard error of measurement
Low=test yields precise or accurate results
Criterion Validity
a measure of how well one variable or set of variables predicts an outcome based on information from other variables, and will be achieved if a set of measures from a personality test relate to a behavioral criterion on which psychologists agree
- Concurrent validity- demonstrated where a test correlates well with a measure that has previously been validated. The two measures may be for the same construct, or for different, but presumably related, constructs.
- Predictive validity - the extent to which a score on a scale or test predicts scores on some criterion measure
Construct validity
Refers to whether a scale measures or correlates with the theorized psychological scientific construct (e.g., “fluid intelligence”) that it purports to measure. In other words, it is the extent to which what was to be measured was actually measured.
The effectiveness of tx can be best demonstrated by
Single-subject design