Research and Program Evaluation Flashcards
cohen’s d effect size
(ES): used to gauge how strong a relationship exists
Kurt Lewins concept of action research
goes beyond advancing knowledge and works to improve situation. Bridges gap between research and application.
experiement
most valuable type of research, used to discover cause and effect. Must have treatment controlled via experimenter and random assignments used in groups.
quasi experiment
researcher uses pre-existing groups, so ind variable cannot be altered. Cannot state with any degree of stat confidence that the ind variable caused dep variable
ex post factor study
“after the fact”. Correlational study using pre-exisiting groups.
Internal validity: whether depp variable (DV) were truly influenced by experimental ind variables (IV).
external validity
whether the experimental research results can be generalized to larger populations (i.e. other ppl, settings, conditions)
factor analysis
stat procedure htat uses important or underlying factors in an attempt to summarize a lot o variables. Ex: a test that measures a counselors ability using 3 most imp variables or factors that make an effective helper even though there are hundreds of factors that might exist.
chi square
stat measure that tests whether an actual distr. Differs sig. From expected theoretical dist
parsimony
interpreting results in simplest way. Strive for parsimony in research (looking for least complex explanation)
occams razor
experimenters will interpret results in simplest manner (interchangable with parsimony)
conway lloyd morgan
english psychologys, physiologist. Created principle of economy in 1894 cannon
william of occam
14th cent. Philosopher, early behaviorists adhere closely to occams razor
bubbles
refers to flaws in research
contaminating variable
variable that enters experiment that is not being controlled by researcher (aka confounding variable)
basic research
research conducted to advance understanding of theory
appplied research
aka action research or experience-near research. Conducted to advance knowledge of how theories skills techniques can be used in practical application.
variable
behavior or circumstance that can exist on at least 2 levels or conditions. Factor that varies or is capable of change.
ind var IV
var. That researcher manipulates or wishes to experiment with
dep var. DV
expresses outcome of data
discrete var
categories
continuous var.
has range
causal comparitive design
true experiment apart that the groups were not randomly assigned.
code of ethics for researchers
inform subjects of risks, remove negative after effects from research, allow subjects to withdraw at any point, protect confidentiality, present results in accurate and not misleading format, should only use techniques you are trained in.
control group
do not receive IV
experimental group
does receive IV
true experiment
need at least 30 subjects for correlational study and 100 subjects for survey.
organismic IV
a variable in which researcher cannot control yet exists such as height weight or gender.hy
hypothesis testing
Assoc. With work of RA Fisher. statement with can be tests regarding relationship of IV and DV.
null hyp
will not be sig. Difference between exper. Group and control group
exper hyp
a difference is evident between control group and experimental group. Aka affirmative hypothesis/alternative hypothesis.
corelational research
does not use IV. just comparing 2 things that already exist.
descriptive stat
describe data such as mean medium mode
inferential stat
infer something about pop.
percentile rank
descriptive stat that tells counselor what % of cases fell below a certain level (don’t confuse with percentage scores)
percentage scores
another way of stating raw score
test of significance
test used to determine whether difference in groups scores are significant or just due to chance.
t test
test of stat significance used when an experiment or study is measuring the difference between 2 groups
ind group comparison design
study in which 2 groups are ind. Of each other in that 1 group doesnt influence the other group
repeated measures comparison design
if researcher measures same group without IV and then with IV. aka between subjects design.
P (in test for significance)
Probability. The lower the number, the more that chance factors are rules out.
Parameter
a value obtained from a population
Statistic
a value drawn from a sample
Correlation coefficient
the degree or magnitude of relationship between two variables. Abbreviated using r. Makes a statement regarding the association of two variables and how a change in one is related to the change in another. Range from 0.00 to 1.0. A positive correlation is not stronger. The minus sign describes that as one goes up the other goes down.
Positive correlation
both variables change in same direction
Negative correlation
Inverse association of variables
Biserial correlation
one variable continuous and other is dichotomous
Level of significance synonyms
alpha level, probability, confidence level, cutoff point or “where one draws the line”
Accepted level of significance/alpha level/probability/or confidence level in social science
0.05 or lower
When setting alpha
very stringent alpha is best and larger sample size helps reduce chances for error. More stringent alpha decreases alpha errors but increase beta error
p=0.05 means
5% chance that the difference between control and experimental group is due to chance factors. Aka 95% confidence interval. Differences truly exist, the experiment will obtain the same results 99 times out of 100.
Type 1 error
alpha error. Researcher rejects null hypothesis when it’s true.
Type 2 error
beta error. Researcher accepts null hypothesis when it’s false.
Probability of committing type 1 error
the level of significance
1 minus beta
the power of a statistical test
Power (in statistical testing)
connotes a statistical test’s ability to reject correctly a false null hypothesis. Parametric tests have more power than nonparametric statistical tests.
Parametric test
used only with interval and ratio data
Type I Type II relationship
seesaw (when one goes up, the other goes down)
Increasing sample size
will reduce type I and type II errors. Differences revealed using larger sample sizes are more likely to be genuine.
t test
testing for sig. difference between two sample groups. `
ANOVA
analysis of variance. Used for testing sig. Difference between more than two groups. Yields an f statistic. Table consulted to find critical value of f. If f obtained is higher than that in the table, the null hypothesis is rejected. (one way analysis of variance)
Two-way ANOVA
testing two independent variables
MANOVA
multivariate analysis of variance for when a study has more than one DV
ANCOVA
analysis of covariance. Tests two or more groups while controlling for extraneous values called covariants.more powerful than anova. Ancova helps take out covariates. extraneous values called covariants.
Kruskal-Wallis
Used instead of one-way ANOVA when the data are nonparametric.
Wilcoxon signed-rank test - used in place of the t test when the data are nonparametric and you’re testing whether two correlated means differ significantly.
Used when groups are 3 or more
Mann-Whitney U test
determine whether two uncorrelated means differ sig. When data are unparametric
Spearman Correlation or Kendall’s Tau
Used in place of Pearson r when parametric assumptions cannot be utilized.
Chi-square
nonparametic test. Examines whether obtained frequencies are sig. different from expected frequencies.
Covary positively
two variables vary together
Correlation does not mean
causal
Correlation research does not yield
cause-effect data
Bivariate
when correlational data describe the nature of two variables
Multivariate
when more than two variables are under scrutiny
N = 1 study structure
aka case study. Take a baseline of behavior, implement treatment, measure behavior again.
Single-blind study
the subjects are unaware of the treatment (control or experimental), but researcher is aware
Double-blind study
the subjects and researchers are unaware of the treatment (control or experimental). Useful to eliminate experimenter effects.
AB or ABA
time-series design. Rely on continuous measurement.
A - baseline secured
B - intervention implemented
A - outcome examined
Multiple-baseline design
when a researcher employs more than one target behavior
Pearson r
interval and ratio aka information and referral
Spearman rho
ends in “o” as in ordinal
Skewed distributions
not normal curve. Left and right sides of the curve are not mirror images
Mode
most frequent score. “Point of maximum concentration”
Mean
average score aka “x bar” or x̄.
Median
the middle score when data are arranged from highest to lowest.
Modal score - highest point on the curve
Normal distribution - 68% of scores
fall within +/- 1 SD of the mean
Normal distribution - 95% of scores
fall within +/- 2 SD of the mean
Normal distribution - 99.7% of scores
fall within +/- 3 SD of the mean
The greater the SD
the greater the spread of scores
Bimodal distribution looks like
camel’s back with two humps
Factorial design
can be used when there are two or more independent variables. IVs in factorial designs can be called levels
Mean is misleading when
the distribution is skewed or there are extreme scores.
Solomon four-group design
using two control groups. One experimental group and control group are pretested. This allows the researcher to know if results are influenced by pretesting.
Positively skewed
the tail points to the right (positive side)
Negatively skewed
the tail points to the left (negative side)
Raw score
Expressed in theunits by which it was originally obtained.
Histogram
bar graph
Horizontal bar chart
when bars are drawn horizontally
Double-barred histogram
used to compare two distributions
X axis
aka “abscissa”. Horizontal. used to plot IV scores
Y axis
aka “ordinate”. vertical. Scale for DV.
Abscissa
x axis
Ordinate
y axis
Observer effect
situation in which a person observing influences or alters the situation
Naturalistic observation
occurs when clients are observed in a natural setting or situation. Researcher does not intervene.
Range
measure of variance. The difference between the highest and lowest score. Some statistic books define the range as the highest score minus the lowest score plus 1. Range generally increases with sample size.
“Inclusive range”
highest score minus the lowest score plus 1
“Exclusive range”
highest score minus the lowest score
Sociogram
a graph database that depicts the relationships among individuals in a group in order to map the group’s social network
Scattergram
aka scatterplot - a pictorial diagram or graph of two variables being correlated
John-Henry effect
threat to internal validity of experiment. Subjects strive to prove that an experimental treatment that could threaten their livelihood isn’t effective. I.e. counselor educators asked to use computers (worried about computers taking over their jobs) will resist the usefulness of computers by preparing more.
Variance
the measure of dispersion of scores around some measure of central tendency. Standard deviation squared.
z-score
same as standard deviation. Also called standard scores.
t-score
transferred scores. Use a mean for 50 with SD as 10.
One tailed t test
places rejection area at one end of dist. Aka one directional experimental hypothesis.
two-tailed t test
placed rejection area at 2 ends of dist. Aka nondirectional experimental hypothesis
CEEB
college entrance examination board, scores are standardized and scale ranges from 200 to 800 with mean of 500. Use SD of 100. Aka ETS score.
Platykurtic distribution
looks like upper half of a hot dog, lying on its side over the abscissa. Flatter and more spread of than frequency dist.
Kurtosis
peakedness of a frequency dist.
Leptokurtic
very tall, thin, peaked.
Stanine scores
divide dist. into 9 equal intervals.stanine 1=lowest 9 and stanine 9=highest. 5 is mean.
4 basic measurement scales
nominal, ordinal, interval, ratio. SS stevens came up with these groups. Memory device: NOIR.
Nominal
strictly quantitative scale, simplest, distinguish logically separated groups. No 0 point, and doesn’t indicate order.
Ordinal scale
variables. Provides relative placement or standing but does not delineate absolutes. (ordinal=order)
interval
has numbers scaled at equal distances but has no absolute 0 point. Most tests used in scores fall into this category. Can add and subtract but no divide or multiply.
Ratio scale
interval scale with true 0 point. Ratio Measurements are possible. Subtraction, addition, multiplication, division can all be used. Most psych attributes cannot be measured on ratio scale. I.e. time weight height volume distance
survey
needs at least 100 ppl, needs 50-75% return rate to be accurate. Drawbacks include: low return rate, poor construction, non-random selection of subjects->cannot be representative of population.
nocebo
placebo with negative effect.
Hawthorne effect
elton mayo and fritz roethlisberger 1924-1932 at Hawthorne Works. Work production increased with better lighting and worse lighting. If subjects know they are part of an experiment, their performance improves. Aka reactive or reactivity effect (subjects behavior is influenced by presence of researcher)
Rosenthal effect
aka experimenter expectancy effect. Experimenter knows about experiment and so makes positive impact on subjects.
Covariate
correlates with the DV. unintended variable that can influence the DV.
Statistical regression
predicts very high and very low scores will move towards the mean if test is administered again. Based on law of filial regression.
Quartile
refers to the points that divide dist into fourths. 25th percentile is first quartile
Inter-quartile range
distance between the 25th and 75th percentiles.
Cross sectional study
aka syncronic method. Clients are assessed at one point in time
Diachronic method
longitudinal study. Data are collected at different points in time.
Demand characteristics
relates to any bit of knowledge correct or incorrect that a subject in the experiment is aware of that will influence their behavior. These characteristics can confound an experiment.
Summative
at end. Used to assess final formal product
Pygmalion effect
experimenter falls in love with own hypothesis and so experiment becomes self fulfilling prophecy
Ahistoric therapy
any model that focuses on here and now rather than the past
Multiple treatment interference
if a subject receives more than one treatment it becomes hard to discern which treatment caused the results.
ERIC
educational resources information center. Resource bank of scholarly literature.
SPSS
Statistical package for the social sciences. Software program that helps with computing statistics.
Random sampling
each subject has same prob. Of being selected. Selection of 1 subject doesnt affect selection of another.
Stratified sampling
looking for a special characteristic to be represented or “stratum”. Stratification variable in sample should mimic pop at large. Ex: if 20% of rogerian counselors are AA than 20% of sample should have AA counselors.
Quota sampling
type of stratified sampling. Where specific number of cases are necessary from each stratum.
Cluster sampling
used when nearly impossible to find list of entire population. Uses existing sample or cluster of people or selects a portion of overall sample. Wont be as accurate as random sample.
Horizontal sampling
occurs when researcher selects subjects from single socio economic class.
Vertical sampling
converse of above when 2 or more socio economic classes are utilized.
Snow-ball sampling
aka chain-referral sample. Uses subjects to find more subjects.
Systematic sampling: take every nth person. Choose list of people, pick every nth (i.e. every 3rd, every 10th etc.).
Sampling error
what happens when sample does not mimic population.
Axiom
universally accepted idea needing no additional proof
Operational definition
outline of procedure esp. Of research so that other researchers can replicate experimental procedure. Must be specific.
Non parametric tests
distribution free tests. Some examples include mann whitney u-test, wilcoxon signed-rank test for matched pairs, solomon and kruskal-wallis h test.
Matched design
subjects are matched in regards to any variable that could be correlated with DV.
Unmatched or uncorrelated groups
aka independent groups
Inductive logic
research goes from specific to general
Deductive
reduces general to specific
Impaired
deterioration in the ability to function as a counselor
Within subjects design
when 2 or more levels of IV are administered to each subject. Each subject acts as his/her own control.