Statistics Flashcards

1
Q

What does the validity of a test refer to?

A

Validity of a test refers to the extent to which the test measures what it was designed to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is internal validity?

A

Internal validity is the extent to which the intervention or manipulation of the IV accounts for changes in the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the most common risks to internal validity?

A
  1. History (event that occurs in the experiment or outside of the experiment other than the manipulation of IV that could account for change in the DV)
  2. Maturation (development/decay of biological or psychological factors)
  3. Testing effect, e.g., taking the test more than once, eg, pre test effect
  4. Instrumentation ( change in instrument/measuring procedures during the course of the experiment)
  5. Regression to the mean- tendency of extreme scores to regress to the mean
  6. Selection bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Rosenthal effect/Pygmalion effect?

A

The tendency for participants’ performance to be effected by expectations of the tester ( eg, students perform better than other students because the teacher expects them too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do researchers safeguard internal validity?

A
  1. Random assignment
  2. Matching subjects on possibly relevant characteristics ( less powerful than random assignment, but necessary when groups cannot be randomly assigned).
  3. Blocking ( study the effects of extraneous subject characteristics by treating the extraneous characteristics as IVs)
  4. ANCOVA ( analysis of co-variance)- a statistical procedure developed to account for group differences in extraneous characteristics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is external validity?

A

Refers to the generalizability of a study/test results. It refers to the limits or boundaries of the findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are threats to external validity?

A
  1. Interactions between subject selection and treatment (treatment effects do not generalize to other members of the population)
  2. Testing and treatment effects ( treatment effects do no generalize to individuals who did not participate in the pre-testing, eg, from demand characteristics)
  3. History and treatment effects ( treatment effects depend on history of testing period)
  4. Demand characteristics (cues in research the alert subjects
  5. Hawthorne effect-tendency of subjects to respond differently based on expectations of the researcher
  6. order effects (risk in repeated measure designs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Hawthorne effect?

A

Tendency of subjects to behave differently when they are in a research study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do researchers safeguard external validity?

A
  1. Random selection from population of interest
  2. Conduct naturalistic/field research
  3. Use single or double blind research designs
  4. Counterbalancing (eg, vary the order of treatment strategies among participants to eliminate order effects)
  5. Stratified random sampling (select random sample from each of several subgroups of target population)
  6. Cluster sampling ( the unit of sampling is naturally occurring in groups of individuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is content validity?

A

The degree to which items on a test represent the domain that test is supposed to measure (eg, does a test of depression include items that measure vegetative Sxs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is face validity?

A

The extent to which a test APPEARS to measure what it states it measures. Face validity can be misinterpreted content validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is criterion-related validity?

A

Refers to the relationship between test items and a criterion of interest (correlation of a measure with a criterion of interest, eg, correlation of performance on an academic test with grades). 2 types of criterion validity: 1. concurrent validity 2. predictive validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is concurrent validity?

A

One type of criterion related validity. Refers to correlation of performance on 2 measures at the same point in time (eg, correlation of performance on an achievement test with current grades)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is predictive validity?

A

A type of criterion validity. Refers to the relation between test scores and future performance on a criterion of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is construct validity?

A

here are 2 ways to think about Construct Validity:

  1. Construct validity of an experiment: If a causal relationship has been determined between the IV and DV (i.e., internal validity has been supported), construct validity refers to the experimentor’s explanation for why the causal relationship exists (i.e., what aspect of the intervention was the causal agent).
  2. More common use of construct validity is in the context of test development. It refers to the extent to which a measure assesses the psychological construct or trait it was designed to measure. In this context there are 2 types of Construct Validity: convergent validity, divergent validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Convergent validity

A

A type of construct validity. The extent to which 2 measures assess similar constructs. Construct validity is supported when measures of the same construct are correlated. It also refers to the similarity among measures of the same construct that are administered in different formats (eg, self and parent report of sxs, essay response vs multiple choice)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is divergent validity?

A

Another type of construct validity, also known as discriminant validity. Correlation between measures of different constructs. CONSTRUCT VALIDITY IS SUPPORTED BY LOW CORRELATIONS BETWEEN MEASURES OF DIFFERENT CONSTRUCTS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do convergent and divergent validity relate to construct validity?

A

Construct validity is supported when a measure correlates highly with other measures of the same construct (convergent validity) and does not correlate with measures of different constructs (divergent validity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can construct validity of a test be established?

A
  1. Establish a relationship between performance on the test and the theoretical construct it was designed to measure (eg, examine concurrent validity and predictive validity 2. Conduct a factor analysis of test items and assess whether factors relate to constructs of interest
  2. Perform factor analysis of items on test with items from established tests with construct validity.
  3. Establish that performance on a test rates more highly with tests of similar constructs (convergent validity)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is reliability

A

Reliability is the consistency of a measure. It refers to the extent to which random, unsystematic factors affect the measure of a construct/trait.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

4 types of reliability

A
  1. Internal reliability ( consistency among items in a test)
  2. test-retest reliability
  3. Alternate form reliability
  4. Interrater reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How are validity and reliability related?

A

A test can be reliable but NOT valid, but not the other way around. Reliability is necessary but not sufficient for validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 2 components of a test score according to psychometric theory

A
  1. Test score
  2. Error score
    Test score refers to all systematic information that contributes to the score. Error refers to all the random “noise” in the score. Note: true score does not refer to the underlying construct of interest, only to systematic variation in score.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How can true score be ID’d?

A

True score is a hypothetical construct and cannot be directly observed. It is best estimated as the means of repeated measures on the same test (each score includes error and the true score, so the true score is the average of these repeated measures). It is also assumed that the distribution of scores is normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What types of variance contribute to the reliability ratio?

A

The ratio of true score variance to observed score variance. Remember that the measurement theory proposes that each test score consists of the true score, reflecting stable characteristics of the participant, and error, reflecting random/unpredicatble variation in score.

26
Q

What is the reliability coefficient?

A

This coefficient, often designated as “ r”, ranges from 0, indicating no relationship between 2 measures, and 1, indicating perfect relationship between 2 measures; r is used to reflect all 4 types of reliability: internal
reliability, test-retest reliability, alternate form of reliability, interrater reliability

27
Q

What measure is used for test-retest reliability and alternate form reliability

A

Pearson’s product moment correlation

28
Q

What is the most common measure for internal test reliability?

A

Cronbach’s alpha; it can be used for tests in which items have a range of responses

29
Q

What measure is used to evaluate the effect of lengthening or shortening a test

A

Spearman-brown correction formula

30
Q

What does the Kuder Richardson formula refer to

A

assess reliability of a test with dichotomous responses, 0 or 1

31
Q

What are acceptable scores of reliability?

A

.80 and above is good reliability
.7 .79 is acceptable reliability, depending on the purpose of the test
.6-.69 marginally reliable
.59 and below is not reliable

32
Q

What does split-half reliability refer to

A

internal reliability ( we would use cronbach’s alpha)Split test items in half and compare consistency of responses

33
Q

What is the most common measure of interrater reliability?

A

Kappa

34
Q

How does test length affect reliability?

A

In general, more test items increase reliability

35
Q

How does range of scores affect reliability?

A

The greater the range, the better the reliability

36
Q

What is SEM, or standard error of measurement?

A

It is the estimate of amount of ERROR associated with an obtained score ( obtained score= true score + error). In computational form, SEM refers to the standard deviation of the distribution of error scores.

37
Q

How does SEM relate to reliability?

A

Inversely related; the larger the SEM, the less reliable the test is

38
Q

What does standard deviation of distribution of error scores, aka, SEM mean?

A

Psychometric theory proposes that “true” score can be estimated as the the mean of a distribution of obtained score. Each obtained score includes the true score and error, and the distribution of these scores falls in the normal distribution or the bell curve. If one were to calculate the error scores by subtracting the hypothetical true score from each obtained score, and plot those error scores, they would also fall in a normal distribution.

39
Q

What is a confidence interval?

A

The range of scores around the obtained score that likely include the “true” score

40
Q

What is a “true experimental design”?

A

Experiment in which subjects are randomly assigned to treatment and control groups

41
Q

What are the benefits of a “true” experimental design?

A
  1. Greater experimental control

2. Greatest protection of internal validity (support for a causal relationship between IV and DV)

42
Q

What is the name of an experimental design in which subjects are not randomized into a group, but the IV is manipulated (eg, effects of alcohol on men’s and women’s reactions to violent films)

A

Quasi-experimental design; in this example, the level of alcohol is manipulated (IV) and DV ( reactions to violent films) is measured. Gender is also an IV of interest, but cannot be manipulated/randomly assigned

43
Q

What is a correlational design?

A

Variables are not manipulated, so no causal relationship can be assessed! It allows for the study of relationships among naturally occurring factors(eg, relationship of white matter density to performance on cognitive tests)

44
Q

What are 3 study designs to assess developmental changes?

A
  1. Longitudinal
  2. Cross-sectional- select subjects of different ages and study them at the same time
  3. Cross-sequential- different groups studied across over a short period of time; this combines the benefits of cross sectional and longitudinal
45
Q

What are the pros and cons of longitudinal research

A

Pros: subject is his/her own control
Control: cost, time intensive, attrition, practice-effects, tends to underestimate true age-related changes

46
Q

What are the pros and cons of cross-sectional research

A

Pros: less costly, provides results faster Cons: cohor effects, differences could be due to experience rather than age. Tends to overestimate true age-related changes

47
Q

What is a time series design?

A

Dependent variable is measured multiple times, at regular intervals, before and after treatment is administered.

48
Q

What are the pros and cons of time series designs?

A

Pros: person is own control, controls for maturation, regression to the mean, and testing effects
Cons: history effect is a threat to internal validity

49
Q

What is single-subject design?

A

Research involves one subject ( or groups of subjects). Baseline measurements are taken and then the intervention is administered. DV is measured several times at baseline, during administration, and after.

50
Q

What does ABAB stand for?

A

Reversal, single subject design. This study design offers multiple baseline (A) and treatment (B) conditions so can measure effects of treatment and treatment withdrawal

51
Q

What are some uses of qualitative/descriptive research?

A
  1. Develop theories of relationships among variables
  2. Used for pilot studies to better understand IVs
  3. Used with observation, interviews, surveys, case studies
52
Q

Name the 4 scales of measurement

A
  1. Nominal: ‘names of categories, unordered
  2. Ordinal: “order” rank data, likery scale
  3. Interval ( no absolute zero, so cannot form ratio, eg, IQ)
  4. Ratio: has absolute zero, eg, weight, time
53
Q

What are parametric statistics?

A

Statistics analyzing interval and ratio data

54
Q

What are the assumptions of parametric statistics?

A
  1. Normal distribution
  2. Homogeneity of variance- variance is equal among all groups
  3. Independence of observations ( 1 data point is not dependent on another data point)
    *Note: parametric stats are somewhat robust to violations of normal distribution and homogeneity of variance. They are not robust to multicoliniarity, or non-
    indepednent measures
55
Q

Multicollinearity

A

Is a phenomenon in which 2 or more predictor variables in a multiple regression model are highly correlated, meaning that one can be predicted from the others with a substantial degree of accuracy

56
Q

Is multicollinearity a problem?

A

Yes, predictors are correlated with each other and it can increase the variance of the coefficient

57
Q

Sensitivity vs. specificty

A

Sensitivity:

If a person has a disease, how often will the test be positive (true positive rate)?
Put another way, if the test is highly sensitive and the test result is negative you can be nearly certain that they don’t have disease.
A Sensitive test helps rule out disease (when the result is negative). Sensitivity rule out or “Snout”
Sensitivity= true positives/(true positive + false negative)

58
Q

Specificty

A

specificity is the ability of the test to correctly identify those without the disease (true negative rate).

If a person does not have the disease how often will the test be negative (true negative rate)?
In other terms, if the test result for a highly specific test is positive you can be nearly certain that they actually have the disease.

A very specific test rules in disease with a high degree of confidence Specificity rule in or “Spin”.

Specificity=true negatives/(true negative + false positives)

59
Q

Positive predictive value?

A

PV+ asks “ If the test result is positive what is the probability that the patient actually has the disease?”

PV+= true positive/(true positive + false positive)

60
Q

Negative predictive value?

A

PV- aks “ If f the test result is negative what is the probability that the patient does not have disease?”

PV-= true negatives/(true negatives +false negatives)