Sattler Article Flashcards

1
Q

Descriptive Statistics

A

Summarizes data obtained on a sample of individuals.

Areas studied are

  1. Scales of Measurement
  2. Measures of Central Tendency
  3. Measures of Dispersion
  4. The Normal Curve
  5. Correlations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Scales of Measurement:

A

used to assign values or scores to some measurable trait or characteristic. They can then be subjected to mathematical procedures to determine relationships between the traits or characteristics of interest and other measured behaviours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Scales of Measurement

A
  1. Nominal
  2. Ordinal
  3. Interval
  4. Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Nominal Scale:
A

Consists of a set or nonordered categories, one of which is assigned to each item being scaled. Male = 1 Female = 2. It allows for classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Ordinal Scale:
A

Has the property of order. The variable being measured is ranked without regard for differences in the distance between scores. ie) ranking of person from highest to lowest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Interval
A

Has an arbitrary zero point and equal unites. ie) Celsius Scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Ratio Scales
A

Has a true zero point, and equal units and a meaningful zero point. Weight. Not usually found in psychology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Measures of Central Tendency

A
  1. Mean: average
  2. Median: the middle point in a set of scores. 50 % lie above and 50% below. IF there is an even # of scores, the median is the number halfway between the two middlemost scores and therefore is not any of the actual scores.
  3. Mode: is the score that occurs more often than any other.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Measures of Dispersion

A
  1. The simplest is the range

The most frequently used is 2. variance and 3. standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Range
A

The range represents the distance between the highest and lowest scores in a set. It is obtained by subtracting the lowest score in the set from the highest score:

Range = Highest Minus Lowest

R = H - L

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Variance
A

Variance measures the amount of spread in a group of scores. the greater the spread in a group of scores, the greater the variance.

See Variance Formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Standard Deviation
A

Is the positive square root of the variance. A commonly used measure of the extent to which scores deviate from the mean, the standard deviation is often used in the field of testing and measurement.

See formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Normal Curve:

A

The normal bell shaped curve is a common type of distribution. Many psychological traits are distributed roughly along a normal curve. It enables us to calculate exactly how many cases fall between any two points under the curve. It shows the percentage of cases that fall within 1, 2 or 3 Standard Deviations above and below the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Correlations

A

Correlations tell us about the degree of association or co-relationship between two variables, including the strength and direction of their relationship.

The strength of the relationship is determined by the absolute magnitude of the correlation coefficient, the maximum value is 1.00.

The direction of the relationship is given by the sign of the coefficient. A positive correlation(+) indicates that a high score on one variable is associated with a high score on the second variable.

Conversely, a negative (-) relationship signifies an inverse relationship - that is a high score on one variable is associated with a low score on the other variable.

Thus correlation coefficients range in value from -1.00 to + 1.00

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Prediction:

A

The higher the correlation between two variables the more accurately one can predict the value of one variable when supplied only with the value of the other variable.

A correlation of -1.00 or + 1.00 means that one can predict perfectly a person`s score on one variable if the score on the other variable is known.

In contrast a correlation of 00.00 indicates that there is no (linear) way of predicting scores on one variable from knowledge of scores on the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pearson`s R

A

is the most common correlation. It is not affected by any transformation of the scores. When the assumptions of Pearson`s r cannot be met, the Spearman R method can be used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Correlations do not….

A

provide us with info about whether an observed relationship reflects a simple cause-effect relationship or some more complex relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Correlations

A

used as validity coefficients must be squared in order to determine the amount of variance explained by the predictor (or test). The value of r2 is known as the coefficient of determination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Regression

A

The correlation coefficient, can be used to construct the best possible linear equation for predicting the score on one variable when the score on another variable is known.

Y = bX + a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Error of Estimate

A

A measure of the accuracy of the predicted Y scores is the standard error of estimate.

See formula

The higher the correlation between X and Y, the smaller the standard of error of estimate and hence the greater the average accuracy of predictions will be.

A +1.00 correlation coefficient coefficient means that perfect predictions can be made. A .00 correlations means that you cannot improve your prediction of Y scores by knowing the associated X scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Norm Referenced Measurement

A

In a norm referenced testing an examinees performance is compared with the performance of a specific group of subjects. A norm provides an indication of average or typical performance of the specified group. Norms are needed because a raw test score in itself is not very meaningful. We need to know how others performed on a test. The comparison is carried out by converting the childs raw score into some relative measure, termed a derived score and they indicate the childs standing relative to the norm group. Derived scores allow us to compare the childs performance on one test with his performance on another test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Evaluating the Norm Group

A
  1. Representativeness: the norm sample should match the demographic characteristics of the population as a whole.
  2. Size: The # of subjects should be large enough to ensure stability of the test scores and inclusion of the various groups that are represented in the population. The larger the # used the better.
  3. Relevance: The correct norm group must be chosen to evaluate the examinee’s test results against.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Derived Scores

A
  1. Age Equivalent and Grade Equivalent Scores
  2. Ratio IQ’s
  3. Percentile Ranks
  4. Standard Scores
  5. Stanines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Percentile Ranks

A

Are derived scores that permit us to determine an individual’s position relative to the standardization sample. IT is a point in a distribution at or below which the scores of a given percentage of individuals fall in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Standard Scores

A

are raw scores that have been transformed to have a given mean and a standard deviation. They express how far an examinees’ score lies from he mean of the distribution in terms of the standard deviation.

26
Q

Z Scores

A

are a type of standard score with a mean of 0 and a standard deviation of 1. Most z scores range from -3.0 to +3.0. Z Scores are frequently transformed into other standard scores to eliminate the + and - signs. ie a T Score.

27
Q

T Score

A

A T Score is a standard score with a mean of 50 and a standard deviation of 10.

28
Q

The Deviation IQ is another standard score

A

It has a mean of 100 and a SD of 15.

29
Q

Stanford Binet

A

Deviation IQ’s with a mean of 100 and a SD of 16

30
Q

Percentile Ranks Based on IQ Scores

A

AN IQ of 100 represents the 50th percentile because an IQ of 100 has a mean of 100. An IQ of 115 is 1 SD from the mean. The percentile rank associated with this IQ - the 84th percentile rank is obtained by adding 50 to 34%. The 50% represents the proportion of the population below the mean of 100 and the 34% represents the proportion of the population between the mean and +1 SD away from the mean.

31
Q

Statistical Significance

A

Convention has established the .05 level as the minimum significance level indicating that observed differences are real, such results would occur 5% of the time by chance.

32
Q

Reliability

A

Refers to the consistency of measurements. Test results need to be dependable, reproducible, stable (reliable) and meaningful (valid).

Reliability is expressed by a reliability coefficient or by the standard error of measurement, which is derived from the reliability coefficient.

A test should not be trusted if its reliability coefficient is low. For most tests of cognitive and special abilities, a reliability coefficient of .80 or higher is considered acceptable.

33
Q

The theory of reliability of Measurement

A

When a child takes a test on several occasions, sometimes their scores change in a systematic way (regular increase or decrease in scores) and sometimes in a random or unsystematic way.

A test is unreliable if scores are subject to random, unsystematic fluctuations, and therefore not dependable.

Reliability of measurement refers to the extent to which unsystematic variation affects the measurement of a trait or characteristic.

34
Q

Scoring

A

The score an examinee obtains on a test is composed of a true score and error score.

The obtained score is a composite of the amount of the trait the child actually possesses (the true score) and the error of measurement (the error score).

The child’s true score is a hypothetical construct, it cannot be observed.

The theory assumes that the child possesses stable traits, that errors are random, and that the obtained score results from the addition of true and error scores.

The reliability coefficient then, represents a ratio of the true score variance to the observed score variance.

35
Q

Reliability Coefficients

A

expresses the degree of consistency in the measurement of test scores.

Reliability coefficients range from 1.00 (perfect reliability) to 0.00 (no reliability)

36
Q

3 types of reliability

A
  1. Test-Retest
  2. Alternate Forms
  3. Split Half
37
Q
  1. Test-Retest Reliability
A

Is an index of stability. Administer the same test to the same group on two different occasions, within a short period of time. (two weeks to a month).

The obtained correlation, called the coefficient of stability, represents the extent to which the test is consistent over time.

This correlation is affected by factors associated with the specific administration of the test or with what the child has remembered or learned in the interim.

The shorter the retest interval, the higher the reliability coefficient as there are fewer reasons for an individual’s score to change.

38
Q
  1. Alternate Form Reliability:
A

Also called equivalent or parallel form reliability is obtained by administering two equivalent tests to the same group of examinees. If the two forms of the test are equivalent they should have the same means and variances and a high reliability coefficient. If there is no error in measurement, an individual should earn the same score on both forms of the test.

Half given form A, and then form B. And other group is given Form B and then Form A.

Scores from the two forms are then correlated yielding a correlation coefficient.

Should be given in a short period of time.

Because examinees are not tested twice with the same items, there is less chance that their memory will affect the scores.

39
Q
  1. Split-Half
A

Divide the test into two equivalent halves and it creates two alternate forms of the test. You can assign odd #d items to one and even #s to another. This procedure assumes that all items measure the same trait.

40
Q
  1. Cronbach`s Formula for coefficient Alpha and the Kuder Richardson Formula 20
A

Measure the uniformity or homogeneity of items throughout the test

41
Q

Cronbach`s coefficient alpha

A

A general reliability coefficient can be used for different scoring systems, is based on the variance of the test scores and the variance of the item scores. The coefficient reflects the extent to which items measure the same characteristic.

42
Q

The Kuder Richardson Formula 20 Coefficient

A

A special case of coefficient alpha, is useful for tests that are scored pass/fail. It is obtained by calculating the proportion of people who pass and fail each item and the variance of the test score.

43
Q

Factors Affecting Reliability

A
  1. Test length - the more items there are on a test the greater the reliability.
  2. Test-Retest Interval: the smaller the time interval btw two tests the smaller the chance of change and the higher reliability.
  3. Variability of Scores: The greater the variance of scores on a test, the higher the reliability estimate is likely to be. Small changes in performance have a greater impact on the reliability of a test when the range or spread of scores is narrow than when it is wide.
  4. Guessing: the less guessing that occurs on a test the higher the reliability is likely to be.
  5. Variation within a test situation: the fewer the variations in the test situation the higher the reliability is likely to be. ie) misleading instructions, scoring errors, illness, and daydreaming create error.
44
Q

Reliability of an Individuals Examinees Test Score

A

Unreliable results can occur if an examinee is uncooperative or anxious or has difficulty following instructions or if examiners are incompetent.

45
Q

Standard Error of Measurement

A

Because of the presence of measurement error associated with test unreliability, there is always some uncertainty about an individuals’s true score. The Standard Error of Measurement is an estimate of the amount of error usually attached to an examinee’s score. It is directly related to reliability.

The larger the standard error of measurement, the lower the reliability.

Large standard errors of measurement mean less precise measurements and larger confidence intervals.

46
Q

Standard Error of Measurement

Formula

A

Is the standard deviation of the distribution of error scores. It can be computed from the reliability coefficient of the test by multiplying the standard deviation of the test by the square root of 1 minus the reliability coefficient.

47
Q

Confidence Intervals for Obtained Scores

A

A band or range of scores that has a high probability of including the examinees true score.

The standard error of measurement provides the basis for forming the confidence interval.

The interval may be large or small depending on the degree of confidence desired.

Levels of Confidence
68%
95%
99%

A 94% confidence interval can be thought of as the range in which a person’s true score will be found 95% of the time. The chances are only 5 in 100 that a person’s true score lies outside this confidence interval

48
Q

Confidence Level Formula

A

Obtained Score + or - Z(SEM)

Z score in Table

49
Q

Confidence Intervals for Predicted Scores

A

Y Predicted + or - Z(SEst)

50
Q

Validity

A

The validity of a test refers to the extent to which a test measures what it is supposed to measure, and therefore the appropriateness with which inferences can be made on the basis of the test results.

Unless the test is valid for the purpose for which it is being used, the results cannot be used with any degree of confidence.

Validity is not all or nothing, but rather a matter of degree.

Keep in mind the social consequences of using tests.

51
Q

Content Validity

A

refers to whether the items on a test are representative of the domain that the test purports to measure.

One must consider the appropriateness of the type of items, the completeness of the items sample and the way in which the items assess the content of the domain involved.

Are the test questions appropriate and does the test measure the domain of interest?

Does the test contain enough info to cover appropriately what it is supposed to measure?

What is the level of mastery at which the content is being assessed?

52
Q

Face Validity:

A

Refers to what the test appears to measure, not what it actually does measure.

It is important to examinees in that if the test does not appear to measure what it purports to measure, they may become skeptical and then the results may not accurately reflect their abilities.

53
Q

Criterion-Related Validity:

A

Refers to the relationship between test scores and some type of criterion or outcome, such as ratings, classifications, or other test scores.

The criterion, like the test must possess adequate psychometric properties, it should be measureable, free from bias and relevant to the purposes of the test.

A complementary relationship between test and criterion is necessary otherwise it is impossible to use the criterion to determine whether the test measures the trait or characteristic it was designed to measure.

54
Q

2 Types of Criterion Related Validity are:

A
  1. Concurrent and

2. Predictive

55
Q

Concurrent Validity:

A

Refers to whether test scores are related to some currently available criterion measure.

ie) lets suppose that we find that test scores correlate with a teachers assessment of the children’s knowledge of math then the test is said to have concurrent validity.

56
Q

Predictive Validity:

A

Refers to the correlation between test scores and performance on a relevant criterion where there is a time interval between the test administration and performance on the criterion.

The score obtained on the test is an accurate predictor of future performance.

A readiness school pre-school program predicts future success in school.

Predictive validity is established by giving a test to a group of people that has yet to perform on the criterion of interest. The group’s performance is subsequently measured. The correspondence between two scores provides a measure of the predictive validity of the test. If the test possesses high predictive validity, persons scoring high on the test will perform well on the criterion measure.

Those scoring low on the test will perform poorly on the criterion.

If the predictive validity of a test is low there will be an erratic and unpredictable relationship between the test scores and subsequent performance on the criterion.

57
Q

Construct Validity

A

Refers to the extent to which a test measures a psychological construct or trait. Various procedures are used to determine how the items in a test relate to a theoretical constructs that the test purports to measure.

The construct validity of an intelligence test can be evaluated by examining how the items relate to a theory of intelligence.

Factor Analysis also permits an examination of the construct an examination of the construct validity of a test.

58
Q

FActors affecting Validity

A

They are affected by the range of talent being measured and the length of the interval between the administration of the measures.

Predictive validity of the IQ can be impaired in the following ways:

Test-taking skills, anxiety, motivation, speed, understanding of instructions, degree of item or format novelty, examiner-examinee rapport, physical handicaps, degree of bilingualism, deficiencies in educational opportunities, unfamiliarity with the test material, and deviation in other ways from the norm of the standardization group.

FActors relating to the criterion may also affect validity. School grades are affected by motivation, classroom beh, personal appearance, and study habits. If examinees have problems in any of these areas the predictive validity of intelligence tests may be lowered.

Intervening events and contingencies may also affect predictive validity especially in testing handicapped children. If therapeutic intervention can increase the validity it should be used.

Finally, validity is affected by the tests reliability. Reliability is a necessary but not sufficient condition for validity. In some cases it may be difficult to determine the validity of the test results on the basis of only one test session.

59
Q

Factor Analysis

A

Factor Analysis is a mathematical procedure used to analyze the intercorrelations of a group of tests that have been administered to a large # of individuals.

Because complex computations are involved, factor analysis is almost always done with a computer.

IT is based on the assumption that inter-correlations can be accounted for by some underlying set of unobservable factors that are fewer in # than the tests themselves.

FActor Analysis might be used to determine the # of different mental abilities that account for the pattern of intercorrelations in the tests in the battery.

A major purpose of factor analysis is to simplify the description of behaviour by reducing the # of variables to the smallest possible #.

The findings from factor analysis tell you the extent to which varying #’s of factors account for the correlation among tests.

A factor is defined as that which a cluster of interrelated tests have in common.

The results also indicate the extent to which each test loads on, or is correlated with, one or more factors.

Factor loadings are simply correlations between factors and tests. The loadings indicate the weight of each factor in determining performance on each test.

60
Q

Procedures used in Factor Analysis

A

Most Factor Analysis programs work by extracting first the factor that accounts for the largest proportion of variance, then the factor that accounts for the largest proportion of the residual variance and so on.

The first unrotated factor is a general factor on which most variables have high loadings. A general factor is found in cases where all subtests have a considerable amount of overlap, such as in an intelligence test.

In intelligence testing, the first general factor is considered to reflect general intelligence or there maybe two or three important factors but no single factor on which all variables load.

Most researchers rotate the matrix of factor loadings to make the factor structure clearer. The rotation rearranges the factors so that ideally for every factor there are some tests with high loadings on the factor.

The order in which the factors originally were extracted is not always preserved in the rotation. In particular the first unrotated factor usually cannot be discerned. The factors resulting from rotation are referred to as group factors. It is up to the researcher to name or interpret each factor by looking at the content of the tests that have high loadings on the factor.

After all of the common factor variance has been extracted and the rotation has been completed, there still may be a significant amount of variance that has not been analyzed. This variance present in one test but not in the other tests under study may be termed as specific factor variance or specificity.