Midterm 1 Flashcards

1
Q

What is a test

A

A standardized process or device that yields information about a sample of behavior or cognitive processes in a quantified manner.

  • We never experience any cognitive process on its own, we only assess external behaviors that indicate that process.
  • This can be tricky because we are taking a set of traits from a human and putting them into a manner that can be quantified so we can use statistical analysis
  • There is a large assumption that we take that we take a reductionistic stance on what it is to be human. We assume that we have enough parts and that if we put them together we know what it is to be human.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between validity and reliability? Why are these important concepts in psychological measurement?

A
  • Validity- How well out measurement taps into the construct (if we are measuring what we think we are measuring).
  • Reliability- Consistency in measurement over time. (If your observation of the behavior will give you the same concept of the behavior at different points in time.)
  • These are important because we need both for a strong measurement, we need to make sure we are actually measuring what we think we are measuring and that its consistent over time.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the four critical assumptions of testing and what do they mean?

A
  1. Deviation- People differ in important traits.
  2. We can QUANTIFY these traits: Need operational definition. We need to have some level to be able to judge a behavior-operational definition. We as a field have not measured happy, motivation, attitude, etc but we measure behaviors that underlie the psychological construct.
  3. The traits are reasonably STABLE We are variable and so are emotions. Naturalistic observation, we act differently in a lab.
  4. MEASURE of the traits relate to actual behavior: We do not see the constfuct, but we use correlated behaviors to quantify the traits Remember: Dairy Queen Strawberry Milkshake
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are societal concerns of psychological assessment?

A
  • Extent to which tests invade privacy
  • Fair use of a test
  • Justice: Impact of testing on society- We have a responsibility to know that research is for something ethical,
  • We have an ethical code- way to help you as a professional (ethics is a mindset)
  • We have not grown into a view of a holistic person because we are stuck into a behavioristic standpoint.
  • We typically look at the average and look at the individual scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do ethical standards and legal standards differ? Which are we held to as professional psychologists?

A
  • Ethical standards are what one should or should not do according to the principles through norms of professional conduct
  • The law is what one must or must not do according to legal dictates.
  • Examples: CEO hiring to weed out people with mental illness, interrogation – legal but not ethical, reporting client abuse.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Idea of ethical themes and the need for competence

A

• We are held to both and have a strong need for a high standards of ethics and administering and using tests.

Our field is based on reputation.

We need to use ethics in choosing, administering, interpreting, and communicating test results.

We need to utilize tests responsibly, and we should develop competence in assessment concepts and methodology

Competence:

  • An understanding of norms, reliability, validity, and test construction
  • Knowledge of specific procedures applicable to a particular test (administration, scoring, etc.) •The psychologist is responsible for continually updating his or her knowledge and assessment skills •Recognizing the boundaries of competence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are ethical responsibilities of the test developer?

A
  • The test developer should define clearly what the test measures and who it applies to.
  • Know the present characteristics and limitation of the test.
  • Review test items for insensitive content and language
  • Define clearly what the test measures and who it applies to.
  • Accurately present characteristics and limitation of the
  • Review test items for insensitive content and language.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ethical responsibilities of test administration

A
  • Select tests only after they have conducted a thorough review of all tests available
  • Have and maintain a thorough knowledge of all test materials (including the manual)
  • Avoid using the test for purposes that are not recommended by the test developers
  • Provide the test-taker with information about their rights
  • Inform the test-taker how long scores obtained will be kept on file and to whom they can (and will) be released
  • Explain the results of the test in language the test-taker can understand
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

APA test taker rights

A
  • Be treated with courtesy, respect, and impartiality, regardless of age, disability, ethnicity, gender, national origin, religion, sexual orientation or other personal characteristics
  • Be tested with measures that meet professional standards and that are appropriate, given the manner in which the test results will be used
  • Know, in advance of testing, when the test will be administered, if and when test results will be available, and if there is a fee for testing services ¨

Also, the test taker has the right to…

  • Have the test administered and the test results interpreted by appropriately trained individuals who follow professional codes of ethics
  • Receive a brief oral or written explanation (prior to testing) about:
  • The purpose(s) for the testing
  • The kind(s) of tests to be used
  • If the results will be reported to them or to others
  • The planned use(s) of the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Where can you find information on ethical principles and professional guidelines, and information about specific tests?

A
  • The ethical principles of psychologists and code of conduct
  • The standards for Educational and Psychological Testing
  • About tests: Mental measurement yearbook, Test in print, Test Critiques, Google Scholar
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is measurement?

A
  • Quantification
  • The process of assigning numbers to persons in such a way that some attributes of the persons being measured are faithfully reflected by some properties of the numbers.

(We want to maintain a faithful representation of what we are studying. )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the scales of measurement?

A

Nominal Scales – numbers take on the meaning of a verbal label, but don’t signify any particular amount of a trait •Least useful, pretty much just a label that it attached to an individual. Ex: football jerseys, we don’t analyze quantities

Ordinal Scales – numbers denote order or ranking, but not amount of a trait, and there is no consistent difference between numbers •Rank ordering of information. No distinction between each of the numbers (2 to 3 distance is not the same as 3 to 4) ex: runner 1:10 min, runner 2: 15 min, runner 3: 2 days

Interval Scales – numerical differences in scores represent equal differences in trait being measured •Ex: on a scale of 1-100, how happy are you. Identifiable and known difference between points on the scale. Likert scale- not a true interval scale. It is interval-ish, we do it because we don’t have the precision knowledge to make it more precise. This may have a 0 but it does not mean the absence of the trait, it is just on the scale (ex: temperature)

Ratio Scales – have a true zero point, with zero= total absence of the trait being measured AND can make proportional statements, with twice the score= twice the attribute •Theoretically this can have a 0 as long as it is a true 0 and when it goes in the negatives it is in line with the interval for the positives. (ex: money and debt) **The scale you use is important because it will help you draw different conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of the scales of measurement meets the minimum criteria for statistical measurement?

A

•Interval is the basic assumption for all statistics to work. (Can’t have an absence of an emotion.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the different measures of central tendency?

A

A Singular representation of a lot of data

Mode: most frequently occurring number

Median: the literal middle score, the number that separates the top half from the bottom half of a score distribution, the 50th percentile

Mean: the arithmetical average score, calculated by summing the scores, then dividing by the total number of scores. This is the most common!

    \+: It takes into account every point of data

    -: It is susceptible to outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the normal curve and why is it important?

A
  • Theoretical distribution of human traits in nature
  • Also called the normal distribution or bell curve
  • Mean, median, and mode are the same value in a normal distribution
  • Same proportion of scores can always be found within the same standard deviation limits We would have no statistical analysis if we did not have the concept of what is normal.
  • It is a BIG assumption that data has this curve.
  • Normal curve is the perfect standard to which we will compare other information and data we have. We would have no statistical analysis if we did not have the concept of what is normal.
  • Science does not prove anything, we have hypothesis but we don’t ever have definitive proof of anything.
  • The more non-normal your distribution is the less you should trust it
  • This curve assumes that within the population there is a normal distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is variability and how is it calculated?

A

Variability- reflects the extent to which individuals differ and how far our points are from the mean.

Variance is the core of statistics. We need variance, it reflects the extent to which individuals differ. Ex: taco bell.

Calculate: sum of squared deviations / number in sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the standard deviation and how is it calculated?

A

Standard deviation: average of how far things are away from the mean.

Calculate: Square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What benefits are derived from use of a z-score? How do you calculate it?

A
  • We use a z-score to normalize a weird distribution. Allows us to compare across different scales and put them on the same curve. Apples and oranges.
  • Provides meaning into insight on test scores of being high, medium, or low. Interpreted in standard deviation units. z = divide an individual’s score by the standard deviation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is correlation?

A
  • Reflects the degree to which a score on one measure or variable is associated with a score on another measure or variable.
  • Range from +1 to -1, with numbers closer to +/-1 indicating a strong relationship and numbers closer to zero indicating a weak relationship
  • Relationship can be positive, indicating scores vary in the same direction, or negative, indicating scores vary in opposite directions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why do we say that we cannot assume causation from correlational data?

A

We cannot assume causation because:

  • We do not know if there is another variable driving the correlation
  • We do not know which caused which because we are not looking at a pre-existing state.
  • We can make prediction in regression, but cannot make causal claims.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When can we assume causation?

A

• We can assume causation in a methodological situation. Need experimental manipulation to infer causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When can we predict using correlation?

A

If you have a strong correlation where in a positive correlation as one increases we see a corresponding increase in the other variable, and the negative is the inverse of this.

If you have a moderately strong correlation in either direction, simply knowing the value of one variable will give you an idea of what the other variable is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When can we assume causation?

A

We can assume causation in a methodological situation. We need experimental manipulation to infer causation

  1. The cause must precede the effect in time.
  2. The cause and effect must co-vary
  3. There must be no other plausible explanations for the effect other than the presumed cause.
24
Q

What benefit is derived from regression over a bivariate correlation?

A

•Regression allows us to look at the relative contribution of one or more predictor variables on a specified outcome variable.

When we do regression modeling we will see if the variables are significant or not in accounting for any of the variance.

Built in z-scores and looks at how important and how much weight each variable has. (ex: grad school) Bivariate is bad because there is less ability to explain variance because there are only two variables.

There is more error. Linear regression – Ferrari of correlations.

25
Q

For what is factor analysis used?

A

Factor analysis is used to identify the underlying variables (or factors) that account for the correlations between test scores.

26
Q

What is the difference between exploratory and confirmatory factor analysis?

A

Exploratory: initial exploration of what the factors are

Confirmatory: Once you have the factors, determining if the factor structure holds up with the new factor. Wanting to determine what factors are involved. Like to see if kitten lovin is similar to puppy lovin

Exploratory/Confirmatory: Same mathematically and statistically. Differences are the PURPOSE.

27
Q

What are the six steps involved in test construction?

A
  1. Define the Test’s Purpose: Statement of Purpose
  2. Preliminary Design Issues
  3. Item Preparation
    * Stimulus, Response, Conditons governing the responses, Scoring Procedures
  4. Item Analysis
    * Item difficulty, Distractor analysis, Discimination
  5. Standardization and Ancillary Research (Norming)
    *
  6. Final Materials and Publication

*

28
Q

What information should be included in a clear statement of purpose for a test? Why is this important?

A

A statement of purpose should include:

  • Traits to be measured- Have an understanding of the construct to know you are dealing with (is it uni-dimensional? If multi-dimensional are you going to deal with all or one aspect?)
  • Target audience. Suit off the rack example: Trade off between efficiency and accuracy.
29
Q

What are the basic preliminary design issues? What do they mean?

A

• Mode of administration-

  • Is it going to be pen and paper? Survey? Computer? Individually? Group system?
  • What would guide this choice is where it will be used and who will be using it (ex: for therapists we probably won’t want a group measure.
  • Pen and paper and computer- Depending on construct and stability of construct. If it is stable it won’t matter as much and can be done on pen and paper or computer. So, the environment is going to be responsible for responses, not what we are actually looking for.

**• Length: **

  • Longer tests tend to be more reliable and they give you a better overview of the construct itself.
  • Questions may evoke a different response, must examine construct.
  • If complex, more questions are needed.

• Item format

  • Selected response, likert scale, true false, multiple choice.
  • Or of an open ended nature with a constructed response. If there is not a lot known about construct then open ended is good for exploratory.

• Number of scores

  • If a singular construct, then you will get a single score. If it has a factor you will have a set of subscores.

• Administrator training

  • Format will dictate the training needed.

• Background research:

  • Will inform all of the other categories.
30
Q

What is the tradeoff between efficiency and reliability in terms of length of the test?

A
  • A test with a lot of questions will be very reliable because we can get at every aspect we are looking for.
  • However this is not efficient because testing fatigue may not show good data.
  • We need to find the “sweet spot”.
31
Q

What are the four parts of a given test item?

A

Stimulus

  • The item itself, elicits response that is hopefully correlated with behavior

Response

  • Behavior correlated to psychological construct

Conditions governing responses

  • Time limit, immediate or contemplation,

Scoring procedures

  • Need to have development of consistent scoring rubric that will determine contributory value of each item.
32
Q

Why is consistency across administrations of a test important?

A
  • So that it is reliable and valid.
  • Making sure that participants are given the same testing conditions and that they are responding to the same stimuli across testing sessions.
  • We want them to be responding to the stimulus alone, not the administrator. (ex: dr. coat vs. motley crue shirt)
  • Reduce measurement eror
33
Q

What are the differences between selected-response items and constructed-response items?

A

Selected response- Participants choose a response out of the ones you have given

Constructed response- Give participants the stimulus and they will respond

34
Q

What are the benefits of selected-response?

A
  • Easy! Faster!
  • Scoring reliability
  • Temporal Efficiency
  • Scoring Efficiency
35
Q

What are the benefits of constructed-response?

A

Behavioral observation

Exploring/expanding of responses

Development of study habits (has nothing to do with test construction)

36
Q

In general, how large should your total pool of test item questions be for use in the item analysis procedure?

A

•2-3 times the amount you want in the measure

37
Q

What are some of the additional item review steps that many go through before undergoing a formal item analysis? Why are these important?

A
  • Be sure to try out enough items: Generally two to three times the number needed for the final test.
  • Do a simple, informal tryout (like a pilot investigation)
  • These are important so that we do not waste time and resources testing items that do not make sense or that have small flaws that could have been addressed earlier.
38
Q

What types of information are provided by a statistical item analysis?

A
  • Item difficulty: The percentage of students who took the test who answered the item correctly.
    • Difficulty (p) = 3people correct / total
  • Item Discrimination: Assumption that a single item and the test measure are the same thing.
    • To determine we use Item discrimination Index D o Discrimination Coefficients
  • Distractor analysis: Useful when you have multiple test items
39
Q

What does item difficulty reflect?

A
  • (P VALUE)
  • The percentage of students who took the test who and answered the item correctly
  • Calculate individually, not as a whole.
40
Q

How is item difficulty (p) calculated?

A

people correct / Total taking test

41
Q

Why do I say that a p-value is basically a behavioral measure?

A
  • Only as good as the strength of the relationship that exists between the behavior we are assessing and it’s underlying psychological assessment.
  • It is all relative. Based off characteristics of the sample.
42
Q

• Why is a restriction on score variability a source of concern?

A
  • • It is a concern because if it is too easy or too hard there is no variance AND VARIANCE IS EVERYTHING
43
Q

• What is the value of an item (and I guess a test as a whole) that has high discrimination? The value of an item that has high discrimination

A
  • Assumption that a single item and test measure are the same thing.
  • The item is good because you can determine variance of the item
  • Assume that if a person does well on the test they can do well on one item.
44
Q

• Why do we generally focus only on the upper and lower 27% of test-takers when calculating D?

A
  • When we have a large range we are looking at the extremes
  • Looks at a greater range
45
Q

Have a general idea what the ranges of D-scores mean.

A
  • .4 and greater - good
  • .30 to .39 - reasonably good
  • .20-.29 - marginal
  • .19 and below - poor
46
Q

What are distractors?

A
  • We are obtaining a discrimination index for each option to determine usefulness of distractor
  • Anything but the correct response
  • Should be low and preferably negative
  • Be cautious or large D values
47
Q

Why is an analysis of distractors important?

A
  • Analysis of distractors tells how much incorrect responses distracted test takers from the true correct response.
  • If distractors are too distracting we are not getting a true idea of the level of the participants.
48
Q

What are the distractor guidelines?

A
  • Create distractors that are equally plausible
  • Make all the alternatives parallel in length and grammatical structure •

Keep the alternatives short

  • Don’t write distractors that mean the same thing
  • Alternate the position of the correct answer within the distractors
  • Use the alternatives “all the above” and “none of the above” as little as possible
  • Make sure each alternative agrees with the stem grammar.
49
Q

Why is it important to have normative data on your given measure?

A
  • Establishment of a standard of which your scores can be compared to
50
Q

What two steps are involved in the norming of a test?

A
  • Define the target population
    • Should have been noted in the statement of purpose
  • Select the sample
    • This is a norm sample that should match the target population
    • Select samples from that norm sample
    • You want that norm sample to be as large as possible
    • Want the standard to talk to the whole population
    • The larger the norm the larger your sample population are going to be
    • You don’t need a large sample if the sample is truly similar to the target population.
51
Q

Why is a clear definition of your target population important? Why is the selection of a relevant sample from that population important?

A

• If you specify your target population you’ll be able to get a sample from that.

52
Q

• What is the basic difference between probability and nonprobability forms of sampling?

A
  • Probability
    • Equal likelihood of a person being selected. Better form of sampling, known as gold sampling.
  • Nonprobability
    • There is a chance that a person will not get selected for the study because you don’t have access to every population. We typically engage in this.
53
Q

What are the different forms of probability sampling? What are some limitations of these?

A
  • Random Sampling
    • The golden standard
    • Form of sampling in which all members of the population have an equal and known likelihood of being selected from a sample.
    • If you are dealing with extreme populations it is impossible because we don’t’ always have access to an entire population where each member has an equal and known chance of being selected
  • Systematic Sampling
    • Form of probability sampling
    • Generally done when we have contact information and equal availability of all members of the population.
    • This will be done in small, isolated populations.
  • Stratefied Sampling
    • Once you find out the population strata, you determine the relative size and then sample for that Picks up on subgroups in population that random sampling does not always pick up.
54
Q

Is there an optimal sampling method that cuts across all norming situations? Why or why not?

A
  • No sampling method is perfect.
  • This is why replication is so important!
  • The best thing you can do is to tell the story for the individuals that are giving them their data.
  • The purpose is to collect data and structure in such a way that you are being representative of the population.
55
Q

What type of information is contained within a standard technical manual for a test? Why is this information important?

A
  • Means to create standardization of administrating
  • The testing situation itself is becoming a stimulus, we want it to be consistent
  • Can vary on length.
56
Q

Important considerations in test construction:

A
  • The original conceptualization is more important than the technical / statistic work
  • You need to spend substantial time studying the area before writing the items.
  • THIS IS THE BULK OF TIME
  • In the original design stage, you need to think about the final score reports
  • When preparing test items, aim for simplicity
  • Be sure to try out enough items:
  • Do a simple, informal tryout before the major tryout.
  • From a statistical viewpoint, the standardization group need to be very large, if properly selected.
57
Q

What are the different forms of non-probability sampling? What are some limitations of these?

A
  • Nonrandom form of sampling in which not every member of the population has an equal or known likelihood of being selected for the sample
    • We tend to deal with populations that are fairly large, therefore we do not focus too much on probability sample
  • Convenience Sampling
    • Individuals who are part of the sample because they are easy to contain
    • Ease of Time and Money
    • Tradeoff between efficiency and accuracy (these tend to be inaccurate).
    • Run a high risk of being a non-representative sample
    • Good for an initial look at something, and that is about it
  • Judgment Sampling
    • The researcher makes an informed judgment and collects information and collects information from another of different areas or different partitions of the population
  • Quota Sampling
    • Non-probability version of stratified. Instead of calling them strata they are referred to as quotas
      • determine relative size of grouping
      • use non-probability (convenience etc.) to fill in the rest]
  • Snowball Sampling
    • Used when it is very difficult to access a population but you do have access to either a single or a small number of people in the population. Provide the survey etc to individuals and then they give you contact info for more people in the population, or they pass on your survey.
    • A lot of the time people think they are doing this when they are actually doing a convenience sample.