Research Design/Statistics/Tests and Measurements Flashcards
William Wundt
Wundt was the one who brought fields together to study psychology as a science. He believed that experimental psychology was very limited since the methodology could not be used to study higher mental processes.
He also believed that there was no thought without mental image.
Ebbinghaus
Demonstrated that higher mental processes could be studied using experimental methodology.
Oswald Kulpe
Disagreed with Wundt and proposed that thought could be without mental images, his work went on to prove his hypothesis.
Cattel
A student of wundt, he brought mental testing to the USA.
Alfred Binet
published the first intelligence test (The Binet-Simon Test) with Theodore Simon. Test was originally used to determine which ID children in France would not benefit from ordinary schooling. Binet also introduced the concept of a mental age.
William Stern
Developed an equation to compare mental age to chronological age, later known as an IQ score.
Lewis Terman
Revised the Binet-Simon test (The Stanford-Binet Intelligence Test) to be used in the USA.
Operational Definitions
How does the researcher plan to define the variables in the experiment so that the variables are measurable.
IV and DV
Independent Variable: The variable whose effect is being studied
Dependent Variable: The variable expected to change due to variation in the IV.
Three Types of Research
True Experiments: When there is random assignment and a manipulation
Quasi-Experiments: When there is manipulations, but no random assignment.
Correlational Studies: When no manipulations take place
Naturalistic Observation
When the research does not intervene and just measures behavior as it naturally occurs. Also called field studies.
Sample Selection Types
Random Selection: each member of the population being studied has an equal chance of being selected for the study.
Stratified Random Sampling:Technique of ensuring that each subgroup of the population is randomly sampled in proportion to its size.
Research Study Designs
Between-Subjects Design: Each subject is exposed to only one level of each IV.
Matched Subjects Design: When research groups are created based upon matching demographic information in pairs and splitting them between groups.
Within Subjects Design (aka repeated measures design): When each subject is exposed to multiple conditions, to separate the effects of individual differences from the effects of the IV.
Counterbalancing
When all subjects experience both levels, but the orders are changed to ensure that there are no order effects.
Confounding Variables
Unintended IVs which may affect the DV in an unintended way.
Nonequivalent Group Design
When the control group is not similar to the experimental group due to non-random assignment.
Experimenter Bias
The fact that due to their expectations, the experimenter might inadvertently treat groups of subjects differently.
Demand Characteristics
Any cue that might suggest to subjects what the researcher expects from them. (if the subject knows, they may try to act as expected)
Hawthorne Effect
The tendency of people to behave differently if they know they are being observed.
Two Types of Statistics
Descriptive Statistics: concerned with organizing/quantifying/summarizing a collection of observations
Inferential Statistics: Concerned with making inference from the sample involved in the research to the population of interest.
How do Outliers Effect Central Tendency
Generally, outliers don’t effect the median and mode, but will drastically effect the mean.
Normal Distribution Percentiiles
68% in 1 SD
96% in 2 SD
4% beyond 2 SD
z-score
When you subtract the mean of the distribution from your score, and divide the difference by the standard deviation.
0 is mean, negative means below mean, positive means above mean. 1 means 1sd about, -1 means 1sd below, etc.
34% fall bw 0->1 and 0->-1
50% on either side of 0.
t-scores
mean of 50, SD of 10.
Correlation Coefficients
Type of descriptive statistic that measures to what extent two variables are related. Helps us understand the correlation and the degree of association between two variables. (always between 1 and -1).
Factor Analysis
Attempts to account for interrelationships found among various variables by seeing how groups of variables are related
Inferential Statistics
Allows us to use smaller batches of observations to make conclusions regarding an entire population of interest.
Experimental Hypothesis
an experimental hypothesis is confirmed by disconfirming the null hypothesis. Generally, the null hypothesis is that the population mean is the same as the sample mean.
Alpha Level
The criterion of significant, generally we use 5%, meaning that we need 95% confidence in a finding to reject the null hypothesis
Errors in Significance Testing
Type I: False Positive. The chance of making this error is the same as the alpha level (usually 5%)
Type II: False Negative. Chance of making this error is represented by beta.
The Three Relevant Significance Tests
t-test: used to compare the means for two groups
ANOVA: used to compare the means of more than 2 groups
chi-square: tests the equality of two proportions/frequencies
F Ratio
Used in ANOVAs, it estimates how much group means differ from each other (done by comparing between group variance to within group variance).
An F Ratio of 1 would mean there is no group differences.
Chi-Square Tests
work with categorical data (categories, not continuous, AKA nominal data). Return from a chi square test is a frequency or proportion
Two ways to Interpret Test Results
Norm-Referenced: Assessing an individuals performance in terms of how they compare to others. Population of test takers need to be large and representative though
Domain-Referenced: AKA criterion-referenced testing. Assesses the individual’s knowledge regarding specified content domains.
Reliability
Consistency with which a test measures something.
Standard Error of Measurement
Index of how much, on average, we expect a person’s observed score to vary from the score they are capable of receiving.
(ideally this would be zero, but tests aren’t perfect).
Three (basic) Methods to Test Reliability
test-retest: when the same test is administered to the same group twice
alternate-form: two different forms of the test are given to the group at two separate times
split-half: test takers only take one test, and group is split in two, with one group taking one form, and the other group taking the other form.
Types of Validity
- Content: test’s coverage of the skill area it’s supposed to measure
- Face: if the test appears to measure what it’s supposed to measure
- Criterion: how well the test can predict an individual’s performance on an established test of the same skill
- Construct: how well performance on the test fits the theoretical framework related to what you want to measure
Construct Validity
How well performance on the test fits the theoretical framework related to what you want to measure
- convergent: if people score high on this test, will they score high on tests of related skills?
- discriminant: if people score high on this test, is it correlated with other variables that are not related?
Criterion Validity
How well the test can predict an individual’s performance on an established test of the same skill
- Predictive: can it predict future performance
- concurrent: can it predict performance on a different test
Reliability + Validity
A test with zero reliability will have zero validity.
HOWEVER, a test can have 100% reliability, and zero validity.
Four Types of Measurement Scales
Nominal: (AKA categorical): labels observations
Ordinal: observations are ranked in terms of magnitude
Interval: uses numbers (i.e. % correct)
Ratio: when there is a true zero point that indicates absence of the quantity
Two Types of Ability Tests
Aptitude: used to predict what one can accomplish through training
Achievement: used to assess what one can do/knows now.
Personality Inventory
type of personality test where individuals self-rate themselves on how applicable 100-500 statements are to them
While fairly accurate, there are some biases which effect the accuracy due to self reporting.
Minnesota Multiphasic Personality Inventory (MMPI)
consists of 550 statements which subjects respond true/false/idk to. It yields scores on 10 clinical scales (i.e. depression, schizophrenia, etc). Also has scales to indicate if the person is malingering (intentionally or unintentionally).
Content scales were later added, which were formed from theoretically derived questions.
Created with the empirical criterion-keying approach.
Empirical Criterion-Keying Approach
Testing thousands of questions and retaining those that differentiate between patient and non-patient populations (even if the item seemed random).
California Psychological Inventory (CPI)
Based on the MMPI, it is a personality inventory for typical populations ages 13+/ Especially oriented to HS and college students. Has 20 scales (3 validity) and measures personality traits (ie dominance, sociability, femininity, etc)
Projective Tests
Stimuli are ambiguous and test taker is not limited to small number of responses. Generally, people are presented with a vague stimuli, and asked to interpret what they see. Scoring is subjective.
Rorschach Inkblot Test
Most famous projective test, it’s made up of ten cards that appear to be inkblots. Clinician interprets scores based upon the remarks the person says regarding the inkblot.
Thematic Apperception Test (TAT)
Created by Morgan and Murray, consists of 20 pictures depicting ambiguous scenes. Test taker is told to tell a story about what is happening in the picture. Scoring is completely qualitative and subjective.
Blacky Pictures
Projective test, devised specifically for children.
Features 12 pictures of “blacky” the dog. Each picture represents different stages in psychosexual development, and the child is asked to tell stories about the pictures they are shown.
Rotter Incomplete Sentences Blank
Projective test, specifically a sentence completion test.
Provides 40 sentences for subject to fill in to see what is on their minds.
Barnum Effect
The tendency of people to accept and approve of the interpretation of their personality that you give them. This is a form of psuedo-validation.
Interest Testing
Used to assess an individual’s interest in different lines of work.
Strong-Campbell Interest Inventory
Best known interest test. Interpretation of results in based on the Holland model of occupational themes.
Holland’s Six Themes were: realistic, investigative, artistic, social, enterprising, conventional.
Standard Deviation calculation
The Square Root of the variance
Variance calculation
The standard deviation squared
z-score calculation
(your score - the mean)/(standard deviation)
Deviation vs Ratio IQs
Deviation IQ: norm is 100, is standardized for the person’s age
Ratio IQ: (mental age/chronological age)*100