PSYC 523 Flashcards
Achievement Test
What: A test that is designed to measure an individual’s level of knowledge in a particular area; generally used in schools and educational settings. Often the distinction is made that achievement tests emphasize ability acquired through formal learning or training rather than their innate potential. Focuses specifically on how much a person knows about a specific topic.
Why: This is important to understand the ability one has to succeed. These tests are cost effective and their scoring is objective and reliable
EX: The comps exam is an achievement test, as it is designed to measure how thoroughly clinical counseling students have learned the information in the ten core classes of the program
ANOVA
What: Analysis of Variance. A parametric statistical technique used to compare more than two experimental groups at a time. It is more flexible than a t-test because it can analyze the difference among more than two groups, even if the groups have different sample sizes.
Why: Determines whether there is a significant difference between the groups, but does not reveal where that difference lies.
EX: You are studying the effects of social media use and sleep, there is a low use and high use of social media groups. You would run an ANOVA to see if there are differences in the social media groups.
Aptitude Test
What: Measures a person’s potential to learn or acquire specific skills. Aptitude tests are prone to bias. In the context of testing, this is a test designed to measure an individual’s potential for learning a specific skill. Aptitude tests rely heavily on predictive criterion validation procedures
Why: This is important for understanding a person’s innate potential. They are prone to bias.
EX: The SAT is an aptitude test designed to predict a student’s potential success in college. There is reason to doubt the predictive validity of the SAT
Clinical vs Statistical Significance
What: Clinical significance refers to the meaningfulness of change in a client’s life. Statistical significance refers to the reliability of an outcome and is calculated mathematically. Generally in psychology, a result is statistically significant if the p-value is < .05, meaning that there is a less than 5% chance that the result is due to chance.
Why: Findings can be clinically significant without being statistically significant or vice versa. This is important in being a good consumer of research and understanding if the values with a study can show that an intervention is successful for a disorder.
EX: Sarah was doing research regarding if CBT was a good intervention for OCD, based on the results there was clinical significance showing that it was effective in reducing symptoms
Construct Validity
What: This is the degree to which a test or instrument is capable of measuring a concept, trait, or other theoretical entity that it claims to be measuring. A variety of factors can threaten construct validity like a mismatch between the construct and its operational definition, bias, experimenter, and participants effects. There are two types: Convergent validity - does the test correlate highly with other tests that measure the concept Divergent validity - does the test correlate lowly with tests that measure different constructs.
Why: This is important in research to make sure you are actually measuring what you intended to measure. It can be determined by factor analyses.
EX: If a researcher were to create a scale measuring aggression, construct validity would be the extent to which the questions actually asked about aggression compared to assertiveness
Content Validity
What: Content validity is the degree to which a measure or study includes all of the facets/aspects of the construct that it is attempting to measure. Content validity cannot be measured empirically but is rather assessed through logical analysis.
Why: This is important to research and measure the entire range of what you want to measure. This can be determined using exploratory factor analysis
EX: If a test is designed to survey arithmetic skills at a third-grade level, content validity indicates how well it represents the range of arithmetic operations possible at that level
Correlation vs Causation
What: Correlation simply tells us if there is a relationship between two variables. This can be positive or negative and the coefficient will be between -1 and +1. Causation can only be concluded if there is a manipulation of an independent variable (determined via controlled studies). Correlation does not equal causation!!
Why: This is important when consuming research to find out if you are measuring just a potential relationship or if one variable causes a result in another.
EX: Marla is examining the relationship between social media use and body image. In this study she is not modifying any variable to get causation she is only assessing if the relationship exists so this is a correlational study.
Dependent t-test
What: Statistical analysis that compares the means of two related groups to determine whether there is a statistically significant difference between these means. Sometimes called a correlated t-test because the data are correlated. Used when the design involves matched pairs or repeated measures, and only two conditions of the independent variable**
Why: It is called “dependent” because the subjects carry across the manipulation–they take with them personal characteristics that impact the measurement at both points—thus measurements are “dependent” on those characteristics.
EX: If you are measuring cigarette smoking habits in specific smokers before and after an intervention, you would use a dependent t-test to measure habits on the same group before and after intervention.
Descriptive vs Inferential
What: Descriptive statistics are used to describe or summarize data but do not tell about differences or relationships. This includes mean, median, mode, standard deviation, etc. Inferential statistics are used to make inferences about the probability or extent of a relationship/difference. Instead of just summarizing the data, they summarize the relationships found in the data set. There are parametric and non-parametric inferential statistics. They are used to test hypotheses and if conclusions drawn from a sample can be generalized to a population.
Why:
EX: A researcher conducts a study examining the rates of test anxiety in Ivy League students. This is a descriptive study because it is concerned with a specific population. However, this study cannot be generalized to represent all college students, so it is not an inferential study
Effect Size
What: A quantitative measure of the strength of a relationship between two variables; refers to the magnitude of an effect. This can be measured using Cohen’s d, r-squared in correlation, which will show the number of standard deviations units between two means. Effect size can be used with the correlation between two variables, regression coefficients or the mean difference.
Why: It is also valuable for quantifying the effectiveness of a particular intervention, relative to some comparison - commonly used in Meta-analyses. Often, effect sizes are interpreted as indicating the practical significance of a research finding.
EX: A researcher conducts a correlational research study on the relationship between caffeine and anxiety ratings. The study produces a correlation coefficient of 0.8 which is considered a large effect size. The effect size reflects a strong relationship between caffeine and anxiety.
Independent T-test
What: Used to determine if there are significant differences between two group means. This is used when there are two conditions of the independent variable to determine if there are differences between groups using group means. The independent t-test is specifically used when the two groups are not related to each other.
Why: We make the assumption that if randomly selected from the same population, the groups will mimic each other; the null hypothesis is no difference between the two groups
EX: Gabi is researching gender differences in the use of CBT for depression. She will run an independent t-test to understand if there are differences in effectiveness between groups
Internal Consistency
What: this type of reliability refers to the extent to which different items on a test measure the same ability or trait. In other words, internal consistency measures whether several items that propose to measure the same general construct produce similar scores and are free from error. Internal consistency is usually measured with Cronbach’s alpha; measured using split-half in which both halves are correlated or by using the reliability coefficient - ranges from 0-1.Internal consistency is an index of the reliability of a test.
Why: It is the degree of interrelationship or homogeneity among items on a test and it is important to measure what you are supposed to be measuring.
EX: Donna is creating a questionnaire to assess the Big 5 personality traits, she uses internal consistency to make sure the items are measuring what they are supposed to be
Internal Validity
What: The extent to which the observed relationship between variables in a study reflects the actual relationship between the variables. Control for confounding variables can increase internal validity, as well as a random selection of participants. Thus, internal validity is how sure we can be that the experimental treatment was the only cause of change in a dependent variable(s). It pertains to the soundness of results obtained within the controlled conditions of a particular study, specifically with respect to whether one can draw reasonable conclusions about cause-and-effect relationships among variables
Why:
EX: Researchers investigated a new tx for depression using tight controls in terms of who could be a participant. For instance, they did not allow anyone with comorbidity to participate. This increased the study’s internal validity. It did, however, jeopardize the ecological validity of the research.
Interrater Reliability
What: A type of reliability that measures the agreement level between independent raters. Useful with measures that are less objective and more subjective. The extent to which independent evaluators produce similar ratings in judging the same abilities or characteristics in the same target person or object. It can be expressed using a correlation coefficient.
Why: Used to account for human error in the form of distractibility, misinterpretation or simply differences in opinion.
EX: Gabi is running a study on CBT for depression, she has three of her professors assess the reliability of the questionnaire and finds that they get the same reliability. It can be assumed that her study is reliable.
Measures of Central Tendency
What: Provides a statistical description of the center of the distribution, and describes a data set. Three main measures are used: the mean, mode and median.
Mean is the arithmetic average of all scores within a data set.
Mode is the most frequently occurring score.
Median is the point that separates the distribution into two equal halves.
Why:
EX: Sarah works at a school and wants to see the norm scores for a math exam the whole school took. She will use the mean of the combined scores to figure out that information
Measures of Variability
What: measures of variability are how the spread of the distribution varies around the central tendency. Three primary measures: range, variance and standard deviation.
Range is obtained by taking the two most extreme scores and subtracting the lowest from the highest.
Variance is the average squared deviation around the mean and must be squared because the sum of the variations would equal zero.
Standard deviation is the square root of the variance and is highly useful in describing variability.
Why: It is important to see the outliers of data to assess if they need to be dropped to get accurate data when running tests. Helps determine which statistical analyses you can run on a data set
EX: Sarah had the entire third grade take a math test to see their current math abilities. After running standard deviation, she was able to see outliers and who was struggling or excelling in this subject.
Normal/Ordinal/Interval/Ratio Measurements
What: These are all levels of measurement of variables, which all vary in the degree of precision to which they can be recorded. The level of measurement is important because it determines the type of statistical analyses that can be run with the data set.
Nominal data: dichotomous, only two levels, such as male and female, or categorical, such as Republican, Democrat, Independent. They have none of the 3 properties that distinguish scales
Ordinal data (numbers) indicate order only (1st born, 2nd born)
Interval data: true score data where you know the score a person made and you can tell the actual distance between individuals based on their respective scores, but the measure used to generate the score has no true zero (temperature, F or C, SAT scores)
Ratio data: interval data with a true zero (age, height, weight, speed)
Why: Both nominal and ordinal data are non-continuous, while interval and ratio are continuous.
EX: Gabi is running a study to assess happiness in grad level students. Using an ordinal scale she able to assess who is the least happiest to the most
Norm-Referenced Scoring/Tests
What: A norm referenced test evaluates a test taker’s performance against a standardized sample; typically used for the purpose of making comparisons with a larger group. Norms should be current, relevant, and representative of the group to which the individual is being compared.
Why: There are some issues with norms, many norming samples attempt to be representative of the population which can result in several categories being represented by very few people. This can lead to inappropriate scoring, or test acceptability with some populations and has resulted in within group norming.
EX: The child psychologist tested the adolescent’s IQ and discovered that the child’s IQ was 165, placing him in the 99th percentile, more than 3 SDs above the mean on the normal curve because IQ is normally distributed. IQ testing is an example of norm referenced scoring/testing because an individual’s score is always interpreted in terms of typical performance/results.
Normal Curve
What: A normal curve is a normal distribution, graphically represented by a bell-shaped curve. A frequency where most occurrences take place in the middle of the distribution and taper off on either side. The population tends to carry a symmetrical distribution with most behaviors. Within 1 SD on either side of the median, mode and mean lies 68% of the data.
Why: Many statistical models are based on the assumption that data follow a normal distribution.
EX: A researcher is developing a new intelligence test. After obtaining the results, they found that the scores fell along a normal curve: most participants scored in the middle range with very few obtaining either the highest or lowest scores (scores were normally distributed).
Objective Tests
What: This is a type of psychological assessment instrument consisting of a set of items that have specific correct answers, such that no interpretation, judgment, or personal impressions are involved in scoring. For example, yes/no, true/false. They are scored quantitatively and not influenced by rater variables. Unambiguous stimuli and answers are scored quantitatively. Not open to interpretation; there is a correct and a wrong answer
Why:
EX: Gabi is a teacher in a special education classroom and wants to assess the math skills of her class. She will create a test that each question only has one answer to see who is understanding the material
Probabilty
What: A mathematical statement indicating the likelihood that something will happen when a particular population is randomly sampled, symbolized by (p). The chance that we have found something interesting is not due to chance or error. A p-value of .05 is generally accepted; this means that there is a 95% chance the IV influenced the DV.
Why: The higher the p value, the more likely that the phenomenon or event happened by chance
EX: Gabi wants to understand the probability that a child from poverty will grow up to have a drug addiction, she will distribute a test and assess it using a p-value to see if poverty influences potential drug addiction.
Projective Tests
What: The general idea behind projective tests is that a person’s interpretation of an ambiguous stimulus reflects his unique characteristics; most often personality tests. Based on a projective hypothesis that says when people attempt to understand an ambiguous or vague stimulus, their interpretation of that stimulus reflects their needs, feelings, experiences, thought processes.
Why: TAT and Rorschach tests are generally used and require extensive training. These tests need to be interpreted so they are always more at risk for error.
EX: You are seeing a client and you ask them to interpret a black ‘blob’ while using the Rorschach inkblot test. This is a projective test that suggests the client saying that she sees a crab in the image might be indicative of her mood at the time of testing.
Parametric vs Nonparametric Statistical Analyses
What: Parametric statistics are ones that are based on a normal distribution, have homogeneity of the variance, and require data to meet certain assumptions. Nonparametric statistics are statistics that use non-continuous data or collapse continuous data into a non-continuous distribution to examine relationships. There are no assumptions made and are less robust, typically only used when necessary. You would use a chi-squared test to get these statistics. They are used for nominal /ordinal data and skewed interval/ratio.
Why: Parametric analyses are preferred because they have greater statistical power and are more likely to detect statistical significance. This is important to understand when running tests on data.
EX: Professor Johnson handed out a survey to students to assess preferences of exams on a like to dislike scale. This is ordinal data therefore requiring nonparametric statistics to be run
Regression
What: Regression is a prediction based on correlated data. Correlation tells us whether a relationship exists. Regression allows us to predict based on that relationship by identifying the line of best fit. This is generally assessed using R-squared to describe, explain, or predict the variance of an outcome or dependent variable using scores on one or more predictor or independent variables. Linear regression is predicting one variable to another while multiple regression uses multiple predictor variables.
Why: This is important to assess if there is a relationship between X and Y. It produces a line of best fit. The stronger the correlation, the less error in prediction.
EX: A developmental psychologist performed a study on aggressive behavior in boys and hormone levels. Researchers performed a regression analysis on the data. Their results showed that the severity and frequency of the boys’ aggression could be accurately predicted based on the levels of testosterone