523 - Stats Flashcards by Deena Elraie

achievement test

WHO:

WHERE: used in schools and education settings

WHAT: A test designed to measure how much someone knows about a particular topic. Measures previous learning!!! Not their ability to learn something

WHY: Achievement tests offer a standardized measure to compare individuals or groups. The scoring is objective and reliable They may also help to highlight academic strengths and weaknesses.

EXAMPLE: Comps is an achievement test designed to measure how much students have learned in the ten core classes of the program. Also, if they’ve learned enough to continue in the program.

How well did you know this?

Not at all

Perfectly

ANOVA

WHO:

WHERE: Applied statistics and psychometrics

WHAT: An analysis of variance. A statistical technique used to compare more than two experimental groups at a time. Different than t-tests because they can analyze differences even if groups have different sample sizes.

WHY: ANOVAs determine whether there is a significant difference between groups. Can also reduce the chances of type I errors (false positives)

EXAMPLE: There is an experiment done to compare test scores using three different study techniques: flashcards, note reading, and practice tests. An ANOVA test is run to see if there are any significant differences between the groups.

How well did you know this?

Not at all

Perfectly

aptitude test

WHO:

WHERE: Applied statistics and psychometrics

WHAT: Measures a person’s potential to learn specific skills/gain knowledge on a topic. They rely heavily on predictive criterion validation procedures.
Prone to bias (cultural, racial, language).

WHY: Aptitude tests are important to help understand a person’s innate potential. They can help predict future performance in specific areas and help ensure that students are enrolled in programs that match their capabilities.

EXAMPLE: The ACT is an aptitude test designed to predict a student’s potential success in college. There is reason to doubt the predictive validity of the ACT (racial, gender bias).

How well did you know this?

Not at all

Perfectly

clinical vs statistical significance

WHO:

WHERE: Applied statistics and psychometrics

WHAT: Clinical = meaningfulness of change in a client’s life
How meaningful/important are the changes to the patient? Does an individual still have quality of life? Do they still meet criteria for a diagnosis? What percentage of patients are benefitting?

Statistical = reliability of an outcome; calculated mathematically; considered statistically significant if p-value is < .05 (<5% chance results are due to chance)
larger sample = less likely results due to chance

WHY: Findings can be clinically significant without being statistically significant, or vice versa.
This is important to remember while understanding research, and understanding if a treatment may be helpful for a disorder.

EXAMPLE: A therapist is trying to decide between two different treatments for a client. One treatment has a high clinical significance and a statistical one. The other has a high statistical significance, but a low clinical significance. The therapist chooses the first treatment, as the patients in the study have a higher quality of life, and fewer of them meet diagnostic criteria post-treatment.

How well did you know this?

Not at all

Perfectly

construct validity

WHO:

WHERE: applied stats and psychometrics

WHAT: The degree to which a test is capable of measuring all aspects of what it claims/aims to be measuring.
- Focuses on the attributes, features, nature, and ability of a measurement instrument being tested
- Does it fully measure when it aims to be measuring?
Divergent validity = how well the test does not correlate with other tests that measure different constructs
Convergent validity = how well the test correlates with other tests that measure the same constructs

WHY: It is important to keep construct validity in mind to ensure you are measuring what you intend to research. Additionally, steps can be taken to avoid things that threaten construct validity, such as a mismatch between the construct and its operational definition, bias, experimenter, and participant effects.

EXAMPLE: A group of researchers create a new test to measure depression. They want to ensure that the test has construct validity (that it is actually measuring the construct of depression). To do this, they measure how much the test correlates with the BDI and how much it does not measure another construct like anxiety.

How well did you know this?

Not at all

Perfectly

content validity

WHO:

WHERE: applied stats and psychometrics

WHAT: The degree to which a measure represents all aspects of a given construct
- how well a measure encompasses the full domain of what it is trying to measure
- Is the test representative of what it aims to measure?
Can’t be measured empirically– assessed via factor analysis

WHY: Considering content validity is important in research to ensure a measure measures the entire range of what it aims to test. Useful to assess whether items are relevant

EXAMPLE: A test is designed to survey arithmetic skills at a fourth-grade level. The test’s level of content validity indicates how well it represents the range of arithmetic skills possible at the level

How well did you know this?

Not at all

Perfectly

correlation vs causation

WHO:

WHERE: applied stats and psychometrics

WHAT: correlation = relationship between two variables (correlation coefficient between -/+1)
causation = when change in one variable brings a change in the other variable. determined via controlled studies

WHY: Correlation ≠ causation!! Important to consider when creating + consuming research to know how/why two variables are related, and to be able to deduct accurately

EXAMPLE: Ice cream sales and drowning rates are positively correlated. This is not because none causes the other, but rather because both are more common during summer months.
Annie is examining the relationship between social media and her body image. She abstained from social media for one month and noticed her body image became more positive. She now has reason to believe there is a causational relationship between the two variables/

How well did you know this?

Not at all

Perfectly

dependent t-test

WHO:

WHERE: applied stats and psychometrics

WHAT: A statistic analysis that compares the means of two RELATED groups to determine whether there is a statistically significant difference between their means
- used when the design involves matched pairs or repeated measures, and has 2 levels of the IV

WHY: Called ‘dependent’ because the groups have characteristics that impact the measurement. The measurement is dependent on these characteristics
They allow for researchers to control for individual characteristics

EXAMPLE: A research wants to test how effective a relaxation technique is on reducing stress levels in college students. Stress levels are recorded before and after the use of the relaxation technique. A dependent t-test is conducted to compare the mean stress levels before and after intervention to determine if the relaxation technique made a statistically significant difference.

How well did you know this?

Not at all

Perfectly

internal consistency

WHO:

WHERE: applied stats and psychometrics

WHAT: Type of reliability
Measures the extent of which items on a test measure a specific ability or trait.
Do items that are intended to measure the same contract produce similar scores?
Measured with Cronbach’s alpha, ranges 0 - 1

WHY: Internal consistency shows the degree of interrelationship/homogeneity of items on a test. It is important to ensure a test truly measures what it’s supposed to be measuring.

EXAMPLE: Molly is creating a test to measure the Big 5 personality traits. She tests the test’s internal consistency to ensure it adequately is measuring what she intended it to. The Cronbach’s alpha comes out to 0.91, indicating a good internal consistence. Molly’s test is suitable for use.

How well did you know this?

Not at all

Perfectly

internal validity

WHO:

WHERE: applied statistics and psychometrics

WHAT: The extent to which the observed relationship between variables in a study reflects their actual relationship
Internal validity is how sure you can be that the intervention was the only reason for change in the DVs
To increase internal validity = control for cofounding variables, randomly select participants

WHY: A study with a high internal validity may indicate causation. Internal validity indicates whether one can draw reasonable conclusions about the cause-and-effect relationships among variables in a study.

EXAMPLE: A group of researchers were testing a new treatment for depression. They highly controlled who could be a participant, including not allowing anyone with a comorbid disorder. This reduced potential cofounding variables, increased the study’s internal; validity, and therefore increased the likelihood that their treatment was the sole reason for change in participants.

How well did you know this?

Not at all

Perfectly

interrater reliability

WHERE: applied statistics and psychometrics

WHAT: Type of reliability
Measures the agreement level between independent raters
- the extent to which independent evaluators produce similar ratings in judging the same thing in the same person/object
- useful with measures that less objective and more subjective
- expressed with correlation coefficient

WHY: Interrater reliability is used to compensate/account for human error in an independent rater (distractibility, misinterpretation, differences in ability)

EXAMPLE: A natural observation study is being conducted to look at the effect of violent video games on the behavior of 10 year old boys. 3 independent observers were to rate the level of aggressiveness of the boys’ behavior. The responses were consistent and yield a high correlation coefficient, indicating good interrater reliability.

How well did you know this?

Not at all

Perfectly

measures of central tendency

WHO:

WHERE: applied statistics and psychometrics

WHAT: Statistical descriptions of the center of the distribution
Mean = average
Median = point that separates distribution into two halves
Mode = most frequently occurring
**median and mode most resistant to outliers

WHY: Describes a data set/distribution. Allows for a better understanding of the data, as well as for inferences to be made about trends and the shape of the distribution.

EXAMPLE: A researcher is studying the frequency of BPD patients intentionally skipping their medications per month. To better understand the gathered data, the researchers calculate the most frequently occurring number of days, the average number of missed doses, and the number of missed days in the center of the data set.

How well did you know this?

Not at all

Perfectly

measures of variability

WHO:

WHERE:

WHAT:

WHY:

EXAMPLE:

How well did you know this?

Not at all

Perfectly

nominal/ordinal/interval/ratio measurements

WHO:

WHERE: applied statistics and psychometrics

WHAT: How the spread of the distributions varies around the central tendency
SD = square root of variance
Range = difference between the highest and lowest value
Variance = the average of each value’s SQUARED difference from the mean

WHY: It is important to see the outliers of data to asses

It is important to see the outliers of data to assess if they need to be dropped to get accurate data when running tests. Helps determine which statistical analyses you can run on a data set

EXAMPLE:

How well did you know this?

Not at all

Perfectly

norm-referenced scoring/tests

WHERE: Taught in applied stats and psychometrics

WHAT: A norm referenced test evaluates a test taker’s performance against a standardized sample; typically used for the purpose of making comparisons with a larger group. Norms should be current, relevant, and representative of the group to which the individual is being compared.

WHY: It is important as Norm-referenced scoring/tests can be problematic when tests are not normed with a culturally diverse population. Many norming samples attempt to be representative of the population which can result in several categories being represented by very few people. This can lead to inappropriate scoring, or test acceptability with some populations and has resulted in within group norming.

EXAMPLE: IQ testing is an example of norm referenced scoring/testing because an individual’s score is always interpreted in terms of typical performance/results.

How well did you know this?

Not at all

Perfectly

normal curve

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: The normal curve is the bell-shaped curve which is created by a normal distribution of a population. It describes the shape of a frequency distribution in which most occurrences take place in the middle and taper off to either side of the mean. The normal curve is symmetrical in nature and the mean, median and mode are the same value. Random sampling tends to make a normal curve.

WHY: It is important to understand what the normal curve is as many statistical models are based on the assumption that data follow a normal distribution.

EXAMPLE: A researcher is developing a new intelligence test. After obtaining the results, they found that the scores fell along a normal curve: most participants scored in the middle range with very few obtaining either the highest or lowest scores (scores were normally distributed).

objective tests

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: This is a type of unbiased and structured psychological assessment instrument consisting of a set of items that have specific correct answers, such as yes/no, true/false. The stimuli are unambiguous, and answers are scored quantitatively. Not open to interpretation; there is a correct and a wrong answer.

WHY: Important as it leaves no interpretation, judgment, or personal impressions involved in scoring

EXAMPLE: The psychologist found that when assessing clients with Borderline Personality Disorder, objective tests of personality–such as the MMPI– were more valid in providing personality information than projective tests–such as the Rorschach Test– in which their own personal bias or judgment could hinder the test results thus affecting the reliability and validity of the measure

probability

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: A mathematical statement indicating the likelihood that something will happen when a particular population is randomly sampled, symbolized by (p). A p-value of .05 is generally accepted; this means that there is a 95% chance the IV influenced the DV.

WHY: The higher the p value, the more likely that the phenomenon or event happened by chance.

EXAMPLE: A psychologist wants to understand the probability that a child from poverty will grow up to have a drug addiction. She will distribute a test and assess it using a p-value to see if poverty influences potential drug addiction.

projective tests

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: A type of non structured assessment in which the test-taker is asked to provide a spontaneous response to ambiguous stimuli, rather than choosing an answer from provided response options. Most often used with personality tests. Based on a projective hypothesis that says when people attempt to understand an ambiguous or vague stimulus, their interpretation of that stimulus reflects their needs, feelings, experiences, thought processes.

WHY: TAT and Rorschach tests are generally used and require extensive training. These tests need to be interpreted so they are more at risk for error.

EXAMPLE: The therapist administers the Rorschach inkblot test, a projective test. The purpose is for the client to give their response based on their subjective representation of the inkblot. However, the client’s response to the inkblot could be an indicator of their mood at the time of testing.

parametric vs nonparametric statistical analyses

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: Parametric statistical analyses require certain assumptions about the distribution of scores and are based on normal (symmetrical)distribution. Nonparametric statistical analyses do not require strict assumptions and are based on a skewed (not normal) distribution.

WHY: Important to understand the differences between them to see which one to use. Parametric analyses are preferred because they have greater statistical power and are more likely to detect statistical significance. Nonparametric analyses are used when necessary (they are used for nominal /ordinal data and skewed interval/ratio.

EXAMPLE: A researcher sets up a study categorizing participants to see what proportion of students enrolled in the program prefer exams over papers (like or dislike scale) This type of research uses ordinal data and would fall under a nonparametric research design versus the parametric design.

regression

Study These Flashcards

WHO: a descriptive statistical technique developed by Sir Francis Galton

WHERE: Taught in applied stats and psychometrics

WHAT: It is an analysis one step beyond correlation. It is the prediction based on significantly correlated data. It can be used to describe, explain, or predict the variance of an outcome or DV using scores on one or more predictors or IVs.

WHY: Reasoning is if two variables are significantly correlated, then we should be able to predict one from another This is important to assess if there is a relationship between X and Y. It produces a line of best fit. The stronger the correlation, the less error in prediction.

EXAMPLE: A researcher finishes a study in which they find a positive correlation between caffeine and test scores. They want to use this data to estimate an individual’s test score based on the amount of caffeine they consume so they calculate a regression line. They will use the equation of this line to make predictions.

types of reliability:

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: It is a measure of the trustworthiness or consistency of a measure. The degree to which the instrument is free of random error, yielding the same results across multiple applications.
- Internal reliability is the extent to which a measure is consistent within itself while external reliability is the extent to which a measure varies from one use to another
- Test-Retest Reliability examines consistency of a measure from one time to another. Same test given at two points in time. Correlation between those scores obtained by the same person on 2 occasions
- Inter-Rater Reliability examines the degree of consistency between different raters’ scores. Correlation between those scores
- Parallel Forms Reliability examines the consistency of the results of two tests constructed in the same way from the same content domain. Tests must be very similar. Correlation between the equivalent forms of the test

WHY: Important as they are used to assess how reliable a testing method is to get the same results over different conditions and free of measurement error.

EXAMPLE: While developing a new version of an IQ test, researchers gave the test to the same group of subjects at several different times to evaluate the instrument’s test-retest reliability.

sample vs population

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: A sample is a small subset of the population that is selected to represent the population in a study. A population is all members of a group; the larger group of individuals from which a sample is selected.

WHY: It’s important to ensure that the sample is representative of the population in research because it increases the potential that any findings of importance can be generalized back to the whole population

EXAMPLE:. A researcher wants to conduct a study to examine how opioid addiction affects depression rates. As it would be nearly impossible to study every individual with an opioid addiction, they create a sample of individuals that most closely represents the whole population.

standard error of estimate

Study These Flashcards

WHERE: Taught in applied stats and psychometrics

WHAT: A standard deviation in a regression which indicates the amount that the actual scores differ from the predicted scores. This is also known as standard error of the residuals. It is a measure of the accuracy of an estimate.

WHY: Important to know as the smaller the standard error of estimate, the more confident one can be in the accuracy of the estimated y value.

EXAMPLE: A researcher wants to understand the relationship between caffeine and test scores, so they calculate the correlation and regression line. Next, they want to know if the predictions made using the regression equation are accurate predictions of test scores, so they calculate the standard error of estimate.

standard error of measurement

WHERE: Taught in applied stats and psychometrics WHAT: An estimate of how much an individual’s score on a measurement would be expected to change upon retesting with a similar or the same test. WHY: Important as it provides an indication of how confident one may be that an individual's obtained score on any given measurement opportunity represents their true score. The smaller the S.E.M the more precise the measurement capacity of the instrument EXAMPLE: A researcher develops a test to measure depression, then administers it to a sample. They then want to use this data to make predictions about scores. They calculate the SEM, which turns out to be low, meaning the predictions are more accurate.

standard error of the difference (2 sample t-test)

WHERE: Taught in applied stats and psychometrics WHAT: A statistical calculation that informs, on average, how much deviation there is across groups in a study. WHY: Important as it's the estimate of error between the two groups. *A two-sample t-test compares the means of two samples to see if they came from the same population. * EXAMPLE: A researcher conducts a study on how caffeine affects test scores. They take the mean of scores from each group (with and without caffeine) and calculate the differences between the means. They then use the S.E.D to see the amount of error between the estimated and actual difference.

test bias

WHERE: Taught in applied stats and psychometrics WHAT: A systematic error in the measurement process that differentially influences scores for identified groups. A difference in test scores that can be attributed to demographic variables such as age, sex, and race. Tests are considered biased if a test design systematically disadvantages certain groups of people over others. WHY: This bias is a systematic error and is important to keep in mind when adding cultural and ethnic factors into test making. EXAMPLE: Researchers develop a test that examines depression levels. The test uses language and vernacular that is not easily recognized by non-white American populations. The test has bias.

type I and type II error

WHERE: Taught in applied stats and psychometrics WHAT: Type I error occurs when researchers incorrectly conclude that the independent variable had an effect on the dependent variable. They rejected a null hypothesis (false positive). Type II error occurs when researchers incorrectly conclude that the independent variable did not affect the dependent variable. They failed to reject the null hypothesis (false negative) WHY: Important to know the differences when looking at the data. Helps with limiting errors. EXAMPLE: A researcher is testing a new drug to treat depressive symptoms. After reviewing the results, they concluded that the drug effectively reduced symptoms; however, the results were wrong, and the drug had no impact. This is a Type I error.

types of validity:

WHERE: Taught in applied stats and psychometrics WHAT: validity is the measure of how well a particular measure fulfills the function for which it is being used Content Validity: degree to which a measure represents all aspects of a given construct; how well a measure encompasses the full domain of what it is trying to measure Criterion Validity: extent to which the test corresponds with a particular criterion against which it is compared; how well one measure predicts outcome of another measure Concurrent Validity: extent to which a new measure correlates with a previously established/validated measure Construct Validity: the degree to which the test measures the construct or trait it intends to measure Internal Validity: whether the effects observed in a study are due to the manipulation of the independent variable and not some other factor External Validity: the extent to which the results of a study can be generalized to other situations and to other people WHY: Important to know to see if measure fulfills the function it is being used for. EXAMPLE: A group of researchers design a test to measure depression. They want to ensure that the test has construct validity, so they measure how well the test correlates with the Beck Depression Inventory and how much it differs from a measure for another construct like depression.

variance

WHERE: Taught in applied stats and psychometrics WHAT: A measure of variability which is defined as the average squared deviation around the mean. Must be squared because sum of deviations around mean would always = 0 . Variance is useful for statistical analysis but not as useful for descriptive statistics. WHY: Variance is helpful in research because it can be quantified using statistics and converted to a number that can be used to compare between samples or across samples in populations to see which has the most or least variance or to see how much variance may change due to an intervention or treatment applied. EXAMPLE: A researcher is studying the effects of an SSRI on depression symptoms. The variance between the placebo group and the intervention group is high. This means that the SSRI works to treat depressive symptoms.

523 - Stats Flashcards

(30 cards)