PSYC523 – Statistics Flashcards

1
Q

ANOVA

A

an abbreviation for an analysis of variance, which is a parametric procedure for determining whether significant differences exist in an experiment that contains two or more conditions. ANOVA is a test of inferential statistics.

a medical researcher wants to determine whether there is a difference in the mean length of time it takes 3 types of pain relievers to provide relief from headache pain. Headache sufferers are randomly selected, randomly assigned to 3 groups, and given one of the 3 medications. Each headache sufferer records the time in minutes it takes the medication to begin working. The mathematical formula for a one-way ANOVA is used to determine if there is a difference in the pain medications. The null hypothesis states that the 3 medications are equal. The ANOVA shows that the F-value is not in the rejection region, so the null hypothesis should not be rejected as there is not enough evidence at the 5% level of significance to conclude that there is a difference in the mean length of time it takes the 3 pain relievers to provide relief from headache pain as shown by the variance between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Construct validity

A

in research design, construct validity is the degree to which a test or study measures the construct ( theoretical trait) that it claims to measure. There are two further aspects to construct validity: convergent and divergent validity. Convergent validity is how well a test agrees with other previously validated tests that measure the same construct. Divergent validity is the extent to which a test measures what it is supposed to and not some theoretically unrelated construct. In order to have high construct validity, a test should correlate highly with other measures of the same construct, and not correlate highly with measures of other constructs.

Ex:Client comes to therapy complaining of fatigue, loss of appetite, and feeling hopeless. The therapist uses the Beck Depression Inventory (BDI) to measure her current symptoms of depression bc of the test’s high construct validity, the BDI is a psychological assessment that accurately measures depression(construct), thereby demonstrating construct validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Content validity

A

in research design, content validity is the degree to which a test or study includes all of the facets of the construct it is attempting to measure; items should cover entire range of relevant behaviors, thoughts, and feelings that define the construct being measured. An element of subjectivity exists in relation to determining content validity, which requires a degree of agreement about what a particular construct (such as extraversion) represents. Content validity is related to face validity, but is not the same thing.

EX—The newly developed depression scale lacked content validity as it only assessed the affective dimension of depression but failed to take into account the behavioral dimension of the individual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Correlational research

A

this is a form of research design that determines whether there is a relationship between two variables and, if so, what the strength of that relationship is. A correlation is a measure of a LINEAR relationship between two variables. Correlational studies yield a correlation coefficient (a number between -1.00 and 1.00) which represents the strength and direction of the relationship between the two variables.
- Correlational research cannot establish causation, Correlational research is often conducted as exploratory or beginning research. Once variables have been identified and defined, experiments are conductable.

Ex:psychologist is interested in testing the claim that people with more friends tend to be healthier. She surveys 500 people in her community, asking them how many friends they have and getting some measure of their overall health. Then she makes a scatterplot and sees that there is a positive correlation between these variables. Specifically, she finds that r = +.3, concluding that there is a positive correlation between people with more friends and good health.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cross-sectional design

A

This type of study utilizes different groups of people who differ in the variable of interest, but share other characteristics such as socioeconomic status, educational background, and ethnicity.
Typically, several dependent variables are measured, and the study itself rarely takes more than a few months to complete. Cross sectional designs are advantageous because they take less time and therefore less money, but are disadvantageous because they give little information about the stability of the dependent variables and the change in them over time.Cross-sectional studies are observational in nature and are known as descriptive research, not causal or relational.

For example, researchers studying developmental psychology might select groups of people who are remarkably similar in most areas, but differ only in age. By doing this, any differences between groups can presumably be attributed to age differences rather than to other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dependent t-test

A

a statistical procedure that is appropriate for significance testing when the scores meet the requirements of a parametric test, the design involves matched groups or repeated measures, and there are only two conditions of the independent variable.

Ex:Performed a t-test to evaluate significance in scores on an intervention to improve life satisfaction among grad students. Program “Happy”
is implemented with our sample of 30 participants. Life satisfaction is measured by an index score at pretest and posttest phases. The life satisfaction index is operationally defined as a continuous and ratio measure ranging from 0 to 100 with lower scores indicating lower life satisfaction and higher scores indicating higher life satisfaction.

The intervention test phase is the independent variable (posttest versus pretest) and the life satisfaction index score is the dependent variable. Test phase is categorical and nominal with two subcategories. The life satisfaction score is continuous and ratio.

The dependent-samples t test compares the average values of a characteristic measured on
a continuous scale (life satisfaction) between two conditions of the same group (e.g., assessment pretest vs posttest)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Descriptive vs. Inferential

A

descriptive statistics are those which are used to concisely describe a data set. Inferential statistics are those which use a smaller representative sample to draw conclusions about a larger group. Descriptive statistics can only be used to describe the sample that they are conducted on; inferential statistics can be used to make generalizations about a larger population from a small sample.

Ex:Frequency distributions, measures of central tendency (mean, median, and mode), and graphs like pie charts and bar charts that describe the data are all examples of descriptive statistics. ex: 15% of students at Made-up High have been a victim of bullying.

Examples of inferential statistics include linear regression analyses, ANOVA, correlation analyses, ex: 15% of children in high school have been a victim of bullying.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Double-blind study

A

a type of experimental design in which both the participants and the researchers are unaware of who is in the experimental condition and who is in the placebo condition. This is in contrast to a single-blind, where only the participants are unaware. Double-blind studies eliminate the possibility that the researcher may somehow communicate (knowingly or unknowingly) to a participant which condition they are in, thereby contaminating the results.

ex:A study to test the efficacy of a new SSRI targeted to alleviate anxiety symptoms in returning vets suffering from PTSD used a double-blind study in order to increase the internal validity and reduce experimenter bias. Neither the experimenter nor participants were aware of who was in the treatment group and who was receiving a placebo until the results were calculated. This setup ensured that the experimenter could not make subtle gestures signaling who was receiving the drug and who was not and that experimenter expectations could not affect the studies outcome. With this double-blind design the drug proved to be extremely efficacious in treating anxiety in PTSD.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ecological validity

A

in the context of research, Ecological validity is present to the degree that a result generalizes across different settings. It is the extent to which an experimental situation approximates the real-life situation which is being studied.
Reactivity, a threat to ecological validity, is defined as an alteration in performance that occurs as a result of being aware of participating in a study.
Reactivity is a problem of ecological validity because the results might only generalize to other people who are also being observed.
Researchers have called for making experiments more ecologically valid in hopes that they would generalize better to the real world.

Ex: research carried out at the local university looked at the effects of getting 4 hours of sleep or less on cognitive performance. The subjects were primarily adults over the age of 25 and
the experiment was monitored in a laboratory setting.
Study has low ecological validity when applied to the population as a whole since college students are mainly young and rather accustomed to functioning on low amounts of sleep and then going to class and performing cognitive tasks. The setting
also had some artificial features lacking in the real
world (e.g., the research participant was aware of the goals of the study.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Experimental research

A

a form of research in which one variable (the independent variable) is manipulated in order to see what effect it will have on another variable (the dependent variable). Researchers will try to control any other variables (confounds) that may affect the dependent variable, in order to establish that if a change occurred it was caused by the independent variable. Experimental research is the only kind of research which can establish causation.

Ex: using experimental research the psychologist randomly assigned the participants who fit the criteria for depressive symptoms into an experimental and a control group. He then administered the an anti-depressant drug (the IV) daily to the experimental group and a placebo on to the control group to determine what effects the drug combined would have on the pts. depressive symptoms. When compared to the control group, the symptoms (DV) of the experimental group improved significantly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Independent t-test

A

a statistical procedure used for significance testing that is appropriate when the scores meet the requirements of a parametric test, the design involves independent samples, and there are only two conditions of the independent variable. Independent t-test is important in determining significance.

Ex:In examining the life satisfaction of grad students, we are interested in the relationship between gender and life
satisfaction levels. We use a sample of 30 participants and examine whether life satisfaction (measured by an index score)
varies between male and female grad students.
The life satisfaction index is operationally defined as a continuous and ratio measure ranging from 0 to 100 with lower scores indicating lower life satisfaction and higher scores indicating higher life satisfaction.

Gender is the independent variable and the life satisfaction index score is the dependent variable. Gender is categorical and nominal with two subcategories (male and female). The life satisfaction index score is continuous and ratio.

compares the average values of a characteristic measured on a continuous scale (Satisfaction) between two subgroups of a categorical variable. (Gender)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Internal consistency

A

In statistics and research, Internal consistency is a measure of reliability used to evaluate the degree to which different test items that probe the same construct produce similar results.

There are three methods used to measure internal consistency reliability:

Cronbach’s alpha: The most commonly used measurement of internal consistency.

Split-halves test: Involves splitting the test items in half (i.e., forming a group of all even items and another group with all of the odd items) and correlating the two halves.

Kuder-Richardson test: Similar to the split-halves test. You find the average correlation for all of the possible split-half combinations.

Internal consistency ranges between zero and one.
No matter which method you use, the closer your measurement is to 1, the higher your internal consistency is.

Ex: Patient comes in with symptoms of PTSD after surviving a car accident. You decide to search for a psychological test that is designed help you to detect and diagnoses PTSD. You come across the Posttraumatic Stress Diagnostic Scale (PDS). The test manual indicates that the PDS is a valid measure of PTSD. You look in the test manual of the PDS and find that Cronbach’s alpha is 0.91. This indicates that the PDS has strong internal consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal validity

A

Internal validity can be defined as the degree to which the independent variable causes the changes seen in the dependent variable being examined within the study. The internal validity of a study is related to the researcher’s control of extraneous variables. Therefore, an experiment conducted in a laboratory with high control can eliminate extraneous variables more easily and establish internal validity. Also, construct validity must be established before internal validity can be attained.

Ex:The drug company used tight controls for the participants allowed to be in the study to test a new drug for depression. They did not allow anyone with a comorbidity to participate. This increased the internal validity of the research and showed high statistical and clinical significance for the efficacy of the drug as there were no extraneous variables of other disorders to conflict with the treatment. It did, however, jeopardize the external validity of the research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interrater reliability

A

in research design, this is a type of reliability that measures the agreement level between independent raters. It is used with measures that are less objective and more subjective. This type of reliability is used to account for human error in the form of distractibility or misinterpretation.

Ex: three grad students are performing a natural observation study to examine violent video games and behavior of a group of 9 year old boys. The students rated the behavior on a scale of 1 (not aggressive) to 5 (very aggressive). However, the responses were not consistent between the students. The study lacked inter-rater reliability. It was decided that the raters needed more training to properly define the construct of aggressive b/h in order to increase inter-rater reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Measures of central tendency

A

in statistics, a measure of central tendency is a single value that describes the way in which a group of data cluster around a central value. They help to summarize the main features of a data set and identify the score around which most scores fall.

The mean, median, and mode are the three measures of central tendency. The mean and median can only be used for numerical data, The mode can be used with both numerical and nominal data. The mean is the arithmetic average of all scores within a data set; the mode is the most frequently occurring score; the median is the point that separates the distribution into two equal halves. The median and mode are not as affected by outliers as the mean.

Central tendency is useful: It lets us know what is normal or ‘average’ for a set of data. allows you to compare one data set to another. Central tendency is also useful when you want to compare one piece of data to the entire data set.

Ex:According to the bell curve that represents IQ, the mean is 100 with a standard deviation of 15. Many psychologists use this measure of central tendency when evaluating the mental stability and capacity of children and mentally ill pts The child was given the Stanford-Binet IQ measure and scored one SD above the mean of 100 for a score of 115. The measure of intelligence did not correlate with his failing grades and he was referred for counseling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Measures of variability

A

in statistics, these are measures of how scores in a distribution deviate or vary around the central tendency; Variability is a measure of the spread of a data set. Three primary measures: range, variance, and standard deviation. The range is obtained from taking the two most extreme scores and subtracting the lowest from the highest. The variance is the average squared deviation around the mean, and must be squared because the sum of the deviations would equal zero. The standard deviation is the square root of the variance, and is highly useful in describing variability.

Because measures of variability are a form of descriptive statistics, we can only use them to describe the data in our study. They cannot be used to draw conclusions or make inferences that go beyond our data set.

Ex:According to the bell curve that represents IQ, the mean is 100 with a standard deviation of 15. Many psychologists use standard deviation as a measure of variability when evaluating the mental stability and capacity of children and mentally ill pts The child was given the Stanford-Binet IQ measure and scored one SD above the mean of 100 for a score of 115. The measure of intelligence did not correlate with his failing grades and he was referred for counseling.

17
Q

Nominal/Ordinal/Interval/Ratio measurements

A

these are four types of scales of measurements seen in statistics. Nominal scales are used for categorical groupings, and have none of the three properties that distinguish scales. Ordinal scales are used for rankings of individuals or variables, and have the property of magnitude. Interval scales have magnitude and have equal intervals between any two observations, but do not have the property of absolute zero. Ratio scales have all three properties of scales: magnitude, equal intervals, and absolute zero. The type of scales used dictates what statistical procedures may be utilized on a data set.

Ex:the grad student used a Likert scale, scale used to measure levels of agreement/disagreement, to collect data for her thesis. However, this type of ordinal measure did not discriminate equal values between 1 and 2 or 3 and 4 on the scale. Thus, the answers were somewhat subjective to the participant answering the questionnaire but were still valid for the type of research being done and the data needed. She calculated her results using a non-parametric statistical analysis.

18
Q

Probability

A

a mathematical statement indicating the likelihood that particular event will occur when particular population is randomly sampled, symbolized by (p). Describes the operation of chance; Probability is a number or fraction between 0 and 1. (can be converted to a percentage) A probability of 1 means something will always happen, and a probability of 0 means something will never happen.
More frequent events have higher probability
Lower frequency events have lower probability

Ex:The female pt presented with severe anxiety. She was obsessed with the belief that she was going to be a victim of a violent crime. The therapist attempted to counter this irrational thought with the statistic that a person has only a 5% probability of being a victim of a violent crime thus there is a 95% probability that she would never be a victim to a violent crime.

19
Q

Parametric vs. nonparametric statistical analyses

A

Parametric and nonparametric are two broad classifications of statistical procedures.

Parametric statistical procedures rely on assumptions about the shape of the distribution
(i.e., assume a normal distribution) in the underlying population and about the form or
parameters (i.e., means and standard deviations) of the assumed distribution.

 Nonparametric tests do not rely on assumptions about the shape or parameters of the
underlying population distribution.

If your measurement scale is nominal or ordinal then you use non-parametric statistics
If you are using interval or ratio scales you use parametric statistics.

 If the data deviate strongly from the assumptions of a parametric procedure, using the
parametric procedure could lead to incorrect conclusions.

Nonparametric procedures generally have less power for the same sample
size than the corresponding parametric procedure if the data truly are normal.

Ex: the grad student used a Likert scale, scale used to measure levels of agreement/disagreement, to collect data for her thesis. However, this type of ordinal measure did not discriminate equal values between 1 and 2 or 3 and 4 on the scale. Thus, the answers were somewhat subjective to the participant answering the questionnaire but were still valid for the type of research being done and the data needed. She calculated her results using a non-parametric statistical analysis.

20
Q

Random sampling

A

a method of selecting participants from the population for a given study in which all members of the population being studied have an equal chance of being chosen or sampled. The goal is to obtain a sample that is representative of the larger population. It is important because it is used to reduce the potential for biases in experiments.

Ex:using experimental research the psychologist randomly assigned the participants who fit the criteria for depressive symptoms into an experimental and a control group. He then administered the an anti-depressant drug (the IV) daily to the experimental group and a placebo on to the control group to determine what effects the drug combined would have on the pts. depressive symptoms. When compared to the control group, the symptoms (DV) of the experimental group improved significantly.

21
Q

Regression

A

this is a statistical technique in which one variable is used to predict or estimate the score of another variable.
The two basic types of regression are linear regression and multiple regression. Linear regression uses one independent variable to explain and/or predict the outcome of Y, while multiple regression uses two or more independent variables to predict the outcome.
The goal is to find a regression line that maximizes prediction accuracy and minimizes error.

Ex: psychologist performed a study on agressive behavior and hormone levels. They performed a regression analysis on the data. Their results showed that the severity and frequency of the boys’ aggression could be predicted based on the levels of the hormones testosterone, DHEA, and cortisol.

22
Q

Sample vs. Population

A

The major use of inferential statistics is to use information from a sample to infer something about a population.

The population is the large group of all scores that would be obtained if the behavior of every individual of interest in a particular situation could be measured. A sample is a relatively small subset of the population that is selected to represent the population in inferential statistics, as it would be impossible to study the entire population. It is important that the sample is representative of the population being studied.

Ex: in order to test the efficacy of the anti-depressant drug the researchers chose a sample of participants from the population that would be representative of the population. It is too expensive and time consuming to test the entire population that qualifies for anti-depressant drugs, so the researchers find that using a subset is more feasible and should produce the needed results. In an effort to increase external validity of the research they used a sample of 1,000 people who met the criteria of MDD and randomly assigned them either to a control or experimental group.

23
Q

Scientific methodology

A
this is an empirically based way of conducting research and studying human behavior. It is the process of systematically gathering and evaluating info through careful observations to gain an understanding of a phenomenon.first, researchers conceptualize a process or problem to be studied.
Then: 
Form a hypothesis
Define operational definition
Data collection
Analysis of results
Interpretation of results 

Ex: Psychologist hypothesizes that a new experimental drug will decrease symptoms of depression. During experimental research the psychologist randomly assigned the participants who fit the criteria for depressive symptoms into an experimental and a control group. He then administered the an anti-depressant drug (the IV) daily to the experimental group and a placebo on to the control group to determine what effects the drug combined would have on the pts. depressive symptoms. When compared to the control group, the symptoms (DV) of the experimental group improved significantly.

24
Q

Standard error of estimate

A

in the context of statistics, this is a standard deviation in a regression which indicates the amount that the actual Y scores differ from the predicted Y scores. Standard error of the estimate is also known as standard error of the residuals. It is a measure of the accuracy of an estimate. The larger its value, the less well the regression model fits the data, and the worse the prediction.

Ex:The grad students collected data regarding SES of people to determine what effect low vs high SES had on diagnoses of MDD and in the period of 2 years before the economic decline and 2 years after the decline. Based on the standard error of estimate 63% of variation in diagnoses can be explained by the variation in SES and 37% of the variation is unexplained

25
Q

Standard error of the difference (2 sample t-test)

A

this is the estimated standard deviation of the differences between the means of independent samples in a two-sample experiment. In other words, this is the error between groups.

EX—The grad students were examining the differences in verbal memory in children with musical training vs no musical training. A verbal memory test with a possible score was administered to 90 children—45 with musical training and 45 with no musical training. The 45 with training had an avg. score of 85 and a SD of 5.7. The students in the control group had an avg. score of 79 with a SD of 6.5. A 2 sample t-test was used to test the difference between the two populations.

26
Q

Standard error of the mean (single sample z-test)

A

this is the standard deviation of the sampling distribution of the mean. This is a type of standard deviation used when the population mean and standard deviation are known. It is used to estimate how much the sample mean deviates from the population mean.

Ex:-In an advertisement, a drug company claims that its new pain reliever has a mean relief time of less than 30 minutes when used to relieve migraine headaches. In order to determine the standard error of the mean a random selection of 36 migraine sufferers that were given the new medication has a sample mean of 28.5 minutes and a standard deviation of 3.5 minutes. The claim is the mean pain relief time is less than 30 minutes so the null hypothesis is µ ≥ 30 minutes and the alternate hypothesis is µ < 30 minutes. Because n ≥ 30, a single sample z-test is used. Z = -2.57 and the area corresponding is 0.0051. the p-value is less than α = 0.01 so the null hypothesis should be rejected and the test reveals that “at the 1% level of significance, you have sufficient evidence to conclude that the mean pain relief time is less than 30 minutes.

27
Q

Standard error of the mean, estimated (single sample t-test)

A

is used to see if your sample is like or unlike the population when the population mean is known but the population std deviation is not known.
Can be used when the population is normal or nearly normal, pop SD is not known, and n< 30 minutes. Because n ≤ 30, a single sample t-test is used. T = -2.57 and the area corresponding is 0.0051. the p-value is less than α = 0.01 so the null hypothesis should be rejected and the test reveals that “at the 1% level of significance, you have sufficient evidence to conclude that the mean pain relief time is less than 30 minutes.

28
Q

Statistical significance

A

in the context of statistics and research, results are statistically significant when they are unlikely to have occurred simply by chance. Testing for statistical significance uses a “t distribution” and a criterion of significance (i.e. .05, .01) is selected. When a test is statistically significant, the null hypothesis is rejected.

Stated as p-value—the acceptable level of p-value that denotes statistical significance is p < .05 which means that there is a 5% probability that the relation b/t the variables found in our sample is a fluke.
EX—A new drug to help reduce anxiety and depression is schedule to be tested. The researchers have set the p-value for the research to be p<.05 in order to consider the drug to be efficacious in treating the d/o. An experimental study using 2 random assigned groups is used. Group A takes the medicine and Group B takes a placebo. Group A shows improved relief from anxiety and depression. The study reveals the p-value is .03. This p-value is statistically significant as it is less than the set value of .05 and it implies that Group A would have a 3% chance of having improved relief from their disorders just by chance. Because the drug has been shown to be statistically significant the researchers conclude that it is correlated with reduced anxiety and depression in the participants.

29
Q

Type I and Type II error

A

these are two types of errors seen in research. A Type I error occurs when researchers incorrectly conclude that the independent variable(s) has had an effect on the dependent variable(s) (rejecting the null hypothesis). A Type II error occurs when the researchers incorrectly conclude that the independent variable(s) has not had an effect on the dependent variable(s) (accept the null hypothesis).

EX—In the clinical research to determine the efficacy of the new anti-depressant drug the null hypothesis was that the drug would have no effect on the participants and the alternative hypothesis would have an effect on the participant’s depressive symptoms. The results of the research were determined to have a positive effect on the participant’s depressive symptoms. However, upon further research it was discovered that the pts had also engaged in an exercise program that attributed to the improved depressive symptoms. Thus, the researchers had made a Type I error by claiming that the drug improved the depressed symptoms when in fact it did not.