Quiz 1 Flashcards
Two branches of statistical methods
Descriptive statistics & Inferential Statistics
Descriptive statistics
Psychologists use to summarize and describe a group of numbers from a research study.
Inferential statistics
Psychologists use to draw conclusions and make inferences that are based on the numbers from a research study but that go beyond the numbers.
- Allows researcher to make inferences about large group based on smaller representative group
Variable
Condition or characteristic that can have different values
Value
Number or category
Score
A particular person’s value on a variable
Stress level, gender, and religion is an examples of
Variable
0,1,2,3,4; 25, 28; female; catholic is an example of
Value or score
Numeric variable/ Quantitative variable
Is a variable in which the numbers stand for approximately equal amounts of what is being measured
- Continuous variable
Equal-interval variable
An equal-interval variable is measured on a ratio scale if it has an absolute zero point.
- Numeric variable
- Continuous variable
Rank-order variable/ ordinal variable
It is a variable in which the numbers stand only for relative ranking.
- Numeric variable
- Discrete variable
Nominal variable / Categorical variable
in which the values are names or
categories.
- Discrete variables
Grade point average (GPA), scale stress level, and age is an example of
- Numeric variable / Quantitative variable
- Continuous variable
- Equal-interval variable
The number of siblings a person has is measured on a ratio scale because a zero value means having no siblings. Time, weight, distance. Are examples of
- Equal-interval variable
- Continuous variable
Student’s class standing and position finished in a race. Are examples of
- Discrete variables
- Rank-order variable / ordinal variable
Gender and psychiatric diagnosis are examples of
- Nominal variable / Categorical variable
- Discrete variables
Discrete variables
Represent counts (e.g., the number of objects in a collection).
Continuous variables
Represent measurable amounts (e.g., water volume or weight).
Level of measurements
Nominal
Ordinal
Interval
Ratio
Nominal
The data can only be categorized
Ordinal
The data can be categorized and ranked
Interval
The data can be categorized, ranked, and evenly spaced
Ratio
The data can be categorized, ranked, evenly spaced, and has a natural zero.
City of birth, Gender, Ethnicity, Car brands, and Marital status are examples of
Nominal data
Top 5 Olympic medallists, Language ability (e.g., beginner, intermediate, fluent), Likert-type questions (e.g., very dissatisfied to very satisfied)
Are examples of
Ordinal data
Test scores (e.g., IQ or exams), Personality inventories, Temperature in Fahrenheit or Celsius
are examples of
Interval data
Height, Age, Weight, and Temperature in Kelvin are examples of
Ratio data
Mean
Sum of the scores divided by the number of scores
- most common
Mode
Value with the most greatest frequency in a distribution
Median
Middle score when all the scores in a distribution is arranged from lowest to highest
When is mean used
- With equal-interval variables
- Very commonly used in psychology research
When is mode used
- With nominal variables
- Rarely used in psychology research
When is median used
- With rank-ordered variables
- When a distribution has one or more outliers
- Rarely used in psychology research
mode = mean
a perfectly symmetrical unimodal distribution
Mode not equal mean
mode is not a good way of describing the central tendency of scores
Variability
the measure of how to spread out a set of scores is
- the average of the squared deviations from the mean.
standard deviation
the square root of the average of the squared deviations from the mean
- the most common descriptive statistic for variation
- approximately the average amount that scores in a distribution varies from the mean.
Research’s definition of variance
the sum of squared deviation scores divided by 1 less than the number of scores
If the actual score is above mean
- z-score
If the actual score is below mean
+ z-score
Z score
Number of standard deviations that a score is above (or below, if it is negative) the mean of its distribution
- it is thus an ordinary score transformed to better describe the score’s location in a distribution.
Raw score
Ordinary score (or any number in a distribution before it has been made into a Z score or otherwise transformed).
Mean of any distribution of z scores =
= 0
The standard deviation of any distribution of Z score =
= 1
normal distribution
Frequency distribution that follows a normal curve.
normal curve
Specific, mathematically defined, bell-shaped frequency distribution that is symmetrical and unimodal
- distributions observed in nature and in research commonly approximate it.
Percentage of scores between the mean and different standard deviations
- 34% of score are always between the mean & 1 SD from the mean
- 14% of scores are always between 1 & 2 standard deviations above mean
- 50% if scores are below mean
Sample
the part of the population about which you actually have information
- In psychology research common to take samples to make inferences about the population
Population
The entire set of things of interest.
Random selection
Method for selecting a sample that uses truly random procedures (usually meaning that each person in the population has an equal chance of being selected)
- Step 1: Researcher starts with complete list of population
- Step 2: Randomly selects some of them to study
Haphazard sampling
Taking whoever is available or happens to be first on a list
Why Psychologists Study Samples Instead of Populations
- Not practical
- Point of the study is to make generalizations, not figure out answers based on population - Most common
- Researchers study people who don’t differ greatly from the general population
Methods of sampling
Haphazard selection - Taking whoever is available or happens to be first on a list
Random selection - Most ideal method of picking a sample to study
Population parameter
actual value of the mean, standard deviation, and so on, for the population; usually population parameters are not known, though often they are estimated based on information in samples.
Sample statistics
descriptive statistics, such as the mean or standard deviation, figured from the scores in a group of people studied.
When population mean unknow =
best predictor is the sample mean
When the population standard deviation, σ, is unknown
the sample standard deviation is used to estimate σ in the confidence interval formula.
Confidence intervals
roughly speaking, the range of scores (that is,the scores between an upper and lower value) that is likely to include the true population mean
- more precisely, the range of possible population means from which it is not highly unlikely that you could have obtained your sample mean.
Sample mean vary a lot =
can’t be confident est. is close to true pop. mean.
Sample mean close to pop mean =
est. is pretty close
Normally want to be ___ confidence about estimates
68%
Psychologists use _____ confidence intervals
95% or 99%
(greater confidence = broader confidence interval )
95% Confidence level =
confidence interval in which, roughly speaking,there is a 95% chance that the population mean falls within this interval.
- want area in normal curve on each side between mean & Z score that includes 47.5%
- Z score = -1.96 to 1.96
99% Confidence level =
confidence interval in which, roughly speaking,there is a 99% chance that the population mean falls within this interval.
use Z scores for middle 99% of normal curve (49.5% above & below mean)
Confidence limits
points at which a more extreme true population wouldn’t include sample mean 95% of time
correlation
association between scores on two variables.
Linear & Curvilinear Correlations
Linear = straight line
Curvilinear = not straight line
No Correlation
No relationship between variables
Positive correlation
relation between two variables in which high scores on one go with high scores on the other, mediums with mediums, and lows with lows
- on a scatter diagram, the dots roughly follow a straight line sloping up and to the right.
Negative
relation between two variables in which high scores on one go with low scores on the other, mediums with mediums, and lows with highs
- on a scatter diagram, the dots roughly follow a straight line sloping down and to the right.
Strength of the Correlation
how much there is a clear pattern of some particular relationship between two variables.
“large” (or “strong”) linear correlation =
if the dots fall close to a straight line (.50)
“small” (or “weak”) correlation =
if you can barely tell there is a correlation at all; the dots fall far from a straight line. (.10)
“moderate” (also called a “medium” correlation) =
if the pattern of dots is somewhere between a small and a large correlation. (.30)
How to measure what is a high score and what is a low score =
comparing scores on different variables in a consistent way (using Z score)
(High Z score) x (High Z score) =
+ cross-product
Why?: scores above mean are + Z scores
(Low Z score) x (Low Z score) =
+ cross-product
Why? Scores below mean are - Z scores neg. Times neg. Equal pos.
(+ or -) of correlation coefficient =
direct of linear correlation between 2 variables
Values of correlation coefficient - between 0 and 1 =
tell you strength of linear correlation
Three Possible Directions of Causality
1) X could be causing Y.
2) Y could be causing X.
3) Some third factor could be causing both X and Y.
Ruling Out Some Possible Directions of Causality
1) The future can’t cause the past
2) Strongest way to rule out possibilities = perform true experiment
restriction in range
situation in which you figure a correlation but only a limited range of the possible values on one of the variables is included in the group studied.
Attenuation
reduction in a correlation due to unreliability of measures
The direction and strength of a correlation can be drastically distorted by
one or more individual’s scores on the two variables if each pair of scores is a very unusual combination. -Outliers-
small correlations
- have practical importance
- impressive in demonstrating the importance of a relationship - if a study shows that the correlation holds even under what would seem to be unlikely conditions.
Correlational results are usually presented in research articles either in the
1) Text with the value of r (and usually the significance level)
2) A unique table (a correlation matrix) shows the correlations among several variables.
a correlation matrix
A unique table that shows the correlations among several variables.
Predictor (X)
in prediction, variable that is used to predict scores of individuals on another variable.
criterion (Y)
(usually Y ) in prediction, a variable that is predicted.
Predictor (X) vs. criterion (Y) variables
With prediction, we have to decide which variable is being predicted from (Predictor X) & which variable is being predicted (Criterion Y)
Linear prediction rule
The formula for making predictions; that is, a formula for predicting a person’s score on a criterion variable based on the person’s score on one or more predictor variables.
Prediction in research articles
favored by research psychologists is a rule of the form:
“to predict a person’s score on Y, start with some baseline number, which we will call a, then add to it the result of multiplying a special predictor value, which we will call b, by the person’s score on X”
What is a parametric statistic?
Standard deviation
What statistics is/are unbiased estimators of their population parameters?
Mean
SPSS only calculates ___ statistics
Inferential
A pet owner found that on days her puppy napped a lot she had fewer headaches. This is an example of a ____ correlation
Negative
A research calculated a -.23 correlation between variable X & variable Y. This is considered a ____ correlation
small
Research calculated a -.23 correlation between variable X & variable Y. This is considered a ____ correlation
small
A ____ confidence interval occurs between a mean plus/minus 1 z score
68%
A formula for predicting score on a criterion variable based on a score on a predictor variable is called a(n)
linear prediction rule