exam 1 Flashcards
Systematic empiricism
planning, making, recording and analyzing observations in the world
Collecting and analyzing data
Parameter
descriptive, true value in a population
Statistic
descriptive value in a sample
Inferential statistics
methods for using sample data to make general conclusions (inferences) about populations
Parameter– difference between groups in a population
Statistic– difference between groups in a sample
Sampling error
discrepancy (error) between the sample statistic and its population parameter
Chance variation in ever random sample we pull from the population could describe group differences between statistic and parameter
Goal of inferential statistics
estimating how large the sampling error actually is
Sample should be diverse and big enough to be representative to avoid sampling error
Random sample
Covariates
other variables we know of that may correlate with our independent/dependent variables
Never manipulated by the researcher
We can statistically control for covariate and use them to ask more complex questions
Demographic variables
descriptive variables about our sample
Age, race, gender, educational attainment, income, marital status, etc
always need at least age, race, and gender
Shows how well you are representing the population
Also important for replication purposes
Construct
variables that cannot be directly observed, but are useful for describing and explaining behaviors
ex– happiness, stress levels
Operation definition
the way a construct is measured in an empirical study
We operationally define constructs with measures
Survey measures
Psychological measures
Behavior measures
Reliable
consistency across time, items, and raters
Valid
accuracy, measuring what they are supposed to measure
Indivisible categories
nothing exists between them
Example – attachment styles
Secure, avoidant, resistant, disorganized
Continuous variable
infinitely divisible at the discretion of the researcher
Example– time
Can have an infinite number of categories
Between any two points on a truly continuous variable its alway possible to find a third point between them
150 - 150.5 - 151 - 151.5
Nominal scales
unordered set of categories identified only by name
Only able to tell us whether individuals are the same/different
Categories, no math value to them
Example– college major
Real limits
boundaries located exactly halfway between adjacent categories and define the range of each category
Real upper limit– top boundary
Real lower limit– lower boundary
Example
4.425– 4.43
4.424– 4.42
Ordinal scales
ordered set of categories
Tells us the direction of differences between two categories, but not the absolute distance
Not equal intervals between the categories
Examples– restaurant drink sizes, rankings in race (first, second, third)
If you rate a single statement on a scale from strongly disagree (1) to strongly agree (5), etc, it counts as an ordinal scale
Interval scales
ordered set of categories with equally distances intervals with an arbitrary 0 point
Arbitrary– just another category, does not mean absence of the construct
Example– temperature
0 degrees does not mean no temperature
Ratio scales
interval scale with an absolute zero point
Key– zero means there is none of the construct
Can also perform mathematical operations
Examples– height, weight, running distance
helpful in determining appropriate scale
Test one– ask yourself if zero means total absence of the quantity
If it does then its a ratio scale
Test two– ask yourself if you can have less than 0 of something
If you can’t, its likely a ratio scale
Positively/ negatively skewed
pos– scores pile up on left
neg– scores pile up on right
N
total scores in a population
n
total scores in a sample
M
sample mean
Central tendency
describe the central point or typical value of a distribution
Goal– allows researchers to summarize/condense a large data set into a single value
Allows us to quickly compare two data sets that have collected similar information
Ex– exam one scores in this class versus other psy 350 class
Variability
how spread out the scores are around the centrail point
Together, these measure describe distributions of scores
Usually reported together
Descriptive statistics– both central tendency and variability measures
Both define the shape of distribution
central tendency/ mean
most commonly used measure of central tendency
Only works if measure is numerically coded
Rescaling
moving the distribution on a number line
Adding/multiplying a constant value to every single score in sample
rescaling– multiplying
Mean: Is multiplied by the constant value.
Standard Deviation: also multiplied by the constant value (spread changes). - Distribution: Stretches or compresses depending on whether the constant is greater than or less than 1.
rescaling– adding
Mean– Increases by the constant value.
Standard Deviation– stays the same
Nominal data
It is always inappropriate to compute a mean for nominal data
No numerical value assigned to categories
ordinal data
Usually inappropriate to compute mean for ordinal data
If you have a lot of categories its might be appropriate
8+ categories
Easier to just give percentages that fall into each category
Wiggle room with ordinal data, but never with nominal
Median
midpoint of a list of scores that are sorted in order from smallest to largest
Median is less affected by extreme scores and skew
Stays in the center of a distribution
Should be used over the mean in these situations
Median does a better job at representing the distribution over mean
Also does a better job even if skew is present
skew cut off points
perfect– 0
-2 or lower– problematic negative skew
2 or higher– problematic positive skew
Mode
most frequently occurring category or score in a distribution
Only one to report for nominal data
Central tendency and distribution shape
Normal distribution– means, median and mode will be equal to each other and be the same value
Skewed
Mode will be at the peak
Mean will be toward the tail
Median usually is between mean and the mode
Order (from left to right)– mode, median, mean
variability purposes
Describe the spread in a distribution
Important component of most inferential statistical tests
Three variability measures
Range
Variance
Standard deviation
Variance
measures the average squared distance between a score and the mean
Standard deviation
standard average distance between a score and the mean
Most commonly reported measure of variability
Both tell us how much spread is around the mean
Population variance
mean squared distance from the mean
Population standard deviation
standard, mean distance from the mean
unbiased estimates
A sample statistic is unbiased if the average value of the statistics is systematically equal to the population parameter
On average sample statistic will be that parameter after repeating it over and over again
biased estimates
A sample that is biased if the average value of the statistic is systematically different from the population parameter
Statistic samples won’t be equal after repeating it over and over again
degrees of freedom
number of scores in the sample that are free to vary from the mean
Example
If M=6
Frist score– 8
Second score– 3
Third score has to be 7 to result in a mean of 6
First two are free to vary, where third is fixed
Properties of standard deviation
if a constant is added to every score in the distribution, the standard deviation does not change
Adding a constant to every score just shifts distribution
If its multiplied by a constant it does change
Increased distance = increases standard deviations
Z scores
standard score
We convert X into a z score to tell us the exact location of any given score
interpreting z scores
Positive– z score is higher than sample mean
Negative– z score is lower than sample mean
If z score is 2.5– 2.5 standard deviations above the sample mean
When converting raw scores into z score– mean becomes 0, standard deviation becomes 1
Standardized distributions
A distribution where all the original scores have been changed so that they now have a set (or specific) mean and standard deviation
When we transform all scores to z-scores
the form of the distributions does not change
Mean now is 0
Standard deviation is now 1
Advantages– puts everything on the same scale so we can make comparisons between different distributions
Interpreting z scores
Must pay attention to the sign (positive or negative) and absolution value (how large is it)
Sing tell us whether the score is above or below the mean
Negative– below
Positive– above
Absolute value– tells us how many standard deviations the score is from the mean
Z score of 0 means it is the mean
Probabilistic reasoning
We typically deal with likelihood instead of certainty
Independent random sample
means that every item has an equal chance of being chosen, and the chances don’t change from one pick to the next, even if you’re picking more than one item
If body is the left and tail is right
z score is positive
Vice versa for negative
Usually more interested in the tail
Rarest z-score will always be the one furthest from the mean
Distribution of sample mean
ollection of sample means for all possible random samples of a given size from a population
This is a specific type of sampling distribution, or a distribution of statistics obtained from all possible samples
The more samples you draw the more normal and symmetrical your distributions of samples will look
Three distributions we have talked about so far
Distribution of individual scores in the population
Original set of scores in the population we want to study
Example– 333 million americans
Distribution of individual scores in a sample
Actual set of scores from our sample we are able to study
Example– 20,000 american you pulled for your sample
Disruption of sample means
Theoretical set of mean from all possible random samples
Used to draw conclusions
Central limit theorem
The average of our distribution of sample mean will always be equal to the population mean
Expected value of M– most likely value for the mean in any sample
The standard deviation for the distribution of sample means is smaller than the population standard deviation
Law of large numbers
the larger our sample sizes (n) are, the more probable it is that a sample mean will be close to the population mean
As our sample size gets larger, our standard error gets smaller
Standard error vs sampling error
Sampling error is the diff between one sample mean and the population mean
Standard error is the average sampling error across all possible samples
How large is large
30
If you are taking samples of at least n = 30, you are guaranteed a normal distribution of sample means
If it’s not 30, but the population distribution is normal, you are still guaranteed a normal distribution of sample means
Z test
Purpose– test for a difference between the true population parameter and the observed sample statistic
Decide between two explanations
Difference is som small, there does not appear to be a treatment effect
Difference is so big that there appears to be a treatment effect
Can it reasonably be explained by sampling error?
Known population
this is known mean from the original untreated population you want to compare your sample to
What you are drawing your sample from
Have to know what the mean and sd are
Unknown population
unknown mean among the population that receives the treatment you are trying to estimate with your sample
After treatment if it is different enough from sample mean it now belongs to unknown treated population
Main goal– does the random sample still represent the treated population or now represent an unknown treated population
Null hypothesis
state that there is no difference in the population as a result of some treatment
Assumptions there is no effect in the study
Assumes our “treated” sample comes from the same untreated population
In the example, average stress would still be 50
The null hypothesis is what we are testing so we assume thats its true and after applying treatment it will be the same
Trying to find enough evidence enough the null so we can reject it
Is diff large enough we can conclude null is not true?
Alternative hypothesis
states that there is a difference in the general population as a result of some treatment
Assumes any change we see is beyond what we would expect because of sampling error
Assuming its not plausible that our treated sample comes from the original untreated population
We don’t make any specific prediction about the alternative hypothesis, just that it is ot the same as the null
Null– H0 = 50
Alt– H1(equal sign with a slash) 50
Set the criteria for a decision
Choose between the null and alternative hypotheses based on how likely it is that we would observe our test statistic if the null hypothesis were true
If it’s likely, the the null is probably correct and we would prefer it
Vice versa for alternative
Have to decide what rare is going to be
Have to make this choice in advance– before we ever look at the data
Alpha level
probability value used to define what is likely and what is unlikely
If you think 15% is an unlikely score, alpha level would be .15
In psychology 5% (.05) is the typical standard
5% of all possible sample means can be classified as unlikely to occur
After selecting alpha level, find z score associated with 5% chance in the tails
Can’t do it with a skewed distribution– has to be at least 30 people in our sample
We want to pick a z score with both its positive and negative tails values to create 5%
(+/- 1.96)– gives you 5% of all the possible scores
Critical region
defined by critical z score (+/- 1.96) that encompass 5% of all possible sample means
Values you would deem super unlikely to occur if null true
In theory, any score in this region is almost impossible if the treatment has no real effect
Collect data and compute sample statistics
Rescutir sample, apply treatment, compute mean score
Convert the sample statistics to a test statistic, which is just a z-score in the case of a z test
Assuming the treated sample belongs to unknown treated population, you want to know how probable it would be to find a z score of that magnitude
Probable– accept null hypothesis
Unprobale– accept alternative hypothesis
Make a decision
Use unit normal table
If it is in the critical region, we conclude that the difference is significant
In this case we reject the null
If it’s in the body, we would conclude we did not find enough evidence to reject the null
In this case we fail to reject the null
Rejecting vs. accepting
We ever accept the alternative hypothesis because that not what we are actually testing (always testing the null), but you can support it
For the null we either reject (p<.05) or fail to reject the null (p > .05)
Type one error
false error
Rejecting null when you shouldn’t have
A sample statistic falls in the critical region even though the treatment has no real effect in the population
type two error
false negative
Treatment effect does exist, we just failed to detect it
More likely to make a type two error
Lack of statistical power
Power– probability the test will reject the null when the treatment does have a real effect
Depend on
Size of treatment effect
Size of sample– larger sample size, more power
Experimental validity
Increase power of a hypothesis test
Increase effect size
Conduct an experiment instead of an observational study
Use more precise measurement
Increase sample size– what we have more control over
Make sample more representative of the population
Reduces standard error of the distribution of sample means– makes it skinnier
Decrease population standard deviation
Limit target population
Reducing population so people are more similar
“20 year old, female, right handed, college student”
Increase alpha level
Ex– making it .10 instead of .05