Exam 1 Flashcards
What are nominal variables? And examples
Data that can be categorized but not ranked or measured. It is descriptive and has no numerical value.
Examples: sex/gender, types of country, model of car, types of cars
What are ordinal variables? And examples
Categories with a distinct order or rank variables
Examples: first, second, and third rank, fastest to slowest ranking, elementary, middle school, and high school ranking
What are interval variables? And examples
indicate the size of the difference between scores but they don’t have zero starting point (or can go into the negatives)
Examples: IQ and temperature
What are ratio variables? And examples
indicate the size of the difference between scores but have definite zeroes that indicate the complete absence of something
Examples:
* They can say that one value is twice as much as another
*Age
*GPA
*Time to complete a task
What are discrete quantitative variables? And examples
Is one that can only take specific values.
Examples:
*The number of children in a family
*The number of times a person has been to Brazil
What are continuous quantitative variables? And examples
is one that can theoretically be measured forever (measured without end
Examples:
*Physical distance between two people
*Time spent working on a puzzle
What are categorical variables? and examples
They are variables that indicate different categories like
*Gender
*College major
*Experimental condition
What are quantitative variables? and examples
where you can measure the size of the differences between scores
*IQ scores
*Temperature (in Fahrenheit or Celsius)
*Age
*GPA
*Time to complete a task
What are the subcategories of categorical variables?
Nominal and ordinal
What are the sub variables of quantitative variables?
Interval, ratio, continuous quantitative variables, and discrete quantitative variables
Which contains the most information out of ordinal, nominal, interval and ratio? (Lowest- highest amount of information)
Nominal: Lowest information
Ordinal: Middle information
Interval: MIddle information
Ratio: Highest information
What is the coding system for nominal variables?
It is arbitrary they can take any value you want it doesn’t matter. Just to make it go along with the category
What is the definition of a population?
It is an entire group of people or stuff being studied
What is the definition of a sample?
is a group of people that come from the population and that you are actually able to get data from
What is the difference between an independent and dependent variable?
Independent: A variable that the researcher manipulates
Dependent: The variables that are measured and changes based on the manipulation of the independent variable
How do you calculate frequency?
By determining how much a number or a data point shows up in data
What does the box represent in a box plot?
the range for the middle 50% of scores.
What minimum level of measurement is needed for a bar chart, boxplot, and histogram?
Bar chart: Nominal
Box Plot: Ordinal
Histogram: Interval or Ratio
How can you tell if a graph is a bar graph or a histogram?
If the bars are touching in a graph then they are a histogram. If they are not then they are a bar graph
What are the benefits of using a bar graph?
BEST TO USE FOR NOMINAL VARIABLE
Easier to see the numbers than pie charts
What are the benefits of using a box plot?
Are a way of looking at an entire distribution at once
Good for seeing outliers and check symmetry
What are the benefits of using histograms?
Used for bell curves for quantitative variables
Good to describes the feel for set scores. And to describe the shape of the distribution
What is normal distribution? And its value?
- refers to the bell curved shaped. ALWAYS CURVED
- ITS VALUE IS NEAR 0
What is the bell curve?
It is the normal distribution in a graph and it has one hump (one curve)
What is the difference between unimodal and bimodal distributions?
Unimodal: has one hump in a graph
Bimodal: has two humps in a graph
What is the difference between positive and negative skewed graphs?
Positive skewed: The higher points are on the light side of the graph and go down (like a slide)
Negative Skewed: The higher points are on the right side of the graph and go down (like a slide)
What is kurtosis?
Kurtosis has to do with how flat or pointed the distribution is compared to a normal distribution
What is the difference in shape between platykurtic, mesokurtic, and leptokurtic?
Platykurtic: Is more flat, has a flat plateau
Mesokurtic: The normal bell curve shape
Leptokurtic: Narrow and super pointy
what effects kurtosis the most?
The outliers
What are the values of platykurtic, mesokurtic, and leptokurtic? And do they have a lot of outliers or not?
Platykurtic: Negative FEW OUTLIERS
Mesokurtic: the value is 0 NO OUTLIERS
Leptokurtic: Positive A LOT OF OUTLIERS
What does the middle line in the middle of the box plot represent in a box plot?
50th percentile Q2
What does the left side of the box and the left line represent in a box plot?
25th percentile Q1
What does the right side of the box and the right line represent in a box plot?
75 percentile Q3
What does the left line outside of the box represent in a box plot?
Lowest non-outlying value
What does the right line outside of the box represent in a box plot?
Highest non-outlying value
What do the dots represent in a boxplot?
The outliers
how do you calculate the mean median and mode?
Mean add up all the numbers in the set then divide it by the total number in the set (how many there are)
Median is the middle number physically in a set
Mode is how much a number shows up in a set
What are the minimum levels of measurement needed for the mean, median, and mode?
Mean- NEEDS INTERVAL OR RATIO DATA so it is the most restrictive
Median- USES ORDINAL DATA
Mode- USES NOMINAL DATA
What happens to the mean and median if the distribution scores are perfectly symmetrical?
THEY WILL HAVE THE SAME VALUE
What happens to the mean, median, and mode when the distribution score are perfectly symmetrical and unimodal?
THEY WILL ALL HAVE THE SAME VALUE
What happens to the mean median and mode in positive skewed graphs and negative skewed graphs?
Positive: the mode is on the left high end ( the left side lowest value), the median is in the middle, and the mode is on the lowest end (highest value)
Negative: The mode is on the high end (the right side, highest value), the median is in the middle, and the mean is on the lower end (lowest value)
Determine if the mean, median, and mode can handle outliers and open ended stuff
Mean cannot handle outliers or opened stuff
Median can handle outliers it can handle open ended stuff
Mode can handle outliers it can handle open ended stuff
How does the mean work best? (what situations?)
Mean- contributes directly to nearly every inferential statistic
How does the median work best (what situations)?
Median- The median’s ability to deal with statistical aberrations is one of its great advantages. This is one of the reasons that whenever there are skewed data, such as income or house prices in which most of the data are at the low end but a few are at the very high end, the median is typically reported.
How does the mode work best (what situations)?
Mode- it is the easiest to use
Which measures will be highest or lowest in a positive and negative skewed distribution?
Positive: The mean
Negative: The mode
How do you calculate the range of a data set?
highest number in the data set MINUS the lowest number in the data set
What is IQR (interquartile range)?
The IQR is a measure of variability that corresponds to the median
How do you calculate the IQR of a data set?
Q3 or the 75% MINUS the Q1 or the 25% =IQR
How do you get the Q3 or 75%?
It is the halfway point from the median and the highest data set point (number) is the Q3 75%
How do you get the Q1 or 25%?
It is the halfway point from the median and the lowest data set point (number) is the Q1 25%
How do you calculate the population variance and standard deviation?
Variance: To calculate the population variance you get the mean of the data set. Then you get the numbers of the data set and subtract them by the mean to get the deviation of each number. Then you square root the deviation numbers. After that you add up all of the square root deviation numbers. Then you divide it by the number of the data set (how many numbers there are). Then you got the population variance
Standard deviation: You do the same steps to get the population variance but take the square root of the population variance. to get the standard deviation.
How do you calculate the sample variance and standard deviation?
Variance: Get the mean of the sample set. Then you get the numbers of the data set and subtract them by the mean to get the deviation numbers. Then you square the deviation scores. Then you add up all of the squared deviation scores together. Then you divide that by the number of the data set (how many total numbers there are) MINUS 1. Then that is your sample variance.
Standard deviation: Same steps as the sample variance expect. Just take the square root of the sample variance.
Why do samples have different formulas than populations?
The sample typically underestimates the variability of the population.
Define degrees of freedom
It tells you how many numbers are free to vary (whatever the number you want it to be)
What are deviation scores/values?
It is the difference between one individual’s score and the population mean. (one data point minus the mean)
Can Range handle outliers? Why or why not?
(Cannot handle outliers because it is easily distorted)
Can IQR handle outliers? Why or why not?
(YES , it is not effected by outliers)
Can Variance population handle outliers? Why or why not?
(cannot handle outliers because it will either drastically increase or decrease the score)
Can Standard deviation population handle outliers? Why or why not?
(cannot handle outliers because it will either drastically increase or decrease the score)
Can Variance sample handle outliers? Why or why not?
(cannot handle outliers because it will either drastically increase or decrease the score)
Can Standard deviation Sample handle outliers? Why or why not?
(cannot handle outliers because it will either drastically increase or decrease the score)
Why can’t open-ended and undefined score be used in calculations of Range, Variance population and sample, and standard deviation sample and population?
BECAUSE THEY ARE CATEGORIES THAT CANNOT BE SUMMED UP
Can IQR handle open-ended and undefined scores?
YES but they need to make up less than 25% of the data.
What does it mean when variability statistics are high vs. when they are low?
High:Harder to get significantly significant numbers and there is a larger variability.
Low: it gets closer to the mean and is easier to predict the variability.
What does N stand for?
is the size of the population or number of scores included in the calculations
What does σ2 stand for?
population variance
What does X stand for?
means one individual’s score
What does Σ stand for?
add up everybody’s scores on the part that follows.
What does (X – μ) stand for?
the difference between one individual’s score and the population mean
What does μ stand for?
population mean or average
What does (X – μ)2 stand for?
is the squared deviation from the mean
What does Σ(X – μ)2 stand for?
is the sum of squared deviations from the mean, also known as the “sum of squares” or just SS
What does n-1 stand for?
n-1 is used in sample standard deviation and sample variance
What does Σ(X – μ)2/N stand for?
It is the population variance, or the average squared deviation from the mean
How do you calculate a Z score for an individual raw score?
The raw score (X) minus the Mean (μ). Then divide that by the standard deviation (SD)
How does the shape of distribution change when an entire distribution of raw scores goes to z-scores?
(DOES NOT CHANGE)
What happens to the values of M (median) and SD (standard deviation) when an entire distribution of raw scores goes to z-scores?
(M= 0) (SD= 1 ALWAYS)
What does a Z score mean?
A z-score simply indicates how far away a score is from the distribution’s mean in terms of standard deviations. How many standard deviations above or below they are of the individual median score.
What does 34% mean in a normal distribution?
of the distribution is between z = 0 and z = +1 (and the same amount between 0 and –1).
What does 68% mean in a normal distribution?
is within one standard deviation of the mean; put another way, –1 < z < +1 or |z| < 1 (where the vertical bars, called pipes, indicate absolute value).
What does 14% mean in a normal distribution?
(approximately) is between z-scores of +1 and +2 (and the same amount between –1 and –2).
What does 50% mean in a normal distribution?
of the distribution is below the mean and 50% is above the mean.
What does over 99% mean in a normal distribution?
of the distribution is within 3
standard deviations of the mean, or, put another way, |z| < 3.
Why do you convert X scores to Z scores and vice versa?
If one variable is more familiar or more meaningful then convert the other variable to the other one
How do you convert X scores to Z scores?
You add the mean to the Z score and times it by the standard deviation
How do you convert Z score to X scores?
Z score and times it by the standard deviation (of the X score) then you add the mean (of the X score)You add the mean to the X score and times it by the standard deviation
What are the characteristics of Standard normal distributions that we need to know? (3)
- Important in distribution in stats
- It has a bell curve
- And it has a complete set of scores
What does the mean and standard deviation equal in standard normal distributions?
Mean = 0, SD = 1