Stats Flashcards
Idk
Population
The set of all measurements of interest to the investigator. What the investigator wants to study, analysis, and make conclusions//inferences about.
Examples:
All people in the United States
All cats on Mars
All bugs in your hair
Sample (in other words, sample size)
A small group within/reflective of the group people want to study that is included for this experiment
Ex:
100 people in the United States
50 Cats on Mars
5 bugs in your hair
Experimental Unit or Element
Individual or object on which a variable is measured
In other words, whatever you have to give to the sample in order to gain data from it
Ex: the people in the United States
bugs on Mars
How do people collect from the sample?
Surveys, observations, interviews, etc.
Variable
whatever attribute you are aiming to find or a measurement within your sample/experimental unit
Ex: height of people in United States
color of bugs in your hair
Measure of center or central tedency
Basically the average in a data set, also known as center of the “distribution” (can be mean, median or mode)
Sample mean
Sum of all sample data over count of data(indicated by xi)
Population mean
Sum of all population data, indicated by mew(μ) over true population count
Excel formula: average(data)
Median
Essentially the middle value in an ordered set of numbers (if odd count, then there’s middle value, if even then it’s average of two middle ones)
Eliminates outliers(very extreme values)
Excel formula: median(data)
Mode
The most frequent value in the data set(what values occurs the most)
Frequency table, the most frequent value(highest frequency)on the graph is the mode
Frequency
Relate frequency to mode(if value has most frequency = it’s the mode), amount of times a certain values shows up in a data set
Bimodal distribution
Two modes
Multimodal distribution
Data has more than two modes
What is sometimes the reason for bimodal or multimodal distributions?
This is because we mix two kind of populations into one. For example, heights of men and woman. Generally, men have the most common height of 5’9, while woman it’s probably like 5’5 or something, those are two completely different values yet it’s still bimodal.
mode.sngl excel formula
if there are two modes, this gives the lowest of the two
mode.multexcel
Returns ALL modes from dataset
Weighted Mean
When some things are weighted more than others and can impact the mean more (because they are given more importance)
Anchor
Makes a term constant while you copy and paste the formula
Variance
a way to measure how spread out numbers are from their average. It tells us how much the values in a data set typically differ from the mean (average).
Standard deviation (and how it’s diff from normal population and sample variance)
Essentially it’s variance (both sample and population) squared
Sample vs population
Sample data is data only from a portion of the population, and people assume it reflects all of the population