Week 1 - Outlining the Basics (Review) Flashcards
What are statistics? What is the overarching goal of statistics?
Statistics are mathematical procedures for collecting, organizing, summarizing, and interpreting large amounts of data.
Goals: To understand variability in data.
What are the two main purposes of statistics?
- Describing data sets by organizing and summarizing them
2. Inferring properties of a population by testing hypotheses and finding estimates from sample data.
What is a population? What is a parameter?
The population is the set of all individuals of interest in a particular study.
A parameter is a value that describes a population.
What is a sample?
A set of individuals selected from a population intended to represent the population in a research study.
What is a statistic? What are statistical results used to estimate and when can you use statistical results?
A statistic is a value that describes a sample.
Statistical results are used to estimate population parameters and can only be used to generalize when the sample is REPRESENTATIVE.
What is an example of population and a representative sample of the said population?
Population eg: University students in Ontario
Representative Sample: A handful of uni students from each program and year from each university in Ontario.
Are samples the same thing as a population?
Samples are not the same thing as a population as they only contain a subset of the whole population. This means they can sometimes underestimate or overestimate the characteristics of the population.
What is a variable?
Characteristic or condition that changes or has different values for different individuals.
What are the two main types of variables and how do the two types of variables differ?
Independent Variable (IV) - “independent” because it is supposed to be random with respect to all other variables in the population of interest.
Dependent Variable (DV) - not random; influenced by IV
IV is manipulated while DV is measured/observed
What are discrete variables? What are continuous variables?
Discrete variables have separate, indivisible categories while continuous variables have an infinite number of possible values that fall between any two observed values.
Discrete variables include all categorical/qualitative and some quantitative variables. (eg. # of people, countries, types of dogs).
Continuous variables include measurement/quantitative variables (eg. height, weight)
What is a nominal scale? What are some examples?
Nominal scales are when data is organized by unordered categories that can be organized by name. The distinction between observations is not quantitative in nature.
Examples: favorite colors, Type of animal one owns, gender***, mode of transport
What is an ordinal scale? What are some examples?
These scales are used for categorical variables that are ordered or ranked. These scales tend to have a direction.
Examples: Positions in a race, income level, level of education, Likert scales
What is an interval scale? What are some examples?
When variable changes and there is no 0 value???
Examples: Temperature in *C or *F,
What is a ratio scale? What are some examples?
When the variable changes by the same amount and there is a 0 value in the scale.
How do ordinal scales differ from interval scales? How do interval scales differ from ratio scales?
ummmmmm
Why does it matter which scale of measurement we use?
a
What tests can we use with interval and ratio data? Why are these tests preferred?
a
What type of scales do Likert scales fall under? Why do we use Likert scales with parametric tests?
a
What tests can we use with non-parametric tests?
a
What is between-subjects design? What is repeated-measures/within-subjects design?
When some participants are in one condition and other participants are in another condition.
When all participants experience all conditions of the study.
What are the advantages of between-subjects design? What are the advantages of within-subjects design?
a
What are the disadvantages of between-subjects design? What are the disadvantages of within-subjects design?
s
What are descriptive statistics?
a
What are the three (main) ways of describing data?
- Shape (Distribution Type)
- Central Tendency (Mean, Median, Mode)
- Variability (Range, IQ Range, Variance, Standard Deviation)
What is symmetrical distribution?
When both sides of the distribution (ie. from the mean value) are equivalent in shape.
What are skewed distributions? How do the means and medians compare on positively skewed data vs. negatively skewed data?
When data points gather at one side or the other of the graph.
- Positively skewed distribution (tail is at the larger side): Mean > Median
- Negatively skewed distribution (tail is at the shorter side): Median > Mean
What is kurtosis???? What are the different types of distribution as defined by kurtosis?
ummmm Check textbook
What are outliers? What is non-normality and when is this more common?
Data points that seem far away from the bulk of the data???
Non-normality is when the data does not have one particular type of distribution; common with smaller sample sizes.
What is a central tendency?
A measure of the middle values of the data set?
What are the three central tendency measures?
Mean, Median, Mode
How do you calculate mean? (include equations)
Add all data points and divide by the number of data points there are in the data set.
How do you calculate the median? (include equations)
- Arrange data points from smallest to largest
- Calculate the midpoint (include image here)
- Find the midpoint within the data set
In multimodal distribution are the medians and means meaningful?
No.
Thus, unimodal distributions are generally preferred.
When does the mean equal the median and the mode?
When you have a normal distribution.
Which central tendencies are mostly used for statistical analyses?
Mean and median.
What is variability?
The spread of the data around the central tendency.
What is the range? What is the interquartile range?
Range: Difference between the largest value and smallest value
IQ Range: Difference between value at Q3 and Q1
What are the problems with using range?
- You cannot always accurately gauge variability
2. ????
Why is the interquartile range potentially misleading?
It produces smaller variability????
What is variance? What is standard deviation?
ummm forgot how to describe this????
How do you calculate variance and standard deviation? What is the relationship between variance and standard deviation?
The sq. root of variance is the SD
What is degrees of freedom and why do we use it n-1 for sample variance and sample STDV?
Degrees of freedom are…….
How will the variability for a sample differ from the variability of its population and why does it differ?
The variability of the sample underestimates the variability of the population. It differs because a sample tends to include data points from the most frequent part of the population.
What is a sampling error?
A sampling error is the naturally occurring discrepancy, or error, that exists between a sample statistic and the corresponding parameter.
What are the strengths and weaknesses of the mode?
t
What characteristics of the sample statistics do we want?
r
What are the strengths and weaknesses of the median?
t
What are the strengths and weaknesses of mean?
y
Why is the sample variance biased before correcting with “n-1”?
t
What is the standard error of the mean? (SEM) How do you calculate it?
r