Research & Assessment Methods Flashcards
What are the 3 steps of the statistical process?
- Collect data (sampling, surveys)
- Describe & summarize data, look for patterns (descriptive statistics, exploratory data analysis)
- Interpret the data (inferential stats, statistical modeling)
Statistical Inference
test theories/hypotheses about the data, using probability theory which allows us to draw a conclusion
4 Sampling Methods:
- Sampling Frame
- Probability Sampling
- Non-Probability Sampling
- Implementation
Sampling Frame
the population of interest that you are using (sampling method)
(eg. sampling within the frame of customers at a specific bookstore)
Probability Sampling
most sophisticated, rigorous, and defensible method of sampling
take a subset of a population in an organized manner:
-randomly
-systematic (eg. every 20th phone # in the phone book)
-stratified (eg. by age, education)
-cluster (eg. households that all live in the same neighborhood)
Non-Probability Sampling
-Convenience Sampling (eg. snowball - you interview people & let them refer you to someone else)
-Volunteered (Volunteered Geographic Information -VGI)
Implementation (Sampling Method)
Mail, telephone, web, in-person
What are the 4 main scales of data measurement?
- Nominal Scale
- Ordinal Scale
- Interval Scale
- Ratio Scale (the gold standard)
What is Nominal Scale data measurement?
Categories, the label doesn’t matter
What is Ordinal Scale data measurement?
Ordered categories, ranking only
What is Interval Scale data measurement?
Continuous, but only absolute differences are meaningful
What is Ratio Scale data measurement?
The gold standard, both absolute and relative differences
What are the 4 Types of Variables?
- Qualitative Variables
- Quantitative Variables
- Discrete Variables
- Continuous Variables
What are qualitative variables?
categories of nominal or ordinal data with a ranking
Qualitative research can not provide a generalized understanding (such as a population trend), but it can provide a deeper understanding of a given topic.
What are quantitative variables?
interval or ratio scale
What are discrete variables?
only a finite number of values (eg. count of events, # of accidents on a certain street)
-special case is binary or dichotomous (where you have only two values eg. 0 or 1)
What are continuous variables?
infinite # of values (positive and negative)
In a survey design, what is sampling frame?
A sampling frame is the population of interest.
What are the differences between nominal, ordinal, and interval data?
Nominal data are categories (eg. ice cream type - think “name” = nominal)
Ordinal data are ranked (think “order” = ordinal)
Interval data are continuous (but only the differences between values have meaning)
What are discrete variables?
Discrete variables can only take a finite # of values.
A special case is a dichotomous variable, that can only take two values (often 0 and 1)
What is a distribution?
statistical distribution describes how values are distributed for a field. In other words, the statistical distribution shows which values are common and uncommon (which values are likely to be observed).
You can represent a distribution graphically or mathematically.
What is a histogram?
A graph that shows bins with ranges of value - how many observations fall within that range of values.
What is a density curve?
A density curve is a graphical representation of a numerical distribution where the outcomes are continuous.
Ideally, you try to replace discrete approximation (histogram) with continuous representation. Density curves can be added to a plot of discrete graphed information.
What is a box plot/box and whisker graph?
Based on the ranking of observation.
While a histogram is based on categories, this box plot is based on a ranking from low to high.
Quartiles: 25th percentile, median/mid-point/50th percentile, 75th percentile
What is the interquartile range?
a measure of spread in the box and whisker plot. It accounts for the range from 25th percentile to 75th percentile (whisker to whisker)
The box plot helps us to identify _________.
Outliers
Outliers are…
An outlier is an observation that is “extreme” / outside the reasonable range of the distribution.
An observation that is more than 2 standard deviations different from the mean or an observation that is outside the fences in a box plot.
What does a histogram show?
A histogram shows the distribution of a variable visualized as a bar chart.
What are the 2 types of Hypothesis Tests?
(inferential statistics)
- Null Hypothesis
- Alternative Hypothesis
What is a null hypothesis?
This is a reference statement that we typically want to reject.
Typically it’s a value, often 0.
What is the alternative hypothesis?
the main purpose is to help to provide evidence for rejecting the null hypothesis. You don’t (NEVER) “accept” the alternative hypothesis, you use it to reject the null.
It is the research hypothesis, a statement one wants to find support for.
How do you reject the null hypothesis?
Find evidence in the data (a statistic like the average)
If we observe that the value of the statistic (eg. mean) is very far from the null, they we reject the null.
What is a Type I error?
The chance/probability that we make the wrong decision in rejecting the null hypothesis when it’s true.
What are the 3 types of Test Statistics?
- Z-Score
- T-Test
- Chi-square test
What is z-score?
(x-mean)/standard deviation
subtract the mean from the standardized value and divide by the stan dev.
then compare the z-score to the standard normal distribution