Research Methods Flashcards
Three steps of statistical process
1) collect data (e.g., surveys), covered in Lesson 2; (2) describe and summarize the distribution of the values in the data set; (3) interpret by means of inferential statistics and statistical modeling, i.e., draw general conclusions
Quantitative Variables
variables where the actual numerical value is meaningful. Quantitative variables represent an interval or ratio measurement
Qualitative Variables
qualitative variables correspond to nominal or ordinal measurement (zoning classification)
Continuous variables
These can take an infinite number of values, both positive and negative, and with as fine a degree of precision as desired. Most measurements in the physical sciences yield continuous variables.
Discrete variables
can only take on a finite number of distinct values. An example is the count of the number of events, such as the number of accidents per month. Such counts cannot be negative
dichotomous variables
can only take on two values, typically coded as 0 and 1
Descriptive Statistics
Describe the characteristics of the distribution of values in a population or in a sample.
Inferential Statistics
Use probability theory to determine characteristics of a population based on observations made on a sample from that population.
Normal distribution
Normal Distribution is a probability distribution that is symmetrical around the mean. It is bell shaped and when with a standardized relationship between the mean and variance is called a score.
The highest point on a curve with normal distribution is the truest measure of central tendency and will represent the mean, mode, and median.
Central tendency
is a typical or representative value for the distribution of observed values. There are several ways to measure central tendency, including mean, median, and mode.
Two ways to measure spread around central tendency
Variance and standard deviation. They both are based on the squared difference from the mean, but the standard deviation is the square root of the variance.
Coefficient of Variation
measures the relative dispersion from the mean by taking the standard deviation and dividing by the mean.
inter-quartile range
This is the difference in value between the 75 percentile and the 25 percentile, i.e., the 1/4 cut-off value and 3/4 cut-off value in a set of ranked values.
confidence interval
This constitutes a range around the sample statistic that contains the population statistic with a given level of confidence, typically 95% or 99%.
Variance
A variance is a measure of dispersion around the mean. It is calculated as the average of the sum of the squared deviations from the mean.
cluster sampling
In cluster sampling, researchers divide a population into smaller groups known as clusters. They then randomly select among these clusters to form a sample.
Cluster sampling is a method of probability sampling that is often used to study large populations,
simple random sampling
A simple random sample takes a small, random portion of the entire population to represent the entire data set, where each member has an equal probability of being chosen
stratified random sampling
Stratified random sampling is a method of sampling that involves the division of a population into smaller sub-groups known as strata. In stratified random sampling, or stratification, the strata are formed based on members’ shared attributes or characteristics such as income or educational attainment.
systematic random sampling
Systematic sampling is a probability sampling method in which a random sample, with a fixed periodic interval, is selected from a larger population. low probability of contaminating data.
positive correlation
high values of one variable match high values of the other, and low values match low values
negative correlation
high values of one variable match low values of the other, and vice versa