Vocab / Key Terminology Flashcards
Comparative Analysis
analyzing data from different settings or grounds at the same point in time OR same settings or groups over a period of time to find similarities/differences
Discourse Analysis
this is “theory” stuff: semiotics, deconstructions, narrative analysis, etc. Studying the way versions of the world (society, events, psyche) are produced in language and discourse within various forms of knowledge/power
Ethnography
about observing/interviewing people in their “naturally occurring settings” (researcher is present in these settings with subjects of the research)
Grounded Theory
“inductive” form of qualitative research → data collection + analysis are conducted together. You don’t go in with any preconceived hypothesis about the outcome, and are not concerned with validation or description. Instead, you allow the data you collect to guide your analysis and theory creation.
Narrative Analysis
qualitative research approach whereby the researcher analyzes the stories people create, to understand the meaning of events in a person’s life. Respondents give detailed accounts of their experiences and stories, rather than answer a predetermined list of questions.
Statistical Process
(1) Collect data; (2) Describe and summarize; (3) Interpret
Types of Measurement: Nominal Data
mutually exclusive groups or categories and lack intrinsic order; for example, zoning classifications or social security numbers
Types of Measurement: Ordinal Data
ordered categories implying a ranking of the observations; the values themselves are meaningless, only the rank counts; for example, letter grades or response scales on a survey
Types of Measurement: Interval Data
an ordered relationship where the difference between the scales has a meaningful interpretation; for example, temperature
Types of Measurement: Ratio Data
the gold standard for measurement; both absolute and relative differences have meaning; for example, distance
Types of Variables: Quantitative
represents an interval or ratio measurement
Types of Variables: Qualitative
represents a nominal or ordinal measurement
Types of Variables: Continuous
can take an infinite number of values, positive or negative, and with as much precision as desired
Types of Variables: Discrete
can take a finite number of distinct values
Types of Variables: Binary/Dichotomous
a special case of discrete variables; can only take on two values typically coded as 0 and 1
Statistical Concepts: Descriptive Statistics
describe the characteristics of the distribution of values in a population or a sample
Statistical Concepts: Inferential Statistics
use probability theory to determine the characteristics of a population based on observations made on a sample of the population
Distribution: Range
the difference between the largest and smallest value
Distribution: Symmetric
where an equal number of observations are below and above the mean
Distribution: Skew
an asymmetrical distribution where there are more observations either above or below the mean
Distribution: Normal/Gaussian
the gold standard in statistical analysis, the bell curve; symmetric distribution where the spread around the mean can be related to the proportion of observations
Basic Descriptive Statistics: Central tendency
a typical or representative value for the distribution of observed values
Mean
the average of a distribution; appropriate for interval and ratio scaled data not ordinal or nominal
Weighted mean
greater importance is placed on specific entries or when values are used for groups of observations
Population weighted mean
when computing the measure for a mean value among multiple countries, the value of each country would be multiplied by its population
Median
the middle value of a ranked distribution
Mode
the most frequent number in a distribution; there can be more than one
Basic Descriptive Statistics: Central tendency: Symmetry
mean and median are affected by the symmetry of the distribution; very close if symmetric; different if skewed
Dispersion
characterizes how values are spread around the central tendency
Variance
the average squared difference from the mean; large variance means a greater spread or flatter distribution; small variance means a narrower spread or a spikier distribution
Function - (value - mean)2 for each value and then average all of those values together
Standard deviation
the square root of the variance; in a normal distribution 95% of the values fall within 2 standard deviations of the mean; the symbol is a little o with a tail to the top right, σ
Degree of freedom correction
necessary for finding the variance and standard deviation of a sample group because a sample mean is estimated; when averaging the squared differences subtract one from the number of observations to divide the sum by
Outliers
in a normal distribution, values that fall outside of two standard deviations above or below the mean
Coefficient of variation
measures the relative dispersion from the mean by taking the standard deviation and dividing by the mean
Z-score
a standardization of the original value by subtracting the mean and dividing by the standard deviation; once all values are standardized, the mean of the group is 0 and the variance and standard deviation are 1; transforms all values into standard deviation units - example: a z-score of more than 2 would mean an observation is more than 2 standard deviations away from the mean, an outlier
Inter-quartile range (IQR)
an alternate measure of dispersion; the difference in value between the 75th percentile and the 25th percentile in a set of ranked values; forms the basis of an alternate concept of outliers
Inter-quartile range (IQR): Fences
two fences are the 25th percentile value minus 1.5 times the IQR and the 75th percentile value plus 1.5 times the IQR
Inter-quartile range (IQR): Box/Whisker plots
visualization summarizing a set of data; the shape of the boxplot shows how the data is distributed and any outliers; useful way to compare different sets of data as you can draw more than one boxplot per graph
Statistical Inference
the process of drawing conclusions about the characteristics of a distribution from a sample of data
Hypothesis test
finding evidence in the data to reject the null hypothesis statement in the direction of the alternative hypothesis; statistical evidence only provides support to reject the null hypothesis never to accept the alternative hypothesis
Null hypothesis
the point of departure or reference; typically consists of setting characteristics of the distribution, such as the mean, equal to a given value, often zero
Alternative hypothesis
the research hypothesis wanted to support rejecting the null hypothesis
Two-sided - differences in both directions are considered
One-sided - only differences in one direction are considered, i.e. only larger or smaller than, but not both
Test statistic
provides a way to operationalize a hypothesis test
Sampling error or Sampling distribution - the random variation caused because a sample does not contain all the information of the population therefore any statistic computed from the sample will not be identical to the population statistic
Systematic error
model misspecification which occurs because the model or assumptions are wrong