Stats Flashcards
What should a conceptual definition have?
1) key characteristics of the concept
2) empirical referents to which the concept refers
3) level of abstraction at which the concept is operating (how general or specific a concept is)
What is the operational definition?
statement that describes how a concept will be measured
What is a variable/indicator?
a set of observations that results form applying the operational definition
concept+error
Which two errors happen in measurement
Systematic Error (Validity)
Random Error (Reliability)
Where does the systematic errror come from?
From influences other than the desired concept
Where does the random error come from?
From non-sysematic influences on the observed measure
What does the random error imply?
Implies that if another researcher used the same measurement
technique, they would not get the same result - this makes the
measure unreliable.
What happens if a systemati error occurs?
If the measure is capturing other influences, it will systematically
over- or underestimate the true concept
What are the two types of validity?
Face Validity - on first glance it appears to be valid
construct validity - correlates strongly with other measures that are accepted as valid, comes from the concepts relationshi with other measures
Example for concept, conceptual definition, operationalization
Concept - religiosity
Conceptual definition - degree to which an individual adheres to the tenets of their religion
Operationalization - Survey individuals on their relgiosity
What are the major challenges to reliability?
Subjectivity - measurement relies on the judgment of the measurer
Lack of precision - Too much uncertainty to replicate
What are the three different levels of measurement?
Nominal - categories with no order
Ordinal - categories with order
interval - numerical values
What does validity in measurement mean?
Whether one is over or underestimating the concept?
NOT THE SAME AS VALIDITY IN RESEARCH DESIGN
What are the two types of research design?
Exploratory - form theories ad identify variables and relationships that need to be tested separately
Confirmatory - hypothesis testing
What are the two types of variable?
Dependent variable - what you are trying to explain
Independent variable - what explains or impacts the dependent variabl
What is a hypothesis?
An explicit statement to be tested or examined wth actual data. Mostly about the expected relationship between DV and IV
What are the characteristics of a Hypotheses?
1) should be as specific as possible
2) clear direction of effect
3) must be falsifiable
4) must be empirically testable
What is a theory?
A set of propositions - some of which are testable as hypotheses - intended to explain an outcome
What is an assumption?
A proposition in a theory that is not testable.
What is the simplified, testable form of a theory called?
Model
What is the problem with causality?
It is impossible to provem, but can be argued
What is necessary to convincingly argue for causality?
Time order - Cause must precede effect
Covariation - Changes in the IV must be associated with changes in the DV
Non-spuriousness - relationship should not be driven by a third variable (or by time)
Theoretical consistency - A theoretical arguments for the causal relationship.
What is the best was attain non-spuriousness?
Experiments
What is HARKing?
The practice of hypothesising after the result are known –> NOT GOOD
- write down your hypthesis before
What is a research design?
Set of procedures for testing a hypthesis –> i.e. determining the effects of the IV on the DV
What is an “effect”?
Dependent variable before –> change in IV happens –> Dependent variable after
What are the broad types of research designs?
True Experiements - treatment and control group; variables are measure before and after; researcher controls the environment
Quasi-Experiments - Researcher does not control the environment, obervational designs; natural experiments
What are the two types of validity for research designs?
Internal Validity - Within the study, are alternative explanations possible? Confounding variables
External Validity - Beyond the study, to what etent can the reults be inferred to hold true?
How do the types of research design perform in terms of validity?
True experiments - High internal, low external
Obervational/Quasi-Experiments - Low internal, high external
What are common threats to internal validity?
Omitted variables (Spuriousness)
Regression to the mean: Observations with extreme scores will tend to display lower scores next time when sample is homogenous.
Non-random sample selection: Difference in comparison groups is not due to treatment, but to the fact that the groups were different from the start.
▶ e.g.: Selecting on the dependent variable
What are the common threats to external validity?
Selection - Groups not representative of the larger population (not randomized)
Out of Sample Extrapolation - setting is not within bounds of independent variable
i.e. study administers 20 mg dosages while clinicians in real life always administer 100
What are the two types of statistics?
Descriptive - describe a sample
Inferential statistics - draw inferences to a larger population
What is a distribution?
the way in which observations are
spread over possible values
What are the main features of a histogram?
- visualises the distributiion of on numerical varable
- bins are a range of observations, depending on the size of the bins
What are the measures of central tedency?
Mode - most frequent value
Median - the middle value
Mean - “average” value
What measures apply to which level of measurement?
▶ Nominal – Mode
▶ Ordinal – Mode & Median
▶ Interval – Mode, Median & Mean
What is the formular for variance?
Variance of a sample is given by the formula:
s2 = (X(i) - Mean) / n-1
What is the standard deviation?
Square root of variance;
small std dev. –> data points tend to be very close to the mean
large std. dev. –> greater dispersion and indicates that data points are spread out ove a wide range
When is something skewed in which direction?
mean higher than mode/median - right-skewed
mean lower than mode/median - left skewed
What do you do with too many extreme values?
Log it –> only for positively skewed distributions
or recode it into ordinal categories