What is data Flashcards
What is a heuristic
A mental shortcut to draw a conclusion that reduces effort and simplifies a complex or difficult problem. They are normally a rule-of-thumb method that is not optimal but good enough sometimes. Lead to cognitive biases.
What is statistical thinking
Statistical thinking provides us with the tools to more accurately understand the world and overcome the biases of human judgment such as availability heuristics.
What can do statistics do for us?
Describe phenomena in a simplified way that’s easy to understand.
Decide on what to do based on data, especially in uncertainty. Determine how much results from chance.
Predict new situations based on data from previous situations.
Learning from data
Previous research>
form hypothesis>
does the data support the hypothesis
Data can be used to update beliefs and prior knowledge.
Aggregation
Presents raw data into simple and easy to read format (e.g. Graph).
Uncertainty
Estimates drawn from tests and data. Can never prove a hypothesis though.
Sampling from a population
Samples must be representative of the population. Larger samples are generally more precise.
Causality and statistics
Proceed with caution if you are inferring causation. You typically need an experimental design. Even then, be cautious. If you are observing, try to use terms such as association and relationship. Correlation does not necessarily mean causation.
Generalisability
If research results from the sample can be applied to the entire population of interest and across time. Findings must be valid. External validity.
Randomised controlled trial
Sample a treatment/experimental group (experience a treatment/independent variable) and a control group (experience no treatment/independent variable). Individuals must be assigned randomly otherwise they may differ from each other in terms of attitudes and other factors.
Randomising a sample provides some confidence that no factors will confound the treatment effect. Researchers often try to address these confounds using statistical analyses but controlling for thesecan be very difficult.
Quantitative data
Data measured with a numerical value.
Qualitative data
Data measured with no numerical value. Descriptive.
Binary numbers
Zeros or ones to represent true or false (logical values), or present or absent. Discrete.
Integers
Whole numbers with no fractional or decimal part. Discrete.
Real numbers
Numbers with fractions or decimal parts. Continuous.
Discrete measurement
Takes on one of a finite set of values. It may be qualitative or quantitative. There are no decimal or fractional values.
Continuous measurement
Defined in terms of a real number.
Construct
An unobservable theoretical concept. Not a physical feature. Impossible to measure without some error. Reduce error of measure by improving the quality of the measurement or by averaging over a larger number of individual measurements.
Reliability
The consistency of the measurement.
Test-retest reliability
Test-retest reliability measures the consistency of the result if performed twice.
Inter-rater reliability
Inter-rater reliability is the consistency between multiple raters or judges of the results (eliminate subjective opinion).
Validity
The extent to which the measurement measures what it is supposed to measure.
Internal validity
Internal validity refers to the extent to which you are able draw the correct conclusions about the causal relationships between variables. Can tell which factor is the cause within an experiment.
Face validity
Does the measurement appear to be valid or appropriate for the variable?
Construct validity
Is the measurement related to other measurements in an appropriate way? Measuring what you want to measure.
Convergent validity means that the measurement should be closely related to other measures that are thought to reflect the same construct.
Divergent validity means measurements that reflect different constructs should be unrelated.
Predictive validity
Valid measurements should be predictive of other outcomes.
Possible features of a variable
Identity: each value of the variable has a unique meaning.
Magnitude: The values of the variable reflect different magnitudes and have an ordered relationship to one another – that is, some values are larger and some are smaller.
Equal intervals: Units along the scale of measurement are equal to one another. This means, for example, that the difference between 1 and 2 would be equal in its magnitude to the difference between 19 and 20.
Absolute zero: The scale has a true meaningful zero point.
Nominal scale
A nominal variable satisfies the criterion of identity, such that each value of the variable represents something different, but the numbers simply serve as qualitative labels. Discrete.
Ordinal scale
An ordinal variable satisfies the criteria of identity and magnitude, such that the values can be ordered in terms of their magnitude. The ordering gives us information about relative magnitude, but the differences between values are not necessarily equal in magnitude. Discrete.
Interval scale
An interval scale has all of the features of an ordinal scale, but in addition the intervals between units on the measurement scale can be treated as equal. Can also be negative on the scale, so no true zero. Continuous and discrete.
Ratio scale
A ratio scale variable has all four of the features outlined above: identity, magnitude, equal intervals, and absolute zero. The difference between a ratio scale variable and an interval scale variable is that the ratio scale variable has a true zero point. Continuous and discrete.
Numeric operations for different scales
Nominal: equal or not equal
Ordinal: greater and lesser than another
Interval: addition and subtraction
Ratio: multiply and divide
Variable
Something measured with at least two possible measures.
Constant
Something with only one value.
Likert scale
quasi-interval scale
Operationalisation
The process by which we take a meaningful but somewhat vague concept and turn it into a precise measurement.
Confounder
A confounder is an additional, often unmeasured variable that turns out to be related to both the predictors and the outcome. The existence of confounders threatens the internal validity of the study because you can’t tell whether the predictor causes the outcome, or if the confounding variable causes it.
Covariate
A covariate is usually an independent variable that is measured alongside the main independent variable(s) of interest, whereas a confounding variable is usually an extraneous or uncontrolled factor that may be associated with the outcome but is not part of the causal pathway.
Artefact
A result is said to be “artefactual” if it only holds in the special situation that you happened to test in your study. The possibility that your result is an artefact describes a threat to your external validity, because it raises the possibility that you can’t generalise or apply your results to the actual population that you care about.
History effects
History effects refer to the possibility that specific events may occur during the study that might influence the outcome measure.
Maturation effects
As with history effects, maturational effects are fundamentally about change over time. However, maturation effects aren’t in response to specific events. Rather, they relate to how people change on their own over time.
Repeated testing effects
An important type of history effect is the effect of repeated testing. Suppose I want to take two measurements of some psychological construct (e.g., anxiety). One thing I might be worried about is if the first measurement has an effect on the second measurement. In other words, this is a history effect in which the “event” that influences the second measurement is the first measurement itself!