IRM describing data Flashcards
Statistics help us to:
describe complex information in a simple manner
Learning from data in 3 steps
Previous Steps
Form a hypothesis
Test: Does data support hypothesis?
Aggregated data is
Summarised data relative to categories or levels
Raw data
Is data that is not aggregated, basically just all the information accumulated but not organised
Variables can be
Continuous or discreet and Independent or dependent
An example of a discreet variable
discrete category such as male or female, on or the other, not both
An example of a continuous variable
ie a score can lie anywhere on a continuum i.e 1-100
4 levels of measurement properties in variables are
Nominal, Ordinal, Interval or Ratio
Nominal variable
Arbitrary numerical value (this variable gives the least information) ie 1 for male 2 for female, arbitrary in the way the female is a greater number than the female, yet it means nothing
Ordinal Variable
Rankings on a class test are an example of Ordinal value. can order from high to low, but may not give further information such as the difference between the highest and lowest
Interval Variable
Interval variables have a consistent unit of measurement and the numerical difference between any two values is meaningful. In contrast to nominal and ordinal variables, interval variables allow for meaningful mathematical operations, such as addition and subtraction
ratio Variable
A ratio variable is a type of quantitative variable in statistics that has a meaningful zero point and can be measured on a continuous scale. In other words, the values of a ratio variable can be expressed as a ratio of two numbers, where the denominator is not equal to zero. Examples of ratio variables include length, weight, age, height, income, and many others.
Unlike interval variables, ratio variables have a true zero point, which represents the absence of the measured attribute. This allows for meaningful comparisons between measurements using ratios and proportions. For example, if one person’s income is twice that of another person, it means they earn twice as much money, not just that they earn more.
epistemic
things we don’t known because of a lack of data or experience
aleatoric
things that are simply unknown, like what number a die will show on the next roll
Uncertainty defined
Uncertainty relates to how the estimate might differ from the “true value” and these measures help users of ONS statistics to understand the degree of confidence in the outputs
4 measures of uncertainty
standard error
confidence interval
coefficient of variation
statistical significance
2 different types of samples
representative: proportionate
convenient sample: potential for bias
whats some alternative words that are less likely to infer causation (ie rather than effect as that infers causality, we can say:)
association, relationship
What makes a good measurement
Validity and Reliability
What questions the measurement and it make sense on its face?
Face validity
What validiity verifys if the measurement is related to other measurements in an appropriate way
Construct validity.
different measurements are closely related to one another is known as
Convergent validity
( If my theory of personality says that extraversion and conscientiousness are two distinct constructs, then I should also see that my measurements of extraversion are unrelated to measurements of conscientiousness.)
measurements thought to reflect different constructs should be unrelated, known as
divergent validity
( If my theory of personality says that extraversion and conscientiousness are two distinct constructs, then I should also see that my measurements of extraversion are unrelated to measurements of conscientiousness.)
If our measurements are truly valid, then they should also be predictive of other outcomes
Predictive validity.
All variables must take on at least two different possible values otherwise they would be a ….
constant rather than a variable)