Types of variables and presentation of data Flashcards

1
Q

What are some routinely collected sources of data?

A
  • mortality and census data
  • hospital activity data
  • primary care data
  • infectious disease notifications
  • regular national surveys (e.g. health survey
    for England)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a strength and a weakness of research study data?

A

+ Better quality
- More expensive and time consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 types of categorical variables?

A

Ordinal (ordered categorical)
Nominal (unordered categorical)
Binary / Dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is categorical data?

A

categories (no numbers) e.g. hair colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ordinal data?

A

Has an underlying order

Categories can be ranked
e.g. highest level of education, GCSE, A level, Degree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is nominal data?

A

No underlying order, categories cannot be ranked e.g blood group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is binary (dichotomous) data?

A

Has two categories

e.g. Male / Female
Presence of disease - Yes / No
I / 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 2 types of numerical variables?

A

Continuous
Discrete / count variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is continuous data?

A

Can be any number

e.g. height
e.g. 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is discrete ( count) data ?

A

Can only be whole numbers (integers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can categorical variables be created from numerical variables?

A

Yes - categorical variables can be created from numerical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can numerical variables be created from categorical variables?

A

NO - numerical variables CANNOT be created from categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is the type of variable important?

A

Variable type determines appropriate way to:

  • display the data
  • summarise the data (central tendency /
    variation)
  • analyse the data using statistical testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How should single variable data with one categorical variable be presented?

A

= Bar chart, Pie Chart or Frequency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How should single variable data with one continuous variable be presented?

A

= histogram or bar chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How should a pair of variables with categorical outcome and categorical exposure be presented?

A

= Contingency table

17
Q

How should a pair of variables with numerical outcome and categorical exposure be presented?

A

= Box and whisker plot

18
Q

How should a pair of variables with numerical outcome and numerical exposure be presented?

A

= Scatter plot

19
Q

With exposure and outcome which is the X and which is the Y variable?

A

X variable = Exposure

Y variable = Outcome

20
Q

What factors relate to exposures?

A
  • Explanatory variable
  • Independent variable
  • Risk factor
  • Treatment group

X variable

21
Q

What factors relate to outcomes?

A
  • Response variable
  • Dependent variable
  • Case / control group
  • Disease group

Y variable

22
Q

What is the 3 main features of a bar chart?

A
  • Height of the bars are proportional to the
    frequencies
  • Useful for comparing frequencies relative to
    others
  • Variables MUST be categorical
23
Q

What is the 2 main features of pie charts?

A
  • Areas of the sectors are proportional to the
    frequencies
  • Useful for comparing the frequencies in each
    category with the whole group
24
Q

What are the 2 main features of a histogram?

A
  • Variable must be CONTINUOUS
  • relative frequencies are represented by
    areas of the bars
25
Q

How does a box and whisker plot work?

A

Minimum / maximum indicated by whiskers
Middle 50% contained within box
Median indicated by horizontal line inside box

26
Q

What are the three types of distribution?

A
  • Normal distribution
  • Positively skewed (long tail to right)
  • Negatively skewed (long tail to left)
27
Q

What is the definition of the mean (average)?

A

= Sum of all values divided by number of observations

28
Q

What is the definition of standard deviation?

A

= Measure of the spread of observations around the mean

√ (sum of squared deviations) / (no of observations - 1)

(All square rooted)

[Variance = SD2]

29
Q

What is the formula for squared deviation?

A

= (Original value - Mean value) 2

30
Q

What is the definition of the median?

A

the middle value when values are arranged in order

31
Q

What is the definition of the interquartile range?

A

The range from the first (25%) to the third (75%) quartiles of a distribution

32
Q

What is the definition of the mode?

A

= the most frequently occurring value

  • should not exist is the data is truly continuous
33
Q

What is the definition of the range?

A

The difference between largest and smallest values in a distribution

  • depends upon the extreme values, which may give an unrepresentative view of the whole set of values
34
Q

What does a 95% reference range indicate?

A

= mean + or - 1.96 x SD

-> can interpret as likely values for an individual in the population

35
Q

What is variance?

A

= (Standard Deviation)^2