Describing data Flashcards

1
Q

Name 2 types of Qual data types

A

Nominal and Ordered categorical (ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Ordered categorical (ordinal)

A
Qual type of data
data that can be put into more than 2 cat's, eg social class or grade of BC
Mutually exclusive and ordered
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is Nominal

A

Qual type of data
displayed usually in a pie or bar chart
mutually exclusive and unordered
eg blood group or gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

name 2 types of Quan data types

A

Measured (Continuous) and Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is Discrete data type

A

Quan data type
can only take certain whole number values
eg number of children in a family

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Measured (Continuous) data

A

Quan data type
take values given within a range
limited only by accuracy of instrument
eg weight in Kng, age, height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does Binary data fall under

A

Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what can numerical data be plotted as

A

dot plot, stem and leaf diagram, histogram, Box and whisker plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is mean and how do you calculate it

A

the average of the sample

To get the mean you add all the sample values then divide over the size of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the medium and how do you calculate it

A

Middle value of the ordered sample

Order sample values small to large- basically the middle number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the mode and how do you calculate it

A

A third measure of location is the mode which is simply the most common value observed
eg 2 appears most on sample of 10 with 2 appearing 6 times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the pros and cons of mean/median/mode

A

Median robust to outliers.
Median/mode reflects what ‘most’ people experience.
Mean uses all the data (more ‘efficient’).
Mean is ‘expected’ value.
Mean more common with statistical tests.
Mode useful for grouped or categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is the Median robust to outliers.

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the most common statistical tests

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

which measure uses all the data

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if the data is skewed what measure do you use

A

medium

17
Q

if the data is symmetrical what measure do you use

A

Mean

18
Q

what are the 3 approaches to quantifying the variability

A
  1. Range
  2. Inter quartile Range
  3. Standard deviation
19
Q

what is the range

A

Simplest way to describe the spread of a data set is to quote the minimum (lowest) and maximum (highest) value
Difference between the smallest value and the largest
Effected by extreme values at each end of the data

20
Q

what is the Inter quartile range

A
Split the data set into four equal parts -  quartiles
Using three cut-points
Lower quartile 	(25th centile) 
Median	 (50th centile)	    
Upper quartile  	(75th centile)   

Inter quartile range (IQR) tells you where the middle 50% of your data lies
IQR = upper quartile - lower quartile

Graphical way of summarising data using percentiles is the box & whisker plot.

Basically tells you where medium is, useful for looking at deprivation eg upper Q and lower Q

21
Q

How do you calculate the IQR

A

When the quartile lies between 2 observations easiest option is to take the mean
Take mean away from medium- eg 2 away from 6- IQR-4 and 8- medium 6

22
Q

what is Variance

A

Based on the idea of averaging the distance each value is away from the mean
Basically, you work out each values difference from the mean, square them and add them up then divide that number by the sample size-1
The variance is not a suitable measure for describing variability because it is not in the same units as the raw data

23
Q

Is the Variance suitable measure for describing variability

A

No The variance is not a suitable measure for describing variability because it is not in the same units as the raw data

24
Q

what is a suitable measure for describing variability

A

Standard deviation

25
Q

How do you calculate SD

A

square root the variance

26
Q

sd vs IQR

A

S.D. vulnerable to ‘outliers’
Not useful for skewed data

IQR robust
Does not use all the data

27
Q

why use the mean and SD

A

For many variables in health sciences the mean ± 1 SD covers 68% of the distribution.
The mean ± 2 SDs covers 95% of the distribution.
The mean ±2 SDs is called the ‘normal reference range’.

28
Q

What does 95% of the distribution mean

A

the mean +- 2SD

68% IS 1SD+-

29
Q

in a normal distribution do the mean and medium coincide

A

Yes

30
Q

what summary measure is appropriate for symmetrical data

A

If symmetrical use the mean and standard deviation

Remember this data is bell shaped

31
Q

what summary measure is appropriate for Skewed data

A

If skewed the median and inter quartile range is more appropriate
Remember this data is presented with tail