Interpreting Data Flashcards

1
Q

what are the two types of data

A
  • Qualitative

- Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe what qualitative data splits into

A

Qualitative data splits into nominal (unordered) and ordinal (ordered e.g. short medium tall)
- Nominal this then split into binary (yes or not questions) and categorical (e.g. different colours)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the name of the data that is unordered in qualitative data

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the name of the data that is ordered in qualitative data

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is quantitative data split into

A
  • discrete ( 10 graduates - whole number)

- Continuous ( length in cm - doesn’t have to be a whole number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are two other ways in which you can summarise data

A

Measure of location

measure of spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What makes up the measure of location

A
  • Median = Middle value when the values are ordered from smallest to largest
  • Mode = the most common value
  • Mean = average = sum of all of the values divided by the number of values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What makes up the measure of spread

A
  • standard deviation

- interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When is it better to use the median over the mean

A
  • Better to use median in order to avoid the influence of outliers (large or very small numbers that can be incorrect in the data)
  • Also use it when the data is skewed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When is it better to use the interquartile range over standard deviation

A
  • Use the interquartile range in order to avoid the influence of outliers
  • also used when the data is skewed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you work out the interquartile range

A

range is between the 25th and 75th percentile

e.g. 1,  2,  3,  4,  5,  6,  7
Interquartile range (IQR) = 2 to 6
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you work out the standard deviation

A
  • work out the mean
  • then from each result subtract the mean and square the result
  • then divide by N (number of participants)
  • then square root it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is AFP levels important

A

If you aren’t pregnant, an AFP test can help to diagnose and monitor certain liver conditions, such as liver cancer, cirrhosis, and hepatitis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an antenatal thyroid screening test

A
  • this is a test that screens thyroid and therefore is able to prevent defects in the babies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is another name for the Gaussian distribution

A

normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what two things is the normal distribution determined by

A
  • Normal distribution is determined only by the mean and standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens if you change the mean to the normal distribution curve

A
  • the curve moves left and right but stays the same height - if it decreases it moves to the left whereas if it increases it moves to the right
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens if you change the standard deviation to the normal distribution curve

A
  • the height of the curve changes but the area under the curve remains the same
  • as the number increases the curve becomes more flattened
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the characteristics of Gaussian distribution

A

• A constant proportion of values will lie within any specified number of Standard Deviations above or below the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What standard deviation correlates to the

  • 99% range
  • 95% range
  • 90% range
A

99% range (0.5th to 99.5th centile) = mean ± 2.58 SDs
95% range (2.5th to 97.5th centile) = mean ± 1.96 SDs
90% range (5th to 95th centile) = mean ± 1.64 SDs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you calculate the 95% percentile

A

Mean +- 1.96 x standard deviation

22
Q

what is statstics used for

A
  • Statics used for our sample to tell us something about the population
23
Q

What does the population contain

A

Population contains the true mean

24
Q

What happens if the sample size is large enough

A

If the sample size isn’t too small then the distribution of the sample mean will be Gaussian

25
Q

what is the standard deviation of the sample size

A

the standard error of the mean

26
Q

What is standard error

A

The standard error is a measure of the statistical accuracy of an estimate

27
Q

What is the standard error of the mean

A
  • The standard error of the mean is the standard deviation of the distribution of all possible sample means
28
Q

How do you work out the standard error of the mean

A

= Standard deviation/ square root of sample size

29
Q

How do you work out a confidence interval

A

95% confidence interval = sample mean +- 1.96 x standard error

30
Q

Define the confidence interval

A

a range of values so defined that there is a specified probability that the value of a parameter lies within it.

31
Q

How would you right about the confidence internval in an exmaple

A

IN THE POPULATION we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg

32
Q

When do we use standard deviation

A
  • use standard deviation for ranges (for individual values)
33
Q

When do we use standard errors

A
  • use standard error for confidence intervals (for means)
34
Q

What happens as the sample size increases

A
  • As the sample size increases the 95% confidence interval gets narrower, this is because the standard errors get smaller
  • Increase in accuracy therefore you can be more confident in the accuracy of our estimate
35
Q

Describe the different types of correlation and there numbers

A
  • R = 0 - no correlation
  • R = 1 – perfect positive correlation
  • R = -1 – perfect negative correlation
36
Q

What is the correlation coefficient

A
  • R = 0 - no correlation
  • R = 1 – perfect positive correlation
  • R = -1 – perfect negative correlation
37
Q

define the correlation coefficient

A

a number between +1 and −1 calculated so as to represent the linear interdependence of two variables or sets of data.

38
Q

How do you work out linear regression

A
Y = a + bx
Y = outcome (deponent variable) 
X = predictor (independent variable) 
a = the point at the line crosses the X axis
39
Q

What is the dependent variable

A

a variable (often denoted by y ) whose value depends on that of another

40
Q

What is the independent variable

A

a variable (often denoted by x ) whose variation does not depend on that of another

41
Q

Why do you want to know if the result is statistically significant

A
  • An observed sample difference between groups might be due to chance
  • We want to know whether a result is statistically significant i.e. unlikely to be due to chance
42
Q

How do you determine if the result is statistically significant

A

• To determine whether an observed difference was due to chance we look at confidence intervals and p-values

43
Q

How do you work out the confidence intervals between two groups

A

95% CI = mean difference ± 1.96 × SE of mean difference

44
Q

What is a P value

A

a p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the population is true

45
Q

When is a confidence interval result significant

A
  • Doesn’t cross 0 therefore there is a difference in the population
  • If the confidence interval crossed 0 then there might not be a difference
46
Q

When is a P value statistically significant

A

when the value calculated is less than 0.05

47
Q

When can P values be calculated

A

When there is a comparison

  • 2 means – are they different i.e. is their difference different from 0?
  • Association – are the observed results different from those expected
  • Regression – is the slope different from 0?
48
Q

Where does the P value come from

A

The p-value comes from a chi-squared test. P=0.002, so we can be confident there is an association

49
Q

What is the chi squared test used for

A

categorical variables

50
Q

What is a T test used for

A

Comparing continuous variables