Statistics Flashcards

0
Q

Categorical data

A

Data that is finite and countable
Children squared
Odds ratios

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Continuous data

A

Data that has infinite values
Eg measurements
T squared test
Provide mean difference with confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reference ranges

A

Used to describe the variations of a measurement for a defined population
Taken from standard deviation
Shows consistency of results
And whether it failed in some people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Confidence intervals

A

Use 95%
Means 95% confident that the value lies between the reference ranges
Measure reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hypothesis testing

A

Whether the results happened by chance or whether the association is real
Start with a null hypothesis
Compare actually results with expected results
Work out the value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

P values

A

The probability that the observed association is due to chance
P reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chi squared test

A

For categorical data that isn’t paired
1) compute expected numbers under null hypothesis
Expected count=row total x column total/ overall total
2) calculate difference between observed and expected valued
X2 = sum of (observed-expected)/ expected
3) look up on value on c squared table for 1 degree of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Independent t test

A

Independent continuous data that is normally distributed

1) difference between means
2) calculate standard error
3) T= difference/SE
4) find p value on t distribution on N-2 degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Paired samples t test

A

Paired continuous data that is normally distributed
Difference between means over standard error
Use n-1 degrees of freedom for finding p value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data that isn’t normally distributed

A

Try log transforming

Use a parametric test-> Mann-Whitney

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correlation

A

Mutual relation of two of more variables

Doesn’t show causation only association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Regression

A

Measure of relation between the mean values
Used to determine the strength of association
Assumes a one way causal effect
Finds the line of best fit
R= degree of linear relationship
R2= % of variation explained by exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Prevelance

A

Number of people with disease/total pop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Incidence risk

A

Number of new cases/total population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Incidence rate

A

Number if new cases/person time at risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ecological fallacy

A

Can’t assume the relationship we see can be directly transfer to individual level

16
Q

Relative risk ratio

A

Association between an exposure and outcome are compared to a baseline group
Risk ratio= risk of event in exposed/risk of event in unexposed
=1-> no difference
1 more likely to have outcome
Categorical data

17
Q

Odds ratio

A

How likely an event is to happen compared to how likely the event won’t happen
Odds ratio= odds in exposed/odds in unexposed
Mutations be used in case control studies
Categorical data

18
Q

Effect size in continuous data

A

Mean difference
= 0 no difference
0 outcome is higher in exposed

19
Q

Confidence intervals

A

For measures of effect
Require a % confidence
Estimate of size is in the middle
If the interval crosses 0 there is no difference as for some people it has had the opposite effect

20
Q

Paired data

A

Difference of the responses between a pair
Eg before and after, wives and husbands
When we know in advance that observations in one data set are directly related to those in another data set

21
Q

Independent data

A

Responses of one treatment group compared to another

Two unrelated sets of units are measured

22
Q

Univariate analysis

A

Descriptive

Summaries data one variable at a time

23
Q

Bivariate analysis

A

Comparison of 2 groups -> relationship between them

Correlation and measure of effect tests

24
Q

Regression co efficient

A

Y= a + bx
Regression co efficiently
-> a= y intercept when x=P
-> b= gradient

25
Q

B coefficient

A

Gradient
An estimate of how much, on average Y increases/ decreases for each unit increase in x
Positive b-> outcome increases as exposure increases-> positive correlation
Negative B-> opposite
B= 0 -> outcome and exposure not related

26
Q

T test for b coefficient

A

Test of the relationship between the dependent variable and a specific independent variable
T value= b coefficient/ std error of b
Use to find p value

27
Q

Correlation r values

A
Measures the degree of linear association between two variables
0-0.3 -> weak positive
0.3-0.5 moderate
>0.5 strong 
Still need p values
28
Q

Analysis of variance

A
Extension to t test
Continuous data
 Compares means in > 2 groups so used for multiple outcome measures
Eg demographic analysis
-> how heterogenous is your group
29
Q

Multiple logistic regression

A

Linear regression for binary outcomes eg yes/no
Single outcome and more than one independent variable
Still gives you regression co efficients
Used to predict probabilities of different possible outcomes of categorically distributed dependent variable given a set of independent variables

30
Q

Multiple linear regression

A

Continuous data

Two or more explanatory variables and one response variable

31
Q

Type 1 error

A

False rejection of a null hypothesis -> find an effect that isn’t there
Often the result of excessive statistical testing
Decease effect by reducing the level of significance
Avoid multiple significance testing or multiple sub group analysis
Defining the hypothesis with primary and secondary outcomes reduces type 1 error

32
Q

Type 2 error

A

Failure to reject a false null hypothesis -> miss an effect that is there
Not enough power to find significant difference-> sample size to small
Need at least 80% power