Statistics Flashcards

1
Q

Differential gene expression analysis

A

Differential expression analysis means taking the normalised read count data and performing statistical analysis to discover quantitative changes in expression levels between experimental groups. For example, we use statistical testing to decide whether, for a given gene, an observed difference in read counts is significant, that is, whether it is greater than what would be expected just due to natural random variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Precision

A

Also called Positive Predicted Value = TP/(TP +FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Recall

A

also called sensitivity, hit rate or True Positive Rate = TP/P = TP/(TP + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Accuracy

A

T/(T + F) = (TP + TN)/(TP + TN + FP + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Specificity

A

Also called True Negative Rate: TN/N = TN/(TN + FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Overall error rate:

A

1 - accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

TP

A

True positives: positive cases identified correctly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

TN

A

True negatives: correct rejection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

FP

A

False positives: negative cases identified as positive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

FN

A

False negatives: positive cases identified as negative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ROC

A
Receiver operating curve:
Plot(Y axis: sensitivity, X axis: FPR)
The perfect classifier is the function having the following shape: 
_
|
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

T-test

A

In many cases, we analyze microarrays with “one gene at a time” approach. That is, for each gene we would like to know if this gene is “different in the two classes”.

A classic analytical method to answer to this question is to perform Independent Student’s t-test (so called Welch test).

Null Hypothesis: Means of the two populations are equal, so the two groups are from the same populations.

Alternative Hypothesis: The mean of the two populations are un-equal and the two groups are from different populations.

Level of significance α is defined a priori and it is the risk we are prepared to take in rejecting H0 when it is in fact true.

Welch T-test is used to investigate the significance of the difference between the means of two populations. Use scipy library to perform the test. E.g.
t_value, p_value = stats.ttest_ind(np.array(Lum_A), np.array(Lum_B), equal_var=False)

Reject null hypothesis if the p-value is lower or equal to α

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bonferroni adjustament

A

When we perform multiple statistical tests, the overall probability in rejecting the null hypothesis when actually it is true is given by α x G, where G is the number of genes. This overall error is called Family-Wise Error Rate (FWER).

The Bonferroni adjustment that requires to select only genes for which p value ≤ α/G.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly