analysis of gene expression 2 Flashcards

Question 1

Q

identification of differentially expressed genes

Answer

A

data analysis methods:
- fold change/thresholds
- t-test

Question 2

Q

fold change

Answer

A

average fold changes of experimental replicated
- or average of log ratios (more accurate)
decide threshold
modification = additional criteria for intensity change
- require absolute change e.g. by 10 units
- or floor intensity data (all below 10 to 0)
- reduces number of false positives and corrects for systematic errors
- fewer hypotheses to be tested

Question 3

Q

fold change advantages

Answer

A

simple to implement and easy to calculate
straightforward interpretation
can use with few replicates

Question 4

Q

fold change disadvantages

Answer

A

small intensity changes can produce large calculated fold changes in poorly expressed genes
doesn’t account for noisy data
- outliers have large effect on average fold change
not statistically-based
- threshold?
- convenient not mathematical

Question 5

Q

t test analysis

Answer

A

statistical version of fold change
assumes 2 samples are normally distributed
investigates whether their means are the same or different by calculating t statistic for each replicate
null hypothesis - average expression is same for both samples

Question 6

Q

multiple correction testing

Answer

A

more tests means you increase the number of apparently significant results you would expect by chance alone
if alpha = 0.05, you would expect 50 unusual results in 1000 tests
need to correct for this

Question 7

Q

t test

advantages and disadvantages

Answer

A

advantages:
- statistical
- fewer false positives than fold change
- can combine RNAseq and microarray data
disadvantages:
- usually few replicates - limits statistical power
- can lead to large gene-to-gene fluctuations in calculated standard deviation with small replicate number

Question 8

Q

DNA binding sites

Answer

A

indicate how translaiton is controlled
can predict regions that lead to expression of particular genes
experimental identification or de novo prediction to create binding site library

Question 9

Q

using DNA motif knowledge

Answer

A

search sequence for known sites
identify and search for restriction sites
use information to model binding site
create consensus
- decide number of allowed mismatches
- depends on sequence properties
create weight matrix
create position frequency matrix
create position weight matrix

Question 10

Q

binding site PWM

Answer

A

probability of base b in position i (b,i)
pseudocount to correct for finite number of input sequences
sigma to represent general probability of base occurrence
score sites to indicate certainty/uncertainty of particular base at that position
search sequence for objects that are likely to arise from that PWM
score directly related to binding energy of DNA-protein interaction
- statistical and energy-based model

Question 11

Q

assumptions of DNA motif knowledge approach

Answer

A

nucleotide at one position has no effect on nucleotide present at adjoining position
TFs have strict spatial requirements in binding sites that preclude variable spacing

Question 12

Q

de novo prediction of DNA binding sites

Answer

A

use of gene expression studies to identify coexpressed genes
something upstream of coexpressed genes may explain expression behaviour
statistical methods to identify the motif of interest in available sequences

analysis of gene expression 2 Flashcards

(12 cards)