Lecture 4 - K-mers and binomial and CLT Flashcards
What is a kmer?
a sequence of k bases
when stored in binary a 32 bit integer holds how many k-mers?
16 base k-mer
What is the probability of a specific kmer being generate by the RNG?
Kpi
What is the conditional probability of an event A happening given an event B happens?
Pr A given B = P (AnB)/PB
What is a context dependent nucleotide generator?
the probability of observing a particular nucleotide sequence depends on what the nucleotide was before it
How do you model a context dependent random nucleotide generator?
Given a genome of size n how can we test if the bases are generate randomly i.i.d or through a Markov process?
-calculate the expected number of times each dinucleotide is expected to appear in the genome
-calculate the frequency of each nucleotide use this to estimate the probability of each base
-calculate the probability of each nucleotide
-calculate the expected number of times each dinucleotide should appear in the genome
-count the number of times each dinulceotide appears in the genome
-compare the number of observed counts of each dinulceotide expected using the chi squared test - how many degrees of freedom
What is the distribution of counts of an outcome?
If n=4 and one outcome is 0110, how many different outcomes have zero 1s?
one
If n=4 and one outcome is 0110, how many different outcomes have two 1s?
4C2