02 Intro to the Genome Flashcards
How many bases in the human genome?
3 Gb x 2
How much of the genome was sequenced in 2003?
92%. Just 151Mb of sequence that we weren’t sure about
How much of the genome is repetitive sequences? (tandem repeats, interspersed repeats, LINES and SINES)
50%
How long are SINEs and LINEs?
~400bp and 6kb respectively
How do we describe someone’s genome when we look at it?
Just say how it varies from the reference genome
How many differences do each of us have from the reference on average?
4-5 million
How many SNVs do we have compared to the reference genome?
4-4.5million
How many indels (<50bp) do we have compared to the reference genome?
700,000
How many structural variants >50bp do we have compared to the reference genome?
25,000
What is the Human PanGenome project?
A project trying to capture all the natural variation the exists between the whole diversity of healthy humans
What were the initial estimates of the number of human protein coding genes?
65-80,000
How many protein coding genes do we actually have?
About 23,000. 19,800 in the main assembly and 3,300 alternative sequences.
Where was it found the genes were first broken up in chunks (exons and introns?)
In the ovalbumin gene in chicken
Do all genes have introns?
No
What’s the most introns in a gene?
360 in Titin
What is the primary transcript?
When the whole gene is initially transcribed before the spliceosome makes mature mRNA
Do introns have a function?
We don’t think so other than allowing alternative splicing
How are alternative transcripts made?
Alternative start and end points, and splicing out whole or partial exons.
Are all transcripts that are made functional?
We don’t know. It might be that some are just mistakes
What is a Gain of Function?
When a cell gains expression when it shouldn’t, or in the wrong type of cell, or a higher level of expression, or sometimes a new actual function.
What inheritance pattern does a gain of function usually have?
Dominant
What inheritance pattern does a loss of function have?
Can be dominant or recessive
What is haploinsufficiency?
Describes a loss of function that is inherited in a dominant pattern
Do we know the functions of all genes?
No! Nor do we know all the functions of single genes
Give an example of a gene that we don’t understand how a variant can be pathogenic at all
The FFR gene. When LOF in males, they have ID, big facial features, and large testes. But we have no idea why.
What effect does ribose have on RNA and why?
It makes it less stable because it has an extra hydroxyl group.H
How long can RNA molecules be?
20 - 20,000nt long
Why is RNA single stranded normally but not DNA?
There are enzymes that will rapidly turn ssDNA into dsDNA, but that does not exist for RNA.
Why does the sequence of RNA’s matter? What can it be used for?
As enzymes, to target other RNAs or sequences of DNA, or to make secondary structures in the RNA
What secondary structures can RNA make? Give an example
Complex self-annealed structures of stem-loops and hairpins. Examples include tRNA (looks like a clover leaf), and MRP RNA.
We used to just think there was mRNA, tRNA and rRNA. But what are some other roles of RNA?
5 RNAs are crucial to spliceosome function. MRP is a ribozyme (RNA enzyme). Many are involved in gene regulation.
What role do RNAs have in immunoglobulin genes and t cell receptor genes?
These genes undergo complex rearrangements to create a diversity of antigen receptors. guide RNAs help to target recombinases to the correct cut points for this rearrangement.
List some RNAs!
ncRNAs, lncRNAs, piwiRNA, miRNA, snRNA, snoRNAs
We have how many cells? A nematode has about 1000.
10^13
We have a similar number of protein coding and non-coding genes compared to nematodes. How do we regulate genes in a more complex way?
We have a lot more lncRNAs than them. This may help. Also more pseudogenes, and more transcripts.
What do microRNAs (21-25nts) do to regulate gene expression?
Bind to 3’UTRs of specific mRNAs to modulate levels of translation
What do piwiRNAs do?
Silence transposons and protect the germline genome intergrity
What do long ncRNAs (>200bp) normally do?
Down regulators of gene expression. Act as antisense transcripts.
What else do long ncRNAs do?
Can activate gene expression.
Can form platforms for assembly of multiprotien complexes.
What’s a notable lncRNA and what does it do?
XIST is 17kb and it is repsonsible for X inactivation.
What groups can you add to histone proteins?
Methyl, acetyl, ubiquitin.
What do histone modifications do?
Allow or prevent regulatory molecules binding to the DNA. Might block the protein directly or prevent access to the sequence
Are histone modificaitions permanent?
They are fairly permanent, making cells their set cell type. Some are more transient, in response to external cell signals.
Where are promoters located?
Upstream of where RNA polymerase binds
Where are enhancers located and what do they do?
Up and down strream of where RNA polymerase binds. They are responsible for tissue-specific expression of genes.
What does Cohesin do?
Wraps around DNA that is looped to bring RNA polymerase and enhancers close together
What is the series of technologies used to capture the information of how DNA is packed together?
Chomatin Conformation Capture (3C, 4C, 5C and Hi-C).
What does 3C actually look at?
Looks at DNA sequences lying close together in the interphase cell nucleus.
How big are TADs?
500kb-1Mb
What does TAD stand for?
Topologically associated domain
What defines the boundaries between loops of DNA in TADs?
CTCF bound DNA does not go through cohesin loops so defines the boundaries of the TAD
How far away can enhancers act?
Really far! as long as they are in the same TAD
What’s surprising about TAD boundaries between species?
They are highly conserved despite the sequences not being conserved
What’s the abilities of all this new Genome knowledge on NGS to sequence new causes of disease?
Can reinvestigate old unsolved problems. Can identify new lethal dominant mutations. Can detect mosaicism where the disease is genetic but not inherited…
Whats the point of polygenic risk scores?
Identify people at higher risk of common multifactorial diseases
What two things could be do by studying cfDNA?
NIPD for pregnancies, and cancer screening and monitoring
How can improved genomics help with DNA forensics?
Confirm or exonerate individuals. identify hair, eye, skin colour, ethnic origin. Do a speculative search for family members (crime runs in families). Could predict facial apperance in the future.
What’s the effect of deleting repetitive sequences?
Often has no effect. Can have regulatory effects.