Lecture 06 Flashcards

1
Q

High-throughout sequencing

a. Resequencing
- Discuss resequencing(2)
- How high must coverage be?(1)
- What happens during cancer genomics(1)

A

High-throughout sequencing

a. Resequencing
- once a reference genome is available, then new genomes are not assembled de novo but mapped to an existing one
- highly repetitive regions pose challenges during mapping
- coverage must be high enough so that error rate is less than frequency of
natural variation (e.g., SNPs)
- cancer genomics: sequence healthy control cells from same patient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

High-throughout sequencing

Exome sequencing

a. What is it?(1)
b. What does it do?(1)
c. How any exons in human genome?(1)

A

High-throughout sequencing

Exome sequencing

a. resequencing project that sequences only exons/coding regions
b. identify trait-linked mutations in protein-coding regions
c. 180K exons in human genome: ~30 Mb or 1% of entire genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s in a genome? (List the five main features)(5)

A

a. Protein-coding genes (in human genome)
b. Non-protein-coding regions – encode RNA molecules
c. Pseudogenes – degenerate genes that have mutated so far from original sequences that the encoded proteins are non-functional
d. Binding sites for ligands that regulate gene expression (e.g. promoters)
e. Repetitive elements of unknown functions – see Box 1.12

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Discuss protein coding genes

a. What do they follow?(1)
b. How many are there?(1)
c. What is the incidence of genes in genome?(2)
d. What is the gene structure?(2)
e. What do they occupy?(1)
f. How are they distributed?(1)
g. How do they appear?(1) e.g(1)
h. How are unrelated genes separated?92)
i. What is gene transcription under control of?
j. How do they occur?(1)

A

Protein-coding genes (in human genome)
a. central dogma: DNA  mRNA  protein
b. ~23,000 such genes
c. incidence of genes in genome
- gene-poor regions: subtelomeric areas on all chr’s; chr’s 18 and X(evo)
- gene-rich regions: chr’s 19 and 22
d. gene structure
- exons interrupted by introns; ave. exon length = 200 bp; intron length differ to result in gene size differences (e.g. insulin = 1.7 kb, LDL receptor = 5.45 kb and dystrophin = 2,400 kb; Titin)
- splice signal sites delineate intron-exon junctions
e. occupy a small fraction of the human genome – no more than about 2–3% of
the overall sequence
f.distributed unevenly across all chr’s; appear on both strands
g. many appear in multiple copies, either identical or diverged into families
- e.g., over 400 functional related olfactory-receptor genes in humans
h. unrelated genes are fairly well separated
- some, however, do overlap; for example, entire genes may appear on the –ve strand, within an intron of another gene
i. gene transcription may be under control of cis (upstream or downstream) and trans regulatory elements (elsewhere in genome/diff chr’s)
j. often, closely-related genes occur in same area
- identical copies may still occur on different chr’s (e.g. ubiquitin)
- evolution – gene duplication + divergence; further duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Discuss the relation of the genome sequence and proteome

  1. Ideally what happens?
    a. What does alternative splicing result in? discuss(3)
    b. What does RNA editing produce?(3)
    c. What post-transcriptional modifications occur?
    d. Where do special combinatorial dna splicing occur?
A

1.. Ideally, genome sequence -> proteome; however, there is variation to genome-proteome relationship (see Box 1.11)

a. alternative splicing – mature mRNA is formed from diff. combinations of exons, but always in the order of appearance
- . affects 95% of multi-exon protein-coding genes in human genome
- . genes with multiple promoters – if reading frames are ‘out-of-phase’ then different proteins

b. RNA editing – produce 1/+ proteins with diff. amino acid sequences that
differ from what is predicted in genome
-. Vitis vinifera – mitochondrial proteins have C  U editing
- humans – some genes experience A  I change; tissue-specific

c. post-translational modifications – complexes of polypeptide chains (e.g. Hb)
d. special combinatorial DNA splicing – e.g. antibodies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

1.What do non-protein-coding regions encode?(1)

Discuss what you need to know about these regions(3)

A
  1. Non-protein-coding regions – encode RNA molecules
    a. except mRNA, there’s also tRNAs, rRNAs, miRNAs, siRNAs and piRNAs
    b. about 3,000 genes encoding the ‘RNA-ome’ – thus, excl. mRNA
    c. most control gene expression (e.g., miRNA and siRNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. What are pseudogenes?(1)

- Discuss what you need to know about pseudogenes(3)

A

1.Pseudogenes – degenerate genes that have mutated so far from
original sequences that the encoded proteins are non-functional
a. processed pseudogenes – picked up by viruses from mRNA and reverse
transcribed
b. lack introns and promoters; at times, they are transcribed and play regulatory roles by competing with miRNAs for mRNAs
c. some retain function – rescued by translational read-through of stop-codon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discuss repetitive elements of unknown functions(3)

A

Repetitive elements of unknown functions – see Box 1.12
a. large fraction of genome; LINES (21%) and SINES (13%); minisatellites and
microsatellites (collectively 15%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly