Structure of Genes and the Human Genome Flashcards

1
Q

Approximately how much of our genome encodes proteins or structural RNAs? What do these include?

A

30%

  • solitary protein coding genes
  • duplicated protein coding gene families
  • tandemly repeated structural genes (ribosomal RNAs, 5S RNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What makes up approximately 50% of the genome?

A

repetitive DNA

  • simple tandem repeats (one after another)
  • interspersed repeats - mobile genetic elements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

About how much of the genome does other non-repetitive DNA compose?

A

20%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

About how many protein coding genes are there in humans?

A

23,000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What three components compose the human genome?

A
  • genes encoding proteins or structural RNAs (30%)
  • repetitive DNA (50%)
  • non-repetitive DNA (20%)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are solitary protein coding genes?

A

the expressed regions (exons) are separated by large non-expressed regions (introns)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

On average, what percent of the human genome length contributes to mRNA?

A

5%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Blocks of very short repeated sequences (usually 5-10 nucleotides per repeat), make up approximately what percent of the human genome?

A

5-10%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the largest human gene encode?

A

the protein called dystrophin

  • gene is more than 2.4million nucleotides in length
  • contains more than 80 exons

dystrophin normally connects muscle fibers to the cytoskeleton and the cell membrane
*defects in dystrophin cause Becker and Duchenne Muscular Dystrophies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
satellite DNA
(recognize word, don't care too much)
A

simple sequence tandem repeats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

LINE

A
  • long interspersed element
  • remnants of transposable elements (sequences that have the ability to move into and out of genomic DNA)
  • often not full length
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Insertion of a LINE into a gene can cause what?

A

genetic disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Certain hemophilia patients have novel insertions of _____ in the Factor ____ (clotting factor) gene. This mutation was not present in the parental DNA, indicating that it resulted form a recent insertion.

A

LINES

VIII

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SINEs

A
  • short interspersed elements
  • most famous/abundant SINE is Alu (about 300 nucleotides long and is related to the 7SL RNA. The 7SL RNA is part of the normal signal recognition particle involved in protein secretion)
  • often not full length
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

On average, every how many kb of genomic DNA is there an Alu sequence?

A

5-10 kb

*some Alu elements are also mobile and can be inserted into the genome at random locations, causing disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

HERV-W

A
  • retro-viral genome
  • human genome contains numerous copies
  • one of the viral genes, called syncytin, is active in the trophoblast layer in the human embryo
  • syncytin function is essential for implantation of the embryo
16
Q

How much of the human genome is retroviral related sequences?

A

8% of human genome is remnants of retroviral related sequences

17
Q

What is syncytin (a viral gene) necessary for?

A

essential for implantation of embryo

18
Q

How have duplicated protein coding genes arisen?

A

gene families (2+ related copies often located together in genome) arose through duplications that occurred during uneven cross-over (recombination) events

  • more recently duplicated genes are more similar in sequence because they’ve had less time to mutate
  • it is possible for entire genes to be duplicated, or for individual regions (individual exons) to be duplicated within a single gene
19
Q

It is possible for individual regions (individual exons) to be duplicated within a single gene. What is an example of this?

A

collagen

20
Q

What are pseudogenes?

A
  • genes that have been duplicated and then lost their function
  • not transcribed into mRNA
  • often mutations in regulatory regions that control transcription
  • once the gene no longer makes functional protein, it is no longer under any selective pressure and mutations rapidly accumulate
21
Q

tandemly repeated structural RNA genes

A
  • these genes have exactly the same sequence and exactly the same orientation
  • usually high numbers of copies
22
Q

Interspersed repeats

A
  • longer repeated sequences found scattered through genome (not in tandem arrays like simple repeats)
  • not identical, but similar
  • common interspersed repeats are:
    • LINE elements (also called L1 sequences)
    • SINE elements
    • inactive remnants of retroviral genomes
23
Q

short tandem repeats (STRs)

A
  • sometimes called VNTRs (variable number of tandem repeats)
  • usually repeats of 3 or 4 nucleotides that occur either within genes or at various other locations in the genome
  • the basis of DNA fingerprinting (most people differ in the repeat number at a specific location
  • contribute to a number of different human diseases, including Huntington disease and Fragile-X syndrome
24
Q

Huntington Disease

A
  • progressive neurodegenerative disease
  • triplet repeat region in the coding region of the Huntington disease gene (called Huntington)
  • most people have between 1 and 13 modules of the sequence CAG encoding a short polymer of the amino acid, glutamine
  • in diseased people, 30 to over 100 repeats; the more copies, the earlier the onset of the disease
  • the poly-glutamine stretch causes aggregates of Huntington protein, which interferes with normal function of the cell
25
Q

Fragile-X syndrome

A
  • major cause of mental retardation in males (1 in 6000 births)
  • sequence CGG is repeated approx 60x in the 5’ untranslated region of the Fragile-X gene
  • people with disease have more than 200 repeats
  • the sequence CGG can be methylated, leading to reduction of transcription of the gene, so that effectively no fragile-X protein is synthesized
26
Q

other non-repetitive sequences

A

remainder of genome is comprised of unique (non-repeated) sequences

  • there can be enormous sequences of DNA between functional genes and much of this is considered to be ‘spacer’ DNA, with no obvious function
    (ex: around lysozyme)