Understanding Our Genome Flashcards

1
Q

When creating a cDNA library, how would you isolate the mRNA from the rest of the RNA in a sample? What conditions would you use?

A

Use oligo dT chromatography, as eukaryotic mRNAs have a poly-A tail at their 3’ end so can use a oligo dT affinity column with poly-T tails attached. Hydrogen bond formation is favoured by high salt conditions, and the mRNA can be eluted under low salt conditions to break the hydrogen bonds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What things do you need if you want to copy mRNA to cDNA?

A

Reverse transcriptase, the nucleotides (dNTPs) and an oligo dT primer (that is complementary to the poly-A tail at the 3’ end of the mRNA).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a hybrid?

A

A molecule that contains one strand of DNA and one strand of RNA. Formed when mRNA is copied to cDNA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the use of ribonuclease H in the synthesis of double-stranded cDNA. What enzymes need to be used following this?

A

Ribonuclease H cleaves phosphodiester bonds in an RNA molecule that is hydrogen bonded to a DNA molecule (i.e. a hybrid molecule). The enzyme cuts the RNA strand, and then DNA polymerase can be used to replace the RNA with DNA, and DNA ligase creates the phosphodiester backbone. This creates a double stranded DNA molecule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is a genomic library different to a cDNA library?

A

A genomic library includes clones that cover the whole genome, whereas a cDNA library includes clones that correspond to the mRNA sequences. cDNA libraries therefore change based on what cell/tissue the DNA was isolated from and gives a representation of gene expression, whereas a genomic library will not change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

After the double stranded cDNA is produced, what are the following steps in creating a cDNA library?

A

The double stranded cDNAs are cloned into vectors e.g. pUC19, bacteria are transformed, and the DNA library produced is a collection of clones - with each clone carrying a different cDNA molecule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does Contig mean?

A

When a genome is fragmented, it refers to when the fragments overlap (are contiguous) -> allows you put it back together/work out the positions of the fragments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you get the overlapping regions/contigs in order to sequence the DNA?

A

Use a limited amount of restriction enzyme so the DNA is not cut at all of the restriction sites (so get overlapping reigons) i.e. partial digestion, and only incubate for a short amount of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you sequence DNA using contigs?

A

You sequence the short contigs individually (as there is not an enzyme that can read the whole thing), and enzymes read the ends of the long fragments so that they can be joined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The dideoxy chain termination method (Sanger sequencing) was the method used to decipher the first sequence of the human genome. Describe the principle of this method.

A

Uses the fact that when dideoxyribonucleoside triphosphates (ddNTPs) are incorporated into a DNA strand it terminates DNA synthesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe how you carry out the dideoxy chain termination method.

A

First you need to synthesis a radioactive primer from a known sequence on the DNA, and then you synthesis the DNA using dNTPs. However, you also have 4 tubes with a different ddNTP in at a low concentration, therefore the DNA synthesis in each tube will terminate at either the A’s, C’s, G’s or T’s. Use gel electrophoresis to determine the sequence based on the lengths of the fragments, reading from the bottom of the gel up, as will have created fragments that differ in size by just one base.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Dideoxy sequencing is very time consuming, especially if you are trying to sequence an entire genome. How can this be overcome by altering the method?

A

Can put all of the ddNTPs in the same test tube, with each ddNTP having a different coloured fluorescent tag that is detected by a computer (creating a sequence trace). All 4 reactions are carried out in a single capillary, and the sequencing is automated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Name a use of automated dideoxy sequencing.

A

To see if someone is homozygous or heterozygous for a particular allele e.g. in Trimethylaminuria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is the Illumina method called direct sequencing?

A

As no cloning is involved - the DNA is just fragmented into very small pieces (200 to 300bp in length).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the Illumina method.

A

1) DNA is fragmented into small pieces
2) Adaptors are attached to each end of the fragment
3) The adaptors will find and attach to their complementary primer sequence on a slide, and the the DNA fragment bends
4) A DNA cluster is produced by repeated replication via DNA polymerase
5) One of the DNA strands is removed to provide a single stranded template
6) Primer added that is complementary to the original adaptor molecule, and bases are incorporated into the new strand
7) A laser activates the fluorescence, allowing the incorporated bases to be detected and recorded
8) Computer registers fluorescent events in each cluster on the slide, and as the fragments were attached in an ordered array on the slide, it knows where each DNA sequence came from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the drawbacks of the Illumina method?

A

You have to rely on a computer (sacrifice accuracy), fragments are small, and scientists cannot read the sequence unlike in Sanger sequencing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the advantages of the Illumina method?

A

It is quick, cheap, and you can sequence entire genomes (metagenomics).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How big is the nuclear genome?

A

3x10^9 base pairs and about 25,000 genes. The genome is one set of chromosomes and comprises most of our genetic material.

19
Q

Describe the characteristics of the mitochondrial genome.

A

16.6kb long and 37 genes. Contains rRNA genes, tRNA genes, and polypeptide coding genes needed for oxidative phosphorylation. Inherited from mother, no recombination, and up to 0.5% of genes in cell are mitochondrial (as have 100s of copies of genome per cell).
Has two strands: heavy (G-rich and on outside) and light (C-rich and on inside). Genes encoded on both strands but genes do not overlap, and do not have introns (as thought to have come from bacteria).

20
Q

What % of our genome is protein coding?

A

1.5%

21
Q

What was the first gene to be cloned and why?

A

The beta-globin gene, as lot of the gene is exons (38%) and people wanted to understand diseases such as beta-thalassemia.

22
Q

As the gene gets bigger, the intronic proportion gets bigger, not necessarily the % of exons. What % of the dystrophin gene (longest human gene) is protein coding?

A

0.6%

23
Q

What genes are 100% protein coding?

A

tRNA genes, Histone H4, and alpha inteferon.

24
Q

What are the 2 globin gene clusters, and why are they interesting?

A

The alpha globin gene cluster and the beta globin gene cluster. Interesting as the genes are ordered in the order in which they need to be activated during development (embryo -> fetus -> adult). The genes are thought to be duplicated, but the function increasingly gets better (gets better at binding oxygen).

25
Q

How many histone gene clusters are there?

A

11- the gene families have been duplicated and members of a family have been moved onto different chromosomes.

26
Q

One class of pseudogenes is those that have arisen by duplication of a gene sequence. Explain this and give an example.

A

Evolution ‘tried out’ duplications of genes, which have subsequently acquired mutations and lost their functions. They just sit in the genome and no mRNA is transcribed as the promoter is mutated. An example of this class of pseudogenes are those in the alpha and beta globin gene clusters.

27
Q

What are processed pseudogenes and how are they formed?

A

They are pseudogenes that are derived from a functional gene’s transcribed mRNA. mRNA is copied into DNA by reverse transcriptase, which becomes integrated into our chromosome. These pseudogenes therefore have no introns and have a poly-A tail. They occur due to random events e.g. viral/bacterial infection, and are often found on a different chromosome to the functional gene.

28
Q

What are the two major classes that transposable elements fall into?

A
  • Those that do code for reverse transcriptase

- Those that do not code for reverse transcriptase

29
Q

What are LINEs?

A

Long interspersed nuclear elements- transposable elements that code for reverse transcriptase, comprise about 20% of our genome, and were inserted via an RNA intermediate.

30
Q

Describe the composition of a LINE.

A

Has 2 open reading frames (ORFs have the potential to code for a protein), with ORF2 encoding RT and endonuclease activity. There are target site duplications (TSDs) on either side of the line element, that have arisen due to the insertion of foreign DNA into the genome. Has 5’ and 3’ UTRs.

31
Q

LINES promote their own transposition by having 4 things. What are these 4 things and why are they needed?

A

1) A promoter - for transcription
2) Reverse transcriptase - to copy the RNA to DNA to get it into the genome
3) Endonuclease - to cleave the target DNA to allow insertion
4) Ribonuclease (RNase) H for RNA removal

32
Q

Explain how LINEs can insert into our genomes?

A

ORF1 and ORF2 proteins bind to the LINE mRNA and escort it back to the nucleus to be integrated into our DNA. The LINE mRNA anneals to the target DNA by its poly-A sequence, and an RNA-DNA hybrid forms. The line endonuclease causes target-site cleavage, creating a 3’ hydroxyl group that reverse transcriptase can bind to and synthesise cDNA. LINE encoded RNase H degrades the RNA strand, and DNA polymerase replaces the RNA molecule with the DNA molecule. DNA ligase joins and repairs the gaps in the strand.

33
Q

Most LINEs are truncated at their 5’ ends due to RT not fully copying the mRNA. Why is this good for us?

A

Truncated lines cannot be transcribed, good for us as the insertion of LINEs can cause gene disruption - LINE insertion is responsible for about 0.2% of disease causing mutations. Disrupts gene transcription.

34
Q

How can LINEs be silenced in the genome?

A

They tend to be highly methylated - protects us from them

35
Q

Name some diseases that can be caused by LINE insertions.

A

X-linked muscular dystrophy
Haemophilia A and B
Colon cancer

36
Q

What happens when a LINE is inserted into an intron?

A

It makes the intron longer and so it takes longer for the whole gene to be transcribed, and may also affect gene splicing.

37
Q

What are SINEs?

A

Short interspaced nuclear elements. Unlike LINEs they do not code for proteins (including reverse transcriptase)

38
Q

Give an example of a SINE?

A

The Alu repeat family - most abundant sequence in the human genome. 280-300bp in length and are primate specific. Can be detected by digestion with restriction endonuclease Alu 1 and gel electrophoresis -> a band is seen for the Alu repeat as highly abundant, rest is a smear

39
Q

What do all SINEs have?

A

RNA polymerase III internal promoters

40
Q

Describe the structure of the Alu SINE.

A

Has an A box and a B box, which are binding sites for RNA pol III. Has TSDs on either side of the element, and has internal RNA pol III promoter.

41
Q

What is the Alu repeat thought to originate from?

A

From the 7SL RNA that is involved in the targeting of proteins to the endoplasmic reticulum.

42
Q

What are the 3 steps in the insertion of a SINE element into the genome?

A

1) SINE is transcribed by RNA polymerase III
2) Then SINE mRNA is copied by reverse transcriptase provided by a LINE
3) The process of integration is similar to that of a LINE

43
Q

Name some diseases that can be caused by SINE insertions.

A

Haemophilia A and B
Chronic haemolytic anaemia
Cystic fibrosis