Human Genome and Genomics Flashcards

1
Q

Is DNA sequencing a common practice in genetics?

A
  • The ultimate fine structure map of a gene or chromosome is its nucleotide pair sequence
  • Prior to 1975 the thought of sequencing entire chromosomes was barely conceivable
  • Today sequencing is a routine laboratory procedure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What DNA Sequencing procedures are there?

A
  • There are two different procedures by which the nucleotide sequences of DNA molecules can be determined:
    1. Manual sequencing
    2. Automated sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the first two steps of DNA sequencing?

A
  • To sequence the DNA it must first be separated into two strands.
  • The strand to be sequenced must be copied using chemically altered bases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the last two steps of DNA sequencing?

A
  • The altered bases cause the copying process to stop each time one particular letter is incorporated into the growing DNA chain
  • This process is carried out for all four bases and then the fragments are put together like a jigsaw to reveal to original piece of DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Manual DNA sequencing based on?

A

Manual DNA sequencing is also known as the Sanger Dideoxy Method and it is based on three main points:

  • The synthesis (using DNA polymerase) of new strands of DNA that are copied from the DNA (template) of interest
  • The incorporation of 32P label (to allow detection)
  • The termination of DNA synthesis at defined positions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the Sanger sequence start with?

A
  • Sanger sequencing starts with DNA synthesis
  • In DNA synthesis the 3’ OH group is required for the formation of the phosphodiester bond to join the next base
  • In the presence of this free 3’-OH you going to keep getting elongation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens in the Sanger sequence if there is no 3’-OH group?

A
  • If that 3’ OH-group is not present (ie. the base incorporated is a 2’3’- dideoxy-ribonucleotide) then the synthesis of the new DNA chain will terminate
  • Instead of the 3’-OH there will be a 2’,3’-dideoxyribonucleoside triphosphate (there will be two oxygen’s missing)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the first step in the process of Sanger sequencing?

A
  • The first step is to set up four DNA polymerisation reactions in four separate tubes that contain the following components:
  • > Template strand
  • > Primer strand
  • > DNA Polymerase
  • > dGTP, dATP, dTTP, 32P-dCTP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the second step in the process of Sanger sequencing?

A
  • Add one of the four 2’3’-dideoxy-ribonucleoside triphosphate chain terminators to each of the four reaction mixtures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are steps 3-6 in the process of Sanger sequencing?

A
  • Denature the reaction products
  • Load them on a polyacrylamide gel
  • Separate the products based on size by gel electorphoresis
  • Expose the gel to x-ray film
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the “32P” mean?

A
  • The 32P in front of a base pair or a singular base means that that base is radio-labelled, which means that it is missing a 3’-OH (also, when writing the 32P remember that the 32 is an integer and so it is actually ^32P)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the general procedure for Automated DNA sequencing?

A
  • Similar in principle to the manual Sanger sequencing method
  • Each ddNTP has a different fluorescent tag attached (instead of radio-labelling dNTP’s )
  • All four reactions are loaded in the same lane on the gel (because all the different bases have a different fluorescent colour they can all be done at the same time in the same tube)
  • A fluorescent detector analyses the position of each fluorescent tag as they pass through the gel or capillary tube
  • Software manipulates the data to give a print out
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Study the diagrams for DNA synthesis in Sanger sequencing, Process of Sanger sequencing and Automated DNA Sequencing

A

https://docs.google.com/document/d/1Nzo4FTzXCbwOZjpoc_J_4IF3gsOXPcoyC2BowELmx0U/edit?usp=sharing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Genomics?

A
  • Branch of molecular biology concerned with the structure, function, evolution, and mapping of genomes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How was characterisation of genes in a genome initially performed and why was it limited in humans?

A
  • Identifying or generating mutants
  • Generating linkage maps using these mutant strains
  • This approach is limited in humans, ie. limited to linkage mapping of inherited or spontaneously acquired mutations with clear phenotypes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What techniques were used instead of the initial characterisation of genes in a genome approach?

A
  • In the 1980’s, researchers began using recombinant DNA techniques to map chromosomes
  • Resulted in the assignment of about 3,500 markers and genes to human chromosomes
  • The coupling of recombinant DNA techniques and automated DNA sequencing accelerated the study of the human genome (Genomics)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How did the Human genome project begin?

A
  • The Human Genome Project (public endeavor) was launched in 1990, and was estimated to cost $3 billion and take 15 years
  • Initially focused on generating detailed physical and genetic maps of the human genome
  • Development of automated sequencing techniques and establishment of International Human Genome Sequencing Consortium (IHGSC) made large-scale sequencing possible
  • The IHGSC group used a map-based approach to sequencing the genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Who was Craig Venter?

A
  • In 1998, Craig Venter (Celera Genomics) announced a private bid to sequence the human genome
  • Venter proposed a shotgun sequencing approach he claimed would be quicker than the map-based approach
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What was the shotgun sequencing approach?

A
  • Shotgun sequencing is the method that was used by the privet genome project
  • It requires multiple copies of the genome (Human) which are effectively blown up into millions of little fragments
  • Each fragment is then sequenced
  • The small fragments are assembled using an immense amount of computer power to match overlapping sections
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the draw back of the shotgun approach?

A
  • The draw back of this approach comes when dealing with repeat sequences. Often there is no way of knowing how long the repeat sequence is, or in which of the many possible different positions the fragments overlap
  • Even the incredibly powerful software used to shotgun sequence the human genome couldn’t cope with this.
  • Celera, the private company which relied on this approach had to use the public data to fill in the gaps left by the repeats
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the first two steps in the shotgun approach?

A
  • Small overlapping DNA fragments called “contigs” generated either using restriction enzymes that cut at different sites or through partial digestion using a single enzyme
  • The Fragments are inserted into plasmids and cloned in bacteria (small-insert clones)
22
Q

What are the Last two steps in the shotgun approach?

A
  • Fragments sequenced using automated high-throughput systems
  • The development of computerautomated high throughput sequencing systems was the major technological advancement that made genomics possible
  • 96 capillary gels run simultaneously @ 900 bases/run (1hr) x 24 hr = 2,000,000 bases/day
23
Q

How does the Map-based sequencing Approach start?

A
  • The human genome contains more than 3 billion nucleotides
  • The first stage of the public human genome project, focused on identifying marker sequences at regular intervals through out the genome
  • Once enough sequences were tagged, various blocks of the genome were allocated to academic centres for sequencing
24
Q

What happened after the tagged blocks of the genome were sent for sequencing during the map based sequencing approach?

A
  • To begin the sequencing process, 7 copies of a section of DNA are cleaved to produce smaller fragments. This step is small scale shotgun which created numerous, random fragments, each fragment is sequenced, then computer programs align the overlap between fragments to build up an entire page
  • Marker sequences help establish the order of the pages in the book of life (genome)
25
Q

How were the marker-sequences beneficial to the map-based approach?

A
  • This process produced huge amounts of data that have been used to virtually reassemble our genome. However, there are gaps, repeat sequences are common in the human genome so repeats from entirely different chromosome regions may be wrongly joined together. It will take many years to identify the mismatches caused by repeat sequences. Some regions, especially near the centromeres, may never be fully finished
26
Q

What is the process of Map-Based Sequencing?

A
  • Chromosome is partially digested with restriction enzymes and ligated into YACs or BACs to create libraries of contigs (large insert clones)
  • Large-insert clones are placed in their correct order using a pre-existing genetic map of markers or by generating a restriction enzyme cleavage map
  • A subset of overlapping clones are selected and further fractured into smaller fragments that are subsequently cloned (small insert clones) and sequenced
27
Q

How did the Human Genome project pan out for the public and the private groups?

A
  • Both Human Genome Sequencing Consortium (map based approach) and Celera (shotgun approach) moved forward simultaneously
  • Rough drafts were published in the same Feb week in 2001 by both parties, and were found to be amazingly consistent
28
Q

What is Intergenic DNA?

A
  • An Intergenic region is a stretch of DNA sequences located between genes. Intergenic regions are a subset of noncoding DNA. Occasionally some intergenic DNA acts to control genes nearby, but most of it has no currently known function
29
Q

What results have been found from both Human Genome project groups?

A
  • On average, there is 1 gene per 145 kb (but there are gene clusters)
  • The average human gene is 27,000 bp in length and contains 9 exons
  • Exons make up 1.1% of the genome, introns make up 24% of the genome, and 75% of the genome is intergenic DNA
  • 44% of the intergenic DNA is derived from transposable elements
  • There are 22,287 protein coding genes
  • Additional genes specify RNA products (rRNA, tRNA, snRNA, miRNA).
30
Q

Study the diagrams of the Shotgun Approach, Map-based Approach, Human Genome Project Results Table

A
  • Google Doc
31
Q

What are the differences between Shotgun and Map-Based sequencing?

A

Map-based sequencing:
- Organizes contigs from a restriction map before sequencing
- More time-consuming, cumbersome, expensive
Shotgun sequencing:
- Random sequencing and then assembly
- Faster, cheaper, and has now become the most common method of assembling first drafts of genomes

32
Q

What are both sequencing approaches used for currently?

A

Map-based approach now used to resolve problems encountered during shotgun approach. For example:

  • Assembling highly repetitive sequences
  • Resolving gaps in sequence

Shotgun and map-based approaches usually combined to assemble complete or near complete genome sequences

33
Q

What is an NGS?

A
  • Next-generation DNA Sequencing
34
Q

How does NGS work?

A

There are several NGS systems, which have been developed by different companies. However, all these systems share at least three fundamental steps:

  • DNA preparation and immobilization onto solid support
  • Amplification (PCR)
  • Sequencing
35
Q

What are the Advantages of an NGS?

A
  • Construction of a sequencing library -> clonal amplification to generate sequencing features
  • No in vivo cloning, transformation, colony picking, etc
  • Array based sequencing
  • Higher degree of parallelism than capillary-based sequencing
36
Q

So is there only one genome that represents the human race?

A
  • There is no single genome that represents the human species, instead there is a reference genome that was based on DNA donated from several individuals.
  • In reality there are billions of human genomes (every persons genome is likely to be unique), and there is not even a single genome for a person (cells within the same individual will vary due to somatic mutations).
37
Q

Is the human genome that was studied by the Human genome project groups an ancestral sequence?

A
  • The available human genome is not the ancestral sequence for all humans, but instead an arbitrary sequence based on the idiosyncrasies of those individuals whose DNA was used in the human genome projects
38
Q

Are all human genomes very similar or are there numerous differences?

A
  • Genome Sequencing has identified a number of sites where humans differ in their sequence
  • There is 99.9% sequence identity between any 2 individuals. This might not seem like much difference, but given that humans have a genome consisting of 3 billion bp, this would mean that there are more 3 million bp differences between any 2 individuals.
39
Q

What is a Single Nucleotide Polymorphism?

A
  • A single-nucleotide polymorphism is a substitution of a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population
  • It is a site in the genome where a person differs by a single base pair, called a single nucleotide polymorphism (SNP)
40
Q

Are SNP’s inherited?

A
  • SNPs are inherited as allelic variants in the same way as alleles that produce phenotypic differences.
    • Can spread throughout a population over time.
  • Numerous and present throughout genomes.
    • Same chromosome from two different people, a SNP can be found approx. every 1000 bp
41
Q

What is a haplotype?

A
  • The specific set of SNPs and other genetic variants observed on a single chromosome or part of a chromosome is called a haplotype
  • Due to the rate of crossing over events being proportional to the physical distance between genes, SNPs that are located close to each other will be strongly associated as haplotypes.
42
Q

What are SNP’s used for in studies?

A
  • SNPs variability and widespread occurrence throughout the genome make them valuable markers in linkage studies
43
Q

Why are SNP’s used as markers in linkage studies?

A
  • SNPs that are physically close to a diseasecausing locus, tend to be inherited along with the disease causing allele.
  • People with disease tend to have different SNPs than healthy people.
  • Comparing SNP haplotypes between people with a disease and healthy people can help determine the location of the disease causing gene.
44
Q

What is Direct-to-consumer personal genetic testing?

A
  • Provides a saliva-based direct-to-consumer personal genome test
  • Companies like ancestry, myheritage, etc, use SNP genotyping to assess the genetic variation between members of the same species (eg. Humans).
45
Q

What is DNA profiling?

A
  • DNA profiling is the process of determining an individual’s DNA characteristics, which are as unique as fingerprints
46
Q

What was the old way of DNA profiling?

A
  • Using Restriction Enzymes to cut DNA into fragments (Restriction fragment length polymorphism (RFLP)
  • Due to differences within the genome, fragments from two people will be different sizes
47
Q

How is DNA profiling done today?

A
  • Variable nucleotide tandem repeats (VNTRs)
    • Certain DNA motifs are repeated
    • The motif does not change, but the number of times it repeats does
    • Distributed through out the genome
48
Q

What type of Variable Nucleotide Tandem Repeats are used in DNA profiling?

A
  • Short tandem repeats (STR)
    • Very short DNA sequences repeated in tandem (adjacent)
    • Power of STR analysis is simultaneously looking at multiple STR loci which are independently assorted.
49
Q

How are STR’s detected?

A
  • STRs are detected using PCR with primers that flank the microsatellite repeats
  • The higher the number of repeats the larger the fragment and the slower it moves via electrophoresis
50
Q

Study the Diagrams for Single Nucleotide Polymorphism, Short Tandem Repeats and the detection of STR’s

A
  • Google doc
51
Q

In DNA profiling, how are the fragments created by the STR’s identified?

A
  • Using fluorescent detection, the fragments are represented as peaks on a graph.
  • Homozygotes for an STR allele have a single tall peak
  • Heterozygotes have two shorter peaks