Bioinformatics Flashcards
What is DNA encoded by and what is its purpose?
Can hold information
Encoded using nucleotide sequence
Describe the nucleotide base pairs in DNA and what their purpose is
Complementarity (AT,GC)
• Copying and repair
• Mechanism for heredity
Why do organisms differ to each other?
• Organisms differ due to different sequences
What is the genetic similarity between 2 humans?
o The genetic similarity between two humans is 99.9%
What is the genetic similarity between a human and a banana
o The genetic similarity between a human and banana is 60%
Describe the structure of DNA
DNA structure- made of nucleotides built into DNA strand
• Anti-parallel DNA helix
• Each strand is complementary
• Genes spaced along strands
What is DNA packaged in?
Packaged into chromosomes
What is the genome?
• Genome- complete DNA/RNA sequence required to maintain or make an organism
What is genomics?
• Genomics- study of entire genome and products (RNA, proteins)
What is the aim of genomics and what does it involve?
o Aim: To understand how genome contributes to functioning of cell, organism, population and ecosystem
o Involves large scale molecular biology and genetics which makes complex data sets
o Relies heavily on automated data acquisition, computer-based data analyses (bioinformatics)
Describe the number of base pairs in the human genome?
• More than 3 billion base pairs
Describe the number of protein-coding genes
• About 30,000 protein-coding genes
What is the function of interspersed DNA?
o Interspersed DNA- which seems to have regulatory function
Why is genomics potentially useful?
• Human health
o Detect genetic variants that increase disease risk
o Identify cancer mutations, search for cure/treatment
o Pathogen identification and mitigation (e.g. SARS disease)
Identify antibiotic targets
Rapid disease identification
• Agriculture/animal breeding
o Enhance productivity, consistency and progeny quality
o Identify disease-causing genetic variants
• Study biology of non-model organisms
• Answer fundamental questions
Describe 2 methods for sequencing DNA
- Sanger sequencing, older method still in use
* De novo sequencing of genomes
Describe how Sanger sequencing sequences
• Sanger sequencing, older method still in use:
o PCR or cloning in plasmid of gene of interest, to make many copies of the same piece of DNA
DNA separated into 2 strands
o Dye termination (Sanger) sequencing of product
Take purified fragments of DNA copies
Add dNTPs, primers (complementary to each end of fragment), DNA polymerase
Add ddNTPs- fluorescently labelled chain terminators that identify the last base incorporated into the chain. No 3’-OH group.
• Once ddNTP added to chain, chain elongation stops
Fragments are put together to reveal sequence of original DNA sequence
What is a chromatogram and how do you read one?
o Chromatogram-result of Sanger Sequencing
Each peak represents a fluorescent flash from labelled nucleotide
• Each different nucleotide is labelled different colours- order of colour=order of nucleotide
What DNA strands is a chromatogram suitable for?
Suitable for short stretches of DNA (about 700 bp) but not suitable for long stretches of DNA as would take too much time
Could sequence a whole genome a bit at a time
What are the advantages of de novo sequencing of genomes?
• De novo sequencing of genomes o Newer method of sequencing Works for any organism No need to know gene sequence in advance • Don’t need primers o Very rapid o Quite cheap Human: about $1000 USD Send sequence to lab and may get sequence in small bits back in 5 days and get the entire synthesised genome back in 2 months
What other names is de novo sequencing known by?
o Whole genome shotgun sequencing/massively parallel sequencing/name based on machine manufacturer (e.g. illumina or 454)
How does whole genome shotgun sequencing occur?
Extraction of DNA -> many copies of the genome (1/cell)
Cell samples are inserted into sequencing instrument where high intensity soundwaves break the DNA into billion of pieces which are only 600 bases long
Special tags are added to the ends of the fragmented DNA
• Add adaptors to fragments to anchor them to a support
Tagged strands of DNA attach to a tagged slide
In a sequencer, each piece of DNA is copied hundreds to thousands of time
• Creates clusters of identical DNA fragments
• PCR to amplify each fragment
Sequencer reads the DNA in parallel, one base at a time using different coloured tags for each DNA base
Special sensors within the machine detect different coloured tags
Computers piece together individual DNA fragments and determines order and orientation of contigs using overlaps, laying out reads and make a consensus
Genome sequenced
What are reads?
• Product of sequencing is called reads
Why do we want to sequence the strand many times in whole genome shotgun sequencing?
• Want to sequence strands many times as easier to find overlaps this way (sequence in depth)
What are contigs?
• Contig- stretch of contiguous sequence
o Reads stitched together with no gaps