Microbial Genomics Flashcards
What is a “genome?”
What is genomics?
- Genome
- entire complement of genetic information
- Includes genes, regulatory sequences, and noncoding segments of the genome
- Genomics
- Discipline of mapping, sequencing, analyzing, and comparing genomes
What are the 3 steps to sequencing a genome?
- Sequence the DNA
- Assemble the sequences into chromosomes or large gragments called “contigs”
- Predict genes (annotation)
What was the first “major” way to sequence a genome?
- Sanger Sequencing
- Used a Dideoxy analog to the dNTP that was missing an OH on the 3’ Carbon
- This made whateversequence stop at that analog
- Used a Dideoxy analog to the dNTP that was missing an OH on the 3’ Carbon
What is “sequencing” in Genomics?
Determining the precise order of nucleotides in a DNA molecule
How many templates should a Sanger-esque sequencing technique have?
1, purified template
What are some of the New sequencing techniques?
- 454 - 1 million reads/run; 700 bp length (no longer supported)
- Illumina - 4 billion reads/run; 300 bp length
- PacBio - 125,000 reads/run; 2000 bp length
What is the “modern” standard for genome sequencing?
Illumina
What is the 4th generation sequencing Technology?
Basically, how does it work and what is its limitation?
- Nanopore sequencing
- Detects eletrical current as DNA is threaded through a protein pore
- Detects 3 bases at a time
- Uses a Helicase (maybe) to feed DNA through the pore.
- High error rate - 5-38.2%
How do you assemble a genome, after sequencing?
- Align all reads to generate longer sequences - contigs
- Gaps may still remain where no reads are found
- Orientation can be assumed if closely related genomes have been completed
- Must close the gap with PCR and sequencing of PCR amplicons by Sanger Seq
How do you fill gaps in your genome after you’ve sequenced it?
- You could try to orient it if closely related genomes have been completed
- Have to clsoe with PCR and sequencing of PCR amplicons by Sanger seq
What is the last step in sequencing a genoeme?
Annotating (Bioinformatics)
What is Annotation?
What do the majoirty of genes encode?
What are functional RNAs?
- Converting raw sequence data into a list of genes or elements present in the genome
- proteins - mRNAs
- Functional RNAs = rRNAs, tRNAs, ncRNAs
What is one way we could use organisms with a similar ORF for sequencing?
- Most genes code for proteins, and functional Rnas like rRNAs and tRNAs are very conserved
- So, if organisms have similar ORFs you can could assume function of the ORF or even assume sequence…
What is synteny?
arrangement of genes in a genome
What are things you can do with a genome?
- Compare gene sequences between bacteria
- Compare mutations between strains
- compare arrangement of genes in genome
- Design new therapeutics to treat
How many bases are in a kilobase?
How many bases are in a megabase?
- Kilo - 1,000
- Mega - 1,000,000
The bigger your genome –> the [] number of protein-coding genes…
Increases
The larger you genome –> The [] % of gene-coding
smaller
What may be some reasons for why bacteria have such a low % of non-coding genes?
- They are small. Don’t want a big genome.
- The metabolic burden would be too great to have too large a genome.
What types of genes are typically the most abundant known class?
metabolic
Based on the standard deviations, there is basically no difference between the gene distribution in [] and [] …..
Bacteria and Archaea
What is the metagenome?
The total content of the organisms present in an environment
What is used in the bacterial world’s metagenomics to assign species?
16s rRNA
What can/can’t Metagenomics do?
- Metagenomics Can
- Sequence of DNA in a sample
- Diversity information
- Relative quantities of community members
- Metagenomics Can’t
- Tell you the useful members of a community
- Who is alive/who is dead
- What pathway are being used in an environement
- so use TRANSCRIPTOMICS instead!
Transcriptome:
- The entire complement of [] produced under a given set of []
- Why study?
- Global gene []
- Expression of specific groups of [] under different []
- Expression of genes with [] function
- Examine host-[] interactions
- RNA, conditions
- Why?
- Expression
- genes, conditions
- unkown
- pathogen
Which method of Transcriptome measurement is not good for Global genomics? Why?
- qRT-PCR - uses reverse transcriptase to convery RNA –> DNA - then quantititave PCR to estimate transcript abundance
- Can only use it for individual genes!
Measuring Transcriptomes:
What are Microarrays?
What is RNA-seq?
-
Microarray (used for whole transcriptome)
- requires genome sequence, compare two samples together, is always relative abundance of transcripts between samples
- Limited because you have to know exactly what you’re testing
-
RNA-seq (used for whole transcriptome)
- compare global gene expression in a mixed sample
Transcriptomics vs Proteomics
- Transcriptomics
- All transcripts in cells under a given condition
- what should get made
- only way to get infro on ncRNAs, sRNAs, rRNAs, and tRNAs
- dont see post transcription/ post translational effets though
- All transcripts in cells under a given condition
- Proteomics
- All proteins that are present in cells under a given condition
- what does get made
- more informative
- MUCH more difficult
- 2d-gels, Mass spec
- All proteins that are present in cells under a given condition
How are 2D gels used in Proteomics?
- technique used for separation, identification, and measurement of all proteins present in a sample
- 1st horizontal dimension - separated by isoelectric point (neutral pH)
- 2nd vertical dimension - proteins separated by size
- Identified by N-terminal protein sequencing or mass spectrometry
How are proteins identified from 2D gel after separation?
N-terminal protein sequencing or mass spectrometry
Is Mass spectrometry good for complex samples?
no
What is metabolomics?
- Complete set of metabolic intermediates and other small molecules produced in an organism
What is the difference between primary and secondary metabolites?
- Primary -
- essential for growth, development, or reproduction of host…..stuff in TCA cycle for example
- Secondary -
- Specific for environment
- Not necessary for cell but give a specific advantage
- Drugs
- Antibiotics
What is one of the primary techniques for monitoring metabolites?
Mass Spectrometry
Horizontally transferred genes typically [] [] encode core [] functions
- do not
- metabolic
What mediates horizontal gene transfer?
Transformation, conjugation, and transduction
What are ways to detect Horizontal gene Flow?
- DNA with GC or codon bias that differs from the parent strain
- presence of genes typically found in distantly related species
- Change in position of a conserved gene compare to clost relatives
What are the differene between Homologous - Paralog - Ortholog - Xenolog?
- Homologous - related in sequence to an extent that implies common genetic ancestry - can be paralog or ortholog
-
Paralog - genes within an organism whose similarity to one or more genes in the same organism is the result of gene duplication
- 2 closely related genes from an ancestor, in the organism
- Ortholog - similar gene from a common ancestor but found in different species
- Xenolog - genes in the same organism that came about through horizontal gene transfer from a different organism
Core + Accessory=:
The Core genome is…
the Accessory genome is…
- Core = genes shared by all strains of the species or group
- Accessory = genes found in only some members of the speies or group
- Core + Accessory = Pan