Microbial Genomics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a “genome?”

What is genomics?

A
  • Genome
    • entire complement of genetic information
    • Includes genes, regulatory sequences, and noncoding segments of the genome
  • Genomics
    • Discipline of mapping, sequencing, analyzing, and comparing genomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 3 steps to sequencing a genome?

A
  1. Sequence the DNA
  2. Assemble the sequences into chromosomes or large gragments called “contigs”
  3. Predict genes (annotation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What was the first “major” way to sequence a genome?

A
  • Sanger Sequencing
    • Used a Dideoxy analog to the dNTP that was missing an OH on the 3’ Carbon
      • This made whateversequence stop at that analog
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is “sequencing” in Genomics?

A

Determining the precise order of nucleotides in a DNA molecule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many templates should a Sanger-esque sequencing technique have?

A

1, purified template

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some of the New sequencing techniques?

A
  • 454 - 1 million reads/run; 700 bp length (no longer supported)
  • Illumina - 4 billion reads/run; 300 bp length
  • PacBio - 125,000 reads/run; 2000 bp length
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the “modern” standard for genome sequencing?

A

Illumina

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the 4th generation sequencing Technology?

Basically, how does it work and what is its limitation?

A
  • Nanopore sequencing
    • Detects eletrical current as DNA is threaded through a protein pore
    • Detects 3 bases at a time
    • Uses a Helicase (maybe) to feed DNA through the pore.
  • High error rate - 5-38.2%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you assemble a genome, after sequencing?

A
  1. Align all reads to generate longer sequences - contigs
  2. Gaps may still remain where no reads are found
    1. Orientation can be assumed if closely related genomes have been completed
    2. Must close the gap with PCR and sequencing of PCR amplicons by Sanger Seq
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you fill gaps in your genome after you’ve sequenced it?

A
  1. You could try to orient it if closely related genomes have been completed
  2. Have to clsoe with PCR and sequencing of PCR amplicons by Sanger seq
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the last step in sequencing a genoeme?

A

Annotating (Bioinformatics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Annotation?

What do the majoirty of genes encode?

What are functional RNAs?

A
  1. Converting raw sequence data into a list of genes or elements present in the genome
  2. proteins - mRNAs
  3. Functional RNAs = rRNAs, tRNAs, ncRNAs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is one way we could use organisms with a similar ORF for sequencing?

A
  • Most genes code for proteins, and functional Rnas like rRNAs and tRNAs are very conserved
    • So, if organisms have similar ORFs you can could assume function of the ORF or even assume sequence…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is synteny?

A

arrangement of genes in a genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are things you can do with a genome?

A
  1. Compare gene sequences between bacteria
  2. Compare mutations between strains
  3. compare arrangement of genes in genome
  4. Design new therapeutics to treat
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How many bases are in a kilobase?

How many bases are in a megabase?

A
  1. Kilo - 1,000
  2. Mega - 1,000,000
17
Q

The bigger your genome –> the [] number of protein-coding genes…

A

Increases

18
Q

The larger you genome –> The [] % of gene-coding

A

smaller

19
Q

What may be some reasons for why bacteria have such a low % of non-coding genes?

A
  1. They are small. Don’t want a big genome.
  2. The metabolic burden would be too great to have too large a genome.
20
Q

What types of genes are typically the most abundant known class?

A

metabolic

21
Q

Based on the standard deviations, there is basically no difference between the gene distribution in [] and [] …..

A

Bacteria and Archaea

22
Q

What is the metagenome?

A

The total content of the organisms present in an environment

23
Q

What is used in the bacterial world’s metagenomics to assign species?

A

16s rRNA

24
Q

What can/can’t Metagenomics do?

A
  • Metagenomics Can
    • Sequence of DNA in a sample
    • Diversity information
    • Relative quantities of community members
  • Metagenomics Can’t
    • Tell you the useful members of a community
    • Who is alive/who is dead
    • What pathway are being used in an environement
    • so use TRANSCRIPTOMICS instead!
25
Q

Transcriptome:

  1. The entire complement of [] produced under a given set of []
  2. Why study?
    1. Global gene []
    2. Expression of specific groups of [] under different []
    3. Expression of genes with [] function
    4. Examine host-[] interactions
A
  1. RNA, conditions
  2. Why?
    1. Expression
    2. genes, conditions
    3. unkown
    4. pathogen
26
Q

Which method of Transcriptome measurement is not good for Global genomics? Why?

A
  • qRT-PCR - uses reverse transcriptase to convery RNA –> DNA - then quantititave PCR to estimate transcript abundance
  • Can only use it for individual genes!
27
Q

Measuring Transcriptomes:

What are Microarrays?

What is RNA-seq?

A
  • Microarray (used for whole transcriptome)
    • requires genome sequence, compare two samples together, is always relative abundance of transcripts between samples
    • Limited because you have to know exactly what you’re testing
  • RNA-seq (used for whole transcriptome)
    • compare global gene expression in a mixed sample
28
Q

Transcriptomics vs Proteomics

A
  • Transcriptomics
    • All transcripts in cells under a given condition
      • what should get made
      • only way to get infro on ncRNAs, sRNAs, rRNAs, and tRNAs
      • dont see post transcription/ post translational effets though
  • Proteomics
    • All proteins that are present in cells under a given condition
      • what does get made
      • more informative
      • MUCH more difficult
      • 2d-gels, Mass spec
29
Q

How are 2D gels used in Proteomics?

A
  • technique used for separation, identification, and measurement of all proteins present in a sample
    • 1st horizontal dimension - separated by isoelectric point (neutral pH)
    • 2nd vertical dimension - proteins separated by size
    • Identified by N-terminal protein sequencing or mass spectrometry
30
Q

How are proteins identified from 2D gel after separation?

A

N-terminal protein sequencing or mass spectrometry

31
Q

Is Mass spectrometry good for complex samples?

A

no

32
Q

What is metabolomics?

A
  • Complete set of metabolic intermediates and other small molecules produced in an organism
33
Q

What is the difference between primary and secondary metabolites?

A
  • Primary -
    • essential for growth, development, or reproduction of host…..stuff in TCA cycle for example
  • Secondary -
    • Specific for environment
    • Not necessary for cell but give a specific advantage
      • Drugs
      • Antibiotics
34
Q

What is one of the primary techniques for monitoring metabolites?

A

Mass Spectrometry

35
Q

Horizontally transferred genes typically [] [] encode core [] functions

A
  1. do not
  2. metabolic
36
Q

What mediates horizontal gene transfer?

A

Transformation, conjugation, and transduction

37
Q

What are ways to detect Horizontal gene Flow?

A
  • DNA with GC or codon bias that differs from the parent strain
  • presence of genes typically found in distantly related species
  • Change in position of a conserved gene compare to clost relatives
38
Q

What are the differene between Homologous - Paralog - Ortholog - Xenolog?

A
  • Homologous - related in sequence to an extent that implies common genetic ancestry - can be paralog or ortholog
  • Paralog - genes within an organism whose similarity to one or more genes in the same organism is the result of gene duplication
    • 2 closely related genes from an ancestor, in the organism
  • Ortholog - similar gene from a common ancestor but found in different species
  • Xenolog - genes in the same organism that came about through horizontal gene transfer from a different organism
39
Q

Core + Accessory=:

The Core genome is…

the Accessory genome is…

A
  1. Core = genes shared by all strains of the species or group
  2. Accessory = genes found in only some members of the speies or group
  3. Core + Accessory = Pan