De Novo Gene Birth Flashcards
de novo gene birth (3)
- definition
- common or rare
- requirements of process
- formation of new genes from non-gene sequences
- considered to be very rare
- DNA region needs to take on features of an ORF and gain the ability to be recognized by transcriptional machinery
why are genes from de novo birth genes called orphan genes
- because they are genes that were not derived from other genes (no parent genes)
theory of de novo gene birth: expression first (2)
- protogene model: several short ORFs, capable of producing short polypeptides, eventually merge to produce a single gene with longer transcript protein
OR - ORF contains premature stop codons that prevent it from being translated; mutations removing premature stops allow ORF to be expressed
theory of de novo gene birth: ORF first
- no regulatory sequences directing expression of ORF; when mutations allow TFs to bind, the ORF is expressed
overprinting (3)
- special case of de novo gene birth
- new ORF develops which overlaps with existing ORF or gene, except in a different reading frame
- further changes occur that allow the new ORF to be transcribed
exonization (3)
- special case of de novo gene birth
- sequence within an intron gains mutations that allow it to be recognized as an exon
- new exon codes for a structure that did not exist previously within the gene or gene family
What are the steps to finding evidence of de novo gene birth? (this case: de novo genes in humans) (5)
- BLAST search of human protein sequences against other primate databases
- remove sequences that lack Start and Stop codons
- find homologous non-gene sequences within primate genomes
- attempt to trace the mutations/sequence differences that could turn non-gene sequences into a coding sequence in humans
- confirm transcription/translation of the coding sequences
What are the steps to finding evidence of de novo gene birth? (this case: de novo genes in humans) (5)
- BLAST search of human protein sequences against other primate databases
- remove sequences that lack Start and Stop codons
- find homologous non-gene sequences within primate genomes
- attempt to trace the mutations/sequence differences that could turn non-gene sequences into a coding sequence in humans
- confirm transcription/translation of the coding sequences
what is the rationale behind:
1. BLAST search of human protein sequences against other primate databases
- confirm which genes are NOT unique to us/genes that we share with other species
what is the rationale behind:
2. remove sequences that lack Start and Stop codons
- ignore the DNA sequences that will not be translated
what is the rationale behind:
3. find homologous non-gene sequences within primate genomes
- see if some genes arose from non-coding sequences
what are some events that could create a de novo gene (7)
- mutations to the sequence
- expansion of the mutation
- fusion of the ORF to a signal peptide
- frameshift mutations to align certain sequences
- gain 3’ UTR and 5’ UTR
- gain of a TATA motif (regulatory element)
- gain of a stop codon
major strategies to detect de novo gene birth (3)
- analyze sequence similarity between genes of closely related species
- use synteny-based approaches
- combination of two strategies
major strategies to detect de novo gene birth: analyze sequence similarity between genes of closely related species (2)
- infer whether an ancestral homolog exists between related species
- genes lacking a common ancestor may be novel orphan genes (not derived from an existing gene)
major strategies to detect de novo gene birth: synteny-based approaches (2)
- analyze regions of synteny and use this to identify non-genic sequences that appear related to the novel gene
- this traces the sequence changes that led to the formation of the new gene