Genome Structure Flashcards
what is the correlation between total genome size and organism size
- ## very weak correlation between total genome size (# of genes) and organism size/complexity (# of cells)
what is the correlation between noncoding sequences and organism complexity
- positive correlation between noncoding sequences and organism complexity
composition of human genome
- exons (1.5%)
- introns & regulatory sequences (24%)
- unique non-coding DNA (15%)
- repetitive DNA (59%)
how are genomes organized/distributed (2)
- gene-rich regions or gene-deserts
- biological significance of these regions is unknown
gene-rich regions
- chromosomal regions that have many more genes than expected from average gene diversity over entire genome
- GC rich regions
gene deserts (3)
- regions of >1 Mb that have no identifiable sequences
- AT rich regions
- 3% of human genome is comprised of gene deserts
how do genome sequence studies affirm evolution from a common ancestor (3)
- genetic components of the basic cellular machinery of all living organisms are remarkably similar
- this suggests that all living organisms are descendants of a single, life-producing biology
- this means that analysis of model organisms can provide biological insight into the corresponding human systems
why would a nematode have so many more genes than a single-celled yeast
- nematode is multi-cellular and needs to encode genes necessary for organism-specific functions
- eg. muscles, nerves, etc
what might experiments might yeast be a good model organisms for
- cell cycle experiments
what might experiments might nematodes be a good model organisms for
- neuron or muscle experiments
protein structure
- relationship between amino acid sequence, secondary structure, motifs, domains, and overall tertiary structure
secondary structure
- beta sheets and alpha helixes
protein motif (2)
- short conserved amino acid sequence (<20 aa), that codes for a structure of biological significance
- simple combinations of secondary structure elements
protein domain (3)
- region of a protein that folds into a stable 3D structure independently of the rest of the protein chain
- typically combinations of secondary structures and motifs that are organized into a characteristic structure that is shared between other protein family members
- can be used in combination in many types of proteins
SH2 domain (2)
- protein domain of ~100 aa
- first identified as a conserved sequence found in oncoproteins
- recognize phosphorylated tyrosine motifs and found in many proteins involved in tyrosine kinase signalling
SH2 function (2)
- connect other functional domains to phosphorylated proteins
- eg. dock to phosphorylated proteins or act in a signalling pathway
notes on domain size (2)
- domains have a limit on size
- majority of domains (90%) have less than 200 residues, and are usually ~100 residues
domains and forms of life
- they are common material used by nature to generate new sequences as many domain families are found in all three life forms
what do domains tell us about the origin and evolution of proteins (2)
- majority of genomic proteins are multidomain proteins created as a result of gene duplication and domain shuffling/mixing
- suggests that new sequences are adapted from pre-existing sequences rather than invented
what domains are shared between yeasts and nematodes
- domains that are shared pertain to basic conserved functions WITHIN cells
what domains are not shared between nematodes and yeasts
- likely pertain to multicellular function/specialized cells that nematodes have, but yeasts don’t have
how are proteins involved in core processes related to those involved in multicellularity
- virtually all biological processes characteristic of multicellular life are performed by proteins that are not close variants of proteins responsible for the core processes, even though they might share some domains
what can we infer from knowing that “protein domains can be found in many combinations within proteins”
- most proteins have evolved as a result of gene duplication, domain shuffling, and mixing
what can we predict by comparing genomes of different species (3)
evolutionary trends:
- evolution of new regulatory or signalling domains
- evolution of new domain architectures from shared (presumably preexisting) domains
- expansion of particular domain families by a series of duplications
between DNA sequences, mRNA sequences, and amino acid sequences, where would you expect to find the greatest sequence similarity (2)
- amino acid sequences
- diff nucleotide sequences can produce the same amino acid sequence and DNA still includes introns