What is a genome? Flashcards
Genomics
Study of an organism’s complete set of genetic info. ‘genome’ = the complete genetic information of an organism. Genome includes both genes and non-coding DNA
Genetics
The study of heredity. The study of function and composition of single genes. Gene - specific sequence of DNA which code for a functional molecule.
Gene is
difficult to define
How many genes in the human genome?
20,000-25,000 genes
What do genes do?
Basic functional unit for heredity. Codes for products that may become proteins, RNAs or alternatively spliced versions of either. Can include enhancers and promoters
How big are genes?
Highly variable, human genes range from 0.9 - 2,400kb
What is a structure of a gene?
Enhancers and promoters potentially, start codon, ORF (exons and introns), Stop codon, alternative splice sites potentially included
How variable is the structure of our genes?
Must start with start codon. Must end with stop codon. at least a single exon. Exerything else can be variable
Types of variation in our genes?
Inversions and balanced translocations or genomic imbalances (insertions and deletions), = copy number variants (CNVs)
Gene organisation
genes are of various different sizes, range from 1 exon to 79 exons (DMD)
Genes can also code for
protein
Some genes not involved in
protein coding, they code for RNA only
There’s an entire level of the
transcriptome; could be over 100,000 transcripts which generate more than a million different proteins which makes up the proteome. This comes from a relatively small number of genes around 20-25,000 genes
Initially how many genes did scientists think existed?
50,000 but actually there are less than originally expected. This doesn’t mean that there’s’ not diversity in a large numbers of different proteins that are produced
Haploid human genome sequence
Non-coding sequence elements, exons, introns, RNA coding and regulatory sequences and genes
- Non-coding sequence elements =
region of the genome that we’re not clear what they do, they’re not coding for protein or RNA, likely to be regulatory
Exons
coding for protein only constitute 1-2% of the genome
Introns
non-coding region of the genome and constitutes a larger fraction of the genome
Genes
RNA coding and regulatory sequences, introns and exons (still the smaller fraction of the genome in total)
The genome unravelled
DNA organised into chromosomes.
At sequence level, then whole level of structure on top (RNA), DNA, DNA wrapped around nucleosomes to form chromatin, then bound and coiled again to form the chromosomes = Super coiled DNA molecule
The power of 7 billion people
- Given known mutation rates, it’s almost certain that every possible single base change compatible with life exists in a living human
2 papers looking at variation
Exome Aggregation Consortium (ExAC) and Genome Aggregation Database (gnomAD)
Exome Aggregation Consortium (ExAC)
- 60,706 exomes
- Began in 2012, first release Oct 2014
- Preprint in Oct 2015
- Published in May 2016
Genome Aggregation Database (gnomAD)
- 125,748 exomes and 15,708 genomes
- Began in 2016, first release Oct 2016
- Latest data release 2020
- 4 papers in Nature in August 2020 - documented the level of variation at the single base level
Loss of function variants =
variants that cause a stop-codon to be gained or effect an essential splice or cause a frameshift in the protein – this causes loss of function of protein – these are very deleterious but also common.
Comparative Genomics
Living organisms have a different number of chromosomes. No correlation between number of chromosomes and complexity of organism
transcriptome
Areas of the genome that are transcribed into RNA