Genomics Flashcards
what is a genome
the complete set of genetic information, including all of the genes and non-coding DNA within an organisms cells
how do we make up for missing genes
homology-based gene prediction (using similar organisms)
De novo gene prediction (computational tools and analysis)
synthetic biology (designing and building new genes)
gene editing
where do new genes come from
lateral gene transfer, transposable elements, plasmids, viruses
what is a reference genome
high-quality genome sequence that serves as a representative example of a particular species or group
how do we differ from each other
small polymorphisms
structural variation
how is the genome split into manageable chunks
rare cutting enzyme digest
what is a CONTIG
continuous sequence made from many sequences overlapping which have been broken up at random
what is a scaffold
a series of contigs where we have additional information to place them together in the right order and orientation but the sequence between the contigs is not complete
what is an assembly
the set of scaffolds for one genome
what is the N50
the size of the largest contig/scaffold of which 50% of the assembled data is in a contig/scaffold of that size or larger
why is resequencing a genome important
identify differences between strains, organisms and individuals
assembly against a reference is much easier than de novo sequencing
explain the method of shotgun sequencing
the DNA is fragmented into many small pieces
adapter oligos are ligated to the ends of the fragments and they’re amplified
fragments are then sequenced producing short reads
short reads are assembled into longer contigs using overlapping regions
larger scaffold sequence is created using additional information
what are the advantages and disadvantages of each type of sequencing
shotgun:
- adv: accurate, no prior knowledge of the genome is needed
- dis: expensive for large genomes, slow
Illumina:
- adv: cheap, accurate
- dis: limited read length