Lec 6 Flashcards
Define genome
complete set of all genetic information
i. e
- chromosomes 1-22
- mitochondrial DNA
- chromosomes xx or xy
Define transcriptome
All the mRNA molecules that can be expressed from the genes of an organism
What main methods do we use to study genomes
Sequencing
Microarrays
Visualisations
What are some sequencing methods that can be used?
-whole genome sequencing
-exome sequencing
RNAseq
ATAC-seq
Targeted seq
What are some methods of Microarrays
- SNPchips
- Expression arrays
What are some methods of Visualization?
- FISH (fluorescence in situ hybridisation)
- northern and southern blotting
- qPCR
FISH stands for
fluorescence in situ hybridization
Out of the main methods of sequencing, microarrays and visualisation, what is the main method we use to study sequencing now?
Sequencing
What do microarrays give us?
microarrays can be a SNPchip or an expression array
which tells you if an individual has a particular SNP or deletion
Microarray may have 100, 1000s of probes that ask a very specific question.
i.e Does your genome have this specific change.
What do expression arrays tell us?
Expression array tells us how much of e.g DGAT is expressed in different cell types or diff organisms
If there is an unknown species. How do we describe it genetically and compare it to other species?
If there is an unknown species. How do we describe it genetically and compare it to other species?
2 main questions
- how do we describe it genetically?
- how would we compare it to other species?
-how is it’s genome structured?
how many chromosomes do they have?
do the chromosomes have any strange and/or unusual structures
what genes does it have?
do any of these genes do interesting or unusual things
-are the genes similar to that of other organisms
What are the main challenges of genetically describing an unknown species
- What size is the genome
- how many chromosomes
- what is the genome ploidy
- how big and how many repeats are there?
- How easy to extract the DNA
Unknown species. why is the genome size important?
is the genome a few billion base pairs or tens of billions of bps
This makes a huge difference in how much data needs to be generated in order to sequence the genome
Unknown species. Why are the number of chromosomes important?
1? or 99?
99 chromosomes will be harder to sequence than 1
Unknown species. Why is the genome ploidy important?
haploid species e.g microbiome = very easy to work with as there is only 1 copy of DNA
Diploid e.g humans= a bit harder to work with as we have 2 copies of DNA which are similar
Tetra etc..
will be harder to work with as they will have 5 or 6 different copies of DNA that is basically identical except for a few changes
Unknown species. Why are the number and size of repeats important?
A lot of the majority of species are repeatitive sequences in their genome
E.g in humans 40% of the genome is repeated.
These repetitive elements can come from simple repeats of DNA or other things such as endogenous retroviruses (that copy themselves into host genomes millions of years ago and replicated and expanded for more space in the genome- modify things)
How big are the repeated sites? 10bp or 1000sbp?= this influences how the genome is analysed
Unknown species. Why is it important to know how easily the DNA can be extracted?
Not all species have all their DNA in all their cells
e.g human RBC have mitochondrial DNA only.
Some species of fish have their full genome in their germ cells but their somatic cells contain a subset of their genome.
WGS stands for
Whole genome sequencing
all the genes are sequenced from the organism
What methods are used for WGS?
Short read methods - illumina, IonTorrent
Long read methods- PacBio and Oxford Nanopore
Historic methods - Sanger sequencing
Depends on what is being sequenced RNA or DNA.
WGS is generally thought of as a ___ sequencing method because?
WGS is generally thought of as a DNA sequencing method as cDNA is usually used.
Describe illumina sequencing
Short read method of WGS
SBS sequencing = highly accurate short read technology
- short reads 2 x 150bp
- generates huge amounts of data ( up to 3000gb)
- cheapest sequence (90gb
Human genome in its haploid state is 3gigabase
Human genome in its haploid state is 3gigabase
Each chip of SBS sequencing technology can hold ____ human genomes
Each chip of SBS sequencing technology can hold 24 human genomes (clinical grade)
Describe how the illumina SBS sequencing works
- takes DNA, fragments it
- sticks adapters on and flows it across
- The piece of DNA on the vertical line binds to a point on the cell and is put on such a level that these separate quite widely over the flow cell
Make a second copy of the DNA and force it to bend over and attach on the other end and cause it to split into 2 single strands
= 2 copies of the same piece of DNA quite close to each other
This is then replicated a few hundred times = this gives a cluster
e.g a thousand copies of 1 piece of DNA all spatially located on a slide on a very tight cluster
This is the prep step
Get rid of the dsDNA and add a single base containing a big fluorescent marker attached to it that will stop any other bases attaching to it.
The enzymes get stuck it, and put all the bases on at once
once it attached, the base that fits with the DNA.
since there is a big marker blocking the next base cannot be integrated
A laser is fired and a photo is taken with a microscope. SInce there are a thousand copies you now have a thousand different things contributing to the same signal cause all the As= 1 colour
Ts = 1 colour
Therefore, this give a bright spot of colour that is correlated to the molecules
and know that position on the slide is an e.g A
easy to
add an enzyme/acid that chops off the fluorophore and then add the next available set of bases
repeat process 150 times