The Covid Genome Flashcards
Genomics of Covid-19. Exeter contributing to national effort to sequence genomes of as many isolates of the virus as possible. What will this help with?
Developing vaccines, inform development of testing diagnostics kits and insight
Coronavirus officially known as
SARS-CoV-2
What does SARS-CoV-2 cause?
COVID-19 disease
Genome composition of SARS-COV-2
single stranded RNA virus genome consisting of ssRNA in the positive orientation
Genome consist of
ss RNA (under 30kb in length)
Genome consists of several
open reading frames translated into polypeptide protein products
What became available at the end of 2019?
First genome sequences of isolates of the virus from China
What were the researchers next step?
Researchers looked at the whole genome sequences and compared to whole genome sequences of other viruses, researchers were able to come up with a phylogenetic tree
Coronavirus probably circulating in
bats as main natural reservoir but it probably didn’t come directly from bats
Background to the development of coronavirus: why is it likely that coronavirus didn’t come directly from bats?
In December, bats hibernate. No bats in Huanan seafood market (but many other mammal species which could have become an intermediate host). Under 90% identity to closest bat virus
SARS and MERS
emerged via intermediate hosts
SARS intermediate host
association with palm civets (virus transferred form bats to palm civets and then to humans)
MERS intermediate host
via camels to humans
Current method of COVID-19 testing
swab and perform PCR
from comparing the first few genome sequences of the current pandemic (COVID) against other genome sequences of previous outbreaks (SARS + MERS) and other viruses from bats you can see that some parts of the viral genome are
highly conserved along most maybe all coronaviruses (whereas some are more variable)
It’s essential to design
PCR primers that target the more conserved (less variable) regions of the genome
An example of highly conserved region which could be targeted with a detection assay which relies on PCR
1b (in this diagram) seems quite highly conserved amongst different viruses within this group (in addition to places on the other end of the genome which show relatively high levels of consensus)
Example of study which did this
Detection of 2019 novel coronavirus (2019-nCoV) by real time RT-PCR = identified regions of the genome which seemed highly conserved and designed PCR primers against them. some of these primers are now recommended by Public Health England
By 05/04/202
up to 3,000 genome sequences available for the virus.
UCSC genome browser
has a genome browser for the SARS-Cov-2 viral genome
UCSC genome browser. The SARS-CoV-2 reference genome in this case was
originally isolated virus from one of the first patients from China
UCSC genome browser. what can you see
You can see the positions of the genes (protein coded genes) which are annotated. Target sites of the PCR primers can be seen, used to test and detect the virus
UCSC genome browser. where are PCR primers targeting?
Many PCR primers are targeting the 1b highly conserved region of the genome Others target other parts of the genome (also highly conserved)
UCSC genome browser. Towards the bottom of the screen you can see the variation between different isolates of the virus
Each of these rows is the genome sequence of 1 isolate of the virus. Where there’s a difference in the genome of that isolate compared to the reference there is a dot. Many dots scattered around; large number of isolates which have a variant in the same region of the genome (long lines down). As the virus passes through hosts and replicates its accumulating mutations
What do you not want to see in the highly conserved regions of the genome
YOU do not want to see many differences/ variants (dots) in the highly conserved regions of the genome (targeted by our PCR primers – they could fail to target the viral genome if too many variants)
Genomic epidemiology of novel coronavirus – Next strain Resource
good for exploring genomic data from current pandemic. Data gathered by the GISAID consortium. Over 3,000 genome sequences of CoV-SARS-2 in the data set
Historically what did GISAID focus on
strains of influenza but now collecting genomic data on current pandemic
Next strain resource has the
latest data and analysis
Next strain resource shows diversity
o Genetic diversity (sequence diversity) o Shows the 30kb of the genome o Height of spikes shows how much genetic diversity exists at each position most have next to none (most of the genome is completely conserved) but some hotspots along the genome – considerable amount of sequence variation because of mutations during replication of the virus during the pandemic
Next strain resource shows phylogenetic tree
o Phylogenetic tree of the 3,000 viral genome sequences which are Colour coded according to geographical origin
Next strain resource shows - Geographical map of transmission
o Press play – reconstruction of outbreak according to timeline of transmission
Next strain resource shows filter by country
533 genome sequences from the uk o Able to tell you when it was first sequenced and where
Next strain resource can explore patterns further
o Go back to main page o Look at situation reports o Can see analysis = Recently another 16 genome sequences have been deposited from Japan, 10 of the 16 come from patients who were on a cruise. These isolates more closely resemble isolates from Europe and North America than Japan, this reflects that the Japanese patients were infected on the cruise by other patients from Europe or North America rather than developing the infection in Japan. 2 genome sequences from Wales they don’t cluster together this suggests the virus has been introduced into wales more than once
Exeter’s efforts
uni using Oxford nanopore minions to sequence isolates of the virus acquired from RD+E hospital. They got the first 4 genomes from Exeter (3rd of April 2020) Placed the Exeter isolates in the phylogeny (black circles). 2 of the isolates were genetically identical to each other so only 3 black dots
Exeter’s efforts continued
Using HARVEST software to look at phylogenetic tree of several hundred UK isolates of virus. Lines indicate mutations/ variants compared to reference genome Can show single nucleotide resolution through zooming in. At particular regions there are mutations which you can see on a nucleotide level