Lecture #5 - CRISPR Part 1 Flashcards
Editting Technology Before CRISPR
Before CRISPR - had Talens
Talens = Fuse a protein that binds DNA and a protein that cuts DNA
- Sequence specificity is hardwired int the proteins (protein only binds to that sequence for the lifetime of that protein)
- Makes sequence specific cuts
Bacteriophages
Bacteriophages = major threat to bacteria
- Injects genetic material into the bacteria cell
When the Bacteriophages injects genetic material into the bacteria the viral genetic material is preferentially transcribed and translated
Original Purpose of CRISPR
CRISPR/cas9 = evolved as a bacterial adaptive immune system to target the specific sequences of invading bacteriophage DNA and kill foreign DNA (Ex. Kill Phages or plasmids)
- Adaptive immune (has memory) - Adaptive because CRIPSR remembers previous encounters with phages/plasmids
How does CRIPSR work in bacterial immunity
Upon infection by a novel pathogen CRSIPR cas identifies viral threat and integrates part of the viral DNA into the bacterial genomes at the CRISPR locus –> the sequence in the CRISPR locus is transcribed and paired with cas nuclease –> programs the cas to specifcally cut the bacterial genome –> THEN upon reinfection by the same virus strain –> CRISPR cas9 can identify and neutralize the viral threat
- Progaming of cas = through base complementaryity using the crRNA
What happens during infection by bacteriophage
During infection by bacteriophage fragments of viral DNA can be acquired into a genome CRIPSR array –> allows for “genetic memory of infection”
Memory in CRISPR = Spacers
CRISPR (Overall)
CRIPSR – Cluster Regularly interspersed palindromic Repeats
Overall - Processed crRNA from CRISPR locus complexes with a cas nuclease to cleave viral DNA in a sequence-specific manner —> prevents future infections
- crRNA – CRISPR RNA
CRISPR/Cas9 System overall (key points)
- 20 BP target RNA is fused with 76 RNA scafoloed
- NGG PAM sequence -
- PAM = protospacer adjacent motif (ex. Cas looks for NGG)
- PAM is needed because then Cas9 would cut the bacterial CRISPR array memory in the genome itself
- PAM is NOT in the bacteria CRISPR memory but IS in the phage that you are targeting - Creates a dsDNA break 3 BP upstream from the PAM
- Breaks repaired by error prone NHEJ or HDR
Image - blue is DNA target ; green is gRNA
Where does gRNA come from in CRISPR
gRNA = comes from the Spacer DNA in CRISPR –> make gRNA –> cas9 look for DNA that matches the guide
gRNA ALSO comes form tracrRNA
- tracrRNA = acts as a scafold
Bacteria need tracerRNA and CRISPR crRNA VS. In lab = fuse crRNa and the tracerRNA
What does Cas9 look for
Cas9 = looks for DNA that matches the spacer AND looks for PAM sequence
- Cas9 matches the spacer and DNA BUT next to that match there needs to be a PAM (NGG for Cas9)
Cas9 Nuclease domains
cas9 = gas Two nuclease domain
RubC and HnH = 2 nuclease domains on cas9 –> each cleaves 1 starnd of target = get dsDNA break
Cas = Ccntinues to cut until a mutation is introduced
Cells repairing the dsDNA break
Repair = where editing actually starts
The cell can repair the break to the original sequence –> in this case there is NO editting
Cells can repair the dsDNA break using:
1. NEHJ = sticks dsDNA back together
- Often results in INDELS at the cut site –> NOW have edited DNA because repaired the DNA wrong
- Once have INDELS = cas9 can’t cus the sequence again because it no longer matches the guide (prevents cas9 form cutting DNA again)
2. HDR –> Have a dsDNA breal in 1 sister chromatid–> use the sister chromatic on the other chromosome as a template to corect the broekn sequence
- IF you overwhelm the cells with a template that has the edit that you want THEN when you cut the DNA it will repair the DNA using the donor DNA instead of the sister chromatid = can insert the sequence that you wanted to add to th cell
Cas9 binding to the DNA
R loop = DNA that match the memory is unwound when cas9 binds to DNA
- 1 srand is bound to the guide and one strand is free ssDNA
What do cas genes code for
Upstream of the array = cas proteins themselves
- Cas protein = involoved in new sequence (spacer) integration + CRISPR RNA processing + Interference
CRISPR locus
CRISPR locus = array with unique spacers targeting discrete viral sequences
- CRSIPR locus = has a library of guides that target many unique viral sequences
Locus alternats a constant repeat sequence and virus-specific spacer sequences
- On each side of the space = has short repeats (Spacer (virus specific) –> Repeat –> Spacer –> Repeat etc.)
Spacer
Spacer = short DNA sequence from the Phage (20-23 BP of DNA stole from the phage that is integrated into the CRIPSR array and becomes memory)
- IF have a new infection by a pahge that matches the spacer (matches the 30 BP of memory) –> THEN the phage will be destroyed by the CRIPSR system
- Bacteria can make new memories
- Spacer = gives the specificity for each virus/memory
- makes CRISP adaptive
Purpose - acts as memory because the crRNA is loaded onto cas9 and cuts anything that matches
Arrays can contain tens to hundreds of spacers
crRNA
Pre-crRNA are transcribed from an upstream leader seqeucne then processed into single mature crRNAs
- Upstream leader seqeunce = upstream of the spacer and repeat sequences (red in the image)
- Pre-crRNA transcript matures into multiple shorter segments (shorter segments that each target motifs)
crRNA = contains the 20-nucleatide spacer sequence used for base-pairing with target DNA
- From the integrated viral sequence at the CRISPR locus (from the Spacers)
- crRNA = variable
Cas9 Nuclease complex
Cas9 complex = programmed to cut specific DNA sequence by interrogating for PAM sequence THEN base pairing to the spacer sequence of the crRNA
Cas9 complex = composed of cas9 portein and crRNA + tracrRNA
Cas9 = has 2 nuclease domains = can make dsDNA break in the target DNA
- Cuts are only made IF the DNA sequence properly matches the spacer sequence that is encoded in the crRNA
- cas9 = binds to DNA bases using an RNA guide
tracrRNA
tracrRNA = strcutual element that tethers the crRNA (complex the crRNA and the cas9 protein)
If the protospacer and spacer sequences stored in the bacterial genome are the same –> THEN how does the CRISPR cas9 system differentiate between foreign viral DNA and its won genome at the CRISPR array locus
Cas protein recognzies a PAM sequence downstream of the spacer/portospacer base pairing on the dsDNA
PAM
PAM (protosoacer Adjacent Motif) - a several nucleotide seqeunce
PAM = needed for Cas to dock on the DNA and open up the DNA THEN can test the crRNA for complementarity and have cleavage
- CRISPR array does NOT have PAM –> NOT encoded in the crRNA = cas will not cut the CRISPR array in the bacterial genome
Purpose - Seraching for PAMs first allows teh cas9 complex to avoid self-taregting and allows cas9 to survey the DNA more efficiently
Classes of CRISPR systsems
CRISPR systems exist in two main classes (based on how many proteins bind to the associated RNA)
Class 1 – Several proteins complex to form nuclease
- Multiple proteins bind to the RNA to become active
- Has Types 1,3, and 4 systems
Class 2 – Single nuclease effector (One protein binds RNA)
- Includes Types 2, 5, and 6 systems
Other CRIPSR systems – Types 3 = transcription dependent ; Type 1 chew up DNA in a unpredictable manner
Why use Type 2 CISRP systems in the lab
Type 2 = used for gene editing in lab because it is simply (only requires 1 cas protein to bind to crRNA)
Type 2 ALSO makes precise dsDNA cuts which triggers the host DNA damage response
SpCas9 in Lab
Scietitists use Streptococcus Pyogenes Cas9 (spCas9) in the lab for genome edittig
How do we adapt the bacterial immune system to cut DNA in Eukryotic cells:
1. A nuclear localization signal is added to enable nuclear import in Eukaryotes (brings the cas9 protein to the nucleus using the nuclear localization signal)
2. The crRNA and tracrRNA are combined into a single gRNA
ALL you need is 1 protein and 1 gRNA
Diversity of CRSIPR systems
Each types of system has a unique PAM that is recognized
ALSO each type of system employs a different mechanism to destroy viral DNA
- Exception – Types 6 destroys RNA
Each species of bacteria can have multiple CRISPR systems + can have multiple CRISPR arrays within their genome
Why is there so much diversity – CRISPR is an adaptive immune system = requires the capacity for diversity to keep up with the quickly mutating viral threat
Cutting with sgRNA/SpCas9 Complex
SpCas9 uses NGG PAM (NGG PAM must be immediately downstream (3’ end) of the protospacer on the non-target srand)
IF have PAM and the spacer is complementary to the target then the SpCas9 creates a dsDNA break 3 nucleatides upstream (5’) of teh PAM
Answer – B
NOTE – Need GG on the 5’-3’ strand (NGG is on non-target strand) + Need target sequence to be complementary to the spacer
- Cut = 3 nucleotides upstream of the PAM
Cas9 will only target if have the correct PAM and there is no mismatch between the spacer (crRNA?) and the target sequence
Off-traget cutting
SpCas9 is highly selective BUT it can cut off-target sites
- Cas9 can tiolerate small mismatches that are far away from the PAM
Overall - As you travel farther from the PAM = complementarity requirement for base pairing is reduced –> Small mismatch on the 5’ end can be tolerated (I THINK 5’ end of the guide)
Cas9 can have PAM flexibility (NAG(G/C)) –> Cas9 can pair with NAGG and NAGC PAMs with low affinity = leads to eronenous cuts
Beyond Cas9
- Mutated Spcas9 orthologs can improve cutting specificity + can change the PAM sequence + can change cutting behavior
- Example – Cas9v24 = used NGA PAM instead of NGG PAM –> can be used if the desired cut site does not have NGG site nearby
- Type 2 CRIPSR/Cas systems from other species use different PAMs
- To find PAM diversity = look to other type 2 Cas systems in other species
Since first use SpCas9 has seen improvment modifications
Use of CRISPR in lab
When CRISPR is used in the lab it is often with the goal of making mutations
Can make:
1. Mutations
2. Insertions
3. Deletions
What causes mutations in CRISPR
Cas9/gRNA cleave the dsDNA BUT it does NOT directly mutate the genome INSTEAD mutations are cause by endogenous DNA damage Repair mechanisms
Way that the cells responds to the berak informs the modification in the genome
DNA damage repair mechanisms used in CRISPR
- NHEJ – Direct ligation of blunt ends
- Error prone mechanism –> may generate indels –> Indels = causes frameshift mutation —> Frameshift mutations are used to knockout genes - Homology Directed Repair (HDR) - Templated repair from sister chromatid or other donor DNA
- Can precisiley insert exogenous sequences
- Repairs in faithful manner
NHEJ
During NHEJ a number of proteins are recruited to the site of the dsDNA break –> Recruitment of proteins leads to a insertion or removal of nucleotides before the 2 strands are joined together –> means the processes is often mutogenic
Imprecise repair causes Indels
- If the cut occurs in a protein coding gene and an INDEL occurs –> INDELs will cause a shift in reading frame of the gene = can result in a premature stop codon
What happens in the genome is repaired correctley following cleavge (get the same DNA sequence that had before the cit)
If the genome is correctly repaired THEN cas9 will cut again either the target sequence no longer matches the sgRNA or the PAM is mutated
- Cas9 will cut again unto an INDEL is created that disrupts the target sequence or the PAM
What do Frameshift mutations lead to
Frameshift mutations can lead to a premature stop codon
Premature stop codons can Knock out genes and lead to loss of protein expression in 2 ways:
1. Triggers non-sense mediated decay to degrade nascant mRNA
- Non-sense mediated decay = detects early stop codons and degrades problamatic mRNA
2. Coding for a non-functional protein product
- Even if the mRNA is translated it can generate a truncated non-functional protein
Exercise #1 – Knock out the humal PLK4 gene
Question 1 - Where would you target the sgRNA?
Want to target early in the gene
- Might target exon 2 because you could have an AUG downstream that can be used (If started before that AUG then could also get translation past the second AUG in exon 2 )
When targeting DNA to cuase deletion = want to make sure that you start translating (need to start past the start codon) and THEN have a frameshift = need to be in coding frame in exon 1 or exon 2
- Want to target early because want frameshift to start early on
Can target promoter – gives a LOF BUT probably won’t give a full KO
- Frameshift = full KO
What does Exon 1 start with
Exon 1 does NOT start with ATG (have a 5’UTR in exon 1 –> means you have a start codon in exon 1 BUT not at the very beginning
Exercise #1 – Knock out the humal PLK4 gene
Question 2 - What type of alteration are you looking for?
Indel that is NOT a multiple of 3
Need INDEL not in a multiple of 3 because want a frameshift –> frameshift gives a non-sense protein that will terminate at some point
Exercise #1 – Knock out the humal PLK4 gene
Question 3 - How do you confirm knockout of the intended target?
PCR region and send for sequencing of do a western blot for the encoded protein
IF have a heterozygous deletion –> get mixed signals at edited based when sequence
- Heterozygous deletion = deletion on 1 chromosome but not others
- Homozygous could also have different INDELS in different alelles –> Sequence can look messy during sequencing
Exercise #1 – Knock out the humal PLK4 gene
Question 4 - How do you deal with potential off-target events?
Do BLAST of guide against the whole genome to look for places that the guide could bind OR can use a software to look for off target sites –> can make more specific guides/look to see if the guides are specific
- Can also use multiple gRNA
To check:
1. After you do the KO sequence the rest of the genome to make sure there was no off target sites (look at the sites that were predicted to be off targets and make sure there was no actual off target)
2. Complement deletion –> IF you add the gene back yo should be able to revert to the original phenotype
- THEN can know that the phenotype is specific to YOUR KO of the gene and NOT due to and off target effect
Using multiple gRNA in CRISPR
Each gRNA would have a different targets
IF have 4 guides for 1 gene that all give the same phenotype then can be sure the phenotype is not due to an off target
- Adding in these 4 guides seperatley (KO the gene 4 ways –> know that phenotype is due to KO and not off targets because the odds that all 4 caused the same off target is very low) (CHECK with riana)