RNA Sequencing Flashcards
What is RNA Sequencing (RNA-Seq)?
A genomic technique that measures the quantity and presence of RNA molecules in a biological sample.
What is RNA sequencing commonly used for?
Analysing gene expression and transcription at the genome level.
What are the three steps for RNA-Seq?
- Prepare a sequencing library
- Sequence
- Data analysis
What are the 6 steps of preparing an RNA-Seq library?
- Isolate RNA
- Break RNA into small fragments (200-300bp)
- Convert the RNA fragments into dsDNA
- Add sequencing adaptors
- PCR amplify the library
- Quality Control
Why do we convert the RNA fragments into double stranded DNA (during RNA-seq library preparation)?
Because dsDNA is more stable than RNA.
Therefore can be easily amplified and modified.
What is the purpose of sequencing adaptors (in RNA-seq library preparation)?
- Allows attachement to flow cell
- Identification of fragments (can sequence multiple at once).
- Allows the sequencing machine to recognise fragments
What is checked in Quality Control (library preparation)?
- Verification of library concentration
- Verification of library fragment lengths (not too long or short)
When sequencing, how many fragments are laid out in a grid?
400,000,000
What is the name of the grid with fragments laid out during sequencing?
Flow cell
How does sequencing work in RNA Seq?
- Inside flow cell, fluorescent probes are colour coded according to the type of nucleotide they bind.
- After each nucleotide in fragment is tagged, machine takes picture of flow cell from above.
- Probes are washed away.
- Process repeats until machine has detected each sequence of nucleotides.
What are quality scores?
Quality scores reflect how confident the machine is in the base it has called.
What causes low quality scores?
- Low diversity (over abundance of single colour) so hard to identify individual sequences.
- Probe not shining as bright as it should.
What are the four lines of data in a sequencing read?
- 1st line: Always starts with ‘@’, followed by a unique ID for the sequences.
- 2nd line: Contains the bases called for sequenced fragment.
- 3rd line: ‘+’ character.
- 4th line: Contains quality scores for each base in the sequenced fragment.
What are garbage reads?
- Reads with low quality base calls
- Reads that are clearly artifacts of the chemistry
What is an artefact of the chemistry?
When adaptors bind to each other instead of DNA fragments and create a false ‘read’.