WS2: Data Analysis II (TGCA) Flashcards
1
Q
Value of using large-scale sequencing data:
A
- Can access whole genome sequencing data from 1000’s of cancer patients
- Can use this to compare genetic status of any gene across cancer types
- Research, clinical and learning applications
2
Q
New vs older generations of sequencing technology:
A
- Sanger (Slab gel, capillary array electrophoresis, microchip) -> all a matter of years to sequence entire human genome
- Next gen (illumina/solexa) -> 12 days
- Next gen (illumina NovaSeq) -> 15 minutes
- Both massively parallel throughput
3
Q
What is the TCGA?
A
- The cancer genome atlas
- Landmark cancer genomics program, molecularly characterised over 20k samples spanning 33 cancer types
- Data collected, processed and analysed using standardised approaches, meaning we can make direct comparisons between cancer types
- Raw data publicly available
4
Q
How can TCGA data be accessed?
A
- Analysis of raw data requires powerful bio-informatics pipelines to process
- One of most commonly used interfaces is cBioportal.org
- Able to identify genetic alterations, features and mutations in the gene of interest