Final Exam Flashcards

Question

what are the cons of plate based single cell isolation

Answer 1

low throughout put and efficiency (historical significance)

Answer 2

1. targeted cell isolation 2. high-precision sorting 3. multiparameter sorting 1. cell viability 2. limited by marker availability 3. throughput and time efficiency lower than other methods

Answer 3

1. utilization of antibodies to specifically target and capture CTCs from peripheral blood 1. rarity of CTCs in the bloodstream 2. potential for bias in antibody-based capture 3. sensitivity and specificity of the chosen antibodies 4. throughput and time-efficiency lower than other methods

Answer 4

1. precise manipulation of cells and fluids at a microscope 2. ability to integrate multiple steps into a single microfluidic chip, reducing sample loss and technical availability 1. lower throughput 2. complexity and cost of the microfluidic chips 3. low efficiency for small or fragile cells

Answer 5

capturing and processing individual cells in microfluidic channels or chambers, aiming at controlled environment benefits studying of specific cell types or low-abundance transcripts

Answer 6

encapsulating individual cells in oil droplets, each containing a unique barcode. designed to process a high number of cells in a single run

Answer 7

1. Scalability and parallel processing 2. Reduced cost and time per cell 3. Large scale and high throughput by barcoded beads in droplets, which tag the mRNA of individual cells 1. Difficulty in capturing large/irregularly shaped cells 2. Potential for capturing multiple cells in a droplet 3. Many cells = lower depth of sequencing per cell

Answer 8

10X genomics

Answer 9

1. uses droplets for single-cell isolation 2. no ERCC spike-ins 3. 8 bp UMI 4. no full length coverage 5. PCR amplification 6. not usable for bulk 7. paired-end sequencing

Answer 10

1. uses FACS for single cell isolation 2. ERCC spike-ins 3. no UMI 4. full length coverage 5. PCR amplification 6. usable for bulk 7. single-end sequencing

Answer 11

unique molecular identifiers- short nucleotide sequences added to RNA molecules before amplification with the aim to tag each original RNA molecule uniquely, allowing the differentiation between true RNA molecules and PCR duplicates. This significantly improves the quantitative accuracy of scRNA-seq

Answer 12

to sequence the entire RNA molecule from the 5' to the 3' end. provides comprehensive info about transcript isoforms, alternative splicing events, and other post-transcriptional modifications

Answer 13

full length sequencing requires reading the entire RNA transcript, so if a UMI is added only to one end, it becomes ineffective of gets lost in the process of sequencing the full length transcript

Answer 14

counting gene expression (i.e. counting transcripts)

Answer 15

10X genomics

Answer 16

- myocytes - brown adipocytes - neurons - sperm cells - oocytes - hepatocytes - endocrine cells

Answer 17

cell stress apoptosis low RNA integrity low-quality RNA extraction technical errors in library prep

Answer 18

RNA integrity cell viability technical artifacts (such as cell doublets) batch effects cell heterogeneity

Answer 19

mitochondrial

Answer 20

- integrates well with existing 10x genomics workflow - offers a relatively large capture area, which is beneficial for analyzing tissue sections - provides high quality data with robust technical support - limited to predefined capture areas, which may not suit all experimental designs - the cost can be relatively high - the capture areas are not cell-resolution

Answer 21

1. 10X Genomics Vision 2. StereoSeq 3. Nanostring GeoMx digital spatial profiler 4. Slide-seq 5. Seq-scope 6. Merfish

Answer 22

- high spatial resolution - comprehensive coverage - flexibility in targeting (can target a wide variety of RNA species) - compatibility with standard histological samples - complexity and cost - requires robust bioinformatics support - instrumentation requirements

Answer 23

- high-plex analysis, enabling simultaneous assessment of numerous targets - flexible in terms of target selection (RNA and protein) - compatible with standard FFPE samples - lower spatial resolution compared to other platforms - dependency on predefined probes (limited novel transcript discoveries)

Answer 24

- high spatial resolution - allows for discovery of novel spatial biomarkers - technically challenging and requires special equipment - lower throughput, limits parallel sample processing

Answer 25

- exceptionally high spatial resolution - still in developmental stages, potentially high cost and technical complexity

Answer 26

- extremely high-plex capacity - high spatial resolution - requires specialized and expensive equipment - complex data analysis pipeline

Answer 27

trade off between spatial resolution and throughput

Answer 28

1. single cell Whole Genome Sequencing (scWGS) 2. single cell Copy Number Variation (CNV) profiling 3. single cell Whole Exam Sequencing (scWES) and single cell targeted DNA sequencing

Answer 29

- higher technical noise compared to scRNA-seq - need for high sequencing depth to detect rare mutations - the potential for DNA amplification biases - cost-efficiency - currently rare, not a lot of data for reference/comparison

Answer 30

single cell Assay for Transposase-Accessible Chromatic using sequencing; surveys the physical structure of the genome by identifying regions of open chromatin

Answer 31

comprehensive characterization of immune cells - gene expression - surface proteins - cytokines - functional states

Answer 32

- 5' transcriptome gene expression - T and B cell repertoire - antigen specificity

Answer 33

Cellular Indexing of Transcriptomics and Epitopes by sequencing; determines the interaction between different immune cell groups and identification of novel distinct immune cell subsets in health and disease

Answer 34

the trade off between spatial resolution and throughput

Answer 35

depends on research question, tissue type, and available resources

Answer 36

- high technical noise - high cost - potential for DNA amplification bias

Answer 37

- tumor heterogeneity (in terms of mutations) - hematology - gene editing

Answer 38

- large volume of data - low depth of sequencing per cell - biological variability across cells/samples - technical variability across cells/samples

Answer 39

a zero count can either mean the gene is not expressed or that the transcript was not detected (false negative)

Answer 40

- transcriptional bursting - varying rates of RNA processing - continuous or discrete cell identities - environmental stimuli - temporal changes

Answer 41

- cell-specific capture efficiency - library quality - amplification bias (drop out) - batch effects - dilution factor

Answer 42

- RNA isolation not performed on the same day - library prep not performed on the same day - different people performing RNA isolation/library prep for all samples - not using same reagents for all samples - RNA isolation/library prep not performed at same location

Answer 43

- split replicates of different sample groups across batches - include batch info in experimental metadata

Answer 44

DoubletDecon

Answer 45

- filter out cells based on mitochondrial reads (%) - filter out cells with too few or too many reads - filter out cells based on n features (genes) (too few or too many) - filter out genes based on expression across the cells - integrate and remove batch effects

Answer 46

CITE-seq integrates scRNA-seq with simultaneous protein-level data, enabling characterization of both transcriptomes and cell surface protein markers from single cells while immune cell profiling may include 5'-end transcript sequences, offering insights into transcriptional initiation patterns specific to immune cells without direct protein-level measurements

Answer 47

because we know the approximate distance between the two reads. this is especially helpful with indels

Answer 48

find the genomics coordinates of the sequencing reads considering that RNA undergoes splicing (prioritize mapping in non-intronic regions)

Answer 49

- data (raw sequencing reads) - high performance computing platform - software - reference genome sequence in FASTA format - exotic/intronic genome coordinates or gene annotation file

Answer 50

a digital nucleic acid sequence database assembles by scientists as a representative example of the set of genes in one idealized individual organism of a species (do not accurately represent the set of genes of any single individual organism)

Answer 51

a description of where genetic elements (intron, exon, transcript, gene) are located in the genome, in the form begin and end coordinate

Answer 52

yes- RNA alignments that do not use gene annotation exist (some are called de novo aligners)

Answer 53

yes- RNA alignments can use. target transcriptome as a multi-FASTA file

Answer 54

STAR, Hisat2, BBmap

Answer 55

BWA, Bowtie2

Answer 56

1. seed searching 2. clustering, stitching, and scoring

Answer 57

- call calling - removing PCR duplicates - assigning reads to individual genes and cells achieved through barcode and UMI sequences

Answer 58

SmartSeq2- each cell has its own .bam 10X- 1 combines .bam and barcodes.tsv, features.tsv, matrix.mtx

Answer 59

1. Process data (on a server/cloud) and obtain GE per cell values (small size manageable outputs) 2. Filter out genes 3. Filter cells 4. Normalize expression values 5. Identify highly variable genes 6. Scale data, regress out unwanted variation 7. Reduce dimensions 8. Determine significant principal components 9. Use the PCs to cluster cells with graph-based clustering 10. Visualize clusters with no linear dimensional reduction (tSNE or UMAP) 11. Detect and visualize marker genes for the clusters 12. Classify the cells by cell type

Answer 60

t- removing them makes the data smaller and computations faster

Answer 61

- low mRNA content in a cell - variable mRNA capture - variable sequencing depth

Answer 62

- divide gene's UMI count in a cell by the total number of UMIs in that cell - multiply the ratio by a scale factor (10,000 by default) - transform the results by taking natural log

Answer 63

high expressing genes; use SCTransform instead

Answer 64

- modeling of gene expression data - normalization and variance stabilization - feature selection - mitigation of batch effects - scalability

Answer 65

strong, low expressing genes have higher variance

Answer 66

- compute the mean and variance of each gene using the unnormalized UMI counts - take log10 of mean and variance - fit curve to predict the variance of each gene as a function of its mean expression - standardize count - for each gene, compute the variance of the standardized values across all cells - rank the genes based on standardized variance and use the top 2000 for PCA and clustering

Answer 67

gives equal weight in downstream analyses so the highly expressed genes do not dominate

Answer 68

Z score normalization in Seurat's ScaleData function: - shifts the expression of each gene so that the mean expression across cells is 0 - scales the expression of each gene so that the variance across cells is 1

Answer 69

Seurat constructs linear models to predict gene expression based on user-defined variables

Answer 70

1. compute cell cycle scores for each gene based on its expression of G2/M and S phase markers 2. model each gene's relationship between expression and the cell cycle score 3. regress: 2 options - remove ALL signals assoc with cell cycle stage - remove the difference between G2M and S phase scores (preserves signals for non-cycling vs cycling genes, only differences in cell cycle phase amongst the dividing cells are removes. useful when studying differentiating processes)

Final Exam Flashcards

(103 cards)