UNIT 2 Flashcards
1
Q
Mapping of Sequence Reads to Reference Genomes
A
- Mapping of Sequence Reads
- Fundamental technique in genomics and bioinformatics
- Aligning reads to a reference genome
- Underpins many downstream analyses
- Key Concepts
- Sequence reads
- Reference genome
- Alignment
- Alignment algorithms and tools
- The Mapping Process
- Preprocessing of reads
- Indexing the reference genome
- Alignment
- Post-alignment processing
- Key Concepts
- Mapping quality (MAPQ)
- Unique vs. multi-mapping reads
- Paired-end vs. single-end reads
- Gapped vs. ungapped alignment
- Importance of Read Mapping
- Variant detection
- Gene expression analysis
- Genome assembly and structural variation
- Comparative genomics
- Clinical diagnostics
- Challenges in Read Mapping
- Repetitive regions
- Structural variations
- Sequencing errors
- Computational demands
- Genomic diversity
- Tools and Algorithms
- BWA, Bowtie, STAR, HISAT2, Minimap2, GMAP/GSNAP
- Hash-based indexing, Burrows-Wheeler Transform, suffix arrays, seed-and-extend
- Recent Advances
- Long-read mapping
- Graph-based references
- Machine learning approaches
- Cloud computing and parallelization
- Real-time mapping
- Practical Considerations
- Choice of reference genome
- Aligner selection
- Parameter optimization
- Quality control and validation
- Handling multi-mapping reads
- Conclusion
- Essential technique in genomics
- Enables accurate interpretation of sequencing data
- Continuous advancements in tools and algorithms
2
Q
Sequence Mapping and Read Mapping
A
- Sequence Mapping
- Computational process of aligning reads to a reference genome
- Essential for understanding sequencing data
- Key Concepts
- Sequence reads
- Reference genome
- Alignment
- Hashing-based methods (Bowtie, MAQ)
- BWT-based methods (BWA, Bowtie2)
- Seed-and-extend algorithms
- Splice-aware mapping (STAR, HISAT2)
- Challenges
- Sequencing errors
- Repetitive regions
- Insertion/deletion (indel) handling
- Large structural variations
- Computational demands
- Genetic diversity
- Importance
- Variant calling
- RNA-Seq analysis
- Functional genomics
- Personalized medicine
- Comparative genomics
- Practical Considerations
- Choice of reference genome
- Aligner selection
- Parameter optimization
- Quality control
- Handling multi-mapping reads
- Conclusion
- Fundamental technique in genomics
- Essential for interpreting sequencing data
- Continuous advancements in tools and algorithms
3
Q
Read Sequence Alignment and Aligners
A
- Read Sequence Alignment
- Fundamental step in bioinformatics
- Aligning reads to a reference genome
- Essential for downstream analysis
- Process
- Preprocessing
- Mapping
- Scoring
- Storing results (SAM/BAM)
- SAM and BAM Formats
- SAM: Text-based, human-readable
- BAM: Binary, compressed, efficient
- Mandatory and optional fields in SAM
- Popular Alignment Tools
- BWA (Burrows-Wheeler Aligner)
- Bowtie/Bowtie2
- STAR (Spliced Transcripts Alignment to a Reference)
- HISAT2
- Minimap2
- Key Concepts
- Mapping quality
- Gapped vs. ungapped alignment
- Paired-end vs. single-end reads
- Structural variations
- Challenges
- Sequencing errors
- Repetitive regions
- Insertion/deletion (indel) handling
- Large structural variations
- Computational demands
- Genetic diversity
- Applications
- Variant calling
- Gene expression analysis
- Genome assembly
- Comparative genomics
- Clinical diagnostics
- Conclusion
- Essential for understanding sequencing data
- Variety of tools and algorithms available
- Continuous advancements in the field
4
Q
Manipulating Alignments in SAM/BAM Files
A
- SAM/BAM Manipulation
- Essential for preparing aligned reads for analysis
- Common Operations
- Conversion between SAM and BAM
- Sorting alignments
- Indexing BAM files
- Filtering alignments
- Marking and removing duplicates
- Adding or modifying read groups
- Merging and splitting BAM files
- Realignment around indels and BQSR
- Tools
- SAMtools
- Picard Tools
- BEDTools
- Sambamba
- GATK
- Best Practices
- Start with high-quality alignments
- Sort and index BAM files
- Handle duplicates
- Maintain read groups
- Filter carefully
- Document steps and parameters
- Optimize computational resources
- Validate manipulated files
- Advanced Topics
- Graph-based references
- Handling multi-mapping reads
- Integration with workflow managers
- Conclusion
- Critical for genomic data analysis
- Mastery of tools and techniques essential
- Adhere to best practices for data integrity and reproducibility