RC Bioinformatics Flashcards
What is the main advantage of PacBio?
It produces long reads
What does ChIP-seq allow us to identify?
The binding sites of transcription factors
What can be seen when Illumina reads from a sample are mapped to the reference when a deletion is present?
The pairs spanning the deletion map further apart than expected.
What is Illumina bridge amplification used for?
Cluster generation
Why will some RNA-seq reads be split when mapped to the reference genome?
They overlap an intron which has been spliced out of the mature transcription, but which is present in the genome sequence the reads are mapped to.
What type of alignment does BLAST perform?
Local alignment
How should the alignment of two sequences be described?
% identity
When might we use the word similarity to describe sequence alignment?
Similarity may be used to describe protein sequence alignment
Where does most mammalian cytosine methylation occur?
CpG sites
What can cause a problem for de novo assembly of genome sequences?
Repetitive sequences in the genome
What does a bubble in a de Bruijn graph indicate?
The presence of a repeat
Why are adapters added to the genomic fragments in an Illumina library?
To allow PCR amplification of the library
How does TBLASTN allow us to BLAST a protein sequence against a nucleotide database?
It translates the database in all six reading frames
Why would we use index sequences in Illumina libraries?
To identify different samples on a flow cell
What can happen when performing multiple tests for differentially expressed genes in parallel?
We are more likely to obtain false positives, so we need to apply the False Discovery Rate correction.
What information is displayed in a pile-up plot?
The number of reads overlapping each base in the reference genome
What is caused by bisulphite treatment?
Unmethylated cytosines to be converted to uracil
What sort of SNP is most likely to cause a genetic condition?
Non-synonymous SNPs change the amino acid sequence of the encoded protein, so are more likely to cause a genetic condition.
What was the main reason for the decrease in sequencing costs between 2007-2012?
The development of “next-generation” sequencing technologies
What is the name of the plots used to display genome-wide association data?
Manhattan plot
What is the benefit of RNA-seq over microarrays for studying gene expression?
RNA-seq does not require a reference genome, since the reads can be assembled de novo.