Genomic databases Flashcards
What are the two widely used genome databases?
ENSEMBL
UCSC
How is basic information about the gene given?
At the top of the page:
Position of the chromosome
Base position
Length of gene in base pairs
How is the gene information given visually?
Idiogram of the chromosome
Skinny with arrows - spliced out introns
Bars - exons, fat bars are translated and skinny fat bars are not
What is the difference between the bar size in the chromosome idiogram?
Fat bars = translated into a protein
Skinny fat bars = not translated into a protein
What are two ways in which the genome alleilic information is given?
Predicted gene - all the information is piled together
Refseq curated - Wikipedia
What information does the expression data provide?
The level of expression of said gene in different tissues
What other information is provided by UCSC?
Acetylation
Conservation of each protein
What information can be obtained from genomic databases?
We can see how SNPs are roughly evenly spread throughout the genome
Clinically annotated CMVs and SNVs
Predetermined primers to extract length of DNA
Associations with disease
Molecular and biological processes
Why is looking at SNP distribution in genome browsers useful?
To identify linkage disequilibrium
How are the sequences we are interested in looking at uploaded onto UCSC?
FASTA format
Describe the FASTA format
Set width
Indentation header
Brief description of the gene
What is BLAT?
A way of searching a piece of unknown DNA across the human genome
Also inserted in FASTA format
What do BLAT search results provide?
Score
Identity
Span
What is the score of BLAT search results?
Number of individual base matches between sequence and the reference
What is the identity of BLAT search results?
How well over that region the genome matches
Calculated by doing score/soan