Week 4 Flashcards
Are homozygotes informative when it comes to linkage analysis?
No, because the markers are homozygous we don’t know which allele is actually associated with the disease allele because markers are the same.
How many Mb or bp can cytogenetic tests detect
5Mb (5000000bp)
Define repetitive sequence
DNA fragment that are present in multiple copies in the genome.
Types of variable number tandem repeats
- Mini satellites ( repeat unit is 7-49)
- Microsatalities (repeat unit is 2-6 bases)
Characterise Minisatellites
- Hypermutable (ver unstable)
- Encourage cross over
- 90% found in sub telomeric regions
Characterise Microsatellites (short tandem repeats)
- Found in coding and non-coding regions of the genome.
- Highly polymorphic and extremely useful
- STR are used in forensics, paternity testing, ancestry testing and diagnostic
Size of copy number variants
The repeat unit may range from 50-1000 bases, to several mega bases in size.
Can copy number variants cause disease.
- Most are benign
- But CNVs in developmental genes causes; nervous system disease (incl Parkinson’s, Autism and Alzheimer’s)
- Also very common in cancer cells
Where are most repetitive regions found in the genome
Repetitive DNA elements are often associated with heterochromatin.
Heterochromatin dysfunction leads to ….
Heterochromatin dysfunction leads to genomic dysregulation by inducing aberrant repeat repair, chromosome segregation errors, transposons activation and replication stress.
It is harder finding what variants when using current technology?
- Intermediate- size structural variants (<2000 bp)
- Inversions
- Regions with DNA composition that is GC- or AT-rich
What are recurrent CNVs
- Similar size and recurrent breakpoints in segmental duplication.
- Enhances population diversity.
What are non-recurrent CNVs
- Random breakpoints scattered across genomic regions
- Usually more severe phenotypic consequences
- Dependent on size and location
How to detect CNVs
- Karyotyping
- Fluorescence in situ hybridisation (FISH)
- MLPA
- Microarray
- Next generation sequencing
Disadvantages of detecting CNVs through Karyotyping and FISH
- Large CNVs only
- Not able to detect small interagency rearrangements
- Time consuming
- Low throughput
Disadvantages of detecting CNVs through MLPA
- Limited number of loci
- Only known gene targets can be assessed = no discovery
- No breakpoint detection
Types of NGS methods to analyse cnvs
- Paired -end mapping
- Split end
- Read death
- Assembly based
- Combination approach
Databases to visualise/ analyse CNVS
- Database of genomic variants (DGV)
- GnomAD-SV
- DECIPHER
The purpose of the human genome project
An international research project to map each human gene and to completely sequence the entire human DNA complement.
Aims of the human genome project . (7)
- Determine the DNA sequence of the human genome
- Developed improved sequencing technologies
- Sequence model organisms
- Store information in a useful way
- Develop better tools for analysis
- Identify all genes and their function
- Consider. Ethical, legal and social implications
What two approaches did the human genome project take?
- Segment assembly approach : aligning and merging fragments that have been obtained from a longer DNA sequence to try and construct the original sequence.
- Whole genome shotgun sequencing: sequencing many overlapping DNA fragments in parallel and using a computer to assemble the small fragments into larger contiguous and then eventually chromosome.
Ethical and legal considerations of sequencing someone’s genome.
- Fairness and privacy: who should have access to your genetic information?
- Psychological effect: how does knowing your predisposition to disease affect you as an individual?
- Genetic testing and genetic screening: issue around the commercialisation of data.
- Reproductive implications: The use of genetic information in decision making.
What was the purpose of sequencing the genome of smaller organisms in the human genome project.
- Foster cooperation
- Smaller genomes serves as tests for developing sequencing methodologies.
- Serve as comparative genomes
- Developed mathematical, statistical and computational tools.
What are primary sequence databases and secondary annotation database.
- Primary sequence database: databases that stores genomic sequence data.
- Secondary databases either have algorithm that predict and store or just store the annotation provided for the raw sequence.
Why would you need to access sequence data?
- Know what the sequence of a gene is
- Identify variants in the sequence
- Compare your sequence to others
- Identify similar sequences
- Find diseases associated with variation in your gene of interest.
What is PubMed ?
Extensive biomedical literature database.
What is RefSeq
Comprehensive, integrated, well-annotated set of reference sequences -genomic, transcript and protein.
What is OMIM
Online mendeline inheritance in man- database of human genes and genetic phenotypes
What is clinVar
Database of genomic variation and the relationship to human health.
What is Ensembl
Resource for high quality integrated annotation data
What is uniprot
Universal protein resource for protein sequence and functional annotation data
What is PDBe
Protein data bank Europe -collection of 3D structural data.
What is interPro
Database of protein families, domains and conserved site.
The aim of gnomAD
Enable researchers too better understand the role of genomic DNA variation in both health and disease states.
What is ExAC
Aggregate and harmonise exome sequencing data from a wide variety of large-scale sequencing projects.
How to interpret z-scores on gnomAD
- Positive z-score; fewer variants observed than expected: highly constraint, intolerant to variation.
- Negative z-score: gene has more variants observed than expected: tolerant to variation.