human genome organization Flashcards
gene rich regions
chrom 19
gene poor regions
chr 13, 18, 21
stable regions
most of the genome
unstable regions
many are disease associated like
SMA: chr 5q13
DiGeorge Syndrome: chr 22q
12 diseases associated with unstable region on chr 1(1q21.1)
GC rich regions
38%
AT regions
54%
clustering of GC rich and AT rich regions are the basis for
chromosomal banding patterns
G banding
G staining stands for
giemsa staining
_____% of genome is translated (protein coding)
1.5 %
____% of genome is represented by genes (exons, introns, flanking sequences involved in regulating gene expression)
20-25
____ % of genes are is single copy sequences
50
______% of genes is classes of repetitive DNA
40-50
euchromatin
relaxed
expressing
heterochromatin
condensed
not expressing
genome seuqencing is focused on _______ regions
euchromatic
_______ regions are essentially unsequenced
heterochromatic
gaps can still remain in _______ regions
euchromatic
Classes of repetitive DNAs:
- Tandem repeats
2. disersed repetitive elements
tandem repeats:
satellite DNA found in different parts of genome
hotspots for
dispersed repetitive elements
- alu family (SINES)
- L1 family (LINES)
- Alu’s and L1’s can be of significant medical relevance
- retrotransposition
- (NAHR)
NAHR
non-allelic homologous recombination
dispersed repetitive element
SINE
short interspersed repetitive element
300bp
500,000 copies in genome
dispersed repetitive element
LINE
Long interspersed repetitive element
6 kb related members
100,000 copies in genome
Duplication rich genome architecture promotes _________
NAHR and disease
segmental dynamic mutation is
nonallelic homologous recombination (NAHR) between blocks of segmental duplication during mitosis leads to micro deletion and micro duplication of the unique region bracketed by duplications.
If the region contains dosage sensitive genes (ABC), disease results.
if not, duplicated cur is predisposed to additional rounds of micro deletion and duplication with increased probability
Insertion deletions polymorphisms (indels):
- minisatelites
2. microsatellites (STRs)
minisatellites
tenderly repeated 10-100 bp blocks of DNA
VNTR (variable number of tandem repeats)
microsatellites (STRs)
di-, tri-, tetra-nucleotide repeats
5 x 10^4 per genome
SNPS
frequency of 1 in 10^3 bp
PCR detectable markers
widely distributed
Copy number variations (CNVs)
variation in segments of genome from 200 bp-2 Mb
can range from one additional copy to many
array comparative genomic hybridization (array CGH)
Gene families are composed of
genes with big sequence similarity e.g. >85 that may carry out similar but distinct functions
some are clustered and some are dispersed
gene families arise through
gene duplication
major mechanism to evolutionary change
CNV’s are
primary type of structural variation
CNV loci may confer 12% of genome
implicated in increasingly large number of diseases
examples of insertions and deletions (indels)
repeats: STRs, VNTRs
estimated that _____% of genome is comprised of segmental duplications
5%
interhominoid cDNA array based comparative genomic hybridization
arrayCGH
compare human to human or normal to cancer. Can also compare human to non human
implications of highly dynamic genome
- no human genome is completely sequenced and assembled
- all regions of genome do not look/behave the same way
- rapidly changing, complex genomic regions
- missing heritability for many complex diseases
GWAS
genome wide association studies
these implicate loci that account for only a small % of expected genetic contribution
haploid human genome sequence is how long
3 x 10^9 bp
The chromosomes are found in
23 pairs:
1) 22 autosomes (1-22)
2) 1 pair of sex chromosomes (XX or XY)
Each chromosome is believed to consist of a _______
a single, continuous DNA double helix.
Genotype + environment =
Genotype + environment = phenotype
“α-satellite”
repeats (171 bp repeat unit)
found near centromeric region of all human chromosomes; may be important to chromosome segregation in mitosis and meiosis.
how many new mutations per person?
30
what drives adaptation?
random genetic variations
random variation in highly ordered structure of DNA and RNA and protein almost always result in…..
deleterious consequences like genetic disease
clustering
non random distribution of GC and AT region is the cause for chromosomal banding
minisatellites
randomly repeated 10-100 bp blocks of DNA
minisatellite examples
VNTR (variable number of tandem repeats)
microsatellites
di, tri, tetra nucleotide repeats
5 x104 per genome
microsatellite examples
STRPs (short tandem repeat polymorphisms)
single nucleotide polymorphism examples
A to G
C to A
SNP frequency
1 in 103 bp
SNP is ___ detectable
PCR
CNVs are
copy number variants
varaition in segments of genome tom 200bp -2Mb
one adds copy to many
CNVs found by
array compparative genomic hybridization (CGH)
variants can be
silent or have function effect
current human genome
99% of euchromatic region of genome
241 gaps
tandem repeats
spread throughout genome
found in 1, 9, 16 and Y
a-satellite repeats near centromere
alu family
300 bp
500,000 copies in genome
LINE family
6 kB
100,000 copies in genome
dispered elements may cause disease by
insertion into genes and aberrant recombination
how many genes?
25,000-30,000
types of genes
protein coding
RNA coding
pseudogenes
advantage of gene duplication
when gene duplication, one copy is free to carry out normal function, the other copy is free to change and lead to evolution
missing heritability
large percentage of genetic factors of dz are yet to be found =missing heritability
this is thought to be in unsequenced, dynamic portions of genome
how frequently SNP is likely to occur between two individuals
1 every 1000bp
99% identical and 3 million differences
3 types of variation in genome
- indertion-deletion polymorphism (indel)
- SNP
- CNVs
indels are
- minisatellites
2. microsatellites
retrotransposition many cause
insertional inactivation of genes
NAHR
repeats may facilitate aberrant recombination events before different copies of dispersed repeats leading to diseases