Genome Organization Flashcards

1
Q

How many base pairs are in the haploid human genome sequence?

A

3x10^9 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many chromosomes are in the human genome?

A

46

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where are the chromosomes located?

A

Nucleus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How many pairs of human chromosomes are there?
Autosomes_____
Sex Chromosomes_____

A

22 autosome pairs
1 sex chromosome pair

so
23 pairs in total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does each chromosome have more than one DNA strand?

A

No, each chromosome is believed to consist of a single continuous DNA double helix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we generally number the chromosomes?

A

Chromosome numbering is generally based on size, with smaller chromosomes being higher numbers
e.g.
Chr1: 245 million bp
Chr22: 49 million bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In what sense is the human genome a record of human evolutionary history?

A

Reflects the results of different selection pressures…these pressure have shaped our genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In terms of evolution, which gene do we retain?

A

Adaptive ones :)…

Thus, many that were maladaptive were not retained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A + B = phenotype

What are A and B?

A

Genotype (genome) + environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the fuel of genomic (and thus all) evolution?

A

Random variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In general, random variation in a highly ordered structure, such as the human genome, is almost always __________

A

Deleterious

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the price that we pay as a species to have a genome that can evolve, i.e. adapt to changing environments

A

Genetic disease

Again, random variation in a highly ordered structure is almost always deleterious. Almost!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is the human genome static?

A

No! it is dynamic and continues to change and evolve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Approximately how many new mutations occur in each individual?

A

30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What properties of meiosis allow for genetic diversity?

A

Independent assortment and shuffling of regions during recombination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The human genome is dynamic, constantly shuffling and changing, is this true for both germ line and somatic cells?

A

Yes
Germ line cells shuffle DNA during recombination
Somatic cells also produce DNA changes, but these too can be deleterious (e.g. cancer is a disease of “genome instability”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is cancer a disease of?

A

Genome instability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Is there a “human genome”

A

There is no “one” human genome, there are many (billions of different) human genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How frequent are SNPs in the human genome

A

Average of 1 SNP every 1000 bp between any two randomly chosen unrelated human genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What percentage of the human genome is identical?

A

Around 99.9%

Leaving about 3,000,000 differences :)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is the human genome organized in a random manner?

A

No…
there are gene rich regions
there are gene poor regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which chromosome is a gene rich chromosome?

A

19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the smallest chromosomes (in terms of gene content)

A

13, 18, 21 (not counting Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What special potential does having limited gene on chromosomes 13, 18, and 21 confer?

A

Viable trisomies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Is the majority of the genome stable or unstable?
Stable, but there are unstable regions
26
What diseases are associated with unstable regions of the genome?
Many e.g. Spinal muscular atrophy (5q13) DiGeorge syndrome (22q)
27
Which chromosome has a particularly large number of diseases associated with unstable regions on it?
Chromosome 1q21.1
28
Chromosome 1q21.1 is associated with how many diseases?
12 disease are associated with this unstable region!
29
Are there regions that are particularly rich in certain base pairs?
Yes GC rich regions AT rich regions
30
GC rich regions comprise about what percent of the genome?
38
31
AT rich regions comprise about what percent of the genome?
54
32
Do we see clustering of GC and AT rich regions?
Yes! This is the basis for chromosomal banding patterns (cytogenetics, karyotype analysis)
33
What is the basis for chromosomal banding patterns
Clustering of GC and AT rich regions stain differently, producing unique banding G-banding (Giemsa staining)
34
Do chromosomal size and gene content align?
Not really
35
2 Strategies for genomic sequencing
1. Construct clone map then sequence clones...assemble 2. Sequence shot put ... let computer assemble Combo works best
36
What part of the genome does the human genome sequenced so far focus on?
Euchromatic regions
37
What are many of the remaining euchromatic gaps associated with?
Segmental duplications
38
Have we sequenced the condensed (heterochromatic) regions of the genome?
No, essentially unsequenced
39
What component of the human genome is protein coding (translated)
1.5%
40
What percentage of the human genome is represented by genes (including exons, introns, flanking sequences involved in regulation, etc.)?
20-25%
41
What percentage of the human genome are "Single copy" sequences?
50%
42
What percentage of the human genome is made up of "repetitive DNA" = sequences that are repeated hundreds to millions of times?
40-50%
43
Have we fully sequenced the euchromatic portion of the genome?
No there are still many sequence gaps (>200) that remain...many of which are associated with segmental duplications
44
What characterizes euchromatic regions
more relaxed
45
What characterizes heterochromatic regions?
more condensed / repeat rich
46
2 broad Classes of repetitive DNA?
Tandem repeats | Dispersed repetitive elements
47
Tandem repeats are as known as
"satellite DNAs"
48
What protocol are tandem repeats used for?
Cytogenetic banding
49
What are C-bands?
Specific tandem repeats - a particular pentanucleotide sequence - that is found as part of specific heterochromatic regions on the long arm (q) of chromosome 1, 9, 16, and y that are a hotspot for human-specific evolutionary changes
50
What is special about C-bands?
They are a hotspot for human-specific evolutionary changes (only found in human genome)
51
What are alpha satellite repeats?
Another example of tandem repeat | 171 bp repeating unit
52
Where do we find alpha satellite repeats?
near centromeric regions
53
What might alpha satellite repeats be important in?
chromosome segregation in mitosis and meiosis - (remember they are close to centromeric region)
54
What are the main dispersed repetitive DNA elements? (2)
``` Alu family (SINES) L1 family (LINES) ```
55
Length and Frequency of short interspersed repetitive elements (SINEs)
300bp; 500,000 copies in genome
56
Length and Frequency of Long Interspersed repetitive Elements (LINEs)
6kb; 100,000 copies in genome
57
Medical relevance of Lines and Sines
Retrotransposition (e.g. of Alu's and L1's) may cause insertional inactivation of genes if pooped back into a detrimental location
58
What is retrotransposition / retrotransposed genes?
A portion of the mRNA transcript of a gene is spontaneously reverse transcribed back into DNA and inserted into chromosomal DNA (no introns)
59
LINES and SINES are examples of what class of genomic structure?
retrotransposons
60
LINES AND SINES can become pseudogenes, what are pseudogenes?
like a gene but no longer have associated promoter
61
Repetitive DNA elements, like lines and sines, can facilitate aberrant recombination events, what is the significance of this?
recombination events between different copies of dispersed repeats leads to non-allelic homologous recombination (NAHR) which results in allelic loss on and gain participating chromosomes
62
Duplication rich genome architecture promotes NAHR and disease...in what way?
Leads to microdeletion and microduplication... if the region contains dose sensitive genes, disease may result.
63
Types of human DNA variation (4 broad categories)
Insertion deletion polymorphisms (Indels) Single nucleotide polymorphsisms (SNPs) Copy number variation (CNV) Chromosomal (large scale)
64
2 types of Indels
minisatellites | microsatellites (STRs)
65
Minisatellites
A type of Indel polymorphism - tandemly repeated 10-100 bp blocks of DNA - Variable number tandem repeats VNTR
66
Microsatellites (STRs)
di, tri, tetra-nucleotide repeats | 5x10^4 per genome
67
HOW FREQUENT ARE SNPs?
1 / 1000 bp
68
Copy number variation (CNV) size? | how many extra copies?
variation in segments of genome from 200bp to 2MB | Can range from one additional copy to many
69
How to we analyze CNV in genome?
Array comparative genomic hybridization (ACGH)
70
DNA variation consequence
can be silent (majority) or have functional defect
71
What are gene families?
Gene families are families of genes composed of genes with high sequence similarity (e.g. >85%) that may carry out similar but distinct functions
72
Are gene families clustered or dispersed
some are clustered and some are dispersed
73
Where do gene families come from?
Gene families arise through duplication
74
Gene duplication is a major mechanism behind_____
evolutionary change
75
Rationale behind gene duplication and evolutionary change ->
when a gene duplicates it frees up one copy to vary while the other copy continues to carry out a critical function
76
duplications frequently co-localize with what?
disease?
77
In the broadest sense, what it genome "Structural variation?"
all changes in the genome not due to single base-pair substitutions
78
What is the primary of genomic structural variation?
copy number variations (CNVs)
79
Up to what percentage of the genome may CNV loci cover?
12%
80
CNVs are implicated in an increasingly large number of what?
diseases
81
Short tandem repeats and Variable number tandem repeats are example of which type of genomic variant?
Insertions/deletions (Indels)
82
In addition to CNVs and Indels, what other types of structural genomic variants do we see? (3)
Inversions Duplications Translocations
83
genomic variation is most commonly __________ and rarely ___________
detrimental | beneficial
84
What protocol do we use to analyze SNP?
PCR detectable markers
85
What percent of the genome is comprised of segmental duplications?
around 5
86
What defines a segmental duplication?
>10kb | >95% sequence similarity
87
Segmental duplications are often located adjacent to?
Human genome sequence gaps
88
Segmental duplications are responsible for much of what?
Much of the dynamic nature of the genome - clustered near some hotspots (e.g. cBand on chromosome 1 is right next to segmental duplication)
89
How do we study / measure genomic DNA copy number alteration?
cDNA microarray (arrayCGH) Can compare e.g. hominid vs human.. Fluorescence ratios are depicted in a pseudocolor scale, such that red indicates increased and green indicated decreased gene copy number compared to the reference
90
Does array CGH measure DNA or RNA?
DNA
91
Simplifying, what are the 3 steps for arrayCGH?
Label genomic reference green and test red Cohybridize to microarray Signal color illustrates relative expression level
92
When we use arrayCGH to look at copy number variation, which chromosomes have significant human specific gene duplications?
1,2,5,9
93
Regions of the genome where there are significant human specific gene duplications also happen to correspond to regions where..... Why
Regions where many of the gaps in the genome sequence are Because duplicate rich nature makes hard to assemble and sequence
94
What specific region of which chromosome did we focus on that was associated with 12 different human diseases?
1q21 This region has copy number variations that have found in 12 different human diseases There's also a human specific inversion here There is also a human specific c-band (constitutive heterochromatin) in this region
95
Which key sequence is highly duplicated in 1q21?
Duf1220 | protein coding domain
96
How many copies of Duf1220 are there in region 1q21?
over 200...wow
97
Amplification of Duf1220 in Human Evolutions... | _____________ specific copy number expansion, which progressively increased from __________ to __________ to ___________
Anthropoid specific expansion | monkey to ape to human
98
Which genedomain exhibits the greatest human-specific copy number expansion of any protein coding sequence?
Duf1220
99
What is the primary cause of Human increase in Duf1220?
Domain hyperamplification
100
What analysis of Duf1220 illustrates positive selection in primates
Ka/Ks analysis -
101
Do we have more genes that have Duf1220?
NO! Humans don't really have more genes that have Duf1220, rather they have markedly increased expansion of Duf1220 domain sequence in similar numbers of genes (NBPF genes)
102
Which genes hole Duf1220 domain?
NBPF | Neuroblastoma breakpoint genes
103
What could account for why Duf1220 region of genome has been associated with so many diseases?
There are a lot of other genes that are non Duf-encoding nearby Remember NAHR Well, Duf genes can serve as recombination focal points - catalysts for disease basically
104
The Duf genome architecture serves as a facilitator for:
genome changes/variation, many of which can be disease relevant (e.g. macro and microcephaly)
105
Duf1220,Brain evolution, and Disease | Increased 1q21.1 instability led to _______(advantageous outcomes)
Increased Duf1220 copy number
106
Duf1220, Brain evolution, and disease | Increased Duf1220 copy number led to?
Evolutionary advantage (increased brain size?)
107
Increased 1q21.1 instability deleterious outcomes?
1q21.1 duplications --> macrocephaly / autism 1q21.1 deletions --> microcephaly / schizophrenia
108
Implications of highly dynamic genome...with regards to genome assembly
No genome is completely sequenced and assembled | - some regions are either missed or too complex and duplication rich to assemble correctly with current methods
109
Do all regions of the genome look and behave similarly
No! | We have rapidly changing and complex genomic regions
110
Rapidly changing complex genomic regions and disease?
Implicated in increasing number of genetic diseases
111
Rapidly changing complex genomic regions and sequencing?
unexamined by available sequencing and genotyping platforms... major current challenge for medical genetics
112
Highly dynamic genome and "missing heritability" implication?
GWAS implicate loci that account for only a small % of expected genetic contribution for many complex diseases
113
Genetic technology that could help with complex genome?
long read sequencing - because if you have a repeat - the sequence read might be long enough that the repeat will evenutally end and you will enter some single copy region which will anchor the repeat to a single copy part of the genome so you will know where the repeat is located
114
GWAS are usually ______ based
SNP
115
GWAS what do they do?
find association to certain part of the genome with a particular disease, however, when they go back and look at that regions contribution to the phenotype, the contribution is often very low (not significant enough to cause severe phenotype)
116
Key takeway from lecture
All regions of the genome are not created equal
117
CNV regions involved in rapid and recent evolutionary change often are enriched for human specific ______________
gene duplications
118
CNV regions involved in rapid and recent evolutionary change are often enriched for genome___________
sequence gaps
119
CNV regions involved in rapid and recent evolutionary change are often enriched for recurrent ___________-
human diseases
120
So there is a link regarding CNV regions, between ______________ and _________________ examples
evolutionary adaptive copy number increases and increase in human diseases 1q21. 1, 9p13. 3 9q21. 12 5q13. 3