Genome Organization Flashcards

1
Q

How many base pairs are in the haploid human genome sequence?

A

3x10^9 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many chromosomes are in the human genome?

A

46

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where are the chromosomes located?

A

Nucleus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How many pairs of human chromosomes are there?
Autosomes_____
Sex Chromosomes_____

A

22 autosome pairs
1 sex chromosome pair

so
23 pairs in total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does each chromosome have more than one DNA strand?

A

No, each chromosome is believed to consist of a single continuous DNA double helix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we generally number the chromosomes?

A

Chromosome numbering is generally based on size, with smaller chromosomes being higher numbers
e.g.
Chr1: 245 million bp
Chr22: 49 million bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In what sense is the human genome a record of human evolutionary history?

A

Reflects the results of different selection pressures…these pressure have shaped our genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In terms of evolution, which gene do we retain?

A

Adaptive ones :)…

Thus, many that were maladaptive were not retained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A + B = phenotype

What are A and B?

A

Genotype (genome) + environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the fuel of genomic (and thus all) evolution?

A

Random variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In general, random variation in a highly ordered structure, such as the human genome, is almost always __________

A

Deleterious

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the price that we pay as a species to have a genome that can evolve, i.e. adapt to changing environments

A

Genetic disease

Again, random variation in a highly ordered structure is almost always deleterious. Almost!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is the human genome static?

A

No! it is dynamic and continues to change and evolve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Approximately how many new mutations occur in each individual?

A

30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What properties of meiosis allow for genetic diversity?

A

Independent assortment and shuffling of regions during recombination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The human genome is dynamic, constantly shuffling and changing, is this true for both germ line and somatic cells?

A

Yes
Germ line cells shuffle DNA during recombination
Somatic cells also produce DNA changes, but these too can be deleterious (e.g. cancer is a disease of “genome instability”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is cancer a disease of?

A

Genome instability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Is there a “human genome”

A

There is no “one” human genome, there are many (billions of different) human genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How frequent are SNPs in the human genome

A

Average of 1 SNP every 1000 bp between any two randomly chosen unrelated human genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What percentage of the human genome is identical?

A

Around 99.9%

Leaving about 3,000,000 differences :)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is the human genome organized in a random manner?

A

No…
there are gene rich regions
there are gene poor regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which chromosome is a gene rich chromosome?

A

19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the smallest chromosomes (in terms of gene content)

A

13, 18, 21 (not counting Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What special potential does having limited gene on chromosomes 13, 18, and 21 confer?

A

Viable trisomies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Is the majority of the genome stable or unstable?

A

Stable, but there are unstable regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What diseases are associated with unstable regions of the genome?

A

Many
e.g.
Spinal muscular atrophy (5q13)
DiGeorge syndrome (22q)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Which chromosome has a particularly large number of diseases associated with unstable regions on it?

A

Chromosome 1q21.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Chromosome 1q21.1 is associated with how many diseases?

A

12 disease are associated with this unstable region!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Are there regions that are particularly rich in certain base pairs?

A

Yes
GC rich regions
AT rich regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

GC rich regions comprise about what percent of the genome?

A

38

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

AT rich regions comprise about what percent of the genome?

A

54

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Do we see clustering of GC and AT rich regions?

A

Yes! This is the basis for chromosomal banding patterns (cytogenetics, karyotype analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the basis for chromosomal banding patterns

A

Clustering of GC and AT rich regions stain differently, producing unique banding
G-banding (Giemsa staining)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Do chromosomal size and gene content align?

A

Not really

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

2 Strategies for genomic sequencing

A
  1. Construct clone map then sequence clones…assemble
  2. Sequence shot put … let computer assemble
    Combo works best
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What part of the genome does the human genome sequenced so far focus on?

A

Euchromatic regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are many of the remaining euchromatic gaps associated with?

A

Segmental duplications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Have we sequenced the condensed (heterochromatic) regions of the genome?

A

No, essentially unsequenced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What component of the human genome is protein coding (translated)

A

1.5%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What percentage of the human genome is represented by genes (including exons, introns, flanking sequences involved in regulation, etc.)?

A

20-25%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What percentage of the human genome are “Single copy” sequences?

A

50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What percentage of the human genome is made up of “repetitive DNA” = sequences that are repeated hundreds to millions of times?

A

40-50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Have we fully sequenced the euchromatic portion of the genome?

A

No there are still many sequence gaps (>200) that remain…many of which are associated with segmental duplications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What characterizes euchromatic regions

A

more relaxed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What characterizes heterochromatic regions?

A

more condensed / repeat rich

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

2 broad Classes of repetitive DNA?

A

Tandem repeats

Dispersed repetitive elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Tandem repeats are as known as

A

“satellite DNAs”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What protocol are tandem repeats used for?

A

Cytogenetic banding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What are C-bands?

A

Specific tandem repeats - a particular pentanucleotide sequence - that is found as part of specific heterochromatic regions on the long arm (q) of chromosome 1, 9, 16, and y that are a hotspot for human-specific evolutionary changes

50
Q

What is special about C-bands?

A

They are a hotspot for human-specific evolutionary changes (only found in human genome)

51
Q

What are alpha satellite repeats?

A

Another example of tandem repeat

171 bp repeating unit

52
Q

Where do we find alpha satellite repeats?

A

near centromeric regions

53
Q

What might alpha satellite repeats be important in?

A

chromosome segregation in mitosis and meiosis - (remember they are close to centromeric region)

54
Q

What are the main dispersed repetitive DNA elements? (2)

A
Alu family (SINES)
L1 family (LINES)
55
Q

Length and Frequency of short interspersed repetitive elements (SINEs)

A

300bp; 500,000 copies in genome

56
Q

Length and Frequency of Long Interspersed repetitive Elements (LINEs)

A

6kb; 100,000 copies in genome

57
Q

Medical relevance of Lines and Sines

A

Retrotransposition (e.g. of Alu’s and L1’s) may cause insertional inactivation of genes if pooped back into a detrimental location

58
Q

What is retrotransposition / retrotransposed genes?

A

A portion of the mRNA transcript of a gene is spontaneously reverse transcribed back into DNA and inserted into chromosomal DNA (no introns)

59
Q

LINES and SINES are examples of what class of genomic structure?

A

retrotransposons

60
Q

LINES AND SINES can become pseudogenes, what are pseudogenes?

A

like a gene but no longer have associated promoter

61
Q

Repetitive DNA elements, like lines and sines, can facilitate aberrant recombination events, what is the significance of this?

A

recombination events between different copies of dispersed repeats leads to non-allelic homologous recombination (NAHR) which results in allelic loss on and gain participating chromosomes

62
Q

Duplication rich genome architecture promotes NAHR and disease…in what way?

A

Leads to microdeletion and microduplication… if the region contains dose sensitive genes, disease may result.

63
Q

Types of human DNA variation (4 broad categories)

A

Insertion deletion polymorphisms (Indels)
Single nucleotide polymorphsisms (SNPs)
Copy number variation (CNV)
Chromosomal (large scale)

64
Q

2 types of Indels

A

minisatellites

microsatellites (STRs)

65
Q

Minisatellites

A

A type of Indel polymorphism

  • tandemly repeated 10-100 bp blocks of DNA
  • Variable number tandem repeats VNTR
66
Q

Microsatellites (STRs)

A

di, tri, tetra-nucleotide repeats

5x10^4 per genome

67
Q

HOW FREQUENT ARE SNPs?

A

1 / 1000 bp

68
Q

Copy number variation (CNV) size?

how many extra copies?

A

variation in segments of genome from 200bp to 2MB

Can range from one additional copy to many

69
Q

How to we analyze CNV in genome?

A

Array comparative genomic hybridization (ACGH)

70
Q

DNA variation consequence

A

can be silent (majority)
or
have functional defect

71
Q

What are gene families?

A

Gene families are families of genes composed of genes with high sequence similarity (e.g. >85%) that may carry out similar but distinct functions

72
Q

Are gene families clustered or dispersed

A

some are clustered and some are dispersed

73
Q

Where do gene families come from?

A

Gene families arise through duplication

74
Q

Gene duplication is a major mechanism behind_____

A

evolutionary change

75
Q

Rationale behind gene duplication and evolutionary change ->

A

when a gene duplicates it frees up one copy to vary while the other copy continues to carry out a critical function

76
Q

duplications frequently co-localize with what?

A

disease?

77
Q

In the broadest sense, what it genome “Structural variation?”

A

all changes in the genome not due to single base-pair substitutions

78
Q

What is the primary of genomic structural variation?

A

copy number variations (CNVs)

79
Q

Up to what percentage of the genome may CNV loci cover?

A

12%

80
Q

CNVs are implicated in an increasingly large number of what?

A

diseases

81
Q

Short tandem repeats and
Variable number tandem repeats
are example of which type of genomic variant?

A

Insertions/deletions (Indels)

82
Q

In addition to CNVs and Indels, what other types of structural genomic variants do we see? (3)

A

Inversions
Duplications
Translocations

83
Q

genomic variation is most commonly __________ and rarely ___________

A

detrimental

beneficial

84
Q

What protocol do we use to analyze SNP?

A

PCR detectable markers

85
Q

What percent of the genome is comprised of segmental duplications?

A

around 5

86
Q

What defines a segmental duplication?

A

> 10kb

>95% sequence similarity

87
Q

Segmental duplications are often located adjacent to?

A

Human genome sequence gaps

88
Q

Segmental duplications are responsible for much of what?

A

Much of the dynamic nature of the genome - clustered near some hotspots (e.g. cBand on chromosome 1 is right next to segmental duplication)

89
Q

How do we study / measure genomic DNA copy number alteration?

A

cDNA microarray (arrayCGH)

Can compare e.g. hominid vs human..

Fluorescence ratios are depicted in a pseudocolor scale, such that red indicates increased and green indicated decreased gene copy number compared to the reference

90
Q

Does array CGH measure DNA or RNA?

A

DNA

91
Q

Simplifying, what are the 3 steps for arrayCGH?

A

Label genomic reference green and test red

Cohybridize to microarray

Signal color illustrates relative expression level

92
Q

When we use arrayCGH to look at copy number variation, which chromosomes have significant human specific gene duplications?

A

1,2,5,9

93
Q

Regions of the genome where there are significant human specific gene duplications also happen to correspond to regions where…..
Why

A

Regions where many of the gaps in the genome sequence are

Because duplicate rich nature makes hard to assemble and sequence

94
Q

What specific region of which chromosome did we focus on that was associated with 12 different human diseases?

A

1q21
This region has copy number variations that have found in 12 different human diseases
There’s also a human specific inversion here
There is also a human specific c-band (constitutive heterochromatin) in this region

95
Q

Which key sequence is highly duplicated in 1q21?

A

Duf1220

protein coding domain

96
Q

How many copies of Duf1220 are there in region 1q21?

A

over 200…wow

97
Q

Amplification of Duf1220 in Human Evolutions…

_____________ specific copy number expansion, which progressively increased from __________ to __________ to ___________

A

Anthropoid specific expansion

monkey to ape to human

98
Q

Which genedomain exhibits the greatest human-specific copy number expansion of any protein coding sequence?

A

Duf1220

99
Q

What is the primary cause of Human increase in Duf1220?

A

Domain hyperamplification

100
Q

What analysis of Duf1220 illustrates positive selection in primates

A

Ka/Ks analysis -

101
Q

Do we have more genes that have Duf1220?

A

NO!
Humans don’t really have more genes that have Duf1220, rather they have markedly increased expansion of Duf1220 domain sequence in similar numbers of genes (NBPF genes)

102
Q

Which genes hole Duf1220 domain?

A

NBPF

Neuroblastoma breakpoint genes

103
Q

What could account for why Duf1220 region of genome has been associated with so many diseases?

A

There are a lot of other genes that are non Duf-encoding nearby
Remember NAHR
Well, Duf genes can serve as recombination focal points - catalysts for disease basically

104
Q

The Duf genome architecture serves as a facilitator for:

A

genome changes/variation, many of which can be disease relevant (e.g. macro and microcephaly)

105
Q

Duf1220,Brain evolution, and Disease

Increased 1q21.1 instability led to _______(advantageous outcomes)

A

Increased Duf1220 copy number

106
Q

Duf1220, Brain evolution, and disease

Increased Duf1220 copy number led to?

A

Evolutionary advantage (increased brain size?)

107
Q

Increased 1q21.1 instability deleterious outcomes?

A

1q21.1 duplications –> macrocephaly / autism

1q21.1 deletions –>
microcephaly / schizophrenia

108
Q

Implications of highly dynamic genome…with regards to genome assembly

A

No genome is completely sequenced and assembled

- some regions are either missed or too complex and duplication rich to assemble correctly with current methods

109
Q

Do all regions of the genome look and behave similarly

A

No!

We have rapidly changing and complex genomic regions

110
Q

Rapidly changing complex genomic regions and disease?

A

Implicated in increasing number of genetic diseases

111
Q

Rapidly changing complex genomic regions and sequencing?

A

unexamined by available sequencing and genotyping platforms… major current challenge for medical genetics

112
Q

Highly dynamic genome and “missing heritability” implication?

A

GWAS implicate loci that account for only a small % of expected genetic contribution for many complex diseases

113
Q

Genetic technology that could help with complex genome?

A

long read sequencing - because if you have a repeat - the sequence read might be long enough that the repeat will evenutally end and you will enter some single copy region which will anchor the repeat to a single copy part of the genome so you will know where the repeat is located

114
Q

GWAS are usually ______ based

A

SNP

115
Q

GWAS what do they do?

A

find association to certain part of the genome with a particular disease, however, when they go back and look at that regions contribution to the phenotype, the contribution is often very low (not significant enough to cause severe phenotype)

116
Q

Key takeway from lecture

A

All regions of the genome are not created equal

117
Q

CNV regions involved in rapid and recent evolutionary change often are enriched for human specific ______________

A

gene duplications

118
Q

CNV regions involved in rapid and recent evolutionary change are often enriched for genome___________

A

sequence gaps

119
Q

CNV regions involved in rapid and recent evolutionary change are often enriched for recurrent ___________-

A

human diseases

120
Q

So there is a link regarding CNV regions, between ______________ and _________________
examples

A

evolutionary adaptive copy number increases
and
increase in human diseases

1q21. 1,
9p13. 3
9q21. 12
5q13. 3