Lecture 5: The Human Genome Flashcards

1
Q

What is the human genome made up of and how many bp are in the haploid genome?

A

22 autosome pairs and 2 sex chromosomes. 3.2 billion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What was the human genome project?

A

International Human Genome Sequence Consortium aimed to obtain the entire DNA sequence of the hapoid human genome in 15 years. Launched 1990

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does hierarchical shotgun sequencing work?

A
  1. Create a library of segments of the genome using bacterial artificial chromosomes (BAC library)
  2. All the BACs are screened for markers and classified by their location on the chromosome
  3. A set of minimally overlapping BACs are selected for sequencing
  4. Individual BACs divided into smaller fragments and sequenced using sanger sequencing
  5. sequenced fragments are assembled based on overlapping segments as many bacteria’s DNA is used.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What had the hierachical shotgun sequencing achieved after 10 years?

A

only sequenced the smallest chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What was Celera? Who started and it and what were their goals?

A

Celera Human Genome sequencing was started by Craig Venter who thought using shotgun sequencing would be faster. He used shotgun sequencing to sequence the first bacterial genome. Funded Celera in 1998 with private funds. Goal to sequence the entire genome in 3 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the random shotgun strategy?

A

the whole genome is shredded into smaller fragments of a few kilobases. each fragment is sequenced at both ends to create read pairs. Based on the overlap, the reads are assembled into contigs which are used to build scaffolds. Read pair mates can be used to determine the size of any gaps between contigs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How many genomes were used in Celera?

A

5 individuals and BAC libraries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why did both methods need each other?

A

Celera method cheaper and faster. Public effort used this method to finish sequencing. Celera method used the physical map from the public effort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are the gaps filled in?

A

Using PCR to amplify the unknown segments which are sequenced. For gaps greater than 20kb the BAC libraries are screened to identify segments containing the edge of the gap and they are shotgun sequenced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Whose genome was used for the public effort?

A

10-20 people from across different racial and ethnic backgrounds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why were their still gaps in the sequence in 2001?

A

Due to 2% hard to clone heterochromatin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What main points did the human genome sequence reveal?

A

Thought there would be 50-100,000 genes but only 20-22K. Made sense that the complexity of eukaryotes would mean more genes. There is a large variation in gene size. most genes sequenced were blow 10 kb but the first intron tends to be very large as lots of regulatory sequences in it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where have many genes derived from?

A

Horizontal transfer from bacteria or from transposons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe how genes and chromosomes are organised in our genome.

A

Genes are not evenly distributed in the genome. Higher expressed genes tend to be in high GC content areas. Some chromosomes are positioned in the nucleus in order to have access to particular TFs so they are more likely to be transcribed. Transcription complexes are located in the centre of the nucleus where most of the open chromatin with high GC content is located.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the genome made up of and in what proportions?

A

Exons = 1.5%
Introns =25%
the rest is composed of repetitive sequences. Transposons make up a large proportion of introns and a small proportion of exons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are introns?

A

Code for functional RNA molecules. removed from RNA before translation. contain elements that regulate gene transcription. Some introns contain other genes nested in them. Exons are dispersed between them.

17
Q

What are the two origins of repetitive elements?

A

Tandem repeats and interspersed repeats

18
Q

Where are tandem repeats found and what are the three types?

A

Found in subtelomeres and pericentromeres. Three types are satallites, large and found in centromeric heterochromatin. minisatallites are medium and found in telomeres. Microsatallites are small and are dispersed

19
Q

What are the two types of transposable element?

A

DNA transposons and retrotransposons.

20
Q

What are DNA transposons?

A

Inverted terminal repeats with a single open reading frame that encodes a transposase

21
Q

What are the two types of retrotransposon?

A

Autonomous and non-autonomous

22
Q

What are the two types of autonomous retrotransposons?

A

LTRs and non-LTRs

23
Q

What do non-autonomous transposons do?

A

Hijack the equipment used by the autonomous ones.

24
Q

What % of the entire genome do retrotransposons occupy?

A

about 10%

25
Q

Why are there so many transposons?

A

Junk where selection isn’t strong enough to get rid of them. Functional, found to increase expression during cell stress. Chance functionality - some end up being encorporated into regulatory regions of genes (5% of exons thought to have transposon origin in humans).

26
Q

What are the different classes of RNA and why do we have so many?

A

ribosomal, transfer, antisense, telomerase. Diversity could represent the start of life which was an RNA world

27
Q

How many long, non-coding RNAs are there and what are the majority of these?

A

around 18,000 and antisense or lincRNA.