Lecture 1: Organisation of the Human Genome Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

DNA - What does it do?

A

Hereditary/Genetic Information carried by DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Shape of DNA - describe it
- calculations, who

A

1.double helical structure

  1. described by Watson and Crick (and Rosalind Franklin) in 1953

3 * 10bp/turn, 3.4nm/turn, 2.37nm diameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which types of DNA exists as Double helical structure:
4

A
  1. Nuclear
  2. Mitochondrial
  3. Bacterial
  4. Viral
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Every (Nucleated) human cell has HOW MANY GENOMES

  • WHAT IS THE CONTENT?
A

2 Genomes

  1. Mitochondrial DNA
  2. Nuclear DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain Mitochondrial DNA: 8

A
  1. (<0.001% of DNA)

** 16 569bp,
**
37 genes,

*** 13 involved in respiratory chain,

***24 non-coding RNAs

– Closed,

– circular DNA,

— densely packed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Nuclear DNA

A
  1. > 99.999% of DNA

**(~3109bp, >20000 genes)

*** 23 pairs of chromosomes, varying sizes

*** Genes spaced irregularly, contain introns and exons

*** >2m of linear DNA per cell, requires dense folding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Understanding the Mitochondrial Genome:

A

1 * Membrane-enclosed organelles

2 * 1000s per cell (depending on cell type)

3 * Converts energy from food to usable ATP

4 * Genome is ~17kb, closed circular loop

5 * Mitochondrial genome encodes :
* 2 ribosomal RNAs (rRNA)
* 22 transfer RNAs (tRNA)
* 13 polypeptides (mostly resp. chain)

6 * Genes do not have introns (cf. prokaryotes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  • Mitochondrial genome encodes : 3
A
  • 2 ribosomal RNAs (rRNA)
  • 22 transfer RNAs (tRNA)
  • 13 polypeptides (mostly resp. chain)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Human Karyotype: Male vs Female

A

Male: 46 Chr, XY

Female: 46 Chr, XX

Paired 1-22 (Autosomes)
23rd pair = Sex chromosome XX, OR XY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Chromosomal DNA packaging: Histones?

A
  1. Histones are NUCLEAR-ENCODED GENES
  2. String beads
  3. ~11nm
  4. A type of protein found in chromosomes.
  5. Histones bind to DNA, help give chromosomes their shape, and help control the activity of genes. Enlarge. Structure of DNA.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chromosomal DNA packaging: Chromosomes

A
  • 46 Chr
  • Chromosomes must be UNFOLDED and REFOLDED DURING REPLICATION AND WHEN GENES ARE EXPRESSED.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Chromosomal DNA packaging:
CODES?

SIZES?

A

Many repeated “codes” within DNA sequence

1.Metaphase Chromosome
~1400nm

2.Condensed chromatin
~300-700nm

3.Packed chromatin fiber ~30nm

  1. DNA double helix approx 2nm
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Structure of DNA

A
  1. Sugar: deoxyribose or ribose
  2. Phosphate group: PO4-2
  3. Nitrogenous Base: Cytosine-Thymine, Adenine-Guanine

1-3 = Nucleotide

  1. complementary strands of nucleotides held together by hydrogen bonds between G-C and A-T base pairs.

6.Purines (adenine and guanine) are two-carbon nitrogen ring bases

  1. pyrimidines (cytosine and thymine) are one-carbon nitrogen ring base
  2. In the DNA segment shown, the 5′ to 3′ directions are down the left strand and up the right strand.
    — The 5′-end (pronounced “five prime end”) designates the end of the DNA or RNA strand that has the fifth carbon in the sugar-ring of the deoxyribose or ribose at its terminus.

9.A codon is a DNA or RNA sequence of three nucleotides (a trinucleotide) that forms a unit of genomic information encoding a particular amino acid or signaling the termination of protein synthesis (stop signals).

  1. Genes are short pieces of DNA that carry specific genetic information.
    - Genes are made up of a sequence of nucleotides.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is TRANSCRIPTION?

WHAT ARE THE STEPS

A

In biology, the process by which a cell makes an RNA copy of a piece of DNA. This RNA copy, called messenger RNA (mRNA), carries the genetic information needed to make proteins in a cell. It carries the information from the DNA in the nucleus of the cell to the cytoplasm, where proteins are made.

  1. Transcription is the first step in gene expression. It involves copying a gene’s DNA sequence to make an RNA molecule.
  2. Transcription is performed by enzymes called RNA polymerases, which link nucleotides to form an RNA strand (using a DNA strand as a template).
  3. Transcription has three stages: initiation, elongation, and termination.
    In eukaryotes, RNA molecules must be processed after transcription: they are spliced and have a 5’ cap and poly-A tail put on their ends.
  4. Transcription is controlled separately for each gene in your genome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is TRANSLATION?

STEPS OF TRANSLATION?

A

In biology, the process by which a cell makes proteins using the genetic information carried in messenger RNA (mRNA). The mRNA is made by copying DNA, and the information it carries tells the cell how to link amino acids together to form proteins.

Translation proceeds in three phases:

  1. Initiation: The ribosome assembles around the target mRNA. The first tRNA is attached at the start codon.
  2. Elongation: The last tRNA validated by the small ribosomal subunit (accommodation) transfers the amino acid. It carries to the large ribosomal subunit which binds it to the one of the preceding admitted tRNA (transpeptidation). The ribosome then moves to the next mRNA codon to continue the process (translocation), creating an amino acid chain.
  3. Termination: When a stop codon is reached, the ribosome releases the polypeptide. The ribosomal complex remains
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the HUMAN GENOME PROJECT?

A
  1. Sequenced between 1990 and 2001 by a public International
    Consortium (IHGSC) and by a private company (Celera Genomics)
  2. Doesn’t represent a single individual, made of a PATCHWORK OF SEQUENCES FRO DIFFERENT INDIVIDUALS.
  3. Early analyses done “by hand”. More recently, large-scale computerbased analyses have been required
  4. Freely available online programs to compare sequences
    – (e.g. BLAST : http://www.ncbi.nlm.nih.gov/BLAST/)
  5. SEQUENCES are “ANNOTATED” with WITH ALL KNOWN INFORMATION REGARDING GENES, REPETITIVE REGIONS, OTHER INFORMATION.
  • Questions :
    – Is the genome sequence complete?
    – How do we look at the genome content?
    – What is the content of the genome?
    – What are the functions of individual components of the genome?
    – How does the genome vary from individual to individual?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

With the NUCLEAR GENOME …HUMAN GENOME PROJECT WE CAN LOOK FOR: 2

A
  1. Repetitive sequences
  2. Specific sequences
    * Predicted sequences (similarity with other known genes)
    * mRNA/cDNA sequences
    * ESTs = Expressed sequence tags (ESTs) are fragments of mRNA sequences derived through single sequencing reactions performed on randomly selected clones from cDNA libraries
18
Q

COMPOSITION OF HUMAN GENOME…

A

human genome = 3200 Mb
1. - Gene related sequences = 1200Mb
- GENES = 48Mb
- Related Sequs = 1152Mb
— Pseudogenes,
— gene fragments,
—introns and UTRs

    • Intergenic DNA = 2000Mb
      - Interspersed Repeats = 1400Mb
      —LINEs =640Mb
      —LTR = 250Mb
      —SINEs =420Mb
      —Transposons = 90Mb
  • Other Intergenic = 600Mb
    — Microstaelites = 90Mb
    — VARIOUS = 510Mb

LOOK AND UNDERSTAND DIAGRAM SLIDE 11

19
Q

What are Retrotransposons? = 6

A

1 * Sequences related to retroviruses

2 * gag, pol and env genes
3 * LTRs (long terminal repeats)

4 * MANY TRUNCATED SEQUENCES IN THE GENOME (LACK ‘env, or just ‘LTRs’)

5 * Pol produces a reverse transcriptase
which ALLOWS DNA TO BE INTEGRATED INTO THE GENOME.

6 * Unlike retroviruses, retrotransposons can’t
move between cells

20
Q

LINEs, SINEs and ALUs…

what is LINEs? 4

A
  1. LINE : Long INterspersed repeat Element
    • Long (6-8kb) and can copy themselves to other parts of the genome
    • ENCODEPROTEINS WHICH ARE REQUIRED FOR THEIR INTEGRATION INTO THE GENOME.
    • 3 distinct LINE families, LINE1, LINE2 and LINE3. Only LINE1 is still transpositionally active
21
Q

What are SINEs and ALUs? =7

A
  1. SINE : Short INterspersed repeat Element

2 * Shorter than LINEs. Often aren’t able to integrate themselves

3 * Use LINE proteins to integrate

4 * SINEs are also divided into families (ALUs and MIRs). Only remaining active
family are ALUs

5 * ALUs occur only in primates. ~1 every 3kb

6 * ALUs classified into numerous sub-families based on their sequence

7 * ALUs are >80M years old. Can be used as molecular clocks

22
Q

LINES, SINES and ALUs ..percentages..5

A

1 * About 50% of genomic DNA is transposable elements

2 * Can damage the host genome through insertional mutagenesis or
unequal crossover.

3 * They don’t move much anymore

  1. Most = SINEs (then Alul)
    LINEs
    LTR elements
    DNA elements
    - mariner
    Unclassified (least)
  2. Total of all types = 44.7
    *transposable elements in the human genome
23
Q

What are Microsatelites? 4

A

1 * Short sequences (1-15bp) repeated in
tandem many times (2-50).
Dinucleotide repeats are the most
frequent

2 * Result in “low complexity” sequence * eg. ACACACACACACACACACACACA
or GCGCGCGCGCGCGCGC

3 * Prone to expansion and contraction
during replication due to polymerase
“slippage”

4 * Microsatellite sequences can be found
in coding sequences, but not very often

24
Q

What are Non-coding RNA genes (ncRNA)?

A

a functional RNA molecule that is transcribed from DNA but not translated into proteins

25
Q

Non-coding RNA genes (ncRNA) MAJOR CLASSES…8

A

1 * tRNA (Translational machinery; gene cluster on Chr 6 – almost complete set)

2 * rRNA (Translational machinery; 150-200 copies. )

3 * Short Regulatory ncRNA

—- 4* snoRNA (RNA processing/base modification. 97 snoRNA, >85% single copy)

—- 5. * snRNA (RNA processing/splicing, multiple copies of some)
—– 6. * miRNA/piRNA/tiRNA (gene expression)

7 * lncRNA (epigenetic control of chromatin, promoter-specific gene regulation, mRNA
stability, X-chromosome inactivation and imprinting)

8* Others? (very current field of research)

26
Q

Non-coding RNA genes (ncRNA)

A

LOOK AT SLIDE 17

27
Q

What are Pseudogenes? 6

A

1 * Sequences related to coding or non-coding sequences that have mutated such that expression/function is lost (e.g. stop codons introduced,
frameshifts etc)

2 * Derived from genes (coding and non-coding) by duplication or
retrotransposition

    • Different types include :
      — 4 * Gene fragments
      * single exons, multiple exons. Very common

—5 * Whole genes
* Includes introns. Splice sites often mutated
* Processed pseudogenes

—6. * Mature mRNA from expressed gene reverse-transcribed and integrated
into the genome

28
Q

Pseudogenes
Different types include : 3

A

1 * Gene fragments
* single exons, multiple exons. Very common

2 * Whole genes
* Includes introns. Splice sites often mutated

3 * Processed pseudogenes
* Mature mRNA from expressed gene reverse-transcribed and integrated
into the genome

29
Q

Pseudogenes

A
  1. Missing promoter
  2. missing start codon
  3. frameshift
  4. premature stop codon
  5. missing intron
  6. partial deletion

look at gene segment drawing SLIDE 18

30
Q

What are “CODING” GENES? 9

A

1 * Make up 1.5% of the genome, but they are the most studied

2 * Produce proteins which act perform activities required by the cell (metabolism, transcription, translation, etc etc)

3 * Can be single copy (e.g. Beta globin) or multiple copy (eg HLA class I genes)

4 * Genes can be grouped into families based on sequence similarity
— 5– Often evolved by duplication and divergence and found in clusters

6 * Some families group into superfamilies based on a common protein domains (eg Ig-SF)

7 * Coding genes can be identified by comparing mRNAs (i.e. spliced sequences) with
genomic sequences

8 * Genbank is a public store of mRNA sequences generated by laboratories worldwide

9 * Gene/mutation naming conventions important for communication of findings

31
Q

Genes and repeat elements overlap = 4

A

1 * Numerous genes, different orientations (forward and reverse, opposite strands)

2 * Pseudogenes and gene fragments often intermingled (repeat content very dense)

3 * Coding genes can overlap in opposite orientations

4 * Some genes may contain complete genes within introns

32
Q

Genes and repeat elements overlap IMAGE

A

SLIDE 20.. DRAW AND LABEL

33
Q

General structure of coding genes

All genes have; 9

A

1) Promoter – TF & RNA pol binding site
(TATA vs TATA-less)

2) Introns and exons (coding)

3) 5’ UTR (drives translation)

4) Start codon (ATG)

5) Splice sites (AG/GT vs AT/AC)

6) Splice enhancers (exonic, intronic)

7) Stop codon (TAA, TAG, TGA)

8) 3’UTR (mRNA stability & localisation)

9) Polyadenylation signal (sequence)

34
Q

General structure of coding genes
Fig 3.6
All genes have; IMAGE DIAGRAM

A

LABEL AND DRAW THE DIAGRAM ON SLIDE 21

35
Q

Understanding Splicing and splice sites: 7

A

1 * DNA is transcribed to RNA in the nucleus

2 * RNA is exported to the spliceosome where introns are spliced out to yield
a mature mRNA

3 * Specific sequences affect splicing
—- 4* Splice acceptor/donor sites occur
at intron/exon boundaries

5 * Enhancers/Silencers occur within introns and exons and can affect splicing in specific tissues (SR proteins)
—- 6* SR proteins can direct the inclusion and exclusion of specific exons
—7 * The mixture of SR proteins differ from tissue to tissue

36
Q

Splicing and splice sites image

A

draw and label diagram on slide 22

37
Q

Alternative splicing can produce multiple protein isoforms… 6

A
  1. exon skipping
  2. intron retention
  3. alternative 5’ donor or 3’ acceptor
  4. mutually exclusive exons
  5. alternative promoters
  6. alternative splicing and ployadenylation

understand all forms and draw the diagrams on slide 23

38
Q

The “Average” human gene = 4

A

1 * Large variation in gene size (2kb – 2Mb)

2 * Large variation in protein sizes

3 * Large variation in UTR lengths (3’ generally longer than 5’)

4 * Many genes have alternative first exons with different 5’UTRs

LOOK AT TABLE ON SLIDE 24

39
Q

Understanding the features of Human genes..

A

1 * Approximately 50000 – 100000 genes were predicted in the genome

2 * The completion of the genome sequence in 2001 showed 20,000-25,000

3 * Alternative splicing explains the difference (some genes can produce >10 different proteins)

4 * Common features can be identified in genes with related functions

5 * Cell Surface Receptors for example;
* Leader sequence (to direct proteins to the cell surface) ~20 amino acids
* Extracellular domains (of different families. One example is Ig, SH-linked)
* Number of EC domains can vary
* A stalk region
* A membrane anchoring sequence and/or transmembrane sequence
* An intracellular domain (of different families). Delivers signals

40
Q

Human Genes….Cell Surface Receptors for example; 6

A
  1. Leader sequence (to direct proteins to the cell surface) ~20 amino acids

2 * Extracellular domains (of different families. One example is Ig, SH-linked)

3 * Number of EC domains can vary

4 * A stalk region

5 * A membrane anchoring sequence and/or transmembrane sequence

6 * An intracellular domain (of different families). Delivers signals

41
Q

NKp44 – a receptor on NK cells

A

draw image on slide — gene segment and features…

SLIDE 25

42
Q

Protein families =12

A
  1. Cellular processes
  2. metabolisim
  3. DNA replication/modification
  4. intracellular signalling
  5. cell-cell communication
  6. protein folding and degradation
  7. transport
  8. multifunctional proteins
  9. cytoskeletal/structural
  10. defence and immunity
  11. miscellaneous function
  12. transcription/translation

Look at slide 26 graph