module 4: genome function Flashcards

1
Q

(T/F) The way proteins get translated (how, when, where) is dictated by the way DNA packs into the nucleus.

A

True!

Everything inside the nucleus has its place. The distribution of these various proteins and DNA is a direct consequence of the function of those particular compartments.

The chromosomes fold so that each DNA sequence is in an OPTIMAL AREA of the nucleus to carry out its function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Where does most of our knowledge about how the nucleus is organized come from?

A

most of our knowledge about how the nucleus is organized came with the ADVENT OF FLUORESCENT MARKERS.

with a microscope, it becomes very hard to make any type of observation about the 3D structure of the nucleus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The nucleus is _____ and ______ bound.

The nucleolus is _______-bound.

Specialized sub-domains inside the nucleus are called ______ _____ _______.

A

circular; membrane

non-membrane

Bio-Molecular Condensates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Briefly answer the questions regarding bio-molecular condensates:

1) what is the function of nucleolus?

2) what are cajal bodies?

3) what are speckles?

A

1) transcription and processing of ribosomal RNA (rRNA)

2) cajal bodies are sites of RNA processing. snRNAs are transcribed and processed here.

3) speckles are sites of mRNA splicing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

(T/F) The nucleus is a pretty static organelle.

A

False!

The nucleus is HIGHLY dynamic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Briefly describe the FRAP technique.

A

Fluorescence Recovery After Photobleaching (FRAP) tracks the movement of fluorescently labelled proteins.

A protein that is present everywhere in the nucleus is labelled with a GFP. Then, a pulse from a high-energy laser is used to bleach a small area of the nucleus (PHOTOBLEACH).

Eventually, the photobleached area is RECOVERED with the protein. By measuring how long it takes for recovery, you can determine if it is a highly mobile (fast recovery) or relatively static protein (slow recovery).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(T/F) By using the FRAP technique, we know that the nucleus is extremely dynamic as a single protein can move around in a matter of minutes.

A

True!

Although the nucleus is jammed full of DNA (6 billion bp of DNA), everything is organized in a way that proteins can be very mobile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which one of the statements is true?

1) DAPI binds to dsDNA.

2) Even with the use of fluorescent dye that attaches to DNA, we are unable to tell the structure of the interphase, thus making it challenging for us to determine the typical packaging of DNA.

A

Both!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The 10nm chromatin fiber is also known as __________.

140-150bp DNA is wrapped around a core _____ of histone proteins (which proteins?), forming a ________.

The nucleosomes are separated by 50-70bp of ____ DNA.

Linker histone (___) binds to linker DNA and _______ DNA further.

A

beads-on-a-string

octamer (two of H2A, H2B, H3, H4); nucleosome

linker

H1; compacts

*role of H1 still unsure about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In human cells, there are ___ bp of DNA that wrap around the histone octamer.

A

147

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Initially, what did we think happened to the 10nm fibre?

What were the two models we generated from this hypothesis?

A

We used to think the 10nm fibre would fold into itself, generating a 30nm fibre. 30nm fibre was thought to be the MAIN chromatin in the INTERPHASE nucleus.

The two models in which the 10nm fibre would coil onto itself:
1) The solenoid model
2) The helical ribbon model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The in-vivo existence of the 30 nm fiber was questioned by a paper. What did they suggest would happen instead?

A

Rather than having a uniform, coiled structure (30nm fiber), we have CLUTCHES of nucleosomes (think of eggs in a nest).

ACTIVE and SILENCED compartments arise from variations in packing densities.

The formation of the interphase DNA is LINKED to the CELL TYPE and the STAGE in DIFFERENTIATION. For example, compared to a stem cell, a somatic cell has more nucleosomes clustered together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Briefly describe how STORM works.

A

STORM is a SUPER-RESOLUTION microscopy technique that makes use of PHOTO-SWITCHABLE FLUOROPHORES.

It took histones found in the nucleosomes and tagged them with the photo-switchable fluorophores. Only a small subset of the fluorophores are at the ON state at a given time.

It switched between fluorescent and dark states.
This limited the number of fluorophores active in each frame, preventing anything closer than 250nm from being on at the same time, generating distinct spots.

It combined pictures taken each round into a single molecule image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

1) What does resolution mean regarding microscopes?

2) What must the minimum separation of two fluorophores to be able to them apart? What happens over and below this number?

3) Can nucleosomes show up as distinct spots?

A

1) Seeing two separate objects as two separate objects.

2) The minimum separation must be 250nm (rayleigh criterion). If two fluorophores are closer than 250nm, we see them as one blurry spot (unresolved). If they are farther than 250nm, they are seen as two distinct spots (resolved).

3) No! Nucleosomes won’t show up as distinct spots because they are closer than 250nm. This limits our availability to investigate the organization of our nucleus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

1) How was STORM used to show that we have clutches of nucleosomes?

2) What were the results?

A

1) H2B was tagged with the photo-switchable fluorophores. Since this protein is part of the core histone octamer, localization should reflect the ARRANGEMENT OF NUCLEOSOMES within the chromatin fiber.

2) The histone protein appeared clustered in discrete and spatially separated nanodomains (no pattern). There was a higher density of it in the nuclear periphery compared to the interior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When there are more clutches (nucleosomes are packed), genes are turned ___.

_________ tends to cluster on the periphery, while _______ tends to cluster on the inside.

A

Off

Heterochromatin (silenced); euchromatin

*the organization of genes is not random!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Differentiate facultative heterochromatin from constitutive heterochromatin.

A

Facultative: sometimes heterochromatin, sometimes euchromatin. depends on cues.

Constitutive: always heterochromatin (telomeres, centromeres, highly repetitive DNA).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

1) what are the two types of networks of fibres present in the nucleus? briefly describe each.

2) why are these important?

A

1) nuclear lamina (just underneath the nuclear envelope) and nuclear matrix (extends towards the nucleoplasm)

2) these serve as ATTACHMENT POINTS FOR CHROMOSOMAL DNA. proteins bind to these fibres and the same proteins bind to DNA, anchoring sequences at specific areas in the nucleus. in part, the placement of DNA inside the nucleus is facilitated by the presence of the LAMINA and MATRIX.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

(T/F) Packaging of DNA inside the nucleus has to support the function of the cell. Thus, it has to change and has to differ from cell type to cell type.

A

True!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Briefly answer the following questions regarding the nuclear lamina.

1) What is the thickness?

2) What kind of filament is it made of?

3) What is its role?

4) What are the types of proteins? How many genes encode the different lamins?

A

1) Its thickness can range from 10nm to 100nm, typically falling within the 10-30nm range.

2) Type V filament

3) Its role is analogous to the role of the cytoskeleton; for rigidity and structure (helps anchor the DNA in the nuclear environment)

4) Lamin A, Lamin C, Lamin B1, and Lamin B2. Lamin A and C are made from the same gene through alternative splicing, while the rest are made from different genes, thus 3 genes encode the different lamins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Lamins are made of three domains. Describe their structure and function.

A

1) Head domain: N-terminal

2) Coiled-coil rod domain: made of four subdomains (1A/B, 2A/B) and mediates the interaction with OTHER PROTEINS IN THE NUCLEAR LAMINA.

3) Ig-like fold domain: mediates interaction with NON-LAMINS such as histones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which one of the statements is true regarding lamins?

1) For assembly, lamins start as monomers and dimerize into dimers that are organized into head-to-tail polymers which then polymerize into anti-parallel filament.

2) One anti-parallel filament is made of all the lamins.

3) All lamins are permanently farnesylated to the inner nuclear membrane.

A

1!

For 2, an anti-parallel filament is made of up either lamin A/C, lamin B1 or lamin B2. These can interact with each other but they don’t mix within a filament.

For 3, lamins A, B1, and B2 undergo post-translation modifications (farnesylation) that cause them to be retained in the nuclear envelope. However, only B-type lamins are permanently farnesylated as farnesyl (isoprene unit of the cholesterol pathway) is removed from lamin A. Lamin A is still close to the membrane.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

1) What is the nuclear matrix (does it exist)?

2) What are matrons?

3) What are S/MARs?

A

1) The existence of nuclear matrix is controversial. There is some evidence that some filaments extend throughout the nucleoplasm and deeper into the nucleus. The nuclear matrix provides anchor points to help organize the chromatin in 3D space (each chromosome occupies a certain region in the nucleus; this must be how).

2) Matrons are proteins that bind to the matrix and could potentially bind to DNA.

3) S/MARs are scaffold/matrix attachment regions (100-1000bp in length) that are AT rich in the DNA that are thought to interact with matrons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In the 1960s, the idea that each chromosome occupied its own space was abandoned.

1) Why was it abandoned?

2) How was this proved to be true around 1980s/1990s?

A

1) It was abandoned in the early 60s as all we had were electron microscopy. These do not tell us anything regarding the structure of the nucleus.

2) Using the FISH technique where each chromosome binds a probe with a distinct colour, it was found that chromosomes occupy distinct regions. It was also found that the two neighbouring chromosomes can INTERMINGLE ON THE BORDERS OF TERRITORY.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Chromosome conformation capture (3C) is a method commonly used for studying ______ ___________ in eukaryotic cells.

It relies on protein ___-______ and ________ ______ to detect chromatin interactions.

It uncovered general features of genome _______.

A

chromatin interactions

cross-linking; proximity ligation

organization

*any two regions of DNA close together can be cross-linked. we can use NGS to find what those pieces of DNA are.

*this tool revolutionized the study of chromatin folding inside the nucleus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Match the steps of 3C to the right order:

1) Step 1
2) Step 2
3) Step 3
4) Step 4
5) Step 5

A) Reverse crosslink.

B) PCR amplification of all circular DNA using primers of two known (‘biased’) genomic regions of interest.

C) Formaldehyde (freezes the nucleus) crosslinks chromatin proteins (covalent bond) to their associated DNA.

D) Digested ends are re-ligated (conditions favour ligation of juxtaposed DNA fragments; intramolecular ligation of nearby ends).

E) Digestion of crosslinked DNA with restriction enzymes (non-specific endonuclease DNAse I) to find points where selected DNA regions are connected through a protein complex.

A

Step 1: Formaldehyde (freezes the nucleus) crosslinks chromatin proteins (covalent bond) to their associated DNA.

Step 2: Digestion of crosslinked DNA with restriction enzymes (non-specific endonuclease DNAse I) to find points where selected DNA regions are connected through a protein complex.

Step 3: Digested ends are re-ligated (conditions favour ligation of juxtaposed DNA fragments; intramolecular ligation of nearby ends).

Step 4: Reverse crosslink.

Step 5: PCR amplification of all circular DNA using primers of two known (‘biased’) genomic regions of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

In 3C, you end up with a genome-wide ligation product library in which each ligation product corresponds to …?

A

In 3C, you end up with a genome-wide ligation product library in which each ligation product corresponds to a specific interaction between the two corresponding loci.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

1) Why is 3C considered a one vs one method?

2) Why is 3C considered a biased method?

A

1) 3C is considered a one vs one method because it can only test ONE PAIRWISE INTERACTION at a time. One half of the primer pair is specific for one gene segment, while the other half of the primer pair is specific for another gene segment. These primers are tested on ALL of the circular DNA. If there is a positive signal (amplification), the two gene segments are next to each other.

2) 3C is considered a biased method because it only probes what we are asking it to probe. It only detects gene segments X and Y if we design primers for X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Which statement is true?

1) 3C is a very efficient method with high throughput.

2) Non-specific DNase I cuts any accessible region. Linker DNA can be cut but DNA around nucleosome can’t be cut as it is more compact and less accessible.

A

2!

For 1) 3C is not a very efficient method. It has a low throughput as it is a one vs one method. It does allow to ask precise questions. Back when this method came out, sequencing was not common thus they had to use PCR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Another chromatin conformation capture technique is ChIA-PET (chromatin interaction analysis by paired-end tag sequencing).

Match the steps of this method in the right order.

1) Step 1
2) Step 2
3) Step 3
4) Step 4

A) Sonication to break DNA into smaller pieces, immunoprecipitation (antibody precipitating a protein of interest), and ligation of adapters to fragment ends.

B) Digestion generating isolated paired-end tags (PETs) and massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins.

C) Crosslinking with formaldehyde

D) Proximity ligation and reverse crosslinking

A

Step 1: Crosslinking with formaldehyde

Step 2: Sonication to break DNA into smaller pieces, immunoprecipitation (antibody precipitating a protein of interest), and ligation of adapters to fragment ends.

Step 3: Proximity ligation and reverse crosslinking

Step 4: Digestion generating isolated paired-end tags (PETs) and massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the specific question that is being asked by ChIA-PET?

A

What sequences are found next to each other but are mediated by a specific protein?

*this method looks at where a certain protein is bound in the genome and determines which two sequences are being brought together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

1) What is the composition of the paired-end tags (PETs) in ChIA-PET?

2) Which DNA sequencing method can be used in ChIA-PET?

A

1) The paired-end tags (PETs) of ChIA-PET are composed of two different DNA segments brought close together in 3D space by the mediator.

2) ChIA-PET can use NGS (Illumina paired-end sequencing), where the forward read is region X and the reverse read is region Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Compare and contrast 3C with ChIA-PET.

A

Differences:
3C ligates free ends while ChIA-PET adds adapters to sequence the DNA and ligates them.

3C is one vs one while ChIA-PET is all vs all but mediated by a protein.

Similarities:
Both do not give information on all of the interaction regions across the genome. 3C is very low throughput and ChIA-PET has higher throughput but it only gives information about all the interacting regions MEDIATED BY A PROTEIN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Another chromatin conformation capture technique is Hi-C.

Match the steps of this method in the right order.

1) Step 1
2) Step 2
3) Step 3
4) Step 4
5) Step 5
6) Step 6

A) Restriction fragment ends labelled with biotin (VB7).

B) Sonication and STREPTAVIDIN pull-down. Sequencing using NGS to identify interacting fragments.

C) Reverse crosslinking (all ligation junctions are marked with BIOTIN).

D) Formaldehyde crosslinking of chromatin proteins to their associated DNA.

E) Digested ends are re-ligated (conditions favour ligation of juxtaposed DNA fragments).

F) Digestion of crosslinked DNA with restriction enzymes.

A

Step 1: Formaldehyde crosslinking of chromatin proteins to their associated DNA.

Step 2: Digestion of crosslinked DNA with restriction enzymes.

Step 3: Restriction fragment ends labelled with biotin (VB7).

Step 4: Digested ends are re-ligated (conditions favour ligation of juxtaposed DNA fragments).

Step 5: Reverse crosslinking (all ligation junctions are marked with BIOTIN).

Step 6: Sonication and STREPTAVIDIN pull-down. Sequencing using NGS to identify interacting fragments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Briefly describe the step of streptavidin pull-down in Hi-C.

A

Biotin and streptavidin are the STRONGEST non-covalent interactions found in nature.

All ligated ends are marked with a biotin molecule. Sequences with biotin = two different sequences ligated together. This step picks out sequences with BIOTIN using immunoprecipitation.

Streptavidin binds to biotin and an antibody specific to streptavidin binds to the streptavidin-biotin complex.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Out of the three (3C, ChIA-PET, Hi-C), which one is the true all vs all method?

A

Hi-C!

Hi-C asks if sequence A is close to sequence B, C, D, E, etc. It is probing every single possible probe-wise interaction across the entire genome.

It is not biased like 3C.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

1) What is Hi-C mostly used for?

2) What is the disadvantage of the Hi-C method?

A

1) Hi-C is mostly used for characterizing long-range chromatin interactions across an entire genome (how DNA folds inside the nucleus).

2) Hi-C requires MILLIONS of cells so it is revealing the architecture shared by an ENTIRE population of individual cells. Nuclear architecture is not the same for all cells; it is generating an aggregated image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

3C uses ________, ChiA pet uses ________, and Hi-C uses ________ enzymes to fragment the DNA into manageable pieces to analyze.

A

non-specific DNase I; sonication; restriction enzymes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

(T/F) In 2009, the first genome-wide view of interactions between all sequences in the mammalian genome was published.

A

True!

From this, basic organizing principles of the nucleus emerged.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

1) What is the Hi-C contact map/matrix? What kind of information can we get from them?

2) What do Hi-C contact points represent?

3) How is the linear genome partitioned for Hi-C?

A

1) Hi-C results are displayed in Hi-C contact map/matrix (chromosome-wide matrix of interaction frequencies). It visually represents all of the interactions across the entire genome (which regions are close together) and gives information on the FREQUENCY of these interactions (how often found).

2) Hi-C contact points represent pairs of genomic positions that were adjacent to each other in 3D space.

3) Linear genomes (chromosomes, series of chromosomes) are partitioned into bins (loci) of a fixed size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Why were initial Hi-C studies giving low-resolution maps?

How could the resolution be increased?

A

Initial Hi-C studies used 1Mb “bins.” Each “square” on the Hi-C contact matrix represents 1 million base pairs of sequence. The low-resolution map is due to each bin being so large. While Hi-C with 1Mb bins can tell us that DNA from one bin interacts with DNA from another bin, it doesn’t reveal which specific sections within those large bins are involved in the interaction.

We can increase resolution by decreasing bin size. This can be done by sequencing more DNA and increasing sequencing runs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

(T/F) Hi-C plays a central role in mapping the 3D architecture of the interphase genome.

A

True!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Depending on the enzyme used, we generate more or less fragments.

If we use 6bp-cutting enzyme, the frequency of cutting is ______ (sequence will occur at random once every _____ bp). We generate ~10^6 restriction fragments; 10^12 possible pairwise interactions.

If we use 4bp-cutting enzyme, the frequency of cutting is _____ (sequence will occur at random once every ____ bp). We generate ~10^7 restriction fragments; 10^14 possible pairwise interactions.

A

4^6; 4096

4^4; 256

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Why did researchers bin data?

A

There would be one million squares on each axis if using a 6bp-cutting enzyme.

If we only sequence 100 million fragments, our matrix will remain underpopulated.

When Hi-C was initially introduced, the capacity for sequencing DNA, particularly achieving 10^12 interactions, was beyond our capabilities then and remains so even now. Consequently, data binning was employed.

Rather than stating that our chromosome is divided into 4096 bp fragments, we grouped the fragments. Bins were set at one million bps, where any fragment corresponding to that one million receives a tick in the respective box, populating the matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is a significant challenge encountered by Hi-C?

A

Achieving sufficient sequencing coverage to support maximal resolution is a significant challenge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

____-chromosomal interactions are rare, while _____-chromosomal interactions are frequent.

A

inter (trans); intra

*chromosomes interact more amongst themselves than between

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

With multi-Mb large bins for Hi-C, they saw the appearance of checkerboard patterns. What did they conclude from this?

A

DNA segregates into two main compartments in the cells: Active (A) and inactive (B).

In the A compartment, transcription of genes is occurring and there are histone modifications that correlate with transcription (acylation).

In the B compartment, transcription of genes is repressed and there are histone modifications that correlate with transcription repression (methylation).

The checkerboard represents that the A compartment sequences tend to interact with A compartment sequences more frequently than with B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

(T/F) There is a continuous segregation of the DNA along the length of a chromosome.

A

False!

There is NOT a continuous segregation of the DNA along the length of a chromosome. It alternates between active and inactive. In 3D space, the active regions are close to each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

When zooming into a chromosome, its A compartment is more towards the ______, while the B compartment is more towards the _______.

A

Central; periphery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Decreasing sequencing costs led to richer Hi-C data sets (200-300 million paired ends), partitioning into ~40kb bins.

What did this help to identify?

A

By increasing the resolution of the maps (going from 10 million paired-end reads for 1mbp bins), they identified TOPOLOGICALLY ASSOCIATING DOMAINS (TADs).

*triangles with very sharp boundaries at the bottom of the Hi-C maps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

TADs are smaller discontinuous ______ ______ within the ___ compartments. They are also known as __________ _______.

They are _________ segment of DNA folded into coils and loops (______ ______).

They are a _________ feature of the genome (2-3k TADs) that can range up to _____ bp in size.

Self-interaction is very ___, creating triangles with ____ boundaries in the Hi-C maps. Interaction between TADs is ____.

A

SELF-ASSOCIATING DOMAINS; A/B compartments; Self-Interaction Domains.

continuous; (chromatin globule).

CONSERVED; 1 million bp

high; sharp; low.

*chromatin fiber makes up TADs
*TADs help isolate certain sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Each chromosome occupies a distinct territory. Why must the chromosomes fold in a specific way in the territory?

A
  1. Make it fit in the interphase nucleus (decondensed)
  2. Has to fold to support gene expression (genes have to be accessible for transcriptional machinery)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

(T/F) Gene sequences within a TAD are not regulated similarly.

A

False!

Gene sequences found within a TAD tend to be regulated similarly (co-ordinately).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Cohesions form _____, very large loops of DNA, via a _____-_____ ______ model.

Cohesions are made of three proteins: _____, _____ and _______.

Structural Maintenance of Chromosomes (SMCs) have a ___ and a ____ domain that is linked by a _________ domain.

Cohesions are one of the main players of ______ the structure of the nucleus.

A

TADs, cohesion-mediated extrusion (loop extrusion)

SMC1; SMC3; RAD21

head; hinge; coiled-coil

organizing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Match the steps of the loop extrusion model:

1) Step 1
2) Step 2
3) Step 3
4) Step 4

A) The cohesion ring entraps DNA and then moves along, extruding the DNA through its lumen until an obstacle (CTCF) is met.

B) Cohesion is released by WAPL and the loop falls apart. These loops are made and fall apart continuously.

C) Cohesion complex is loaded onto DNA with the help of a loading factor (NIPBL). Loading a cohesion onto a DNA is random.

D) Two convergent CTCFs prevent cohesion from moving along the DNA.

A

Step 1: Cohesion complex is loaded onto DNA with the help of a loading factor (NIPBL). Loading a cohesion onto a DNA is random.

Step 2: The cohesion ring entraps DNA and then moves along, extruding the DNA through its lumen until an obstacle (CTCF) is met.

Step 3: Two convergent CTCFs prevent cohesion from moving along the DNA.

Step 4: Cohesion is released by WAPL and the loop falls apart. These loops are made and fall apart continuously.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

(T/F) A dot within TADs on a Hi-C map indicates a highly prevalent and consistently positioned interaction, likely present in every cell across our dataset.

A

True!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What does the dot within TADs on a Hi-C map represent?

A

The dot at the top of the TAD represents the two CTCF binding sites being brought together.

The cohesion complex pauses when it encounters a pair of CTCF proteins bound to the DNA. Since proteins bind to a specific (consensus) sequence of DNA, a loop always forms between those two stops!

In every cell, we find a CTCF consensus binding site and in every cell, CTCF is bound to those sites.

When cohesion binds in between (randomly) and starts extruding, the loop will always bring those sites together.

58
Q

What are the two criteria required for a loop to form?

A

1) There needs to be a CTCF binding site on each side which each binds a separate CTCF protein.

2) The CTCF sites need to be convergent (point towards each other).

*CTCF binding sites are asymmetric (have directionality)

59
Q

What happens if two CTCF binding sites are divergent (not pointing toward each other)?

A

No loop is formed as the cohesion complex does not pause even if there are CTCF proteins binding to the sites.

Cohesion complex can still pass through it until it reaches a convergent site.

60
Q

TADs contain multiple loop extrusion complexes, called ______.

A

Sub-TADs!

There are loops within the TAD (also a loop held together by the interaction between CTCF proteins that cause cohesion molecules to pause). These sub-TADs interact more with themselves than with other sub-TADs.

61
Q

Why do some TADs not have the dot at the top?

A

The formation of a TAD does not necessarily require convergent CTCF sites flanking each side.

The dot = TAD between two CTCFs.
No dot = TAD without the CTCFs. These TADs do not form exactly at the same spot in the population.

*even without the CTCFs, after about 10 minutes, the cohesion complex dissociates.

62
Q

(T/F) There are more TADs without flanking CTCF binding sites.

A

True!

63
Q

As we zoom in from 1 Mb TADs to sizes ranging from tens to hundreds of kilobases, two types of sub-loops become evident within the TADs.

What are they? Describe them briefly.

A

1) CTCF loop domains - subloops within a TAD formed by CTCF binding sites (can see dots)

2) Ordinary domains (compartmental) - subloops within a TAD not formed by CTCF binding sites.

64
Q

1) Briefly describe the hierarchical model.

2) When we increased our resolution of the Hi-C matrix, what did we realize was wrong regarding the hierarchical model?

A

1) The hierarchical model involves transitioning from a broad organizational context to a more detailed structure. Within the active and inactive compartments, we have TADs and within TADs, we have subloops that are either all active or inactive but not both.

2) As we started increasing the resolution, we realized that TADs are not exclusive to active or inactive chromatin. They tend to segregate between both types and thus we have sub-TADs that can span active and inactive domains.

65
Q

(T/F) Compartments (A/B) may contain either active or inactive chromatin, and this holds true for compartmental domains as well.

A

False!

Compartmental domains are not exclusive; they can consist of a combination of active and inactive chromatin (CTCF loop SPANS both active and inactive compartmental domains) or be exclusively composed of either active or inactive chromatin (CTCF loops ENCOMPASS active or inactive compartmental domains).

66
Q

When a CTCF loop spans both compartment domains, it _______ the interactions between the 2 domains.

When CTCF loops encompass active or inactive compartment domains, they ________ interactions between the 2 domains.

A

increases; decreases

67
Q

1) What would happen to the 3D structure of chromatin if CTCF was depleted?

2) What would happen to the 3D structure of chromatin if cohesion was depleted?

A

1) No dots on the matrix; no CTCF loops (no change in compartmental domains)

2) No CTCF or compartmental loops. This results in very strong interactions between the same compartments while having none in the different domains (checkerboard pattern!).

DNA loops are created through the presence of cohesion; the absence of cohesion means the absence of these loops. These loops play a vital role in bringing together both the active and inactive regions of DNA.

68
Q

What will happen to the 3D structure of chromatin if we prevent cohesion from dissociating (no WAPL)?

A

Loop size is determined by the LENGTH of TIME that the Cohesin complex remains in the DNA.

If we prevent cohesion from falling off, it can go past the obstacle and create a LARGE LOOP.

When a section of DNA is part of a larger loop, it becomes less available for interactions with sequences further downstream along the DNA molecule, LOSING LONG-RANGE INTERACTIONS.

The DNA segment is more focused on local interactions within the loop itself, INCREASING SHORT-RANGE INTERACTIONS.

The checkerboard pattern is less pronounced because we are losing long-range interactions.

This could also increase interactions between active and inactive chromatins because the larger loops will most likley be composed of both.

69
Q

Chromosomes can undergo large structural changes such as translocation and inversion.

These occur in the ______ where there is loosely packed DNA.

Chromosomes with boundaries that are interacting are more likely to undergo _________.

When there is a dsDNA break and other chromosomes are not nearby, it is most likely going to be ________.

A

interphase

translocation

inversion

70
Q

Gene-rich chromosomes are located close to _____, while gene-poor chromosomes are located close to __________.

A

center; nuclear membrane

71
Q

How did CRISPR-GO (Genome Organizer) MOVE target GENES to the NUCLEAR ENVELOPE to asses the hypothesis of high transcriptional activity within the genome’s core region?

A

CRISPR-GO moves specific sequences from the core to the nuclear envelope.

CRISPR-GO couples CRISPR-dCas9 (catalytically inactive Cas9) system with nuclear-compartment-specific-proteins.

dCas9 is fused with one segment of a DIMERIZATION domain (ABI), and the dimerization process RELIES on a LIGAND. The counterpart dimerization domain (PYL1) is linked to EMERIN, a nuclear envelope protein.

Using CRISPR, one segment of the dimerization domain (ABI) binds to the genomic locus of interest.

When we add the ligand, ABA, there is dimerization of the domains in the nuclear envelope, moving the genomic locus of interest to the nuclear envelope from the centre.

72
Q

(T/F) There is a GFP bound to the protein Emerin to allow us to visualize its location.

A

True!

73
Q

1) What is the target gene for CRISPR-GO?

2) Where does the sgRNA bind?

3) What is TRE? What is needed for it to turn on?

4) What is CMV and CFG?

A

1) In CRISPR-GO, the target gene is an artificial locus that produces CFP. There is a LacO array, which is followed by a reporter construct composed of TRE, CMV and CFP.

2) The target for sgRNA is the LacO repeat array.

3) The TRE is a doxycycline (Dox)-inducible promoter, positioned adjacent to the LacO repeat array. The enhancer rtTA binds to TRE, activating it, and this binding is dependent on the presence of Dox.

4) Downstream of TRE, there is a promoter known as CMV (controlled by TRE) responsible for regulating CFP (a fluorescent protein) expression.

74
Q

CFG is only expressed in the CRISPR-GO construct when ___ is added.

A

Dox

75
Q

Fill in the blanks regarding the CRISPR-GO construct:

____ was added to the construct to move the artificial loci to the nuclear membrane.

____ was added to turn on the promoter and the fluorescence of CFP was measured.

There was an ______ in fluorescence of CFP with an absence of ABA.

There was a ________ in the fluorescent of CFP with the presence of ABA.

A

ABA; Dox

Increase (more gene expressed); decrease (less gene expressed)

*repositioning the locus to the nuclear envelope leads to a decrease in Dox-induced CFP expression.

76
Q

There is a causal relationship between ________ and the ___________ of the genome in the nuclear space.

A

transcription; 3D arrangement

*not clear whether it is a cause or a consequence; do we see these gene expressions because chromatin is positioned in a specific way or is chromatin positioned in a specific way because of the way genes are regulated?

77
Q

Are genes evenly distributed within a genome?

A

No! Genes are uneven across and within chromosomes.

There are also GENE DESERTS, where there is a very low density of genes over regions as long as several Mb.

The distribution of protein-coding genes is uneven between chromosomes.

78
Q

Which one of the statements is false?

1) The folding of DNA is not random. It is in part dictated by gene distribution, where regions with more genes are inside the core and regions with fewer genes are in the periphery.

2) Large chromosomes have more genes.

A

2!

A large chromosome does not mean more genes! Chromosome 19 has a higher amount of genes than any of the bigger chromosomes such as chromosome 13.

79
Q

1) Where are lamins found and what do they do?

2) What is the Lamin B Receptor (LBR)?

3) What happens when you knock out the Lamin B Receptor (LBR) or LaminA/C?

A

1) Lamins are co-localized in the nuclear envelope and not found in the core. These (scaffolds) help anchor specific sequences of DNA at precise positions in the nuclear envelope, determining the positioning of chromosomes in 3D space.

2) There are other proteins besides the lamina that are inserted inside the nuclear envelope such as the LBR. LBR binds to lamin B and heterochromatin DNA. LBR is integrated into the nuclear pore, while its N terminal extends to the nuclear core.

3) The absence of Lamin B Receptor (LBR) or LaminA/C leads to the accumulation of heterochromatin in the nuclear core. This shows that there are structural proteins that organize the positioning of heterochromatin.

80
Q

What are LADs?

A

Lamina-associated Domains (LADs) are regions of condensed chromatin (no genes, inactive genes, constitutive heterochromatin) that are BOUND by the nuclear LAMINA.

81
Q

How did researchers determine that there are upto 1500 LADs per genome?

A

In the experiment, DamID (adenine methylase) was fused with Emerin on the nuclear membrane and anchored directly at the nuclear envelope.

They aimed to identify methylated adenines in the DNA sequence. Sequences in proximity to the enzyme, at the periphery, were more likely to be methylated, while sequences oriented towards the core were too distant for methylation.

Clusters of methylated adenosines revealed the identification of up to 1500 Lamina-Associated Domains (LADs) in a genome, ranging from 10 kbp to 10 mbp in length.

82
Q

(T/F) LADs contain a heterochromatin signature while the domains in between (loops extending away from the periphery) tend to correlate with transcription activity.

A

True!

83
Q

What is a transcription factory?

Briefly describe its structure.

A

Transcription factory is a BIOMOLECULAR CONDENSATE. It is a highly specialized area that contains many copies (up to 30) of stationary RNA POL II and TFs. It is quite big, ranging from 50-100nm in size.

Structure: RNA polymerases are clustered in a region. Multiple chromosomes are folding so genes that are regulated similarly are clustered together in a single transcription factory.

84
Q

What are the two types of LADs?

A

cLAD (constitutively LAD): always forms with the same sequence at the same spot across different cell types.

fLAD (facultative LAD): forms a LAD in some cell types, while forming an interLAD in other cell types.

85
Q

1) What are NADs?

2) What kind of chromosomes are associated with the nucleolus?

A

1) Sometimes some regions of the chromosome (typically heterochromatin regions) will associate with the nucleolus rather than the nuclear lamina, forming NADs (nucleolus-associating domains). There can be loops formed in between NADs (like interLADs) that represent the active region of the chromatin.

2) Mostly the chromosomes that contain rRNA genes will be associated with the nucleolus as it is responsible for the transcription and biogenesis of rRNAs. The loops formed within NADs are DNA that encode for rRNAs.

86
Q

1) While a LAD has a ____ gene density, inter-LAD has a _____ gene density.

2) While a LAD has a ____ gene expression, inter-LAD has a _____ gene expression.

3) While a LAD is composed of the ____ compartment, inter-LAD is composed of the _____ compartment.

4) While the replication timing is ____ for LADs, it is _____ for inter-LADs.

5) While the A/T content is ____ for LADs, it is ____ for inter-LADs.

A

1) low; high

2) low; high

3) B; A

4) late; early

5) high; low

*A/T content is a hallmark of heterochromatin

87
Q

Transcription is controlled by _________ regions which are bound by ________ _____.

Most eukaryotic genes are associated with ______ control elements (segments of non-coding DNA that help regulate transcription).

Every gene contains a _____ ____ and distal _______/______ regions.

A

regulatory; transcription factors

multiple

core promoter; enhancer/repressor

88
Q

1) What are the components that can make up the core promoter?

2) What is the overarching goal of the core promoter?

3) Are the elements of the core promoter orientation specific?

4) How many RNA polymerases are found in humans, and which one transcribes mRNA?

A

1) The core promoter can consist of BRE (TFIIB recognition element -37 to -32), TATA box (-31 to -26), Initiator (-2 to +4), and Downstream Promoter Element (DPE +28 to +32), either individually or in various combinations.

2) The goal of the core promoter is to bind transcription factors, facilitating the recruitment of RNA polymerase at the +1 site (first nucleotide transcribed).

3) Yes, all elements of the core promoter are orientation-specific. A 180˚ inversion of the TATA box, for example, would result in the absence of gene transcription.

4) There are three RNA polymerases in humans, and RNA polymerase II is responsible for transcribing mRNA.

89
Q

1) What binds to the TATA box and what does this do?

2) In genes lacking a TATA box, what element does TFIID bind to?

3) Where does transcription typically start from the TATA box?

4) Why is the spacing between the Initiator and DPE important for transcription initiation?

A

1) TBP, a component of TFIID, binds to the TATA box in the core promoter. This binding recruits other transcription factors (TFs) and RNA polymerase II, initiating transcription. General transcription factors, not tissue-specific, bind to the TATA box.

2) In genes without a TATA box, TFIID binds to the Downstream Promoter Element (DPE).

3) Transcription starts 25 base pairs downstream of the TATA box.

4) Proper spacing between the Initiator and Downstream Promoter Element (DPE) is essential for accurate transcription initiation.

90
Q

(T/F) Some genes have core promoters that include proximal control elements (100bp upstream) called the GC box or the CAAT box. However, these seem to not be important for transcription.

A

False!

Some genes have core promoters that include proximal control elements (100bp upstream) called the GC box or the CAAT box.

For genes possessing these elements, they appear to play a crucial role in achieving the complete basal level of transcription.

91
Q

Distal control elements far from the TSS can be found in both _______ or ______ of the TSS.

While _______ decrease transcription, _______ increase transcription.

Unlike the core promoter elements. the elements of distal control can work in _____ orientations (inverting has ___ effect).

These elements bind ____-specific TFs.

A

upstream; downstream

silencers; enhancers

both; no

tissue

92
Q

(T/F) TATA box helps position the active site of the RNA polymerase over the +1 start site.

A

True!

93
Q

What are the four ways distal elements can influence the expression of genes?

A

1) One gene, multiple enhancers, one tissue - all enhancers work together, resulting in higher gene expression (synergistic but not additive)

2) One gene, multiple enhancers, more tissues - each enhancer targets a specific cell type!

3) Gene competition for a shared enhancer: winner takes all!

4) Gene competition for a shared enhancer: we are all winners! Transcription levels are lower as sharing one enhancer cannot reach the max rate.

*same for repressors

94
Q

In a winner-takes-all scenario where only one gene is controlled at a time, what determines the winner?

A

1) Positioning in 3D space
2) Which TF binds to enhancer/repressor

95
Q

List the statements as True or False:

1) We possess approximately 100,000 to 400,000 regulatory distal elements while having only 20,000 protein-coding genes. This highlights the complexity of eukaryotic gene expression.

2) Gene expression involves multiple levels at which we can impact the extent of gene expression. Transcription regulation is one of the later ones.

3) The link between nuclear architecture and transcriptional regulation occurs through the proximity established between enhancers and genes (done by cohesion and CTCFs).

A

1) True!

2) False. Transcription regulation is the FIRST LAYER.

3) True!

96
Q

What are insulator proteins?

Give an example of one.

A

Insulators prevent transcription by blocking signals from distal enhancers to influence promoters.

CTCF is an insulator protein as it sequesters chromatin in specific domains (creating physical boundaries in the genome), restricting distal enhancers from interacting with core promoters.

Adding a CTCF binding site between the two; no transcription.

97
Q

How can CRISPR-Cas9 allow us to explore the link between chromatin structure and gene expression?

A

CRISPR-cas9 can induce deletions of CTCF DNA binding sites, changing TADs and sub-loops.

By disrupting specific sites, we can generate larger loops, bringing together sequences that were previously kept separate. Perhaps some distal elements that were kept away are now brought close to core promoters. This can have CONSEQUENCES gene expression and RNA synthesis (aberrant transcription).

This allows us to explore the link between chromatin structure and gene expression.

98
Q

TADs are relatively stable (form roughly in the same area across cell types), with frequent changes occurring within the sub-loops within them. These sub-loop alterations play a role in the regulation (fine tuning) of the expression of specific genes.

Why are TADs relatively stable?

A

TADs are stable because they are integral for fundamental gene expression.

This stability is crucial because a precise pattern of gene expression must occur for the organism to develop properly.

99
Q

What causes transcription to occur in bursts; periods of high activity followed by inactivity?

A

Initially, the distal element and core promoter are spatially distant from each other. This separation is overcome when the cohesion complex randomly binds somewhere between the two elements, causing the extrusion of a loop. As a result, the distal enhancer is brought into proximity with the promoter, facilitating a transcription burst.

As the loop continues to extrude, a significant distance is created between the two elements, resulting in a pause in transcription. The cohesion complex dissociates, binds again, and once more brings the enhancer and core promoter together, initiating another transcription burst.

100
Q

In summary, the interphase structure influences the ________ of the nucleus.

A

Function (gene expression)

101
Q

What is pervasive transcription?

A

Pervasive transcription refers to the widespread occurrence of transcription across the genome, encompassing not only protein-coding genes but also non-coding regions.

75% or more of the genome is transcribed!

*ENCODE consortium was looking for functional genes and found most genes to be transcribed.

102
Q

(T/F) We have confirmed that pervasive transcription is mostly noise.

A

False!

The answer lies in between. There are some lncRNAs (can’t be transcribed) that have regulatory and structural roles.

103
Q

Define transcriptome.

A

A set of RNA transcripts expressed in a given cell type at a specific time or under particular conditions.

*ALL RNA transcripts!

104
Q

1) Which RNA is most abundant by numbers?

2) Which RNA is most abundant by mass?

3) What is the total number of RNA transcripts in a given cell at a given time?

A

1) tRNA

2) rRNA (90% of mass)

3) 10 billion!!! most are tRNAs and rRNAs that are important for the protein-coding machinery.

105
Q

1) What proportion of the transcribed RNA consists of non-coding RNAs?

2) Besides rRNA and tRNA (housekeeping), what are the two types of ncRNAs?

A

1) 95% of transcribed RNA!

2) Short non-coding RNAs (<200 NTs) and long non-coding RNAs (>200 NTs). Function of these are not clear.

106
Q

Match the short non-coding RNAs to their descriptions:

1) Small nuclear RNAs (snRNA)
2) Small nucleolar RNAs (snoRNA)
3) MicroRNAs (miRNA)
4) Piwi-interacting RNAs (piRNA)

A) located in the nucleoli and aid in chemical modifications (methylation/pseudouridylation) of rRNA by guiding the appropriate enzymes through binding to the rRNA.

B) similar to miRNAs but function in the gremline to keep retrotransposons silenced. Engage with the 5’ end of the transcript, recruiting histone remodellers and facilitating methylation in regions near the promoter for transcriptional SILENCING. Alternatively, they can cleave mRNA and instigate transcript degradation for POST-transcriptional SILENCING.

C) attach to proteins to form snRNPs (small nuclear ribonucleic protein complexes) to form the spliceosome.

D) located in the cytoplasm and cause RNA silencing.

A

Small nuclear RNAs (snRNA): attach to proteins to form snRNPs (small nuclear ribonucleic protein complexes) to form the spliceosome.

Small nucleolar RNAs (snoRNA): located in the nucleoli and aid in chemical modifications (methylation/pseudouridylation) of rRNA by guiding the appropriate enzymes through binding to the rRNA.

MicroRNAs (miRNA): located in the cytoplasm and cause RNA silencing.

Piwi-interacting RNAs (piRNA): similar to miRNAs but function in the gremline to keep retrotransposons silenced. Engage with the 5’ end of the transcript, recruiting histone remodellers and facilitating methylation in regions near the promoter for transcriptional SILENCING. Alternatively, they can cleave mRNA and instigate transcript degradation for POST-transcriptional SILENCING.

107
Q

Primary miRNAs are cleaved by ______ and exported to the _____.

These are processed into two miRNAs, each binding to _____, creating a mature miRNA.

Depending on where the miRNAs bind on the target mRNA, there are two outcomes: target mRNA ______ or _______ ______.

A

nucleases; cytoplasm

RISC

cleavage; translational repression

*either way, there is a decrease in protein synthesis.

108
Q

What are the two criteria to be lncRNAs?

A

1) RNA has to be longer than 200 NTs
2) RNA does not get translated

109
Q

Where can we find lncRNAs in our genome?

A

EVERYWHERE

1) in intergenic regions
2) in both sense/antisense strand
3) in introns
4) they can be overlapping with protein-coding exons
6) can be transcribed and alternatively spliced like mRNA; can contain introns

110
Q

What is a hypothesis for why we have so many lcnRNAs?

A

Every single promoter inside the genome is bi-directional.

It can transcribe in one direction for the normal gene it encodes, as well as in the opposite direction for a long non-coding RNA.

111
Q

How can lncRNAs influence chromatin structure?

A

1) lncRNAs bind to DNA through RNA binding proteins, acting as scaffold molecules to arrange the chromatin structure.

2) lncRNAs recruit and stabilize histone modifiers (methyltransferases/acetyltransferases)

112
Q

How do lncRNAs regulate gene expression?

A
  1. Transcription of a lncRNA STIMULATES/REPRESSES transcription of NEIGHBOURING gene.
  2. lncRNAs bind to one DNA strand, displacing the other one, ultimately creating a R loop that regulates gene expression.
  3. lncRNAs bind to TF, helping it bind to promoter or can act as a decoy for TFs so TFs are unable to bind to promoters.
  4. lncRNAs act as a decoy for miRNAs; miRNAs can not bind to target mRNA and silence gene expression.
  5. lncRNAs can enhance or prevent mRNA decay, influencing stability of mRNAs
  6. lncRNAs bind to RNA-binding proteins and affect their cellular location.
  7. lncRNAs bind to TFs and affect their cellular location (move from cytosol to nucleus or vice versa).
113
Q

Which one of the statements is false?

1) A lot of lncRNA functions have to do with the regulation of gene expression (directly/indirectly).

2) All lncRNAs are capable of fulfilling all functions.

3) Paraspeckles sequester lncRNA, regulating their functions.

A

2!

Not all lncRNAs can fulfil every single function.

114
Q

Fill in the blanks regarding a lncRNA acting as a TF decoy:

Four apoptotic genes are under regulation of the _______ TF.

______ lncRNA acts as a decoy by binding to that TF, diverting it away from the apoptotic genes.

When ______ is absent, genes are expressed (apoptosis).

When ______ is present, genes are not expressed (non-apoptotic).

A

NF-YA

PANDA

*maybe PANDA upregulated in cancer cells.

115
Q

Which statement is false regarding eRNAs?

1) eRNAs are a subtype of lncRNAs.

2) 50k different eRNAs are transcribed in a given cell type and are highly functionally relevant.

3) There are two different types of eRNAs: 2D-eRNA and 1D-eRNA.

A

2!

There are over 50k different eRNAs transcribed in a given cell type but We ARE NOT sure if they are functionally relevant.

116
Q

Distinguish 2D-eRNA from 1D-eRNA in terms of:

1) Directionality

2) Splicing

3) Polyadenylation

4) Cis or Trans

A

Directionality: 2D-eRNA are SHORT, BIDIRECTIONAL (transcribed in either direction; both strands), while 1D-eRNA are LONG UNIDIRECTIONAL.

Slicing: 2D-eRNA are UNSPLICED, while 1D-eRNA are SPLICED

Polyadenylation: 2D-eRNA are NON-POLYADENYLATED, while 1D-eRNA are POLYADENYLATED.

Cis or Trans: 2D-eRNA are CIS (act on the same chromosome they were transcribed from) while 1D-eRNA are TRANS (affect other chromosomes)

117
Q

What are the two main functions of the eRNAs? How do they achieve them?

A

1) eRNAs contribute to gene control by ALTERING CHROMATIN environment.
- help associate cohesion complexes that bring
enhancers and promoters together (cis or trans)

- regulate chromatin landscape by recruiting and 
 stabilizing chromatin modifiers (gene control).

2) Interact with transcriptional regulators to control gene expression.
- stabilize the binding of TFs

  • stabilize RNA polymerase; NELF prevents RNA
    polymerase from passing into elongation mode.
    eRNAs bind to NELF so it can dissociate.
118
Q

(T/F) Transcriptome refers to the mRNA transcripts in a given cell.

A

False!

Usually, we refer to transcriptome as the mRNA portion. However, this is incorrect because mRNA only makes up 5% of RNA in a cell. It is small but complex!

119
Q

There are around 10,000-15,000 genes expressed in a single tissue, but 100K mRNAs/proteins. How is this possible?

A

Individual gene can give rise to several mRNAs due to ALTERNATIVE SPLICING and MULTIPLE start/end points.

120
Q

Why do we want to study the transcriptome?

A

1) Gives information on how transcriptome changes from cell to cell.

2) Gives information on the effects of our treatments (are specific genes upregulated/downregulated?)

3) GENETIC SIGNATURES! some mRNA may be expressed at stage 4 of some kind of cancer. can be used clincally.

121
Q

Which one is not a key goal of transcriptomics (study of transcriptome)?

1) Catalogue all species of transcript (mRNA, ncRNA, etc.)

2) Determine the transcriptional structure (start site, 5’ and 3’ ends, splice patterns, etc.)

3) Identify the physical characteristics of individual RNA molecules, such as colour and shape.

4) Quantify the changing expression levels of each transcript.

A

3!

122
Q

How does REAL-TIME PCR (RT-qPCR) differ from regular PCR?

A

RT-qPCR is the same as regular PCR but it uses a FLUORESCENT molecule that allows to TRACK how much DNA is being produced with each cycle of amplification.

It is used to study mRNA abundance.

123
Q

Match the steps of RT-qPCR to the proper order:

1) Step 1
2) Step 2
3) Step 3

A) Measurement of fluorescent signal in real-time (proportional to the amount of DNA present).

B) Reverse transcription of mRNA template into cDNA (dsDNA is needed for PCR).

C) Amplification of cDNA using a primer pair complementary to gene of interest.

A

Step 1: Reverse transcription of mRNA template into cDNA (dsDNA is needed for PCR).

Step 2: Amplification of cDNA using a primer pair complementary to gene of interest.

Step 3: Measurement of fluorescent signal in real-time (proportional to the amount of DNA present).

124
Q

In qPCR, the ______ the cycle number, the less abundant the starting material.

The _____ the cycle number, the more abundant the starting material.

A

higher

lower

*lower cycle #: enough DNA to get detected sooner = more starting material

125
Q

DNA binding dyes (SYBR green) can be used in RT_qPCR.

1) How much does the fluorescent signal increase when bound to dsDNA?

2) What are the advantages?

3) What are the disadvantages?

A

1) Fluorescent signal increases 1,000x when bound to dsDNA.

2) Simple primer design.

3) SYBR green is a NON-SPECIFIC dye - it binds to all dsDNA and LACKS OF SPECIFICITY. Due to this, there cannot be MULTIPLEX (multiple genes in a well). Must use different wells; a 96 well-plate would only give enough about a handful of genes.

126
Q

To overcome the lack of multiplex (multiple genes in a well), RT-qPCR can use PROBE-based detection.

What are the two types?

A

1) Hydrolysis (Taqman) probes

2) Dual hybridization probes

127
Q

Describe how hydrolysis (Taqman) probes work and their advantages + disadvantages.

A

Hydrolysis (Taqman) probes are SEQUENCE-SPECIFIC PROBES. A reporter gene is bound to a quencher (in the probe) that represses the reporter when in proximity. These probes exploit the 5’->3’ exonuclease activity of DNA polymerase. When there is an extension, there is the cleavage of the probe, freeing the reporter from the quencher.

Advantages: high specificity, multiplexing (less likely to have non-specific amplification)

Disadvantages: assay design & initial cost

128
Q

Describe how Dual Hybridization probes work and their advantages + disadvantages.

A

This method uses TWO sequence-specific primers and TWO sequence-specific oligo probes (donor and acceptor) that bind adjacent sequences during ANNEALING in a head-to-tail manner. It is based on Fluorescence Resonance Energy Transfer (FRET); excite at a wavelength of the donor dye and monitor at the emission wavelength of the acceptor dye.

Advantages: the most sensitive way to amplify only gene of interest! There are 4 sequence-specific molecules; there is a very low probability they are going to bind non-specifically and give a false signal.

Disadvantages: assay design & initial cost

129
Q

What is the main disadvantage of RT-qPCR?

A

It provides a minuscule snapshot of mRNA expression in the cell.

It looks at ONE GENE AT A TIME as you need to select each gene you wanna look at.

Each time you want to investigate a different gene, you must create new primers and rerun the experiment. Therefore, if you intend to study the entire cell and all its mRNA, you’ll need to design a primer for each of them.

It works best for the question: does my treatment increase gene A?

130
Q

To get a larger snapshot of mRNA expression in the cell, MICROARRAYS can be used.

1) What are they?

2) Briefly describe the steps.

3) How does it differ from CGH used for detecting abnormalities?

A

1) It is the same concept as CGH. Each spot contains multiple identical strands of DNA and each spot represents a gene (mRNA). Thousands of sports are arrayed in orderly rows and columns on a solid surface.

2) First, you extract mRNA from two samples (treated vs untreated), and you convert it to either RNA or cDNA (cDNA most common). Then, you label each sample with a different fluorescent molecule and hybridize to microarray.

3) It is the same idea as CGH but instead of using genomic DNA, we are using mRNA converted into cDNA.

131
Q

In microarrays, fluorescent ratio changes of less than or equal to _____ or greater than or equal to _____ are considered significant.

A

0.5; 2.0

*less than 0.5=decrease
*more than 2.0=increase
*in between=no change

132
Q

What are the main advantages and disadvantages of using microarrays to study the transcriptome?

A

Advantages: relatively inexpensive; larger snapshot of the genome

Disadvantages:

Arrays tend to get saturated by high-abundance transcripts (less abundant genes can’t be detected because the signal gets lost in the background).

It is a BIASED method as it only picks up the transcripts that have corresponding probes on the array (only detects what we put). It can not be used to detect novel mRNAs.

133
Q

What is a housekeeping gene? How is it used in microarrays?

A

Housekeeping gene: gene whose expression is not affected by treatment!

This gene is used to normalize the expression of genes tested. Normalization is important to account for experimental variations (such as differences in RNA quantity among samples).

Treated: EGFR (3650) and Actin (2000)
Untreated: EGFR (757) and Actin (1000)

First, normalize each gene individually:
757/1000 = 0.757 (untreated)
3650/2000 = 1.825 (treated)

Then compare treated with untreated:
treated/untreated
1.825/0.757 = 2.14

Conclusion: treatment increases EGFR by 2.14x!

134
Q

(T/F) RNA-seq, characterized by its simplicity and ultra-high throughput, mirrors Next-Generation Sequencing (NGS), differing only in the sequencing target—RNA instead of DNA. Any NGS platform is adaptable for RNA-seq applications.

A

True!

135
Q

Briefly describe the steps of RNA-seq.

A

Step 1: Prepare a library of cDNA fragments.
- Use total or fractionated RNA
- Fragment the RNA/DNA -> convert to cDNA (dsDNA)
- Ligate adapters (based on NGS method of choice)

Step 2: DNA sequencing (using any NGS method)
- Produce >50 million short reads
- Align the sequence reads to a REFERENCE GENOME

Step 3: Quantitation (convert read into quantitative info to compare the abundance of a gene)
- COUNT the number of reads that map to each transcript (aka TAG COUNTS).
- The probability of a read coming from a transcript is proportional to its relative ABUNDANCE (sequencing depth) and LENGTH.

136
Q

What are the two things to consider when counting how many reads map to a gene?

A
  1. How long the gene is (longer genes = more reads because more fragments)
  2. Depth of sequencing (can not sequence 5 million fragments for control and 50 million fragments for treated)
137
Q

What are the three locations reads can map to and what do they tell us?

A

Reads can map to:
1. Junction reads (one half of a read matches to one exon and the other matches to another; tells us intron was spliced)
2. Exonic reads
3. Poly(A)end reads (transcriptional stop sites)

*depending on how reads map to the reference genome, they can give us information on the transcription start site and which two exons are connected and how (precise location of transcription boundaries).

138
Q

How can RNA-seq help us determine novel/alternative splice sites?

A

When we find a read that aligns with both an exon and a portion of an intron, it suggests that this intronic region was not spliced out in the mature RNA.

This could be a known gene with alternative splicing patterns or it could reveal a novel splice site.

139
Q

1) Does RNA-seq make use of housekeeping genes like in microarrays?

2) What is RPKM? How do you calculate it?

A

1) RNA-seq does not use housekeeping genes! Normalization is worked into the way we count how many reads map.

2) RPKM is the statistical analysis to determine differential expression and is done FOR EVERY GENE.

RPKM = # raw reads mapped to transcript / [exon length (kb) x (total # reads mapped in sample/1,000,000kb)]

*exon length (kb): due to initial fragmentation step, longer transcripts will tend to produce more fragments than shorter transcripts.

*total # reads mapped in sample: normalize for library size. two replicates with different total library size would produce proportionally different tag counts for the same gene.

140
Q

Which statements are true regarding RNA-seq?

1) Provides information about how 2 exons are connected (long reads) or about the connectivity between multiple exons (short reads)

2) Limited to mRNAs.

3) Can reveal sequence variations (ie. SNPs) in the transcribed regions.

4) Provide information about differential gene expression (RPKM).

A

3 & 4!

1) Provides information about how 2 exons are connected (short reads) or about the connectivity between multiple exons (longer reads)

2) Can detect non-coding RNAs, not limited to mRNAs.

141
Q

What are the advantages of choosing RNA seq over microarray?

A

1) ultra-high throughput
2) can detect novel RNA transcripts; not biased
3) can detect alternative/novel splice sites
4) quantitative accuracy - microarrays are prone to saturation effects
5) can detect non-coding RNAs
6) can detect sequence variations (SNPs)

142
Q

What are the advantages of choosing microarray over RNA seq?

A

1) cheaper
2) library is easier to prepare
3) RNA seq requires more analysis