Gene Regulation Flashcards

1
Q

Intron function

A
  1. Sources of non-coding RNA
  2. Carriers of transcriptional regulatory elements
  3. Contributors to alternative splicing
  4. Enhancers of meiotic crossing over within coding sequences and thus drives evolution.
  5. Signals for mRNA export from the nucleus and nonsensse-mediated decay.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are repetitive sequence

A

DNA fragments that are present in multiple copies in the genome. They account for 47% of the genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Characterise Segmental duplications aka lowcopy repeats

A
  1. 1-400kb in length
  2. Present in 2+ sites within the genome, >90% sequence similarity.
  3. Associated with chromosomal instability or evolutionary rearrangement
  4. Implicated in >25 recurrent genomic disorder.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Segmental duplication can cause issue in which ways.

A
  1. Through non allelic homologous recombination segmental duplication can cause….
    - deletion and duplication
    -translocation
    -inversion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Characters promotors

A
  1. Regulatory region of DNA located upstream of a gene.
  2. Binds transcription factors
  3. Allows the subsequent coordination of components of the transcription initiation complex.
  4. Facilitating recruitment of RNA polymerase 2 and initial of transcription.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What makes up the core and proximal promoter

A

Core- TATA box
Proximal - CAAT box and GC box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Characterise the TATA box

A
  1. Consensus TATAAA
  2. A sequence usually located around -25bp upstream of the start point.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Characterise the CAAT box

A
  1. .a consensus sequence Close to -80bp from the TSS
  2. Responsible for promoter efficiency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Characterise the GC box

A
  1. A consensus sequence rich in guanidine and cytosine.
  2. Usually found in multiple copies in the promoter region, normally surrounding the TATA box.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are enhances

A
  1. Are sequences that increase rate of transcription by interaction with trans-acting factors or activators.
    2 . Enhances do not need to be close its target gene
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are simple and complex enhancers

A

Simple- bound by one transcription factor
Complex- bound by multiple transcription factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are silencers

A

Are DNA sequences located upstream or downstream of the promoter region and bind repressor proteins.
The silencer/repressor interaction reduce the rate of transcription or blocks it. By inhibiting the binding of an activator to the enhancer .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Characterise microRNA

A
  1. Play a key role in the regulatory of gene expression
  2. Acting at the post-transcriptional level, these molecules may fine-tune the expression of as much as 30% of all mammalian protein-coding genes.
  3. Mature microRNA are short, single-stranded RNA molecules approximately 22 nucleotides in length. (22-25)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

MiRNA target prediction through prediction tools, what do they look for.

A
  1. The seed region comprises a zone between nucleotides 2 to 8, Most prediction algorithms include the seed region as a key biological element for miRNA-target prediction.
  2. The seed region of a miRNA is a highly conserved segment that makes it possible to classify the miRNA within families and species.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Delivery systems for delivering miRNA into the body.

A
  1. Viral vectors
  2. Poly-particles
  3. Neutral lipid emulation
  4. EngeneIC delivery vehicle nanocells
  5. Dendrimers
  6. Chitosan
  7. N-acetyl-D-galactosamine
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Key challenges that arise when miRNA is delivered to the body

A
  1. Immunoglobulin-stimulators effects
  2. Toxicity
  3. Endosome escape
  4. Targeting correct disease site
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Described how microarray is used in RNA detection

A
  1. Microscopic spots containing DNA sequence of interest attached to a solid surface like a microscope slide.
  2. CDNA, labelled with fluorescent markers, is washed over the slide.
  3. CDNA attach to and complentary strands on the slide
  4. Higher levels of fluorescence indicates RNA presence.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Key RNA -seq methods (7)

A
  1. MRNA sequencing
    2 targeted RNA sequencing
  2. Ultra-low-input and single-cell RNA seq (separate cells into types before sequencing
  3. RNA exome capture sequencing
  4. Total RNA sequencing
  5. Small RNA sequencing
  6. Ribosome profiling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

RNA seq (mRNA) process

A
  1. Deplete ribosomal RNA (they will overwhelm the sample)
  2. Poly (A)+ RNA capture with RNA beads with TTTT tail
  3. RNA fragmented and primed
  4. First strand cDNA synthesized
  5. Second strand cDNA synthesised
  6. 3’ end adenylated ( just one A is added) and 5’ ends repaired
  7. DNA SEQUENCING adapters ligated
  8. Ligated fragment PCR amplified
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Application of RNA -Seq (6)

A
  1. Determination which genes are expressed in a tissue
  2. Determine differential expression of genes
  3. Determine differential post-transcriptional regulation of genes
  4. Transcript assembly
  5. Alternative splicing quantification
  6. Look for novel genes or transcripts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Advantages of RNAseq (6)

A
  1. Genome- wide
  2. Doesn’t require existing genomic sequence
  3. Very low background noise ; reads can be mapped with high confidence or tossed for poor quality.
  4. Resolution ; 1bp
  5. High -throughput : faster than sanger sequencing
  6. Cost ; 1000x cheaper than Sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The challenges of studying microRNAs are two-fold

A
  1. MicroRNA are very short -> traditional DNA-based methods are not sensitive enough to detect these sequences with any reliability.
  2. Closely related microRNA family members differ by as little as one nucleotide. (Can identify miRNA by size, specifically by beads)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Were can transcriptomics be used clinically

A

Cancer for expression level analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Problems with working with RNA

A
  1. Difficult to work with: degrades easily and needs to be stored at -70degrees C
  2. Tissue/cell specific expression: cant draw blood , you need to biopsy. Cannot infer the transcriptome from one tissue to another.
  3. RNA-seq or whole transcript sequencing is technically challenging, costly, time intensive with intensive data analysis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are topologically associated domains ?

A
  1. They are self - interacting domains
  2. They allow for partition the chromatin fibre
  3. Frequency of intracellular-TAD interactions is higher than inter-TAD interaction
  4. TADs allow for genes to be regulated by…
26
Q

How do TADs look

A
  1. TADS are collections of many chromatin loops .
  2. TADs are separated by TAD borders
27
Q

Name Architectural proteins

A

CTCF
Cohesion

28
Q

How do TADs take place in gene regulation.

A
  1. Cell type-specific enhancers make loops with promoters of correspondent genes predominantly within TADs.
  2. Allow for certain region of the chromatin to interact or they close off areas of the chromatin to prevent interaction.
29
Q

Explain the parts\steps of the 3c method

A

Part 1: Formaldehyde is added to cells to force interaction between protein and protein
Part2: cells are lyse to expose the nucleus (wash the cells to wash away everything except DNA) . Then restriction enzymes are added to cleave the DNA at specific restriction sites across the entire genome.
Part 3: ligate fragments that are attached to each other by proteins, then reverse and purify DNA by getting rid of the formaldehyde. The run a QPCR. (Need to have prior knowledge as primers need to be designed)

30
Q

Function of Immuno-precipitation step

A

You can use an antibody to look for specific proteins instead of looking at them all in general.

31
Q

Which sequencing methods require an hypothesis and which don’t.

A
  1. ChIP-loop = requires hypothesis’s
  2. HiC - ChIA-PET = uses NGS thus requires not hypothesis
32
Q

Differentiate between 3C, 4C and 5C method

A

3C: one promoter= one enhancer
4C: one promoter= multiple enhancer
5C : multiple promoter = multiple enhancer

33
Q

How is The Hi-C experiment different?

A

After ligation biotin is added to single out specific DNA sequence.

34
Q

Motivation for the encode project

A
  1. What is the function of the rest of the genome
  2. Determining the location of regulatory elements and how they influence gene transcription could reveal links between variation in the expression of certain genes and the development of disease.
35
Q

Primary factors for cell type selection for ENCODE

A
  1. Wide availability
  2. Ability to grow them easily
  3. Capacity to produce sufficient number of cells for use in all tech being used by encode investigators.
36
Q

Secondary factors for cell type selection for ENCODE

A
  1. Diversity in tissue source of the cell
  2. Germ layer lineage representation
  3. Availability of existing data generated using the cell type
37
Q

Main findings of encode

A
  1. 80.4% of the genome has some function.
  2. 95% of the genome lies within 8Kb of a DNA-protein interaction
  3. 99% lies within 1.7kb of at least one biochemical event
  4. Quantitative RNA production is correlated with both chromatin marks and transcription binding sites at promoters.
38
Q

How did they do gene annotation of transcribed and protein coding regions?

A

Manual and automated annotation to produce a comprehensive catalogue of human protein-coding and non-coding RNAs as well as pseudogenes.

39
Q

How did the do RNA analysis in encode

A

Sequenced RNA from different cell lines and multiple sub cellular fractions to develop an extensive RNA expression catalogue.

40
Q

How did the do annotation of protein bound regions

A

Mapped the binding locations of 119 different DNA-binding proteins and a RNA polymerase components in 72 cell types using ChIP-seq.

41
Q

Explain the steps off ChIP-seq

A
  1. Cross-link DNA and proteins in samples
  2. Isolate and sonic ate chromatin
  3. ChIP: Add protein specific antibody
  4. QC ChIP
  5. Construct ChIP -seq libraries
  6. Post-library QC
  7. Sequence ChIP-seq libraries using an illumina HiSeq
  8. Bioinformatics analysis
42
Q

How does DNase1 find hypersensitive sites.

A

DNAse 1 cannot cleave where there are proteins, hypersensitive sites chromatin has lost it’s condensed structure, exposing the DNA and making it accessible for cleavage.

43
Q

What was used to profile DNA methylation

A

Reduced-representation bisulfite sequencing (RRBS) used to profile DNA methylation.

44
Q

How did ENCODE identify chromosome interaction regions?

A

Assess using chromosome conformation capture (3C)

45
Q

Limitations of the Encode project (5)

A
  1. over analysis and interpretation of what constitutes function
  2. Technical data and data analysis limitation ‘
  3. Limited predictive power
  4. Functionality not supported by experimental validation
  5. Represents only a few cell types and disease states (they used cell lines, which are not considered normal cells and thus findings often don’t apply to normal cells.
46
Q

Aim of the Epigenetic roadmap

A

The project aimed to discover and study all Epigenetic elements, which are the actual chemicals that are attached to the backbone and control the function of our genes.

47
Q

Goals of the Epigenetic roadmap project

A
  1. Producing a public resource of human Epigenetic data.
  2. Close the gap between data generation and its public dissemination:
    2.1 by rapid Release of the raw sequence data
    2.2 Profiles of epiginomic features.
    2.3Higher-level integrated maps to the scientific community
48
Q

Difference between ENCODE and Epigenetic roadmap

A
  1. Encode only used cell lines
  2. Epigenetic roadmap used normal cell types not only cell lines.
49
Q

What type of cell types did epigenome roadmap use

A
  1. Brain (different parts) vs kidney
  2. Cells from both healthy individuals and patients with cancer, neurodeneration and autoimmune.
  3. Stem cells and primary ex vivo tissue
50
Q

The Epigenetic roadmap project outcomes

A
  1. Collection of normal epigenomes
  2. Provide a framework or reference for comparison and integration within a broad array of future studies.
  3. How Epigenetic elements regulate gene expression.
51
Q

What are the unanswered questions left at the ENCODE and ….. projects

A
  1. Does the epigenome change as cells and people - age.
  2. Can we use these data to better predict cancer risk
  3. Many different types of cells= very good start
52
Q

Why investigate non-coding variation

A
  1. Casual variants for disease were initially expected to be coding (often not the case)
  2. What proportion of casual mutations are coding and non-coding. (88% of variants identified by GWAS are non-coding)
  3. Therefore, non-coding variants cannot be ignored.
53
Q

Consequences of non-coding variants.

A
  1. Typically do not change the amino acid sequence of a protein- unless a splice site is impacted.
  2. Regulatory regions- may affect the expression of the associated gene by altering transcription.
  3. Change mRNA stability and folding
54
Q

What Are some non-coding annotating tools?

A
  1. Scoring algorithms which predict pathogenicity/deleteriousness ( example CADD, FATHMM, GWAVA)
  2. Annotation tools which give experimental evidence (examples: regulomeDB, haploreg,funciSNP)
55
Q

Limitation of the annotation tools

A
  1. How often are they updated
  2. Which build of dbSNP?
  3. Web interface vs command line (searching for many variants)
  4. Most tool cannot be used in isolation. Better analysis from. A combination of several useful tools.
  5. In silico analysis does not replace functional lab studies to determine if variants are truly functional.
56
Q

Characterise intellectual disability.

A
  1. Limitations in learning, reasoning, problem solving
  2. Problems with adaptive behaviour
  3. Prevalence higher in males
  4. Cause can be environmental or genetic
  5. Aneuploidies, chromosomal rearrangements, CNVs and monogenic disorders
  6. Large number of genes linked to LD are found on the X-chromosome
57
Q

What type of diagnostic testing of X-linked intellectual disorder.

A
  1. DNA testing (if we know what we’re looking for)
  2. ACGH (if we do not know what we are looking for)
  3. Clinical assessment to exclude
  4. WES or WGS
58
Q

What questions can be answered by doing a DNASE1.

A
  1. How is the chromatin packaged at a particular locus.
  2. What are the physical position of nucleosomes in the genome or at a particular locus.
59
Q

How does DNASE 1 assay work.

A

DNase is partially inhibited by all DNA bound proteins and will cleave the most exposed DNA first, this allows us to see which DNA in the chromatin is the most exposed.

60
Q

What question can i answer with bisulfide sequencing.

A

This is a very specific and powerful assay to find out which cytosine are methylated in a specific dna sequence, in a specific sample relative to another .DNA methylation is dynamic and highly variable across tissue, cell lines, treatment and time periods.

61
Q

How does Bisulfite sequencing work?

A

Bisulfite causes de-amination of unmethylated cytosine and converts them into uracil. After PCR amplification, the resulting Us are replaced Ts. Methylated Cs are immune to conversion by bisulfite and remain as Cs.

62
Q

Principle of chromatin immunoprecipitation (ChIP)

A

Antibodies that specifically recognise and bind to methylated Cs are used to immuno-precipitate chromatin containing methylated C’s