Lecture 11 Flashcards

1
Q

What method can be used for promoter analysis? What does it do?

A

Gibbs Sampler (motif identification)

Searches for statistically most probably motifs in unaligned sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What methods are used for determining transcription factor binding site in silico?

A
  1. Word counting methods
  2. Gibbs sampling
  3. Phylogenetic footprinting/ comparative genomics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the steps of Gibbs Sampling

A
  1. Set width for motif
  2. Choose random position for the start of the motif in all but one sequences
  3. Estimate the amino acid (or nt) frequencies in the motif columns of all but left-out sequence
  4. Estimate background frequencies (nt0): frequencies of nt (or aa) in positions that are not considered the motif
  5. Scan out the left out sequence and estimate probability of finding the motif at any position
    - Calculate odds score ratio for each position (a= pobserved / p background)
  6. Add up all above odds score and divide the odds score for each position by the total to obtain probability that motif is at that position
  7. These probabilities used as weights to decide a probable location of the motif in the left out sequence
    Repeat > 100 times
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the goal of Gibbs sampling?

A

Find the most probable patterns common to all of the sequences by sliding them back and forth until the ratio of the motif probability to the background probability is a maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How was Gibbs Sampler modified?

A
  1. search multiple motifs
  2. seek pattern in only fraction of input sequences (bc not all genes regulated by same TF or regulatory mechanism)
  3. Look motifs of different widths
  4. Avoid suboptimal solution by shifting current alignments a certain number of positions to right and left
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For what is the hypergeometric p-value used?

A

To ask if there are GO categories enriched in my cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name 4 methods to study protein-protein interactions

A
  1. Classic biochemical (chromatographic) methods
  2. yeast two hybrid followed by clone sequencing
  3. Affinity purification (TAP tagging then mass spectrometry)
  4. interologs or BioID
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does Y2H work?

A

Attach Gal4 binding domain to bait protein
Separate Gal4 activation domain and attach to prey protein
if bait and prey interact in vivo in nucleus of yeast, Gal4 fxn is reconsittuted and drive expression of reporter gene:blue yeast colonies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Problems of Y2H:

A

-Assay done in yeast, so might not get modifications necessary for protein funtion
-overexpressing protein and targetting to nucleus
-not well for membrane proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is Y2H useful for membrane proteins?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is Y2H useful for transient interactions?

A

Yes for transient binary interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is TAP?

A

Tandem affinity purification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does TAP tagging work?

A

immunoprecipitation-based purification technique for studying protein–protein interactions. The goal is to extract from a cell only the protein of interest, in complex with any other proteins it interacted with. TAP uses two types of agarose beads that bind to the protein of interest and that can be separated from the cell lysate by centrifugation, without disturbing, denaturing or contaminating the involved complexes. To enable the protein of interest to bind to the beads, it is tagged with a designed piece, the TAP tag.

The original TAP method involves the fusion of the TAP tag to the C-terminus of the protein under study. The TAP tag consists of three components: a calmodulin binding peptide (CBP), TEV protease cleavage site, and two Protein A domains, which bind tightly to IgG (making a TAP tag a type of epitope tag).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are other types of arrays used for protein study?

A
  1. Protein Microarrays
  2. Glycan Arrays (arrayed 100s carbohydrates onto slides as a tool to understand carbohydrate biology)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why study glycosylation?

A

-regulatory mechanisms
-carbs key structural support in plant biology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How did Moller et al. studied glycan arrays?

A

Developed monoclonal antibodies to crude extracts of cell wall polymers from aribidopsis (rat or mouse)
-use mAb to probe fixed samples to understand distribution of glycans in different cells

17
Q

how did Moller et al. determined specificity of individual mAbs?

A

64 different plant glycans were arrayed onto nitrocellulose, each mAb hybrdized to array and detected using anti-mouse or anti-rat secondary antibodies linked to alkaline phosphatase
-arrays scanned, specificities determined by CLUSTER ANALYSIS

18
Q

What are 3 gene expression databases?

A
  1. ArrayExpress
  2. GEO (Gene Expression Omnibus)
  3. SRA (Sequence Read Archive)
19
Q

For what organisms we can find specific gene expression databases?

A

Human (RefExA)
Mouse
Worm
Fly
Arabidopsis

20
Q

For gene expression data sets what information should be available?

A

source of tissue, age, microarray element identifiers/ identifier annotation, library and fragmentation protocols, etc.

21
Q

Genomic analysis of AtPERKs research question?

A

although PERK1 induced rapidly upon pathogen attack in B. napus, no visible phenotype for individual AtPERK mutants.

22
Q

What was a novel thing of genomic analysis of AtPERKs

A

did not do any experiments, just used gene expression databases–>equivalent to 6k northern blots

23
Q

What consideration should be made for designing higher-order mutants?

A

similar gene expression profiles and sequence similarity (high for both)

24
Q

What is a consequence of coexpression analysis?

A

Coexpressed genes that have a vague functional description might be involved in similar biological process of genes whose funciton is known, so guilt-by association to assign funciton
*RGL2 whose role in floral biology was unknown

25
Q

What was the research question behind seed coexpression network analysis?

A

Since no master regulator of dormany or germination had been identified, can we use coexpression analysis to identify crucial hubs in networks?

26
Q

What was the process of seed co-expression network analysis?

A
  1. Array database (seed microarray database, 175 published samples)
  2. Coexpression calculations: calculate genes coexpression scores in samples that are dormant or can germinate
  3. Database queries: filter databse of interactions, >4.5 M interactions
  4. Visualize and analyze network
27
Q

What is SeedNet?

A

Coexpression network based on seed samples

28
Q

What does a node represent in SeedNet?

A

Node is a gene, and lines between genes denote coexpression between genes
- red colouring : upregulation in dormant samples
-blue colouring: increased expression in gemrinating samples

Node size is proportional to number of connections gene has/ number of coexpression partners the gene has

29
Q

What could the authors identify with SeedNet?

A

Novel transcriptional regulators of dormany or germination using guilt-by association