Exam Questions 2 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

“Sieve of Eratosthenes”:
what is the goal of this algorithm?

A

To find all prime numbers within a given range, typically up to a specified maximum limit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

“Sieve of Eratosthenes”:
Pseudo-code how does it work

A
  1. Start with a list of numbers from 2 to the maximum limit
  2. Begin with the first number in the list, which is 2, and mark it as prime
  3. Cross out all multiples of this prime number in the list as non prime
  4. Go to the next number and If the number is not marked as prime/ crossed out it is a prime number.
  5. repeat 3 and 4 until square root of the limit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

“Sieve of Eratosthenes”:
what is its (time-)complexity? Shortly explain.

A

O(n log log n),
“n” is the maximum limit up to which you want to find prime numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Given a eukaryotic genome with a GC-content of 40%, how long are open reading frames (ORFs) on average?

A

three stop codons: UAA, UAG, and UGA
Probability of A or T (P(AT)) = 0.5
Probability of G or C (P(GC)) = 0.5
P(stop) = (0.5)^2 * 0.5 = 0.125
This means that in a random sequence, you would expect a stop codon approximately every 1 / 0.125 = 8 codons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do you expect the same average ORF lengths on the forward and the backward strands? Shortly explain.

A

No because genes are typically found on the coding strand and not on the non coding strand, hence the coding strand tends to have longer ORFs than the non coding strand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Low gene expression can be detected through pairwise genome alignments

A

False. Pairwise genome alignments are primarily used for comparing genomic sequences to identify similarities, differences, and structural variations.
Typical: RNA-seq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does UPGMA stand for?

A

UPGMA stands for “Unweighted Pair Group Method with Arithmetic Mean.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A distance matrix for n sequences contains n(n-1)/2 entries, when entries on the diagonal are not counted

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Any distance matrix uniquely determines exactly one phylogenetic tree.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Difference between TPM and FPKM

A

FPKM normalizes for both the library size and the length of the gene.
TPM only normalizes for library size (the total number of reads or fragments in the library) but does not consider gene length.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

To specify a Hidden Markov Model with n<∞ states, one needs
0 initial probabilities
0 substitution probabilities
0 transition probabilities
0 emission probabilities
0 exit probabilities
0 likelihood ratios

A

Transition probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do you need to consider when choosing a window with for a sliding window method?

A

feature charactersitics: well defined feature -> narrow window and vice versa
noise ratio: large window smoothes noise
computational ressources:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of these motif descriptors (regular expression, weight matrix, Sequence Logo) is/are suitable for describing a splice site consensus?

A

Weight Matrix:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Briefly summarize the problem of the non-suitable descriptor(s) (regular expression, Sequence Logo)

A

Regular expression: May not capture variability of splice site consensus
Sequence Logo: Used for visualizing sequence motifs and do not provide probabilities for splice site consensus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Difference between Smith Waterman Alignment and structural alignment?

A

Smith Waterman: Sequence Alignment, sequence similarity in local regions,
Strutural Alignment: considers 3D structure, allowing for the identification of conserved structural motifs and functional implications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

You want to find the statistical significance of a possible motif enrichment found in a gene set. You know the size of the set (number of genes) and also how many of them have this motif.
a) What else do you need?

A

Background Distribution against which you can measure the significance of the motif enrichment

17
Q

You want to find the statistical significance of a possible motif enrichment found in a gene set. You know the size of the set (number of genes) and also how many of them have this motif.
What statistical test would you perform?

A

Fisher’s exact test, to determine significance

18
Q

What is enrichment factor?

A

how many times more likely the motif occurs in your gene set compared to what would be expected by chance.

19
Q
A