lecture 8 Flashcards
What are examples of data that naturally form a sequence?
Language, music, stock prices.
What are the two main types of sequential datasets?
Numeric sequences and symbolic sequences.
What is an example of a 1D numeric sequence?
Stock prices over time.
What is an example of a multidimensional numeric sequence?
Closing index values of AEX and FTSE100 over time.
What is an example of a 1D symbolic sequence?
A sequence of words in a sentence.
What is an example of a multidimensional symbolic sequence?
A sentence where each word has multiple tags (e.g., part-of-speech tagging).
What is the difference between a single sequence and a set-of-sequences?
A single sequence is continuous, while a set-of-sequences consists of independent sequences.
What is an example of a sequence classification problem?
Spam detection based on email content.
What is an example of a sequence prediction problem?
Predicting future stock prices based on past trends.
How can sequential data be transformed into a standard regression problem?
By representing each point using a fixed number of preceding values.
What is walk-forward validation?
A technique for evaluating models on time-ordered data without violating temporal structure.
What is the main advantage of walk-forward validation?
It simulates real-world scenarios where new data arrives sequentially.
What is a Markov model?
A probabilistic model that estimates the probability of small subsequences from observed data.
How does a Markov model differ from Naive Bayes?
Markov models consider dependencies between elements in a sequence.
What is the chain rule of probability?
p(W4, W3, W2, W1) = p(W4 | W3, W2, W1) * p(W3 | W2, W1) * p(W2 | W1) * p(W1).
What is a key assumption in Markov models?
The probability of the next state depends only on the current state, not the entire history.
What is the difference between a first-order and second-order Markov model?
A first-order model depends only on the previous state, while a second-order model depends on the two previous states.
What is a practical application of Markov models?
Speech recognition and language modeling.
How can probabilities be estimated in Markov models?
Using relative frequencies from observed sequences.
What is a challenge of using raw frequency counts for probability estimation?
Some sequences may never appear in the training data, leading to zero probabilities.
What technique is used to handle unseen sequences in Markov models?
Smoothing techniques such as Laplace smoothing.
What is Laplace smoothing?
A method that adds a small constant to all frequency counts to prevent zero probabilities.
What is a Hidden Markov Model (HMM)?
An extension of Markov models where states are hidden and only observed indirectly.
What is an example of an HMM application?
Part-of-speech tagging in natural language processing.
What are the two main components of an HMM?
The transition probabilities between states and the emission probabilities of observations.
What is the Viterbi algorithm used for?
Finding the most likely sequence of hidden states in an HMM.
What is the key advantage of Markov models in machine learning?
They efficiently model dependencies in sequential data.
What is the main limitation of Markov models?
They assume limited memory, meaning only recent history is considered.
What is the takeaway from Markov models?
They provide a simple yet powerful way to model sequential data in various domains.