18-Semi supervised learning Flashcards
What is the basis behind active learning?
Learner has access to unlabelled data and requests label from an “oracle”
What is query synthesis?
Generates (constructs) a new instance based on existing data. Synthetic data
Query an instance
Add it as labelled data
What is stream-based sampling?
Looking at sequential data. Doesn’t use synthetic data. One instance at a time
Observe an instance
Flag certain samples for query
Add it as labelled data
What is pool-based sampling?
Review all data
Select best instances from entire pool of data to be queried
Add it as labelled data
What is the problem with query synthesis?
Human annotator might not recognise the pseudo instance
What are active learning approaches?
Query synthesis
Stream based sampling
Pool based sampling
What are query strategies?
Uncertainty sampling:
- Least confident
- Margin sampling
Query by committee:
- Vote entropy
What are applications of uncertainty sampling?
Speech recognition
Machine translation
Text classification
Word segmentation
What is the least confident, uncertainty sampling method?
Query instances where the classifier is least confident of the classification, where x* = argmin_x(P_theta(y^|x))
and y^ = argmin_y(P_theta(y|x))
What is the margin sampling, uncertainty sampling method?
x = argmin_x(P_theta(y_1|x)-P_theta(y_2|x))
where y_1 is most likely and y_2 is second most likely
What is the query by committee strategy?
Use multiple classifiers to predict on unlabelled data and select the instances with the highest disagreement.
Disagreement is measured with vote entropy: x = argmax_x(-sum(v(y_i)/N)*log2(v(y_i)/N))
What is semi-supervised learning?
Learning that utilises a combination of labelled and unlabelled data
What is the simple approach to semi-supervised learning?
Combine a supervised and unsupervised learning model. e.g k-means and assign label to most populous class
What is the difference between active and semi-supervised learning?
In semi-supervised learning, the algorithm automatically generates new labels, whereas in active learning, the algorithm selects unlabelled data and makes queries
What is the main assumption of self-training?
Similar instances are likely to have the same label