lesson_13_flashcards
What is an embedding in machine learning?
A mapping of objects (e.g., words, nodes, images) into vectors in a continuous vector space, where proximity indicates similarity.
What are word embeddings?
Vector representations of words that capture semantic and syntactic relationships, learned from co-occurrence in text data.
What is the distributional hypothesis in NLP?
The idea that words appearing in similar contexts tend to have similar meanings, forming the basis of word embeddings.
What is Word2Vec?
A neural embedding model that predicts context words given a target word (skip-gram) or target word given context words (CBOW).
What are graph embeddings?
Learned vector representations of nodes in a graph that encode structural and relational properties for downstream tasks.
What is negative sampling in Word2Vec?
A training technique where negative examples (unrelated word pairs) are sampled to make training more efficient.
What are hierarchical embeddings?
Representations in hyperbolic space capturing hierarchical relationships, requiring fewer dimensions compared to Euclidean space.
What is fairness in embeddings?
Ensuring embeddings do not amplify or perpetuate biases present in the training data, as seen in cases like gender-biased word analogies.
What is PyTorch BigGraph?
A scalable framework for training embeddings on large graphs with billions of nodes and edges, using techniques like partitioning.
What is intrinsic evaluation of embeddings?
Evaluation based on internal properties, such as nearest neighbor quality or analogy tasks, to assess semantic and syntactic relationships.
What is extrinsic evaluation of embeddings?
Assessment based on performance in downstream tasks like classification, clustering, or recommendation.
What is matrix factorization in graph embeddings?
A method to decompose the adjacency matrix of a graph into low-dimensional latent factors, representing nodes as embeddings.
What is hierarchical softmax in Word2Vec?
A technique to efficiently compute probabilities in large vocabularies by using a binary tree structure.
What is contextual word embedding?
Embeddings like BERT and ELMo that dynamically generate word representations based on the surrounding context in a sentence.
How do embeddings enable recommendation systems?
By representing users and items in the same space, embeddings help predict preferences and recommend similar items.