Representation Learning Flashcards

1
Q

What is representation learning? Why is it important?

A
  • learning representations of input data
    • generally transforms it
    • makes it easier to perform a task
  • the performance of any machine learning model is critically dependent on the representations it uses or learns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Principal Component Analysis (PCA)? What does it do?

A
  • method that aims at re-expressing a given dataset using a linear transformation
  • center the data by subtracting off the mean of each measurement
  • compute the eigenvectorsof XX^T, i.e., EDE^T and posing P = E
  • select the k &laquo_space;m most important components from P according to D
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What transformations give PCA a problem?

A
  • searching for a linear transformation, contamination of data given by:
    • noise (errors, interferences on the data that deviates from the norm)
    • redundacy (multiple variables that can be reduced intoa single one)
  • we aim at finding a trasformation that minimizes both
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can we estimate noise and redundacy in data?

A
  • both can be estimated using measures related to the variance of the data
    • signal-to-noise-ratio to measure noise
    • the covariance matrix can measure dedundacy between features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does the covariance matrix work?

A
  • the dataset must be centered (mean 0)
  • the covariance matrix is computed as
    • S_X = 1/n(X*X^T)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does PCA reduce features covariance?

A
  • we are aiming at diagonalizing the covariance matrix

- find some orthonormal matrix P where X’ = PX so that the covariance matrix of X’ is diagonal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does PCA perform dimensionality reduction?

A
  • the most important components are the first ones
    • data presents the most variability
  • ignoring the less important dimensions can help in simplyfing the data
    • less important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is kernel PCA?

A
  • extension of conventional PCA to deal with non-linear correlations using the
    kernel trick
    1- data not used directly but mapped implicitly to some nonlinear feature space
    2- center data in feature space
    3- apply PCA in the feature space
    4- obtained non-linear transformations in the original data space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can PCA be used for?

A
  • apply lossy compression
  • visualize multi-dimensional data in 2D or 3D
  • reduce the number of dimensions to discard noisy features
  • perform a change of representation in order to make analysis of the
    data at hand
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an autoencoder?

A
  • unsupervised learning technique based on feed forward neural networks
  • learns a representation for a set of data
    • generally for dimensionality reduction
    • can be used for learning generative models of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the autoencoder composed? What do the components do?

A
  • encoder
    • creates a new representation (the code) of the input
  • decoder
    • reconstructs the input starting from the code
  • bottleneck layer
    • compress the data and make the task harder
  • can have more hidden layers (simple vs deep)
    • non-linear activation functions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What loss is generally used in an autoencoder? What is the learning procedure?

A
  • the standard loss function is a mean squared error loss
    • called reconstruction loss
  • backpropagation and SGD generally used for training
    • no specialized algorithm
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What types of autoencoders are there?

A
  • regularized autoencoders
    • loss function that encourages sparsity in the representation and robustness to noise
  • sparse autoencoders
    • sparsity in the activation of hidden units -> L1/L2 regularization
  • denoising autoencoders
    • remove noise and corruption
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are CNN? How do they work?

A
  • Convolutional Neural Networks
    • specialized kind of NN for processing data that has a known grid-like topology (such as images)
  • employes convolution, mathematical operation
  • earns different levels of abstraction of the input
    • fist hl detects general patterns (edges)
    • deeper hl more specific abstractions (textures, patterns)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a typical CNN architecture?

A
  • Several CL + pooling (subsampling) followed by a fully connected network with ReLu
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is neural words embedding?

A
  • refers to technique in the text preprocessing phase

- transforms text into a vector of numbers

17
Q

What is Word2Vec?

A
  • performs word embedding
  • based feed-forward fully connected architecture
    • encoding each word in a vector
    • aim is to represent words so to capture semantic and syntactic word similarity
  • similar to AE, trained against context (neighboring words)
    • CBOW -> context to predict a target word
    • Skip-gram -> target word to predict a target context
18
Q

What is a knowledge graph? What is it used for?

A
  • multi-relational graph
    • defines entities relationship
  • edges are facts, triples (head, kind, tail)
  • effective in representing structured data
    • hard to manipulate (symbolic nature of triples)
19
Q

What is Knowledge Graph Embedding and what is it used for?

A
  • embeds KG components into continuous vector spaces
    • simplify manipulation while preserving structure
  • can be used for various tasks such as Knowledge Graph Completion
20
Q

What is knowledge graph completion?

A
  • KGs are typically incomplete or incorrect
    • perform knowledge base completion (link prediction)
  • ,predict edges or probability of correctness in the graph