ML Flashcards
tf: In tensorflow you imagine
everything you are computing as a graph nodes are the transformations on the data, or functions you are running on the data. These can have multiple inputs and outputs. The edges (the things connecting nodes) are the data.
dl: back propogation is
looking at the output of a deep neural network model and comparing it to the desired output. Based on the difference between the correct answer and the prediction you adjust the layer right before to create the correct answer. Then based on the error in the second last layer, you adjust the third last layer and so on.
ml: The points plotted near the decision boundary are called support vectors because
the are the ones that force the decision boundary to be where it is.
ml: A Support Vector Machine is similar to a Nearest Neighbors because
An SVM only keeps the points that define the decision boundary, while NN keeps the points that do not influence the decision boundary as well as points that do.
tf: A tensor is
a typed ndarray
tf: A one hot vector is
a vector with zero in all columns besides one. The column that one is in represents the class it belongs to.
ml: MNIST is a
computer vision dataset with images of handwritten digits and their labels
ml: softmax is a
multinomial logistic regression
ml: Softmax is good for
when you need the probabilities of a record belong to classes
tf: A bias is used to
tell the algorithm that a certain class is more frequent in general
tf: To create the table that hold all your samples, type
x = tf.placeholder(tf.float32, [None, 784])
tf: in x = tf.placeholder(tf.float32, [None, 1000]), None means
that that dimension can vary
tf: in x = tf.placeholder(tf.float32, [None, 784]), 1000 is
The number of columns
tf: To create the weights variable, type
W = tf.Variable(tf.zeros([1000, 10]))
tf: To create the biases variable, type
b = tf.Variable(tf.zeros([10]))
tf: To create a softmax model, type
y = tf.nn.softmax(tf.matmul(x, W) + b)
tf: a good cost function is called
cross-entropy
tf: Gradient descent is a simple procedure, where
TensorFlow shifts each variable a little bit in the direction that reduces the cost
ml: Using small batches of random data for taining is called
stochastic training
tf: To create your cross entropy cost function, type
y_ = tf.placeholder(tf.float32, [None, 10]) cross_entropy = -tf.reduce_sum(y_*tf.log(y))
tf: To initialize your gradient descent optimizer with a learning rate of 0.01 and a cost function called cross entropy, type
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
tf: To initialize all the variables and then run a session type
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
tf: a feed dict is
a dict that maps a var full of samples to the x and a var full of labels to the y
tf: To connect to tensorflows c++ back end you use
a session
ml: A bias
ads a number to the input times the weight
ml: An activation function is
a function that takes is all of the inputs and then outputs a value
ml: ReLU stands for
Rectified linear units
ml: The rectifier functions formula is
f(x) = max(0, x)
ml: The rectifier is
the most popular activation function for DNNs
ml: A Convolutional Neural network is
a neural network structured in a way that is better for images.
tf: The basic procedure for creating a tf model is
import the data create the tensors create a session create your softmax layer create your loss function create the train step evaluate
tf: A more sophisticated optimizer than GradientBoostingOptimizer is
AdamOptimizer
tf: keep_prob in the feed_dict
controls the dropout rate
ml: A linear function is just a
giant matrix multiply
ml: A Logistic Classifier is
a linear classifier
tf: A softmax function takes all the scores from the linear functions and
turns them into class probabilities that together add up to one
ml: Scores in the context of logistic classifiers are also called
logits
ml: When you multiply to increase the size of your logits
the classifier makes the confident logits grow very quickly and becomes very confident
ml: When you divide to decrease the size of your logits
the classifier makes the logitsmove closer together and becomes less confident
ml: For one hot encoding
you make a vector with the same number of items as there are classes and then give each class one index of the array that represents it by making it’s value 1 while the rest of the values are 0.
ml: A vector is
an array
ml: To make one hot encoding more efficient for models with thousands of classes we use
imbeddings
ml: Cross entropy is
the difference between the array of probabilities compared to your one hot encoded vector for the correct class.
ml: Multinomial logistic classification
inputs use a linear function to produce logits. Logits go into a softmax function to create probabilities out of 1, and the probabilities are then compared to the one hot encoded vector using cross entropy.