Numpy and Keras Flashcards
populate an array with a sequence of numbers:
sequence_of_integers = np.arange(5, 12)
Matplotlib scatter plot
x = np.random.random([100])
y = np.random.random([100])
plt.scatter(x,y)
Populate arrays with 20 random integer from 0 to 100
np.random.randint(low=0,high=101,size=20)
Broadcasting
virtually expand the smaller operand to dimensions compatible
Multiply each cell in a vector by 3
x*3
Assign a sequence of integers from 6 to 20 (inclusive) to a NumPy array named feature.
Assign 15 values to a NumPy array named label such that:
label = (3)(feature) + 4
features = np.arange(6,21)
labels = (3*feature) + 4
plt.scatter(features,labels)
Make a Dataframe from an array of array of data and and a list of column names.
my_dataframe = pd.DataFrame(data=my_data, columns=my_column_names)
Create a new column named adjusted derived from the activity column.
my_dataframe[“adjusted”] = my_dataframe[“activity”] + 2
Create an 3x4 (3 rows x 4 columns) pandas DataFrame in which the columns are named Eleanor, Chidi, Tahani, and Jason. Populate each of the 12 cells in the DataFrame with a random integer between 0 and 100, inclusive.
cn = [‘Eleanor’, ‘Chidi’, ‘Tahani’, ‘Jason’]
my_data = []
for i in range(3):
my_data.append(np.random.randint(0,101,4))
Copying a DataFrame
pd.DataFrame.copy
Referencing. If you assign a DataFrame to a new variable, any change to the DataFrame or to the new variable will be reflected in the other.
Copying. If you call the pd.DataFrame.copy method, you create a true independent copy.
Typical hyperparameters
learning rate
epochs
batch_size
hyperparameter
The “knobs” that you tweak during successive runs of training a model. For example, learning rate is a hyperparameter.
linear model
A model that assigns one weight per feature to make predictions. (Linear models also incorporate a bias.) By contrast, the relationship of weights to features in deep models is not one-to-one.
A linear model uses the following formula:
y_prime= b+ sigma(w_i,x_i)
epoch
A full training pass over the entire dataset such that each example has been seen once.
Relationship between epoch, batch_size and training iterations
epoch == N/batch_size
An epoch represents N/batch size training iterations, where N is the total number of examples.
batch size
The system recalculates the model’s loss value and adjusts the model’s weights and bias after each iteration. Each iteration is the span in which the system processes one batch. For example, if the batch size is 6, then the system recalculates the model’s loss value and adjusts the model’s weights and bias after processing every 6 examples.
An oscillating loss curve strongly suggests
learning rate is too high
SGD Batch size
1
The batch size of a mini-batch is usually between 10 and 1000. Batch size is usually fixed during training and inference.
TensorFlow does/does not permit dynamic batch sizes.
TensorFlow does permit dynamic batch sizes.
Hyperparameter tuning
Training loss should steadily decrease, steeply at first, and then slowly until the slope of the loss curve approaches 0.
If the training loss does not converge, train for more epochs.
Training loss decreases too slowly: increase learning rate.
Training loss jumps around: decrease learning rate.
First, try large batch size values. Then, decrease the batch size until you see degradation.
Very large number of examples: reduce batch size to enable a batch to fit into memory.
ideal combination of hyperparameters is
ideal combination of hyperparameters is data dependent: always experiment
Pivot a table in pandas
data[“hours_since_admitted_rounded”] = round(data[“hours_since_admitted”])
pivoted = data.pivot_table(columns=”component_name”,values=”z_score_ord_num_value”, index=”hours_since_admitted_rounded”, fill_value=0)
pivoted.head(10)