Week 3 (Tensorflow) Flashcards
Tensorflow, Python, Pandas
What is Tensorflow?
A graph based computational framework for building machine learning models.
What is the purpose of the toolkit part: Estimator (tf.estimator)?
Higher-level APIs to specify predefined architectures, such as linear regressors or neural networks.
Describe pandas DataFrame.
A DataFrame in Pandas can be described as a relational data table, with rows and named columns. A DataFrame contains one or more Series and a name for each Series.
Describe pandas Series.
A Series in Pandas is a single column.
How to start using Pandas in Python?
import pandas as pd
How to create a DataSeries object in Python?
Use the pandas library call Series() with the data in square brackets, separated by comma. This constructs a Series object.
city_names = pd.Series(['San Francisco', 'San Jose', 'Sacramento']) population = pd.Series([852469, 1015785, 485199])
How to create a DataFrame object in Python?
Use the pandas library call DataFrame() with a dictionary mapping string (e.g. a dict is a combination of {‘column name’: DataSeries}).
cities_df = pd.DataFrame({‘City names’: city_names, ‘Population’: population})
How to read data from a CSV file into a DataFrame object in Python?
Use the pandas library call read_csv () with the path or address to the file and you can specify a separator.
california_housing_dataframe = pd.read_csv(“https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv”, sep=”,”)
How can you get quick stats about the data in a DataFrame?
Use the describe() function of a DataFrame object.
california_housing_dataframe.describe()
How can you see the first few rows in a DataFrame?
Use the head() function of the DataFrame object
california_housing_dataframe.head()
How can you quickly see the distribution of values in a column in a DataFrame?
Use the hist() function of the DataFrame object
california_housig_dataframe.hist(‘housing_median_age’)
Describe 3 easy ways to see/access data in a DataFrame?.
Specify in squared brackets the column name, to see the series data (e.g. cities_df[‘city name’])
Specify the element (starting with 0) to see a single value (e.g. cities_df[city name][2])
Specify a range (starting with 0) to see all columns and its data in the range (e.g. cities_df[0:2])
Can you manipulate all data in a Series at once?
Yes. E.g. population / 1000
divides all elements in the series population by 1000 and returns the resulting Series
What is NumPy?
NumPy is a popular toolkit for scientific computing.
You import it by:
import numpy as np
Pandas Series can be used as arguments to most NumPy functions.
What can you use the Pandas Series.apply function?
For more complex single-column transformations, you can use Series.apply. Like the Python map function, Series.apply accepts as an argument a lambda function, which is applied to each value.
population.apply(lambda val: val > 1000000)
(returns a new series (same size as population) with boolean values that represent if the population value is above 100000)