pandas Flashcards
Importing Data
From a CSV file
pd.read_csv(filename)
Importing Data
From a delimited text file (like TSV)
pd.read_table(filename)
Importing Data
From an Excel file
pd.read_excel(filename)
Importing Data
Read from a SQL table/database
pd.read_sql(query, connection_object)
Importing Data
Read from a JSON formatted string, URL or file.
pd.read_json(json_string)
Importing Data
Parses an html URL, string or file and extracts tables to a list of dataframes
pd.read_html(url)
Importing Data
Takes the contents of your clipboard and passes it to read_table()
pd.read_clipboard()
Importing Data
From a dict, keys for columns names, values for data as lists
pd.DataFrame(dict)
Exporting Data
Write to a CSV file
df.to_csv(filename)
Exporting Data
Write to an Excel file
df.to_excel(filename)
Exporting Data
Write to a SQL table
df.to_sql(table_name, connection_object)
Exporting Data
Write to a file in JSON format
df.to_json(filename)
Create Test Objects
Create a series from an iterable my_list
pd.Series(my_list)
Create Test Objects
5 columns and 20 rows of random floats
pd.DataFrame(np.random.rand(20,5))
Create Test Objects
Add a date index
df.index = pd.date_range(‘1900/1/30’, periods=df.shape[0])
Viewing/Inspecting Data
First n rows of the DataFrame
df.head(n)
Last n rows of the DataFrame
df.tail(n)
Number of rows and columns
df.shape
Index, Datatype and Memory information
df.info()
Summary statistics for numerical columns
df.describe()
View unique values and counts
s.value_counts(dropna=False)
Unique values and counts for all columns
df.apply(pd.Series.value_counts)
Returns column with label col as Series
df[col]
Returns columns as a new DataFrame
df[[col1, col2]]
Selection by position
s.iloc[0]
Selection by index
s.loc[‘index_one’]