Pandas Foundation Flashcards
How do you get summary info of a DataFrame?
df.info()
How do you represent a DataFrame as a NumpyArray?
You can use the DataFrame attribute .values to represent a DataFrame df as a NumPy array. You can also pass pandas data structures to NumPy methods.
How do you build a DatetimeIndex out of the list of date strings?
pd.to_datetime()
What is REINDEXING the INDEX used for?
Reindexing is useful in preparation for adding or otherwise combining two time series data sets. To reindex the data, we provide a new index and ask pandas to try and match the old data to the new index. If data is unavailable for one of the new index dates or times, you must tell pandas how to fill it in. Otherwise, pandas will fill with NaN by default.
How do you Reindex without fill method?
ts3 = ts2.reindex(ts1.index)
Downsample the ‘Temperature’ column of df to 6 hour data using .resample(‘6h’) and .mean(). Assign the result to df1.
df1 = df.Temperature.resample(‘6H’).mean()
Print the dry_bulb_faren temperature between 8 AM and 9 AM on June 20, 2011? ( To remember syntax)
print(df_clean.loc[‘June 20 2011 08:00:00’:’June 20 2011 09:00:00’, ‘dry_bulb_faren’])
Convert the wind_speed and dew_point_faren columns to numeric values
df_clean[‘wind_speed’] = pd.to_numeric(df_clean[‘wind_speed’], errors=’coerce’)
df_clean[‘dew_point_faren’] = pd.to_numeric(df_clean[‘dew_point_faren’], errors=’coerce’)
Print the median of the dry_bulb_faren column for the time range ‘2011-Apr’:’2011-Jun’ ?
print(df_clean.loc[‘2011-Apr’:’2011-Jun’, ‘dry_bulb_faren’].median())
(this is the format for the DateTime Index)
daily_temp_2011 = df_clean[‘dry_bulb_faren’].values
What does this line of code do?
.values extracts all the values of the columns as a numpy array ( without the .values the values of the column are extracted as a Series)