assignment_4 Flashcards
How do you load an excel file to the DataFrame ‘df’, and carry out the following operations?
- Parse the first 7 rows
- Remove the first 212 rows
- Reset the index
- Rename the columns ‘Unnamed 4’ and ‘Unnamed 5’ to ‘Quarter’ and ‘GDP’, respectively
df = pd.ExcelFile(‘gdplev.xls’).parse(skiprows=7).loc[212:].reset_index()
df = df[[‘Unnamed: 4’, ‘Unnamed: 5’]].rename(columns={‘Unnamed: 4’: ‘Quarter’, ‘Unnamed: 5’: ‘GDP’})
For DataFrame ‘df’, look for the quarter that started a Recession.
df = df.set_index(‘Quarter’)
rec_start = []
recession = False
for i in range(1, len(df)-1):
if (recession==False) & (df.iloc[i-1,0] > df.iloc[i,0] > df.iloc[i+1,0]):
recession = True
rec_start.append(df.index[i])
return rec_start
For DataFrame ‘df’, find the quarter the recession ended.
rec_end = []
recession = False
for i in range(1, len(df)-1):
if (recession==False) & (df.iloc[i-1,0] > df.iloc[i,0] > df.iloc[i+1,0]):
recession = True
elif (recession==True) & (df.iloc[i-1,0] < df.iloc[i,0] < df.iloc[i+1,0]):
recession = False
rec_end.append(df.index[i])
return rec_end
For DataFrame ‘df’, find the row that where the recession started based on the quarter returned by the function ‘get_recession_start’
rec_start = get_recession_start()
start_row = (df.loc[df[‘Quarter’].isin(rec_start)])
For DataFrame ‘df’ find the index of the row with the quarter date when the recession started.
(HINT: The recession start quarter was found in the function get_recession_start)
rec_start = get_recession_start()
start_index = (df.index[df[‘Quarter’].isin(rec_start)])
Given the DataFrame ‘df’, create a new DataFrame df_reduce that has a beginning index of start_index and end_index, and find the minimum value of the ‘GDP’ column.
Hint: start_index should be the index of the row with the quarter that started the recession
rec_start = get_recession_start()
rec_end = get_recession_end()
start_index = (df.index[df[‘Quarter’].isin(rec_start)])
end_index = (df.index[df[‘Quarter’].isin(rec_end)])
df_reduce = df.ix[start_index[0]:end_index[0]]
print(df_reduce[‘GDP’].min())
Given a DataFrame ‘df’, what are two ways to drop the first 50 columns?
- df = df.iloc[:, 49:]
- df = df.drop(df.columns[:49], axis=1)