what's this? Flashcards
.head()
printing the first few rows of the DF
.info()
information about the columns
.mean()
mean of a column
.median()
median of a column
.min()
.max()
sort column
.sort_values
(‘column’)
get cumulative sum
.cumsum()
get cumulative maximum
.cummax()
Calculate the total Column over the whole dataset.
new_column = df[‘column’].sum()
Subset for (column) ‘type’ (“A”- content under column) stores, and calculate their total weekly (‘column’) sales.
new_column = df[df[‘type’] == ‘A’][‘weekly_sales’].sum()
Subset for (column) ‘type’ (“C”- content under column) stores, and calculate their total weekly (‘column’) sales.
new_column = df[df[‘type’]==’C’][‘weekly_sales’].sum()
Get the min of the column, ‘weekly_sales’, for each store ‘type’ using .groupby() and .agg(). store as sales_stats
sales_stats = sales.groupby(‘type’)[‘weekly_sales’].agg([np.min])
Get the min, max, and mean of the column, ‘weekly_sales’, for each store ‘type’ using .groupby() and .agg(). store as sales_stats
sales_stats = sales.groupby(‘type’)[‘weekly_sales’].agg([np.min, np.max, np.mean, np.median])
Get the mean weekly_sales (column) by type(‘column’) using .pivot_table() and store as mean_sales_by_type.
mean_sales_by_type = sales.pivot_table(values=’weekly_sales’, index=’type’)