kag_sal_ecom Flashcards
For DataFrame ‘sal’, find out how many entries there are
sal.info()
What is the average BasePay?
sal[‘BasePay’].mean()
What is the highest amount of OvertimePay in the dataset?
sal[‘OvertimePay’].max()
Return the entire record for the highest amount of OvertimePay in the dataset
sal[sal[‘OvertimePay’]==sal[‘OvetimePay’].max()]
Return just the name of the employee with the highest amount of OvertimPay in the dataset
sal[sal[‘OvertimePay’]==sal[‘OvertimePay’].max()][‘EmployeeName’]
Return the JobTitle of employee JOSEPH DRISCOLL
sal[sal[‘EmployeeName’]==’JOESEPH DRISCOLL’][‘JobTitle’]
Return the TotalPayBenefits for employee JOSEPH DRISCOLL
sal[sal[‘EmployeeName’]==’JOSEPH DRISCOLL’][‘TotalPayBenefits’]
Return the record of the person with the lowest TotalPayBenefits
sal[sal[‘TotalPayBenefits’]==sal[‘TotalPayBenefits’].min()]
Return just the name of the employee with the lowest TotalPayBenefits
sal[sal[‘TotalPayBenefits’]==sal[‘TotalPayBenefits’].min()][‘EmployeeName’]
What is the average (mean) BasePay of all employees per year?
sal.groupby(‘Year’).mean()[‘BasePay’]
How many unique job titles are there?
sal[‘JobTitle’].nunique()
What are the top 5 most common job titles?
sal[‘JobTitle’].value_counts().head(5)
How many job title were represented by only 1 person in 2013?
sal[sal[‘Year’]==2013][‘JobTitle’].value_counts() == 1).sum()
How many people have the word chief in their job title?
def chief_string(title): if 'chief' in title.lower(): return True else: return False
(sal[‘JobTitle’].apply(lambda x: chief_string(x))).sum()
How do you determine if there is correlation between length of Job Title string and TotalPayBenefits?
sal[‘title_len’] = sal[‘JobTitle’].apply(len)
sal[[‘title_len’, ‘TotalPayBenefits’]].corr()