Pandas 5 Combining 2+ DF Flashcards
How to join two or more tables in Pandas
pd.merge(DF1, DF2[[‘DF2_C1’, ’DF2_C2’, ’etc’]], on = ‘gameid’)
This creates DF with all columns from DF1, and selected columns from DF2
The “on” keyword is option. If left out the default behavior to join on the columns the two DFs have in common
Or create new DF to merge with the columns of interest:
DF1 = DFa[[‘gameid’, ‘playerid’, ‘rush yds’]]
DF2 - DFb[[‘gameid’, ‘playerid’, ‘rec yds’]]
NewDF = pd.merge(DF1, DF2, on = [‘playerid’, ‘gameid’])
How to designate left or right join in Pandas
Keyword is “how”
pd.merge(DF1, DF2, how = ‘left’, indicator=True)
Keyword “indicator” creates a column _merge which shows whether the observation was in the left, right, or both DF
NewDF[‘_merge’].value_counts()
right only 813
both 355
left only 200