Pandas Pt 3 (UCSD) Flashcards

1
Q

how do you stack dataframes left and right vertically

A

pd.concat( [ left, right] ) ## will end up w/ 1 df w/ all of the unique cols of the original, and new rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how do you do an inner join on dataframes left and right (but columns repeated)

A

pd.concat( [ left, right] ), axis =1, join=’inner’ ) ## end up repeating duplicate columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what’s another means of vertical stacking left and right dataframes, using append?

A

left.append(right)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how do you merge left and right like a join, w/o repeating the columns

A

pd.merge (left, right, how=’inner’ )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

merge movies and tags dataframe, on movie ID, inner join

A

t = movies.merge(tags, on=’movieId’, how=’inner’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

get the first five rows of df that match bool filter1 and bool filter2

A

df[ filter1 & filter2 ][ :5 ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

split the string values in col ‘city’ in a df using a ‘_’ separator

A

df [‘city’].str.split(‘_’) ## replaces the values in that column w/ lists separated on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

check if any value in the city col of df contains the substring ‘2’

A

df[ ‘city’ ].str.contains(‘2’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

df.str.func() to replace substring

A

df.str.replace( subToReplace, replacementSub)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

df.srt function to return the values matched by a regex

A

df[ colName ].str.extract( ‘ regex ‘ ) ## looks like it returns a sliced df, or a series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

use split to break values out into new columns

A

df.str.split( separator, expand = TRUE )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is unix / posix / epoch

A

counts the number of seconds since 1970 as per UTC time zone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is datatime64[ns]

A

standard python format you can use to compare times ## df[‘time’] > ‘2020-01-01’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

convert from unix time to datetime64 format

A

pd.to_datetime(tags[ ‘timestampCol’ ], unit = ‘s’ ) ## unit refers to seconds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

sort values by a parsed time column in a df

A

df.sort_values(by = ‘parsedTimeCol’, ascending = True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly