Pandas 4 Granularity Flashcards

1
Q

If data is a collection of structured information, _____________ is the level your collection is at.

A

Granularity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Aggregating data to be less granular is called __________

A

Grouping

It necessarily involves loss of detail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Stacking

A

Aka reshaping

Crams data that was formerly in unique rows into separate columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the syntax for grouping in Pandas?

A

DF.groupby(‘column’).sum()

Sum() or whatever aggregating function is needed

The column grouped becomes the new index by default, or pass as_index=False to groupby

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to return groupby of just columns of interest?

A

sum_cols = [‘col1’, ‘col2’, ‘col3’]

DF.groupby(‘game_id’).sum()[sum_cols]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to use the agg() function?

A

DF.groupby(‘game_id’).agg({yards_gained’: sum,
‘Play_id’ : ‘count’,
‘Intercep’ “ ‘sum’,
‘Touchdown’ : ‘sum})

Agg() takes a dictionary

The groupby columns will have the same name as the key columns

To rename the colums, use tuple pairs:
DF.groupby(‘game_id’).agg(
Yards = (yards_gained’: sum),
Nplays = (‘Play_id’ : ‘count’),
intercep = (‘Intercep’ “ ‘sum’),
Touchdown = (‘Touchdown’ : ‘sum))

—this no longers passes a dictionary, instead agg() takes arguments, each in a
new_var = (‘old_var’, ‘function-as-a-string-name’)
format

Page 86

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Stacking is similar to

A

Pviot Table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A join in Pandas is called

A

Merging or horizontal concatenation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A union in Pandas is called

A

Appending or vertical concatenation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly