Pandas Flashcards

1
Q

Create a new DataFrame

A

df = pd.DataFrame() (Atenção com as maiúsculas!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

parâmetro axis=0

A

eixo x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

parâmetro axis=1

A

eixo y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

selecionar colunas W e Z do dataframe

A

df[[‘W’,’Z’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

cria coluna nova no dataframe

A

df[‘new’] = df[‘W’] + df[‘Y’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Algo to cut the lines of the DataFrame up to a value

A

Find the line index using line= df.loc[df[‘COL’] == ‘Limite’].index.min()

df = df[df.index < line]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cut the columns on a DataFrame

A

col_names = [‘Data’, ‘A’]

df = df[col_names]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Caps Lock no nome das colunas

A

df.columns = df.columns.str.upper()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Substituir nan em uma coluna por um valor

A

df[‘A’] = df[‘A’].fillna(‘0,0’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

aplicar uma funcao em uma serie

A

df[‘A’] = df[‘A’].apply(function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

change a DataFrame column from string to date

A

import datetime

df[‘A’] = pd.to_datetime(df[‘A’], format=’%d/%m/%y’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

truncate (cut) values on a column

A

df[‘A’] = df[‘A’].map(str).str.slice(0,10)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

consolida dataframes em um dataframe final

A

df_final.append(df_bradesco)
df_final = pd.concat(df_final, axis=0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

transforma uma coluna para o tipo string

A

df[‘A’].astype(str)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

remover duplicatas de um dataframe

A

df.drop_duplicates(inplace=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

left join de dois dataframes

A

df_final = pd.merge(df_final,df_teste,how=’left’,left_on=’KEY’,right_on=’KEY’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

para cada linha da serie, recebe um valor de outra serie

A

df.at[chave_base, ‘A’] = df_teste.at[chave_teste, ‘A’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Saving DataFrames to an Excel Workbook

A

from pandas import ExcelWriter

writer = ExcelWriter(‘filename.xlsx’)

df1. to_excel(writer, ‘Sheet1’)
writer. save()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Save DataFrame as a dictionary

A

d = df.to_dict()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Save DataFrame as a string

A

str = df.to_string()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Save DataFrame as a numpy matrix

A

m = df.to_matrix()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Transpose rows and columns in a DataFrame

A

df = df.T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Iterate between columns

A

df.iteritems()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Iterate between rows

A

df.iterrows()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Filter a DataFrame with like

A

df = df.filter(like=’x’)

26
Q

Get first column label in a DataFrame

A

label = df.columns[0]

27
Q

Get list of column labels in a DataFrame

A

lis = df.columns.tolist()

28
Q

Get an array of column labels in a DataFrame

A

a = df.columns.values

29
Q

Select column to series

A

s= df[‘colName’]

30
Q

Select column to DataFrame

A

df = df[[‘colName] ]

31
Q

Select all but last column to a DataFrame

A

df = df[df.columns[:-1] ]

32
Q

Swap columns content in a DataFrame

A

df[[‘B’, ‘A’] ] = df[[‘A’, ‘B’] ]

33
Q

Dropping (deleting) columns

A

df.drop(‘col1’, axis=1, inplace=True)

34
Q

Apply log to a column

A

df[‘log_data’] = np.log( df[‘col1’] )

35
Q

Set column values based on criteria

A

df]’d’] = df[‘a].where(df.b != 0, other=df.c)

36
Q

Find index label for min/max values in column

A

label = df[‘col1’].idxmin()

label = df[‘col1’].idxmax()

37
Q

Module of a column in a DataFrame

A

df[‘col’] = df[‘col’].abs()

38
Q

Convert column to date

A

s = df[‘col’],to_datetime()

39
Q

Create a column with a rolling pct change

A

s = df[‘col’].pct_change(periods=4)

40
Q

Create a column with a rolling calculation

A

s = df[‘col’].rolling(window=4, min_periods=4, center=False).sum()

41
Q

Append a column of row sums in a DataFrame

A

df[‘Total’] = df.sum(axis=1)

42
Q

Get the integer position of a column index label

A

i = df.columns.get_loc(‘col_name’)

43
Q

Adding rows to a DataFrame

A

df = original_df.append(more_rows_in_df)

44
Q

Dropping rows (by name)

A

df = df.drop(‘row_label’)

45
Q

Boolean row selection by values greater than

A

df = df[df[‘col2’] >= 0.0]

46
Q

Boolean row selection by values with OR condition

A

df = df[(df[‘col3’] >= 0.0) | (df[‘col1’] < 0.0)

need parenthesis around comparisons

47
Q

Boolean row selection by values in list

A

df = df[df[‘col’].isin( [1, 2, 5, 7, 11] ) ]

48
Q

Boolean row selection by values NOT in list

A

df = df[~ df[‘col’].isin( [1, 2, 5, 7, 11] ) ]

49
Q

Boolean row selection by values containing

A

df = df[df[‘col’].str.contains(‘hello’) ]

50
Q

Get integer position of rows that meet condition

A

a = np.where(df[‘col’] >= 2)

produces a numpy array

51
Q

Find row index duplicates

A

if df.items.has_duplicates:

prin(df.index.duplicated() )

52
Q

Select a cell by row and column labels

A

value = df.at[‘row’, ‘col’]

.at[] is the fastest label based scalar lookup

53
Q

Grouping with an aggregating function

A

s = df.groupby(‘cat’)[‘col1’].sum()

54
Q

Change a string to lower case

A

s = df[‘col’].str.lower()

55
Q

Get the length of the strings in a column

A

s =df[‘col’].str.len()

56
Q

Append values to a string

A

df[‘col’] += ‘suffix’

57
Q

Filter a DataFrame with a list of items

A

df = df.filter(items[‘a’ , ‘b’], axis=0)

58
Q

Remove lines with NaN

A

df.dropna(inplace=True)

59
Q

Definition of the Strip method for strings

A

Returns a copy of the string with both leading and trailing characters removed (based on the string argument passed).

60
Q
A
61
Q
A
62
Q
A