Pandas Flashcards by pedro martins

Create a new DataFrame

df = pd.DataFrame() (Atenção com as maiúsculas!)

How well did you know this?

Not at all

Perfectly

parâmetro axis=0

eixo x

How well did you know this?

Not at all

Perfectly

parâmetro axis=1

eixo y

How well did you know this?

Not at all

Perfectly

selecionar colunas W e Z do dataframe

df[[‘W’,’Z’]]

How well did you know this?

Not at all

Perfectly

cria coluna nova no dataframe

df[‘new’] = df[‘W’] + df[‘Y’]

How well did you know this?

Not at all

Perfectly

Algo to cut the lines of the DataFrame up to a value

Find the line index using line= df.loc[df[‘COL’] == ‘Limite’].index.min()

df = df[df.index < line]

How well did you know this?

Not at all

Perfectly

Cut the columns on a DataFrame

col_names = [‘Data’, ‘A’]

df = df[col_names]

How well did you know this?

Not at all

Perfectly

Caps Lock no nome das colunas

df.columns = df.columns.str.upper()

How well did you know this?

Not at all

Perfectly

Substituir nan em uma coluna por um valor

df[‘A’] = df[‘A’].fillna(‘0,0’)

How well did you know this?

Not at all

Perfectly

aplicar uma funcao em uma serie

df[‘A’] = df[‘A’].apply(function)

How well did you know this?

Not at all

Perfectly

change a DataFrame column from string to date

import datetime

df[‘A’] = pd.to_datetime(df[‘A’], format=’%d/%m/%y’)

How well did you know this?

Not at all

Perfectly

truncate (cut) values on a column

df[‘A’] = df[‘A’].map(str).str.slice(0,10)

How well did you know this?

Not at all

Perfectly

consolida dataframes em um dataframe final

df_final.append(df_bradesco)
df_final = pd.concat(df_final, axis=0)

How well did you know this?

Not at all

Perfectly

transforma uma coluna para o tipo string

df[‘A’].astype(str)

How well did you know this?

Not at all

Perfectly

remover duplicatas de um dataframe

df.drop_duplicates(inplace=True)

How well did you know this?

Not at all

Perfectly

left join de dois dataframes

df_final = pd.merge(df_final,df_teste,how=’left’,left_on=’KEY’,right_on=’KEY’)

How well did you know this?

Not at all

Perfectly

para cada linha da serie, recebe um valor de outra serie

df.at[chave_base, ‘A’] = df_teste.at[chave_teste, ‘A’]

How well did you know this?

Not at all

Perfectly

Saving DataFrames to an Excel Workbook

from pandas import ExcelWriter

writer = ExcelWriter(‘filename.xlsx’)

df1. to_excel(writer, ‘Sheet1’)
writer. save()

How well did you know this?

Not at all

Perfectly

Save DataFrame as a dictionary

d = df.to_dict()

How well did you know this?

Not at all

Perfectly

Save DataFrame as a string

str = df.to_string()

How well did you know this?

Not at all

Perfectly

Save DataFrame as a numpy matrix

m = df.to_matrix()

How well did you know this?

Not at all

Perfectly

Transpose rows and columns in a DataFrame

df = df.T

How well did you know this?

Not at all

Perfectly

Iterate between columns

df.iteritems()

How well did you know this?

Not at all

Perfectly

Iterate between rows

df.iterrows()

How well did you know this?

Not at all

Perfectly

Filter a DataFrame with like

df = df.filter(like='x')

Get first column label in a DataFrame

label = df.columns[0]

Get list of column labels in a DataFrame

lis = df.columns.tolist()

Get an array of column labels in a DataFrame

a = df.columns.values

Select column to series

s= df['colName']

Select column to DataFrame

df = df[['colName] ]

Select all but last column to a DataFrame

df = df[df.columns[:-1] ]

Swap columns content in a DataFrame

df[['B', 'A'] ] = df[['A', 'B'] ]

Dropping (deleting) columns

df.drop('col1', axis=1, inplace=True)

Apply log to a column

df['log\_data'] = np.log( df['col1'] )

Set column values based on criteria

df]'d'] = df['a].where(df.b != 0, other=df.c)

Find index label for min/max values in column

label = df['col1'].idxmin() label = df['col1'].idxmax()

Module of a column in a DataFrame

df['col'] = df['col'].abs()

Convert column to date

s = df['col'],to\_datetime()

Create a column with a rolling pct change

s = df['col'].pct\_change(periods=4)

Create a column with a rolling calculation

s = df['col'].rolling(window=4, min\_periods=4, center=False).sum()

Append a column of row sums in a DataFrame

df['Total'] = df.sum(axis=1)

Get the integer position of a column index label

i = df.columns.get\_loc('col\_name')

Adding rows to a DataFrame

df = original\_df.append(more\_rows\_in\_df)

Dropping rows (by name)

df = df.drop('row\_label')

Boolean row selection by values greater than

df = df[df['col2'] \>= 0.0]

Boolean row selection by values with OR condition

df = df[(df['col3'] \>= 0.0) | (df['col1'] \< 0.0) need parenthesis around comparisons

Boolean row selection by values in list

df = df[df['col'].isin( [1, 2, 5, 7, 11] ) ]

Boolean row selection by values NOT in list

df = df[~ df['col'].isin( [1, 2, 5, 7, 11] ) ]

Boolean row selection by values containing

df = df[df['col'].str.contains('hello') ]

Get integer position of rows that meet condition

a = np.where(df['col'] \>= 2) produces a numpy array

Find row index duplicates

if df.items.has\_duplicates: prin(df.index.duplicated() )

Select a cell by row and column labels

value = df.at['row', 'col'] .at[] is the fastest label based scalar lookup

Grouping with an aggregating function

s = df.groupby('cat')['col1'].sum()

Change a string to lower case

s = df['col'].str.lower()

Get the length of the strings in a column

s =df['col'].str.len()

Append values to a string

df['col'] += 'suffix'

Filter a DataFrame with a list of items

df = df.filter(items['a' , 'b'], axis=0)

Remove lines with NaN

df.dropna(inplace=True)

Definition of the Strip method for strings

Returns a copy of the string with both leading and trailing characters removed (based on the string argument passed).

Pandas Flashcards

(62 cards)