Python Pandas 2 Flashcards
Line to read a CSV file
import pandas as pd
df=pd.read_csv(“k.csv”)
What does df.head(3) do ?
Prints the first 3 entires
What does df.tail(3) do ?
Fetches the last 3 entries of all entries
What is the line to read excel sheets
excel_data_df = pandas.read_excel(‘records.xlsx’, sheet_name=’Employees’)
How do you read a file that ISNT CSV but has different delimitter? 8:18
df = pd.read_csv(“pk_data.txt”.delimiter=’\t’ )
Line to get names of all columns in the data file
df.columns
Line to print one certain column
dt[‘Col_Name’]
Line to get specific column of top 5 entries
dt[‘Col’][0:5]
Line to get multple selective columns
dt[[‘Col1’,’Col2’]]
Line to get all details concerning row at certain position
df.iloc[1]
(Getting entry at position 2 zero indexing applies)
Line to get all details concerning multiple first few row
df.iloc[0:2]
Line to get a specific column OF a certain entry using ONLY INDICES
df.iloc[1,2]
1 is the second entry
And 2 stands for the 3rd column (We have to account for the zero indexing)
Line to iterate through all entries of datasheet
for index,df in df.iterrows():
print(index,df)
/////df stands for data frame/////
Line to iterate through rows and print specific columns of those entries
for index,df in df.iterrows():
print(index,df[‘Name’])
Line to to fetch data entries that satisfy a certain condition(Search filters).
df. loc[df[‘Type 1’] == Grass]
df. loc[‘Type 1’] returns all data strings under Type 1 column and then the condition checks which of them are ‘Grass’
Line to fetch data that has multiple filter layers
df.loc[df[‘Type 1’] == Grass].loc[df[‘Type 1’] == Fire]]
Line to get standard deviation of a certain column
df[‘Age’].describe()[‘std’]
What is the use of describe() is pandas
It displays data like count ,mean , min , std, 25percent, 75 and max
Sort values in ascending order
df.sort_values(‘Power’)
Sort names in leographic order
df.sort_values(‘Name’)
Sort elements in descending order
df.sort_values(‘Name’,ascending = False)
Sort first column with ascending and second column with desc
df.sort_values([‘Type1’,’HP’],ascending=[1,0])
Line to Create new column which is function of previous columns
df[‘Total’] = df[‘HP’]+df[‘Attack’]-df[‘Damage’]
/////It automatically creates a new column for the rest ////
Remove muliple columns
df = df.drop(columns = [‘Total1’,’Total2’])
Getting stats for specific column
df[“Age”].describe()
What does descibe function of pandas dataframe return me ?
It returns me a dictionary of various quantites
Line to get a specific column (VIA LABEL) OF a certain entry (VIA INDEX)
first = data.iloc[0][‘Age’]
We used 0 as index for entry (Meaning first row ) and label as identifier for specific column (Age is the column)
Line to get standard deviation of all columns
df.describe()[‘std’]
Error debugging #1
What happens if you write print(df[Name])
It will raise an error because in df[Name] we should pass a stiring, not just Name(Which might work if it was defined as a string var)
What is the difference between iloc and loc functions ?
loc is used for fetching entires based on boolean conditions
whereas
iloc is used for fetching results based on simple indices.
(Hence the use of i, which probably means index locator)
Line for filtering data of a certain column using indexof column BUT NOT LABEL
df.loc[df.iloc[:,6]
Line for printing specific entires based on comparision filters
df[df[‘Col1’]
Line for printing specific columns AFTER performing a search criteria
df.loc[df[‘Units’]
Line to replace all entries that contain specific element(string or value) with a DIFFERENT element
df.replace(to_replace =”Boston Celtics”,value =”Omega Warrior”)
Line to replace all entries that contain specific element (string or value) that belong to a set with a DIFFERENT element
df.replace(to_replace =[“Boston Celtics”, “Texas”], value =”Omega Warrior”)