Importing dataset Flashcards
Importing
import pandas as pd
Read the online file
# Import pandas library import pandas as pd
# Read the online file by the URL provides above, and assign it to variable "df" o ther_path = "https://cf-courses-data.s3.us.cloud-object storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-SkillsNetwork/labs/Data%20files/auto.csv"
df = pd.read_csv(other_path, header=None)
header = none
we can add an argument headers = None inside the read_csv() method, so that pandas will not automatically set the first row as a header.
import pandas as pd
url = xxxxx
df = pd.read_csv(url, header=None)
Print the first 5 rows
df.head(5)
Print the bottom 10 rows
df.tail(10)
You need to define the headers first and then print the first 10 rows
df. columns = headers
df. head(10)
Remove missing values
df1=df.replace(‘?’, np.NaN)
Drop missing values along the column “price”
df=df1.dropna(subset=[“price”], axis=0)
df.head(20)
Print columns
print (df.columns)
Print data types
print(df.dtypes)
Describe all teh main characteristics of teh data
# describe all the columns in "df" df.describe(include = "all")
Describe certain columns
df [[‘length’, ‘compression-ratio’]].describe()
Export to different formats in Python
data format: csv
read: pd.read_csv()
save: df.to_csv()