Intro To Pandas Flashcards
What is Pandas
One of the most common python libraries used by data scientists.
Why is pandas so popular?
Because it can connect to just about any data source, such as SQL database, from web, load from excel and much more
How is pandas used?
Pandas provide easy-to-use data structures and tools for effectively loading, manipulating and exporting in-memory data in python
Why do data manipulation with pandas?
The pandas library helps you explore your data and visually see the structure of your output as you are transforming your data.
What dataframes work very nicely with python machine learning libraries?
Sickit-learn, statsmodels and data visualization libraries( matplotib, seaborn)
How does having data cleaned in a dataframe help?
Let’s you quickly visualize your data or feed it into a machine learning algorithm
What is tabular?
Data presented in columns or tables
What is Slicing?
To access just certain parts of our dataset
How to use slicing?
With square bracketz, single brackets return a panda series and double brackets return dataframe
Difference between a NumPy array and a pandas Series?
The essential difference is how they are indexed
How to mount data?
from google.colab import drive
drive.mount(‘/content/drive’)
How to filter mortgage names from you data?
df[ ‘Mortgage Name’ ]
How to filter valuable mortgage name data?
df[’ Mortgage Name’ ].value_counts()
How to filter out just 30 year mortgages?
df[df[ ‘Mortgage Name’ ] == ‘30 Year’s]
How to combine filters?
df = df.loc[mortgage_filter & interest_filter, :]
df