Python Flashcards
Attribute that returns data type for a series or dataframe
Series.dtype
Dataframe.dtype
pandas method to show first n (5 is default) rows in a series
Series.head()
pandas method to show last n (5 is default) rows in a series
Series.tail()
pandas method to return unique values in a series
Series.unique()
pandas method to view the highest and lowest values in a series with their counts
Series.sort_index()with ascending=True orFalse
pandas method to count unique values in a series
Series.value_counts()
dropna=False includes null
normalize=True to do %
pandas method to remove/replace specified text
Series.str.replace([‘text_to_replace’],’’)
pandas method to cast a pandas object to a specified type (ex: cast string to float or int)
Series.astype(float)
pandas method chaining to replace text and cast to number
Series.str.replace([‘text_to_replace’],’’).astype(float)
pandas method to detect missing values and return a boolean same-sized object indicating if the values are NA.
Series.isnull()
Ex: Select null values in column
rev_is_null = f500[“revenue”].isnull()
pandas method to detect existing (non-missing) values and return a boolean same-sized object indicating if the values are not NA.
Series.notnull()
Ex: Select non-null values in a column
rev_not_null = f500[f500[“revenue”].notnull()
pandas method to generate descriptive statistics such as central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. Analyzes both numeric and object series, as well as DataFrame column sets of mixed data types. The output will vary depending on what is provided.
Series.describe()
include = ‘all’ to include non-numeric columns
pandas method to rename series index labels or name.
Series.rename({“ram”:”ram_gb”}, axis = ‘columns’, inplace = True)
pandas method to remove whitespace from start and end of string
String.strip()
pandas method to convert string to lowercase
String.lower()
pandas attribute that returns column names for dataframe
DataFrame.columns
pandas method to print a concise summary of a DataFrame, including including the index dtype and column dtypes, non-null values and memory usage.
DataFrame.info()
pandas method to remove or drop rows and columns with null values
DataFrame.dropna()
axis = 0 drops rows
axis = 1 drops columns
pandas method to map values of Series according to input correspondence.
Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.
Series.map(mapping_dictionary)
Ex:
s = pd.Series([‘fox’, ‘cow’, np.nan, ‘dog’])
s.map({‘fox’: ‘cub’, ‘cow’: ‘calf’})
If a value from your series doesn’t exist as a key in your dictionary, it will convert that value to NaN
pandas function to read in a csv file
f = pd.read_csv(‘[file_name]’, encoding=’[encoding_type]’)
encoding type examples = Latin-1, UTF-8, Windows-1251
pandas method to export cleaned data (pandas)
DataFrame.to_csv(“file_name”, index=False)
Attribute to access labels of a series
Series.index
Ex: Series.value_counts().head(10).index
Sort a dictionary in ascending order based on values (not keys)
sort_d = sorted(dictionary.items(), key=lambda kv:(kv[1], kv[0])))
pandas method to remove or drop rows or columns from a dataframe based on specified values
DataFrame.drop([“col_name1”, “col_name2”], axis=1)
Change axis to 0 to drop rows