Pandas Flashcards
series
array with index
Example: create a series
obj = Series([4, 7, -5, 3], index = [‘d’, ‘b’, ‘a’, ‘c’])
find obj value
obj.values
find obj index
obj.index
Find value for index ‘a’ and index ‘c’,’a’,’d’
- obj[‘a’]
2. obj[[‘c’, ‘a’, ‘d’]]
Series vs. dict
Series is a fixed-length, ordered dict
Check if index ‘b’ exists in obj
‘b’ in obj
Create series from dict sdata
obj = Series (sdata)
Check if the value exits (or not exits) for the series obj
- pd.isnull(obj)
2. pd.notnull(obj)
Create name attributes for Series object obj and its index
- obj.name = ‘population’
2. obj.index.name = ‘state’
Most common ways to construct DataFrame
- from a dict of equal-length lists e.g. if data is a dict object, then DataFrame object frame = DataFrame(data)
- NumPy arrays, e.g. if data is a NumPy array, DataFrame(data, columns=[‘year’, ‘state’, ‘pop’]
How to retrieve a column from DataFrame as a Series
- dict-like notation: frame[‘state’]
2. by attribute. frame.year
retrieve a row from DAta Frame object frame with index ‘three’
frame.ix[‘three’]
delete a column ‘eastern’ from DataFrame object frame
del frame[‘eastern’]
Is the index objects mutable?
No, so Index objects c an be safely shared among data structures
create an Index object
index =- pd.Index(np.arrange(3))
Class: Int64Index
Specialized Index for integer values
Class: MultiIndex
“Hierarchical” index object representing multiple levels of indexing on a single axis
Class: DatetimeIndex
Stores nanosecond timestamps (represented using NumPy’s datatime64 dtype)
Class: PeriodIndex
Specialized Index for Period data (timespans)
Index methods
append, diff, intersection, union, isin, delete, drop, insert, is_monotonic, is_unique, unique
reindex
Crate a new object with the data conformed to a new index. e.g. obj2 = obj.reindex([‘a’, ‘b’, ‘c’, ‘d’, ‘e’], fill_value=0)
forward fill when reindexing
ffill or pad; obj3.reindex(range(6), method = ‘ffill’)
backward fill when reindexing
bfill or backfill
Drop entry (1) ‘c’ (2) ‘d’ and ‘c’
obj. drop(‘c’)
obj. drop([‘d’, ‘c’])
special indexing field ix
Select a subset of the rows and columns from a DataFrame with NumPy like notation plus axis labels
Hierarchical indexing
Enable you to have multiple index levels on an axis