Pandas Index & Slice Flashcards
data[‘b’]
returns ‘b’
Like a dictionary, the Series object provides a mapping from a collection of keys to a collection of values:
‘a’ in data
data.keys()
list(data.items())
We can also use dictionary-like Python expressions and methods to examine the keys/indices and values:
data[‘e’] = 1.25
Series objects can even be modified with a dictionary-like syntax. Just as you can extend a dictionary by assigning to a new key, you can extend a Series by assigning to a new index value:
# slicing by explicit index data['a':'c']
# slicing by implicit integer index data[0:2]
# masking data[(data > 0.3) & (data < 0.8)]
# fancy indexing data[['a', 'e']]
A Series builds on this dictionary-like interface and provides array-style item selection via the same basic mechanisms as NumPy arrays – that is, slices, masking, and fancy indexing. Examples of these are as follows:
data. loc[1]
data. loc[1:3]
Pandas provides some special indexer attributes that explicitly expose certain indexing schemes. These are not functional methods, but attributes that expose a particular slicing interface to the data in the Series.
First, the loc attribute allows indexing and slicing that always references the explicit index:
data. iloc[1]
data. iloc[1:3]
The iloc attribute allows indexing and slicing that always references the implicit Python-style index:
data[‘area’]
data.area
The individual Series that make up the columns of the DataFrame can be accessed via dictionary-style indexing of the column name:
data[‘density’] = data[‘pop’] / data[‘area’]
data
Like with the Series objects discussed earlier, this dictionary-style syntax can also be used to modify the object, in this case adding a new column:
data.T
many familiar array-like observations can be done on the DataFrame itself. For example, we can transpose the full DataFrame to swap rows and columns:
data.iloc[:3, :2]
Here Pandas again uses the loc, iloc, and ix indexers mentioned earlier. Using the iloc indexer, we can index the underlying array as if it is a simple NumPy array (using the implicit Python-style index), but the DataFrame index and column labels are maintained in the result:
data.loc[:’Illinois’, :’pop’]
Similarly, using the loc indexer we can index the underlying data in an array-like style but using the explicit index and column names:
data.loc[data.density > 100, [‘pop’, ‘density’]]
Any of the familiar NumPy-style data access patterns can be used within these indexers. For example, in the loc indexer we can combine masking and fancy indexing as in the following:
data[‘Florida’:’Illinois’]
data[1:3]
data[data.density > 100]
First, while indexing refers to columns, slicing refers to rows:
direct masking operations are also interpreted row-wise rather than column-wise: