Numpy and Pandas Fundamentals Flashcards

Fundamentals of Numpy and Pandas from the Pandas book

1
Q

pd.Series

A
  1. Index (and name/label?) must be any hashable type
  2. missing values are automatically excluded
  3. Referencing values by name, dictionary notation and numpy masking work
  4. Passing a dictionary results pd.Series(dict.values, index = dict.keys())
  5. automatically aligns with differently indexed Series
  6. Index can be altered in place by assignment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

pd.DataFrame

A
  1. Retrieve column by dict-like or attribute syntax
  2. Get rows by frame.ix[‘row_index’]
  3. accepts rows of dicts, dicts of dicts, and dicts of format ‘key’:list() where lists are of equal length (or np.arrays)
  4. del dataframe[‘colname’] works as expected
  5. columns and row indices can have names just like Series
  6. frame.values returns a 2d array (rows, columns)
  7. Accepts Numpy masked arrays. ‘masked’ elements are NA/missing in the result
  8. list of lists or list of tuples defaults to passing in row-wise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

index objects

A

methods

  1. append
  2. diff
  3. intersection
  4. union
  5. isin
  6. delete
  7. drop
  8. insert
  9. is_montonic
  10. is_unique
  11. unique
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

reindex

A
  1. creates a new object that conforms to the new index
  2. interpolation methods like ffill or bfill
  3. with dataframe, rows is the default, but columns can be specified by keyword. new columns will be NaN, columns left out will be omitted.
  4. limit option sets max of ffill or bfill
  5. fill_value can fill elements that don’t exist
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

index selection and filtering

A

Series and DataFrames

  1. ‘slicing’ is different than normal python - the endpoint is inclusive
  2. setting works as expected obj[‘b’:’c’] = 5 puts 5 in the b and c slots
  3. obj[[1,3]] pulls first and third index from series (not dataframe)
  4. obj[[‘a’,’d’,’b’]] pulls elements out in that order by that index

DataFrames

  1. df[:2] returns rows 0 and 1 (for some reason, not inclusive as mentioned above)
  2. df[df[‘col’] > 5] returns df where it’s true
  3. df[df < 0] = 0 makes all elements less than 0 equal to 0
  4. df.ix[row_criteria, col_criteria]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

indexing options with DataFrame

A
  1. obj[val] select single col or sequence of columns, boolean, slice, boolean dataframe
  2. obj.ix[val] select single row or subset of rows
  3. obj.ix[: , val] select single column of subset of columns
  4. obj.is[val1, val2] select both rows and columns
  5. reindex - conform one or more axes to new index
  6. xs method = select single row or column as a series by label
  7. icol, irow methods : select column or row, respectively, as series by integer location
  8. get_value, set_value: select single value by row and column lable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Arithmetic methods on DataFrames

A
  1. add
  2. sub
  3. mul
  4. div

each is a method on DataFrame with an optional fill_value argument for elements that do not have a match

So adding elements that don’t have a match will produce NaN, but fill_value=0 will produce identity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Arithmatic between Series and DataFrames

A
  1. df - series: will broadcast the series row down all the rows of the df
  2. if an index wasn’t found in the series, that column will be added as NaN to the df
  3. DataFrame arithmetic methods are used for column-wise math

ex. dframe.sub(series, axis = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Function application and mapping

A
  1. numpy ufuncs (element-wise array methods) work with dataframes and series
  2. dataframe has an apply method, like R’s apply. however, axis = 0 applies ACROSS rows (colsum in R), axis = 1 applies ACROSS columns (rowsum in R)
  3. Function to apply need not return a scalar - can also be a Series object.
  4. applymap performs element wise operations ex. format = lambda x: “%.2f’ % x
    5.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly