Pandas Missing Values Flashcards

Question 1

Q

NaN (acronym for Not a Number)

Answer

A

The other missing data representation, NaN (acronym for Not a Number), is different; it is a special floating-point value recognized by all systems that use the standard IEEE floating-point representation:

Question 2

Q

np.nansum(vals2), np.nanmin(vals2), np.nanmax(vals2)

Answer

A

NumPy does provide some special aggregations that will ignore these missing values:

Question 3

Q

data.isnull()

Answer

A

Pandas data structures have two useful methods for detecting null data: isnull() and notnull(). Either one will return a Boolean mask over the data. For example

Question 4

Q

data[data.notnull()]

Answer

A

Pandas data structures have two useful methods for detecting null data: isnull() and notnull(). Either one will return a Boolean mask over the data. For example

Question 5

Q

data.dropna()

Answer

A

In addition to the masking used before, there are the convenience methods, dropna() (which removes NA values) and fillna() (which fills in NA values). For a Series, the result is straightforward:

Question 6

Q

df.dropna(axis=’columns’)

Answer

A

Alternatively, you can drop NA values along a different axis; axis=1 drops all columns containing a null value:

Question 7

Q

df.dropna(axis=’columns’, how=’all’)

Answer

A

The default is how=’any’, such that any row or column (depending on the axis keyword) containing a null value will be dropped. You can also specify how=’all’, which will only drop rows/columns that are all null values:

Question 8

Q

df.dropna(axis=’rows’, thresh=3)

Answer

A

For finer-grained control, the thresh parameter lets you specify a minimum number of non-null values for the row/column to be kept:

Question 9

Q

data.fillna(0)

Answer

A

We can fill NA entries with a single value, such as zero:

Question 10

Q

# forward-fill
data.fillna(method='ffill')

# back-fill
data.fillna(method='bfill')

Answer

A

We can specify a forward-fill / back fill to propagate the previous value forward:

Question 11

Q

df.fillna(method=’ffill’, axis=1)

Answer

A

or DataFrames, the options are similar, but we can also specify an axis along which the fills take place:

Pandas Missing Values Flashcards

(11 cards)