Pandas Missing Values Flashcards

1
Q

NaN (acronym for Not a Number)

A

The other missing data representation, NaN (acronym for Not a Number), is different; it is a special floating-point value recognized by all systems that use the standard IEEE floating-point representation:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

np.nansum(vals2), np.nanmin(vals2), np.nanmax(vals2)

A

NumPy does provide some special aggregations that will ignore these missing values:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data.isnull()

A

Pandas data structures have two useful methods for detecting null data: isnull() and notnull(). Either one will return a Boolean mask over the data. For example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data[data.notnull()]

A

Pandas data structures have two useful methods for detecting null data: isnull() and notnull(). Either one will return a Boolean mask over the data. For example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

data.dropna()

A

In addition to the masking used before, there are the convenience methods, dropna() (which removes NA values) and fillna() (which fills in NA values). For a Series, the result is straightforward:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

df.dropna(axis=’columns’)

A

Alternatively, you can drop NA values along a different axis; axis=1 drops all columns containing a null value:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

df.dropna(axis=’columns’, how=’all’)

A

The default is how=’any’, such that any row or column (depending on the axis keyword) containing a null value will be dropped. You can also specify how=’all’, which will only drop rows/columns that are all null values:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

df.dropna(axis=’rows’, thresh=3)

A

For finer-grained control, the thresh parameter lets you specify a minimum number of non-null values for the row/column to be kept:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

data.fillna(0)

A

We can fill NA entries with a single value, such as zero:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
# forward-fill
data.fillna(method='ffill')
# back-fill
data.fillna(method='bfill')
A

We can specify a forward-fill / back fill to propagate the previous value forward:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

df.fillna(method=’ffill’, axis=1)

A

or DataFrames, the options are similar, but we can also specify an axis along which the fills take place:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly