Categorical Variables Flashcards

1
Q

What is the .value_counts()

A

the function returns an object containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.

dataframe[‘column’]= dateframe[‘column’].value.counts()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the .str.strip()?

A

Removes the white space characters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is qcut()?

A

“Quantile-based discretization function.” This basically means that .qcut() tries to divide up the underlying data into equal-sized bins. The function defines the bins using percentiles based on the distribution of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you collapse data categories?

A
  1. create the range: range=
  2. Create names
  3. dataframe[‘column’]= pd.cut(dataframe[‘other cloumn’}, bins= range, labels= names)
  4. dataframe[[‘column’, ‘other colum]]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the pd.cut()?

A

cut() function is used to separate the array elements into different bins . The cut function is mainly used to perform statistical analysis on scalar data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you map fewer categories?

A
  1. Create a mapping dictionary
  2. dataframe[‘column’]= dataframe[‘column’].replace(mapping)
  3. dataframe[‘column’].unique()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is unique()?

A

the function is used to find the unique elements of an array. Returns the sorted unique elements of an array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly