Pd Basics Flashcards

1
Q

How to read a CSV using pandas?

A

Pd.read_csv(‘link’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to get the NUM of rows and columns?

A

Df.shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to get more specified information about the data frame for example index type

A

Df.info()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to display the list of columns?

A

Df.columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to set a limit of columns to 85?

A

Pd.set_option(‘display.max_columns’,85)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to print first 3 values?

A

Df.head(3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to print last 3 values?

A

Df.tail(3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a data frame?

A

Data frame is a two dimensional array with additional functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a series?

A

Series is a one dimensional array with additional functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How would you print the first 3 elements using iloc where you want to return only column 2 as output

A

Df.iloc[[0,1,2],2] or df.iloc[0:3,2]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What’s the difference between loc and iloc

A

Iloc uses indices as columns loc uses their actual names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does Inplace=True do?

A

Applies the change to the data frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How would you change the index of a column with a different one?

A

Df.set_index(‘column’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How would you reset the indices?

A

Df.reset_index()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you sort the indices?

A

Df.sort_index()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a filter mask? Give one example

A

A filter mask is a Boolean expression, if a row is True compared to this expression, it will be displayed. Filt = (df[‘cost’] == 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do these signs mean: & |~

A

And or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How to check if a series contains a certain value?

A

Df[column].isin()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How to check if a string contains a certain combination of signs? How to change a value from notanumber to false?

A

Df[column].str.contains(“”, na=false)

20
Q

How would you change values for a column named email to lowercase

A

Df[email] = df[email].str.lower()

21
Q

How does apply work?

A

Apply works on series by changing every single value one by one
Df.apply(pd.Series.min)

22
Q

How does applymap works?

A

Applymap works like apply but not just for elements of series, for elements of dataframes.
Df.applymap(str.lower)

23
Q

How does map work?

A

Map changes chosen values and lives everything else as NAN
Df[column].map({key:value})

24
Q

How does replace work?

A

Replace changes chosen values and leaves other values as they were
Df[column].replace({key:value})

25
Q

How would you create a new column from two other columns containing strings?

A

Df[new] = df[old1] + ‘ ‘ + df[old2]

26
Q

How would you delete a column?

A

Df.drop(columns=[columnname],Inplace=true)

27
Q

How would you expand one column into two

A

Df[column].star.split(‘ ‘, expand=true)

28
Q

How would you connect an array back

A

Df = pd.concat([df,row])

29
Q

What does ignore index do and how would you use it?

A

Ignore index makes the two concatenated dataframes’ indices merge into one list. Df = pd.concat([df,row],ignore_index=true)

30
Q

How would you sort values of a certain column?

A

Df[column].sort_values()

31
Q

How would you sort values of a certain dataframe?

A

Df.sort_values(by=[c1,c2],inplace=True)

32
Q

How would you sort a dataframe by index

A

Df.sort_index()

33
Q

How would you get 10 of the largest and smalles value of a column?

A

Df[c1].nlargest(10) df[c1].nsmallest()

34
Q

How would you get a median and mean of a column

A

Df[c1].median() Df[c2].mean()

35
Q

How would you get a detailed list of data like median and mean for a certain dataframe

A

Df.describe()

36
Q

How would you get a number of times a certain response appears in a column? How would you make it so all the numbers sum up to 1

A

Df[c1].value_counts(), Df[c1].value_counts(normalize=True)

37
Q

How do you create a group object from a dataframe? What exactly does it group?

A

Group = df.groupby(df[c1]). It groups rows with identical value for the chosen column

38
Q

How do you use multiple functions on one column?

A

Group[c1].agg([‘median’,’mean’])

39
Q

How do you get a unique set of values for chosen column?

A

Df[c1].unique()

40
Q

How do you check if values in a dataframe arent numbers?

A

Df.isna()

41
Q

How do you change values that are not numbers to something different?

A

Df.fillna(0, Inplace=true)

42
Q

How do you change the type of value in a column?

A

Df[column] = Df[column].astype(type)

43
Q

How would you format a date string to an actual date

A

Df[date] =pd.to_datetime(df[date], format=checkongoogle)

44
Q

How would you check the name of a day on a specific date? How would you do it on a series of days?

A

Df.loc[0,date].dayname(),

Df[date].dt.dayname()

45
Q

How would you find the oldest and newest date?

A

Df[date].min()
Df[date].max()

46
Q

How would you segregate data by the highest value in a given month?

A

Highs = Df[high].resample(M).max()

47
Q

How would you get rid of rows of data that have missing values in a a given column? How would you get rid of columns? How does ‘how’ parameter work?

A

Df.dropna(axis=index,how=all,subset=[last]),
Df.dropna(axis=column,how=all,subset=[last])
How parameter determines whether to delete a row if there is a value missing in ANY of the listed columns or if to delete it if there is a value missing in ALL of the listed columns