Udemy Code Review Flashcards
num = 12 name= Sam
Print ‘My number is 12 and my name is Sam’ in two ways
print(‘My number is {} and my name is {}’.format(num, name)) print(‘My number is {one} and my name is {two}’.format(one=num, two=name))
d = {‘k1’:[1,2,3]}
Grab number 2
d[‘k1’][1]
What is a set?
set function
A collection of unique elements, eg. {1, 2, 3}
If put in multiples, will return only uniques. e.g. {1, 1, 2, 2, 3, 3} will return {1, 2, 3}
You can pass a list to a set function to get the unique elements:
set([1, 1, 2, 3, 4 ,4, 6, 6])
Returns: {1, 2, 3, 4, 6}
Add 5 to this set:
s = {1, 2, 3}
s.add(5)
split method
string.split(‘delimiter’)
It splits a string into a list based on the delimiter, which is space by default
e.g.
s = ‘hello my name is Sam’
s.split() returns: [‘hello’, ‘my’, ‘name’, ‘is’, ‘Sam’]
Tuple unpacking
x = [(1, 2), (3, 4), (5, 6)]
print 1, 3, 5
for a, b in x:
print(a)
numpy equivalent of range function?
np.arange(start, stop, increment)
Make an 1x3 array of zeros
Make a 5x5 array of zeros
pass array dimensions as a tuple
np. zeros(3)
np. zeros((5, 5))
linspace
np.linspace(start, stop, numberpoints)
Creates an evenly spaced sequence of numbers of desired length (numberpoints)
Create a 4x4 identity matrix
np.eye(4)
Create 1x5 array with random numbers from uniform distribution
5x5 array same
This returns random numbers between 0 and 1
np. random.rand(5)
np. random.rand(5, 5)
Create 2x3 array of random numbers from the normal distribution
np.random.randn(2, 3)
Create an array of 10 random integers from 1 to 99
low number is included but high number is not
np. random.randint(1, 100, 10)
np. random.randint(low, high, numberpoints)
How do you reshape a 1x25 array into a 5x5 array? Use array ‘arr’
arr.reshape(5, 5)
Get the maximum value of the array ‘arr’
Get the minimum value of the array ‘arr’
Get the location of the maximum value of the array ‘arr’
returns index value of max value
arr. max()
arr. min()
arr. argmax()
arr = np.arange(0, 11)
slice_of_arr = arr[0:6]
slice_of_arr[:] = 99
What are the elements of arr?
arr = ([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
Pandas is built upon
Numpy
What is the difference between pandas series and numpy arrays?
Panda series can use labeled indexes
Change my_data into a panda Series with labels as index
labels = [‘a’, ‘b’, ‘c’]
my_data = [10, 20, 30]
Can also pass it a numpy array
pd.Series(data=my_data, index=labels)
What is the index when you turn a dictionary into a pandas series?
The index are the keys.
e.g.
d = {‘a’:10, ‘b’:20, ‘c’:30}
pd.Series(d)
returns:
a 10
b 20
c 30
USA 1
Germany 2
Italy 5
Japan 4
This is a pandas series ser1. How do you pull out 1?
ser1[index]
ser1[‘USA’]
How do you drop a column from a dataframe?
How do you drop a row from a dataframe?
inplace specifies whether to modify original df
df.drop(‘column1’, axis=1, inplace=True)
df.drop(‘row1’, axis=0, inplace=True)
How do you do AND and OR conditions with pandas boolean series? e.g. You want the df where df[‘W’] > 0 and df[‘Y’] > 1
You cannot use regular AND or OR operators.
You need to use & or |
e.g.
df[(df[‘W’] > 0) & (df[‘Y’] > 1)]
How do you reset the index in a dataframe?
inplace is optional. True means that it will change the original dataframe. Default is False
df.reset_index(inplace=True)
How do you set a new index?
inplace = True if want to change original df
df.set_index(column1, inplace=True)
drop missing values from a dataframe
axis is optional and default is 0 (rows)
df.dropna(axis=0)
How do you fill in missing values in a dataframe?
df.fillna(value= )
What is concatenation? How do you concatenate dataframes?
axis=0 concatenates vertically
Combining dataframes
pd.concat([df1, df2, df3], axis = 0)
How do you merge dataframes together based on a common column?
similar logic as merging SQL tables together
pd.merge(df1, df2, how=’inner’, on=’commoncolumn’)
How do you combine dataframes based on an index?
df1.join(df2)
How do you apply a function to a column in a dataframe?
dfd[‘col1’].apply(functionname)
How do you drop a column from a dataframe?
df.drop(‘col1’, axis=1, inplace=True)
Get a list of the column names of a dataframe
df.columns
Sort a dataframe by a column
df.sort_values(by=’col1’)
create a pivot table with multiple indexes
df.pivot_table(values=’col1’, index=[‘col2’, ‘col3’], columns=[‘col4’])
Figure out working directory
pwd
Save a dataframe to a csv
If don’t include index=False, the index is converted into a column
df.to_csv(‘My_output’, index = False)