Lesson7 Numpy_Pandas analysis Flashcards
Create an array of 10 zeros and ensure they are integers.
np.zeros(10, dtype=’int’)
Create a matrix with a predefined value of 5.45 with 3 rows and 5 cols.
np.full((3,5),5.45)
Create an array of even space between 0 and 2. Do this for 5 numbers.
np.linspace(0, 2, 5)
create a 3x3 array with random numbers (0-1) with a normal distribution. Specify that they have a mean 0 and standard deviation 1.
np.random.normal(0, 1, (3,3))
Combine the following arrays x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
z = [21,21,21]
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
z = [21,21,21]
np.concatenate([x, y,z])
Concatenate the grid array twice grid = np.array([[1,2,3],[4,5,6]]).
grid = np.array([[1,2,3],[4,5,6]])
np.concatenate([grid,grid])
Create a dataframe using a dictionary with the columns: Fruit and Items (the values list for items is 121,40,100,130,11] and the values for fruit Fruit’: [‘Peach’,’Apple’,’Pear’,’Plum’,’Kiwi’.
data = pd.DataFrame({‘Fruit’: [‘Peach’,’Apple’,’Pear’,’Plum’,’Kiwi’],
‘Items’:[121,40,100,130,11]})
How do you get complete information on the dataset
data.info()
Make a dataframe with the column name group, kg. Group values: ‘a’, ‘a’, ‘a’, ‘b’,’b’, ‘b’, ‘c’, ‘c’,’c’, kg values: 4, 3, 12, 6, 7.5, 8, 3, 5, 6
data = pd.DataFrame({‘group’:[‘a’, ‘a’, ‘a’, ‘b’,’b’, ‘b’, ‘c’, ‘c’,’c’],’kg’:[4, 3, 12, 6, 7.5, 8, 3, 5, 6]})
Sort the values in the data df by kg. Do this for ascending and change the original df.
data = pd.DataFrame({‘kg’: [‘a’,’a’,’a’,’b’,’b’,’b’,’c’,’c’,’c’], ‘kg values’: [4, 3, 12, 6, 7.5, 8, 3, 5, 6]})
data.sort_values(by=[‘kg’],ascending=True,inplace=True)
Sort by multiple columns - do this for data. Sort group by ascending order and kg by descending order. Make sure you don’t modify the original dataset.
data.sort_values(by=[‘group’,’kg’],ascending=[True,False],inplace=False)
data = pd.DataFrame({‘names’:[‘Mila’]3 + [‘Igor’]4, ‘Age’:[3,2,1,3,3,4,4]})
remove duplicates
data.drop_duplicates()
Remove duplicate values from the name column
data = pd.DataFrame({‘names’:[‘Mila’]3 + [‘Igor’]4, ‘Age’:[3,2,1,3,3,4,4]})
data.drop_duplicates(subset=’names’)
for the farm shop df (data) create a new column animal 2 that shows the result of the meat to animal. Ensure they are all lowercase.
data[‘animal’] = data[‘food’].map(str.lower).map(meat_to_animal)
Remove animal 2 from dataset (series only).
data.drop(‘animal2’,axis=’columns’,inplace=True)