09 NumPy Udemy Flashcards
NumPy is a Linear Algebra Library for Python, the Reasos it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks
NumPy is also incredibly fast, as it has bindings to C libraries
Vectors are strictly 1-d arrays
Matrices are 2-d
but can still have only one row or one columns
NumPy Arrays
my_list = [1,2,3] my_list np.array(my_list) == [1, 2, 3] array([1, 2, 3])
my_matrix = [[1,2,3],[4,5,6],[7,8,9]] my_matrix np.array(my_matrix) == [[1, 2, 3], [4, 5, 6], [7, 8, 9]] array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
np.arange(0,10,2)
==
array([ 0, 2, 4, 6, 8])
np.zeros((4,5))
==
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
np.ones((2,3))
==
array([[ 1., 1., 1.],
[ 1., 1., 1.]])
np.linspace(0,10,3)
array([ 0., 5., 10.])
np.linspace(0,10,6)
array([ 0., 2., 4., 6., 8., 10.])
np.eye(4)
==
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
np.random.rand(4,3)
==
rray([[0.88277868, 0.64350899, 0.34540502],
[0.48679882, 0.27605306, 0.78273307],
[0.62853873, 0.68858502, 0.81299911],
[0.80075412, 0.49107607, 0.90190195]])
Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1)
np.random.randn(5,4)
==
array([[-0.6898178 , -2.19758573, 0.0801776 , -0.04129733],
[-2.7230242 , 0.43690199, -0.23268161, 0.88727527],
[ 0.38131687, -0.15555836, 1.11193221, -2.1126014 ],
[-0.70322463, -0.16923125, -1.80195906, 2.00983817],
[-0.97640657, 1.14549034, -1.05173513, -0.80275212]])
Return a sample (or samples) from the “standard normal” distribution. Unlike rand which is uniform:
np.random.randint(1,11,100)
==
Return random integers from low (inclusive) to high (exclusive).
array([ 5, 8, 10, 5, 7, 10, 5, 3, 3, 3, 9, 8, 3, 1, 9, 10, 1,
5, 10, 2, 3, 9, 8, 2, 2, 7, 3, 4, 6, 10, 8, 7, 6, 5,
8, 2, 1, 8, 9, 4, 3, 5, 2, 9, 8, 6, 10, 8, 7, 9, 9,
7, 9, 5, 3, 4, 10, 10, 3, 3, 2, 1, 8, 5, 8, 4, 7, 10,
8, 1, 10, 10, 9, 10, 8, 3, 4, 4, 5, 6, 2, 5, 1, 3, 5,
8, 10, 10, 10, 6, 4, 4, 8, 5, 6, 8, 8, 7, 1, 8])
arr = np.arange(25)
arr
==
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
arr.reshape(5,5) == array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]])
ranarr = np.random.randint(0,50,10)
ranarr
==
array([38, 12, 4, 45, 4, 23, 14, 49, 41, 21])
ranarr. max() == ???
ranarr. argmax() == ???
49
7
(pra mínimo (min) o raciocínio é o mesmo)
argmax é o índice / posição
x.shape
mostra o formato
x.dtype
mostro o tipo de arquivo
arr = np.arange(0,11)
arr[1:5]
==
array([1, 2, 3, 4])
Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:
Broadcasting
Numpy arrays differ from a normal Python list because of their ability to broadcast
arr
== >
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
arr[0:5]=100
== >
array([100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10])
slice_of_arr = arr[0:6] slice_of_arr[:]=99 slice_of_arr arr == > array([99, 99, 99, 99, 99, 99]) array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
Now note the changes also occur in our original array!
Data is not copied, it’s a view of the original array! This avoids memory problems!
### To get a copy, need to be explicit arr_copy = arr.copy()
Indexing a 2D array (matrices)
The general format is arr_2d[row][col] or arr_2d[row,col].
I recommend usually using the comma notation for clarity.
array([[ 5, 10, 15],
[20, 25, 30],
[35, 40, 45]])
arr_2d[:2,1:]
==>
array([[10, 15],
[25, 30]])
Fancy Indexing
Fancy indexing allows you to select entire rows or columns out of order,to show this, let’s quickly build out a numpy array:
#Set up matrix arr2d = np.zeros((10,10))
#Length of array arr_length = arr2d.shape[1]
for i in range(arr_length):
arr2d[i] = i
arr2d
==>
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
[4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
[8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
[9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])
Allows in any order
Fancy indexing allows the following
arr2d[[2,4,6,8]]
==>
array([[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])
arr2d[[6,4,2,7]]
==>
array([[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])
arr1 = np.arange(1,11)
arr1
==>
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
arr1 > 4
==>
array([False, False, False, False, True, True, True, True, True, True], dtype=bool)
bool_arr = arr1 >4
bool_arr
==>
array([False, False, False, False, True, True, True, True, True, True], dtype=bool)
arr1[bool_arr]
==>
array([ 5, 6, 7, 8, 9, 10])
arr1[arr1>2]
==>
array([ 3, 4, 5, 6, 7, 8, 9, 10])
arr = np.arange(0,10) arr arr + 100 arr * arr ==> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109]) array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])
arr/arr
==>
array([nan, 1., 1., 1., 1., 1., 1., 1., 1., 1.])
# 0/0 == null (nan no np)
1/arr
==>
array([ inf, 1. , 0.5 , 0.33333333, 0.25 ,
0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111])
## resultado é infinito + warning