DATA STRUCTURES Flashcards

Question

scientists = [('Nikola', 'Tesla'), ('Charles', 'Darwin'), ('Marie', 'Curie')] given_names, surnames = zip(*scientists) print(given_names) print(surnames)

Answer 1

('Nikola', 'Charles', 'Marie') ('Tesla', 'Darwin', 'Curie') _____________________________________ (UNZIPPING) You can also unzip an object with the * operator. Here’s the syntax: Note that this operation unpacks the tuples in the original list element-wise into two tuples, thus separating the data into different variables that can be manipulated further.

Answer 2

0 a 1 b 2 c _________________________________________________ The enumerate() function is another built-in Python function that allows you to iterate over a sequence while keeping track of each element’s index. Similar to zip(), it returns an iterator that produces pairs of indices and elements.

Answer 3

2 a 3 b 4 c Note that the default starting index is zero, but you can assign it to whatever you want when you call the enumerate() function. In this case, the number two was passed as an argument to the function, and the first element of the resulting iterator had an index of two. The enumerate() function is useful when an element’s place in a sequence must be used to determine how the element should be handled in an operation.

Answer 4

numbers = [1, 2, 3, 4, 5] new_list = [x + 10 for x in numbers] print(new_list)

Answer 5

[11, 12, 13, 14, 15] x + 10 is the expression, x is the element, and numbers is the iterable sequence. There is no condition.

Answer 6

[('E', 'n'), ('S', 'a')] ________________________________________ list comprehension extracts the first and last letter of each word as a tuple, but only if the word is more than five letters long.

Answer 7

make code more efficient by reducing the need to rely on loops to process data and simplifying working with iterables. Understanding these common tools will save you time and make your process much more dynamic when manipulating data.

Answer 8

['Alameda', 'Sacramento']

Answer 9

['Alameda', 'Sacramento']

Answer 10

No. Each key can only correspond to a single value; so, for example, this will throw an error:

Answer 11

Yes If you enclose multiple values within another single data structure, you can create a valid dictionary. For example: {'numbers': [1, 2, 3]}

Answer 12

[1, 2, 3] To access a specific value in a dictionary, you must refer to its key using brackets:

Answer 13

dict_values([[1, 2, 3], ['a', 'b', 'c']]) To access all values in a dictionary, use the values() method:

Answer 14

{'nums': [1, 2, 3], 'abc': ['a', 'b', 'c'], 'floats': [1.0, 2.0, 3.0]} Dictionaries are mutable data structures in Python. You can add to and modify existing dictionaries. To add a new key to a dictionary, use brackets:

Answer 15

True False To check if a key exists in a dictionary, use the in keyword:

Answer 16

{'nums': [1, 2, 3]} To delete a key-value pair from a dictionary, use the del keyword:

Answer 17

dict_items([('nums', [1, 2, 3]), ('abc', ['a', 'b', 'c'])]) Dictionaries are a core Python class. As you’ve learned, classes package data with tools to work with it. Methods are functions that belong to a class. Dictionaries have a number of built-in methods that are very useful. Some of the most commonly used methods include:

Answer 18

dict_keys(['nums', 'abc'])

Answer 19

dict_values([[1, 2, 3], ['a', 'b', 'c']])

Answer 20

{1, 2, '2'} It's a set. Notice that, in the preceding example, 2 and 2.0 are evaluated as equivalent, even though one is an integer and the other is a float.

Answer 21

{'apple', 2, (1, 2, 2, 2, 3)} example, (1, 2, 2, 2, 3) is a tuple, which is hashable (≈ immutable) and thus treated as a distinct single element in the resulting set.

Answer 22

Error on line 2: set(example_c) TypeError: unhashable type: 'set' The preceding example throws an error because each element of a set must be hashable (≈ immutable), but {‘a’, ‘b’, ‘c’} is a set, which is a mutable (unhashable) object.

Answer 23

{'hamster', 'mother', 'father', 'elderberries'}

Answer 24

{'c', 'b', 'd', 'a'} {'c', 'b', 'd', 'a'} UNION

Answer 25

{1.5, frozenset({'a', 'c', 'b'})} Unlike example_c previously, this set does not throw an error. This is because it contains a frozenset, which is an immutable type and can therefore be used in sets.

Answer 26

{'b', 'c'} {'b', 'c'} INTERSECTION

Answer 27

{'a'} {'a'} DIFFERENCE

Answer 28

{'d', 'a'} {'d', 'a'} SYMMETRIC DIFFERENCE

Answer 29

epa_tuples = list(zip(state_list, county_list, aqi_list))

Answer 30

Both a dictionary’s keys and values

Answer 31

import numpy as np

Answer 32

import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt

Answer 33

np.array([2, 4, 6])

Answer 34

[[1 2 3] [4 5 6]]

Answer 35

[[[1 2] [3 4]] [[5 6] [7 8]]]

Answer 36

[[ 0. 0.] [ 0. 0.] [ 0. 0.]]

Answer 37

[[ 1. 1.] [ 1. 1.]]

Answer 38

[[ 8. 8. 8.] [ 8. 8. 8.] [ 8. 8. 8.] [ 8. 8. 8.] [ 8. 8. 8.]] And this creates an array of a designated shape that is pre-filled with a specified value: These functions are useful for various situations: To initialize an array of a specific size and shape, then fill it with values derived from a calculation To allocate memory for later use To perform matrix operations

Answer 39

[[1 2 3] [4 5 6]] [1 2 3 4 5 6]

Answer 40

[[1 2 3] [4 5 6]] [[1 2] [3 4] [5 6]] This gives a new shape to an array without changing its data.

Answer 41

[[1 2 3] [4 5 6]] [[1 2] [3 4] [5 6]] Adding a value of -1 in the designated new shape makes the process more efficient, as it indicates for NumPy to automatically infer the value based on other given values.

Answer 42

[[1 2 3] [4 5 6]] [[1, 2, 3], [4, 5, 6]] This converts an array to a list object. Multidimensional arrays are converted to nested lists.

Answer 43

[[1 2 3] [4 5 6]] 6 3.5 1 1.70782512766 NumPy arrays also have many methods that are mathematical functions: ndarray.max() : returns the maximum value in the array or along a specified axis. ndarray.mean() : returns the mean of all the values in the array or along a specified axis. ndarray.min() : returns the minimum value in the array or along a specified axis. ndarray.std() : returns the standard deviation of all the values in the array or along a specified axis.

Answer 44

[[1 2 3] [4 5 6]] (2, 3) int64 6 [[1 4] [2 5] [3 6]] NumPy arrays have several attributes that enable you to access information about the array. Some of the most commonly used attributes include the following: ndarray.shape : returns a tuple of the array’s dimensions. ndarray.dtype : returns the data type of the array’s contents. ndarray.size : returns the total number of elements in the array. ndarray.T : returns the array transposed (rows become columns, columns become rows).

Answer 45

[[1 2 3] [4 5 6]] [4 5 6] 2 6 Access individual elements of a NumPy array using indexing and slicing. Indexing in NumPy is similar to indexing in Python lists, except multiple indices can be used to access elements in multidimensional arrays.

Answer 46

[[1 2 3] [4 5 6]] [[2 3] [5 6]] Slicing may also be used to access subarrays of a NumPy array:

Answer 47

a: [[1 2 3] [4 5 6]] b: [[1 2 3] [1 2 3]] a + b: [[2 4 6] [5 7 9]] a * b: [[ 1 4 9] [ 4 10 18]] NumPy arrays support many operations, including mathematical functions and arithmetic. These include array addition and multiplication, which performs element-wise arithmetic on arrays:

Answer 48

[[1 2] [3 4]] [[ 1 2] [ 3 100]] NumPy arrays are mutable, but with certain limitations. For instance, an existing element of an array can be changed:

Answer 49

Arrays cannot be lengthened or shortened: Error on line 5: a[3] = 100 IndexError: index 3 is out of bounds for axis 0 with size 3

Answer 50

import pandas as pd

Answer 51

col1 col2 0 1 3 1 2 4 from a Dictionary

Answer 52

a b c 0 1 2 3 1 4 5 6 2 7 8 9 from a NumPy

Answer 53

Dataframe from a CSV

Answer 54

A B C D row_0 alpha 1 coconut 6 row_1 apple 2 curse 7 row_2 arsenic 3 cassava 8 row_3 angel 4 cuckoo 9 row_4 android 5 clarinet 10 loc[] lets you select rows by name. Here’s an example:

Answer 55

A apple B 2 C curse D 7 Name: row_1, dtype: object The row index of the dataframe contains the names of the rows. Use loc[] to select rows by name:

Answer 56

A B C D row_1 apple 2 curse 7 Inserting just the row index name in selector brackets returns a Series object. Inserting the row index name as a list returns a DataFrame object:

Answer 57

A B C D row_2 arsenic 3 cassava 8 row_4 android 5 clarinet 10 To select multiple rows by name, use a list within selector brackets:

Answer 58

A B C D row_0 alpha 1 coconut 6 row_1 apple 2 curse 7 row_2 arsenic 3 cassava 8 row_3 angel 4 cuckoo 9 You can even specify a range of rows by named index:

Answer 59

A B C D row_0 alpha 1 coconut 6 row_1 apple 2 curse 7 row_2 arsenic 3 cassava 8 row_3 angel 4 cuckoo 9 row_4 android 5 clarinet 10 A apple B 2 C curse D 7 Name: row_1, dtype: object iloc[] lets you select rows by numeric position, similar to how you would access elements of a list or an array. Here’s an example.

Answer 60

A B C D row_1 apple 2 curse 7 Inserting just the row index number in selector brackets returns a Series object. Inserting the row index number as a list returns a DataFrame object:

Answer 61

A B C D row_0 alpha 1 coconut 6 row_2 arsenic 3 cassava 8 row_4 android 5 clarinet 10 To select multiple rows by index number, use a list within selector brackets:

Answer 62

A B C D row_0 alpha 1 coconut 6 row_1 apple 2 curse 7 row_2 arsenic 3 cassava 8 Specify a range of rows by index number:

Answer 63

row_0 coconut row_1 curse row_2 cassava row_3 cuckoo row_4 clarinet Name: C, dtype: object Column selection works the same way as row selection, but there are also some shortcuts to make the process easier. For example, to select an individual column, simply put it in selector brackets after the name of the dataframe:

Answer 64

A C row_0 alpha coconut row_1 apple curse row_2 arsenic cassava row_3 angel cuckoo row_4 android clarinet And to select multiple columns, use a list in selector brackets:

Answer 65

row_0 alpha row_1 apple row_2 arsenic row_3 angel row_4 android Name: A, dtype: object Dot notation It’s possible to select columns using dot notation instead of bracket notation. For example:

Answer 66

A B C D row_0 alpha 1 coconut 6 row_1 apple 2 curse 7 row_2 arsenic 3 cassava 8 row_3 angel 4 cuckoo 9 row_4 android 5 clarinet 10 B D row_0 1 6 row_1 2 7 row_2 3 8 row_3 4 9 row_4 5 10 Note that when using loc[] to select columns, you must specify rows as well. In this example, all rows were selected using just a colon (:).

Answer 67

B D row_0 1 6 row_1 2 7 row_2 3 8 row_3 4 9 row_4 5 10 Similarly, you can use iloc[] notation. Again, when using iloc[], you must specify rows, even if you want to select all rows:

Answer 68

A C row_0 alpha coconut row_1 apple curse row_2 arsenic cassava Both loc[] and iloc[] can be used to select specific rows and columns together.

Answer 69

A B C row_2 arsenic 3 cassava row_4 android 5 clarinet Again, notice that when using loc[] to select a range, the final element in the range is included in the results.

Answer 70

Error on line 1: print(df.loc[0:3, ['D']]) Note that, when using rows with named indices, you cannot mix numeric and named notation. For example, the following code will throw an error:

Answer 71

D row_0 6 row_1 7 row_2 8 row_0 6 row_1 7 row_2 8 Name: D, dtype: int64 To view rows [0:3] at column ‘D’ (if you don’t know the index number of column D), you’d have to use selector brackets after an iloc[] statement:

Answer 72

A B C D 0 alpha 1 coconut 6 1 apple 2 curse 7 2 arsenic 3 cassava 8 3 angel 4 cuckoo 9 4 android 5 clarinet 10 However, in many (perhaps most) cases your rows will not have named indices, but rather numeric indices. In this case, you can mix numeric and named notation. For example, here’s the same dataset, but with numeric indices instead of named indices.

Answer 73

D 0 6 1 7 2 8 3 9 Notice that the rows are enumerated now. Now, this code will execute without error:

Answer 74

Boolean masking

Answer 75

moons planet radius_km 0 0 Mercury 2440 1 0 Venus 6052 2 1 Earth 6371 3 2 Mars 3390 4 80 Jupiter 69911 5 83 Saturn 58232 6 27 Uranus 25362 7 14 Neptune 24622

Answer 76

0 True 1 True 2 True 3 True 4 False 5 False 6 False 7 True Name: moons, dtype: bool

Answer 77

moons planet radius_km 0 0 Mercury 2440 1 0 Venus 6052 2 1 Earth 6371 3 2 Mars 3390 7 14 Neptune 24622

Answer 78

moons planet radius_km 0 0 Mercury 2440 1 0 Venus 6052 2 1 Earth 6371 3 2 Mars 3390 7 14 Neptune 24622

Answer 79

moons planet radius_km 0 0 Mercury 2440 1 0 Venus 6052 2 1 Earth 6371 3 2 Mars 3390 7 14 Neptune 24622

Answer 80

0 Mercury 1 Venus 2 Earth 3 Mars 7 Neptune Name: planet, dtype: object

Answer 81

0 True 1 True 2 True 3 True 4 True 5 True 6 False 7 False Name: moons, dtype: bool

Answer 82

moons planet radius_km 0 0 Mercury 2440 1 0 Venus 6052 2 1 Earth 6371 3 2 Mars 3390 4 80 Jupiter 69911 5 83 Saturn 58232

Answer 83

moons planet radius_km 5 83 Saturn 58232

Answer 84

moons planet radius_km 5 83 Saturn 58232

Answer 85

color mass_g price_usd type 0 red 125 20 pants 1 blue 440 35 shirt 2 green 680 50 shirt 3 blue 200 40 pants 4 green 395 100 shirt 5 red 485 75 pants

Answer 86

mass_g price_usd type pants 270.0 45.000000 shirt 505.0 61.666667

Answer 87

mass_g price_usd type color pants blue 200 40 red 125 20 shirt blue 440 35 green 395 50

Answer 88

type color pants blue 1 red 2 shirt blue 1 green 2 dtype: int64 To simply return the number of observations there are in each group, use the size() method. This will result in a Series object with the relevant information:

Answer 89

count(): The number of non-null values in each group sum(): The sum of values in each group mean(): The mean of values in each group median(): The median of values in each group min(): The minimum value in each group max(): The maximum value in each group std(): The standard deviation of values in each group var(): The variance of values in each group

Answer 90

price_usd mass_g mean max mean max color blue 37.5 40 320.0 440 green 75.0 100 537.5 680 red 47.5 75 305.0 485 the items in clothes are grouped by color, then each of those groups has the mean() and max() functions applied to them at the price_usd and mass_g columns.

Answer 91

mass_g price_usd mean min mean min color type blue pants 200.0 200 40.0 40 shirt 440.0 440 35.0 35 green shirt 537.5 395 75.0 50 red pants 305.0 125 47.5 20

Answer 92

mean min color type blue pants 40.0 40 shirt 35.0 35 green shirt 75.0 50 red pants 47.5 20

Answer 93

color type blue pants 40 shirt 35 green shirt 50 red pants 20 Name: (price_usd, min), dtype: int64

Answer 94

mass_g price_usd mean min mean min type pants 200.0 200 40.0 40 shirt 440.0 440 35.0 35

Answer 95

mass_g mean 537.5 min 395.0 price_usd mean 75.0 min 50.0 Name: (green, shirt), dtype: float64

Answer 96

color type mass_g price_usd 0 blue pants 200.0 40.0 1 blue shirt 440.0 35.0 2 green shirt 537.5 75.0 3 red pants 305.0 47.5

Answer 97

cities.insert(1, ‘Mumbai’)

Answer 98

employees.items()

Answer 99

difference()

Answer 100

reshape() The reshape() method in NumPy is used to change the shape of an array without altering its data. In this case, if the array has three rows and two columns, the data professional can use reshape() to transform it into two rows and three columns.

Answer 101

sales['Price'].max()

Answer 102

iloc[] iloc is used to select rows and columns based on their integer index positions, rather than their labels. You can specify the row and column indices to select the desired subset.