Basics Flashcards
what are 4 different types of data in python?
str = string
int = integer
float = number with decimal values
bool = boolean, True or False
Can a list contain different types of data?
Yes
With what starting number are lists indexed?
zero going forward
-1 starting from the end of the list
List slicing syntax, e.g. 1:3. Is the 3 included?
start : end
No, the end is not included
What happens if you don’t specify the start or end of a list when list slicing?
e.g. mylist[:4] or mylist[2:]
If don’t specify start, it starts at index zero
If don’t specify end, includes the start and rest of the list
x = [2, 3, 4]
y = x
y[0] = 1
How do you prevent the first element changing to 1 in the original list x?
Instead of y=x, use y=list(x) or y=x[:]. Then changes to y will not affect x.
Can a numpy array contain elements with different types?
No. If you try, some of the elements’ types are changed to end up with a homogenous list (type coersion)
Create a 2D NumPy Array
How are rows and columns indexed in a 2D NumPy Array?
How do you subset a 2D NumPy Array?
np_2d = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
Both rows and columns are indexed starting at 0 from the top left corner.
np_2d[0,2] or np_2d[0][2]: the first number is the row and the second the column
np_heights = array([191, 184, 185])
np_positions = array([‘GK’, ‘M’, ‘A’])
gk_heights = np_heights[np_positions == ‘GK’]
What will the gk_heights array contain?
It will contain 191.
np_positions == ‘GK’ creates a boolean array. Using it to index np_heights causes the program to return heights that correspond to “True” values in the boolean array
How do you make an x, y line plot?
Use matplotlib
import matplotlib.pyplot as plt
year = [1950, 1970, 1990, 2010]
pop = [2.5, 3.7, 5.3, 7.0]
plt. plot(year, pop)
plt. show()
Note that need plt.show() to see the chart
How do you create a scatter plot?
import matplotlib.pyplot as plt
year = [1950, 1970, 1990, 2010]
pop = [2.5, 3.7, 5.3, 7.0]
plt. scatter(year, pop)
plt. show()
Note that need plt.show() to see the chart
How do you make a histogram?
import matplotlib.pyplot as plt
values = [1, 2, 3, 4]
plt. hist(values, bins=2)
plt. show()
Note that bins is the number of bars you want the data to be divided into. The program automatically calculates appropriate boundaries for your data.
How do you “clean up” a plot to start afresh?
plot.clf()
How do you add names to a plot’s axes when using matplotlib.pyplot?
How do you add a title?
plt. xlabel(‘label1’)
plt. ylabel(‘label2’)
plt. title(‘title’)
How do you specify the numbers or ticks to display on an axis?
How do you change the name of the ticks on an axis?
plt. yticks([0, 2, 4, 6, 8, 10])
plt. yticks([0, 2, 4, 6, 8, 10], [‘0’, ‘2B’, ‘4B’, ‘6B’, ‘8B’, ‘10B’])
note that the names have to correspond to the ticks listed
How do you change an axis to logarithmic scale?
plt.scatter(x, y)
plt.xscale(‘log’)
how do you change the size of the dots on a scatter plot to reflect a third variable?
plt.scatter(x, y, s=z)
How do you add gridlines?
plt.grid(True)
How do you create a dictionary?
Two facts about keys
dictionaryname = {key1:value1, key2:value2, … }
When you type in dictionaryname[key2], you get value2
Keys:
- Keys in a dictionary must be unique. If state same key twice, dictionary will just retain the last value stated
- Keys have to be “immutable” objects, e.g. cannot be changed after they’re created. Strings, booleans, integers, and floats are immutable.
how do you access the “keys” in a dictionary?
dictionaryname.keys()
Note that it takes no arguments
This will list all the keys in the dictionary
eHow do you add a key:value pair to a dictionary?
How do you change a key:value pair in a dictionary?
How do you delete a key:value pair?
dictionaryname[keytobeadded] = valuetobeadded
dictionaryname[key] = newvalueforkey
del(dictionaryname[key])
How do you get a value that is in a nested dictionary?
e.g.
countries = {‘spain’:{‘capital’:’madrid’, ‘population’:46.77}
‘france’:{‘capital’:’paris’, ‘population’: 66.04}}
extract population of france
dictionaryname[key1][key2]
countries[‘france’][‘population’]
How do you create a Dataframe from a Dictionary?
How do you change the default row index numbers?
Make a dictionary where the keys will be the column labels (variables) of the Data frame, and the values are in list form:
dict ={‘country’:[‘Brazil’, ‘Russia’], ‘capital’:[‘Brasilia’, ‘Moscow’]}
Then,
import pandas as pd
dataframename = pd.DataFrame(dict)
For index change:
dataframename.index = [‘row1’, ‘row2’, ‘row3’}