2 Flashcards
Python: The lines within each block ( : ) must be
Indented e.g.
if var_1 == True:
print(“its true”)
break
Python: The function that places a value in the {} placeholder is
str.format(“value”) e.g. print(“my name is {}”.format(“Alen”))
Python: For a while loop, typing “while True:” will cause it to
loop forever, unless you add a “break” e.g.
if var_1 == 5
break
Python: For loops, the statement that will cause the loop to return to the top, without continuing the rest of the loop is called
continue
Python: To iterate/loop on every value in a list, use
for made_up_var in list_var:
print(made_up_var)
Python: Python stripts run in sequence so
Do not try to use a variable that only gets assigned later in the script
Python: For functions with arguments you must
Send in a value or else you will get an error
Python: An argument, with regards to functions, is
A value you pass into the function when you run it. e.g. my_function(argument1, argument2)
Python: You must tell a function what values to return by typing
return var_1, var_2
Python: functions start with the word
def
Python: function names should only use
lower case letters and underscores
Python: To add another if statement after the first one use
elif e.g.
elif var_1 == “SHOW”:
show_list()
continue
Python: After defining a function, to run it you must
call it e.g. my_function(argument)
Python: To import a python library, type
import name_of_library, into the python interpreter or top of script
Python: To call the “choice” method form the “random” library, type
random.choice()
Python: Can you create a new variable inside a function
Yes
Python: It’s recommended to do all the imports
at the beginning of the script
Python: It’s recommended to do the imports
at the beginning of the script
Python: You can imbed if and else commands
inside other if and else commands
Python: To get an automatic list of numbers starting from zero, type
list(range(10))
Python: You cannot reference variables outside of a function
From within the function. You must pass it in.
Python: To assign multiple variables simulaneously, type
var_1, var_2 = “Phil”, “Bill”
Python: Values on the right side of a variable always get
Evaluated first
Python: To insert a list into the middle of another list, type
list_1.insert(4, list_2)
The 4 denotes the index to insert at
Python: To generate a range of numbers, possibly for a loop, type
range(10)
Python: You can concatenate two lists together, without placing the second list within one index, by typing
[1, 2, 3] + [4, 5]
Python: Should you add a colon : after calling, not defining, a function?
No
Python: To add a line to a string, type
\n
Python: To do arithmetic on a variable in a shorter way, type
var += 2
var -= 2
var *= 2
var /= 2
Python: To remove all white space from the beginning and end of a string, type
“My string”.strip()
Python: Strings that are placed in a list must
have “ “ around them. e.g.
[“a”, “b”, “c”]
Python: Strings are
Immutable
Python: Lists are
Mutable
Python: To remove one item from a list by its index, type
del my_list[2]
Python: The del function does not work on
Strings, because they are immutable.
Python: To delete a list item by passing its value, type
my_list.remove(value)
Python: The remove() function only removes
The first instance of the passed value in the list
Python: When you use remove() on a value that does not exist it
throws an exception, so use within try/except
Python: To make a string lower case, type
lower(“STRING”) or “STRING”.lower()
Python: To capitalize the first letter of a string, type
capitalize(“string”) or string.capitalize()
Python: To remove a value from a list by its index and return it, type
my_list.pop(2)
Python: Inputs, input(), that are numbers should
Be converted to int() because input() is a string by default
Python: To return a portion of a list of string, type
my_list[3:4] or “my string”[1:6]
Python: To slice until the end of a string without knowing its length, type
my_string[1:len(my_string)]
Python: Slices, [:], do not alter a list, they
Return a copy of it
Python: To slice a list or string by returning steps that skip, type
my_list[1::2], Add an extra colon
Python: To return a string or list backwards using slice, type
my_string[::-1], make the skipping step a negative, and swap the start and end range [10:1:-1]
Python: Slices that start a range as a negative
Move the start point backwards through the end of the string or list and start slicing from there.
Python: Standard format for a function
my_list = list(range(3))
def first_4(my_iterable): four_arg = my_iterable[:4] return four_arg
first_4(my_list)
Python: Function checklist
The function def ends with :
The function is called somewhere
All of the arguments defined in the function are being passed in
All of the lines are indented the same
The variable referenced in the function are also assigned in the function
It ends with return
Python: Can you delete from the middle of a string?
No
Python: To delete a slice, type
del my_list[1:3]
Python: To replace a slice of a list with new items, type
my_list[4:7] = [“e”, “f”]
Python: The sections of the slice function are
[start:stop:step]
Python: To return the value associate with a key inside a dictionary, type
my_dict[“key_name”]
Python: To create a dictionary, type
my_dict = {“Key”: “Value”, “Key2”: “Value2”}
Python: You cannot return a key and value from a dictionary by its index because
The order changes and the keys and values are not attributed to an index
Python: Can you create a dictionary as a value of a key in another dictionary?
Yes
Python: Can you create lists within lists?
Yes
Python: The append() function is not recommended for concatenating 2 disparate lists because
It places the entirety of the appended list into the very last index of the initial string and does not set the values into individual indexes of the initial list.
Python: To return the value of a key in a dictionary that itself is the value of a key in a superceding dictionary, type
my_dict[“key_name”][“key_name2”]
Python: Returning a value from inside a function does not print it, it
Turns the calling function into that value.
Python: Functions must end with
return
Python: Before saving new code always check presence of
All necessary colons and indentations.
Python: The format for a “for loop” that checks for the presence of each of a lists items in a dictionary and then adds the items that are present to another list, is
present_in_list = []
for item in my_list:
if item in my_dict:
present_in_list.extend(item)
Python: elif requires
a True value test in order to run
Python: else does not require
a test because it runs whenever the test on “if” was False
Python: Returning a value from inside a function does not print it, it
Turns the value of the of the calling function into the returned value. If more than one value it returns as a tuple.
Python: The boolean values True and False must be
capitalized.
Python: To fill placeholders in a string using the format method without knowing which order to put the values, you can assign key names to values by typing
“My name is {name_key} and I am {age_key} years old”.format(age_key=”22”, name_key=”Alen”)
Python: To use the key name placeholders in the format method with the key values are stored in a dictionary, type
my_dict = {“state”: “California”, “name”: “Alen”}
“I am {name} and I live in {state}”.format(**my_dict)
Python: To create a new key for a dictionary, type
my_dict[“new_key_name”] = “value”
Python: To change the value of an existing dictionary key name, type
my_dict[“key_name”] = “new_value”
Python: To delete a key from a dictionary, type
del my_dict[“key_name”]
Python: To add and change values of multiple dict keys at once, type
my_dict.update({“job”: “Teacher”, “age”: “23”, “gender”: “male”})
Python: To create an empty dictionary, type
my_dict = {}
Pandas: To use pandas and be able to call it by a shorter name, type
import pandas as pd
Python: This returns a
def method(): return var_1, var_2, var_3
method()
Tuple
Python: To return a tuple with the index and value from an iterable, type
enumerate(my_iterable)
Python: To unpack an enumerate()d tuple into a “for loop”, type
for tuple in enumerate(my_iterable):
print(“This is item number {} and it is a {}”.format(*tuple))
or
for index, item in enumerate(my_iterable):
print(“This is item number {} and it is a {}”.format(index, item))
or
for tuple in enumerate(my_iterable):
print(“This is item number {} and it is a {}”.format(tuple[1], tuple[2]))
Python: One asterisk in the format function
Unpacks a tuple
Python: To unpack an item tuple into a “for loop”, type
for key, value in my_iterable.items():
print(“This is the key {} and the value is {}”.format(key, value))
Python: You can create a tuple with either
placing a comma between 2 values or tuple()
Python: If a function returns three values you can either
Pack it into one variable or unpack it into three variable seperated by commas.
Python: To capitalize the first letter of every word, type
titlecase(“my_string”)
Python: To create a function that doesn’t need to be passed every parameter because there are defaults, type
def my_function(param_1=”A”, param_2=”B”, param_3=”Var”):
my_function()
IPYNB: To launch the IPython Notebook from the console, type
ipython notebook
Python: To run a python script, type into the console
python3 ~/myfolder/script.py
Python: When using a “for loop” on a dictionay, the “item” variable only takes on the value of
The key, not the key value
Python: To use a “for loop” on a dictionary and have the “item variable” iterate on the dictionaries values instead of the keys, type
for key in my_dict.values():
print(key)
or
for key in my_dict:
print(my_dict[key])
Python: To create a tuple, type
my_tuple = (1, 2, 3)
Python: For tuples, the parenthesis
Aren’t required. Only the commas are required.
Python: A tuple is an
Immutable list that can be packed and unpacked.
Python: You can turn a list into a tuple by typing
my_tuple = tuple(my_iterable)
Python: To return the value at a certain index in a tuple, type
my_tuple[2]
To enter the python interpreter in the cosole, type
python
Python script file names end with
.py
to exit the python Interpreter you type
exit()
To exit the help(word) function, type
q
To look up the attributes and methods of a class use
dir(nameofclass)
To assign a user input to a variable, type
var_1 = input(“Whatever you want the prompt to the user to be?”)
To invoke a newer language you installed in the terminal type
e.g. python3
To create a new text file, type
nano new_name.py
The placeholder for the str.format() method is
{} e.g. “I’m {}, and you are {}”.format(“Alen”, “Mike”)
The if and else function lines must end with
: (also try and except)
The methods that come after the if and else statements must
Be indented the exact same amount. The amount of spaces or tabs doesn’t matter.
A number with a decimal is called a
Float
A number without a decimal is called an
Integer
You can turn a string to an integer and a float with
int(“55”) float(“2.2”)
You can turn a float into an integer, and and integer into a float with
int(5.5) and float(5)
True + False will evaluate to
1 because True has a value of 1 and False has a value of 0
To check if a string is not in a variable string
if not “searchstring” in user_num:
print(“not here”)
To compare if two values are equal use
==
To try running something that might cause errors use
try:
1 / 0
except:
print(“script messed up”)
You can check if a string is in another string or list by typing
“g” in “dog”
This would return True
To get more info on a function, type
Into the interpreter, help(str.split)
To return the length of a list, type
len(my_list)
To return an item in a list by its index, type
my_list[3]
To change the value of one index in a list, type
my_list[3] = “new value”
To seperate all the letters of a string into seperate list items
list(“my string”)
To seperate the words in a string by white space, type
my_string.split()
To join a list with a delimiter, type
“_“.join(my_list)
The extend() method
Appends the second list onto the initial list and returns “None”. The second list remains the same value but the initial list will now contain the additional list indexes.
Console: Servers usually do not have a
GUI
Console: The “~” in the command line stands for
The home directory e.g. users/student/
Console: Usually the first word in the command line is
The username you are signed in as
Console: To list the files in your current directory, type
ls
Console: To list the files in the current directory with more detail, like permissions, type
ls -l
Console: To list all the files in the current directory including the hidden dot files, type
ls -a
Console: To clear the screen, type
clear
Console: to list the files in another directory, type
ls user/student/folder/
Console: “Folders” is synonymous with
Directories
Console: To see your current directory, type
pwd
Console: The home directory (“~”) usually contains the folders
My Documents, Pictures, etc.
Console: To change current directory, type
cd ~/myfolder/ or cd users/student/myfolder
Console: To move up one directory, type
..
Console: To see a previous command you typed, press
^ arrow
Console: To view the contents of a text file, type
less ~/myfolder/file.txt or cat ~/myfolder/file.txt
Console: To exit the “less” program, type
q
Console: To concatenate two disparate files, type
cat ~/myfolder/file.txt ~/myfolder/file2.txt
Console: To edit a text file, type
nano ~/myfolder/file.txt
Console: To use the menu at the bottom on nano you must hold
control
Console: To save in nano you must
ctrl x (exit) and then Y (yes) to the save prompt
Console: To Save As in nano you must
ctrl x (exit) and then Y (yes) to the save prompt, then change the name it prompts
Console: To rename a file or directory, type
mv hello.txt hi.txt or mv myfolder/ myfolder2/
Console: To refer to current directory in a path type
.
Console: When referencing directories always add
a slash at the end of the name
Console: To move and rename a file simultaneously, type
mv python.txt /users/student/myfolder/newname.txt
Console: To copy any file just type
cp myfolder/python.txt otherfolder/pythoncopy.txt
Console: To copy a directory with all of the files included in it to another location, type
cp -r ~/myfolder/ ~/myfolder2/
Console: To remove a file or an empty directory, use
rm myfolder/python.txt
Console: Be careful since there is no undo for
rm
Console: To use rm on a directory with files inside it, type
rm -r ~/myfolder/
Console: To create a directory, type
mkdir name_of_directory/
Console: To make a nested directory, type
mkdir -p documents/myfolder/pictures/
Console: The permissions in ls -l are ordered by
Creator, Group, Public
Python: Are tuples associated with an index?
Yes
Pandas: To give a name to a Series, type
my_series.name=”Name of My Series”
Pandas: To give a name to an index, type
my_series.index.name=”Name of My Index”
Math: The mean is also called the
Average
Math: The mode is
The number in a sequence that occurs most frequently
Math: The median is
The number that is in the middle of the sequence if you put them all in order.
Math: If you are attempting to find the median for a sequence has two numbers in the middle, you must
Find the mean for the two middle numbers
Pandas: The data in a Series is
Homogeneous. If you change one values from an Int to a float they all become float.
Pandas: If you use a dictionary as the data for a Series, it will
Automatically use the keys as the index and the values as the data.
Pandas: A Series is an
Ordered key-value store
Pandas: To multiply all the values in a Series, type
my_series * 2
Pandas: To return a slice of a Series by the label, type
my_series[“Thur”:”Sat”]
Pandas: To return a slice of a Series by the position, type
my_series[1:5]
Pandas: To return one value in a Series based on it’s index, type
my_series[4]
Pandas: To set the value of one index in a Series, type
my_series[3] =188
Pandas: To return the median of a Series, type
my_series.median()
Pandas: To return the max of a Series, type
my_series.max()
Pandas: To change the values in a Series to the cumulative sum, type
my_series.cumsum()
Pandas: To return the values of a series enumerated, type
for idx, value in enumerate(my_series):
print(idx, value)
Note: The reverse doesn’t work.
Pandas: To check if a key is in a series, type
“Tue” in my_series
Would return true or false
Pandas: To retrieve a Series value using a key or index, type
my_series[“Tue”]
Pandas: To set a value by the key in a Series, type
my_series[“Wed”] =22
Pandas: To loop over a Series or dictionary and return keys and values, type
for key, value in my_series.iteritems():
print(key, value)
Python: When using my_list[4] to return a value, the square brackets contents can be
Anything that evaluates to a number. e.g. True, False
Python: To return the position of a string inside another string, type
“Look in my string for the position”.find(“my string”)
Python: If the find() method cannot find the string inside the main string, it returns
-1
Python: The find() method only returns the position of
The first occurrence of the value you look for.
Python: You can make the find() method start searching only after a set position by typing
“Look in my string for the position”.find(“my string”, 4)
Console: When writing a file or directory path, always start the path with
A slash
Python: To do exponentiation, type
2 ** 10
Python: To create a while loop function, type
def while_function: i = 0 while i
Python: To return the index of an item in a list by its value, type
my_list.index(“list_item”)
Note: If the item is not present, this returns an error.
Python: For value tests (like in while, if), an empty list and a not empty evaluate to
Empty: False
Not Empty: True
Internet: A network is a
Group of entities that can communicate even though they are not all directly connected.
Internet: Latency is
The time it takes for a message to go from source to destination.
Internet: Bandwidth is
The amount of information that can be transmitted per unit time.
IPYNB: To exit the ipython notebook from the console, type
control C, Y, Enter
Console: To see the route a site takes in the network to get to you and the time, type
traceroute www.google.com
Python: To time the execution of a function, type
import time
def time_execution(): start = time.clock() eval("25 * 25") stop = time.clock() execution = stop - start return execution
Python: Another way to execute the evaluation of code is
eval(2 * 2)
Math: Modulo (%) is the
Remainder after dividing. e.g 5 % 2 = 1
Python: To return a number associated with a single letter, type
ord(“A”)
Python: To return a letter associated with a single number, type
chr(114)
Pandas: A pandas DataFrame is
A spreadsheet with row and column labels.
Pandas: A DataFrames data in a column is
Homogeneous
Pandas: A DataFrames data can
Be any type
Pandas: To create a DataFrame, type
my_data_dict = {“Tokyo”: [23, 43, 12, 65], “London”: [3, 4, 27, 55], “Date”: [1/20, 1/21, 1/22, 1/23]}
pandas.DataFrame(my_data_dict)
Pandas: Every key-value list in the data dictionary going into a DataFrame must
Be the same length.
Pandas: To return one column in a DataFrame (as a Series), type
df[“column name”] or
df.column_name
Pandas: The data type for the return of one column in a DataFrame is
a Series
Pandas: To set a DataFrame column to be the index, type
df.set_index(“column name”)
Note: Must use quotes.
Pandas: To return only that last n rows of a DataFrame, type
my_dataframe.tail(10)
Pandas: To return all unique values from a column along with their frequency, type
my_dataframe[“Column Name”].value_counts() or
my_dataframe.Column.value_counts()
Pandas: To create a new column that totals others, type
df[“total”] = df[“Column1”] + df[“Column2”] + df[“Column3”]
Plarium: The Tracker counts the ROI up until
The current moment for all the regs within the chosen time period.
Plarium: The Payments ROI Comparison tool counts ROI until
The end date that you set at midnight.
Plarium: For the end of month campaigns report
Use the Payments ROI Comparison tool.
Pandas: To use value_counts(), you must use it on
A column Series
Pandas: When importing a csv, you do not need to type
DataFrame
Pandas: To sort a DataFrame by the index, type
df.sort_index()
Pandas: To sort the columns of a DataFrame, type
df.sort_index(axis=1)
Pandas: When referring to a column make sure to
Spell it perfectly and see if it needs quotes.
Python: To open a browser automatically to a certain site from python, type
import webbrowser
webbrowser. get(‘firefox’)
webbrowser. open(“http://www.google.com”)
Pandas: To sort a DataFrame by a column’s values, type
df.sort_index(ascending=False, by=[“Converted clicks”])
Pandas: To sort a DataFrame by a two column’s values, type
df.sort_index(ascending=[False,True], by=[“Converted clicks”, “Avg. position”])
Pandas: The order method returns a
Series
Pandas: To sort a DataFrame you must use the
sort_index() method
Chrome: To open last closed tab, type
Ctrl, Shift, T
Python: Before calling new variables, make sure to
Initialize them.
eg. my_var = 0
Pandas: The groupby object is not a
DataFrame. It is a dictionary where each unique value is a key and the value is the dataframe that has that value attributed.
Python: The print function must use
Parentheses
Pandas: The groupby() function
Splits the DataFrame into separate dataframe objects for every unique value in the passed in column.
Pandas: A groupby object is dict like because (2 reasons)
- The unique column values are keys and the rest of the values are key values.
- It is iterable.
Pandas: To create an empty DataFrame, type
df = pandas.DataFrame()
Numpy: To invoke pylab, type
%pylab inline
Pandas: To create a date range, type
days = pandas.date_range(“2014-01-01”, “2014-02-28”, freq = “d”)
Pandas: To slice out a small range of the rows and columns, type
df.ix[2:45, “Madrid”:”Boston”]
Pandas: To slice out specific rows and specific columns, type
df.ix[[5, 22, 31] , [“Madrid”,”Boston”,”Shanghai”]]
Pandas: To slice out specific rows and all columns, type
df.ix[[5, 22, 31] , : ]
Pandas: To slice out all rows and a range of columns, type
df.ix[ : , [“Madrid”:”Boston”]]
Pandas: To transpose the columns and rows of a DataFrame, type
new_df = df.T
Pandas: To look at just a few of the columns in a DataFrame, type
new_df = df[[“Bangladesh, “India”, “Uganda”]]
Time: To make a python script wait for a certain time, type
import time
time.sleep(60)
Console: To see the running processes, type
top -o cpu
Console: To exit the “top” , type
q
Pandas: When writing a path starting with the home directory, you must
Begin it with a slash.
Pandas: Pandas automatically removes any excess
Blank rows and column from the CSV.
Pandas: To check the number of rows in your DataFrame, type
len(df.index)
Pandas: To change one column name, type
df = df.rename(columns = {“old_name”:
“new_name”})
Pandas: To set a default maximum number of rows to display, type
pandas.set_option(“display.max_rows”, 10)
Pandas: In order for df.to_csv(“”) to work, df must be
a DataFrame or a Series
Pandas: In order to filter in pandas you must first create a
mask
Pandas: To apply your mask to your df (filter for), type
df[mask]
Pandas: To apply the inverse of your mask (filter out), type
df[numpy.invert(mask)]
Note: Must import numpy for this.
Pandas: When you apply a boolean index (mask) to a df, it only returns the rows that the boolean index had as
True
Pandas: To create a mask for one criteria, type
df[“Ad group”]==”Banner” or
df.Cost
Pandas: To create a mask for multiple criteria, type
mask = (df[“Ad group”]==”Banner”) & (df.Cost>200)
Matplotlib: To plot a line graph based on two df columns, type
df.plot(x=”Campaign”, y=”Cost”)
Pandas: To get stats on every df column at a glance, type
df.describe()
Pandas: To get a stat (e.g. median) for one column, type
df[“Column Name”].median()
Pandas: To create a groupby, type
df.groupby(“Column Name”)
Pandas: To remove the first row of a df, type
df = df.drop(df.index[:1])
Pandas: To change the value of an exact coordinate (row and column), type
df.loc[5, “Cost”] = 10
Pandas: To return the value of an exact coordinate (row and column), type
df.loc[24, “Campaign”]
Python: To pass a function value to the third parameter, while leaving the others as their defaults, type
def my_function(param_1=”A”, param_2=”B”, param_3=”Var”):
my_function(param_3=”New Value!”)
Python: Can you add an equation to the return line at the end of a function?
Yes
Python: To apply a predefined function to every item in a list, in a short way, type
map(my_function, my_list)
Pandas: To remove the last row of a df, type
df = df.drop(df.index[-1:])
Pandas: To create a function meant to return a boolean index to be used as a filter(), type
new_list = [] def my_filter(a_list): for item in a_list: test = item > 5 new_list.append(test) return new_list
Pandas: To create a pivot table and choose the rows, columns, values, and presence of grand totals, type
table = pandas.pivot_table(df,index=[“Manager”,”Status”],columns=[“Product”],values=[“Quantity”,”Price”],aggfunc={“Quantity”:len,”Price”:[numpy.sum,numpy.mean]},fill_value=0)
Pandas: To create a new column and make the values the return of a function acting on another column, type
df[“New Column”] = df[“Old Column”].apply(function_name)
Will be useful for data cleanup
Python: To use a list comprehension to alter the items in a list, type
[float(item) for item in my_list]
Pandas: To create a lambda function, type
lambda var_name: var_name**2
Pandas: To create a list comprehension with an if statement, type
[item for item in my_list if item>5]
Pandas: Can a lambda function take in multiple values?
Yes.
e.g. lambda var_name, var_name2: var_name*var_name2
Pandas: Lambda functions automatically returns
The of the evaluation after the colon
Numpy: The method that returns a standard deviation is
my_numpy_array.std()
Pandas: To reverse the order of all the rows in a df, type
df.ix[::-1]
Pandas: To combine two tables based on similar values in one column, like a lookup table, type
df.merge(df2, on=”Similar_Column_name”)
Pandas: For a merge() to work the columns that the tables will merge on must be
Labeled the same.
Pandas: Values (in the column that two tables are merging on) that do not match exactly on both tables when merging are
Removed
Pandas: If a value in the merged on column of one table has a duplicate value that is also an exact match in the other table
The duplicate gets included in the merged table.
Pandas: When the merged column has 2 duplicate exact match values on both tables, it
Combines the rows in every possible configurations, because it does not know which of the duplicates on one table matches with which of the duplicates of the other.
Excel: To switch 2 rows or cells
Select the cell or row and press Ctrl x, then select the destination cell or row and press Ctrl, Shift, =
Pandas: To replace an exact cell value in a specified entire column of a df with another value, type
df[“Column 1”] = df[“Column 1”].replace({“Fee”:”Fee Time”})
Pandas: To strip all “$” signs from one column, type
df[“Column 1”] = df[“Column 1”].apply(lambda x: x.strip(“$”))
Pandas: To strip any arrangement of a list of characters from the beginning of a string in a column, type
df[“Column 1”] = df[“Column 1”].map(lambda x: x.lstrip(“-+=&”))
or
df[“Column 1”] = df[“Column 1”].lstr.strip(“-+=&”)
Excel: To calculate correlation between 2 arrays, type
=correl(array_1, array_2)
Numpy: To filter a numpy array with a boolean index, type
numpy_array[numpy_array > 5]
Numpy: To use two boolean index filters on a numpy array, type
numpy_array[(numpy_array==5) | (numpy_array > 6)]
Numpy: To create a numpy array, type
numpy.array([1,2,3])
Numpy: To get the position of the max value in a numpy array, type
numpy_array.argmax()
Numpy: To create a range in a numpy array, type
numpy.arange(20,30,1)
Start,stop,step.
Numpy: Nan values in a numpy array
Screw up calculations and must be dealt with beforehand.
Requests: To scrape all of the html content of a page into a variable, type
import requests
page = requests.get(“http://www.scrape.com”)
Requests: To see all of the html scraped by requests in your variable, type
page.content
BS4: To put scraped data into BS from the var used by requests, type
from bs4 import BeautifulSoup
soup_page = BeautifulSoup(page.content, “html.parser”)
BS4: To print a BS content variable in a pretty way, type
print(soup_page.prettify())
BS4: To return all of the content contained in a certain html tag, type
soup_page = BeautifulSoup(page.content)
soup_page.find_all(“a”)
BS4: To return all the values of a certain parameter from a list of html tags, type
for item in soup_page.find_all(“a”):
print(item.get(“href”))
BS4: To return all the anchor text for every link in a soup page, type
for item in soup_page.find_all(“a”)
print(item.text)
Python: To return the value of a dictionary key using a method, type
my_dict.get(“Key name”)
Requests: The requests library is a
site scraper
BS4: The beautiful soup is an
html parser
Pandas: To add new rows to a DataFrame, use
concat
Pandas: To add new columns to a DataFrame that are a different length than the original and have pandas fill the the missing data with nan, rather than deleting any rows, type
df1.join(df2, how=’outer’)
Pandas: To groupby by two columns, type
grouped = df.groupby([“Google Name”, “Adgroup”]).agg({“Regs”: numpy.sum, “Deposits Amount”: numpy.sum})
Pandas: To turn a column value that pandas thinks is an int/float to a string, type
df[“Campaign”] = df[“Campaign”].apply(lambda x: str(x))
Pandas: To turn a groupby into a flat table, type
grouped.reset_index()
Pandas: To remove all rows with “VALUE!”, inside one column of a data frame, use
A filter.
Pandas: To remove rows with duplicate values in a df column and keep the last one, type
df.drop_duplicates(subset=”Column name”, take_last=True)
Pandas: To filter for df rows that contain certain values in a column, type
df[df[‘Column Name’].isin([3, 6])]
Pandas: To create a filter mask with “or” criteria, type
(df[“Column 1”] >= 5) | (df[“Column 2”] > 45)
Pandas: To create a groupby for multiple columns data, and aggregate one of the columns by two functions, type
df.groupby([“Campaign”, “Adgroup”]).agg({“Adgroup”: [numpy.size, numpy.mean]})
Pandas: The apply function does not require your
function to iterate.
Pandas: To return the row of a certain value in a column, type
df[“Column_name”][df[“Column name”] == “Value name”].index.tolist()[0]
Pandas: To turn a list into a number, use
”“.join(“my_list”)
BS4: The find_all(“a”) method returns
a List of all the “a” tags.
Pandas: To use a text filter on a df, type
df[df[“Column name”].str.contains(“string”)]
OS: To return the current directory automatically, type
import os
path = os.getcwd()
BS4: To parse a site with potentially broken html, type
soup_page = BeautifulSoup(page, “html.parser”)
BS4: To use .get(“href”) and then place return into a list, type
new_list = []
for item in soup_page.find_all(“a”, {“class”:”title may-blank “}:
new_list.extend([item.get(“href”)])
BS4: When using soup_page.find_all(“a”, {“class”:”name”})
The parameters must be an exact match
Selenium: To run a loop that will sometimes return errors, but you want it to continue, type
for item in site_list: try: my_browser.get(item) except: pass
Python: To execute some code immediately after a for loop has iterated over every item, type
for item in range(1,10):
print(item)
else:
print(“finished”)
CLI stands for
command line interface
python: Python’s GIL is
a Global Interpreter Lock. A mechanism that prevents executions of multiple python bytecode instructions simultaneously.
http: SSL stands for
secure sockets layer
http: SSL is
an encrypted connection between server and client
bash: To remove a directory type
rm -r directory_name
python: When importing a file,
point to it from your current working direcory
python: Do not name your python file
the same as a library name you are importing, otherwise it will import itself.
python: To evaluate a string that is assigning a value to
exec(“var_name = 1”)
python: var_name = 1 is not an expression, it is a
statement
python: To return the first value in a list that matches a criteria, type
first(x for x in my_list if x == 10)
pandas: To import mongodb into pandas, use
pymongo
python: To sort a dict by key, type
sorted(my_dict.items(), key=lambda x: x)
python: my_dict.items() returns
a list where each item is a tuple that contains the key and value
python: To remove an df from RAM memory, type
del df
import gc
gc.collect()