3 Flashcards

1
Q

VC: The most senior person in a VC firm is called a

A

managing director or general partner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

VC: The managing director or general partner

A

make the final decision on which companies to invest in and sit on the board of directors for the companies they invest in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

HostGator: To allow remote sql connections

A

Click “Remote MySQL” and enter the IP address of the computer you want to allow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

HostGator: To create a database, a new user, and give permissions

A

click “MySQL Database Wizard” and go through it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

HostGator: To upload a csv

A

Click on desired db, then click on import tab. Select the csv. Choose the csv file format, then if necessary, click checkbutton for “first line contains table columns”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

DB: A simple DB is called

A

sqlite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Windows: To create a new file, type

A

echo.>file_name.py

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Windows: To see current directory, type

A

echo %cd%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Windows: To change directory, type

A

cd c:/users/alen.solomon/desktop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Windows: To list all the files in current directory, type

A

dir /b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Pandas: Non printable characters can cause a “ValueError: No columns to parse from file” and can be fixed by

A

adding the parameter encoding=”utf-16” to read_csv

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Adwords: When exporting a report from adwords, make sure not to use

A

Excel.csv format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pandas: To choose a delimiter for read_csv, type

A

sep=”\t”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Pandas: To turn an excel file to a df, type

A
xl = pandas.ExcelFile("path/to.xlsx")
df = xl.parse("Sheet name")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Pandas: To change column values with a dollar sign to a float, type

A

df[“Cost”] = df[“Cost”].apply(lambda x: str(x).strip(“$”)).astype(float)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pandas: When you use df[“Column name”].apply(my_function)

A

the function name does not need to end with ()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Pandas: To return only certain columns of a df in a set order, type

A

df = df.ix[ :, [“Abbrev”,”Jan”,”Feb”,”Mar”,”Total”]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Pandas: To replace all nans with something, type

A

df[“Column name”].fillna(“Replacement”, inplace=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Pandas: To write a couple dfs to two sheets in excel, type

A

writer = pandas.ExcelWriter(‘/users/student/desktop/demo.xlsx’, engine=’xlsxwriter’)

df. to_excel(writer, index=False, sheet_name=’Sheet1’)
df2. to_excel(writer, index=False, sheet_name=’Sheet2’)
writer. save()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Windows: To clear the cmd screen, type

A

CLS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Numpy: To turn a numpy array into a matrix, type

A

my_numpy_array.reshape(5,5)

Note: the dimensions are the 5,5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Numpy: To create a boolean index matrix from two numpy matrices, type

A

my_numpy_matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Numpy: To turn a numpy matrix back into an array, type

A

my_numpy_matrix.ravel()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Selenium: To import selenium, type

A

from selenium import webdriver

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Selenium: To send selenium to a site, type

A

my_browser.get(“http://google.com”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Selenium: To instantiate Chrome, type

A

my_browser = webdriver.Chrome()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Selenium: To return the title tag, type

A

my_browser.title

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Selenium: To find an element on a page by it’s id, type

A

my_element = my_browser.find_element_by_id(“lst-ib”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Selenium: To press keyboard into a page element, type

A

my_browser.find_element_by_name(“aw”).send_keys(“Send these keys”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Selenium: To close the browser you instantiated, type

A

my_browser.quit()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Selenium: To submit in an element, type

A

my_element.submit()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Selenium: To simulate an arrow key press, type

A

my_element.send_keys(“my text”, Keys.ARROW_DOWN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Selenium: To empty a text field, type

A

my_browser.find_element_by_id(“field”).clear()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Seleniun: To click a submit button, type

A

my_browser.find_element_by_id(“submit”).click()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Selenium: To select an element by its class name, type

A

my_browser.find_element_by_class_name(“date-box”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

HTML: The -select> tag is used for

A

Drop down menues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Selenium: To find the first form on a page through its xml path, type

A

my_browser.find_element_by_xpath(“//form[1]”)

Note: the slashes represent moving into the html tags and then the body tags and into the form tags.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Selenium: An easier way to find the submit form button and press it is

A

my_browser.find_element_by_id(“pswrd”).submit()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Selenium: To click a checkbox xpath, type

A

my_browser.find_element_by_xpath(“.//*[@id=’facebook’]/body/div[2]/label[2]”).click()
Note: Find xpath using firepath.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Selenium: The best way to find a checkbox is

A

xpath

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Selenium: Sometimes to make an xpath work you need to

A
remove the [@id="js_5x"] from
//*[@id="js_5x"]/div/ul/li[6]/a/span/span
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Selenium: To get the xpath in chrome

A

Inspect element twice and then right click and click “copy xpath”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Selenium: When looking for the path for checkboxes or buttons, often

A

the label, or the outer ridge of button outside of the button text shows the correct path in inspect element rather than the button itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Selenium: When uploading a file do not

A

click on the upload button.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Selenium: To upload a file, type

A

my_browser.find_element_by_xpath(“//*/body/div[6]/input”).send_keys(“/users/student/desktop/report.csv”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Selenium: To select an list item in a drop down menu, type

A

from selenium.webdriver.support.select import Select
select = Select(my_browser.find_element_by_id(“dropdownid”))
select.select_by_visible_text(“Jan”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Selenium: To make the browser continue looking for a page element if it is not there for a set amount of time, type

A

my_browser.implicitly_wait(50)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Selenium: implicitly_wait(30) only has to be

A

set once and all tests for this browser will have the wait.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

BS4: To parse a soup_page for an html tag with a certain parameter (e.g. css class name), type

A

soup_page.find_all(“div”, {“class”: “name”})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

BS4: To return a list of all the top level html tags contents separated, type

A

soup_page.contents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

BS4: To return the first item in a list of all the top level html tags contents separated, type

A

soup_page.contents[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

BS4: To return the first item in a list of all the top level html tags contents separated and then the first tags contents from within that list item, type

A

soup_page.contents[0].contents[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Python: To use a list comprehension to return a boolean index, type

A

my_list = [1,2,3,4,5,6,7,8]

[item>3 for item in my_list]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

IPYNB: To make a graph show up, type

A

%matplotlib inline
import matplotlib.pyplot as plt

df.plot(x=”Column”, y=”Column2”, kind=”scatter”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Pandas: A series is like a

A

One dimensional array with an index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Pandas: To set your own index to a Series or df, type

A
my_index_list = list(range(20))
my_series.index = my_index_list
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Pandas: To slice a section of a Series by its index, type

A

my_series.ix[3:6]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Pandas: In order to set a new index for a df or Series, the index list must

A

Be the same length as the df or Series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

BS4: To put the page source of a selenium page into a soup_page, type

A

page = my_browser.page_source

soup_page = BeautifulSoup(page)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Pandas: To return the length and width of a df in a tuple, type

A

df.shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Pandas: To see how many unique values are in a column, type

A

df[“column name”].unique()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Pandas: Even after you change the index to something other than 0 onward, it will

A

still allow you to slice by the index number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Pandas: To return specific indexes from a Series, type

A

my_series[[ 6, 9, 2]]
or
my_series[[ “A”, “C”, “G”]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Pandas: If you change the index of a Series to new numbers you cannot use the original index of the numbers unless yo use

A

my_series.iloc[0: 5]
or
my_series.iloc[[4, 6, 9]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Pandas: The difference between .iloc and .ix when the index is numerical, but doesnt match the zero index is

A

.iloc uses the zero index while .ix uses the current, artificial, index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Pandas: To check if any value in a Series passes a value test, type

A

my_series[my_series > 50].any()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Pandas: To check if all values in a Series passes a value test, type

A

my_series[my_series > 50].all()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Pandas: To sum how many values in a Series pass a value test, type

A

sum(my_series > 5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Pandas: Can you do a step when slicing a Series

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Pandas: df[“Campaign name”] is the data type

A

Series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Pandas: To create a copy of a df, type

A

df_copied = df.copy()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Pandas: To add the rows of one df to another that has the same columns, type

A

df_concated = pandas.concat([df, df2])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Pandas: With regard to nan values, pandas will

A

ignore them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Pandas: When you concatenate two dfs the index will

A

be maintained, so if the dfs were both zero indexed, they there will be duplicates index values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Pandas: To return the length of a df column, type

A

df[“Column name”].count()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Pandas: Anything action you can perform on a Series can be also performed on a

A

df[“Column name”]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Pandas: To forward fill the na values in a Series, type

A

concated_df.ffill()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Pandas: To backwards fill the na values in a Series, type

A

concated_df.bfill()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

Pandas: To replace the index of a concated df with one that is ordered, type

A

concated_df.index = range(concated_df[“Column name”].count())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Pandas: df[“column 1”] + df[“column 1”], is an

A

index based arithmetic. If the indexes are not aligned it will add the wrong rows together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Pandas: When multiplying two Series, if there are many indices with the same value, pandas will

A

multiply every combination of the values and add new rows for each product.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Pandas: If you decide to create a DataFrame with lists rather than dicts, the column labels will

A

Just be set to zero index numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Pandas: To name unnamed columns, type

A

df.columns = [“Column 1”, “Column 2”]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

Pandas: To filter for indexes in a pivot_table, type

A

pvt.query(‘Type == [“Banner”, “Text”]’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Pandas: To drill down to a certain value in a certain level of a pivot_table, type

A

pvt.xs(“Value name”, level=0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

SQL: To select everything from a table, type

A

SELECT * FROM table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

SQL: A databases structure is called it’s

A

schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

SQL: The three main types of data you can set a column to hold are

A

String, Numeric, Date and Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

SQL: The two types of string data are

A

Text and Varchar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

SQL: Varchar string type is ideal for

A

Short strings, like names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

SQL: Text string type is ideal for

A

Long strings, like descriptions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

SQL: The numeric data types are

A

Integers, Fixed Point Decimal, Float

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

SQL: The fixed point data type

A

Sets a strict number of decimal places and is ideal for dollars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

SQL: The float point data type

A

does not set a strict number of decimal places

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

SQL: The best data type to store a date and time together is

A

datetime

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

SQL: To create a table with one column that stores 50 characters, type

A

CREATE TABLE tablename (columnname VARCHAR(50));

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

SQL: To create a table with two columns, one that stores 50 characters, and another that store integers, type

A

CREATE TABLE tablename (columnname VARCHAR(50), columnname INTEGER);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

SQL: To insert data into a row, type

A

INSERT INTO table VALUES (“String”, 1000);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

SQL: When inserting values into the DB, the values must

A

be in the same order you defined in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

SQL: String you insert must have

A

quotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Facebook: To create free banners, go to

A

www.picmonkey.com

102
Q

Pep 8: Before and after a top level function put

A

two blank lines, not including #comments

103
Q

Pep 8: After every comma and operator, put a

A

space

104
Q

Pep 8: Import libraries

A

on separate lines and at the top with no lines in between

105
Q

Pep 8: Class names should start with a

A

capital letter

106
Q

Pep 8: If putting a comment on the same line as some code

A

precede it by two spaces

107
Q

Pep 8: At the end of a file put

A

an empty line

108
Q

PDB: To use python debugger, type

A

import pdb; pdb.set_trace() above the area you want to debug

109
Q

PDB: To quit the python debugger, type

A

q

110
Q

PDB: To see the return of every variable one by one in order to debug, use

A

python debugger

111
Q

PDB: When finished debugging, make sure to

A

remove pdb

112
Q

PDB: To run the next line of code and return the variable, type

A

next or n

113
Q

Pandas: To turn a pivot_table or groupby back into a filterable table, type

A

pvt.reset_index()

114
Q

Python: Every file that you create is a

A

library

115
Q

Python: To import one specific class, type

A

from library_name import Class_name

116
Q

Python: To create an instance of a class type

A

my_class_instance = Class_name()

117
Q

Python: Functions that belong to classes are called

A

methods

118
Q

Pandas: When importing a csv, to get pandas to recognize a date column, type

A

parse_dates=[“Column name”]

119
Q

Pandas: To merge two dfs but only keep the keys on the left df, type

A

df1.merge(df2, on=”Column”, how=”left”)

120
Q

Pandas: To refer to a column by it’s index instead of it’s name, type

A

df.columns[1]

121
Q

Pandas: When you receive a key error, double check for

A

Extra spaces

122
Q

Pandas: To return a list of all the column labels, type

A

list(df.columns.values)

123
Q

Python: To turn two lists into a dictionary, type

A

dict(zip(list1, list2))

124
Q

Python: To create an invisible command line input for a password, type

A

import getpass

pswd = getpass.getpass(“Password:”)

125
Q

Selenium: To find an element by link text and click it, type

A

my_browser.find_element_by_link_text(“Link text”).click()

126
Q

Selenium: To take a screenshot of the entire page you must

A

use firefox driver. chrome driver currently has a bug.

127
Q

Selenium: The find elements By. types I should remember are

A
By.PARTIAL_LINK_TEXT
By.XPATH
By.NAME
By.CLASS_NAME
By.TAG_NAME
By.ID
128
Q

Selenium: Screenshots must be saved as a

A

png

129
Q

Selenium: To refresh the current page, type

A

my_browser.refresh()

130
Q

Selenium: To create an explicit wait condition, type

A

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(my_browser, 20)
wait.until(EC.element_to_be_clickable((By.PARTIAL_LINK_TEXT,”Reports”)))

131
Q

Selenium: The find elements By. types I should remember are

A
By.PARTIAL_LINK_TEXT
By. XPATH
By.NAME
By.CLASS_NAME
By.TAG_NAME
By.ID
132
Q

Selenium: Regarding elements hidden by jquery, Selenium must

A

do what the user must do in order to make it visible.

133
Q

Selenium: To find an element based on anchor text, type

A

my_browser.find_element_by_partial_link_text(“Link text”)

134
Q

Pandas: To convert a Series or column to floats, type

A

df[“Column name”].astype(float)

135
Q

Pandas: To create a new column with values that are conditional on other columns values in the same row.

A

df[“New”] = numpy.where((df[“Column.”]>35) | (df[“Column 2”]==”Banner”), df[“Column2”] * 0.8, df[“Column4”])

136
Q

Pandas: When using both the “or” and “and” conditions,

A

the conditions on both sides of “and” get evaluated together first, before the “or”

137
Q

Pandas: To turn a column into a list type

A

column_list = df[“Column name”].tolist()

138
Q

Pandas: To combine two columns into key value pairs in a dictionary, type

A
column_list = df["Column name"].tolist()
column_list2 = df["Column name2"].tolist()
combined_dict = dict(zip(column_list, column_list2))
139
Q

Pandas: The purpose of inplace=True is to

A

change the df itself without having to re assign the return of the function to the same df variable name.

140
Q

Pandas: To filter a date formatted column for just a month and year, type

A

df[“2014-11”]

141
Q

Pandas: To place a df with inconsistent dates onto an index with all dates, type

A

df = pandas.DataFrame(index=pandas.date_range(“2014-08-02”, “2014-09-06”, freq=”d”))
df.join(inconsistent_df, how=”outer”)

142
Q

Pandas: Join, by default, merges on the

A

indexes

143
Q

Javascript: To refer to the current page, type

A

document

144
Q

Javascript: To create an alert, type

A

alert(“Hello!”);

145
Q

Javascript: To write an h1 into the current page, type

A

document.write(“<h1>Hello!</h1>”);

146
Q

Javascript: Javascript files end with the filename

A

.js

147
Q

Javascript: To pull javascript code into a webpage from an external file, type

A

-script src=”javascript.js”>-script>

148
Q

Javascript: To print to the console, type

A

console.log(“My log”);

149
Q

Javascript: To create a variable with no value, type

A

var my_var;

150
Q

Javascript: To create a variable with a value, type

A

var my_var = 25;

151
Q

Javascript: Variable name cannot start with

A

a number

152
Q

Javascript: To make a quote be just a quote and not end a string you can use an

A

escape character \ right before it

153
Q

Javascript: To save a user input from a dialog box into a variable, type

A

var dialog_input = prompt(“Text to display”);

154
Q

Javascript: To add a string and a variable using +, type

A

var concated_string = “Hello “ + visitor;

155
Q

Javascript: To update a variable that is referencing itself, type

A
var message = "Hello ";
message = message + "Dave";
156
Q

Javascript: To update a variable that is referencing itself at the beginning in a shorter way, type

A
var message = "Hello ";
message += "Dave";
157
Q

Javascript: To return the length of a string type

A

my_string.length;

158
Q

Javascript: To return the lower case of a string, type

A

my_string.toLowerCase();

159
Q

Javascript: To return the upper case of a string, type

A

my_string.toUpperCase();

160
Q

Pandas: To parse dates based on column location instead of column name, type

A

parse_dates=[0]

161
Q

Pandas: To rename a column label by its index, type

A

df.columns.values[0] = “New label”
or
df.rename(columns={df.columns[0]:”New name”})

162
Q

Pandas: To do a value test for nan, type

A

df[“Column”] == numpy.nan

163
Q

Pandas: Pandas deals with nan values by

A

ignoring them

164
Q

Pandas: To turn all of the values in a df to integers, type

A

df = df.astype(int)

165
Q

Pandas: To see a table of correlations, type

A

df.corr()

166
Q

Pandas: To save an image of a matplotlib plot, type

A

import matplotlib.pyplot as plt

my_plot = df.plot(x=”Column”, y=”Column2”, kind=”scatter”)
my_plot.get_figure().savefig(“/users/student/desktop/pii.png”, bbox_inches=”tight”)

167
Q

Pandas: The type of graph df.plot(x=”A”, y=”B”) returns is

A

line

168
Q

Math: The X axis is

A

the bottom going horizontal

169
Q

Matplotlib: The kinds of plots I should remember are

A

bar, scatter, pie, line

170
Q

Python: When creating a big project, remember to

A

Use pep8, document everything starting with “This”, create each part of the project in separate files for easy testing

171
Q

Python: To slice the last 8 characters from a string, type

A

my_string[-8:]

172
Q

Javascript: To turn a string to an integer, type

A

parseInt(var_string)

173
Q

Javascript: To turn a string to a float, type

A

parseFloat(var_string)

174
Q

Python: To round a number to the nearest whole number, type

A

round(1.3)

175
Q

Python: To round a number to the nearest first decimal, type

A

round(1.326, 1)

176
Q

Python: To trigger a python script from another python script, type

A

import os

os.system(“python /users/student/desktop/script.py”)

177
Q

Python: To open a file in its default program, type

A

import subprocess

subprocess.Popen([“open”, “/Applications/Calculator.app/”])

178
Q

Pandas: To delete a column from a df, type

A

df.drop(“column_name”, axis=1, inplace=True)

179
Q

Python: To open any file on a mac, type

A

import subprocess

subprocess.call([“open”, “/Users/student/Desktop/file.app”])

180
Q

Python: To open any file on widows, type

A

import os

os.startfile(c:/filename/path)

181
Q

Python: To delete a file from computer, type

A

import os

os.remove(“/users/student/desktop/file.png”)

182
Q

Pandas: To read_csv for specific columns, type

A

usecols=range(1,7)

183
Q

Pandas: The usecols argument for read_csv must be

A

explicit by either index or label, but not a slice

184
Q

Pandas: To find the highest value in a series, type

A

my_series.max()

185
Q

Pandas: The .apply method does not have

A

access to the index. Must use df.index.map() in the case where you must create a column based on the index.

186
Q

Pandas: To reference a date_range index to create a new column with the day of the week, type

A

df[“Day”] = df.index.map(lambda x: x.strftime(“%A”))

187
Q

Pandas: To add two new values to the bottom of a df, type

A

df = df.append([[“value1”, “value2”]], ignore_index=True)

188
Q

Pandas: To get the number of rows in each of a groupby’s keys, type

A

grouped.size()

189
Q

ML: Supervised learning is when

A

The machine learns by labeled examples. The data sample you teach it with must be labeled and then it will start to predict.

190
Q

ML: A common ratio of a training data set to a testing dataset is

A

67% train to 33% test.

191
Q

ML: Naive bayes is useful for datasets with

A

less than 100k samples and text data

192
Q

ML: Data sets must be

A

Labeled

193
Q

Scraping: Remember before beginning the scraping loop to

A

create the variable that will hold the results and import BeautifulSoup

194
Q

ML: Learning new classifiers as you receive data without doing a batch update is called

A

online learning

195
Q

ML: When some features are missing it is known as

A

“missing features”

196
Q

ML: In non-online cases, when new features become available, you would need to

A

re-fit the model based on a new batch.

197
Q

ML: After you have a batch of samples you need to

A

Fit a model to it.

198
Q

Pandas: When parsing the XL file sheet, the sheet name is

A

Case sensitive

199
Q

Pandas: the to_csv command must be used on a

A

df. eg. df.to_csv(“path/to/file.csv”)

200
Q

Pandas: When using df.append([[val1, val2]], ignore_index=True), remember to

A

reassign the df value to the appended version. e.g. df = df.append([[val1, val2]], ignore_index=True)

201
Q

ML: A dimension is a

A

column

202
Q

ML: In ML, every pixel of an image is its own

A

Dimension/column

203
Q

sklearn: Before you can train a model, you need to

A

instantiate it.

204
Q

sklearn: To instantiate a regression model, type

A

from sklearn.linear_model import LinearRegression

model = LinearRegression()

205
Q

sklearn: To pass an estimator your data you must call the method

A

model.fit(x, y)

206
Q

sklearn: The x and y parameters in the model.fit(x, y) method take in

A
x = samples by features
y = the labels
207
Q

sklearn: Given a trained model, to predict the label of a new data sample by highest probability, type

A

model.predict([[feature_value, feature_value, feature_value]])

208
Q

sklearn: To return the probability that a new data set has each label, type

A

model.predict_proba()

209
Q

ML: To use categorical data, you must

A

Binarize it into separate columns.

This is because models assume that if it is in the same column that it is a range, like length or height.

210
Q

sklearn: The LinearRegression model

A

plots a line through the data and based on the y axis return x.

211
Q

sklearn: The DecisionTreeRegressor

A

Hold all samples in a library and compares new data to the sample and returns the label that matched closest.

212
Q

sklearn: A classification task is when

A

You give a model samples with features and have it predict the label.

213
Q

sklearn: Parameters defined by training have a

A

trailing underscore

214
Q

sklearn: The basic format for a ML prediction generator is

A

from sklearn.chosenlibrary import ChosenClass
model = ChosenClass()
x = samplesbyfeaturesdata
y = targetlabels
model.fit(x, y)
model.predict([[feature_val, feature_val2, feature_val3]])

215
Q

sklearn: clf stands for

A

classifier

216
Q

sklearn: To import the cross validation module, type

A

from sklearn import cross_validation

217
Q

sklearn: To fit a model and then cross validate it, type

A

from sklearn import cross_validation
from sklearn.chosenlibrary import ChosenClass
x = samplebyfeaturedata
y = targetlabels
X_train, X_test, y_train, y_test = cross_validation.train_test_split(x, y, test_size=0.25, random_state=0)
clf = ChosenClass()
clf.fit(x, y)
pred = clf.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy_score(pred, y_test)

218
Q

SQL: Database normalization usually involves

A

dividing large tables into smaller (and less redundant) tables and defining relationships between them.

219
Q

Numpy: To see the number of rows and columns in a numpy array

A

my_numpy_array.shape

220
Q

Numpy: To convert a pandas DataFrame to a numpy array, type

A

my_numpy_array = array(df)

221
Q

sklearn: clf stands for

A

classifier.

222
Q

ML: Natural language processing is

A

Training a computer to understand the meaning of words.

223
Q

ML: A decision surface is

A

a boundary on a scatter plot that divides two different types of data

224
Q

sklearn: To return the accuracy of a prediction to the test labels, type

A
from sklearn.metrics import accuracy_score
pred = clf.predict(features_test)
accuracy_score(pred, labels_test)
or
model.score(feature_test, label_test)
225
Q

sklearn: To import naive bayes, type

A

from sklearn.naive_bayes import GaussianNB

226
Q

Pandas: To return just the fourth column of a df, type

A

df.ix[:,3:4]

227
Q

sklearn: Does not accept values of the type

A

string

228
Q

sklearn: To import and instantiate a decision tree classifier, type

A

from sklearn import tree

clf = tree.DecisionTreeClassifier()

229
Q

DecisionTreeClassifier: The DecisionTreeClassifier segments the data in a

A

blocky and slicy manner.

230
Q

DecisionTreeClassifier: The min_samples_split parameter controls

A

How many data samples are necessary for the DecisionTree decision boundary line to turn to fit them.

231
Q

DecisionTreeClassifier: Entropy is

A

a measure of impurity of the data samples and controls how impure the data must get before splitting.

232
Q

DecisionTreeClassifier: The default parameter for DecisionTreeClassifier that tunes entropy is

A

“gini”

233
Q

DecisionTreeClassifier: DecisionTreeClassifier is prone to

A

overfitting

234
Q

KNeighborsClassifier: The KNeighborsClassifier predicts by

A

a simple majority vote of the nearest neighbors of each test point

235
Q

NaiveBayes: NaiveBayes classifies by using

A

correlation with each feature independently to guide classification. It does not group features.

236
Q

Numpy: To convert a two dimensional numpy array to a no dimensional numpy array, type

A

my_numpy_array.ravel()

237
Q

Pandas: To parse dates where the day, month and year are in separate columns, type

A

parse_dates={“Dates”:[“Day”, “Month”, “Year”]}

238
Q

Pandas: To bin data from a column in a new column, type

A

ranges = [0,6,12,18,24]
labels = [“AM”,”AM”,”PM”,”PM”,”PM”]
df[“New column”] = pandas.cut(df[“column name”], ranges, labels = labels)

239
Q

Pandas: To save a csv without the index, type

A

df.to_csv(“path.csv”, index=False)

240
Q

Pandas: To make a specific cell a nan value, type

A

df.iloc[1,1] = numpy.nan

241
Q

Pandas: To remove rows with nan values, type

A

df = df.dropna()

242
Q

Pandas: To convert an excel file to a df, do not try to

A

save the excel file as a csv and the read_csv, it will unicode error

243
Q

Pandas: You cannot df.drop() the

A

header

244
Q

Pandas: To change the header to another row in the df, type

A

df.columns = df.iloc[1]

245
Q

python: to return the last index where the substring is found, type

A

my_string.rfind(“substring”)

246
Q

pandas: When getting a nonetype is not … error using .apply(lambda x: x) use the

A

str(x) or int(x) method to convert the objects to string

247
Q

pandas: iterrows() returns

A

a tuple of the index and the row data

248
Q

When creating automation scripts always add some

A

logging

note: Not the same character every time

249
Q

python: To make a one line if else statement, type

A

“true value” if 10==10 else “false_value”

250
Q

python: Think of a virtual env as

A

an environment you are running your script with, not a place you are saving your scripts.