3 Flashcards

1
Q

VC: The most senior person in a VC firm is called a

A

managing director or general partner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

VC: The managing director or general partner

A

make the final decision on which companies to invest in and sit on the board of directors for the companies they invest in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

HostGator: To allow remote sql connections

A

Click “Remote MySQL” and enter the IP address of the computer you want to allow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

HostGator: To create a database, a new user, and give permissions

A

click “MySQL Database Wizard” and go through it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

HostGator: To upload a csv

A

Click on desired db, then click on import tab. Select the csv. Choose the csv file format, then if necessary, click checkbutton for “first line contains table columns”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

DB: A simple DB is called

A

sqlite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Windows: To create a new file, type

A

echo.>file_name.py

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Windows: To see current directory, type

A

echo %cd%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Windows: To change directory, type

A

cd c:/users/alen.solomon/desktop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Windows: To list all the files in current directory, type

A

dir /b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Pandas: Non printable characters can cause a “ValueError: No columns to parse from file” and can be fixed by

A

adding the parameter encoding=”utf-16” to read_csv

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Adwords: When exporting a report from adwords, make sure not to use

A

Excel.csv format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pandas: To choose a delimiter for read_csv, type

A

sep=”\t”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Pandas: To turn an excel file to a df, type

A
xl = pandas.ExcelFile("path/to.xlsx")
df = xl.parse("Sheet name")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Pandas: To change column values with a dollar sign to a float, type

A

df[“Cost”] = df[“Cost”].apply(lambda x: str(x).strip(“$”)).astype(float)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pandas: When you use df[“Column name”].apply(my_function)

A

the function name does not need to end with ()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Pandas: To return only certain columns of a df in a set order, type

A

df = df.ix[ :, [“Abbrev”,”Jan”,”Feb”,”Mar”,”Total”]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Pandas: To replace all nans with something, type

A

df[“Column name”].fillna(“Replacement”, inplace=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Pandas: To write a couple dfs to two sheets in excel, type

A

writer = pandas.ExcelWriter(‘/users/student/desktop/demo.xlsx’, engine=’xlsxwriter’)

df. to_excel(writer, index=False, sheet_name=’Sheet1’)
df2. to_excel(writer, index=False, sheet_name=’Sheet2’)
writer. save()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Windows: To clear the cmd screen, type

A

CLS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Numpy: To turn a numpy array into a matrix, type

A

my_numpy_array.reshape(5,5)

Note: the dimensions are the 5,5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Numpy: To create a boolean index matrix from two numpy matrices, type

A

my_numpy_matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Numpy: To turn a numpy matrix back into an array, type

A

my_numpy_matrix.ravel()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Selenium: To import selenium, type

A

from selenium import webdriver

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Selenium: To send selenium to a site, type
my_browser.get("http://google.com")
26
Selenium: To instantiate Chrome, type
my_browser = webdriver.Chrome()
27
Selenium: To return the title tag, type
my_browser.title
28
Selenium: To find an element on a page by it's id, type
my_element = my_browser.find_element_by_id("lst-ib")
29
Selenium: To press keyboard into a page element, type
my_browser.find_element_by_name("aw").send_keys("Send these keys")
30
Selenium: To close the browser you instantiated, type
my_browser.quit()
31
Selenium: To submit in an element, type
my_element.submit()
32
Selenium: To simulate an arrow key press, type
my_element.send_keys("my text", Keys.ARROW_DOWN)
33
Selenium: To empty a text field, type
my_browser.find_element_by_id("field").clear()
34
Seleniun: To click a submit button, type
my_browser.find_element_by_id("submit").click()
35
Selenium: To select an element by its class name, type
my_browser.find_element_by_class_name("date-box")
36
HTML: The -select> tag is used for
Drop down menues
37
Selenium: To find the first form on a page through its xml path, type
my_browser.find_element_by_xpath("//form[1]") Note: the slashes represent moving into the html tags and then the body tags and into the form tags.
38
Selenium: An easier way to find the submit form button and press it is
my_browser.find_element_by_id("pswrd").submit()
39
Selenium: To click a checkbox xpath, type
my_browser.find_element_by_xpath(".//*[@id='facebook']/body/div[2]/label[2]").click() Note: Find xpath using firepath.
40
Selenium: The best way to find a checkbox is
xpath
41
Selenium: Sometimes to make an xpath work you need to
``` remove the [@id="js_5x"] from //*[@id="js_5x"]/div/ul/li[6]/a/span/span ```
42
Selenium: To get the xpath in chrome
Inspect element twice and then right click and click "copy xpath"
43
Selenium: When looking for the path for checkboxes or buttons, often
the label, or the outer ridge of button outside of the button text shows the correct path in inspect element rather than the button itself.
44
Selenium: When uploading a file do not
click on the upload button.
45
Selenium: To upload a file, type
my_browser.find_element_by_xpath("//*/body/div[6]/input").send_keys("/users/student/desktop/report.csv")
46
Selenium: To select an list item in a drop down menu, type
from selenium.webdriver.support.select import Select select = Select(my_browser.find_element_by_id("dropdownid")) select.select_by_visible_text("Jan")
47
Selenium: To make the browser continue looking for a page element if it is not there for a set amount of time, type
my_browser.implicitly_wait(50)
48
Selenium: implicitly_wait(30) only has to be
set once and all tests for this browser will have the wait.
49
BS4: To parse a soup_page for an html tag with a certain parameter (e.g. css class name), type
soup_page.find_all("div", {"class": "name"})
50
BS4: To return a list of all the top level html tags contents separated, type
soup_page.contents
51
BS4: To return the first item in a list of all the top level html tags contents separated, type
soup_page.contents[0]
52
BS4: To return the first item in a list of all the top level html tags contents separated and then the first tags contents from within that list item, type
soup_page.contents[0].contents[0]
53
Python: To use a list comprehension to return a boolean index, type
my_list = [1,2,3,4,5,6,7,8] [item>3 for item in my_list]
54
IPYNB: To make a graph show up, type
%matplotlib inline import matplotlib.pyplot as plt df.plot(x="Column", y="Column2", kind="scatter")
55
Pandas: A series is like a
One dimensional array with an index.
56
Pandas: To set your own index to a Series or df, type
``` my_index_list = list(range(20)) my_series.index = my_index_list ```
57
Pandas: To slice a section of a Series by its index, type
my_series.ix[3:6]
58
Pandas: In order to set a new index for a df or Series, the index list must
Be the same length as the df or Series.
59
BS4: To put the page source of a selenium page into a soup_page, type
page = my_browser.page_source soup_page = BeautifulSoup(page)
60
Pandas: To return the length and width of a df in a tuple, type
df.shape
61
Pandas: To see how many unique values are in a column, type
df["column name"].unique()
62
Pandas: Even after you change the index to something other than 0 onward, it will
still allow you to slice by the index number
63
Pandas: To return specific indexes from a Series, type
my_series[[ 6, 9, 2]] or my_series[[ "A", "C", "G"]]
64
Pandas: If you change the index of a Series to new numbers you cannot use the original index of the numbers unless yo use
my_series.iloc[0: 5] or my_series.iloc[[4, 6, 9]]
65
Pandas: The difference between .iloc and .ix when the index is numerical, but doesnt match the zero index is
.iloc uses the zero index while .ix uses the current, artificial, index
66
Pandas: To check if any value in a Series passes a value test, type
my_series[my_series > 50].any()
67
Pandas: To check if all values in a Series passes a value test, type
my_series[my_series > 50].all()
68
Pandas: To sum how many values in a Series pass a value test, type
sum(my_series > 5)
69
Pandas: Can you do a step when slicing a Series
Yes
70
Pandas: df["Campaign name"] is the data type
Series
71
Pandas: To create a copy of a df, type
df_copied = df.copy()
72
Pandas: To add the rows of one df to another that has the same columns, type
df_concated = pandas.concat([df, df2])
73
Pandas: With regard to nan values, pandas will
ignore them
74
Pandas: When you concatenate two dfs the index will
be maintained, so if the dfs were both zero indexed, they there will be duplicates index values.
75
Pandas: To return the length of a df column, type
df["Column name"].count()
76
Pandas: Anything action you can perform on a Series can be also performed on a
df["Column name"]
77
Pandas: To forward fill the na values in a Series, type
concated_df.ffill()
78
Pandas: To backwards fill the na values in a Series, type
concated_df.bfill()
79
Pandas: To replace the index of a concated df with one that is ordered, type
concated_df.index = range(concated_df["Column name"].count())
80
Pandas: df["column 1"] + df["column 1"], is an
index based arithmetic. If the indexes are not aligned it will add the wrong rows together.
81
Pandas: When multiplying two Series, if there are many indices with the same value, pandas will
multiply every combination of the values and add new rows for each product.
82
Pandas: If you decide to create a DataFrame with lists rather than dicts, the column labels will
Just be set to zero index numbers.
83
Pandas: To name unnamed columns, type
df.columns = ["Column 1", "Column 2"]
84
Pandas: To filter for indexes in a pivot_table, type
pvt.query('Type == ["Banner", "Text"]')
85
Pandas: To drill down to a certain value in a certain level of a pivot_table, type
pvt.xs("Value name", level=0)
86
SQL: To select everything from a table, type
SELECT * FROM table
87
SQL: A databases structure is called it's
schema
88
SQL: The three main types of data you can set a column to hold are
String, Numeric, Date and Time
89
SQL: The two types of string data are
Text and Varchar
90
SQL: Varchar string type is ideal for
Short strings, like names
91
SQL: Text string type is ideal for
Long strings, like descriptions
92
SQL: The numeric data types are
Integers, Fixed Point Decimal, Float
93
SQL: The fixed point data type
Sets a strict number of decimal places and is ideal for dollars
94
SQL: The float point data type
does not set a strict number of decimal places
95
SQL: The best data type to store a date and time together is
datetime
96
SQL: To create a table with one column that stores 50 characters, type
CREATE TABLE tablename (columnname VARCHAR(50));
97
SQL: To create a table with two columns, one that stores 50 characters, and another that store integers, type
CREATE TABLE tablename (columnname VARCHAR(50), columnname INTEGER);
98
SQL: To insert data into a row, type
INSERT INTO table VALUES ("String", 1000);
99
SQL: When inserting values into the DB, the values must
be in the same order you defined in the table.
100
SQL: String you insert must have
quotes
101
Facebook: To create free banners, go to
www.picmonkey.com
102
Pep 8: Before and after a top level function put
two blank lines, not including #comments
103
Pep 8: After every comma and operator, put a
space
104
Pep 8: Import libraries
on separate lines and at the top with no lines in between
105
Pep 8: Class names should start with a
capital letter
106
Pep 8: If putting a comment on the same line as some code
precede it by two spaces
107
Pep 8: At the end of a file put
an empty line
108
PDB: To use python debugger, type
import pdb; pdb.set_trace() above the area you want to debug
109
PDB: To quit the python debugger, type
q
110
PDB: To see the return of every variable one by one in order to debug, use
python debugger
111
PDB: When finished debugging, make sure to
remove pdb
112
PDB: To run the next line of code and return the variable, type
next or n
113
Pandas: To turn a pivot_table or groupby back into a filterable table, type
pvt.reset_index()
114
Python: Every file that you create is a
library
115
Python: To import one specific class, type
from library_name import Class_name
116
Python: To create an instance of a class type
my_class_instance = Class_name()
117
Python: Functions that belong to classes are called
methods
118
Pandas: When importing a csv, to get pandas to recognize a date column, type
parse_dates=["Column name"]
119
Pandas: To merge two dfs but only keep the keys on the left df, type
df1.merge(df2, on="Column", how="left")
120
Pandas: To refer to a column by it's index instead of it's name, type
df.columns[1]
121
Pandas: When you receive a key error, double check for
Extra spaces
122
Pandas: To return a list of all the column labels, type
list(df.columns.values)
123
Python: To turn two lists into a dictionary, type
dict(zip(list1, list2))
124
Python: To create an invisible command line input for a password, type
import getpass pswd = getpass.getpass("Password:")
125
Selenium: To find an element by link text and click it, type
my_browser.find_element_by_link_text("Link text").click()
126
Selenium: To take a screenshot of the entire page you must
use firefox driver. chrome driver currently has a bug.
127
Selenium: The find elements By. types I should remember are
``` By.PARTIAL_LINK_TEXT By.XPATH By.NAME By.CLASS_NAME By.TAG_NAME By.ID ```
128
Selenium: Screenshots must be saved as a
png
129
Selenium: To refresh the current page, type
my_browser.refresh()
130
Selenium: To create an explicit wait condition, type
from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC wait = WebDriverWait(my_browser, 20) wait.until(EC.element_to_be_clickable((By.PARTIAL_LINK_TEXT,"Reports")))
131
Selenium: The find elements By. types I should remember are
``` By.PARTIAL_LINK_TEXT By. XPATH By.NAME By.CLASS_NAME By.TAG_NAME By.ID ```
132
Selenium: Regarding elements hidden by jquery, Selenium must
do what the user must do in order to make it visible.
133
Selenium: To find an element based on anchor text, type
my_browser.find_element_by_partial_link_text("Link text")
134
Pandas: To convert a Series or column to floats, type
df["Column name"].astype(float)
135
Pandas: To create a new column with values that are conditional on other columns values in the same row.
df["New"] = numpy.where((df["Column."]>35) | (df["Column 2"]=="Banner"), df["Column2"] * 0.8, df["Column4"])
136
Pandas: When using both the "or" and "and" conditions,
the conditions on both sides of "and" get evaluated together first, before the "or"
137
Pandas: To turn a column into a list type
column_list = df["Column name"].tolist()
138
Pandas: To combine two columns into key value pairs in a dictionary, type
``` column_list = df["Column name"].tolist() column_list2 = df["Column name2"].tolist() combined_dict = dict(zip(column_list, column_list2)) ```
139
Pandas: The purpose of inplace=True is to
change the df itself without having to re assign the return of the function to the same df variable name.
140
Pandas: To filter a date formatted column for just a month and year, type
df["2014-11"]
141
Pandas: To place a df with inconsistent dates onto an index with all dates, type
df = pandas.DataFrame(index=pandas.date_range("2014-08-02", "2014-09-06", freq="d")) df.join(inconsistent_df, how="outer")
142
Pandas: Join, by default, merges on the
indexes
143
Javascript: To refer to the current page, type
document
144
Javascript: To create an alert, type
alert("Hello!");
145
Javascript: To write an h1 into the current page, type
document.write("

Hello!

");
146
Javascript: Javascript files end with the filename
.js
147
Javascript: To pull javascript code into a webpage from an external file, type
-script src="javascript.js">-script>
148
Javascript: To print to the console, type
console.log("My log");
149
Javascript: To create a variable with no value, type
var my_var;
150
Javascript: To create a variable with a value, type
var my_var = 25;
151
Javascript: Variable name cannot start with
a number
152
Javascript: To make a quote be just a quote and not end a string you can use an
escape character \ right before it
153
Javascript: To save a user input from a dialog box into a variable, type
var dialog_input = prompt("Text to display");
154
Javascript: To add a string and a variable using +, type
var concated_string = "Hello " + visitor;
155
Javascript: To update a variable that is referencing itself, type
``` var message = "Hello "; message = message + "Dave"; ```
156
Javascript: To update a variable that is referencing itself at the beginning in a shorter way, type
``` var message = "Hello "; message += "Dave"; ```
157
Javascript: To return the length of a string type
my_string.length;
158
Javascript: To return the lower case of a string, type
my_string.toLowerCase();
159
Javascript: To return the upper case of a string, type
my_string.toUpperCase();
160
Pandas: To parse dates based on column location instead of column name, type
parse_dates=[0]
161
Pandas: To rename a column label by its index, type
df.columns.values[0] = "New label" or df.rename(columns={df.columns[0]:"New name"})
162
Pandas: To do a value test for nan, type
df["Column"] == numpy.nan
163
Pandas: Pandas deals with nan values by
ignoring them
164
Pandas: To turn all of the values in a df to integers, type
df = df.astype(int)
165
Pandas: To see a table of correlations, type
df.corr()
166
Pandas: To save an image of a matplotlib plot, type
import matplotlib.pyplot as plt my_plot = df.plot(x="Column", y="Column2", kind="scatter") my_plot.get_figure().savefig("/users/student/desktop/pii.png", bbox_inches="tight")
167
Pandas: The type of graph df.plot(x="A", y="B") returns is
line
168
Math: The X axis is
the bottom going horizontal
169
Matplotlib: The kinds of plots I should remember are
bar, scatter, pie, line
170
Python: When creating a big project, remember to
Use pep8, document everything starting with "This", create each part of the project in separate files for easy testing
171
Python: To slice the last 8 characters from a string, type
my_string[-8:]
172
Javascript: To turn a string to an integer, type
parseInt(var_string)
173
Javascript: To turn a string to a float, type
parseFloat(var_string)
174
Python: To round a number to the nearest whole number, type
round(1.3)
175
Python: To round a number to the nearest first decimal, type
round(1.326, 1)
176
Python: To trigger a python script from another python script, type
import os | os.system("python /users/student/desktop/script.py")
177
Python: To open a file in its default program, type
import subprocess | subprocess.Popen(["open", "/Applications/Calculator.app/"])
178
Pandas: To delete a column from a df, type
df.drop("column_name", axis=1, inplace=True)
179
Python: To open any file on a mac, type
import subprocess | subprocess.call(["open", "/Users/student/Desktop/file.app"])
180
Python: To open any file on widows, type
import os | os.startfile(c:/filename/path)
181
Python: To delete a file from computer, type
import os | os.remove("/users/student/desktop/file.png")
182
Pandas: To read_csv for specific columns, type
usecols=range(1,7)
183
Pandas: The usecols argument for read_csv must be
explicit by either index or label, but not a slice
184
Pandas: To find the highest value in a series, type
my_series.max()
185
Pandas: The .apply method does not have
access to the index. Must use df.index.map() in the case where you must create a column based on the index.
186
Pandas: To reference a date_range index to create a new column with the day of the week, type
df["Day"] = df.index.map(lambda x: x.strftime("%A"))
187
Pandas: To add two new values to the bottom of a df, type
df = df.append([["value1", "value2"]], ignore_index=True)
188
Pandas: To get the number of rows in each of a groupby's keys, type
grouped.size()
189
ML: Supervised learning is when
The machine learns by labeled examples. The data sample you teach it with must be labeled and then it will start to predict.
190
ML: A common ratio of a training data set to a testing dataset is
67% train to 33% test.
191
ML: Naive bayes is useful for datasets with
less than 100k samples and text data
192
ML: Data sets must be
Labeled
193
Scraping: Remember before beginning the scraping loop to
create the variable that will hold the results and import BeautifulSoup
194
ML: Learning new classifiers as you receive data without doing a batch update is called
online learning
195
ML: When some features are missing it is known as
"missing features"
196
ML: In non-online cases, when new features become available, you would need to
re-fit the model based on a new batch.
197
ML: After you have a batch of samples you need to
Fit a model to it.
198
Pandas: When parsing the XL file sheet, the sheet name is
Case sensitive
199
Pandas: the to_csv command must be used on a
df. eg. df.to_csv("path/to/file.csv")
200
Pandas: When using df.append([[val1, val2]], ignore_index=True), remember to
reassign the df value to the appended version. e.g. df = df.append([[val1, val2]], ignore_index=True)
201
ML: A dimension is a
column
202
ML: In ML, every pixel of an image is its own
Dimension/column
203
sklearn: Before you can train a model, you need to
instantiate it.
204
sklearn: To instantiate a regression model, type
from sklearn.linear_model import LinearRegression | model = LinearRegression()
205
sklearn: To pass an estimator your data you must call the method
model.fit(x, y)
206
sklearn: The x and y parameters in the model.fit(x, y) method take in
``` x = samples by features y = the labels ```
207
sklearn: Given a trained model, to predict the label of a new data sample by highest probability, type
model.predict([[feature_value, feature_value, feature_value]])
208
sklearn: To return the probability that a new data set has each label, type
model.predict_proba()
209
ML: To use categorical data, you must
Binarize it into separate columns. | This is because models assume that if it is in the same column that it is a range, like length or height.
210
sklearn: The LinearRegression model
plots a line through the data and based on the y axis return x.
211
sklearn: The DecisionTreeRegressor
Hold all samples in a library and compares new data to the sample and returns the label that matched closest.
212
sklearn: A classification task is when
You give a model samples with features and have it predict the label.
213
sklearn: Parameters defined by training have a
trailing underscore
214
sklearn: The basic format for a ML prediction generator is
from sklearn.chosenlibrary import ChosenClass model = ChosenClass() x = samplesbyfeaturesdata y = targetlabels model.fit(x, y) model.predict([[feature_val, feature_val2, feature_val3]])
215
sklearn: clf stands for
classifier
216
sklearn: To import the cross validation module, type
from sklearn import cross_validation
217
sklearn: To fit a model and then cross validate it, type
from sklearn import cross_validation from sklearn.chosenlibrary import ChosenClass x = samplebyfeaturedata y = targetlabels X_train, X_test, y_train, y_test = cross_validation.train_test_split(x, y, test_size=0.25, random_state=0) clf = ChosenClass() clf.fit(x, y) pred = clf.predict(X_test) from sklearn.metrics import accuracy_score accuracy_score(pred, y_test)
218
SQL: Database normalization usually involves
dividing large tables into smaller (and less redundant) tables and defining relationships between them.
219
Numpy: To see the number of rows and columns in a numpy array
my_numpy_array.shape
220
Numpy: To convert a pandas DataFrame to a numpy array, type
my_numpy_array = array(df)
221
sklearn: clf stands for
classifier.
222
ML: Natural language processing is
Training a computer to understand the meaning of words.
223
ML: A decision surface is
a boundary on a scatter plot that divides two different types of data
224
sklearn: To return the accuracy of a prediction to the test labels, type
``` from sklearn.metrics import accuracy_score pred = clf.predict(features_test) accuracy_score(pred, labels_test) or model.score(feature_test, label_test) ```
225
sklearn: To import naive bayes, type
from sklearn.naive_bayes import GaussianNB
226
Pandas: To return just the fourth column of a df, type
df.ix[:,3:4]
227
sklearn: Does not accept values of the type
string
228
sklearn: To import and instantiate a decision tree classifier, type
from sklearn import tree | clf = tree.DecisionTreeClassifier()
229
DecisionTreeClassifier: The DecisionTreeClassifier segments the data in a
blocky and slicy manner.
230
DecisionTreeClassifier: The min_samples_split parameter controls
How many data samples are necessary for the DecisionTree decision boundary line to turn to fit them.
231
DecisionTreeClassifier: Entropy is
a measure of impurity of the data samples and controls how impure the data must get before splitting.
232
DecisionTreeClassifier: The default parameter for DecisionTreeClassifier that tunes entropy is
"gini"
233
DecisionTreeClassifier: DecisionTreeClassifier is prone to
overfitting
234
KNeighborsClassifier: The KNeighborsClassifier predicts by
a simple majority vote of the nearest neighbors of each test point
235
NaiveBayes: NaiveBayes classifies by using
correlation with each feature independently to guide classification. It does not group features.
236
Numpy: To convert a two dimensional numpy array to a no dimensional numpy array, type
my_numpy_array.ravel()
237
Pandas: To parse dates where the day, month and year are in separate columns, type
parse_dates={"Dates":["Day", "Month", "Year"]}
238
Pandas: To bin data from a column in a new column, type
ranges = [0,6,12,18,24] labels = ["AM","AM","PM","PM","PM"] df["New column"] = pandas.cut(df["column name"], ranges, labels = labels)
239
Pandas: To save a csv without the index, type
df.to_csv("path.csv", index=False)
240
Pandas: To make a specific cell a nan value, type
df.iloc[1,1] = numpy.nan
241
Pandas: To remove rows with nan values, type
df = df.dropna()
242
Pandas: To convert an excel file to a df, do not try to
save the excel file as a csv and the read_csv, it will unicode error
243
Pandas: You cannot df.drop() the
header
244
Pandas: To change the header to another row in the df, type
df.columns = df.iloc[1]
245
python: to return the last index where the substring is found, type
my_string.rfind("substring")
246
pandas: When getting a nonetype is not ... error using .apply(lambda x: x) use the
str(x) or int(x) method to convert the objects to string
247
pandas: iterrows() returns
a tuple of the index and the row data
248
When creating automation scripts always add some
logging | note: Not the same character every time
249
python: To make a one line if else statement, type
"true value" if 10==10 else "false_value"
250
python: Think of a virtual env as
an environment you are running your script with, not a place you are saving your scripts.