3 Flashcards

Question

Selenium: To send selenium to a site, type

Answer 1

my_browser.get("http://google.com")

Answer 2

my_browser = webdriver.Chrome()

Answer 3

my_browser.title

Answer 4

my_element = my_browser.find_element_by_id("lst-ib")

Answer 5

my_browser.find_element_by_name("aw").send_keys("Send these keys")

Answer 6

my_browser.quit()

Answer 7

my_element.submit()

Answer 8

my_element.send_keys("my text", Keys.ARROW_DOWN)

Answer 9

my_browser.find_element_by_id("field").clear()

Answer 10

my_browser.find_element_by_id("submit").click()

Answer 11

my_browser.find_element_by_class_name("date-box")

Answer 12

Drop down menues

Answer 13

my_browser.find_element_by_xpath("//form[1]") Note: the slashes represent moving into the html tags and then the body tags and into the form tags.

Answer 14

my_browser.find_element_by_id("pswrd").submit()

Answer 15

my_browser.find_element_by_xpath(".//*[@id='facebook']/body/div[2]/label[2]").click() Note: Find xpath using firepath.

Answer 16

``` remove the [@id="js_5x"] from //*[@id="js_5x"]/div/ul/li[6]/a/span/span ```

Answer 17

Inspect element twice and then right click and click "copy xpath"

Answer 18

the label, or the outer ridge of button outside of the button text shows the correct path in inspect element rather than the button itself.

Answer 19

click on the upload button.

Answer 20

my_browser.find_element_by_xpath("//*/body/div[6]/input").send_keys("/users/student/desktop/report.csv")

Answer 21

from selenium.webdriver.support.select import Select select = Select(my_browser.find_element_by_id("dropdownid")) select.select_by_visible_text("Jan")

Answer 22

my_browser.implicitly_wait(50)

Answer 23

set once and all tests for this browser will have the wait.

Answer 24

soup_page.find_all("div", {"class": "name"})

Answer 25

soup_page.contents

Answer 26

soup_page.contents[0]

Answer 27

soup_page.contents[0].contents[0]

Answer 28

my_list = [1,2,3,4,5,6,7,8] [item>3 for item in my_list]

Answer 29

%matplotlib inline import matplotlib.pyplot as plt df.plot(x="Column", y="Column2", kind="scatter")

Answer 30

One dimensional array with an index.

Answer 31

``` my_index_list = list(range(20)) my_series.index = my_index_list ```

Answer 32

my_series.ix[3:6]

Answer 33

Be the same length as the df or Series.

Answer 34

page = my_browser.page_source soup_page = BeautifulSoup(page)

Answer 35

df["column name"].unique()

Answer 36

still allow you to slice by the index number

Answer 37

my_series[[ 6, 9, 2]] or my_series[[ "A", "C", "G"]]

Answer 38

my_series.iloc[0: 5] or my_series.iloc[[4, 6, 9]]

Answer 39

.iloc uses the zero index while .ix uses the current, artificial, index

Answer 40

my_series[my_series > 50].any()

Answer 41

my_series[my_series > 50].all()

Answer 42

sum(my_series > 5)

Answer 43

df_copied = df.copy()

Answer 44

df_concated = pandas.concat([df, df2])

Answer 45

ignore them

Answer 46

be maintained, so if the dfs were both zero indexed, they there will be duplicates index values.

Answer 47

df["Column name"].count()

Answer 48

df["Column name"]

Answer 49

concated_df.ffill()

Answer 50

concated_df.bfill()

Answer 51

concated_df.index = range(concated_df["Column name"].count())

Answer 52

index based arithmetic. If the indexes are not aligned it will add the wrong rows together.

Answer 53

multiply every combination of the values and add new rows for each product.

Answer 54

Just be set to zero index numbers.

Answer 55

df.columns = ["Column 1", "Column 2"]

Answer 56

pvt.query('Type == ["Banner", "Text"]')

Answer 57

pvt.xs("Value name", level=0)

Answer 58

SELECT * FROM table

Answer 59

String, Numeric, Date and Time

Answer 60

Text and Varchar

Answer 61

Short strings, like names

Answer 62

Long strings, like descriptions

Answer 63

Integers, Fixed Point Decimal, Float

Answer 64

Sets a strict number of decimal places and is ideal for dollars

Answer 65

does not set a strict number of decimal places

Answer 66

CREATE TABLE tablename (columnname VARCHAR(50));

Answer 67

CREATE TABLE tablename (columnname VARCHAR(50), columnname INTEGER);

Answer 68

INSERT INTO table VALUES ("String", 1000);

Answer 69

be in the same order you defined in the table.

Answer 70

www.picmonkey.com

Answer 71

two blank lines, not including #comments

Answer 72

on separate lines and at the top with no lines in between

Answer 73

capital letter

Answer 74

precede it by two spaces

Answer 75

an empty line

Answer 76

import pdb; pdb.set_trace() above the area you want to debug

Answer 77

python debugger

Answer 78

remove pdb

Answer 79

pvt.reset_index()

Answer 80

from library_name import Class_name

Answer 81

my_class_instance = Class_name()

Answer 82

parse_dates=["Column name"]

Answer 83

df1.merge(df2, on="Column", how="left")

Answer 84

df.columns[1]

Answer 85

Extra spaces

Answer 86

list(df.columns.values)

Answer 87

dict(zip(list1, list2))

Answer 88

import getpass pswd = getpass.getpass("Password:")

Answer 89

my_browser.find_element_by_link_text("Link text").click()

Answer 90

use firefox driver. chrome driver currently has a bug.

Answer 91

``` By.PARTIAL_LINK_TEXT By.XPATH By.NAME By.CLASS_NAME By.TAG_NAME By.ID ```

Answer 92

my_browser.refresh()

Answer 93

from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC wait = WebDriverWait(my_browser, 20) wait.until(EC.element_to_be_clickable((By.PARTIAL_LINK_TEXT,"Reports")))

Answer 94

``` By.PARTIAL_LINK_TEXT By. XPATH By.NAME By.CLASS_NAME By.TAG_NAME By.ID ```

Answer 95

do what the user must do in order to make it visible.

Answer 96

my_browser.find_element_by_partial_link_text("Link text")

Answer 97

df["Column name"].astype(float)

Answer 98

df["New"] = numpy.where((df["Column."]>35) | (df["Column 2"]=="Banner"), df["Column2"] * 0.8, df["Column4"])

Answer 99

the conditions on both sides of "and" get evaluated together first, before the "or"

Answer 100

column_list = df["Column name"].tolist()

Answer 101

``` column_list = df["Column name"].tolist() column_list2 = df["Column name2"].tolist() combined_dict = dict(zip(column_list, column_list2)) ```

Answer 102

change the df itself without having to re assign the return of the function to the same df variable name.

Answer 103

df["2014-11"]

Answer 104

df = pandas.DataFrame(index=pandas.date_range("2014-08-02", "2014-09-06", freq="d")) df.join(inconsistent_df, how="outer")

Answer 105

alert("Hello!");

Answer 106

document.write("

Hello!

");

Answer 107

-script src="javascript.js">-script>

Answer 108

console.log("My log");

Answer 109

var my_var;

Answer 110

var my_var = 25;

Answer 111

escape character \ right before it

Answer 112

var dialog_input = prompt("Text to display");

Answer 113

var concated_string = "Hello " + visitor;

Answer 114

``` var message = "Hello "; message = message + "Dave"; ```

Answer 115

``` var message = "Hello "; message += "Dave"; ```

Answer 116

my_string.length;

Answer 117

my_string.toLowerCase();

Answer 118

my_string.toUpperCase();

Answer 119

parse_dates=[0]

Answer 120

df.columns.values[0] = "New label" or df.rename(columns={df.columns[0]:"New name"})

Answer 121

df["Column"] == numpy.nan

Answer 122

ignoring them

Answer 123

df = df.astype(int)

Answer 124

import matplotlib.pyplot as plt my_plot = df.plot(x="Column", y="Column2", kind="scatter") my_plot.get_figure().savefig("/users/student/desktop/pii.png", bbox_inches="tight")

Answer 125

the bottom going horizontal

Answer 126

bar, scatter, pie, line

Answer 127

Use pep8, document everything starting with "This", create each part of the project in separate files for easy testing

Answer 128

my_string[-8:]

Answer 129

parseInt(var_string)

Answer 130

parseFloat(var_string)

Answer 131

round(1.3)

Answer 132

round(1.326, 1)

Answer 133

import os | os.system("python /users/student/desktop/script.py")

Answer 134

import subprocess | subprocess.Popen(["open", "/Applications/Calculator.app/"])

Answer 135

df.drop("column_name", axis=1, inplace=True)

Answer 136

import subprocess | subprocess.call(["open", "/Users/student/Desktop/file.app"])

Answer 137

import os | os.startfile(c:/filename/path)

Answer 138

import os | os.remove("/users/student/desktop/file.png")

Answer 139

usecols=range(1,7)

Answer 140

explicit by either index or label, but not a slice

Answer 141

my_series.max()

Answer 142

access to the index. Must use df.index.map() in the case where you must create a column based on the index.

Answer 143

df["Day"] = df.index.map(lambda x: x.strftime("%A"))

Answer 144

df = df.append([["value1", "value2"]], ignore_index=True)

Answer 145

grouped.size()

Answer 146

The machine learns by labeled examples. The data sample you teach it with must be labeled and then it will start to predict.

Answer 147

67% train to 33% test.

Answer 148

less than 100k samples and text data

Answer 149

create the variable that will hold the results and import BeautifulSoup

Answer 150

online learning

Answer 151

"missing features"

Answer 152

re-fit the model based on a new batch.

Answer 153

Fit a model to it.

Answer 154

Case sensitive

Answer 155

df. eg. df.to_csv("path/to/file.csv")

Answer 156

reassign the df value to the appended version. e.g. df = df.append([[val1, val2]], ignore_index=True)

Answer 157

Dimension/column

Answer 158

instantiate it.

Answer 159

from sklearn.linear_model import LinearRegression | model = LinearRegression()

Answer 160

model.fit(x, y)

Answer 161

``` x = samples by features y = the labels ```

Answer 162

model.predict([[feature_value, feature_value, feature_value]])

Answer 163

model.predict_proba()

Answer 164

Binarize it into separate columns. | This is because models assume that if it is in the same column that it is a range, like length or height.

Answer 165

plots a line through the data and based on the y axis return x.

Answer 166

Hold all samples in a library and compares new data to the sample and returns the label that matched closest.

Answer 167

You give a model samples with features and have it predict the label.

Answer 168

trailing underscore

Answer 169

from sklearn.chosenlibrary import ChosenClass model = ChosenClass() x = samplesbyfeaturesdata y = targetlabels model.fit(x, y) model.predict([[feature_val, feature_val2, feature_val3]])

Answer 170

classifier

Answer 171

from sklearn import cross_validation

Answer 172

from sklearn import cross_validation from sklearn.chosenlibrary import ChosenClass x = samplebyfeaturedata y = targetlabels X_train, X_test, y_train, y_test = cross_validation.train_test_split(x, y, test_size=0.25, random_state=0) clf = ChosenClass() clf.fit(x, y) pred = clf.predict(X_test) from sklearn.metrics import accuracy_score accuracy_score(pred, y_test)

Answer 173

dividing large tables into smaller (and less redundant) tables and defining relationships between them.

Answer 174

my_numpy_array.shape

Answer 175

my_numpy_array = array(df)

Answer 176

classifier.

Answer 177

Training a computer to understand the meaning of words.

Answer 178

a boundary on a scatter plot that divides two different types of data

Answer 179

``` from sklearn.metrics import accuracy_score pred = clf.predict(features_test) accuracy_score(pred, labels_test) or model.score(feature_test, label_test) ```

Answer 180

from sklearn.naive_bayes import GaussianNB

Answer 181

df.ix[:,3:4]

Answer 182

from sklearn import tree | clf = tree.DecisionTreeClassifier()

Answer 183

blocky and slicy manner.

Answer 184

How many data samples are necessary for the DecisionTree decision boundary line to turn to fit them.

Answer 185

a measure of impurity of the data samples and controls how impure the data must get before splitting.

Answer 186

overfitting

Answer 187

a simple majority vote of the nearest neighbors of each test point

Answer 188

correlation with each feature independently to guide classification. It does not group features.

Answer 189

my_numpy_array.ravel()

Answer 190

parse_dates={"Dates":["Day", "Month", "Year"]}

Answer 191

ranges = [0,6,12,18,24] labels = ["AM","AM","PM","PM","PM"] df["New column"] = pandas.cut(df["column name"], ranges, labels = labels)

Answer 192

df.to_csv("path.csv", index=False)

Answer 193

df.iloc[1,1] = numpy.nan

Answer 194

df = df.dropna()

Answer 195

save the excel file as a csv and the read_csv, it will unicode error

Answer 196

df.columns = df.iloc[1]

Answer 197

my_string.rfind("substring")

Answer 198

str(x) or int(x) method to convert the objects to string

Answer 199

a tuple of the index and the row data

Answer 200

logging | note: Not the same character every time

Answer 201

"true value" if 10==10 else "false_value"

Answer 202

an environment you are running your script with, not a place you are saving your scripts.

3 Flashcards

(250 cards)

Hello!