451 - 503 Flashcards
matplotlib.contour()
used to plot contours.
x, y = np.mgrid[-3*np.pi:3*np.pi:100j, -3*np.pi:3*np.pi:100j] z = np.sinc(x) + np.cos(y) fig, ax = plt.subplots() ax.contour(z) fig.set_figwidth(8) # ширина и fig.set_figheight(8) # высота "Figure"
x, y = np.mgrid[-3*np.pi:3*np.pi:300j, -3*np.pi:3*np.pi:300j] z = np.sinc(x) + np.cos(y) fig, ax = plt.subplots() ax.contour(z, levels = 20) fig.set_figwidth(12) # ширина и fig.set_figheight(12) # высота "Figure" plt.show()
CI and CD
«непрерывное развертывание», — это методика разработки программного обеспечения, реализуемая благодаря инструментам автоматизации. Регулярные и надежные обновления уменьшают циклы выпуска за счет непрерывной доставки кода.
numpy.genfromtxt(fname, dtype=<class ‘float’>, comments=’#’, delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=” !#$%&’()*+, -./:;<=>?@[\]^{|}~”, replace_space=’_’, autostrip=False, case_sensitive=True, defaultfmt=’f%i’, unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None, encoding=’bytes’, *, ndmin=0, like=None)
Load data from a text file, with missing values handled as specified.
Each line past the first skip_header lines is split at the delimiter character, and characters
following the comments character are discarded.
data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'), ('mystring','S5')], delimiter=",")
pandas.read_stata(filepath_or_buffer, *, convert_dates=True, convert_categoricals=True, index_col=None, convert_missing=False, preserve_dtypes=True, columns=None, order_categoricals=True, chunksize=None, iterator=False, compression=’infer’, storage_options=None)
Read Stata file into DataFrame.
df = pd.read_stata('animals.dta')
itr = pd.read_stata('filename.dta', chunksize=10000) for chunk in itr: # Operate on a single chunk, e.g., chunk.mean() pass
pandas.Series.to_frame(name=_NoDefault.no_default)
is used to convert the given series object to a dataframe.
s = pd.Series(["a", "b", "c"], name="vals") s.to_frame() vals 0 a 1 b 2 c
pandas.ExcelFile.parse(sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, parse_dates=False, date_parser=None, thousands=None, comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, **kwds)
Parse specified sheet(s) into a DataFrame.
Equivalent to read_excel(ExcelFile, …) See the read_excel docstring for more info on accepted parameters.
pandas.read_excel(io, sheet_name=0, *, header=0, names=None, index_col=None, usecols=None, squeeze=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, decimal=’.’, comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, storage_options=None)
Read an Excel file into a pandas DataFrame. Supports xls, xlsx, xlsm, xlsb, odf, ods and odt file extensions read from a local filesystem or URL. Supports an option to read a single sheet or a list of sheets.
pd.read_excel('tmp.xlsx', index_col=0)
pd.read_excel(open('tmp.xlsx', 'rb'), sheet_name='Sheet3')
pd.read_excel('tmp.xlsx', index_col=None, header=None)
pd.read_excel('tmp.xlsx', index_col=0, comment='#')
string.isalpha()
method returns True if all the characters are alphabet letters (a-z).
txt = "Company10" x = txt.isalpha() print(x) 👉 False
txt = "CompanyX" x = txt.isalpha() print(x) 👉 True
LinkedIn BOOLEAN SEARCH (AND, OR, NOT, (), “”)
allow you to specify the search.
- OR → или то или иное, любой из вариантов
- AND → для поиска результатов включающих все параметры (including all)
- ”” → должны присутствовать слова и в том же порядке как указано внутри скобок
- NOT → убрать из поиска определенные слова или фразы
- () → то фильтр, что внутри должен использоваться первым
"Software Developer" OR "Python Developer" OR "Data Analyst" OR "Data Engineer" OR "Data Scientist" OR "Machine Learning"
bookkkeeper OR accountant ceo OR founder OR entrepreneur
accounting AND law ceo AND nutrition AND fitness
"freelance writer "business development manager"
vp OR director NOT assistant "personal trainer" NOT "weight loss"
"business owner" AND (coach OR consultant) AND (health OR fitness OR nutrition) "personal trainer" AND (moms OR pregnancy OR "weight loss") NOT injury
pyautogui.Window Functions
- pyautogui.getWindows() → returns a dict of window titles mapped to window IDs
- pyautogui.getWindow(str_title_or_int_id) → returns a “Win” object
- pyautogui.win.move(x, y)
- pyautogui.win.resize(width, height)
- pyautogui.win.maximize()
- pyautogui.win.minimize()
- pyautogui.win.restore()
- pyautogui.win.close()
- pyautogui.win.position() → returns (x, y) of top-left corner
- pyautogui.win.moveRel(x=0, y=0) → moves relative to the x, y of top-left corner of the window
- pyautogui.win.clickRel(x=0, y=0, clicks=1, interval=0.0, button=’left’) → click relative to
the x, y of top-left corner of the window - pyautogui.win.isMinimized()
- pyautogui.win.isMaximized()
pandas.DataFrame.cumsum(axis=None, skipna=True, *args, **kwargs)
used to find the cumulative sum value over any axis. Each cell is populated with the cumulative sum of the values seen so far.
s = pd.Series([2, np.nan, 5, -1, 0]) s.cumsum() 0 2.0 1 NaN 2 7.0 3 6.0 4 6.0
s = pd.Series([2, np.nan, 5, -1, 0]) s.cumsum(skipna=False) 0 2.0 1 NaN 2 NaN 3 NaN
pandas.DataFrame.rank(axis=0, method=’average’, numeric_only=None, na_option=’keep’,
ascending=True, pct=False)
method returns a rank of every respective index of a series passed. The rank is returned on the basis of position after sorting.
s = pd.Series(range(5), index=list("abcde")) s["d"] = s["b"] s.rank() a 1.0 b 2.5 c 4.0 d 2.5 e 5.0
pandas.DataFrame.idxmin(axis=0, skipna=True, numeric_only=False)
function to find the index of the minimum value along the index axis.
df = pd.DataFrame({"A":[4, 5, 2, 6], "B":[11, 2, 5, 8], "C":[1, 8, 66, 4]}) df.idxmin(axis = 0) A 2 B 1 С 0
df = pd.DataFrame({"A":[4, 5, 2, None], "B":[11, 2, None, 8], "C":[1, 8, 66, 4]}) # Skipna = True will skip all the Na values df.idxmin(axis = 1, skipna = True) 0 C 1 B 2 A 3 C
pandas.read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True,
parse_dates=None, columns=None, chunksize=None)
Read SQL database table into a DataFrame.
pd.read_sql_table('table_name', 'postgres:///db_name')
cnx = create_engine('sqlite:///students.db').connect() df = pd.read_sql_table('students', cnx)
pandas.DataFrame.assign(**kwargs)
Assign new columns to a DataFrame. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten.
df = pd.DataFrame({'temp_c': [17.0, 25.0]}, index=['Portland', 'Berkeley']) df.assign(temp_f=lambda x: x.temp_c * 9 / 5 + 32) temp_c temp_f Portland 17.0 62.6 Berkeley 25.0 77.0
pandas.DataFrame.quantile(q=0.5, axis=0, numeric_only=_NoDefault.no_default,interpolation=’linear’, method=’single’)
return values at the given quantile over requested axis. Divides the members of a batch or sample into equal-sized subgroups of adjacent values or a probability distribution into distributions of equal probability.
df = pd.DataFrame({"A":[1, 5, 3, 4, 2], "B":[3, 2, 4, 3, 4], "C":[2, 2, 7, 3, 4], "D":[4, 3, 6, 12, 7]}) df.quantile(.2, axis = 0) A 1.8 B 2.8 С 2.0 D 3.8
pandas.to_datetime(arg, errors=’raise’, dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None, infer_datetime_format=False, origin=’unix’, cache=True)
Convert argument to datetime. This function converts a scalar, array-like, Series or DataFrame/dict-like to a pandas datetime object.
df = pd.DataFrame({'year': [2015, 2016], 'month': [2, 3], 'day': [4, 5]}) pd.to_datetime(df) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns]
s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000'] * 1000) s.head() 0 3/11/2000 1 3/12/2000 2 3/13/2000 3 3/11/2000 4 3/12/2000 dtype: object
pd.to_datetime(1490195805, unit='s') Timestamp('2017-03-22 15:16:45') pd.to_datetime(1490195805433502912, unit='ns') Timestamp('2017-03-22 15:16:45.433502912')
pd.to_datetime(['2018-10-26 12:00 -0500', '2018-10-26 13:00 -0500']) DatetimeIndex(['2018-10-26 12:00:00-05:00', '2018-10-26 13:00:00-05:00'], dtype='datetime64[ns, pytz.FixedOffset(-300)]', freq=None)
pandas.DataFrame.nlargest(n, columns, keep=’first’)
Return the first n rows with the largest values in columns, in descending order. The columns that are not specified are returned as well, but not used for ordering.
data = pd.read_csv("employees.csv") data.dropna(inplace = True) large5 = data.nlargest(5, "Salary")
pandas.DataFrame.loc
Access a group of rows and columns by label(s) or a boolean array.
for lab, row in cars.iterrows(): cars.loc[lab, "name"] = len(row["sex"])
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]], index=['cobra', 'viper', 'sidewinder'], columns=['max_speed', 'shield']) df.loc['cobra', 'shield'] 👉 2
data = [[50, True], [40, False], [30, False]] label_rows = ["Sally", "Mary", "John"] label_cols = ["age", "qualified"] df = pd.DataFrame(data, label_rows, label_cols) print(df.loc["Mary", "age"]) 👉 40
pandas.DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.
cars["test"] = cars["sex"].apply(str.upper)
df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B']) df A B 0 4 9 1 4 9 2 4 9 df.apply(np.sqrt) A B 0 2.0 3.0 1 2.0 3.0 2 2.0 3.0 df.apply(lambda x: [1, 2], axis=1) 0 [1, 2] 1 [1, 2] 2 [1, 2]