Data Analyst Flashcards
Що таке генеральне середнє?
це середнє арифметичне варіант ознак
1/N * (sum(xi)) N - об’єм генеральної суккупності
How to read csv file? [Pandas]
import pandas as pd
data = pd.read_csv(‘path_name’, sep=’;’)
Get first / last n rows in data? [pandas]
data. head(5)
data. tail(5)
How to replace NaN with zeroes? [pandas]
[2 options]
data. replace(np.nan, 0)
data. fillna(0)
How to convert column to float? [pandas]
dataset[column_name] = dataset[column_name] \
.str.replace(‘,’, ‘.’) \
.astype(float)
How to sort rows by some columns? [pandas]
dataset.sort_values(by=[sortBy], ascending=False)[[‘Country Name’, sortBy]]
How to find max element (its index) in data? [pandas]
how we can get that row?
data[‘Area’].idxmin()
data.iloc[ind_min_area][0]
How to replace nan with column mean? [pandas]
data.replace(np.nan, data.mean())
How to count total amounts for each values? [pandas]
Ex: T T K -> T = 2 K = 1
data[‘Populatiion’].value_counts()
How to group by? [pandas]
groupedData = data.groupby(‘Region’)[‘Area’].mean()
print(str(groupedData.idxmax()) + ‘ that has ‘ + str(groupedData.max()))