Models & Librairies Flashcards
Decision tree
sklearn/tree
Random Forest
sklearn.ensemble/RandomForestClassifier
Naïve Bayes
sklearn.naive_bayes/MultinomialNB
K-means
Libraries for task automation
• Automate - automation
• PyAutoGUI - Graphical User Interface (GUI) automation
• Selenium - web browser automation
• Schedule - job scheduling
• Fabric - streamlining the use of SSH for application deployment
• Celery — distributed task queue
• Invoke - task execution and command-line tooling
Librairies for machine learning
• scikit-learn - general-purpose machine learning
• TensorFlow - deep learning and neural networks
• Keras - high-level deep learning
• PyTorch - deep learning and neural networks
• XGBoost - gradient boosting framework
• LightGBM - gradient boosting framework
• CatBoost - gradient boosting framework
• statsmodels — statistical modeling
• NITK - natural language processing (NLP)
• spaCy - NLP
• Gensim - topic modeling and document similarity analysis
• fastai - deep learning
• H20-3 - general-purpose machine learning
• Prophet - time series forecasting
• Neural Structured Learning (NSL). - neural graph learning
Libraries for data analysis
• pandas — data manipulation and analysis
• NumPy - numerical computing
• SciPy - advanced scientific computing
• math - Python’s built-in module for mathematical operations
Librairies for data visualization
• matplotlib - basic plotting
• seaborn — statistical data visualization
• plotly - interactive plotting/APIs
• bokeh - interactive visualization for web browsers
• Vega-Altair - declarative statistical visualization
• GeoPandas - geospatial data visualization
• HoloViews - interactive visualization
• Pygal - Scalable Vector Graphics (SVG) plots
• folium - geospatial data visualization on interactive maps
• Dash by Plotly - analytical web applications
• plotnine - for statistical visualization
• NetworkX - for network graphs
t-test
from scipy.stats import ttest_ind
result = ttest_ind(data[‘group1’], data[‘group2’])