ML_Production (Quora) Flashcards
domain definition
- milions of Q&A
- millions of users
- thousands of topics
- main features: relevance, quality, demand
common ML algos
- logistic regresssion
- ElasticNets
- matrix factorization
- random forests
- DL
implicit vs explicit feedback
- implicit feedback is more dense as available to all users (rating a movie (implicit) vs watching a movie(explicit))
- better correlated with A/B tests
- but may not correlate with long term user retention
- solution: combine implicit+explicit feedback
model learning dependencies
it will learn according to:
- training data
- target function/variable
- metric used
using ensembles
- flexible in using many different models
- flexible in using many approaches
- treat each model as a feature and add it to the ensemble (i.e. in a linear ‘supermodel’)
- avoid feedback loops
feature engineering
main characteristics of a feature:
- reusable (across models),
- transfomable (applying diff functions)
- interpreatable,
- reliable (easy to monitor, to fix bugs)
ML system goals
strive for all:
- allow for experiments
- reusable
- easy-to-use
- flexible
- scalable
- performant
- use same tools in production and research
- implement abstraction layers for easy access
model easy to debug?
important because:
- determines the model used
- gives answers when something fails
- determines the features to use
- determines the selections of tools for its implementation
distributed machine learning?
most of practical ML cand be done with a multi-core machine with: - data sampling - offline schemes - efficient parallel code - optimizing computation must take into account: - costs - latency
hyperparameter optimization
- using Bayesian optimization (GP) better than CV
- tools like spearmint, AutoML, hyperopt
presentation bias
- user will click only what the app is showing that in turn is decided by the model based on its predictive analysis
- address it for example by improving the probability of user’s click on a position
collaborative filtering at a glance
a couple ways to do it:
- user-similarity: cosine sim using vector representation on common rated items
- item-simmilarity: cosine sim using vector representation on common users’ ratings
problems:
- cold start: no info to begin with
- popularity bias: tends to recommend popular items
hybridization methods
- weighted models based on importance
- switching model used based on situation
- mixed: results presented together
- feature combination (from diff sources for models) for input of a single model
- cascade, feature augmentation: using output of one technique as input of another
learning to rank approaches
- pointwise: using regression or classification (logistic regression)
- pairwise: minimize the inversions in ranking