05 - Recommender Systems Evaluation Flashcards by Michelle Limbach

What questions should you ask yourself is you develop a recommender system?

Objective: What do you want to achieve with the model?
How to measure: Evaluation methods and Evaluation metrics
How good/relevant are the results?

How well did you know this?

Not at all

Perfectly

What are the goals of the business world?

A successful business
Maximum profit, income, and user satisfaction
Minimize costs
Get as many users as possible
To have the best product

How well did you know this?

Not at all

Perfectly

What are possible costs, that may arise?

Labour costs
Server
Legal/Licenses
etc.

How well did you know this?

Not at all

Perfectly

What is Goodhart’s Law?

When a measure becomes a target, it ceases to be a good measure (dt. wenn ein Messwert zu einem Ziel wird, ist es kein geeigneter Messwert)

How well did you know this?

Not at all

Perfectly

What are the three main evaluation methods and metrics?

Online Evaluations
Offline Evaluations
User Studies

How well did you know this?

Not at all

Perfectly

What is part of Online Evaluations?

Sales
Profit
Clicks

How well did you know this?

Not at all

Perfectly

What is part of Offline Evaluations?

Errors
Accuracy

How well did you know this?

Not at all

Perfectly

What is part of User Studies?

User feedback
User observations

How well did you know this?

Not at all

Perfectly

How does an A/B Test work?

Typical Online Test
50% of the users see Variante A
50% of the users see Variante B

How well did you know this?

Not at all

Perfectly

How does Interleaving work?

Randomize Rankings
All kinds of variations (Random Mix, Top n Mix, Fixed amount Mix)

How well did you know this?

Not at all

Perfectly

What is a typical metric for classification?

Accuracy

How well did you know this?

Not at all

Perfectly

What is a typical metric for Regression?

Error Metrics

How well did you know this?

Not at all

Perfectly

What is a typical metric for Ranking?

Ranking Metrics

How well did you know this?

Not at all

Perfectly

Is Regression = Classification?

Regression tasks can be interpreted as classification/ranking problem
Define intervals and treat them as classes (and use a classification algorithm instead of regression algorithm)

How well did you know this?

Not at all

Perfectly

What regression metrics do you know?

Mean Absolute Error (MAE)
(Root) Mean Square Error ((R)MSE)

How well did you know this?

Not at all

Perfectly

What is Mean Absolute Error (MAE)?

Average Error (Mittelwert) between prediction and observation

What is the benefit of Mean Absolute Errors (MAE)?

Intuitive

What is the drawback of RMSE?

Not very intuitive
Punishes high error rates more

What (Ranked) Retrieval Metrics do you know?

Mean Reciprocal Rank (MRR)
Mean Average Precision (MAP)
Normalized Discounted Cumulative Gain (nDGC)

What is Mean Reciprocal Rank (MRR)?

Measures at which rank the first relevant result is displayed
Takes care of the first relevant result only

What is Normalized Discounted Cumulative Gain (nDGC)?

Relevant items are ranked higher than less relevant items

In which steps can the Normalized Discounted Cumulative Gain (nDGC) be divided?

Step 1: Cumulative gain = Sum of relevance of the top n items
Step 2: Discounted Cumulative Gain: Punishes relevant items, that are less ranked
Step 3: Normalized Discounted Cumulative Gain: Normalises DCG on interval 0 to 1

What is Effectiveness?

Die richtigen Sachen machen (Do the right things )

What is Efficiency?

Sachen richtig machen (Do things right)

What is Performance?

- Sometimes synonym for Effectiveness - Sometimes used as generic term