Recommender systems Flashcards

Question 1

Q

What is a recommender system? What is their task?

Answer

A

software tools and techniques that provide suggestions for items that are most
likely of interest to a particular user
- or predict the preference a user would give
can be used for various task but the main is still predictions and suggestions

Question 2

Q

What types of data are used in RecSys?

Answer

A

input
- users data
- items data
- interactions
RecSys elaborates them with context and outputs recommendations

Question 3

Q

What is the taxonomy of RecSys?

Answer

A

personalized -> depends on datas and users interactions
- collaborative filtering -> recommend items liked by users with similar tastes (Item-item similarity, User-user similarity)
non-personalized -> same items suggested to all users
- most popular
- highest rated

Question 4

Q

What are the types of interactions used in RecSys?

Answer

A

explicit feedback
- hard to collect, requires user effort (likes, rating)
- reliable
implicit feedback (visualizations, clicks)
- easy to collect
- noisy

Question 5

Q

What is the rating matrix?

Answer

A

way to represent rating informations
- rows users
- columns items
- explicit -> rating or 0, implicit -> boolean
matrix is generally sparse density < 0.01%
user and item distribution is generally long tail (users generally interacts with little items)

Question 6

Q

What are some popular recommendation tasks?

Answer

A

Rating prediction
- explicit feedback
- predict missing ratings in the rating matrix
TOP-N item recommendation
- implicit feedback
- predicting N items the user will like the most
- uses scoring function for relevance

Question 7

Q

What are the quality indicators for RecSys? Why are they used?

Answer

A

to tell if a system is doing a good job
- Relevance -> ability to recommend items that users like
- Coverage -> most of the items in a catalogue
- Novelty -> items unknown to the user
- Diversity -> diversify the recommended items
- Serendipity -> ability of surprising the user (items that users would have never been able to discover by themselves)

Question 8

Q

How can a recommender system be evaluated?

Answer

A

offline
- does not require involvement of user
- used for years
- based on benchmark datasets (qualitative)
- user experience not considered
online
- users directly involved
- evaluation qualitative and quantitative
- user experience considered
- users are not consistent

Question 9

Q

How does online evaluation work?

Answer

A

direct user feedback (give a form for user to compile, feedback on recommendations)
A/B testing (two set of users, each given a version (base and new variation), evaluating improvement (metrics or feedback)
controlled (in lab) experiments

Question 10

Q

What does the cold-start problem refer to in RecSys?

Answer

A

users are unknown at testing time due to train/test splitting

Question 11

Q

In what ways can the ratings dataset be partitioned for offline evaluation?

Answer

A

avoids cold-start problem, can work with users with low ratings
randomly selects test ratings, can have cold-start users
good practice to split training-test on the basis of the timestamp

Question 12

Q

What are some evaluation metrics used for offline rating prediction?

Answer

A

explicit (rating prediction)
- Mean Absolute Error
- Mean Squared Error
- Root Mean Squared Error
implicit (top-N)
- Recall
- Precision
- Area Under Curve
- Average Precision
- Discounted Cumulative Gain
- Mean Reciprocal Rank
Diversity
- similarity measure
Novelty
- aproximately the inverse of popularity of retrieved items

Question 13

Q

In non-personalized RS how is most popular computed?

Answer

A

number of ratings columns in matrix rating is calculated
the one with the highest number of ratings is selected
if user has already interacted with item the most popular after is presented
- unless re-consumption

Question 14

Q

In non-personalized RS how is highest rated computed?

Answer

A

average of columns in matrix rating is computed
the one with the highest value of rating is selected
if user has already interacted with item the highest rated after is presented
- unless re-consumption
generally normalization factor added to give a bias towards popular items

Question 15

Q

What approaches exist to collaborative filtering?

Answer

A

similarity-wise
- item-based, based on similarity between items (share many users)
- user-based, based on similarity between users (share many items)
algorithm-wise
- memory-based, compute the similarity between users or items
- model-based, predict users’ rating of unrated items

Question 16

Q

What are some evaluation metrics used in CF?

Answer

Study These Flashcards

A

implicit feedback
- cosine similarity
explicit feedback
- pearson correlation -> simili se rating si discosta dalla media
- adjusted cosine similarity
  - differences in the rating scales, more appropriate to center ratings on user mean
- shrinkage, re-weighting similarity penalizing ones on few ratings

Question 17

Q

What are some memory based methods used in CF?

Answer

Study These Flashcards

A

k-Nearest Neighbours
- weighted combination of most similar users/items ratings
- both implicit and explicit

Question 18

Q

What are some model based methods used in CF?

Answer

Study These Flashcards

A

matrix factorization
- matrix learnt from data, from a representation of users and items
- mapping of users and items in a joint latent factor space with dimensionality k
- interactions are modeled as a scalar product between items and users

Question 19

Q

How istraining for matrix factorization done?

Answer

Study These Flashcards

A

Stochastic Gradient Descent
Alternate Least Square
- easy to parallelize

Recommender systems Flashcards

(19 cards)