8. Recommender Systems Flashcards
What is the utility function for the formal model?
u: X x S -> R
Where X is a set of customers, S is a set of items, and R is a rating.
Essentially, we get a rating for each customer/item pairing
What are the key problems associated with the formal model for recommender systems?
- Gathering known ratings to fill the utility matrix
- Extrapolating unknown ratings from known ones
- Evaluating extrapolation methods in terms of success or performance
What are the 2 ways we can collect ratings for the utility matrix?
- Explicitly asking people to rate items which doesn’t work well in practice
- Implicitly learning ratings from user actions
What are the approaches to recommender systems for extrapolating utilities?
- Content-based
- Collaborative
- Latent factor based
Why is extrapolating utilities a problem?
Most people have not rated most items
New items have no ratings
New users have no history
Not much info to extrapolate from
What is the main idea behind a content-based recommendation system?
To recommend items to customer x that are similar to previous items rated highly by that customer
What is an item profile?
A set of features. It is convenient to think of it has a vector with one dimension per feature
What is the prediction heuristic for content-based recommendation systems?
Given a user profile x and item profile i, estimate u(x, i) using cosine similarity between x and i
What is a user profile and how can we calculate it?
When a user has rated items each with their own profile, we create a user profile using the weighted average of rated item profiles or we can weight them by the difference from the average rating for that item
What are the pros of the content-based recommendation system?
- No need for data on other users
- Able to recommend to users with unique tastes
- Able to recommend new and unpopular items
- Able to provide explanations of recommendations by listing the content features that caused it to be selected
What are the cons of the content-based recommendation system?
- Finding the appropriate features is hard
- Recommendations for new users is difficult
- May be overspecialization where it never recommends items outside of the content profile
- Unable to exploit quality judgements from other users
What is the goal of a collaborative filtering system?
Finding a set N of other users whose ratings are similar to user x’s ratings. We estimate x’s ratings based on ratings of users in N
What is the formula for Jaccard Similarity when we have sets of ratings for users A and B?
Sim(A, B) = |rA INTERSECT rB|/|rA U rB|
What is the formula for cosine similarity when we have sets of ratings for users A and B?
Sim(A, B) = cos(rA, rB) = (rA * rB)/(|rA||rB|)
What is centered cosine similarity (Pearson Correlation)?
Same as cosine similarity but we first normalize all ratings by subtracting the mean of the row (mean of a user’s ratings)