8. Recommender Systems Flashcards

1
Q

What is the utility function for the formal model?

A

u: X x S -> R
Where X is a set of customers, S is a set of items, and R is a rating.
Essentially, we get a rating for each customer/item pairing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key problems associated with the formal model for recommender systems?

A
  1. Gathering known ratings to fill the utility matrix
  2. Extrapolating unknown ratings from known ones
  3. Evaluating extrapolation methods in terms of success or performance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 2 ways we can collect ratings for the utility matrix?

A
  1. Explicitly asking people to rate items which doesn’t work well in practice
  2. Implicitly learning ratings from user actions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the approaches to recommender systems for extrapolating utilities?

A
  1. Content-based
  2. Collaborative
  3. Latent factor based
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is extrapolating utilities a problem?

A

Most people have not rated most items
New items have no ratings
New users have no history
Not much info to extrapolate from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the main idea behind a content-based recommendation system?

A

To recommend items to customer x that are similar to previous items rated highly by that customer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an item profile?

A

A set of features. It is convenient to think of it has a vector with one dimension per feature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the prediction heuristic for content-based recommendation systems?

A

Given a user profile x and item profile i, estimate u(x, i) using cosine similarity between x and i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a user profile and how can we calculate it?

A

When a user has rated items each with their own profile, we create a user profile using the weighted average of rated item profiles or we can weight them by the difference from the average rating for that item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the pros of the content-based recommendation system?

A
  1. No need for data on other users
  2. Able to recommend to users with unique tastes
  3. Able to recommend new and unpopular items
  4. Able to provide explanations of recommendations by listing the content features that caused it to be selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the cons of the content-based recommendation system?

A
  1. Finding the appropriate features is hard
  2. Recommendations for new users is difficult
  3. May be overspecialization where it never recommends items outside of the content profile
  4. Unable to exploit quality judgements from other users
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the goal of a collaborative filtering system?

A

Finding a set N of other users whose ratings are similar to user x’s ratings. We estimate x’s ratings based on ratings of users in N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula for Jaccard Similarity when we have sets of ratings for users A and B?

A

Sim(A, B) = |rA INTERSECT rB|/|rA U rB|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for cosine similarity when we have sets of ratings for users A and B?

A

Sim(A, B) = cos(rA, rB) = (rA * rB)/(|rA||rB|)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is centered cosine similarity (Pearson Correlation)?

A

Same as cosine similarity but we first normalize all ratings by subtracting the mean of the row (mean of a user’s ratings)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the issue with the Jaccard and cosine similarity measures?

A

Jaccard ignores the value of the rating and only looks at overlapping things

Cosine treats missing ratings as negative by giving them a 0

17
Q

How do we translate a similarity metric to a recommendation?

A

rxi = (sim(x, y) * ryi for all y in N)/(sim(x, y) for all y in N)

Where N is the set of k users most similar to x who have rated i, rx is the vector of user x’s ratings

18
Q

What is item-item collaborative filtering?

A

Unlike user-user filtering where we compare user preferences, we want to find similar items to a given item

19
Q

What is the process for item-item collaborative filtering?

A
  1. For item i, find other similar items
  2. Estimate rating for item i based on ratings for similar items
  3. We can use the same similarity metrics and prediction functions as the user-user model
20
Q

What is the upside of the collaborative filtering system?

A

It works for any kind of item, no feature selection is needed

21
Q

What are the cons of the collaborative filtering system?

A
  1. Cold start problem where there are not enough users in the system to find a match
  2. Sparisty of the user/ratings matrix means it is hard to find users that have rated the same items
    3.First rater problem where we cannot recommend an item that hasn’t been rated before
  3. Popularity bias meaning we cannot recommend items to someone with a unique taste
22
Q

How do we compute a global baseline estimate and why would we need one?

A

Average rating + (movie rating - average) + (user rating - average)

We need this in case someone has not rated any movie similar to one we are trying to estimate a rating for

23
Q
A