Quiz 6 Flashcards
- What is the main challenge faced by recommender systems due to the sparsity of the utility matrix?
A. Cold Start problem
B. Over specialization
C. Finding appropriate features
D. Low Prediction accuracy
A. Cold Start problem
- What heuristic is commonly used to pick important features in item profiles in text mining?
A. Term frequency * Document frequency
B. Term frequency * Inverse Document frequency
C. Inverse term frequency * Document frequency
D. All of the above
B. Term frequency * Inverse Document frequency
- Which of the following can be termed as recommendations?
A. Hand-curated recommendations
B. Tailored to individual users
C. Simple aggregates
D. All of the above
D. All of the above
- Which of the following refers to the cold start problem?
A. New items have no ratings
B. New users have no history
C. Both A & B
D. New users are recommended popular items
C. Both A & B
- What is a key advantage of implicit feedback in recommender systems?
A. It provides more accurate ratings
B. It eliminates the cold start problem
C. It learns ratings from user actions such as purchases
D. It enables personalized recommendations
C. It learns ratings from user actions such as purchases
- True | False New and unpopular items can be recommended using content-based-filtering approach.
True
- True | False The tf-idf score of a word that repeats in all the documents is 1.
False
- True | False Content-based recommender systems do not face the cold start problem.
False
- True | False Cosine similarity is used to estimate the similarity between user and item profiles in Content-based recommender systems.
True
- True | False The main idea of content-based recommendation systems is to recommend items to customer X similar to previous items rated highly by X.
True
- Describe the main approach used in content-based recommendation systems.
It’s using it already has to recommend the user something similar. For movies it might look at actors, genre, possibly even watch time to determine a recommendation.
- What are the pros and cons of content-based recommendation systems?
Pros:
-No cold start
-Recommending to unique taste
-Provides content/features that shows why the item was chosen
Cons:
-Finding appropriate features
-New user recommendations
-Not fully suited for users with multiple interests
- How is a user-profile constructed from an item-profile? Assume that item-profile is an n-dimensional vector. How locality sensitive hashing is used to find items similar to a user profile?
With the item profiles it gets split into buckets with LSH. Once the buckets have been developed/calculated. It can recommend by which of the buckets best suits the user.
- Calculate the TF-IDF scores for the words – ‘dog’, ‘lazy’, ‘sleeps’ and ‘The’ across the given three documents.
Document 1: “The quick brown fox jumps over the lazy dog”
Document 2: “A brown dog sleeps in the box”
Documents 3: “The Lazy dog sleeps all day”
- Count Words in Each Document
Document 1: 9 words
Document 2: 7 words
Document 3: 7 words - Calculate Term Frequencies
Document 1:
“dog”: 1/9
“lazy”: 1/9
“sleeps”: 0/9 = 0
“The”: 1/9
Document 2:
“dog”: 1/7
“lazy”: 0/7 = 0
“sleeps”: 1/7
“The”: 0/7 = 0
Document 3:
“dog”: 1/7
“lazy”: 1/7
“sleeps”: 1/7
“The”: 1/7 - Calculate Document Frequencies
“dog”: Appears in all 3 documents
“lazy”: Appears in 2 documents (Document 1 and Document 3)
“sleeps”: Appears in 2 documents (Document 2 and Document 3)
“The”: Appears in 2 documents (Document 1 and Document 3) - Calculate IDF (using natural logarithm with base 10)
“dog”: log(3/3) = 0
“lazy”: log(3/2) ≈ 0.176
“sleeps”: log(3/2) ≈ 0.176
“The”: log(3/2) ≈ 0.176 - Calculate TF-IDF
Document 1:
“dog”: 1/9 * 0 = 0
“lazy”: 1/9 * 0.176 ≈ 0.0196
“sleeps”: 0/9 * 0.176 = 0
“The”: 1/9 * 0.176 ≈ 0.0196
Document 2:
“dog”: 1/7 * 0 = 0
“lazy”: 0/7 * 0.176 = 0
“sleeps”: 1/7 * 0.176 ≈ 0.0251
“The”: 0/7 * 0.176 = 0
Document 3:
“dog”: 1/7 * 0 = 0
“lazy”: 1/7 * 0.176 ≈ 0.0251
“sleeps”: 1/7 * 0.176 ≈ 0.0251
“The”: 1/7 * 0.176 ≈ 0.0251