- quality and taste - item descriptions/features - serendipitous recommendations

- cold start problem - early rater problem - sparsity problem - scalability

Exam Flashcards by Byron OG

What is the bottleneck of user-based CF and how does item-based Cf avoid it

the search for neighbours (in real-time) among a large user population of potential neighbours.
item-based CF avoids this by computing similarities between items instead of users

How well did you know this?

Not at all

Perfectly

What is the intuition of item-based CF?

users are interested in items similar to those previously experienced

How well did you know this?

Not at all

Perfectly

What is the edge that item-item similarities have over user-based and why?

They are more “stable” as the domain of items changes less than users, allowing for less frequent system updates.

How well did you know this?

Not at all

Perfectly

What is the benefit of adjusted cosine similarity in item-based CF?

It accounts for differences in how users rate items

How well did you know this?

Not at all

Perfectly

What is the underlying heuristic of CF?

people who agreed or disagreed on items in the past are likely to agree or disagree on future items

How well did you know this?

Not at all

Perfectly

What are the steps in the UBCF algorithm

Data representation
similarity computation
neighbourhood formation
prediction/top-N list

How well did you know this?

Not at all

Perfectly

What is the main issue with the MSD similarity metric?

it assumes that users rate according to similar distribution

How well did you know this?

Not at all

Perfectly

For MSD similarity, what are two important features of the metric

summations over co-rated items only -> else set to 0
results in a value [0,1]

How well did you know this?

Not at all

Perfectly

For Pearson similarity, what are two important features of the metric

summations over co-rated items only -> else set to 0
results in a value [-1,1]

How well did you know this?

Not at all

Perfectly

What is the benefit of significance weighting to Pearson

It adjusts for the number of co-rated items

How well did you know this?

Not at all

Perfectly

What impacts the range of cosine similarity results

the non-negativity of ratings

How well did you know this?

Not at all

Perfectly

Briefly describe some of the extensions to Pearson Correlation

jaccard index: modify similarity weights by the number of co-rated items between users divided by the union of items
default voting: calculate over the union of items applying a default to non-co-rated items
case amplification: emphasise weights which are close to 1 and reduce the influence of lower weights
inverse user frequency (IUF): gives more weight to ratings for niche items

How well did you know this?

Not at all

Perfectly

CF advantages

quality and taste
item descriptions/features
serendipitous recommendations

How well did you know this?

Not at all

Perfectly

CF Limitations

cold start problem
early rater problem
sparsity problem
scalability

How well did you know this?

Not at all

Perfectly

What do RS help drive?

demand down the long-tail; benefits to both consumers and retailers alike

How well did you know this?

Not at all

Perfectly

What does CF automate?

The “word-of-mouth” process

How well did you know this?

Not at all

Perfectly

What is the key difference between CF and Content-based recommendation?

The use of the item’s descriptions/features (content)

How well did you know this?

Not at all

Perfectly

How is document-document similarity calculated in Content-based?

Study These Flashcards

The cosine of the angle between the document’s vectors

What is case-based recommendation?

Study These Flashcards

A form of content-based recommendation which represents items using a well-defined set of features and feature values

List sources of recommendation knowledge and give examples of each

Study These Flashcards

transactional and behavioural data: clicks, purchases, likes.
content and meta data: text, features, tags.
experiential data: user-generated opinions.

List some properties of consumer reviews

Study These Flashcards

ubiquitous
abundant
usually independent
often insightful

Do reviews matter?

Study These Flashcards

Yes. Research shows that reviews help users to make better decisions. They increase conversion rates and improve satisfaction.

What are some considerations when making recommendations and ranking them

Study These Flashcards

business imperatives: e.g. promoting items
domain
the influence of particular items

How are non-personalised recommendations usually presented?

Study These Flashcards

in the form of a top-N ranked list

Personalised recommendation considerations

- acquiring users' personal information - recommendation output - personalisation: ephemeral or persistent

what is ephemeral personalisation?

matching current activity

what is persistent personalisation?

matching long-term interests

Benefits of RSs?

- turning web browsers into buyers - cross/up-selling - customer loyalty

What are the two main approaches to content-based recommendation and how do you distinguish between them?

- traditional content-based (unstructured) - case-based (structured)

List term-weighting approaches

- term frequency - normalised term frequency - inverse document frequency (IDF) - binary weighting - NTF + IDF

What is term stemming?

considering terms of similar meaning as being the same for matching purposes

How are stop words handled?

They are omitted from the term-document matrix

Differences in making recommendations for NP vs P

- NP: rank recommendation candidates by similarity to the target item - P: rank recommendation candidates by similarity to the target user's profile

Reasons why case-based recommendation is a powerful approach to recommendation?

- facilitates the search and navigation of complex information spaces - flexible user feedback options - suitable for e-commerce applications

Underlying assumptions of case-based reasoning

- the world is a regular place and similar problems tend to have similar solutions - the world is a repetitive place and similar problems tend to recur

CBR Cycle

- Retrieve - Reuse - Revise - Retain

Key differences between Case-based from content-based systems

- case representation - similarity assessment

In case-based recommenders, what do the following symbols represent: Sim(T, C), w, v, Sim(v1, v2)

- similarity between the target case and candidate case - relative importance of a feature - $v_{c,i}$ is the value of feature i in case C - feature-level similarity

What is a key issue for case-based recommenders

acquiring similarity knowledge (numerical vs non-numerical, symmetric vs asymmetric)

The ideal balance of similarity vs density in similarity-based recommendation

we want the top-k retrieved items to be equally similar to the target item/user profile but in different ways

Two algorithms used for balancing similarity vs density in similarity-based recommendation

- Bounded greedy selection - Shimazu's algorithm

Advantages of content-based systems

early recs can be made

Issues of content-based systems

- feature identification and extraction can be problematic - content-based filters cannot distinguish between low and high-quality items - a "more-like-this" approach -> low serendipity

Evaluation methods for systems

- live-user trials - offline evaluations

Exam Flashcards

(44 cards)