Case Study — NextStar Recommender System Flashcards

Question

What is content filtering?

Answer 1

Personalised results based on previous data interactions.

Answer 2

What similar people interacted with.

Answer 3

A hybrid of both. In combination with different machine learning methods.

Answer 4

Other users are used. Chance is involved.

Answer 5

Needs more data Problems for new users/products If product has no ratings it can't be used. People aren't all the same.

Answer 6

Works with lesser data User specific

Answer 7

Over-specification (won't try new stuff)

Answer 8

Collaborative = emphasises user. Content = emphasises content.

Answer 9

Cold start problem is when the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information.

Answer 10

Collaborative = they don't know who you are like.

Answer 11

K nearest neighbour (K=nn) Matrix factorisation.

Answer 12

stores available cases and new cases based on similarity. Classifying data on what it is similar to.

Answer 13

K refers to the number of cases one is close to so it can decide whether it matches?

Answer 14

Hyperparameter

Answer 15

Choosing the right diameter of k.

Answer 16

Trial and Error

Answer 17

Neighbours are too specific = won't try new things.

Answer 18

More work = more expensive = more storage needed.

Answer 19

Breaking down bigger things into smaller ones.

Answer 20

Uses data of how similar people have rated certain movies and using that to find patterns of how other people will.

Answer 21

If someone who liked action movies loved the movie with the car. Then someone else who loved action movies might love the movie with the car.

Answer 22

People aren't completely random they have preferences that can often be patterned,

Answer 23

Instead of storing data of everyones individual preferences, You can store people ad preferences and predict it when necessary. Plus it is easier to teach to training models

Answer 24

Cold start problem = won't be as accurate until we have data to use.

Answer 25

Is the storage of data in a servers and databases that can be accessed over the internet.

Answer 26

Cloud deployment models describe the structure of the cloud, what they run through, who runs them, how they run. IaaS, SaaS and PaaS

Answer 27

On demand computing resources on cloud using infrastructur e, that when paid will allocate more space.

Answer 28

Run through internet e.g. software/browser.

Answer 29

Run through platform. Dedicated to providing services for apps to run.

Answer 30

Does not have Unique App Legal limitations Potential Security flaws. Doesn't; work without internet connectivity.

Answer 31

Not very flexible Data stored offsite. Dependent on server infrastructure. Compromise for security.

Answer 32

Not very scalable. Data security to third party ownership. Your proffered tech stack might not be available. Transition to PaaS is hard.

Answer 33

Measures how much error there is between two data sets.

Answer 34

Evaluate the quality of machine learning predictions.

Answer 35

Generated information about you from the things you do online to predict you.

Answer 36

Clicking a link Watching/reading Rating an item user customer service placing an order response to adverts/marketing

Answer 37

What you like and don't how you responds what are you likely to do.

Answer 38

Who you are where you live who are you related to.

Answer 39

System recommendations are too exact or too close, and may therefore fail to fit to additional data or predict future observations reliably

Answer 40

Using the click-through rate measures how many people click recommendations. Furthermore, predictive accuracy metrics measures how close a recommender predicted ratings were closer to actual user ratings.

Answer 41

PaaS is better because you get the mixture of customisation with outside control, meaning it can be efficient with and without your intervention and still be customisable.

Answer 42

a way to measure how accurate a recommender system is. It takes into account both the system's recall score and precision score.

Answer 43

Mean absolute error

Answer 44

the average difference between the observed result and the expected result. Used to measure the accuracy of a recommender system.

Answer 45

measures the ratio of correct items identified out of total items identified

Answer 46

an optimisation algorithm to find the model parameters that correspond to the best fit between predicted and actual outputs.

Answer 47

Cost function is a parameter deciding the success of an algorithm based on how many errors there are in each stage by comparing predicted values and actual values.

Answer 48

By using cost function, NextStar can see where the errors are in the system in a way to start error spotting and error solving. It allows the algorithm to become more reliable as the results keep getting checked.

Answer 49

IaaS would be best for NextStar because it is customisable but also helped by server managers. The best of both worlds.

Answer 50

A combination of two methods of accuracy, precision and recall, which tests the success rate of filtering.

Answer 51

A hyperparameter is a machine learning parameter whose value is chosen before a learning algorithm is trained.

Answer 52

The mean of the errors of the filtering system.