Introduction & Collaborative Filtering Flashcards

Question 1

Q

What is data mining?

Answer

A

Analyzing large datasets to discover:
- patterns and insights,
- enabling data driven decision making.

Question 2

Q

What are the 3 components of mine data?

Answer

A

Rapid growth,
technology advancement and
competitive advantage

Question 3

Q

What is rapid growth of data?

Answer

A

Volume of data generated increasing exp. Due to:
- transactions,
- media, -IoT and
- cloud

Question 4

Q

What is the technology advancements? (3)

Answer

A

Modern data storage,
processing,
large scale data analysis

Question 5

Q

What is competitive advantage? (3)

Answer

A

Uncover trends,
optimize operations,
strategic advantage

Question 6

Q

What is the goal of supervised learning?

Answer

A

Predict a single variable where the target value is known

Question 7

Q

What are two methods in supervised learning?

Answer

A

Classification and regression(prediction)

Question 8

Q

What are the 2 goals of unsupervised learning?

Answer

A

Segment data into groups;
detect patterns where the target variable is unknown

Question 9

Q

What are four methods of unsupervised learning?

Answer

A

Association rules & recommendation systems
Cluster analysis
Data & dimension reduction
Data exploration/visualization

Question 10

Q

What are the 7 steps in data mining?

Answer

A

Define business purpose
Obtain data (random sampling)
Explore, clean, pre-process (reduce data)
Specify task and choose technique
Iterative implementation and tuning
Assess and compare results
Deploy solution

Question 11

Q

What is collaborative filtering?

Answer

A

Technique to make predictions/recommendations by leveraging:
- preferences,
- behaviors, or
- interactions of groups/users

Question 12

Q

How does collaborative filtering operate?

Answer

A

Individuals with similar preferences in the past are likely to share preference in the future

Question 13

Q

What are three examples of real world applications recommendation systems?

Answer

A

e-commerce platforms,
streaming services, and
social networks

Question 14

Q

What is association rules mining?

Answer

A

Focuses on discovering relationships or patterns between items in transactional data

Question 15

Q

What does collaborative filtering aim to provide? and how?

Answer

A

Aims to provide personalized recommendations by
leveraging user interactions and similarities

Question 16

Q

What are the two ways to measure similarity?

Answer

A

Pearson correlation and
cosine similarity

Question 17

Q

What are the ranges when using pearson correlation?

Answer

A

-1 (perfect negative) to 1 (perfect positive)

Question 18

Q

What are the ranges when using cosine similarity?

Answer

A

0 (no similarity) to 1 (perfect similarity)

Question 19

Q

Why isnt collaborative filtering not be used to create recommendations for new users or new items?

Answer

A

Suffers from cold start

Question 20

Q

What are the advantages of the clustering alternative?

Answer

A

Move large computations and faster/cheaper

Question 21

Q

What are the disadvantages of the clustering alternative?

Answer

A

Accuracy in recommendations

Question 22

Q

What is item-based alternative?

Answer

A

Finding items that were co-rated by KNN user(s) with:
item of interest &
recommend the most popular items among the similar items

Question 23

Q

What is user-based alternative?

Answer

A

Recommends items by identifying users with similar preferences