week6 Flashcards

1
Q

Support Vector Machine

A
  • Another Linear Classifier
  • SVMs use a boundary called a hyperplane to partition data into groups of similar values
  • vector space-based machine learning method aiming to find a decision boundary between two classes that are maximally far from any point in the training data
  • the goal of SMV is to create a flat boundary called a hyperplane which is a straight line that divides the space to create fairly homogenous partitions on either side
  • the SMV learning combines aspects of both the instance-based nearest neighbor learning (lazy learning) classification using nearest neighbors and linear regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Application of SMV

A
  • classification of microarray gene expression data in the field of bioinformatics to identify cancer or other genetic diseases
  • text categorization such as identification of the language used in a document or the classification of documents by subject matter
  • the detection of rare yet important events like combustion engine failure, security breaches, or earthquakes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Collaborative Filtering Algorithm

A
  • The technique used by Recommender Systems
  • user-based filtering system
  • if users A and B have purchased similar items in the past, the recommender system would have recommended items purchased by user B and to user A
  • the similarity in behavior between two users is often computed by Cosine distance or Euclidean distance measure
  • The lesser the cosine angle, the higher the similarity in behavior between the two users
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Content-Based Filtering Algorithm

A
  • another algorithm used by recommender systems
  • unlike collaborative filtering algorithm, content-based algorithms use features of the items such as genre, artist
  • is user A has been buying Harry Potter books, it is likely that user A may purchase another fantasy book ‘the hobbit’
  • one technique to computer similarity between two items is cosine similarity or Euclidean distance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Feature Engineering

A
  • process of extracting features from a raw dataset
  • a term coined to give due importance to the domain knowledge required to select sets of features for machine learning algortihms
  • as technology becomes more sophisticated, more datasets will be available
  • but do we need all the features/variables of the dataset?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Advantages of feature engineering

A
  • improved predictive performance of the model
  • faster and less complex machine learning process
  • a better understanding of the underlying data relationships
  • explainable and implementable machine learning models and solutions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly