Unit 2: Techniques for Supervised and Unsupervised Learning Flashcards
What is supervised learning, and what are its key characteristics?
Supervised Learning: A type of machine learning where the model is trained on labeled data (input-output pairs).
Key Characteristics:
Labeled Data: Each training example is paired with an output label.
Objective: To learn a mapping from inputs to outputs.
Applications: Classification (e.g., spam detection) and regression (e.g., predicting prices).
List common algorithms used in supervised learning and their applications.
Common Algorithms:
Linear Regression: Used for predicting continuous outcomes. Example: Predicting housing prices based on features like size and location. Logistic Regression: Used for binary classification problems. Example: Predicting whether an email is spam or not. Equation: P(Y=1∣X)=11+e−(β0+β1X)P(Y=1∣X)=1+e−(β0+β1X)1 Decision Trees: Used for both classification and regression tasks. Example: Classifying types of fruits based on attributes like color and weight. Support Vector Machines (SVM): Effective for high-dimensional spaces in classification. Example: Image classification tasks.
List common algorithms used in unsupervised learning and their applications.
Common Algorithms:
K-Means Clustering: Groups data points into k clusters based on feature similarity. Example: Segmenting customers into different groups based on purchasing behavior. Equation: Minimize ∑i=1k∑j=1n∣∣xj−μi∣∣2∑i=1k∑j=1n∣∣xj−μi∣∣2 (where μiμi is the centroid of cluster ii). Hierarchical Clustering: Builds a hierarchy of clusters using either agglomerative or divisive methods. Example: Creating a dendrogram to visualize customer groupings. Principal Component Analysis (PCA): Reduces dimensionality while preserving variance in data. Example: Reducing the number of features in a dataset to visualize data in two dimensions.
What is unsupervised learning, and how does it differ from supervised learning?
A:
Unsupervised Learning: A type of machine learning where the model is trained on unlabeled data.
Key Differences from Supervised Learning:
Lack of Labeled Data: No explicit output labels for training examples. Objective: To identify patterns or groupings within the data. Applications: Clustering (e.g., customer segmentation) and association (e.g., market basket analysis).