Artificial Intelligence & Applications Flashcards
What is Artificial Intelligence (AI)?
AI is the ability of machines to think and act intelligently, like a human would.
How does AI differ from traditional programming?
AI learns from data and makes decisions, while traditional programming involves telling a computer exactly what to do.
What are key areas of AI?
- Computer Vision
- Machine Learning (ML)
- Deep Learning (DL)
- Data Mining
What is Weak AI?
AI built for specific tasks—it doesn’t think like a human.
Give examples of Weak AI.
- Siri and Alexa
- Chess-playing AI
- Chatbots like GPT-4
What is Strong AI?
AI that can think, learn, and adapt like a human.
Does Strong AI exist?
Not yet! Scientists are still trying to create it.
What is Machine Learning (ML)?
A method where AI is trained on data instead of being programmed manually.
What are the two main types of Machine Learning?
- Supervised Learning
- Unsupervised Learning
What is Supervised Learning?
The AI is trained on labeled data (data with answers).
What is an example of Supervised Learning?
Teaching AI to recognize cats using labeled cat pictures.
What is Unsupervised Learning?
AI finds patterns in data by itself—no labels.
What is the difference between Machine Learning and Data Mining?
ML = AI learns patterns and makes predictions; Data Mining = Humans find patterns manually in big data.
What is Deep Learning (DL)?
A special kind of Machine Learning that uses neural networks.
What are key models in Deep Learning?
- CNNs (Convolutional Neural Networks)
- RNNs (Recurrent Neural Networks)
- Autoencoders
What is CNN best for?
Images.
What is RNN best for?
Sequences like speech and text.
Where is AI used?
- Computer Vision
- Natural Language Processing (NLP)
- Generative AI
What is an application of AI in Computer Vision?
Medical imaging for detecting diseases from scans.
What does Natural Language Processing (NLP) enable AI to do?
Understand language.
What is an example of Generative AI?
GANs (Generative Adversarial Networks) that create new images, videos, and music.
Fill in the blank: AI is __________ technology that mimics human intelligence.
[smart]
True or False: Deep Learning makes AI less powerful than traditional Machine Learning.
False.
What are the goals of data exploration?
✔️ Visualize patterns and trends
✔️ Summarize key statistics
✔️ Detect anomalies
✔️ Understand relationships between variables
Example: Analyzing diamond prices for trends related to carat size and cut.
Why is data visualization important?
📌 Find patterns and trends
📌 Understand relationships between variables
📌 Detect errors or missing data
📌 Make data easier to interpret
Quote: ‘Make both calculations and graphs.’ – F.J. Anscombe, 1973
What types of charts are used for different purposes?
🔹 Relationship → Scatter plots
🔹 Composition → Pie charts
🔹 Comparison → Bar charts, line graphs
🔹 Location → Maps & heatmaps
Example: Scatter plots for height vs. weight.
What did researchers Cleveland & McGill find about chart design?
✔️ Position & length are the most accurate ways to show numbers
✔️ Pie charts are harder to interpret than bar charts
Best Practice: Use clear, simple charts.
What is the Grammar of Graphics?
✔️ A structured approach to designing visualizations
✔️ Ensures consistency in designing graphs
✔️ Used in tools like ggplot2 in R
Helps in creating clear visualizations.
What are common data issues in data pre-processing?
❌ Missing Values
❌ Duplicates
❌ Inconsistent Data
❌ Noise & Outliers
Solutions include filling missing values and standardizing formats.
What is feature engineering in data pre-processing?
✔️ Feature Selection → Keep important variables
✔️ Feature Transformation → Convert data into better formats
Example: Standardizing price and carat size in a diamonds dataset.
Fill in the blank: The package used for creating visualizations in R is _______.
[ggplot2]
What is the purpose of a scatter plot in data visualization?
Helps us see the relationship between two variables
Example: Carat vs. Price in diamond datasets.
True or False: Cleaning data is essential for accuracy.
True
What are the key takeaways from the data exploration and visualization process?
✔️ Data exploration helps us understand patterns
✔️ Visualization is key for discovering insights
✔️ Choosing the right chart aids interpretation
✔️ The Grammar of Graphics helps create structured visualizations
✔️ Cleaning data is essential for accuracy
When do we use Machine Learning?
When no direct formula exists to solve a problem and when we have data that can help find patterns.
Example: Predicting customer purchase behavior.
What is Supervised Learning?
A type of ML where the model is trained on labeled data, learning from known answers.
What are key features used in Supervised Learning?
- Buying Price
- Maintenance Cost
- Number of Doors
- Seating Capacity
- Luggage Boot Size
- Safety Rating
What is Predictive Modeling?
When ML learns patterns from data to make predictions.
What is the difference between Regression and Classification?
- Regression → Predicts continuous values (e.g., house prices).
- Classification → Assigns data into categories (e.g., spam or not spam).
What is Linear Regression?
A method to find the best-fit line Y = mx + c, where c is the intercept and m is the slope.
What are the limitations of Linear Regression?
- Not good for non-linear relationships
- Not good when there are too many outliers.
What is a Decision Tree?
A flowchart-like structure where each decision leads to an outcome.
What is the process of creating a Decision Tree?
- Pick the best feature
- Split the data into groups
- Keep splitting until groups are pure.
What is Random Forest?
A collection of multiple decision trees to improve accuracy and reduce overfitting.
How does Random Forest work?
- Train many Decision Trees on random data subsets
- Use different features at each split
- Combine all tree predictions.
What is k-Nearest Neighbors (k-NN)?
A method that classifies new data points based on the ‘k’ closest points in the dataset.
What is the process for k-NN classification?
- Choose k
- Compute distances from the new point to all training points
- Find the k closest neighbors
- Assign the most common class label or take the average.
What is a limitation of k-NN?
It is slow for large datasets.
List the main concepts of Supervised Learning.
- Uses labeled data
- Regression vs. Classification
- Linear Regression
- Decision Trees
- Random Forest
- k-NN
What is the goal of Linear Regression?
To find the best-fit line that represents the relationship between variables.
True or False: Decision Trees can overfit.
True
Fill in the blank: Random Forest is an army of _______.
[Decision Trees]
What is Unsupervised Learning?
Unsupervised learning is a machine learning approach where the model learns patterns, structures, or groupings in the data without labeled outputs.
Key characteristics include working with unlabelled data, finding hidden structures, and being used for clustering and dimensionality reduction.
What are the key characteristics of Unsupervised Learning?
- Works with unlabelled data (no predefined categories)
- Finds hidden structures in data
- Used for clustering & dimensionality reduction
Example: Grouping similar books in an unorganized library.
What is Clustering?
Clustering is a method in unsupervised learning that groups similar data points together.
Within a cluster, data points are similar; in different clusters, they are dissimilar.
Why is Clustering used?
- Data Reduction
- Outlier Detection
- Data Segmentation
Clustering helps summarize large datasets, identifies unusual patterns, and groups customers by behavior.
What are some real-world applications of Clustering?
- Social Network Analysis
- Image Segmentation
- Data Annotation
Examples include grouping users based on interests and dividing images for medical imaging.
What are the steps in Clustering?
- Define a distance metric to measure similarity
- Form clusters by grouping similar data points
- Maximize within-cluster similarity, minimize between-cluster similarity
Common distance metrics include Euclidean and Manhattan distances.
What is K-Means Clustering?
K-Means is a partition-based clustering algorithm that groups data into k clusters.
It is one of the most popular clustering algorithms.
How does K-Means Clustering work?
- Choose the number of clusters (k)
- Select k random points as initial centroids
- Assign each data point to the nearest centroid
- Recalculate centroids by finding the mean of each cluster
- Repeat until centroids stop changing
Example: Grouping customers into low, medium, and high spenders.
What is the Elbow Method in K-Means?
The Elbow Method involves plotting the Within-Cluster Sum of Squares (WCSS) and looking for the ‘elbow’ point where adding more clusters stops improving the fit significantly.
The bend in the curve indicates the optimal number of clusters (k).
What are the strengths of K-Means Clustering?
- Simple and efficient
- Works well for large datasets
K-Means is favored for its speed and ease of use.
What are the weaknesses of K-Means Clustering?
- Requires predefined k
- Sensitive to initialization
- Struggles with non-globular clusters
These limitations can affect the clustering results.