MODULE 1 Flashcards
M1S1 - M1S2
Algorithm where samples are used for training.
Machine Learning Algorithm
It is a research field at the intersection of statistics, artificial intelligence, and computer science.
Machine Learning
It is the practice of cleaning, altering, and reorganizing raw data prior to processing and analysis
Preprocessing
contains inconsistent records
Inconsistency
contains incorrect records or exceptions
Noise
Creating plots and charts to visualize data distributions and relationships.
Visualization
T/F
The performance of ML algorithms adaptively improves with an increase in the number of available samples during the ‘training’ processes.
FALSE: (‘learning’)
T/F
Data reduction is a data cleansing technique.
FALSE
T/F
Reducing noise in data is a feature engineering technique.
FALSE
It covers the ethical and moral obligations of collecting, sharing, and using data, focused on ensuring that data is used fairly, for good.
Data Ethics
Best Practices for Successful ML Model Deployment
- Choosing the Right Infrastructure.
- Effective Versioning and Tracking
- Robust Testing and Validation
- Implementing Monitoring and Alerting
Data: _____________
Learning Algorithms: ______________
Basic Understanding: ______________
Experience (E)
Task (T)
Measure (P)
Even when intentions are good, the ___________ of data analysis can cause inadvertent harm to individuals or groups of people.
Outcome
Once deployed, models need to be continuously monitored.
Monitoring
A field of study concerned with giving computers the ability to learn without being explicitly programmed.
Machine Learning
It is a collection of data used in machine learning tasks.
Dataset
Feature Engineering techniques
- Feature scaling or normalization
- Data reduction
- Discretization
- Feature encoding
Calculating measures like mean, median, variance, and standard deviation.
Summary Statistics
Data Cleansing techniques
- Identify and sort out missing data
- Reduce noisy data
- Identify and remove duplicates
It is used to understand the main characteristics of the data, identify patterns to discover patterns, spot anomalies, test a hypothesis, or check assumptions.
Exploratory Data Analysis (EDA)
The process of creating a model from data is called ___________
Learning (training)
Rule-based algorithms: Condition
Machine Learning: _________.
Model
Algorithm where explicit programming is used.
Rule Based Algorithm
It refers to the process of using the model obtained after learning for prediction.
Testing
It is the most crucial process of integrating the ML model into its production environment. This process is the most challenging, involving several moving pieces, tools, data scientists, and ML engineers to collaborate and strategize.
Model Deployment
In addition to owning their personal information, data subjects have a right to know how you plan to collect, store, and use it.
Transparency
Another ethical responsibility that comes with handling data is ensuring data subjects’ ____________
Privacy