Module 3 Flashcards
List the 4 phases of the AI development lifecycle
- Planning
- Design
- Development
- Implementation
What needs to be done in the planning stage of the AI development lifecycle?
- Determine the business objectives and requirements
- Determine the scope of the project
- Determine the governance structure and responsibilities
List the main types of business problems
- Classification business problem
- Regression business problem
- Recommendation business problem
What is a classification business problem?
You want to use an AI system to classify data into different types
What is a regression business problem?
You are looking for an AI system to predict what you should do in the future based on past data
What is a recommendation business problem?
AI is used for recommendations of what to do for a particular problem, such as viewer recommendations or product recommendations
List key questions to help define the business problem
- What type of results are you expecting from the use of the AI system, what do you want to happen, how would you like to solve the problem
- What processes are currently in use to solve the problem that you can leverage as part of your AI system
- What type of improvement in what you are doing are you expecting
- What are the key performance indicators you want to track to understand how well the AI system is performing later
- What resources do you have available to use to solve the business problem
How do you figure out the answers to your business problems?
- Conduct user interviews
- Conduct market research on AI systems
- Identify AI use cases
- Use the right data
Why is it important to conduct market research?
To understand what types of AI systems are available, how they are used and what type of AI system fits into your organization for the problem you are looking to solve
What should you consider when identifying AI use cases?
Focus on the organizational mission, what is your mission, what do you do, what’s important to you, what are your main goals, where are the gaps in reaching those goals
During the planning stage, how do you identify the right data?
See what is accessible to you, look at existing areas where you obtain data but also new areas that may offer new data
What should you consider when scoping a project?
- Use cases – what your business needs are, then prioritize which problem you want to solve first
- Impact of AI system for the problem – will the impact be big, will it solve the problem, what effort and resources are required, how long is it going to take
- Fit of AI system – how well does the use of an AI system fit with the goals of your organization and the problems you want to solve
What are the first things you should do to set up a governance structure?
- Look at existing governance structures
- Identify an executive to be a champion
What key things should you clarify in your governance structure?
- Who is responsible for maintaining and implementing the AI governance structure
- Who writes AI policies and procedures
- Who is responsible for monitoring development and testing, and selecting particular AI systems
What should be included in your data strategy during the design stage of the AI development lifecycle?
- Data gathering
- Wrangling
- Cleansing
- Labelling
- Applying PETs
What questions should you ask yourself during data gathering?
- What data is required
- How is the data collected
- Where is the data stored
- Do you want pre-trained data
- Do you want to use internal or external data
- Does the quality of data fit your needs
- What is the format of the data (structured/unstructured, static/streaming)
What is data wrangling?
Preparing the data
Describe the data wrangling stage
- About 80% of the lifecycle
- Taking raw data and converting it to valuable information
- Cleansing
- Labelling
- Applying PETs
List the 5 Vs used in data wrangling
- Volume
- Velocity – how often is it updated or changed
- Variety – structured, unstructured, etc.
- Veracity – how accurate and trustworthy is it, is the data from a trusted source
- Value – what is the outcome of the AI system and do you have the right data to do that
List some PETs
- Anonymization – removing identifiers from data
- Minimization – if data is not needed in the application, do not use it to train the model
- Differential privacy
- Federated learning
What is Differential privacy?
Blurs data by applying an algorithm which enables you to modify the data to keep it meaningful but non-specific – you are not able to identify individuals, but you can still use the data
Provide an example of differential privacy
Altering the age of individuals in the dataset by a random number between 1 and 5
What is federated learning?
A new way to train models where you don’t need to share data that might be sensitive among different locations
- You have one central model in a central location (for example in the cloud)
- Each different local location downloads the central model and trains it on the data within their location
- The results of the training are sent back up to the central location to be aggregated
What should you consider when choosing the AI system architecture and select a model?
- Choose the algorithm according to the desired level of accuracy and interpretability of the data
- Think about what you want to learn from your data and how it is going to help solve your problem
- What are the other requirements or constraints (for example, time constraints)