AIGP general Flashcards

1
Q

Training data is best defined as a subset of data that is used to?

A Enable a model to detect and learn patterns.
B Fine-tune a model to improve accuracy and prevent overfitting.
C Detect the initial sources of biases to mitigate prior to deployment.
D Resemble the structure and statistical properties of production data.

A

Correct Answer: A
Training data is used to enable a model to detect and learn patterns. During the training phase, the model learns from the labeled data, identifying patterns and relationships that it will later use to make predictions on new, unseen data. This process is fundamental in building an AI model’s capability to perform tasks accurately. Reference: AIGP Body of Knowledge on Model Training and Pattern Recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

To maintain fairness in a deployed system, it is most important to?

A Protect against loss of personal data in the model.
B Monitor for data drift that may affect performance and accuracy.
C Detect anomalies outside established metrics that require new training data.
D Optimize computational resources and data to ensure efficiency and scalability

A

Correct Answer: B
To maintain fairness in a deployed system, it is crucial to monitor for data drift that may affect performance and accuracy. Data drift occurs when the statistical properties of the input data change over time, which can lead to a decline in model performance. Continuous monitoring and updating of the model with new data ensure that it remains fair and accurate, adapting to any changes in the data distribution. Reference: AIGP Body of Knowledge on Post-Deployment Monitoring and Model Maintenance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When monitoring the functional performance of a model that has been deployed into production, all of the following are concerns EXCEPT?

A Feature drift.
B System cost.
C Model drift.
D Data loss.

A

Correct Answer: B
When monitoring the functional performance of a model deployed into production, concerns typically include feature drift, model drift, and data loss. Feature drift refers to changes in the input features that can affect the model’s predictions. Model drift is when the model’s performance degrades over time due to changes in the data or environment. Data loss can impact the accuracy and reliability of the model. However, system cost, while important for budgeting and financial planning, is not a direct concern when monitoring the functional performance of a deployed model. Reference: AIGP Body of Knowledge on Model Monitoring and Maintenance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

After completing model testing and validation, which of the following is the most important step that an organization takes prior to deploying the model into production?

A Perform a readiness assessment.
B Define a model-validation methodology.
C Document maintenance teams and processes.
D Identify known edge cases to monitor post-deployment.

A

Correct Answer: A
After completing model testing and validation, the most important step prior to deploying the model into production is to perform a readiness assessment. This assessment ensures that the model is fully prepared for deployment, addressing any potential issues related to infrastructure, performance, security, and compliance. It verifies that the model meets all necessary criteria for a successful launch. Other steps, such as defining a model-validation methodology, documenting maintenance teams and processes, and identifying known edge cases, are also important but come secondary to confirming overall readiness. Reference: AIGP Body of Knowledge on Deployment Readiness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which type of existing assessment could best be leveraged to create an Al impact assessment?

A A safety impact assessment.
B A privacy impact assessment.
C A security impact assessment.
D An environmental impact assessment.

A

Correct Answer: B
A privacy impact assessment (PIA) can be effectively leveraged to create an AI impact assessment. A PIA evaluates the potential privacy risks associated with the use of personal data and helps in implementing measures to mitigate those risks. Since AI systems often involve processing large amounts of personal data, the principles and methodologies of a PIA are highly applicable and can be extended to assess broader impacts, including ethical, social, and legal implications of AI. Reference: AIGP Body of Knowledge on Impact Assessments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following steps occurs in the design phase of the Al life cycle?

A Data augmentation.
B Model explainability.
C Risk impact estimation.
D Performance evaluation.

A

C. Risk impact estimation.

In the design phase, the focus is on planning and identifying potential risks and impacts of the AI system. Risk impact estimation involves assessing the potential consequences of deploying the model, including ethical, legal, and operational risks. The other steps typically occur in later stages of the AI life cycle:

A. Data augmentation happens during the data preparation phase.
B. Model explainability is often addressed during model development or validation.
D. Performance evaluation occurs after the model is trained, during testing and validation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

During the planning and design phases of the Al development life cycle, bias can be reduced by all of the following EXCEPT?

A Stakeholder involvement.
B Feature selection.
C Human oversight.
D Data collection.

A

B. Feature selection.

While feature selection is an important step in AI model development, it typically occurs during the modeling phase, not the planning or design phases. Bias can be reduced during planning and design through A. Stakeholder involvement, C. Human oversight, and D. Data collection, which ensure that diverse perspectives and appropriate data are considered early on. Feature selection focuses more on refining the model’s inputs and is not directly related to bias reduction at the planning and design stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following use cases would be best served by a non-AI solution?

A A non-profit wants to develop a social media presence.
B An e-commerce provider wants to make personalized recommendations.
C A business analyst wants to forecast future cost overruns and underruns.
D A customer service agency wants automate answers to common questions.

A

A. A non-profit wants to develop a social media presence.

Building a social media presence typically involves content creation, scheduling posts, and engagement strategies, which can be handled effectively with standard tools and human effort rather than requiring AI. The other use cases—such as personalized recommendations, forecasting, and automating customer service—are more suited to AI-driven solutions that can leverage data and machine learning models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

All of the following are elements of establishing a global Al governance infrastructure EXCEPT?

A Providing training to foster a culture that promotes ethical behavior.
B Creating policies and procedures to manage third-party risk.
C Understanding differences in norms across countries.
D Publicly disclosing ethical principles.

A

Answer : D

Establishing a global AI governance infrastructure involves several key elements, including providing training to foster a culture that promotes ethical behavior, creating policies and procedures to manage third-party risk, and understanding differences in norms across countries. While publicly disclosing ethical principles can enhance transparency and trust, it is not a core element necessary for the establishment of a governance infrastructure. The focus is more on internal processes and structures rather than public disclosure. Reference: AIGP Body of Knowledge on AI Governance and Infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following would be the least likely step for an organization to take when designing an integrated compliance strategy for responsible Al?

A Conducting an assessment of existing compliance programs to determine overlaps and integration points.
B Employing a new software platform to modernize existing compliance processes across the organization.
C Consulting experts to consider the ethical principles underpinning the use of Al within the organization.
D Launching a survey to understand the concerns and interests of potentially impacted stakeholders.

A

Answer : B

When designing an integrated compliance strategy for responsible AI, the least likely step would be employing a new software platform to modernize existing compliance processes. While modernizing compliance processes is beneficial, it is not as directly related to the strategic integration of ethical principles and stakeholder concerns. More critical steps include conducting assessments of existing compliance programs to identify overlaps and integration points, consulting experts on ethical principles, and launching surveys to understand stakeholder concerns. These steps ensure that the compliance strategy is comprehensive and aligned with responsible AI principles. Reference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A company initially intended to use a large data set containing personal information to train an Al model. After consideration, the company determined that it can derive enough value from the data set without any personal information and permanently obfuscated all personal data elements before training the model.

This is an example of applying which privacy-enhancing technique (PET)?

A Anonymization.
B Pseudonymization.
C Differential privacy.
D Federated learning.

A

Answer : A

Anonymization is a privacy-enhancing technique that involves removing or permanently altering personal data elements to prevent the identification of individuals. In this case, the company obfuscated all personal data elements before training the model, which aligns with the definition of anonymization. This ensures that the data cannot be traced back to individuals, thereby protecting their privacy while still allowing the company to derive value from the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The planning phase of the Al life cycle articulates all of the following EXCEPT the?

A Objective of the model.
B Approach to governance.
C Choice of the architecture.
D Context in which the model will operate.

A

Answer : B

The planning phase of the AI life cycle typically includes defining the objective of the model, choosing the appropriate architecture, and understanding the context in which the model will operate. However, the approach to governance is usually established as part of the overall AI governance framework, not specifically within the planning phase. Governance encompasses broader organizational policies and procedures that ensure AI development and deployment align with legal, ethical, and operational standards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the best reason for a company adopt a policy that prohibits the use of generative Al?

A Avoid using technology that cannot be monetized.
B Avoid needing to identify and hire qualified resources.
C Avoid the time necessary to train employees on acceptable use.
D Avoid accidental disclosure to its confidential and proprietary information.

A

Correct Answer: D
The primary concern for a company adopting a policy prohibiting the use of generative AI is the risk of accidental disclosure of confidential and proprietary information. Generative AI tools can inadvertently leak sensitive data during the creation process or through data sharing. This risk outweighs the other reasons listed, as protecting sensitive information is critical to maintaining the company’s competitive edge and legal compliance. This rationale is discussed in the sections on risk management and data privacy in the IAPP AIGP Body of Knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following is an example of a high-risk application under the EU Al Act?

A A resume scanning tool that ranks applicants.
B An Al-enabled inventory management tool.
C A government-run social scoring tool.
D A customer service chatbot tool.

A

Correct Answer: C
The EU AI Act categorizes certain applications of AI as high-risk due to their potential impact on fundamental rights and safety. High-risk applications include those used in critical areas such as employment, education, and essential public services. A government-run social scoring tool, which assesses individuals based on their social behavior or perceived trustworthiness, falls under this category because of its profound implications for privacy, fairness, and individual rights. This contrasts with other AI applications like resume scanning tools or customer service chatbots, which are generally not classified as high-risk under the EU AI Act.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the best method to proactively train an LLM so that there is mathematical proof that no specific piece of training data has more than a negligible effect on the model or its output?

A Clustering.
B Transfer learning.
C Differential privacy.
D Data compartmentalization.

A

C. Differential privacy.

Explanation:
Differential privacy is the best method to ensure that no specific piece of training data has a significant effect on the model or its output. This technique involves adding noise to the data or the training process in a controlled manner, such that it becomes mathematically provable that the model’s output does not change significantly due to the inclusion or exclusion of any single data point.

Key reasons why differential privacy is suitable:

It provides mathematical guarantees that the contribution of individual data points is limited.
It helps ensure data privacy because the model cannot be used to infer whether any specific data point was present in the training set.
Here’s why the other options are less suitable:

A. Clustering: Clustering is a method for grouping similar data points together but does not inherently protect individual data points’ influence on the model or provide mathematical guarantees about privacy.

B. Transfer learning: Transfer learning involves using a pre-trained model and fine-tuning it on new data, but it does not focus on ensuring that individual data points have a minimal impact on the overall model output.

D. Data compartmentalization: This is a method for organizing and isolating data into segments but does not directly address controlling the influence of specific data points on the model.

Differential privacy is specifically designed for scenarios where it is important to ensure that the presence or absence of any single piece of data cannot be detected or inferred from the model, making it the best choice for this purpose.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Machine learning is best described as a type of algorithm by which?
A. Systems can mimic human intelligence with the goal of replacing humans.
B. Systems can automatically improve from experience through predictive patterns.
C. Statistical inferences are drawn from a sample with the goal of predicting human intelligence.
D. Previously unknown properties are discovered in data and used to predict and make improvements in the data.

A

B. Systems can automatically improve from experience through predictive patterns.

Explanation:
Machine learning (ML) is a branch of artificial intelligence (AI) that focuses on building algorithms and systems that can learn from data and improve their performance over time without being explicitly programmed for each specific task. ML algorithms learn from past data (experience) to identify patterns and make predictions or decisions.

17
Q

You asked a generative Al tool to recommend new restaurants to explore in Boston, Massachusetts that have a specialty Italian dish made in a traditional fashion without spinach and wine. The generative Al tool recommended five restaurants for you to visit.
After looking up the restaurants, you discovered one restaurant did not exist and two others did not have the dish.

This information provided by the generative Al tool is an example of what is commonly called?
A. Prompt injection.
B. Model collapse.
C. Hallucination.
D. Overfitting.

A

C. Hallucination.

Explanation:
In the context of AI, hallucination refers to instances where a generative AI model produces information that is false, inaccurate, or fabricated. This means the model might generate responses that seem plausible or detailed but are not grounded in reality.

In your case, the generative AI tool recommended a restaurant that does not exist and suggested dishes that were not actually available at the other restaurants. This is a classic example of hallucination, where the model produces responses based on patterns it has learned, even though those responses do not correspond to real-world facts.

Here’s why the other options are incorrect:

A. Prompt injection: This occurs when a user manipulates the prompt to alter or exploit the AI’s behavior. It’s not relevant here, as the issue is about the AI providing inaccurate information, not about how the prompt influenced it.

B. Model collapse: This refers to a situation where a model’s performance deteriorates over time, often due to training issues. It’s not related to the generation of incorrect information.

D. Overfitting: Overfitting happens when a model learns too closely from its training data, resulting in poor performance on new, unseen data. It is not related to the generation of false information like recommending non-existent restaurants.

18
Q

Each of the following actors are typically engaged in the Al development life cycle EXCEPT?
A. Data architects.
B. Government regulators.
C. Socio-cultural and technical experts.
D. Legal and privacy governance experts.

A

B. Government regulators.

Explanation:
In the context of the AI development life cycle, various stakeholders are typically involved, such as:

A. Data architects: They play a crucial role in designing the data infrastructure, preparing and structuring data, and ensuring it is suitable for training and testing AI models.

C. Socio-cultural and technical experts: These experts help ensure that the AI system is developed with consideration for its social and cultural impact and that it aligns with technical best practices and societal values.

D. Legal and privacy governance experts: These professionals ensure that the AI system complies with laws and regulations regarding data privacy, security, and ethical considerations throughout its development.

B. Government regulators, however, are generally not directly involved in the AI development process itself. Instead, they play a role in setting standards, creating regulations, and ensuring compliance after the AI system is deployed or during audits. They might interact with organizations to ensure adherence to laws, but they are not typically part of the internal development process.

19
Q

A company is working to develop a self-driving car that can independently decide the appropriate route to take the driver after the driver provides an address.
If they want to make this self-driving car “strong” Al, as opposed to “weak,” the engineers would also need to ensure?
A. That the Al has full human cognitive abilities that can independently decide where to take the driver.
B. That they have obtained appropriate intellectual property (IP) licenses to use data for training the Al.
C. That the Al has strong cybersecurity to prevent malicious actors from taking control of the car.
D. That the Al can differentiate among ethnic backgrounds of pedestrians.

A

A. That the AI has full human cognitive abilities that can independently decide where to take the driver.

Explanation:
The distinction between “strong” AI (also known as Artificial General Intelligence, or AGI) and “weak” AI (also known as narrow AI) lies in the scope of cognitive abilities.

Weak AI is designed to perform a specific task or set of tasks, such as driving a car or playing chess. It does not possess general understanding or reasoning beyond its designated functions.

Strong AI, or AGI, would have the ability to understand, learn, and reason across a wide range of topics, similar to a human. It would be capable of making decisions autonomously in a manner that reflects broad human-like understanding.

In the context of a self-driving car, making the car “strong” AI would require it to have the capability to independently decide where to take the driver even without a specific address, reflecting human-like judgment and understanding of complex situations.

20
Q

Which of the following is NOT a common type of machine learning?
A. Deep learning.
B. Cognitive learning.
C. Unsupervised learning.
D. Reinforcement learning

A

B. Cognitive learning.

Explanation:
Cognitive learning is not a standard term used to describe a type of machine learning. It generally refers to human learning processes, such as understanding, applying knowledge, and thinking. It is not specifically related to machine learning algorithms or methods.

The other options are common types of machine learning:

A. Deep learning: A subset of machine learning that uses neural networks with many layers (deep neural networks) to learn from large amounts of data. It is particularly effective in complex tasks like image and speech recognition.

C. Unsupervised learning: A type of machine learning where the model is trained on data without labeled outcomes. It is used to find patterns or groupings within the data, such as clustering and association.

D. Reinforcement learning: A type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize cumulative rewards. It is commonly used in robotics, game playing, and autonomous systems.

21
Q

An EU bank intends to launch a multi-modal Al platform for customer engagement and automated decision-making assist with the opening of bank accounts. The platform has been subject to thorough risk assessments and testing, where it proves to be effective in not discriminating against any individual on the basis of a protected class.

What additional obligations must the bank fulfill prior to deployment?

A. The bank must obtain explicit consent from users under the privacy Directive.
B. The bank must disclose how the Al system works under the Ell Digital Services Act.
C. The bank must subject the Al system an adequacy decision and publish its appropriate safeguards.
D. The bank must disclose the use of the Al system and implement suitable measures for users to contest

A

D. The bank must disclose the use of the AI system and implement suitable measures for users to contest.

Explanation:
Under the EU AI Act and other relevant EU regulations, when deploying an AI system that is used in high-stakes contexts like customer engagement and automated decision-making for opening bank accounts, the bank has certain obligations:

Transparency: The bank is required to disclose to customers that an AI system is being used in the decision-making process. This ensures that users are aware that decisions affecting them are partially or wholly automated.

User Rights: The bank must also implement mechanisms for users to contest decisions made by the AI system. This means that if a customer disagrees with a decision made by the AI (e.g., rejection of a bank account application), they should have a way to seek a human review or appeal the decision.

22
Q

Random forest algorithms are in what type of machine learning model?

A. Symbolic.
B. Generative.
C. Discriminative.
D. Natural language processing.

A

C. Discriminative.

Explanation:
Random forest algorithms fall under the category of discriminative models in machine learning. Discriminative models are designed to classify or predict a target outcome by learning the boundary between different classes based on the features in the data.

Here’s why the other options are not correct:

A. Symbolic: Symbolic AI involves rule-based systems where knowledge is encoded in symbols and rules. Random forests do not follow this approach; they are based on data-driven learning of decision trees.

B. Generative: Generative models focus on modeling the joint probability of the input features and the output labels, allowing them to generate new data instances. Random forests do not attempt to model the joint probability; instead, they learn to differentiate between classes based on input features.

D. Natural language processing: Natural Language Processing (NLP) is a field of AI focused on interactions between computers and human language. Random forest is a type of algorithm that can be applied to NLP tasks, but it is not a category of machine learning itself.

23
Q

Under the NIST Al Risk Management Framework, all of the following are defined as characteristics of trustworthy Al EXCEPT?

A. Tested and Effective.
B. Secure and Resilient.
C. Explainable and Interpretable.
D. Accountable and Transparent.

A

A. Tested and Effective.

Explanation:
Under the NIST AI Risk Management Framework (NIST AI RMF), the focus is on ensuring that AI systems are developed and deployed in a way that makes them trustworthy. Trustworthiness is defined through several key characteristics, including:

B. Secure and Resilient: Ensuring that AI systems are protected against adversarial attacks, vulnerabilities, and can recover from unexpected events is a key aspect of trustworthiness.

C. Explainable and Interpretable: It is important for AI systems to provide outputs that can be understood and explained to human users, especially in high-stakes environments. This ensures that stakeholders understand how decisions are made.

D. Accountable and Transparent: Trustworthy AI systems require clear accountability structures and transparency around how decisions are made, ensuring that stakeholders can understand and hold the AI system accountable.

24
Q

A company is creating a mobile app to enable individuals to upload images and videos, and analyze this data using ML to provide lifestyle improvement recommendations. The signup form has the following data fields:
1.First name
2.Last name
3.Mobile number
4.Email ID
5.New password
6.Date of birth
7.Gender
In addition, the app obtains a device’s IP address and location information while in use.

What GDPR privacy principles does this violate?

A. Purpose Limitation and Data Minimization.
B. Accountability and Lawfulness.
C. Transparency and Accuracy.
D. Integrity and Confidentiality.

A

A. Purpose Limitation and Data Minimization.

Explanation:
The GDPR (General Data Protection Regulation) establishes several privacy principles that organizations must adhere to when processing personal data. Two of these principles are particularly relevant in this scenario:

Purpose Limitation: This principle requires that personal data be collected only for specified, explicit, and legitimate purposes and not further processed in a way that is incompatible with those purposes. The company must clearly define why each piece of personal data is being collected (e.g., why the app needs date of birth and gender for lifestyle recommendations).

Data Minimization: This principle mandates that the data collected should be adequate, relevant, and limited to what is necessary in relation to the purposes for which it is processed. If the app collects data that is not strictly needed for providing lifestyle recommendations or delivering core functionalities, it may violate this principle. For example, if the app can function without collecting a mobile number or precise location information, then collecting this data might be considered excessive.

25
Q

What is the primary purpose of an Al impact assessment?

A. To define and evaluate the legal risks associated with developing an Al system.
B. Anticipate and manage the potential risks and harms of an Al system.
C. To define and document the roles and responsibilities of Al stakeholders.
D. To identify and measure the benefits of an Al system.

A

B. Anticipate and manage the potential risks and harms of an AI system.

Explanation:
The primary purpose of an AI impact assessment is to identify, evaluate, and manage the potential risks and harms associated with the deployment and use of an AI system. This process helps ensure that the AI system is developed and used in a way that minimizes negative consequences and aligns with ethical and legal standards.

Key aspects of an AI impact assessment include:

Identifying potential risks: Understanding how the AI system could cause harm to individuals, groups, or society.
Managing risks: Developing strategies to mitigate those risks and ensure that the AI system is safe, fair, and aligned with the organization’s values.
Considering broader impacts: Taking into account the social, ethical, and environmental implications of deploying the AI system.

26
Q

All of the following are penalties and enforcements outlined in the EU Al Act EXCEPT?

A. Fines for SMEs and startups will be proportionally capped.
B. Rules on General Purpose Al will apply after 6 months as a specific provision.
C. The Al Pact will act as a transitional bridge until the Regulations are fully enacted.
D. Fines for violations of banned Al applications will be €35 million or 7% global annual turnover (whichever is higher).

A

C. The AI Pact will act as a transitional bridge until the Regulations are fully enacted.

The EU AI Act outlines specific penalties and enforcement mechanisms to ensure compliance with its regulations. Among these, fines for violations of banned AI applications can be as high as €35 million or 7% of the global annual turnover of the offending organization, whichever is higher. Proportional caps on fines are applied to SMEs and startups to ensure fairness. General Purpose AI rules are to apply after a 6-month period as a specific provision to ensure that stakeholders have adequate time to comply. However, there is no provision for an “AI Pact” acting as a transitional bridge until the regulations are fully enacted, making option C the correct answer.

27
Q

Which of the following most encourages accountability over Al systems?

A. Determining the business objective and success criteria for the Al project.
B. Performing due diligence on third-party Al training and testing data.
C. Defining the roles and responsibilities of Al stakeholders.
D. Understanding Al legal and regulatory requirements.

A

C. Defining the roles and responsibilities of AI stakeholders.

Explanation:
Accountability in the context of AI systems means ensuring that individuals or groups are responsible for the various aspects of the AI system, including its design, deployment, and impact. Defining roles and responsibilities of stakeholders is crucial for creating clear lines of accountability. This ensures that everyone involved knows their duties and obligations regarding the AI system’s performance, monitoring, and ethical considerations. It also helps ensure that if something goes wrong, there are clear points of contact for taking corrective action.

28
Q

All of the following are common optimization techniques in deep learning to determine weights that represent the strength of the connection between artificial neurons EXCEPT?

A. Gradient descent, which initially sets weights arbitrary values, and then at each step changes them.
B. Momentum, which improves the convergence speed and stability of neural network training.
C. Autoregression, which analyzes and makes predictions about time-series data.
D. Backpropagation, which starts from the last layer working backwards.

A

C. Autoregression, which analyzes and makes predictions about time-series data.

Explanation:
Autoregression is not a common optimization technique used to determine weights in deep learning models. It is a statistical method often used for time-series analysis, where future values are predicted based on past values. While autoregression can be used in time-series forecasting, it is not a technique used for optimizing weights in neural networks.

29
Q

What is the technique to remove the effects of improperly used data from an ML system?

A. Data cleansing.
B. Model inversion.
C. Data de-duplication.
D. Model disgorgement.

A

D. Model disgorgement.

Explanation:
Model disgorgement refers to the process of removing or undoing the influence of improperly used or compromised data on a trained machine learning model. This might be necessary when a model has been trained using data that was obtained or used inappropriately (e.g., data that violates privacy laws or was collected without proper consent).

In such cases, the organization might be required to retrain the model from scratch without the tainted data or modify the existing model to remove the impact of the improperly used data. Model disgorgement is a technique often discussed in regulatory and compliance contexts, especially when data privacy violations have occurred.

Here’s why the other options are not suitable:

A. Data cleansing: This involves correcting or removing errors or inconsistencies from a dataset before it is used for training a model. It does not address the issue of a model that has already been influenced by improperly used data.

B. Model inversion: This is a technique that attempts to reconstruct input data from a trained model, often used as a privacy attack technique. It is not related to removing the effects of bad data from a model.

C. Data de-duplication: This is the process of removing duplicate records from a dataset. It helps in improving data quality but does not address the issue of data that was improperly used in training.

30
Q

Pursuant to the White House Executive Order of November 2023, who is responsible for creating guidelines to conduct red-teaming tests of Al systems?

A. National Institute of Standards and Technology (NIST).
B. National Science and Technology Council (NSTC).
C. Office of Science and Technology Policy (OSTP).
D. Department of Homeland Security (DHS).

A

A. National Institute of Standards and Technology (NIST).

Explanation:
According to the White House Executive Order on AI issued in November 2023, the National Institute of Standards and Technology (NIST) is tasked with developing guidelines for conducting red-teaming tests of AI systems. These guidelines are intended to provide a framework for testing and evaluating the robustness, security, and trustworthiness of AI systems, particularly to identify vulnerabilities and risks associated with their deployment.

Red-teaming involves subjecting AI models to rigorous testing, often simulating adversarial conditions, to assess their performance under various challenging scenarios. NIST’s role is to ensure that these guidelines are comprehensive and aligned with standards that promote the safe and responsible use of AI.

The other options are less relevant for this specific responsibility:

B. National Science and Technology Council (NSTC): This body coordinates science and technology policy across federal agencies but is not specifically tasked with creating guidelines for red-teaming.

C. Office of Science and Technology Policy (OSTP): The OSTP plays a role in setting overall policy direction and priorities for AI but does not directly create testing guidelines like those developed by NIST.

D. Department of Homeland Security (DHS): The DHS is involved in matters of national security and could be concerned with the implications of AI in that context but is not responsible for creating technical testing guidelines for AI systems.

31
Q

What is the term for an algorithm that focuses on making the best choice achieve an immediate objective at a particular step or decision point, based on the available information and without regard for the longer-term best solutions?

A. Single-lane.
B. Optimized.
C. Efficient.
D. Greedy.

A

D. Greedy.

Explanation:
A greedy algorithm is one that makes the best immediate choice at each step or decision point, aiming to optimize the solution for that specific moment. It focuses on achieving the local optimum at each step without considering the broader, long-term consequences or whether that immediate decision leads to the overall best solution.

Greedy algorithms are often used in optimization problems where the goal is to find a solution quickly, though they may not always guarantee the global optimal solution for the entire problem.

32
Q

All of the following are reasons to deploy a challenger Al model in addition a champion Al model EXCEPT to?

A. Provide a framework to consider alternatives to the champion model.
B. Automate real-time monitoring of the champion model.
C. Perform testing on the champion model.
D. Retrain the champion model.

A

D. Retrain the champion model.

Explanation:
Deploying a challenger model alongside a champion model is typically done for the following reasons:

A. Provide a framework to consider alternatives to the champion model: This is a core purpose of having a challenger model. It allows for testing new models against the champion to see if they offer improved performance.

B. Automate real-time monitoring of the champion model: While not the main reason for deploying a challenger model, the comparison between the challenger and champion can inform monitoring decisions. However, direct real-time monitoring would be done through other tools, not solely through the presence of a challenger model.

C. Perform testing on the champion model: A challenger model helps test and validate the performance of the champion model by providing a benchmark for comparison. This ensures that the champion model remains effective over time.

D. Retrain the champion model is not a reason to deploy a challenger model. Retraining is a process where the champion model is updated or improved based on new data, not something that requires the presence of a challenger model. The purpose of a challenger model is to provide an alternative for evaluation, not to directly trigger or facilitate the retraining of the existing champion model.

33
Q

You are part of your organization’s ML engineering team and notice that the accuracy of a model that was recently deployed into production is deteriorating.

What is the best first step address this?

A. Replace the model with a previous version.
B. Conduct champion/challenger testing.
C. Perform an audit of the model.
D. Run red-teaming exercises.

A

B. Conduct champion/challenger testing.

When the accuracy of a model deteriorates, the best first step is to conduct champion/challenger testing. This involves deploying a new model (challenger) alongside the current model (champion) to compare their performance. This method helps identify if the new model can perform better under current conditions without immediately discarding the existing model. It provides a controlled environment to test improvements and understand the reasons behind the deterioration. This approach is preferable to directly replacing the model, performing audits, or running red-teaming exercises, which may be subsequent steps based on the findings from the champion/challenger testing.

34
Q

What is the main purpose of accountability structures under the Govern function of the NIST Al Risk Management Framework?

A. To empower and train appropriate cross-functional teams.
B. To establish diverse, equitable and inclusive processes.
C. To determine responsibility for allocating budgetary resources.
D. To enable and encourage participation by external stakeholders.

A

A. To empower and train appropriate cross-functional teams.

Explanation:
Under the Govern function of the NIST AI Risk Management Framework (AI RMF), accountability structures are designed to ensure that there are clear roles, responsibilities, and processes in place for managing AI risks. This includes empowering and training cross-functional teams who are responsible for overseeing the AI system’s lifecycle, from development to deployment and monitoring.

Key aspects of accountability structures include:

Empowering teams: Ensuring that the teams responsible for AI systems have the authority and resources they need to carry out their responsibilities effectively.
Training: Providing the necessary training to cross-functional teams, including data scientists, legal experts, and compliance officers, so they understand the risks and responsibilities associated with the AI system.
Clarity of roles: Defining who is responsible for different aspects of the AI system, such as data management, ethics, and compliance.

35
Q

All of the following are included within the scope of post-deployment Al maintenance EXCEPT?

A. Ensuring that all model components are subject a control framework.
B. Dedicating experts to continually monitor the model output.
C. Evaluating the need for an audit under certain standards.
D. Defining thresholds to conduct new impact assessments.

A

D. Defining thresholds to conduct new impact assessments.

Post-deployment AI maintenance typically includes ensuring that all model components are subject to a control framework, dedicating experts to continually monitor the model output, and evaluating the need for audits under certain standards. However, defining thresholds to conduct new impact assessments is usually part of the initial deployment and ongoing governance processes rather than a maintenance activity. Maintenance focuses more on the operational aspects of the AI system rather than setting new thresholds for impact assessments.

(Nota: o chatgpt escolheu a solução A inicialmente)

36
Q

In the machine learning context, feature engineering is the process of?

A. Converting raw data into clean data.
B. Creating learning schema for a model apply.
C. Developing guidelines to train and test a model.
D. Extracting attributes and variables from raw data.

A

D. Extracting attributes and variables from raw data.

Explanation:
D. Extracting attributes and variables from raw data accurately describes feature engineering. In the context of machine learning, feature engineering involves transforming raw data into meaningful features (attributes or variables) that can be used by machine learning models to improve their accuracy and performance. This process often includes selecting, creating, or modifying variables that best capture the patterns in the data.
Why Not the Other Options?
A. Converting raw data into clean data refers more to data cleaning or preprocessing, which involves handling missing values, removing duplicates, and correcting inconsistencies in the data.

B. Creating learning schema for a model apply does not specifically define feature engineering. This would be more related to defining the structure or configuration of a machine learning model rather than the extraction of features.

C. Developing guidelines to train and test a model is more about model training protocols and best practices, not about creating or selecting features from data.