AIGP general Flashcards
Training data is best defined as a subset of data that is used to?
A Enable a model to detect and learn patterns.
B Fine-tune a model to improve accuracy and prevent overfitting.
C Detect the initial sources of biases to mitigate prior to deployment.
D Resemble the structure and statistical properties of production data.
Correct Answer: A
Training data is used to enable a model to detect and learn patterns. During the training phase, the model learns from the labeled data, identifying patterns and relationships that it will later use to make predictions on new, unseen data. This process is fundamental in building an AI model’s capability to perform tasks accurately. Reference: AIGP Body of Knowledge on Model Training and Pattern Recognition.
To maintain fairness in a deployed system, it is most important to?
A Protect against loss of personal data in the model.
B Monitor for data drift that may affect performance and accuracy.
C Detect anomalies outside established metrics that require new training data.
D Optimize computational resources and data to ensure efficiency and scalability
Correct Answer: B
To maintain fairness in a deployed system, it is crucial to monitor for data drift that may affect performance and accuracy. Data drift occurs when the statistical properties of the input data change over time, which can lead to a decline in model performance. Continuous monitoring and updating of the model with new data ensure that it remains fair and accurate, adapting to any changes in the data distribution. Reference: AIGP Body of Knowledge on Post-Deployment Monitoring and Model Maintenance.
When monitoring the functional performance of a model that has been deployed into production, all of the following are concerns EXCEPT?
A Feature drift.
B System cost.
C Model drift.
D Data loss.
Correct Answer: B
When monitoring the functional performance of a model deployed into production, concerns typically include feature drift, model drift, and data loss. Feature drift refers to changes in the input features that can affect the model’s predictions. Model drift is when the model’s performance degrades over time due to changes in the data or environment. Data loss can impact the accuracy and reliability of the model. However, system cost, while important for budgeting and financial planning, is not a direct concern when monitoring the functional performance of a deployed model. Reference: AIGP Body of Knowledge on Model Monitoring and Maintenance.
After completing model testing and validation, which of the following is the most important step that an organization takes prior to deploying the model into production?
A Perform a readiness assessment.
B Define a model-validation methodology.
C Document maintenance teams and processes.
D Identify known edge cases to monitor post-deployment.
Correct Answer: A
After completing model testing and validation, the most important step prior to deploying the model into production is to perform a readiness assessment. This assessment ensures that the model is fully prepared for deployment, addressing any potential issues related to infrastructure, performance, security, and compliance. It verifies that the model meets all necessary criteria for a successful launch. Other steps, such as defining a model-validation methodology, documenting maintenance teams and processes, and identifying known edge cases, are also important but come secondary to confirming overall readiness. Reference: AIGP Body of Knowledge on Deployment Readiness.
Which type of existing assessment could best be leveraged to create an Al impact assessment?
A A safety impact assessment.
B A privacy impact assessment.
C A security impact assessment.
D An environmental impact assessment.
Correct Answer: B
A privacy impact assessment (PIA) can be effectively leveraged to create an AI impact assessment. A PIA evaluates the potential privacy risks associated with the use of personal data and helps in implementing measures to mitigate those risks. Since AI systems often involve processing large amounts of personal data, the principles and methodologies of a PIA are highly applicable and can be extended to assess broader impacts, including ethical, social, and legal implications of AI. Reference: AIGP Body of Knowledge on Impact Assessments.
Which of the following steps occurs in the design phase of the Al life cycle?
A Data augmentation.
B Model explainability.
C Risk impact estimation.
D Performance evaluation.
C. Risk impact estimation.
In the design phase, the focus is on planning and identifying potential risks and impacts of the AI system. Risk impact estimation involves assessing the potential consequences of deploying the model, including ethical, legal, and operational risks. The other steps typically occur in later stages of the AI life cycle:
A. Data augmentation happens during the data preparation phase.
B. Model explainability is often addressed during model development or validation.
D. Performance evaluation occurs after the model is trained, during testing and validation.
During the planning and design phases of the Al development life cycle, bias can be reduced by all of the following EXCEPT?
A Stakeholder involvement.
B Feature selection.
C Human oversight.
D Data collection.
B. Feature selection.
While feature selection is an important step in AI model development, it typically occurs during the modeling phase, not the planning or design phases. Bias can be reduced during planning and design through A. Stakeholder involvement, C. Human oversight, and D. Data collection, which ensure that diverse perspectives and appropriate data are considered early on. Feature selection focuses more on refining the model’s inputs and is not directly related to bias reduction at the planning and design stages.
Which of the following use cases would be best served by a non-AI solution?
A A non-profit wants to develop a social media presence.
B An e-commerce provider wants to make personalized recommendations.
C A business analyst wants to forecast future cost overruns and underruns.
D A customer service agency wants automate answers to common questions.
A. A non-profit wants to develop a social media presence.
Building a social media presence typically involves content creation, scheduling posts, and engagement strategies, which can be handled effectively with standard tools and human effort rather than requiring AI. The other use cases—such as personalized recommendations, forecasting, and automating customer service—are more suited to AI-driven solutions that can leverage data and machine learning models.
All of the following are elements of establishing a global Al governance infrastructure EXCEPT?
A Providing training to foster a culture that promotes ethical behavior.
B Creating policies and procedures to manage third-party risk.
C Understanding differences in norms across countries.
D Publicly disclosing ethical principles.
Answer : D
Establishing a global AI governance infrastructure involves several key elements, including providing training to foster a culture that promotes ethical behavior, creating policies and procedures to manage third-party risk, and understanding differences in norms across countries. While publicly disclosing ethical principles can enhance transparency and trust, it is not a core element necessary for the establishment of a governance infrastructure. The focus is more on internal processes and structures rather than public disclosure. Reference: AIGP Body of Knowledge on AI Governance and Infrastructure.
Which of the following would be the least likely step for an organization to take when designing an integrated compliance strategy for responsible Al?
A Conducting an assessment of existing compliance programs to determine overlaps and integration points.
B Employing a new software platform to modernize existing compliance processes across the organization.
C Consulting experts to consider the ethical principles underpinning the use of Al within the organization.
D Launching a survey to understand the concerns and interests of potentially impacted stakeholders.
Answer : B
When designing an integrated compliance strategy for responsible AI, the least likely step would be employing a new software platform to modernize existing compliance processes. While modernizing compliance processes is beneficial, it is not as directly related to the strategic integration of ethical principles and stakeholder concerns. More critical steps include conducting assessments of existing compliance programs to identify overlaps and integration points, consulting experts on ethical principles, and launching surveys to understand stakeholder concerns. These steps ensure that the compliance strategy is comprehensive and aligned with responsible AI principles. Reference
A company initially intended to use a large data set containing personal information to train an Al model. After consideration, the company determined that it can derive enough value from the data set without any personal information and permanently obfuscated all personal data elements before training the model.
This is an example of applying which privacy-enhancing technique (PET)?
A Anonymization.
B Pseudonymization.
C Differential privacy.
D Federated learning.
Answer : A
Anonymization is a privacy-enhancing technique that involves removing or permanently altering personal data elements to prevent the identification of individuals. In this case, the company obfuscated all personal data elements before training the model, which aligns with the definition of anonymization. This ensures that the data cannot be traced back to individuals, thereby protecting their privacy while still allowing the company to derive value from the dataset
The planning phase of the Al life cycle articulates all of the following EXCEPT the?
A Objective of the model.
B Approach to governance.
C Choice of the architecture.
D Context in which the model will operate.
Answer : B
The planning phase of the AI life cycle typically includes defining the objective of the model, choosing the appropriate architecture, and understanding the context in which the model will operate. However, the approach to governance is usually established as part of the overall AI governance framework, not specifically within the planning phase. Governance encompasses broader organizational policies and procedures that ensure AI development and deployment align with legal, ethical, and operational standards
What is the best reason for a company adopt a policy that prohibits the use of generative Al?
A Avoid using technology that cannot be monetized.
B Avoid needing to identify and hire qualified resources.
C Avoid the time necessary to train employees on acceptable use.
D Avoid accidental disclosure to its confidential and proprietary information.
Correct Answer: D
The primary concern for a company adopting a policy prohibiting the use of generative AI is the risk of accidental disclosure of confidential and proprietary information. Generative AI tools can inadvertently leak sensitive data during the creation process or through data sharing. This risk outweighs the other reasons listed, as protecting sensitive information is critical to maintaining the company’s competitive edge and legal compliance. This rationale is discussed in the sections on risk management and data privacy in the IAPP AIGP Body of Knowledge.
Which of the following is an example of a high-risk application under the EU Al Act?
A A resume scanning tool that ranks applicants.
B An Al-enabled inventory management tool.
C A government-run social scoring tool.
D A customer service chatbot tool.
Correct Answer: C
The EU AI Act categorizes certain applications of AI as high-risk due to their potential impact on fundamental rights and safety. High-risk applications include those used in critical areas such as employment, education, and essential public services. A government-run social scoring tool, which assesses individuals based on their social behavior or perceived trustworthiness, falls under this category because of its profound implications for privacy, fairness, and individual rights. This contrasts with other AI applications like resume scanning tools or customer service chatbots, which are generally not classified as high-risk under the EU AI Act.
What is the best method to proactively train an LLM so that there is mathematical proof that no specific piece of training data has more than a negligible effect on the model or its output?
A Clustering.
B Transfer learning.
C Differential privacy.
D Data compartmentalization.
C. Differential privacy.
Explanation:
Differential privacy is the best method to ensure that no specific piece of training data has a significant effect on the model or its output. This technique involves adding noise to the data or the training process in a controlled manner, such that it becomes mathematically provable that the model’s output does not change significantly due to the inclusion or exclusion of any single data point.
Key reasons why differential privacy is suitable:
It provides mathematical guarantees that the contribution of individual data points is limited.
It helps ensure data privacy because the model cannot be used to infer whether any specific data point was present in the training set.
Here’s why the other options are less suitable:
A. Clustering: Clustering is a method for grouping similar data points together but does not inherently protect individual data points’ influence on the model or provide mathematical guarantees about privacy.
B. Transfer learning: Transfer learning involves using a pre-trained model and fine-tuning it on new data, but it does not focus on ensuring that individual data points have a minimal impact on the overall model output.
D. Data compartmentalization: This is a method for organizing and isolating data into segments but does not directly address controlling the influence of specific data points on the model.
Differential privacy is specifically designed for scenarios where it is important to ensure that the presence or absence of any single piece of data cannot be detected or inferred from the model, making it the best choice for this purpose.