AIGP Study Cases Flashcards

1
Q

Use the following answer the next question:

ABC Corp, is a leading insurance provider offering a range of coverage options to individuals. ABC has decided to utilize artificial intelligence to streamline and improve its customer acquisition and underwriting process, including the accuracy and efficiency of pricing policies.

ABC has engaged a cloud provider to utilize and fine-tune its pre-trained, general purpose large language model (‘‘LLM’’). In particular, ABC intends to use its historical customer data—including applications, policies, and claims—and proprietary pricing and risk strategies to provide an initial qualification assessment of potential customers, which would then be routed a human underwriter for final review.

ABC and the cloud provider have completed training and testing the LLM, performed a readiness assessment, and made the decision to deploy the LLM into production. ABC has designated an internal compliance team to monitor the model during the first month, specifically to evaluate the accuracy, fairness, and reliability of its output. After the first month in production, ABC realizes that the LLM declines a higher percentage of women’s loan applications due primarily to women historically receiving lower salaries than men.

What is the best strategy to mitigate the bias uncovered in the loan applications?

A Retrain the model with data that reflects demographic parity.
B Procure a third-party statistical bias assessment tool.
C Document all instances of bias in the data set.
D Delete all gender-based data in the data set.

A

A. Retrain the model with data that reflects demographic parity.

Explanation:
The issue described is a bias in the LLM’s output, where a higher percentage of loan applications from women are being declined due to historical salary disparities. To address this, it’s crucial to adjust the model’s training data so that it better reflects a fairer distribution among demographic groups.
Retraining the model with data that reflects demographic parity is the best strategy to mitigate the bias uncovered in the loan applications. This approach addresses the root cause of the bias by ensuring that the training data is representative and balanced, leading to more equitable decision-making by the AI model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A local police department in the United States procured an Al system to monitor and analyze social media feeds, online marketplaces and other sources of public information to detect evidence of illegal activities (e.g., sale of drugs or stolen goods). The Al system works by surveilling the public sites in order to identify individuals that are likely to have committed a crime. It cross-references the individuals against data maintained by law enforcement and then assigns a percentage score of the likelihood of criminal activity based on certain factors like previous criminal history, location, time, race and gender.

The police department retained a third-party consultant assist in the procurement process, specifically to evaluate two finalists. Each of the vendors provided information about their system’s accuracy rates, the diversity of their training data and how their system works. The consultant determined that the first vendor’s system has a higher accuracy rate and based on this information, recommended this vendor to the police department.

The police department chose the first vendor and implemented its Al system. As part of the implementation, the department and consultant created a usage policy for the system, which includes training police officers on how the system works and how to incorporate it into their investigation process.

The police department has now been using the Al system for a year. An internal review has found that every time the system scored a likelihood of criminal activity at or above 90%, the police investigation subsequently confirmed that the individual had, in fact, committed a crime. Based on these results, the police department wants to forego investigations for cases where the Al system gives a score of at least 90% and proceed directly with an arrest.

During the procurement process, what is the most likely reason that the third-party consultant asked each vendor for information about the diversity of their datasets?

A To comply with applicable law.
B To assist the fairness of the Al system.
C To evaluate the reliability of the Al system.
D To determine the explainability of the Al system.

A

B. To assist the fairness of the AI system.

Explanation:
The diversity of the datasets used in training an AI system is crucial for ensuring that the model is fair and does not disproportionately target or misclassify individuals based on attributes like race, gender, or other characteristics. If the training data is not diverse, the AI system may learn biased patterns, which can lead to unfair outcomes—such as over-representing certain groups as being more likely to engage in criminal activity.

The consultant’s request for information about the diversity of training data was likely motivated by the need to assess whether the system is designed in a way that reduces bias and ensures fair treatment across different demographic groups. This is especially important in law enforcement contexts, where biased predictions can have significant ethical and legal implications.

Here’s why the other options are less suitable:

A. To comply with applicable law: While compliance with anti-discrimination laws is important, the direct request for information about dataset diversity is more likely aimed at understanding fairness rather than merely complying with legal requirements.

C. To evaluate the reliability of the AI system: Reliability refers to the consistency and accuracy of the system’s results, but this is generally assessed through metrics like accuracy rates and error rates, not necessarily through data diversity. Data diversity impacts fairness more than reliability.

D. To determine the explainability of the AI system: Explainability concerns how easily humans can understand the AI system’s decision-making process. It relates more to how the system’s decisions are communicated rather than the nature of the training data itself.

Thus, B is the best answer because assessing dataset diversity helps ensure that the AI system treats different demographic groups equitably, which is key to preventing biased outcomes in sensitive applications like law enforcement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Good Values Corporation (GVC) is a U.S. educational services provider that employs teachers to create and deliver enrichment courses for high school students. GVC has learned that many of its teacher employees are using generative Al to create the enrichment courses, and that many of the students are using generative Al to complete their assignments.

In particular, GVC has learned that the teachers they employ used open source large language models (“LLM”) to develop an online tool that customizes study questions for individual students. GVC has also discovered that an art teacher has expressly incorporated the use of generative Al into the curriculum to enable students to use prompts to create digital art.

GVC has started to investigate these practices and develop a process to monitor any use of generative Al, including by teachers and students, going forward.

All of the following may be copyright risks from teachers using generative Al to create course content EXCEPT?

A. Content created by an LLM may be protectable under U.S. intellectual property law.
B. Generative Al is generally trained using intellectual property owned by third parties.
C. Students must expressly consent to this use of generative Al.
D. Generative Al often creates content without attribution.

A

C. Students must expressly consent to this use of generative AI.

Explanation:
The question asks which of the options is not a copyright risk associated with the use of generative AI by teachers in creating course content. Let’s break down each option:

A. Content created by an LLM may be protectable under U.S. intellectual property law: This option is about the uncertainty around whether or not content generated by AI can be copyrighted. The current legal framework in the U.S. does not clearly grant copyright protection to AI-generated content, which can present a copyright-related challenge for users of generative AI. Thus, this could be considered a potential risk.

B. Generative AI is generally trained using intellectual property owned by third parties: This is a significant copyright concern. Generative AI models are often trained on vast datasets that may include copyrighted content. If the generated outputs are deemed to have copied or derived from these protected works, it could create legal risks for users like teachers and educational institutions.

C. Students must expressly consent to this use of generative AI: This statement is not directly related to copyright risks. Instead, it pertains to privacy and consent issues, which are important but not specifically tied to copyright law. Therefore, it is not a copyright risk, making it the correct answer.

D. Generative AI often creates content without attribution: This can present a copyright issue because if the generated content includes or is derived from protected works, the lack of attribution could lead to claims of copyright infringement. This is indeed a copyright-related risk.

Thus, C is the answer because it is related more to the consent and privacy of students rather than being a direct copyright risk associated with using generative AI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A mid-size US healthcare network has decided to develop an Al solution to detect a type of cancer that is most likely arise in adults. Specifically, the healthcare network intends to create a recognition algorithm that will perform an initial review of all imaging and then route records a radiologist for secondary review pursuant agreed-upon criteria (e.g., a confidence score below a threshold).

To date, the healthcare network has taken the following steps: defined its Al ethical principles: conducted discovery to identify the intended uses and success criteria for the system: established an Al governance committee; assembled a broad, cross functional team with clear roles and responsibilities; and created policies and procedures to document standards, workflows, timelines and risk thresholds during the project.

The healthcare network intends to retain a cloud provider to host the solution and a consulting firm to help develop the algorithm using the healthcare network’s existing data and de-identified data that is licensed from a large US clinical research partner.

Which stakeholder group is most important in selecting the specific type of algorithm?
A. The cloud provider.
B. The consulting firm.
C. The healthcare network’s data science team.
D. The healthcare network’s Al governance committee.

A

C. The healthcare network’s data science team.

Explanation:
The data science team plays a crucial role in selecting the specific type of algorithm for the AI solution. This is because they have the necessary expertise in machine learning, data analysis, and algorithm selection. Their deep understanding of the data, the problem being addressed, and the technical requirements of different types of algorithms allows them to evaluate and choose the most suitable one for the task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A mid-size US healthcare network has decided to develop an Al solution to detect a type of cancer that is most likely arise in adults. Specifically, the healthcare network intends to create a recognition algorithm that will perform an initial review of all imaging and then route records a radiologist for secondary review pursuant agreed-upon criteria (e.g., a confidence score below a threshold).

To date, the healthcare network has taken the following steps: defined its Al ethical principles: conducted discovery to identify the intended uses and success criteria for the system: established an Al governance committee; assembled a broad, cross functional team with clear roles and responsibilities; and created policies and procedures to document standards, workflows, timelines and risk thresholds during the project.

The healthcare network intends to retain a cloud provider to host the solution and a consulting firm to help develop the algorithm using the healthcare network’s existing data and de-identified data that is licensed from a large US clinical research partner.

In the design phase, what is the most important step for the healthcare network to take when mapping its existing data to the clinical research partner data?

A. Apply privacy-enhancing technologies to the data.
B. Identify fits and gaps in the combined data.
C. Ensure the data is labeled and formatted.
D. Evaluate the country of origin of the data.

A

B. Identify fits and gaps in the combined data.

Explanation:
During the design phase of developing an AI solution, when combining data from different sources (in this case, the healthcare network’s existing data and the de-identified data from a clinical research partner), it is crucial to identify fits and gaps in the combined data. This involves understanding how well the datasets align in terms of structure, features, and content, as well as identifying any inconsistencies or missing data points.

This step is important because:

It helps ensure that both datasets are compatible for training the AI model.
Identifying gaps allows the organization to address any data deficiencies that could impact the model’s performance.
Understanding fits helps ensure that the data can be integrated effectively, maximizing the utility of the combined data for training the AI model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A mid-size US healthcare network has decided to develop an Al solution to detect a type of cancer that is most likely arise in adults. Specifically, the healthcare network intends to create a recognition algorithm that will perform an initial review of all imaging and then route records a radiologist for secondary review pursuant agreed-upon criteria (e.g., a confidence score below a threshold).

To date, the healthcare network has taken the following steps: defined its Al ethical principles: conducted discovery to identify the intended uses and success criteria for the system: established an Al governance committee; assembled a broad, cross functional team with clear roles and responsibilities; and created policies and procedures to document standards, workflows, timelines and risk thresholds during the project.

The healthcare network intends to retain a cloud provider to host the solution and a consulting firm to help develop the algorithm using the healthcare network’s existing data and de-identified data that is licensed from a large US clinical research partner.

In the design phase, which of the following steps is most important in gathering the data from the clinical research partner?

A. Perform a privacy impact assessment.
B. Combine only anonymized data.
C. Segregate the data sets.
D. Review the terms of use.

A

D. Review the terms of use.

Explanation:
When gathering data from an external partner, such as a clinical research partner, it is crucial to review the terms of use for the data. This ensures that the healthcare network understands the permissions, limitations, and legal obligations associated with using the data. Key considerations include:

Licensing restrictions: Understanding what the healthcare network is allowed to do with the data, such as using it for training the AI model.
Data use limitations: Ensuring that the use of the data complies with any stipulations in the agreement, such as restrictions on redistribution or conditions for de-identification.
Compliance with regulations: Confirming that data usage aligns with HIPAA (Health Insurance Portability and Accountability Act) and other relevant privacy laws.
This step is critical because misusing the data or failing to adhere to the agreed terms could result in legal consequences or the revocation of access to the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly