CHAP 11 : AI and societal issues Flashcards

1
Q

What are the 7 rights of individuals of personal data under the GDPR by the European Union (EU) ? (General Data Protection Regulation)

A
  1. Right of access
    - allow subjects to access the data that the company processes
  2. Right to retification
    - right to change or modify the data subjects provide company when they believe the data is inaccurate or out-of-date.
  3. Right to erasure
    - right to remove data from database, when subject no longer consent / data is no longer needed etc
  4. Right to restrict processing
    - subject’s right to request the restriction of processing, if subject contests the accuracy of processing methods, objects to unlawful processing etc
  5. Right to data portability
    - subject’s right to receive the personal data held by the company controlling data in a commonly used format and send the data to another company for use it for their personal purposes
  6. Right to object
    - subjects have the right to object to data processing, including profiling, when it is on relevant grounds.
  7. Right not to be subject to a decision based solely on automated processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is anonymization?

A

It is the process of removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can anonymisation be carried out, in medical data for hospitals?

Why does anonymisation not work too well?

A
  • In patient records, there are fields like Name, Zipcode, Birthday, Sex etc. Anonymisation can be carried out by replacing the name with an anonymous ID
  • Anonymisation in this case may not work well as individual’s data may still be recognisable –> e.g. In cambridge, 6 ppl that has same birthday as governer (male), 3 of whom are male, and only 1 lives in the same zipcode as the governer –> when others buy this info, they can identify the governer’s personal details
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is differential privacy?

A

A system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset.

(prevent individual records from being identified by adding noise to data in a controlled way while still allowing for the extraction of valuable insights from the data. )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When is a mechanism considered differentially private?

A

if the probability of any outcome occurring is nearly the same for any two datasets that differ in only one record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 2 advantages of differential privacy?

A
  1. Robustness during post-processing (protected ag post processing) –> It ensures that predictions do not heabily depend on any datapoint.
  2. Composability
    - A1(D) – guarantees some privacy definition with level e1 for dataset 1
    -A2(D) – guarantees sime privacy with level e2 for dataset 2

then releasing both A1(D) and A2(D0 satisfies the same privacy definition with parameter f(e1,e2)

E1 ~ f(e1,e2)
E2 ~ f(e1,e2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the disadvantage of differential privacy?

A

It gives poorer estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Randomised response and peturbation : involves asking individuals to respond to a “yes” or “no” question (for questions where it may be embarassing etc, such as “Do you cheat in exams?)

How can this be done? (using simple coin flipping mechanism)

A
  • Respondents choose answer, “Yes” or “No”
  • Before sending the real answer to server, differential privacy algo will flip a coin.
  • If heads, sends real answer. If its tails, it flips the coin again. If the second toss lands on heads, send the real answer. If its tails, it sends the opposite answer.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Differential privacy : coin flipping differential privacy algo

Suppose fraction x of students cheated in exams, and the measured fraction of “Yes” is y, how can we estimate the actual fraction of students who cheated?

A

y ~ 0.5x + 0.25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Differential privacy : coin flipping differential privacy algo

Suppose fraction x of students cheated in exams, and the measured fraction of “Yes” is y, how can we estimate the actual fraction of students who cheated?

A

y ~= 0.5x + 0.25
[ 0.5x = P(heads on first toss) AND P (student actually cheated),
0.25 = P(first toss tail) AND P (student randomly answered yes) ]

x ~= 2(y-0.25)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which 2 companies uses differential privacy to collect usage statistics to provide privacy to users?

A
  1. Google
  2. Apple (iphone)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Amazon created a tool to review resumes to hire top talent. However, its AI tool was shown to be biased and discriminated against women.

How is it possible that this tool was biased?

A
  • The tool’s algortihm predicts if an application is successful by looking at resume based on PAST applicants
  • The outcome on past applicants may have been biased, as more males were hired in the past.
  • This leads to a biased algorithm in order to predict well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does statistical parity measure ?

A

Statistical Parity measures the difference in probabilities of a positive outcome across two groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does equality of of opportunity mean?

A

It means that the same proportion of each population/group receives the “good” outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If there is equal opportunity, statistical parity is achieved. However, it may not be fair in real life. Think of some examples where this may be the case?

A
  1. Employment discrimination: An employer may have an equal number of male and female employees, but if women are disproportionately assigned lower-paying or less desirable positions or denied promotions or training opportunities, there is still unfairness and discrimination.
  2. Loan approvals: A lender may approve loans at equal rates for different racial groups, but if the criteria for approval systematically disadvantage certain groups, such as requiring higher credit scores or more collateral, it can still be unfair.
  3. Access to healthcare: Equal access to healthcare may be achieved on paper, but if certain groups face more barriers to obtaining healthcare, such as lack of transportation, inadequate insurance coverage, or discrimination from healthcare providers, it can still be unfair.
  4. Educational opportunities: Statistical parity in enrollment or graduation rates may exist between different groups, but if certain groups are systematically disadvantaged in terms of access to quality education, such as through underfunding or inadequate resources, it can still be unfair.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why do we need to be able to interpret the model and data in ML?

A

We may not feel comfortable with machines making decisions (e.g. approving bank loans), and we ourselves must understand and let other people understand how the model is doing the deciding to establish trust.

  • So that we know model is not discriminating against a certain race / profile etc when approving bank loans
17
Q

What are black box models?

A

Models that do not reveaal their internal mechanisms;; models that cannot be understood just by looking at its parameters

18
Q

Why are neural networks considered black box models?

A

We don’t know how all individual neurons work together to arrive at the final output (it is often unclear what any particular neuron is doing on its own).

19
Q

What are model agnostic methods?

A

Methods that can be applied to any model (e.g. neural networks, support vector machines etcetc)
- including blackbox models –> models that do not reveal information on its internal workings (e.g. how it derives output from input)

20
Q

What is the difference between local intepretable model-agnostic methods (LIME) and Global intepretable model-agnostic methods?

A

Local interpretation methods explain individual predictions (e.g. how the model came to a prediction of a single instance), while global intepretable model-agnostic methods describe the average behavior of a machine learning model

21
Q

What are the 4 steps involved in LIME?

A
  1. Select instance (FEATURE) of interest
  2. Perturb data around instance to get a new dataset
  3. Weight data according to distance from instance of interest
  4. Train intepretable model
22
Q

List the name of the global model agnostic method and the name of the 2 local interpretable model agnostic methods.

A
  • Global model agnostic method : Partial dependence plot
  • Local model agnostic methods
    1. Counterfactual explanations
    2. Adversial examples
23
Q

Why is a partial dependence plot considered a global model agnostic method?

A
  • A partial dependence plot shows the effect of one or 2 features on the machine learning model
  • It can show whether the relationship between the target and a feature is linear, monotonic or more complex.
24
Q

Model agnostic method for intepretation : Partial dependence plot

How does a partial dependence plot work?

A
  • set different values for specified feature
  • take average of predictions
  • plot a graph of the output (predictions) against the average for each value for the specified feature
25
Q

What are 2 disadvantages of partial dependence plots?

A
  1. It assume that input features are independent (in affecting the outcome). However, it maybe inconsistent with reality as in reality, certain features are correlated with others
  2. Solely taking the average of training datablurs out other information such as the spread of distribution of the data for different instances.
26
Q

What is the main goal of Local Intepretable model agnostic methods (LIME)?

A

to make ML models more secure against manipulation, by testing, interpreting and improving the models, thereby making it less susceptible to attacks, by adding noise to the data (perturb data) to get new dataset

27
Q

What is a counterfactual explanation?

A

It describes a causal situation in the form: “If X had not occurred, Y would not have occurred”.

  • For example: “If I hadn’t taken a sip of this hot coffee, I wouldn’t have burned my tongue, where the cause X is sipping hot coffee and effect Y is burning my tongue
28
Q

What is the purpose of adversial examples?

A

To deceive the ML model to result in a different prediction

29
Q

What is the optimisation objective in both counterfactual explanations and adversial examples?

A

To find the smallest change from x (original) to x′ (the counterfactual/adversarial example), that changes prediction of black box from one class to another class

30
Q

Adversial examples are a type of _____ which aims to _____ the model, not _____ it.

A

counterfactual ;; deceive ;; interpret