ETHICAL AND SOCIAL ISSUES IN NLP Flashcards

1
Q

Negative uses of NLP

A

Profiling of users?
Generating harmful tweets/comments?
Propaganda?
Manipulation and framing?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Using the right training data

A

Incomprehensible training data – not fully disclosed, explored or understood
- The entire Web?
- A large Twitter dataset collected by a single keyword?
eg ChatGPT is only based on data in the US

Size doesn’t guarantee diversity
Static data vs changing social views

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data bias

A

AI is trained on biased data
Racially, gendered, ethically
All data is biased
also depending on how the data is annotated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Ethical issue with data

A

Data have significant implications
- Is the data representative? What about bias?
- Is the data appropriate for the task?
- How was the data labeled?
- Copyright? Private data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

AI Data Pollution

A

We could run out of data to train AI
It’s going to get trickier to find good-quality, guaranteed AI-free training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Issues with testing and validation

A

Instead of just accuracy:

Systematic evaluation needed
- What is the value? Can the model do harm?
- How feasible it is to deploy it?
- Validation should lead to trust and confidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

LLM Risks

A
  • Discrimination, hate speech and
    exclusion
  • Information hazards (privacy)
  • Misinformation
  • Malicious Use (Fraud)
  • Human-Computer interaction
    harms (stereotypes)
  • Environmental and socioeconomic
    harms (hurt creative economies?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Environmental impact

A

Training a single BERT base model (without hyperparameter tuning) on GPUs was estimated to require as much energy as a trans-American flight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Geopolitical impact

A

Counties battling to create the best AI
Importance of having the leading AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Resource challenges

A

Computational infrastructure
- Who can afford these? Only the selected ones?
- Training large models is challenging and very costly
- Running costs

Right staff in right places?

Right partnerships?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Regulations

A

We need an agreed regulatory framework

We need agreed ethical and validation framework(s)

Transparency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is pre-mortem instead of post-mortem

A

Consider known and try to understand unknown risks and limitations of new product/project before it has been even designed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly