Understanding the AI Development Lifecycle:Design Flashcards

1
Q

What are the elements of the design phase (data gathering)?

A
  • Implementing a data strategy.
  • Determine what data is required.
  • Determine how much data is needed.
  • Determine how data will be collected and stored.
  • Consider whether pre-trained data should be used.
  • Decide whether to use internal data only or external data.
  • Consider the quality of the data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name some different formats of data.

A
  • Structured data (data that is labeled and categorized)
  • Unstructured data (data that is not labeled or categorized)
  • Static data (does not change e.g. past sales)
  • ## Streaming data (data that will change e.g. customers visiting a website)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which part of development takes the most effort?

A
  • Preparing the data (approx. 80% of the cycle)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 5 V’s of data preparation?

A

1) Volume of the data
2) Velocity of the data
3) Variety of the data
4) Voracity of the data
5) Value of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the definition of cleansing data?

A

Removing erroneous or irrelevant data (helps to avoid privacy issues later).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is involved in labeling data?

A

Tagging or annotating data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you anonymize data?

A

Remove identifiers from the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the concept of data minimization?

A

If data is not needed in the application, do not use it to train the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

List some privacy-enhancing technologies (PETs) that can be applied to AI systems.

A
  • Anonymization
  • Minimization
  • Differential privacy
  • Federated learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

During which phase is the AI architecture and model chosen?

A

The Design Phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Differential Privacy?

A

Blurs the data by using an algorithm. The data is still usable but non-specific (unable to identify individuals).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Federated Learning?

A

One central model is downloaded to local models and data is trained in each location. This avoids sharing sensitive data. Results of the training are sent back to the central location, where all of the data gets aggregated together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are elements of implementing a data strategy?

A
  • Data gathering
  • Data wrangling
  • Data cleansing
  • Data labeling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly