DPP Topic 3 - Data Anonymization Flashcards

1
Q

Data Anonymization

A

The process of removing or modifying personally identifiable information from a dataset to prevent the identification of individuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Original Database

A

The initial dataset containing personal information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Published Database

A

The anonymized dataset that is made available for use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Anonymized Data

A

The modified data that retains the usefulness of the original data while protecting individual privacy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Original Data

A

The personal information in the initial dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Balancing Data Privacy and Data Utility

A

The goal of anonymization is to make data less specific while retaining its usefulness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Determine release model
    (Data Preparation)
A

Deciding whether the anonymized dataset will be made public or kept non-public.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Determine re-identification risk threshold (Data Preparation)
A

Higher risk thresholds lead to increased data anonymity but decreased data utility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Classify data attributes
    (Data Preparation)
A

Identifying explicit identifiers, quasi-identifiers, and sensitive data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Remove unused data attributes
    (Data Preparation)
A

Suppressing attributes that are not required in the anonymized dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Anonymize identifiers
    (Data Execution)
A

Applying relevant anonymization techniques to different types of identifiers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Evaluate the solution
    (Data Execution)
A

Assessing the anonymized dataset for sufficient data anonymity and utility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. Determine controls required
    (Data Execution)
A

Implementing technical and non-technical controls to protect the anonymized data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. Document anonymization process
    (Data Execution)
A

Recording the details of the anonymization process for future reference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

1) Attribute Suppression
(Techniques)

A

Removal of an entire attribute (column) from the dataset. (Strongest Technique)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

2) Character Masking
(Techniques)

A

Replacing characters in a data value with a symbol (e.g., “*” or “x”).

17
Q

3) Generalization
(Techniques)

A

Reducing the precision of data by converting it into a range of values.

18
Q

4) Swapping
(Techniques)

A

Rearranging data values within the dataset to break the link between individual records.

19
Q

5) Data Perturbation
(Techniques)

A

Modifying the original data values to be slightly different.

20
Q

6) Synthetic Data
(Techniques)

A

Artificially creating data that captures the statistical distributions of the original data.

21
Q

7) Data Aggregation
(Techniques)

A

Converting a dataset from individual records to summarized values.

22
Q

8) K-anonymity
(Techniques)

A

A property of a dataset where each combination of values appears in at least k records, making it harder to identify individuals.

23
Q

9) Pseudonymization

A

Replacing identifying data with made-up values that have no relationship to the original values.