DPP Topic 3 - Data Anonymization Flashcards

Question 1

Q

Data Anonymization

Answer

A

The process of removing or modifying personally identifiable information from a dataset to prevent the identification of individuals.

Question 2

Q

Original Database

Answer

A

The initial dataset containing personal information.

Question 3

Q

Published Database

Answer

A

The anonymized dataset that is made available for use.

Question 4

Q

Anonymized Data

Answer

A

The modified data that retains the usefulness of the original data while protecting individual privacy.

Question 5

Q

Original Data

Answer

A

The personal information in the initial dataset.

Question 6

Q

Balancing Data Privacy and Data Utility

Answer

A

The goal of anonymization is to make data less specific while retaining its usefulness.

Question 7

Q

Determine release model
(Data Preparation)

Answer

A

Deciding whether the anonymized dataset will be made public or kept non-public.

Question 8

Q

Determine re-identification risk threshold (Data Preparation)

Answer

A

Higher risk thresholds lead to increased data anonymity but decreased data utility.

Question 9

Q

Classify data attributes
(Data Preparation)

Answer

A

Identifying explicit identifiers, quasi-identifiers, and sensitive data.

Question 10

Q

Remove unused data attributes
(Data Preparation)

Answer

A

Suppressing attributes that are not required in the anonymized dataset.

Question 11

Q

Anonymize identifiers
(Data Execution)

Answer

A

Applying relevant anonymization techniques to different types of identifiers.

Question 12

Q

Evaluate the solution
(Data Execution)

Answer

A

Assessing the anonymized dataset for sufficient data anonymity and utility.

Question 13

Q

Determine controls required
(Data Execution)

Answer

A

Implementing technical and non-technical controls to protect the anonymized data.

Question 14

Q

Document anonymization process
(Data Execution)

Answer

A

Recording the details of the anonymization process for future reference.

Question 15

Q

1) Attribute Suppression
(Techniques)

Answer

A

Removal of an entire attribute (column) from the dataset. (Strongest Technique)

Question 16

Q

2) Character Masking
(Techniques)

Answer

A

Replacing characters in a data value with a symbol (e.g., “*” or “x”).

Question 17

Q

3) Generalization
(Techniques)

Answer

A

Reducing the precision of data by converting it into a range of values.

Question 18

Q

4) Swapping
(Techniques)

Answer

A

Rearranging data values within the dataset to break the link between individual records.

Question 19

Q

5) Data Perturbation
(Techniques)

Answer

A

Modifying the original data values to be slightly different.

Question 20

Q

6) Synthetic Data
(Techniques)

Answer

A

Artificially creating data that captures the statistical distributions of the original data.

Question 21

Q

7) Data Aggregation
(Techniques)

Answer

A

Converting a dataset from individual records to summarized values.

Question 22

Q

8) K-anonymity
(Techniques)

Answer

A

A property of a dataset where each combination of values appears in at least k records, making it harder to identify individuals.

Question 23

Q

9) Pseudonymization

Answer

A

Replacing identifying data with made-up values that have no relationship to the original values.