Data Publishing Flashcards

Question 1

Q

Privacy Threats

Answer

A

Membership disclosure: An individual’s data is in a dataset of sensitive nature
Attribute disclosure: An individual’s data is in a dataset, and this individual’s anonymity set has a unique sensitive attribute
Record disclosure: An individual’s data is in a dataset, and this individual’s anonymity set contains only one record

Question 2

Q

K-Anonymity

Answer

A

Each person contained in the database cannot be distinguished from at least k-1 other individuals whose information also appears in the released database
-> utility trade-off
-> does not provide privacy when sensitive values lack diversity

Question 3

Q

l-Diversity

Answer

A

An equivalence class has l-diversity if there are at least l well-represented values for the sensitive attribute
-> does not consider semantics of sensitive values
-> does not consider overall distribution of sensitive values

Question 4

Q

t-Closeness

Answer

A

An equivalence class has t-closeness if the distance between the distribution of a sensitive attribute in this class and the distribution of the attribute in the whole table is no more than a threshold t

Question 5

Q

How to achieve Differential Privacy

Answer

A

Input perturbation:
-> Add noise directly to the database
+ independent of the algorithm & easy to reproduce
determining the amount of required noise is difficult
Output perturbation:
-> Add noise to the function (statistic) output
+ easier to control privacy & better guarantees than input perturbation
results cannot be reproduced
Algorithm perturbation:
-> Inherently add noise to the algorithm
+ algorithm can be optimized with the noise addition
difficult to generalize & depends on the inputs