Data Publishing Flashcards

1
Q

Privacy Threats

A
  • Membership disclosure: An individual’s data is in a dataset of sensitive nature
  • Attribute disclosure: An individual’s data is in a dataset, and this individual’s anonymity set has a unique sensitive attribute
  • Record disclosure: An individual’s data is in a dataset, and this individual’s anonymity set contains only one record
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

K-Anonymity

A

Each person contained in the database cannot be distinguished from at least k-1 other individuals whose information also appears in the released database
-> utility trade-off
-> does not provide privacy when sensitive values lack diversity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

l-Diversity

A
  • An equivalence class has l-diversity if there are at least l well-represented values for the sensitive attribute
    -> does not consider semantics of sensitive values
    -> does not consider overall distribution of sensitive values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

t-Closeness

A
  • An equivalence class has t-closeness if the distance between the distribution of a sensitive attribute in this class and the distribution of the attribute in the whole table is no more than a threshold t
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to achieve Differential Privacy

A
  • Input perturbation:
    -> Add noise directly to the database
    + independent of the algorithm & easy to reproduce
  • determining the amount of required noise is difficult
  • Output perturbation:
    -> Add noise to the function (statistic) output
    + easier to control privacy & better guarantees than input perturbation
  • results cannot be reproduced
  • Algorithm perturbation:
    -> Inherently add noise to the algorithm
    + algorithm can be optimized with the noise addition
  • difficult to generalize & depends on the inputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly