Privacy Enhancing Technologies Flashcards
Which requirement categories of a system may contradict privacy?
Functionality, efficiency, accountability
What are key attributes vs quasi-identifiers?
Key attributes: uniquely identifying information
Quasi-identifiers: Combination of attributes that can be used to identify users in many / most / some cases
What is K-anonymity?
Each person represented in the table is in an anonymity set of at least k people.
Or: A table is k-anonymous if any quasi-identifier present in the released table appears in at least k records.
Name two techniques to reach k-anonymity
- Generalization: replace quasi-identifiers with less specific values
- Suppression: blunt the data (extreme case of generalization)
Describe two attacks on k-anonymity
- Homogeneity Attack: use lack of diversity in equivalence class
- Background Knowledge Attack: use background knowledge -> may be problematic even if records are diverse
What is L-diversity?
A table is L-diverse if every equivalence class in the table has at least L different values of the sensitive attribute.
Name three other types of L-diversity
Probabilistic L-diversity, Entropy L-diversity, Recursive (c,L)-diversity
Name two attacks on L-diversity
- Skewness Attack: statistical anomalies in blocks may be telling
- Similarity Attack: values may be semantically correlated
What is t-Closeness?
T-Closeness is described by the distance between two distributions:
- Q: distribution of the sensitive attribute value in the whole table
- P: distribution of the sensitive attribute value in one block
A table has t-closeness if for every block the distance between P and Q is below a threshold t
What is the idea behind Differential Privacy?
Differential privacy aims to provide means to maximize the accuracy of queries from statistical databases while minimizing the chances of identifying its records. -> plausible deniability
How is access to data monitored in an inverse transparency system?
- Direct access: blocked
- API access: Monitored API wrapper
- Analytical tool access: Monitored plug-ins
What path does data take through the monitored API wrapper?
Request: Request authenticator -> allowance module -> request translator -> data source
Response: data source -> Risk computation -> allowance module, access logger -> response generator
What are two benefits of monitored plug-ins vs apis?
- Raw data never leaves the plug-in (?)
- additional benefit of rich access semantics