Anonymisierung Flashcards
1
Q
Information Security
A
Preservation of confidentiality, integrity and availability of information
2
Q
Data Privacy
A
- use and governance of personal data and information
- While security is necessary for protecting data, it’s not sufficient for
addressing privacy
3
Q
Eckpunkte Datenschutz
A
- Recht jedes Menschen, wem wann welche seiner persönlichen Daten zugänglich sein sollen
3
Q
Eckpunkte Datenschutz
A
- Recht jedes Menschen, wem wann welche seiner persönlichen Daten zugänglich sein sollen
4
Q
EU-DSGVO
A
Schutz natürlicher Personen bei der automatisierten Verarbeitung personenbezogener Daten
5
Q
Personally Identifiable Information (PII)
A
- Any information directly/indirectly relating to identified/identifiable special characterics of a natural person (no company, not deceased)
- identifier such as a name, an identification number, location data, an online factor to identify the natural
6
Q
Pseudonymization
A
- increases security, but doesn’t offer anonymization
- e.g. simple masking (phone numbers), hashes, encryption
7
Q
Anonymization
A
- Goal: “„Anonymized data is no longer restricted by GDPR“
- irreversible
8
Q
Removal of Identifiers
A
- Remove PII
- -> Linkage attack with Quasi-identifiers (combinatio of non-identifying attributes to uniquely identify an individual), e.g. ZIP code, birth date, gender to identify 87% of U.S. population
- -> Differencing attack: exploit differences in result sets
9
Q
k-Anonymity
A
- Generalize quasi-identifiers until there are
at least k records for each group - Suppress any remaining records violating
k-anonymity property
10
Q
K-Anonymity issues
A
- Homogeneity attack (same disease)
- Background knowledge attack (statistics, probabilities)
- Curse of dimensionality –> more unique attribute combinations, suppress or accept reduced level of anonymity
11
Q
l-diversity
A
- prevents homogeneity attack by ensuring at least l “well-represented” values per QID group
- Skewness attack
- Similarity attack
12
Q
t-Closeness
A
- Sensitive attribute’s distribution in QID group
should be “close” to its global distribution to prevent skewness and similarity attacks - How meaningful is data if distribution in all QID
groups is similar? - General issue: False assumption that attacker
will only use some attributes to identify
individuals
13
Q
summary of generalization approach
A
- Group-based anonymization can be re-identified due to identifiers, quasi-identifiers and
sensitive attributes - Game of cat-and-mouse
- conflict between data utility and anonymity
- always depends on the Adversary’s side knowledge
14
Q
k-anonymity vs. Differential Privacy
A
- Privacy conditioned on data set vs. Privacy conditioned on function
- Hide individual in data subset (“crowd”) vs. Hide individual’s impact on function
- Syntactic privacy:
attributes are either “public” or “sensitive” vs. Semantic privacy - Informal privacy guarantee vs. Formal privacy guarantee