Applying Data Protection Flashcards
Define Anonymity? and how can we understand it in information technology?
Anonymity is a situation where the identity of the person acting is unknown. In information technology, it refers to data where it is impossible to identify the individual whose data is processed.
Give Examples of re-identifying individuals!
IP addresses assigned by Internet Service Providers (ISPs) who can identify individuals,
location data history, which can often reveal an individual,
unique identifiers such as social security numbers, customer numbers, or personnel numbers, that can be mapped to an individual, and
statistical data filtering, which can often reveal the identity of an individual.
What is K-Anonymity? And how can we achieve it?
K-anonymity is a concept that manages the conflict of using data and, at the same time, protecting the privacy of the individuals involved with the data. This compromise is achieved by making the data less accurate.
Which Methods could we use to achieve K-Anonymity?
Suppression: Attributes are removed completely, in this case the name and the country.
Generalization: The process of replacing specific data with broader categories. Example: instead of using the exact Age (29) use a range (20 to 30) or <30.
What is L-Diversity?
L-diversity is a form of group-based anonymization that is used to preserve privacy in data sets by reducing the granularity of a data representation. It is an extension of the k-anonymity model. Example: instead of using the clear Age (29) write it like this (2*).
What is Differential Privacy and what is it used for?
Differential privacy changes data so that it still can be used statistically, but the privacy of individuals is maintained. A common example is to survey individuals if they possess a certain trait.
How does Differential Privacy achieve the requested level of privacy?
Differential privacy adds random noise to a dataset in order to achieve the requested level of privacy. These can be random dummy records that are added or random changes to datasets. and uses the parameter ε to measure the privacy of a dataset.