chapter 2 Flashcards
what are the roles in data collection and data publishing
Data Recipient
Data Publisher
Record Owners
an example of data recipient
Medical Center
(data mining)
an example of data Publisher
Hospital
(data anonymization)
an example of record owners
patients
what are the data attributes
Explicit Identifier
Quasi Identifier (QID)
Sensitive Attributes
Non-Sensitive Attributes
what are explicit identifiers
Data attributes that explicitly identifies record owners, e.g., name, identity card number, mobile phone number.
what are Quasi Identifier (QID)
Data attributes that could potentially identify record owners, e.g., postal code, age, gender.
what are sensitive attributes
Data attributes that are sensitive person-specific information, e.g., salary, disease, disability status
what are non sensitive attributes
Data attributes that do not fall into all of the other categories.
what are the roles responsible with data collection
data publisher
record owners
what are the roles responsible for data publishing
data receipient
what are the privacy attacks
record linkage
attribute linkage
table linkage
probabilistic
what is the record linkage model
- Similar Quasi Identifier (QID) values grouped into small number of records
- Victim’s QID matches and linked to this group
- Smaller number of possibilities in identifying the
victim’s record - Identifying the victim in this group, with additional
information
example of record linkage model
Example: Hospital wants to publish the patient records in Table 1 to a research center
- Research center has access to the external table, Table 2
- Research center knows that every person with a record in Table 2 has a record in Table 1
- Joining the two tables on the common attributes Job, Sex, and Age may link the identity of a person to his/her Disease
what is the attribute linkage model
- Adversary may not precisely identify the record of the
target victim - Victim belongs to a group, based on a set of
Sensitive Attributes - Adversary could infer victim’s sensitive values from
the published data