Tracking/Ethics Flashcards

Question 1

Q

How do you approach anonymization in general?

Answer

A

Eliminate bits that would identify user: hostnames, user ids, IP addresses
If devices shall remain identifiable: use pseudonyms (rand strings), one-way hashes
Individual fields may be obfuscated but combinations might still reveal personal information

Question 2

Q

Explain what device manufacturers do to harden tracking

Answer

A

MAC address randomization: New MAC address for every probe request
Probe requests w/o SSID
Systems won’t easily track your movement anymore

However: once connected, address is stable

Question 3

Q

Name and explain 5 ethics theories and practices

Answer

A

Consequentialism: end justifies the means
Deontology: Morality of actions is only measure
Virtue Ethics: how we should be rather than what we should do
Principlism: Moral principles of autonomy, beneficence, nonmaleficence, justice
Pluralism and casuistry: Appleal to common sense wen weighing reasons and balancing risk against benefits

Question 4

Q

Name 3 types of publication

Answer

A

Interactive Model: Data owner is gatekeeper - others have to request queries, data owner gives anonymized answers
“Send me your code”: Data owner executes code on their system and reports result (problem: malicious? error-free?)
Offline - aka “publish and be damned”: Data owner publishes anonymized dataset

Question 5

Q

What categories of records exist?

Answer

A

Identifiers (Personal Identifiable information PII): full name
Quasi-identifiers (triple: zip code, date of birth, gender)
Sensitive attributes: should not be associated with individuals
Other

Question 6

Q

Name 3 anonymization objectives

Answer

A

Protect individuals against:

Membership disclosure
Sensitive attribute disclosure
Identity disclosure

be aware that additional information can be used

Question 7

Q

Name 4 anonymization approaches

Answer

A

Suppression: remove (parts of) attributes
Generalization: Limit granularity: age: 21 -> age 20-30
Perturbation: Add noise to data while preservice general properties
Permutation: Swap association of attributes across records

Question 8

Q

What is k-Anonymity and how does it work?

Answer

A

Each record should be indistinguishable from at least (k-1) others on its QI attributes
or: Cardinality of any query result should be at least k

We have to consider all different data sources and try to avoid linking

Question 9

Q

Name some problems with k-anonymity

Answer

A

Efficiency problem: expensive to find k-anonymity transform with max utility

Security problem (e.g. k=4): k-anon: identifier, quasi-identifier, other.

1: Other data might have little diversity: information may be leaked
2: In presence of side-information, any information can become quasi-identifier

Question 10

Q

What is L-diversity?

Answer

A

Build q-blocks (grouped by one attribute).
A q-block is l-diverse if it contains at least L well represented values for the sensitive attribute S.
A table is l-diverse if every q-block is l-diverse

Question 11

Q

What is t-closeness?

Answer

A

An equivalence class is said to have t-closeness if the distance between the distribution of a sensitive attribute in this class and the distribution of the attribute in the whole table is not more than t

Question 12

Q

How can we practically deal with anonymity vs utility?

Answer

A

It is rare that releasing data sets after sanitization can preservice privacy and utility so we can only release the rich under license to designated trusted parties