6 - Big data's end run - Nissenbaum Flashcards

1
Q

What is big data and how is it used?

A

Paradigm of knowledge as data + framework for decision making
Uses power of patterns hidden in massive datasets
Used to get analytic insights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which are the problems correlated with big data and how they are dealt with?

A

Creates new classes of good & services => requires legislative principles, BUT without killing development
Solution: anonymization (solves identification, NOT reachability) + informed consent (CANNOT fully specify terms of interaction)
Cons of solution: perceived as best & only, it is elusive & difficult to implement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is privacy for Nissenbaum?

A

Contextual integrity = right over control of informational flow among social contexts with respect to roles & relations in each specific context (~Rachels definition)
Eg: medical records shared with doctor preserve contextual integrity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When concerns are raised about privacy?

A

When expected and actual info flows are different
Eg: race/sex info should not influence hiring decisions (expected), but sometimes they do (actual) => privacy concern

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is anonymity and which are its problems?

A

Elimination of link between data and owner
Problems:
1. Impossible when data is unique info
2. Prone to re-identification attacks (linkage = overlap anonymous dataset with non-anonymous ones, differencing = use multiple queries to get subset of identifying attributes)
3. May not avoid reachability (= contact someone without knowing their identity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In which ways anonymity is implemented and which can be the problems about them?

A

Implemented with:
1. Anonymous identifiers: identifier, just different from commonly used (pseudonym)
Cons: reuse makes them actual identifiers, if created following patterns can still identify someone
2. Differential privacy: research field for useful analysis preserving anonymity
Cons: still breakable

Broken by:
1. Comprehensiveness: rich datasets allow attribute identifying => identification without knowing common identifier
3. Inference: use big data itself to extract hidden knowledge (common ids just noise)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is informed consent and which are its problems?

A

Corollary of privacy as control over info flow
Aims to inform users about collectors of data, which data are collected, how they will be used/shared
1. Difficult to be modeled: privacy policies not read/understood, even if made readable or using opt-in default (agree to something before enrolling, instead of disagree and exit after enrollment)
2. Transparency paradox: simplicity & clarity result in fidelity loss, if policies in plain languages (if possible) too heavy + disrupt user experience flow
3. Unpredictability: uses for data unpredictable because of big data paradigm (hidden patterns => hidden scopes) & (potentially) infinite chain of collectors
4. Tyranny of minority: volunteered info about few (~20%) can unlock same info about rest (because of big data), no explicit connection required => very powerful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Are anonymity and informed consent sufficient to deal with privacy concerns and, if not, in which ways they can be improved?

A

No sufficient (probably dead end), BUT no actual alternative => still meaningful
Informed consent need contextualization: cover (with agreements) only detachments from EXPECTED info flow
ALSO burden of legitimacy of actions over data should be moved from users to collectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly