Ch 15 - Emerging Issues Flashcards
IOT and Big Data Facts and Background
Big Data is a term used to describe the nearly ubiquitous collection of data about individuals from multitudinous sources, coupled with the low costs to store such data and the new data mining techniques used to draw connections and make predictions based on this collected information.
IOT: The data analyzed by analytics programs, algorithms, machine learning, and other data mining techniques—the underpinnings of the term Big Data—are often gathered by devices collectively known as IoT. The next evolution of interaction with computer devices combines sensors almost anywhere with connection to the Internet.
The number of sensors connected to the Internet is now counted in the tens of billions.
By 2025, it is estimated that amount of data will double every 12 hours.
Big Data is characterized by the “three Vs”: velocity (how fast the data is coming in), volume (the amount of data coming in), and variety (what different forms of data are being analyzed).
Microsoft’s CEO Satya Nadella proposed a list of AI design principles, including several that focus on privacy issues: “AI must be designed to help humanity, AI must be designed for intelligent privacy, AI must be transparent, and AI needs algorithmic accountability so humans can undo unintended harm.
Friends and Family Test
would the managers feel comfortable if data on themselves and their family and friends were in the database, subject to possible breach?
For instance, would managers at the bank feel comfortable with their own family’s data going into the ACF database?
If not, that is a reason to take greater precautions from cybersecurity perspective.
Means of preventing Big Data Breach
Data minimization Segmentation De-identification Collection, purpose and use limitations (FIPPS) Access controls
Direct and indirect identifiers
Direct identifiers = data that identify an individual with little or no additional effort.
Examples: address, phone number
Indirect identifiers = data such as age or gender that can increase the likelihood of identifying an individual.
De-id terms: pseud, de-id, anon
- Pseudonymous data: Information from which the direct identifiers have been eliminated. Indirect identifiers remain intact.
- De-identified data: Direct and known indirect identifiers have been removed.
- Anonymous data: Direct and indirect identifiers have been removed or technically manipulated to prevent re-identification.
These categories do not result from a single method or from reducing the identifiability of data. Instead, reduction of the risk of re-identification results from a collection of techniques that can be applied to different kinds of data with differing levels of effectiveness.
Blurring
This technique reduces the precision of disclosed data to reduce the certainty of individual identification.
For example, date of birth is highly identifying (because a small portion of people are born on a particular day of a particular year), but year of birth is less identifying.
Similarly, a broader set of years (such as 1971-1980, or 1981-1990) is less identifying than year of birth.
Masking
Masking. This technique masks the original values in a data set with the goal of data privacy protection. One way this may be accomplished is to use perturbation—make small changes to the data while maintaining overall averages—to make it more difficult to identify individuals.
Differential Privacy
This technique uses a mathematical approach to ensure that the risk to an individual’s privacy is not substantially increased as a result of being part of the database.
FTC Characterization of a Data Broker in 2014
The FTC characterized the data broker industry as: collecting consumer data from numerous sources, usually without consumers’ knowledge or consent; storing billions of data elements on nearly every U.S. consumer; analyzing data about consumers to draw inferences about them; and combining online and offline data to market to consumers online.
FTC broad categories of products offered by data brokers - 2014
(1) marketing (such as appending data to customer information that a marketing company already has),
(2) risk mitigation (such as information that may reduce the risk of fraud) and
(3) location of individuals (such as identifying an individual from partial information).
For each of these segments of the industry, the FTC suggested that data brokers engage in data minimization practices, review collection practices carefully as they relate to children and teens, and take reasonable precautions to ensure that downstream users did not use the data for discriminatory or criminal purposes.
FTC Report on Big Data (2016)
The agency expressed its understanding that Big Data brought with it significant benefits coupled with significant risks. Examples of the benefits identified included providing healthcare tailored to individual patients, enhancing educational opportunities by tailoring the experience to the individual student, and increasing equal access to employment. Examples of the risks included: exposing sensitive information; reinforcing existing disparities; and creating new justifications for exclusion.
IOT Background
In 2016, estimates for the number of IoT devices in use topped 15 billion worldwide,
with spending on these devices approaching $1 trillion globally.3
By 2020, the number of wearable device shipments is estimated to be more than 200 million.
much of IoT—such as temperature, traffic statistics, and sensors around industrial production—often does not implicate PII.
IOT devices share 2 characteristics that are important for privacy and security discussions
(1) the devices interact with software running elsewhere (often in the cloud) and function autonomously and
(2) when coupled with data analysis, the devices may take proactive steps and make decisions about or suggest next steps for users.
Concerns regarding privacy and cybersecurity wrt IOT devices stem from
(1) limited user interfaces in the products;
(2) lack of industry experience with privacy and cybersecurity;
(3) lack of incentives in the industries to deploy updates after products are purchased; and
(4) limitations of the devices themselves, such as lack of effective hardware security measures.
Wearables - Issues
Most of this information is not protected by HIPAA, because HIPAA applies only to the activities of covered entities such as providers and health insurance plans
Challenges include:
- Right to forget - hard to remember to delete
- impact of location disclosure - stalking
- Screens read by tose nearby
- video/audio recording without knowledge - e.g. google glasses.
- lack of control of data - how will it be used?
- Automatic syncing with social media - without controls
- Facial recognition.
FPF Best Practices for Wearables
- access, deletion, and correction rights;
- opt-in consent for sharing with third parties;
- sharing of data for scientific research purposes, with informed consent;
- compliance with leading app platform standards and global privacy frameworks;
- strong data security requirements; and strong requirements for de-identification.
Connected Cars
- Examples:
a vehicle that wirelessly alerts the dealership when tires need to be rotated.
app from a car insurance company that records braking habits.
- Privacy experts raise concerns that these configurations place sensitive information at risk to unauthorized access or hacking. The complexity of these issues has led to a situation where numerous federal agencies are considering regulating connected cars: the National Highway Traffic Safety Administration (NHTSA), FTC, and the Federal Communications Commission (FCC).
Smart Homes: Privacy Issues
- Smart thermostats - ransomware
Smart TVs - not effectively secured. Overhear conversations?
Communication systems - hacking into home wifi networks
security systems - hackers use to break in?
Smart Cities: Basics and examples
primarily refers to municipalities and other government entities using sensors to monitor functions and improve government services.
Examples:
wireless sensors in lighting
garbage collection
parking meters
Smart Cities: DHS Report
Highlighted 3 themes in cyber risks that arise when integrating with city infrastructure:
- Changing seams - btw legacy and new system, and urban and rural systems - boundaries disappearing
- Inconsistent adoption
- Increased automation
can lead to more threat vectors, cascading failures, and removal of manual overrides.
Surveillance and big brother aspect comes into play with smart cities, bc are governmental led.
Service-surveillance spectrum
Broadband Internet Technical Advisory Group reccs on IOT and privacy/security (2016)
- IoT devices should follow security and encryption best practices
- For devices that can be customized by the users, the company should test the IoT devices in different possible configurations
- IoT devices should be designed to facilitate automated, secure software updates
- IoT devices should be secured by default by the inclusion of a password
- IoT devices should be shipped originally with reasonably up-to-date software
- IoT devices should be shipped with a privacy policy that is understandable and easy to find
- IoT devices should communicate with restrictive rather than permissive protocols
- IoT devices should continue to function if Internet connectivity is disrupted or if cloud back-up fails
Note: Real risk from IOT is not individual device compromised, but that IOT will allow for network attack.
FTC Report on IOT (2015) - Internet of Things: Privacy and Security in a Connected World
- Volume and personal nature of data in IOT heightens the need for protection.
If no consumer interface, no choice available. So, consider choice at point of sale, video tutorials, choice during set up, dashboards and icons, email/text
Security risks identified:
- lax security that could allow intruders access to personal information collected by the devices,
- security vulnerabilities on a consumer’s device that could lead to attacks on networks connected to the device,
- and security issues with the devices that could place physical safety at issue, such as changes to instructions for an insulin pump or to a lock on a front door to a house.
In response, FTC urged:
- security by design
- deploy seucrity
- vendor management
- security at several levels
- access controls
- monitor throughout lifecycle
Pseudonymous data:
Information from which the direct identifiers have been eliminated. Indirect identifiers remain intact.
De-identified data:
Direct and known indirect identifiers have been removed.
Anonymous data:
Direct and indirect identifiers have been removed or technically manipulated to prevent re-identification.
Differential privacy:
<p>an approach for analyzing database content without disclosing information about the user.</p>
<p>*Not sure if it works: there is criticism around this method.</p>