Big Data Flashcards
Definition Big Data
extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions
Which characteristics has big data in relation to security, privacy and welfare concerns? (5)
1) Volume
2) Velocity (Geschwindigkeit)
3) Variety
5) Variability
6) Complexity
Explain Volume
-huge amounts of data from a wide range of sources (transactions, unstructured streaming from text, images, audio, voice, VoIP, video, TV and other media, sensor and machine-to data)
Explain Velocity
- some data is time-sensitive
- > speed is more important than volume
- needs to be stored, processed and analyzed quickly
Explain Variety
- data comes in in various formats:
- structured, numeric data, traditional database, unstructured text docs, email, video, audio, fin. transactions
Explain Variability
- data flows can vary greatly with periodic peaks and throughs
- related to social media trends, daily, seasonal and event-triggerde peak data loads and other factors
Explain complexity
-data comes from multiple sources which requires different strategies of data preparation as linking, cleansing, and transforming across different systems
Explain Collection/storing for the characteristic volume
- high volume -> higher attractiveness for cybercriminals
- amplified technical impact
- transparency principles might be violated
Explain Collection/storing for the characteristic velocity
- customer concerns over privacy are increasing because of behavioral advertising based on real-time profiling and tracking technologies such as cookies
- individual participation principle of FIPPs is violated (individual can´t give consent or deny data usage)
What is PII?
- personally, identifiable information
- can be used to distinguish or trace an individual´s identity (e.g. name, social security number, biometric records)
- highly personal
What are the FIPPs?
The Fair Information Practice Principles (FIPPs), are a set of eight principles regarding data usage, collection, and privacy. They were published by the Organization for Economic Cooperation and Development (OECD)
Explain Collection/storing for the characteristic variety
- unstructured data is more likely to conceal PII
- large variety makes it more difficult to detect security breaches, react and respond appropriately
Explain Collection/storing for the characteristic variability
- Organizations may lack capabilities to securely store huge amounts of data and manage the collected data during peak data traffic
- Attractiveness as a crime target increases during peak data traffic
Explain Collection/storing for the characteristic complexity
- prepared, complex data is often more personal than the data a person would consent to give
- Data collected from illicit sources is more likely to have information on technologically less savvy consumers
Explain sahring/accessibility by third parties and various user types for the characteristic volume
-firms may need to outsource data analysis to cloud-service-providers which may give rise to privacy and security issues