Big Data Flashcards

1
Q

Definition Big Data

A

extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which characteristics has big data in relation to security, privacy and welfare concerns? (5)

A

1) Volume
2) Velocity (Geschwindigkeit)
3) Variety
5) Variability
6) Complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain Volume

A

-huge amounts of data from a wide range of sources (transactions, unstructured streaming from text, images, audio, voice, VoIP, video, TV and other media, sensor and machine-to data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain Velocity

A
  • some data is time-sensitive
  • > speed is more important than volume
  • needs to be stored, processed and analyzed quickly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain Variety

A
  • data comes in in various formats:

- structured, numeric data, traditional database, unstructured text docs, email, video, audio, fin. transactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Variability

A
  • data flows can vary greatly with periodic peaks and throughs
  • related to social media trends, daily, seasonal and event-triggerde peak data loads and other factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain complexity

A

-data comes from multiple sources which requires different strategies of data preparation as linking, cleansing, and transforming across different systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain Collection/storing for the characteristic volume

A
  • high volume -> higher attractiveness for cybercriminals
  • amplified technical impact
  • transparency principles might be violated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain Collection/storing for the characteristic velocity

A
  • customer concerns over privacy are increasing because of behavioral advertising based on real-time profiling and tracking technologies such as cookies
  • individual participation principle of FIPPs is violated (individual can´t give consent or deny data usage)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is PII?

A
  • personally, identifiable information
  • can be used to distinguish or trace an individual´s identity (e.g. name, social security number, biometric records)
  • highly personal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the FIPPs?

A

The Fair Information Practice Principles (FIPPs), are a set of eight principles regarding data usage, collection, and privacy. They were published by the Organization for Economic Cooperation and Development (OECD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain Collection/storing for the characteristic variety

A
  • unstructured data is more likely to conceal PII

- large variety makes it more difficult to detect security breaches, react and respond appropriately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain Collection/storing for the characteristic variability

A
  • Organizations may lack capabilities to securely store huge amounts of data and manage the collected data during peak data traffic
  • Attractiveness as a crime target increases during peak data traffic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain Collection/storing for the characteristic complexity

A
  • prepared, complex data is often more personal than the data a person would consent to give
  • Data collected from illicit sources is more likely to have information on technologically less savvy consumers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain sahring/accessibility by third parties and various user types for the characteristic volume

A

-firms may need to outsource data analysis to cloud-service-providers which may give rise to privacy and security issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain sharing/accessibility by third parties and various user types for the characteristic velocity (fast data)

A
  • increase in supply and demand of location-based real time personal information (stalking people in real-time)
  • risk of hurting the personal living sphere and physical security risks
17
Q

Explain sharing/accessibility by third parties and various user types for the characteristic variety

A

-Most organizations lack mechanisms to ensure that employees and third-parties have appropriate access to unstructured data and they are in compliance with data protection regulations.

18
Q

Explain sharing/accessibility by third parties and various user types for the characteristic variability

A

-peak data traffic may cause higher needs to outsource to cloud-service-providers which may lead to security issues

19
Q

Explain sharing/accessibility by third parties and various user types for the characteristic complexity

A
  • data from different sources can be combined
  • with that de-identified data can be re-identified
  • > violation of FIP(P)s
20
Q

Give a company as an example for the characteristic volume

A
  • Amazon
  • has a massive database with all the customer details, search history and purchase activities
  • has its own cloud service AWS (Amazon Web Services) to handle data and offer services to other customers
21
Q

Give a company as an example for the characteristic velocity

A
  • starbucks
  • uses geo-push to make highly targeted offers
  • tracks users via GPS or cell towers
  • geo-fences, virtual boundaries around starbucks shops
  • crossing them leads to a specific action as sendig a coupon to your phone
22
Q

Give a company as an example for the characteristic complexity

A
  • starbucks
  • combines geo data (geo-fences) with users’ purchase history to anticipate user desires and lure the potential customer into the stores