Theory: Big Data Flashcards
Give the three defining features of big data
-volume
-velocity
-variety
Describe the volume feature of big data
-too much data to fit in a conventional hard drive or on a server. Data is therefore stored across multiple servers, each composed of many hard drives.
Describe the velocity feature of Big Data
Refers to the rapid modification of data. Servers must respond to frequently changing data
Describe the variety component of big data
Big data consists of many different types of data
What is the most challenging quality of big data?
It’s lack of structure
Why does the unstructured nature of big data cause issues?
Makes data difficult to analyse
How is the problem of lack of structure overcome?
Machine learning techniques are used to discern patterns in data
What is a second problem with Big data?
As data is stored across multiple servers, the processing must also be across multiple machines. With conventional programming paradigms, all servers would have to be synchronised to prevent data being overwritten.
How is data being stored across multiple servers dealt with?
Functional programming can be used to process data across multiple servers.
What are some of the features of functional programming paradigms?
Functional programs are stateless (meaning there are no side effects) and make use of immutable data structures. It supports higher-order functions.