3.11 Big Data Flashcards
Big Data meaning
describes data whose volume is too large to fit on a single server and is generally unstructured
three defining features of big data
volume, velocity, variety
volume big data
there is too much data for it to fit on a conventional hard drive or server, stored over multiple servers composed of many hard drives
velocity big data
data is created and modified rapidly within milliseconds
variety big data
data held consists of many different types of data
big data’s biggest challenge
lack of structure, makes it difficult to analyse data as databases are not suited to store it all, to extract useful info machine learning techniques must be used to find patterns in data, data is distributed across several servers
functional programming and Big Data
solves problem of processing data over multiple machines, functional programs are stateless and make use of immutable data structures (state can not change after creation), makes it easy to write correct, efficient, distributed code
fact-based model
each piece of information is stored as a fact, facts are immutable and can not be overwritten, stored with a timestamp to show the most recent, reduces risk of losing data accidentally, appends new data