4.11.1 Big Data Flashcards
What is Big Data?
a catch-all term for data that won’t fit the usual containers
Why is the lack of structure of big data such an issue?
analysing the data is made significantly more difficult
relational databases are not appropriate because they require the data to fit into a row-and-column format.
What is big data in terms of volume?
Volume: the sheer amount of data is on a very large scale, too big to fit into a single server. Size impacts when the data doesn’t fit onto a single server because relational databases don’t scale well across multiple machines
What is big data in terms of variety?
Variety: the type of data being collected is wide-ranging, varied and may be difficult to classify. data in many forms such as structured, unstructured, text, multimedia
What is big data in terms of velocity?
Velocity: the data changes quickly and may include constantly changing data sources, there may be a large degree of latency
What happens when data is too big to fit on one server?
the processing must be distributed across more than one machine
functional programming is a solution, because it makes it easier to write correct and efficient distributed code because the code can run across multiple servers.
What is structured / unstructured data?
Structured data - data that fit into a standard database structure of columns and rows (fields and records).
Unstructured data - data that doesn’t fit into a standard database structure of columns and rows (fields and records).