Big Data Flashcards
What are the 3 main features of big data?
Volume
Variety
Velocity
Where is big data used?
Scientific research
Retail
Banking
Government
Security
Why is volume a feature of big data?
The amount of data is on a very large scale.
Why is variety a volume of big data?
The type of data being collected is wide-ranging, varied and may be difficult to classify.
Why is velocity a feature of big data?
The data changes quickly and may include constantly changing data sources.
What are the main features of structured big data?
Defined using traditional techniques (fields and records)
It is possible to give each data item a name and type.
Examples: customer and banking data.
What are the main features of unstructured big data?
Data cannot be defined in rows and columns.
Examples: web pages, emails, documents, presentations.
Each example has their own structure but do not conform to any universal structure.
What are some of the issues with big data?
Analysis is difficult (lots of data)
Automation of analysis is also difficult
Massive storage and processing power is needed (use of supercomputers or large dedicated networks)
Difficult monitor constantly changing data
Concurrency - when more than one user works on the data at once.
What types of data are likely to be structured and unstructured (big data)?
Structured - Quantitative data (stored in databases)
Unstructured - Qualitative data
On what type of data is machine learning generally used?
Unstructured qualitative data
How does functional programming help address big data issues?
Immutable data structures - there are strict fixed rules as to how data can be used/manipulated.
Statelessness - doesn’t remember results or states of previous events.
High-order functions - allow the language to be highly adaptable.