Big data analytics and data science role Flashcards
What are the 5 v’s?
Volume Velocity variety Value Veracity
What does Veracity mean?
willingness to believe data is good
What are the 2 types of data?
Meta and para
What does metadata mean?
minimum you should know about the data
What does paradata mean?
how has the data been processed
What are the 4 data structures?
structured
semi-structured
quasi-structured
unstructured
what are the data repositories?
data islands
data warehouses
analytic sandbox
what is a data island?
isolated datamarts. record keeping in spreadsheets and low volume DBMS
what is a data warehouse?
centralised data repository. Supports BI and reporting
what is an analytic sandbox?
assets from multiple sources ready for analysis
What are the three big data project success factors?
timely decision making
processing throughout
flexibility
what three ways does an analytic sandbox support big data success factors?
provides high performance analysis
ingests data from different sources
owned by the DS rather than IT
What are the business drivers of big data/data science?
optimise business processes
predict new business opportunities
mitigate business risk
meet legal and regulatory requirements
what are the four parts of the big data ecosystem?
data devices
data collectors
data aggregators
data users/buyers
what are data devices in the big data ecosystem
they continuously gather data about the world (phones)