W1 Flashcards
Big Data is usually described according to what 3 V’s and what other 2 v’s
a) Volume, Velocity, Variety b)Veracity & Value
What does Volume mean?
Terabytes, records, transitions, tables, files
What does Variety mean?
Structured or unstructured, multi-factor, probabilistic
What does velocity mean?
Batch, real/near time, processes, streams
What does value mean?
statisticla, events, correlation, hypothetical
What does veracity mean?
trustworthiness, authenticy, origin, reputation, availability, accountability.
What are 9 concepts for big data
Big data is”
- challenging to manage
- can be impactful
- changing how we live our lives
- changing how companies compete
- should be importatnt to strategy
- can be complex
- by itself doesn’t create value
- can create value if analyzed and acted on
how is big data generated?
by living our personal and corporate lives
is big data traditional or non-traditional
non-traditional
What does big data require in general in terms of technology to manage and apply it.
requires innovations in teh to manage and apply it.
big data is already created by who?
someone other than us
is big data always straightforward or can it be misleading?
misleading
What are the 3 types of data structures ? (describe and give examples)
Structured
Data is organized in a tabular format at the field
level
Eg. Spreadsheets and data base tables
Unstructured
The structure is unknown
Eg. Images, audio, video,
Multi-structured
The structure includes a combination of tabular, hierarchical, tagged, other and unknown
Eg. Emails and Twitter feeds contain some known fields such as the From
and To
parties combined with unstructured text
what’s the percentage of structured vs unstrucuted data?
80% unstructured
20% structured
What are the 5 differences between structured and unstructured data.
Structured: can be displayed in rows, columns and relational databased
- numbers, dates and strings
- estimated 20% of structured data,
- requires less storage
- easier to manage and protect with legcy solutions.
Unstrucutred data - can’t be displayed in rows or columns and relational databases,
-includes images, audio, video, emails, spreadsheets
-estimated 80% of enterprise data
-requires more storage
-more difficultto manage and protect with legacy solutions
,