Big Data and Social Media Flashcards
What is Big Data
Data that arrives in increasing VOLUMES with greater VARIETY and with more VELOCITY.
What are the 3Vs of Big Data?
What are the other 2Vs of Big Data?
- Variety
- Velocity
- Volumes
- Value
- Veracity
What is Variety in Big Data?
Variety = Many types of data that are available such as text, video, audio, images etc….
What is Volumes in Big Data?
Volumes = The amount of data that is received.
Big data = High volumes of low-density, unstructured data.
What is Velocity in Big Data?
Velocity = The fast rate at which data is received and (perhaps) acted on.
What is Value in Big Data?
Value = Is the data valuable and can it be analysed and used by business to be more efficient or develop more products.
What is Veracity in Big Data?
Veracity = Is the data trustworthy or is the data fake news?
What is involved in analysing Big Data? (3 methods)
1 - Data Mining
2 - Deep Learning
3 - Predictive Analysis
What is Data Mining (Analytical Method of Big Data)?
Data Mining = Sorts through large data sets to identify patterns + relationships by finding anomalies and creating data clusters.
What is Deep Learning (Analytical Method of Big Data)?
Deep Learning = Imitates human learning patterns by using artificial intelligence + machine learning to layer algorithms and find patterns.
What is Predictive Analysis (Analytical Method of Big Data)?
Predictive Analysis = Uses historical data to make predictions about the future, identifying risks and opportunities and generate ideas from this.
What is Big Data Analytics?
Big Data Analytics = Refers to collecting, processing, cleaning + analysing large datasets to help organisations operationalise their big data.
- Collect.
- Process it.
- Clean it.
- Analyse
What is involved in the collection of Big Data?
- Collect unstructured data from a variety of sources (such as cloud storage, mobile applications etc).
What is involved in processing Big Data?
- Data must be organised properly to get accurate results on analytical queries, especially when it’s large + unstructured.
What is involved in cleaning Big Data?
- All data must be formatted correctly.
- Irrelevant + duplicate data must be removed + accounted for.