Big data and analytics Flashcards
What is Data Mining?
Data Mining is the process of analysing data to extract hidden knowledge(insights) that is not directly seen from raw data alone.
DM is analyzing data using established algorithms. It allows new knowledge to be discovered.
Examples of datamining tools:
Data Mining Algorithms:
• cluster analysis (explained in the recording)
• association detection (explained in the recording)
• artificial neural networks (explained in the recording)
- Statistical analysis
- Mathematical modelling
What is big data?
Big data is not structured and it is not stored in a uniform manner!
Unlike structured data elements, big data can be streams of data. Big data is data that is mega large in Volume. It is data that is generated and captured not only in huge volumes but also at very large speeds or Velocity. Big data is not uniform. It can be a mix of different types of data – text, spread sheet data, data from XML documents or email, video files or audio files. It is very complex. Hence it is said that Big Data has much Variety.
Where is Big Data stored?
It cannot be stored in relational databases as the data gathered is not structured. Big data uses special databases that allows extremely large amounts of data to be stored. These are for example are Hadoop