lec 1(done) Flashcards
why data mining?
The explosive growth of data
We need automated analysis of massive data
Major sources of abundant data:
1-Business: e-commerce, transactions, stocks, product descriptions…
2-Science: Remote sensing, bioinformatics, scientific experiments, …
3-Society and everyone: news, social networks, digital cameras, YouTube ..
Data Mining
Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data.
an example of data mining:
Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest)
what is Data view, knowledge view , method view and application view:
Data view:Kinds of data to be mined
Knowledge view (Data mining functions): Kinds of knowledge or patterns to be discovered
Method view:
Kinds of techniques utilized
Application view:
Kinds of applications adapted
Relational database system
Relational database system is a collection of tables with ER(entity-relationship) for modeling and SQL for querying.
what is a Data warehouse?
Data warehouse is a repository of information collected from multiple sources, stored under a unified schema at a single site in order to facilitate management decision making.
Transactional database
A file where each record represents a transaction
such as a customer’s purchase: sales (transID, list of item IDs)
Other Kinds of Data (Advanced datasets)
Data streams and sensor data Spatial data Time-series data, temporal data, sequence data Graphs, social networks data Object-relational databases Multimedia database Text databases The World-Wide Web