Chapter 2 Flashcards
Characteristics of data warehousing
- Subject oriented
- Integrated
- Time-variant (time series)
- Nonvolatile
- Web based
- Relational/multi-dimensional
- Client/server
- Real-time
- Include metadata
What is data?
A collection of facts usually obtained as the result of experiences, observations or experiments.
- the lowest level of abstraction
Data in Analytics can be categorized into:
- structured data
- unstructured or semi-structured data
Structured data can be categorized into:
- categorical
- numerical
Categorical data can be cateorized into:
-nominal
- ordinal
Numerical data can be categorized into:
- interval
-ratio
Unstructured or semi-structured data can be categorized into:
- textual
- multimedia
- XML/JSON
Multimedia data can be cateorized
- image
- audio
- video
What are the measures of centrality?
- arithmetic mean
- mean
- mode
What are the measures if dispersion?
- range
- variance
- standard deviation
- mean absolute deviation
- quartiles
- box plots
- shape distribution: skewness, kurtosis
Define data visualization
use of visual representations to explore, make and communicate data
What is the role of dashboards?
they provide visual displays of important information that is consolidated and arranged on a single screen
What are the best practices in dashboard design?
- Benchmark KPIs with industry standards
- Warp metrics with contextual metadata
- Validated design by usability specialist
- Prioritizte and rank alerts and exceptions
- Pick the right visual constructs
6- Provide guided analytics
What are types of Information Retrieval?
- Document Matching
- Link Analysis
- Search Engines
What are types of Web Mining?
- Web content mining
- Web structure mining
- Web usage mining