Understanding Big Data and its Impact on Business Flashcards
distributed computing
Processes and manages algorithms across many machines in a computing environment.
data mining
The process of analyzing data to extract information not offered by the raw data alone.
data profiling
The process of collecting statistics and information about data in an existing source.
data replication
The process of sharing information to ensure consistency between multiple data sources.
recommendation engine
A data-mining algorithm that analyzes a customer’s purchases and actions on a website and then uses the data to recommend complementary products.
estimation analysis
Determine values for an unknown continuous variable behavior or estimated future value.
market basket analysis
Evaluates such items as websites and checkout scanner information to detect customers’ buying behavior and predict future behavior by identifying affinities among customers’ choices of products and services.
affinity grouping analysis
Reveals the relationship between variables along with the nature and frequency of the relationships.
cluster analysis
A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible.
classification analysis
The process of organizing data into categories or groups for its most effective and efficient use.
data mining tools
A variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making.
prediction
A statement about what will happen or might happen in the future, for example, predicting future sales or employee turnover.
cube
The common term for the representation of multidimensional information.
algorithm
A mathematical formula placed in software that performs an analysis on a data set.
anomoly detection
The process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set.