Final Flashcards

1
Q

Business Intelligence

A
  • The umbrella term that includes the application, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.
  • The use of data visualization and reporting for becoming aware and understanding “what happened and what is happening”
  • Done by charts, tables, and dashboards to display, examine and explore data
  • Process of raw data to interpreting information
  • The process of going from raw data to intelligent information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Business Analytics

A
  • Extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact based management to drive decisions and actions
  • Practice and art of bringing quantitative data to bear on decision making
  • Subset of business intelligence
  • Relies on a number of different disciplines to collect and analyze data
  • Ex: upsailing to customers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data Mining

A
  • Business analytics methods that go beyond counts, descriptive techniques, reporting, and methods based on business rules.
  • Extracts useful info from large data sets (finding gold)
  • Process of exploration and analysis of large quantities of data in order to discover meaningful patterns and rules
  • Employs pattern recognition technologies as well as statistical and mathematical techniques
  • Ex: evaluating which customers are going to switch- customer retention (phone provider)
    o Subset of business intelligence
    o Intersection of IT and statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unsupervised Learning

A
  • Search for patterns and structure among all variables
  • No predefined outcome groups (no dependent or outcome variables)
  • Define groups of cases with similar characteristics
  • Find out the structure of the data
  • Ex: average characteristics of measures of data in groups or clusters
  • Find out classifier
  • Ex: cluster analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Supervised Learning

A
  • Have a target variable
  • Example is regression analysis
  • Predefined outcome groups or variable (know dependent variable)
  • Decide to which class each case belongs (by calculating membership score of each case)
  • Find out major characteristics that differentiate predefined groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data Mining Techniques

A
  • Prediction
  • Classification
  • Association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Prediction

A
  • Dependent (response variable is a continuous variable
  • Formula or model to predict future observations
  • Ex: multiple regression and decision trees
  • Ex: predicting amount of time to sort when using gloves
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Classification

A
  • Dependent variable is a categorical variable
  • Identify categories of data (buy vs. not buy)
  • Ex: logistic regression, decision trees, cluster analysis
  • Ex: will someone buy or not buy products
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Association

A
  • Relationship among entities
  • Ex: if you bought cornflakes did you also buy bananas
  • Ex: market basket analysis (not in class)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Business Intelligence

A
  • Umbrella term that spans people, process and tools

- Organize data/information, enable access to it, analyze it, improves decision and manage performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Business Analytics

A
  • Process of “doing” analysis in a particular domain

- Uses analytical techniques (data mining)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data Mining

A

Process of discovering new patterns from large data sets involving artificial intelligence, statistics, and database systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

CRIP-DM

A
  • Cross- Industry Standard Process for Data mining

- Fits data mining into the general problem-solving strategy of business/research unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

CRIP-DM Phases (stages of Data Mining Process)

A
  1. ) Business Understanding
  2. ) Data Understanding
  3. ) Data Preparation
  4. ) Modeling
  5. ) Evaluation
  6. ) Deployment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Business Understanding

A
  • Demonstrate business objectives (why study- specific problem, knowledge discovery- increase sales of new shirt)
  • Assess situation (set up a concise and clear discription of the problem)
  • Determine data mining goals (achieve in technical terms and what is success criteria)
  • Product project plan (establish a budget)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data Understanding

A
  • Collect initial data
  • Describe data
  • Explore data
  • Verify data quality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data Preparation

A
  • Select data
  • Clean data (outliers, transform)
  • Construct data
  • Integrate data
  • Format data
18
Q

Modeling

A
  • Select modeling technique
  • Generate test design
  • Build model
  • Assess model
19
Q

Modeling Techniques

A
  • Classification
  • Clustering
  • Predictions
  • Sequential patterns
20
Q

Classification

A

Map each item of data into one of set of classes

21
Q

Clustering

A

Grouping data- no predefined classes

22
Q

Predictions

A

Predict a value of variable- regression analysis

23
Q

Sequential Patterns

A

Analyzing time series data- find out a seasonal pattern

24
Q

Evaluation

A
  • Evaluate the result (interpret the results and are busines objectives met)
  • Review process
  • Determine next steps
25
Q

Deployment

A

The knowledge needs to get reported to managers so they can reflect, tie it to business processes and enhance performance or solve issues.

26
Q

Types of Data

A
  • Qualitative (continuous)

- Quantative

27
Q

Qualitative Types of Data

A

Nominal and ordinal

28
Q

Nominal

A
  • Categorically discrete data (name of school, type of car, number assigned to country)
  • Nominal sounds like name
  • Gender, political party
29
Q

Ordinal Data

A

Quantities that have natural ordering (class ranks, order in place in line)

Sounds like “order”

30
Q

Quantitative Types of Data

A

Interval and Ratio

31
Q

Interval Ratio

A

Like ordinal except the intervals between each value are equal (temperature)

32
Q

Ratio

A

Interval data with a natural zero point or a well adjusted scale
(time, weight, height, age)

33
Q

Data Quality Characteristics

A
  • Accuracy
  • Completeness
  • Consistency
  • Uniqueness (each only represented once)
  • Timeleiness
34
Q

Types of Visualization Charts

A
  • Frequency tables
  • Bar chart, line graph, scatterplot
  • Distribution plots
  • Histograms
  • Stem and leaf
  • Box plots
  • Pareto chart
  • Maps
  • Cross tabulations
35
Q

Bar chart, line graph, scatterplot

A

Use categorical data

36
Q

Histograms

A
  • Show shape of distribution

- Use continuous data

37
Q

Stem and Leaf

A
  • Used to visualize the data (not used if large data set)

- More meaningful than histogram (can still see the actual numbers)

38
Q

Box plots

A
  • Helpful to give more details about data set than you would get just from a histogram
  • Whiskers
  • Can summarize nominal data
  • Used to identify outliers
39
Q

Central Tendency

A
  • Measure represents the center or middle of the data

- May or may not be a typical value

40
Q

Measures of Central Tendency

A

Mean, median, mode

41
Q

Relationship with Mean Median and Mode in Normal Curve

A

All are in the middle

42
Q

Empirical Rule

A
  • 68% within 1 standard deviation
  • 95% within 2 standard deviations
  • 99.7% within 3 standard deviations