5.1 BUSINESS INTELLIGENCE & DATA ANALYTICS Flashcards

1
Q

Q1: As said earlier, it is assumed that about 80% of all firm data is
a) structured.
b) unstructured.
c) semi-structured.
d) metadata.

A

b) unstructured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Q2: Which statement is NOT CORRECT?
a) RFM analysis is sometimes referred to as a poor man’s approach to customer lifetime value (CLV) analysis.

b) The RFM framework is a well-known and well-developed measurement framework used in marketing across different industries such as banking, insurance, Telco, non-profit, travel, on-line retailers, and even government.

c) In the RFM framework, R stands for Recency, F for Frequency and M for Monetary.

d) The RFM framework focusses on prospective customers, instead of existing customers.

A

d) The RFM framework focusses on prospective customers, instead of existing customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Q3: According to Van Vlasselaer and Baesens, which is a key characteristic of fraud?

a) uncommon
b) well-considered
c) imperceptibly concealed
d) time-evolving
e) often carefully organized
f) all of the above

A

f) all of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Q4: In response modeling, reading an advertisement email, clicking on a link, downloading a product description, configuring a product such as a car for example, or contacting the customer service desk for a price quote could be considered as examples of

a) implicit response.
b) explicit response.

A

a) implicit response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Q5: Which of the following are component(s) of the Customer Lifetime Value (CLV)?

a) Costs
b) Revenues
c) Discount rate
d) Time horizon
e) All of the above.

A

e) All of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Q6: Open data is data that

a) only a very specific set of people can access, use and share.
b) only the government can access, use and share.
c) anyone can access, use and share.
d) no one can access, use and share.

A

c) anyone can access, use and share.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Q7: Which are key skills of a data scientist?

a) Quantitative skills
b) Business skills
c) Creativity
d) Programming skills
e) All of the above.

A

e) All of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Q8: What data relates to the core entities a company is working with such as customers, products, employees, suppliers and vendors?

a) Master data
b) Transactional data
c) Metadata
d) External data

A

a) Master data
Master data is the core data that is absolutely essential for running operations within a business enterprise or unit.
Most popular categories; customer, supplier; product, location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Q9: Customer journey analysis can be used to

a) get a clear and comprehensive picture of the overall process.
b) highlight process deficiencies such as excessive processing times, indicate deadlock situations, circular references, and unwanted customer leakage, among others.
c) verify if the process is compliant with both internal and external regulations.
d) all of the above.

A

d) all of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Q10: Which of the following is a characteristic of Big Data?

a) Volume
b) Velocity
c) Variety
d) Veracity
e) Value
f) All of the above

A

f) All of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Q11: Which statement is NOT CORRECT?

a) Active churn implies that the customer stops the relationship with the firm.
b) Passive churn occurs when the customer stays with the firm but decreases the intensity of the relationship.
c) Forced churn implies that the customer stops the relationship with the company because its products or services are too expensive.
d) Expected churn occurs when the customer no longer needs the product or service.

A

c) Forced churn implies that the customer stops the relationship with the company because its products or services are too expensive.
= customers are involuntarily or forcibly removed from a service or subscription. due to eg. payment issues, violation of terms of service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Q12: The goal of machine learning models is to
a) complement human expert-based insights.
b) replace human expert-based insights.

A

a) complement human expert-based insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Q13: In a recommender setting, recommending a product or service, which the user was not aware of and thus not looking for, but turns out to be very interesting to him/her is an example of

a) user interest.
b) serendipity.
c) simplicity.
d) item relevance.

A

–> b) serendipity. correct

A= user is already aware of and looking for such recommendations.
b= discovery of something interesting or valuable
c= ease of use or straightforwardness of a system
d= the degree to which a recommended item is pertinent or suitable for the user based on their preferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Q14: In fraud detection, the date and location of an accident picture is an example of:

a) master data.
b) transactional data.
c) metadata.
d) external data.

A

c) metadata.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Q15: A web page is an example of

a) structured data.
b) unstructured data.
c) semi-structured data.
d) metadata.

A

c) semi-structured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Q16: Which statement is NOT CORRECT?

a) In unsupervised machine learning, there is no target variable available. The idea is to find structure in the data. Popular examples are clustering, association rule mining and sequence rule mining. It is also referred to as descriptive analytics as the idea is to describe patterns in the data.

b) Usually, the machine learning or analytics step is the most complex and most time consuming. Estimates say that it takes about 80% of the total effort.

c) Once the data preprocessing step of the analytics process model is finished, the process proceeds with a data transformation step. Here, data can be aggregated such as from zip code to city, state or even country for example.

d) Supervised machine learning is characterized by the presence of a target variable. The idea is to relate or map predictor variables X to a target variable Y. Popular examples are churn prediction, fraud detection, response modeling, and credit risk modeling. It is also referred to as predictive analytics since the aim is to predict.

A

b) Usually, the machine learning or analytics step is the most complex and most time consuming. Estimates say that it takes about 80% of the total effort.

datapreprocessing is most time consuming not ml

17
Q

Q17: The Pareto principle states

a) For many events, roughly 80% of the effects come from 20% of the causes.
b) For many events, roughly 90% of the effects come from 10% of the causes.
c) For many events, roughly 20% of the effects come from 80% of the causes.
d) For many events, roughly 10% of the effects come from 90% of the causes.

A

a) For many events, roughly 80% of the effects come from 20% of the causes.

18
Q

Q18: A Tweet is an example of

a) structured data.
b) unstructured data.
c) semi-structured data.
d) metadata.

A

–> b) unstructured data. correct

a= eg. ID, name, age
b= eg. social media post, mail, article
c= eg. Java, XML
d= eg. contect of digital photo

19
Q

Q19: A churn prediction model essentially tries to predict which customers will

a) become fraudsters.
b) leave you or decrease their product/service usage.
c) turn into bad payers.
d) respond to your marketing campaign.

A

b) leave you or decrease their product/service usage.

20
Q

Q20: Which statement is NOT CORRECT?

a) The goal of response modeling is to model whether customers will respond to a marketing campaign or not.
b) The focus of response modeling can be on either customer acquisition or on deepening customer relationships by selling additional products or services to your existing customer portfolio.
c) Customer acquisition is a lot easier than customer retention.
d) Just as with churn prediction, response modeling essentially boils down to a binary classification task so many of the ideas of churn prediction also apply here.

A

c) Customer acquisition is a lot easier than customer retention.

21
Q

Q21: The ACFE, or association of certified fraud examiners, estimates that a typical organization loses

1% of its revenues to fraud each year.
5% of its revenues to fraud each year.
10% of its revenues to fraud each year.
20% of its revenues to fraud each year.

A

5% of its revenues to fraud each year.

22
Q

Q22: Credit bureaus are

data pooling organizations that gather default information from various financial institutions such as delinquency history, bureau checks, and bureau score.

governmental institutions that gather credit data at country level.

business units that develop credit scoring models.

consultancy firms that provide credit scoring solutions.

A

data pooling organizations that gather default information from various financial institutions such as delinquency history, bureau checks, and bureau score.

23
Q

Q23: Search data such as Google Trends can be used for nowcasting where the aim is to

forecast the past.
forecast the future.
forecast the present or near future.

A

forecast the present or near future.

24
Q

Q24: Customer journey analysis can be used to

get a clear and comprehensive picture of the overall process.

highlight process deficiencies such as excessive processing times, indicate deadlock situations, circular references, and unwanted customer leakage, among others.

verify if the process is compliant with both internal and external regulations.

all of the above.

A

all of the above.

25
Q

Q25: In a subscription or contractual setting, churn can be defined

as the customer explicitly cancelling the contract.
by the company itself.

A

as the customer explicitly cancelling the contract.

26
Q

Q26: A Point of Sale (POS) application typically gathers

Master data.
Transactional data.
Metadata.
External data.

A

Transactional data.

27
Q

Q27: Which statement about credit scoring is NOT CORRECT?

It is the aim of credit scoring to come up with a statistically based decision model which allows to score future credit applications and decide which ones to accept or reject.

A key assumption which is made when building a credit scoring model is that the future differs from the past.

In credit scoring, the target variable is binary.

Credit scoring is a key risk management tool for a bank to optimally manage, understand and model the credit risk it is exposed to.

A

A key assumption which is made when building a credit scoring model is that the future differs from the past.
-> remains the same

28
Q

Q28: Which statement is NOT CORRECT?

The ultimate aim of churn prediction is to increase the number of long term loyal customers since these generate higher profits, tend to be less sensitive to competitive marketing activities, are typically less costly to serve, tend to generate positive word-of-mouth effect and thus result in a higher CLV.

Losing customers leads to opportunity costs because of the reduced sales. This is actually quite nicely illustrated by a quote from an article by Markey in a recent Harvard Business Review paper: Loyalty leaders grow revenues roughly 2.5 times as fast as their industry peers and deliver 2 to 5 times the shareholder returns over the next 10 years.

Significant improvements in customer retention typically generate small returns.

It is a well-known marketing fact that attracting new customers is 5 to 6 times more expensive than keeping existing clients satisfied through retention campaigns such as special customised offers.

A

Significant improvements in customer retention typically generate small returns.
-> incorrect

29
Q

Q29: Which statement is NOT CORRECT?

Information retrieval retrieves documents or web pages based on search terms. An example of this is the Google search engine.

Text summarization summarizes text into a few concepts or keywords. This could be useful in a complaint analysis application.

Text clustering groups text into a set of overlapping clusters.

Text classification assigns text to a set of predefined categories. Think about spam filtering where emails are classified as spam or not based on their context.

Opinion mining distills customers’ opinions about a product or service.

A