business intelligence and data analytics Flashcards

1
Q

What are the benefits of these large data sets?

A

New, better, cheaper products and services, deep insight into behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the challenges associated with large data sets?

A

High complexity of behavior (more detail in the data, dimensionality), traditional analytics and engineering insufficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is master data, and what are some examples?

A

Master data relates to core entities a company works with, such as customers, products, employees, suppliers, and vendors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is transactional data, and what are some examples?

A

Transactional data captures timing, quantity, and items related to business activities. Examples include POS data, credit card transactions, money transfers, web visits, and RFM variables (Recency, Frequency, Monetary).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is external data, and what are some examples?

A

Answer: External data is collected from sources outside the organization. Examples include social media data, macroeconomic data, weather data, competitor data, search data, and web-scraped data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is open data?

A

Open data is data that anyone can access, use, and share without copyright restrictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the five V’s of big data?

A

Volume (data at rest), Velocity (data in motion), Variety (data in its many forms), Veracity (data in doubt), and Value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is structured data, and what are some examples?

A

Structured data is organized in a predefined format, like a table. Examples include numbers and the name of a customer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is unstructured data, and what are some examples?

A

Unstructured data is not organized in a predefined format. Examples include product reviews and Tweets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is semi-structured data, and what are some examples?

A

Semi-structured data has some structure, but it’s not as rigid as structured data. Examples include web pages and resumes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is metadata?

A

Metadata is data that describes other data, often referred to as “data about data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some examples of metadata?

A

Data definitions, stored in the catalog of a DBMS. In fraud detection, metadata might include the date and location of an accident picture. In web analytics, metadata could include web visits and geospatial analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What kind of information do banks gather to assess credit risk?

A

Banks gather information about an individual’s default behavior, such as date of birth, gender, income, and employment status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is up-selling?

A

Up-selling is offering a customer a higher-priced or premium version of a product they were originally considering. For example, suggesting Westmalle instead of Stella Artois.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is cross-selling?

A

Cross-selling is offering a complementary product to a customer’s original purchase. For example, pairing cheese with Westmalle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is down-selling?

A

Down-selling is offering a customer a lower-priced or less feature-rich version of a product when they can’t afford or don’t need the original choice. For example, suggesting fewer beers if a customer has already ordered too many.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is RFM analysis used for?

A

RFM is a well-known and well-developed measurement framework used in marketing across various industries. It helps monitor customers’ behavior and develop suitable CRM strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the focus of RFM analysis?

A

RFM analysis focuses on existing customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does RFM analysis summarize consumer purchasing habits?

A

RFM summarizes purchasing habits using three dimensions: Recency, Frequency, and Monetary.

20
Q

What is the Pareto principle, and how does it relate to RFM?

A

The Pareto principle states that 80% of effects come from 20% of causes. In RFM, this means 20% of customers generate 80% of profit.

21
Q

What are the three dimensions of RFM analysis?

A

Recency (time since most recent purchase), Frequency (total number of purchase transactions), and Monetary (value of purchases).

22
Q

Retention Modeling and Churn Prediction

A

Retention modeling predicts which customers will leave or decrease their product/service usage.

23
Q

Why is customer retention important?

A

Long-term loyal customers generate higher profits, are less sensitive to competition, are less costly to serve, have a positive word-of-mouth effect, and have a higher CLV.

24
Q

What are the different types of churn?

A

Answer: Active churn (customer stops the relationship), passive churn (customer decreases the intensity of the relationship), forced churn (company stops the relationship), and expected churn (customer no longer needs the product/service).

25
Q

Transaction versus Relationship buyers

A

– transaction buyers: buy because of low price
– relationship buyers: build loyal relationship with firm

26
Q

Goal and Focus of response modelling

A

Goal: Model response to marketing campaigns.
Focus:
Customer acquisition.
Deepening customer relationships.

27
Q

Targeted Marketing

A

Targeted Marketing: Delivering marketing messages to specific individuals based on their characteristics and behaviors.
Classification Task: Predicting whether a customer will respond to a marketing campaign (e.g., purchase, click, unsubscribe)

28
Q

Implicit and Explicit Responses

A

Implicit Response: Actions that indicate interest but may not be direct purchases.
Reading an email
Clicking on a link
Downloading a product description
Configuring a product
Contacting customer service
Explicit Response: Direct actions related to purchasing and advocacy.
Purchase
Purchase and good review
Purchase, good review, and word-of-mouth

29
Q

Definition of Recommender Systems

A

Recommender Systems: Software tools and techniques that provide suggestions for items to be of use to a user.

30
Q

data of geospatial apps

A

Historical CRM Dataset: Contains information about customers, including payment history.
Open Data from StatBel: Provides socioeconomic data about different neighborhoods.

31
Q

Approach of geo spatial

A

Neighborhood Characterization: Use open data to describe neighborhoods based on socioeconomic factors.
Geocoding: Match customer addresses to specific neighborhoods.
Model Development: Create a model to predict the likelihood of a customer becoming a “bad payer” based on neighborhood characteristics and other relevant factors.

32
Q

Customer Lifetime Value (CLTV):

A

CLTV = (R_t - C_t) * s_t / (1 + d)^t
Where:
R_t: Revenue in month t
C_t: Cost in month t
s_t: Survival probability in month t
d: Discount rate

33
Q

In a subscription or contractual setting churn can be defined

A

as the customer explicitly cancelling the contract.

34
Q

leave you or decrease their product/service usage.

A

churn prediction

35
Q

Customer journey analysis can be used to

A

get a clear and comprehensive picture of the overall process.
highlight process deficiencies such as excessive processing times, indicate deadlock situations, circular references, and unwanted customer leakage, among others.

verify if the process is compliant with both internal and external regulations.

36
Q

RFM analysis is sometimes referred to as

A

a poor man’s approach to customer lifetime value (CLV) analysis.

37
Q

The RFM framework is a well-known and well-developed measurement framework

A

The RFM framework is a well-known and well-developed measurement framework used in marketing across different industries such as banking, insurance, Telco, non-profit, travel, on-line retailers, and even government.

38
Q

A web page is an example of

A

semi-structured data.

39
Q

credit scoring

A

It is the aim of credit scoring to come up with a statistically based decision model which allows to score future credit applications and decide which ones to accept or reject.

40
Q

In credit scoring, the target variable .

A

In credit scoring, the target variable is binary.

41
Q

Information retrieval

A

Information retrieval retrieves documents or web pages based on search terms. An example of this is the Google search engine.

42
Q

Text summarization

A

Text summarization summarizes text into a few concepts or keywords. This could be useful in a complaint analysis application.

43
Q

Text classification assigns

A

Text classification assigns text to a set of predefined categories. Think about spam filtering where emails are classified as spam or not based on their context.

44
Q

Search data such as Google Trends can be used for nowcasting where the aim is to

A

forecast the present or near future.

45
Q

Credit bureaus are

A

data pooling organizations that gather default information from various financial institutions such as delinquency history, bureau checks, and bureau score.

46
Q

According to Van Vlasselaer and Baesens, which is a key characteristic of fraud?

A

well-considered
imperceptibly concealed
time-evolving
carefully organized