M1 Flashcards

Question 1

Q

[ ] are fundamental components in various fields, providing tools for data interpretation and decision-making.

Answer

A

Statistical analysis and Modeling (SAM)

Question 2

Q

[ ] primarily focuses on the collection, analysis, interpretation, presentation, and organization of data.

Answer

A

Statistics

Question 3

Q

[ ] provides foundational tools for understanding data distributions, variability, and relationships through methods such as hypothesis testing and regression analysis.

Answer

A

Statistics

Question 4

Q

[ ] encompasses a broader scope, integrating statistical methods with advanced computational techniques to derive insights from data.

Answer

A

Analytics

Question 5

Q

[ ] emphasizes predictive modeling, data mining, and the application of algorithms to inform strategic decisions and optimize processes.

Answer

A

Analytics

Question 6

Q

A [ ] is an interlinked set of activities that an organization performs to convert inputs to outputs that are valuable to a market.

Answer

A

value chain

Question 7

Q

Insights that prescribe direct and meaningful actions then [ ].

Answer

A

drive decision making

Question 8

Q

When is the value of data to organizations and their market is realized?

Answer

A

When actions or decisions are implemented from them.

Question 9

Q

What are the five main data sources?

Answer

A

(TransData - ContractSub - Surve - DataPool - Unstruct)
- Transactional Data
- Contractual, Subscription, or Account Data
- Surveys
- Data Poolers
- Unstructured Data

Question 10

Q

This type of data source consists of structured, detailed information capturing the key characteristics of a transaction.

Answer

A

Transaction Data

Question 11

Q

This type of data source includes information about the type of product combined with customer characteristics.

Answer

A

Contractual, Subscription, or Account Data

Question 12

Q

This type of data source are questionnaires aimed at extracting sociodemographic and behavioral data from a particular group of people.

Question 13

Q

These are companies that gather data in particular settings or for particular purposes and sell them to interested customers looking to enrich or extend their data sources.

Answer

A

Data Poolers

Question 14

Q

This refers to information that does not reside in a traditional row-column database in the world of big data.

Answer

A

Unstructured Data

Question 15

Q

What are the phases of Data Analytics?

Answer

A

(Bu - Du - Dp - M - E - D)
- Business Understanding
- Data Understanding
- Data Preparation
- Modelling
- Evaluation
- Deployment

Question 16

Q

[ ] is knowing what the study is for or identifying a business task.

Answer

A

Business Understanding

Question 17

Q

[ ] is when you select the related data from many available databases to correctly describe a given business task; identifying relevant data for the problem description.

Answer

A

Data Understanding

Question 18

Q

[ ] is also known as data preprocessing.

Answer

A

Data Preparation

Question 19

Q

[ ] is to filter, aggregate, and fill-in (impute) missing values.

Answer

A

Data Preparation

Question 20

Q

[ ] uses mathematical formulations to convert different measurements into a unified numerical scale.

Answer

A

Data Transformation

Question 21

Q

Transforming numerical to numerical scales [ ].

Answer

A

shrinks or enlarges the data

Question 22

Q

Transforming categorical to numerical values can be [ ].

Answer

A

ordinal (less, moderate, strong) or nominal (red, yellow, blue).

Question 23

Q

What are the two major categories of modeling?

Answer

A

Predictive Modeling
Descriptive Modeling

Question 24

Q

[ ] predicts the value of an attribute based on the values of other attributes.

Answer

A

Predictive Modeling

Question 25

Q

[ ] derives patters that summarizes the underlying relationships in the data.

Answer

A

Descriptive Modeling

Question 26

Q

[ ] summarizes the general characteristics or features of a target class of data.

Answer

A

Characterization

Question 27

Q

What are the methods of Characterization and Discrimination?

Answer

A

data summaries based on stat measures and plots.
user-controlled data summarization using OLAP, EXCEL, Spreadsheet, SQL, Python, etc.

Question 28

Q

What are the outputs of Characterization and Discrimination?

Answer

A

pie charts, bar charts, curves, crosstabs
Characteristic Rules

Question 29

Q

[ ] compares the general features of the target class data objects against the general features of objects from one or multiple contrasting classes.

Answer

A

Discrimination

Question 30

Q

[ ] is the process of finding a model or function that describes and distinguishes classes or concepts.

Answer

A

Classification

Question 31

Q

[ ] is a statistical methodology that is most often used for numeric prediction.

Answer

A

Regression Analysis

Question 32

Q

[ ] is when objects are grouped based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity.

Answer

A

Clustering

Question 33

Q

[ ] detects objects in data that do not follow norms and its methods may include statistical tests or using distance measures.

Answer

A

Outlier Analysis

Question 34

Q

What are the two things to consider in the Data Interpretation stage?

Answer

A

How to recognize business value from knowledge patters discovered.
How to visualize the results to properly interpret patterns.

Question 35

Q

A pattern is interesting if:

Answer

A

easily understood by humans
valid on new or test data with some degree of certainty
potentially useful
novel

Question 36

Q

What are the two primary techniques of descriptive analytics?

Answer

A

Data Aggregation
Data Presentation