M2 U1 - Distilling the Analytic Objective - Q1 Flashcards

Question 1

Q

At what point do you begin to define analytic objectives?

Answer

A

Once you Set Business Goals

Question 2

Q

Who is responsible for framing the project’s analytic objective?

Answer

A

Client and data science team collaborate on all aspects of framing the analytic goal, where either side supports the other in understanding its business-related and technical components.

Question 3

Q

The problem (statement?) underlying an analytic objective should satisfy the following criteria:

Answer

A

Solving the problem must be part of a possible solution vision towards the business objective
Data can facilitate that solution, and that data either exists or is feasible to gather.
The problem must be specific and realistic and add enough value if executed properly.

Question 4

Q

What should be done before the team commits to fulfilling any analytical expectations on the part of the client?

Answer

A

data science projects/consultations involve a preliminary data survey that informs, or even precedes, longer substantive discussion. This checks for:

Readiness of the organization
Technical objections relating to the data.

Question 5

Q

What is the main purpose of the problem statement in the analytic objective?

Answer

A

It focuses on specific steps necessary to achieve the business objective that can benefit from data-driven methods in the project.

Question 6

Q

Describe a common variant of classification

Answer

A

sequence labeling: where individual data points are not independent but form a series. Typical examples: sentiment analysis of text, labeling images as depicting certain objects (animals, cars, etc.).

Question 7

Q

What are the ways to characterize a data science task? (7)

Answer

A

Classification: Individual instances in the dataset have a categorical (i.e. non-numerical) label associated with them, or will be labeled as such. The goal of the project is to develop a system capable of categorizing data points using these labels. A common variant of classification is sequence labeling, where individual data points are not independent but form a series. Typical examples: sentiment analysis of text, labeling images as depicting certain objects (animals, cars, etc.).
Regression: Individual instances in the dataset have a numerical label associated with them whose magnitude carries a specific meaning. The goal of the project is to develop a model capable of predicting this target score for individual data instances. Like classification, regression is often applied over dependent series of data points. Typical examples: predicting measurements in medical or demographic data over time
Retrieval & Ranking: The dataset can be thought of as one or more collections of data points and queries. In response to the query, one or more “ideal” data points should be retrieved and presented in a “correct” order. Typical example: search engines
Recommendation: The dataset consists of users and items, as well as information about preferences of users for specific items. The task is to find items to recommend to users towards the maximization of some utility. Typical examples: movie and product recommendations
Clustering: The dataset is assumed to have some latent structure which should be discovered by dividing data points into groups that are close to each other in variants of the feature space. Typical examples: exploratory analyses of unlabeled data
Anomaly Detection: The dataset is assumed to have some latent structure which should be discovered in order to identify instances that do not adhere to the pattern. Typical example: fake customer review detection in online retail data
Domain-specific tasks: Aside from the generic task types explained above, some domains have developed specific task patterns and associated evaluation metrics that should be used if the analytic goal is sufficiently specific. This is particularly true of natural language processing and image analysis. For example, machine translation will commonly be characterized as a text generation task and models will be evaluated using a specialized BLEU score. Similarly, models that segment a part of an image containing a specific object may be evaluated using average precision in conjunction with an “intersection over union” threshold.

Question 8

Q

Which primary role does the task statement play in the analytic objective?

Answer

A

It characterizes the task for purposes of planning and the nature of the eventual evaluation.

Question 9

Q

The statement of methods fits what criteria? (3)

Answer

A

It must be precise enough so that the data science team understands it as a concise summary of the technical approach .
At the same time, it should allow for testing different techniques around the main conceptual idea.
The methods should be suitable to produce the target functionality/insight for the task given the available data.

Question 10

Q

The most general categories of methods one can identify in the methods statement typically include:

Answer

A

Supervised learning methods, which involves learning to predict a target variable (typically through regression or classification) by training on “true” example data points whose target variable has manually been labeled or is available by other means.
Unsupervised learning methods deal with finding patterns in unlabeled data without an explicit prediction target.
Semi-Supervised learning methods encompass hybrid methods that combine supervised and unsupervised learning in different ways.

Question 11

Q

What’s a good beginning strategy when formulating (evaluating?) proposed methods?

Answer

A

Checking your proposed methods against the target functionality or insight by conceptually thinking through its application and explicitly formulating expected results.

Question 12

Q

The statement of methods is important because:

Answer

A

Similar to the task and problem statement, you are gaining an understanding of the business needs.

Question 13

Q

Unsupervised learning methods are best applied to

Answer

A

tasks were an outcome is not known.

Question 14

Q

What are the most important criteria for evaluating data?

Answer

A

the data is presumed to contain patterns that are informative for the analytic objective
allows one to successfully tackle the proposed task and make progress towards solving the problem in a data-driven way.

Question 15

Q

How does data collection and curation relate to the field of data science?

Answer

A

Data collection and curation is a complex sub-discipline of data science and is equally important as data analysis .

Question 16

Q

What is a constructive analytical objective?

Answer

Study These Flashcards

A

states that it is in principle possible to develop a desired functionality from the available methods and data without the need to fully optimize its performance yet. One can think of it as a proof-of-concept or prototyping endeavor.

Question 17

Q

What’s different about analytic objectives in academic settings (as opposed to industry settings)?

Answer

Study These Flashcards

A

In academic settings, the overarching interest may be that of advancing the state of the art in research, and hence the statement may either not include an explicit business objective or state it as a problem solution vision.

Question 18

Q

What are Benchmarking objectives?

Answer

Study These Flashcards

A

In scenarios where the feasibility of an analytical task has been established, projects may be targeted towards improving over the state of the art in some performance metric by using innovative methods/features/data. This is typically the case if one works on leaderboard-type datasets where there are models.

Question 19

Q

What are Exploratory objectives?

Answer

Study These Flashcards

A

Exploratory objectives are typically formed when data is available that is related to a problem of interest, but needs to be surveyed before it can be used in projects pursuing constructive or benchmarking objectives.

Seems like this is used to create something toward enabling something else in the future?

Question 20

Q

What are the elements of a well-framed analytic objective? (6)

Answer

Study These Flashcards

A

Understanding the business objective
Identifying the problem
Focusing on a well defined task
Checking proposed method against the target insight
Proposing data collection and curation methods
Framing the analytic objective

Question 21

Q

What should an analytic objective state? (3)

Answer

Study These Flashcards

A

An analytic objective should state (1) what specific functionality, insight, or resource is gained from leveraging that data and methods you propose (2) relative to the current situation, and (3) assuming the project is successful as proposed.

Question 22

Q

What’s the template for an analytic objective?

Answer

Study These Flashcards

A

As an incremental step towards business objective O

We work towards solving problem P

by focusing on specific task T

and applying analytic methods M in conjunction with data D

to create valuable functionality F and/or produce insight I

M2 U1 - Distilling the Analytic Objective - Q1 Flashcards

(22 cards)