Einstein Discovery Terms Flashcards

Question 1

Q

Licenses needed to use Einstein Discovery

Answer

A

CRM Analytics Plus or Einstein Predictions License

Question 2

Q

Permission sets for Einstein Discovery

Answer

A

CRM Analytics Plus User

CRM Analytics Plus Admin

Question 3

Q

What can the user permission set do?

Answer

A

Use einstein Discovery - only view created stories
View Einstein discovery recommendations.

Admin can do everything.

Question 4

Q

Descriptive

Answer

A

What happened to historical data?

Question 5

Q

Diagnostic

Answer

A

Why did it happen?

Question 6

Q

Comparative

Answer

A

What is the difference between subgroups?

Question 7

Q

Predictive

Answer

A

What could happen?

Question 8

Q

Prescriptive

Answer

A

How can I improve the predicted outcome?

Question 9

Q

2 story types

Answer

A

Insights only

Insights and Predictions

Question 10

Q

Insights Only story type

Answer

A

produces Descriptive insights only

Question 11

Q

Insights and Predictions Story Type

Answer

A

Created a model and produces all insight types

Question 12

Q

Einstein Discovery Use Cases

Answer

A

Numerical - numeric outcomes
Binary - two possible results represented as text data (yes/no)
Multiclass - predict outcomes with 3 to 10 possible results, represented as text data

Question 13

Q

3 story templates

Answer

A

Maximize customer revenue
Maximize win rate
Minimize time to close

Question 14

Q

Einstein Discovery AI ethical features

Answer

A

Bias Detection - alerts you to bian

Model cards - document the models

Question 15

Q

CRM Analytics Dataset Limits - # Rows

Answer

A

Minimum Descriptive Insights: 50
Minimum Predictive Insights: 400

Max: 20M

Question 16

Q

CRM Analytics Dataset Limits - # Columns

Answer

A

Min: 3 (1 outcome, 2 dataset columns)
Max: 50

Question 17

Q

story creations per org per day

Question 18

Q

story creations per org per month

Question 19

Q

concurrent story creations per org

Question 20

Q

concurrent queries per user

Question 21

Q

queries per user per day

Question 22

Q

Prediction Limits for Automated Predictions

Answer

A

5 sf objects models can be deployed

10 active models deployed to a single prediction

Question 23

Q

Number of unique predictions on a single entity with automated prediction fields

Question 24

Q

Number of predictions requested per org per day using automated prediction fields

Question 25

Q

Number of records that can be scored per org per day

Question 26

Q

Number of active prediction definitions

Answer

A

CRM Analytics Plus License: Unlimited

Einstein Prediction License: 10

Question 27

Q

Number of active models deployed to a single prediction definition using manually configured predictions fields.

Question 28

Q

Number of Einstein Prediction Service API calls per org per day. The Usage Statistics chart displays the cumulative total for this metric.

Question 29

Q

Number of Einstein Prediction Service API requests per user per hour.

Question 30

Q

Number of concurrent Einstein Prediction Service API requests within an org.

Answer

A

Depends on how the model associated with the request was built:

Einstein Discovery-built models: 25
Externally-built models uploaded to Salesforce: 1

Question 31

Q

Multivalue fields in ED

Answer

A

Not supported in ED

Question 32

Q

Usage Statistics Stats

Answer

A

Number of predictions run today
Number of story versions created today
Number of concurrent stories that can be analyzed
Number of prediction API calls run today
Number of story versions created this month

Question 33

Q

Actionable Variable

Answer

A

Explanatory variable people can control.
If variable is designated as actionable, model uses prescriptive analytics to suggest actions user can take to improve predicted outcome

Question 34

Q

Actual Outcome

Answer

A

An actual outcome is the real-world value of an observation’s outcome variable after the outcome has occurred. Einstein Discovery calculates model performance by comparing how closely predicted outcomes come to actual outcomes. An actual outcome is sometimes called an observed outcome.

Question 35

Q

Bias

Answer

A

Variables are being treated unequally in your model

Question 36

Q

Cardinality

Answer

A

Number of distinct values in a category.
ED supports 100 categories per variable.
Null values are put into category called unspecified.
Can consolidate remaining categories (categories with < 25 obs) into ‘other’ category

Question 37

Q

Categorical Variable

Answer

A

Represents Qualitative values. Story with binary or multi-class is categorical.

Question 38

Q

Causation

Answer

A

Statistical associated between variables.

Question 39

Q

Diagnostic Insights

Answer

A

Insights derived from a model. Show ‘why’ it happened. Drill into correlated variables.

Question 40

Q

Disparate Impact

Answer

A

Data reflects discriminatory practices towards a patricular demographic

Question 41

Q

Dominant values

Answer

A

Data is unbalanced. Most values are in same category.

Question 42

Q

Drift

Answer

A

Drift can occur due to changing factors in the data or in your business environment. Drift also results from now-obsolete assumptions built into the story on which the model is based. To remedy a model that has drifted, you can refresh it by adjusting story settings, retraining it on newer data, and redeploying it.

Question 43

Q

Duplicates

Answer

A

Two or more explanetory variables are highly correlated (ex: Zipcode and city) ED recommends choosing just one variable to improve results.

Question 44

Q

Explanatory Variable

Answer

A

Variable you explore to determine whether and to what degree if can influence the outcome variable.
Also called input variable, feature, predictor, or independent variable

Question 45

Q

Feature Selection

Answer

A

Picking the best explanatory variables in a story.Too few features could result in underfitting, too many could result in overfitting.

Select the most influential explanatory variables with no significant llurking variables

Question 46

Q

First Order analysis

Answer

A

how one explanetory variable explains variation in the outcome variable. Also called bivariate analysis

Question 47

Q

Generalized Linear Model

Answer

A

Regression absed model

Question 48

Q

Goal

Answer

A

Specifies the desired outcome for the story. Includes the story’s outcome variable plus your preferred direction for the outcome.

Question 49

Q

Gradient Boosting

Answer

A

Decision-Tree based algorithm.

Question 50

Q

Identical Values

Answer

A

All values for a variable belong to the same category

Question 51

Q

Improvement

Answer

A

Suggested action based on prescriptive analytics that use can take to improve likelihood of desired outcome. Associated with actionable variables.

Question 52

Q

Imputation

Answer

A

stat technique for replacing numeric values with valies derived from subset of data.

Question 53

Q

Insight

Answer

A

Starting point for you to investiate the relationsihps among story’s explanatory variables and its goal.

Question 54

Q

k-fold Cross-Validation

Answer

A

Model validation process in which Einstein Discovery randomly divides all the observations in the Analytics dataset into four separate partitions of equal size. Next, it completes four test passes (folds) in which three of the partitions serve as the training set and one partition serves as the test set. For each fold, Einstein Discovery compiles model metrics, then averages the metrics for all four folds.

Question 55

Q

Leakage

Answer

A

Leakage occurs when the data used to train your model includes one or more variables that contain the information that you are trying to predict. This can result in models that are extremely accurate when, in actuality, they are problematic. To remedy data leakage, remove any variables from your model that are causing the leakage.

Question 56

Q

Lurking Variable

Answer

A

A lurking variable is an explanatory variable that is missing from your story but which significantly explains variations in the outcome variable.

Question 57

Q

Modeling Algorithm

Answer

A

A modeling algorithm is what Einstein Discovery uses to create a model for a story. Einstein Discovery uses one of several algorithms: generalized linear model (GLM) is a regression-based algorithm, while gradient boosting machine (GBM) and XGBoost are decision tree-based machine learning algorithms.

Question 58

Q

Model Manager

Answer

A

The Model Manager is the Einstein Discovery tool used to manage predictions and models you have deployed.

Question 59

Q

Model Metrics

Answer

A

Model metrics describe the performance of the predictive model associated with your story. It provides metrics (quality indicators, which are sometimes called fit statistics) to show how well the model’s predictions fit the training data in the dataset. For definitions of quality indicators shown in the Model Metrics tabs, see Evaluate Model Quality.

Question 60

Q

Multiclass Classification Use Case

Answer

A

The multiclass classification use case addresses business outcome that have between 3 and 10 outcome values, such as five possible service plans or eight possible insurance policies. Multiclass classification is one of the main use cases that Einstein Discovery supports. Compare with Binary Classification.

Question 61

Q

Noise

Answer

A

any data that does not meaningfully explain variations in your outcome variable

Question 62

Q

Overfitting

Answer

A

In predictive analytics, overfitting occurs when a model performs well in predicting outcomes on the training data in the dataset, but less well when predicting outcomes for other data, such as production data. Using too many explanatory variables can result in an overly complex predictive model that captures the noise in your data. To mitigate overfitting, Einstein Discovery uses ridge regression and regularization

Question 63

Q

Second-Order Analysis

Answer

A

In an insight, a second-order analysis examines how the combination of two explanatory variables explains variation in the outcome variable. In second-order analysis, the combined impact of both variables together on the outcome is sometimes called the interaction effect. Second-order analysis is sometimes called multivariate analysis.