Einstein Discovery Terms Flashcards
Licenses needed to use Einstein Discovery
CRM Analytics Plus or Einstein Predictions License
Permission sets for Einstein Discovery
CRM Analytics Plus User
CRM Analytics Plus Admin
What can the user permission set do?
Use einstein Discovery - only view created stories
View Einstein discovery recommendations.
Admin can do everything.
Descriptive
What happened to historical data?
Diagnostic
Why did it happen?
Comparative
What is the difference between subgroups?
Predictive
What could happen?
Prescriptive
How can I improve the predicted outcome?
2 story types
Insights only
Insights and Predictions
Insights Only story type
produces Descriptive insights only
Insights and Predictions Story Type
Created a model and produces all insight types
Einstein Discovery Use Cases
Numerical - numeric outcomes
Binary - two possible results represented as text data (yes/no)
Multiclass - predict outcomes with 3 to 10 possible results, represented as text data
3 story templates
Maximize customer revenue
Maximize win rate
Minimize time to close
Einstein Discovery AI ethical features
Bias Detection - alerts you to bian
Model cards - document the models
CRM Analytics Dataset Limits - # Rows
Minimum Descriptive Insights: 50
Minimum Predictive Insights: 400
Max: 20M
CRM Analytics Dataset Limits - # Columns
Min: 3 (1 outcome, 2 dataset columns)
Max: 50
story creations per org per day
100
story creations per org per month
500
concurrent story creations per org
2
concurrent queries per user
10
queries per user per day
10,000
Prediction Limits for Automated Predictions
5 sf objects models can be deployed
10 active models deployed to a single prediction
Number of unique predictions on a single entity with automated prediction fields
3
Number of predictions requested per org per day using automated prediction fields
500K
Number of records that can be scored per org per day
1M
Number of active prediction definitions
CRM Analytics Plus License: Unlimited
Einstein Prediction License: 10
Number of active models deployed to a single prediction definition using manually configured predictions fields.
10
Number of Einstein Prediction Service API calls per org per day. The Usage Statistics chart displays the cumulative total for this metric.
50K
Number of Einstein Prediction Service API requests per user per hour.
2000
Number of concurrent Einstein Prediction Service API requests within an org.
Depends on how the model associated with the request was built:
Einstein Discovery-built models: 25
Externally-built models uploaded to Salesforce: 1
Multivalue fields in ED
Not supported in ED
Usage Statistics Stats
Number of predictions run today
Number of story versions created today
Number of concurrent stories that can be analyzed
Number of prediction API calls run today
Number of story versions created this month
Actionable Variable
Explanatory variable people can control.
If variable is designated as actionable, model uses prescriptive analytics to suggest actions user can take to improve predicted outcome
Actual Outcome
An actual outcome is the real-world value of an observation’s outcome variable after the outcome has occurred. Einstein Discovery calculates model performance by comparing how closely predicted outcomes come to actual outcomes. An actual outcome is sometimes called an observed outcome.
Bias
Variables are being treated unequally in your model
Cardinality
Number of distinct values in a category.
ED supports 100 categories per variable.
Null values are put into category called unspecified.
Can consolidate remaining categories (categories with < 25 obs) into ‘other’ category
Categorical Variable
Represents Qualitative values. Story with binary or multi-class is categorical.
Causation
Statistical associated between variables.
Diagnostic Insights
Insights derived from a model. Show ‘why’ it happened. Drill into correlated variables.
Disparate Impact
Data reflects discriminatory practices towards a patricular demographic
Dominant values
Data is unbalanced. Most values are in same category.
Drift
Drift can occur due to changing factors in the data or in your business environment. Drift also results from now-obsolete assumptions built into the story on which the model is based. To remedy a model that has drifted, you can refresh it by adjusting story settings, retraining it on newer data, and redeploying it.
Duplicates
Two or more explanetory variables are highly correlated (ex: Zipcode and city) ED recommends choosing just one variable to improve results.
Explanatory Variable
Variable you explore to determine whether and to what degree if can influence the outcome variable.
Also called input variable, feature, predictor, or independent variable
Feature Selection
Picking the best explanatory variables in a story.Too few features could result in underfitting, too many could result in overfitting.
Select the most influential explanatory variables with no significant llurking variables
First Order analysis
how one explanetory variable explains variation in the outcome variable. Also called bivariate analysis
Generalized Linear Model
Regression absed model
Goal
Specifies the desired outcome for the story. Includes the story’s outcome variable plus your preferred direction for the outcome.
Gradient Boosting
Decision-Tree based algorithm.
Identical Values
All values for a variable belong to the same category
Improvement
Suggested action based on prescriptive analytics that use can take to improve likelihood of desired outcome. Associated with actionable variables.
Imputation
stat technique for replacing numeric values with valies derived from subset of data.
Insight
Starting point for you to investiate the relationsihps among story’s explanatory variables and its goal.
k-fold Cross-Validation
Model validation process in which Einstein Discovery randomly divides all the observations in the Analytics dataset into four separate partitions of equal size. Next, it completes four test passes (folds) in which three of the partitions serve as the training set and one partition serves as the test set. For each fold, Einstein Discovery compiles model metrics, then averages the metrics for all four folds.
Leakage
Leakage occurs when the data used to train your model includes one or more variables that contain the information that you are trying to predict. This can result in models that are extremely accurate when, in actuality, they are problematic. To remedy data leakage, remove any variables from your model that are causing the leakage.
Lurking Variable
A lurking variable is an explanatory variable that is missing from your story but which significantly explains variations in the outcome variable.
Modeling Algorithm
A modeling algorithm is what Einstein Discovery uses to create a model for a story. Einstein Discovery uses one of several algorithms: generalized linear model (GLM) is a regression-based algorithm, while gradient boosting machine (GBM) and XGBoost are decision tree-based machine learning algorithms.
Model Manager
The Model Manager is the Einstein Discovery tool used to manage predictions and models you have deployed.
Model Metrics
Model metrics describe the performance of the predictive model associated with your story. It provides metrics (quality indicators, which are sometimes called fit statistics) to show how well the model’s predictions fit the training data in the dataset. For definitions of quality indicators shown in the Model Metrics tabs, see Evaluate Model Quality.
Multiclass Classification Use Case
The multiclass classification use case addresses business outcome that have between 3 and 10 outcome values, such as five possible service plans or eight possible insurance policies. Multiclass classification is one of the main use cases that Einstein Discovery supports. Compare with Binary Classification.
Noise
any data that does not meaningfully explain variations in your outcome variable
Overfitting
In predictive analytics, overfitting occurs when a model performs well in predicting outcomes on the training data in the dataset, but less well when predicting outcomes for other data, such as production data. Using too many explanatory variables can result in an overly complex predictive model that captures the noise in your data. To mitigate overfitting, Einstein Discovery uses ridge regression and regularization
Second-Order Analysis
In an insight, a second-order analysis examines how the combination of two explanatory variables explains variation in the outcome variable. In second-order analysis, the combined impact of both variables together on the outcome is sometimes called the interaction effect. Second-order analysis is sometimes called multivariate analysis.