CAP Model Building Flashcards
Gain
Cumulative expected response using predictive model over expected response using random selection
Non-parametric modeling
A system where no set of factors can fully describe the system’s performance independent of observed data
Diagnostic modeling
A way to analyze data to identify business needs
iterative (agile) method
Method under which requirements and solutions evolve through the collaborative effort of self-organizing and cross-functional teams and their customer(s)/end user(s)
Prescriptive modeling
A way to analyze data to determine the best way to proceed in the future
Model
an abstraction that emphasizes certain aspects of reality to assess or understand the behavior of a system under study; the system may be physical, logical, mathematical, or some other representation of reality, such as an enterprise or some portion of one.
Model reliability
Ability of a model to produce consistent results
Parallel method
Method in which there is a need for separate development paths to diverge from a common starting point so that there is two or more concurrent “latest” configurations
Confusion matrix
A table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives
Proprietary code
Non-free computer software for which the software’s publisher or another person retains intellectual property rights—usually copyright of the source code, but sometimes patent rights
Automated modeling
Where a computer develops a representation of a process without the user describing the nature of the observed data
Steps in building predictive model
1) Understand the problem and data 2) Explore and clean the data 3) Feature extraction and/or selection 4) Model evaluation and selection 5) Model optimization 6) Interpretation of results and predictions
Predictive modeling
A way to use factors to forecast outcomes of events
Deterministic modeling
A descriptive or predictive model whose performance can be described without any random variation
Model extensibility
Ability of a model to be adapted to varying operation modes
Bias-variance trade-off
Use most features possible to reduce bias while using fewest features possible to reduce variance
Personally Identifiable Information (PII)
Any information relating to a specific identifiable person
Document and communicate findings
Tailor message to audience; avoid message distortion to non-technical audiences; use graphics to simplify results and uncover patterns
Model scalability
Ability of a model to perform well under different magnitudes of data volume
Waterfall method
Method in which the systems development life cycle tasks occur sequentially, with one activity starting only after the previous one has been completed.
Model Fidelity
the degree to which a model or simulation reproduces the state and behaviour of a real world object, feature or condition; encourages parsimony of parameters
Model stability
a notion in computational learning theory of how a machine learning algorithm is perturbed by small changes to its inputs
Conway’s Law
Organizations which design systems are constrained to produce designs which are copies of the communication structures of those organizations. SE management should facilitate communications, streamline controls, and simplify paperwork.
Receiver Operating Characteristic (ROC)
Method to assess classifier model by comparing true positive rate to false positive rate as classifier’s discrimination threshold is varied
Tips for communication
Describe not just what you did, but why you did it, how the steps are connected, and what it all means.
Supervised modeling
Building a representation of a process with user input describing the groups of different observations or records
Parametric modeling
A way to represent a system where a finite set of factors describe the system’s performance independent of observed data
Descriptive modeling
A way to describe real world events and the relationships between factors responsible for them
Multiple use modeling
A representation of events that can be applied to more than one case study
Coefficient of determination
R^2 = 1-(SS_ret/SS_tot), how much of variation in the data is explained by the model
Interpreability
Ease of an analyst describing decision making method of model to stakeholders
Sensitivity analysis uses
Testing the model for validity or accuracy; searching for errors in the model; simplifying the model; calibrating the model; coping with poor or missing data; prioritizing acquisition of information
Lift
Expected response of an interval of data selected using predictive model over expected response of interval selected using random selection
Static (snapshot) modeling
Modeling a system for purpose of evaluating performance at a specific point in time
Stochastic modeling
a descriptive or predictive probability model yielding a location or time sequence representing the state of a system that is subject to random variation
Dynamic (movie) modeling
Modeling a system for purpose of evaluating performance change over time