6 - Building Your First Model Flashcards

1
Q

What is prior authorization?

A

The process in which an insurance company requires a physician to get clearance for reimbursement before providing a service or procedure.

Prior authorization is used to ensure reimbursement for medically necessary services and to control costs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do insurers use prior authorizations?

A

To ensure reimbursement for medically necessary services and to control costs by adding friction to the system for physicians and patients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a significant issue with current prior authorization processes?

A

There are too many false positives and false negatives regarding surgery approvals, leading to higher costs and complications for patients.

Patients who need surgery may be denied, while those who don’t need it may be approved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What role does data science play in improving prior authorization decisions?

A

Data science can help empower decision-makers with data to make better choices regarding surgery approvals and denials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the challenge faced by reviewers in the prior authorization process?

A

Reviewers have difficulty predicting whether a patient would benefit from nonsurgical treatment options.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the main goal of the model that David and the team want to build?

A

To accurately determine which patients would benefit from nonsurgical treatment options for back pain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between prediction and explanation in data modeling?

A

Prediction focuses on accurately forecasting outcomes, while explanation seeks to understand the variables leading to those outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is meant by defining the outcome in a predictive model?

A

The outcome is the quantity that the model aims to predict, such as a patient’s chance of success on a nonsurgical treatment plan.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How did the team propose to measure success for nonsurgical treatment?

A

By predicting health care utilization related to back pain, such as medications and physical therapy, rather than just recovery from pain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What potential bias did Jenna raise regarding health care utilization predictions?

A

Patients may have lower utilization due to not seeking care, which does not necessarily mean their back pain is well managed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is feature engineering?

A

The process of creating variables to feed into a predictive model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What type of data will the model use?

A

Tabular data that can be organized in spreadsheet format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What temporal restriction is important when building the model?

A

The model should use only data collected before the prior authorization to ensure access to the relevant data during model application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are catastrophic events mentioned as potential outcomes?

A

Hospitalizations or deaths from complications, which are significant for both patient welfare and cost considerations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What time frame did Jenna suggest for measuring health care expenditure?

A

Three years after the prior authorization request for surgery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the significance of defining specific outcomes for the model?

A

Specific outcomes help focus the model on what is actually being measured, improving its accuracy and relevance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True or False: The model can use any type of data for predictions.

A

False.

The model requires data to be in a tabular format, which excludes unstructured text data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the primary model outcome discussed in the meeting?

A

Health care utilization related to back pain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the potential risk of using health care expenditure as an outcome measure?

A

It may not accurately reflect a patient’s condition if they do not seek care, leading to misleading conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What did David suggest about the prediction window for health care utilization?

A

To try a range of prediction windows from one month to one year to see how model accuracy changes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What was Kamala’s concern about older patients in the model?

A

Older patients may die soon after submitting a request, potentially skewing health care expenditure predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is feature engineering?

A

The process by which we create variables to feed into our model from other data sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How can clinician’s notes be utilized in modeling?

A

Feature engineering techniques from natural language processing can convert free text into a table of numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is optical character recognition used for?

A

To extract numerical and text data from scanned PDFs, turning them into machine-readable format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What must be considered before using scanned PDF data in modeling?
Exhausting all existing tabular data before spending time on scanned PDFs.
26
What is an example of creating a feature from claims data?
Counting the total number of times a patient was hospitalized in the year before the prior authorization submission.
27
What are some metrics that can be created from existing claims data?
* Total number of hospital visits * Total number of medications prescribed * Average number of claims per month * Rate of change in prescription drug use
28
Is there a specific number of features needed for a model?
There’s no set rule; the goal is to find the most highly predictive features.
29
What is the role of the model in feature selection?
To identify statistical relationships between variables and find the strongest predictors.
30
True or False: A model can become confused if there are too many features.
True.
31
What is feature selection?
The process of identifying which features to include in your model.
32
What can happen if irrelevant variables are included in a model?
It can add more noise, resulting in worse predictions.
33
What is a common filtering method for variable selection?
Selecting variables that have a high correlation with the outcome variable.
34
What are wrappers in feature selection?
Methods like forward selection, backward selection, and stepwise selection that examine how useful each feature is.
35
What does forward selection do?
Adds the most important variable first and continues adding until no significant variables are left.
36
What is an embedded approach in feature selection?
A method that automatically includes variable selection as part of the model’s optimization algorithm.
37
What is the LASSO algorithm?
A variation of regression that minimizes the sum of the squares of the residuals plus the sum of the absolute values of the regression coefficients.
38
What is the main goal of model training?
To feed the model data and allow it to learn patterns.
39
What does model testing involve?
Showing previously unseen data to the model and evaluating its predictions.
40
How does a model identify statistical patterns?
Using statistical associations and correlations from the training data.
41
What is the concept of fitting a trend line in modeling?
Finding the optimal values of parameters that best fit the data points.
42
What is the equation of a line used in linear prediction models?
Y = mX + b, where m is the slope and b is the y-intercept.
43
What happens if the model encounters a data point that is very different from the training data?
It may perform poorly due to lack of representative training data.
44
What happens if the training data is not representative of real-life situations?
The model may perform poorly in unexpected scenarios ## Footnote For example, if a model is trained only on pictures of cats and dogs, it may struggle with pictures of hippos.
45
How can we ensure our model is robust?
By making our training data as representative as possible and filtering out dissimilar data points ## Footnote This helps the model learn from a diverse array of cases.
46
What is the trade-off when deciding how much data to use for training versus testing?
Using more data for training improves performance on the training set but complicates performance evaluation on the test set ## Footnote A common rule of thumb is to use 20 to 40 percent for testing.
47
How does the size of the data set affect the amount reserved for testing?
In huge data sets, just 1 percent may be sufficient as a representative test set ## Footnote This allows for effective evaluation without needing a large portion of the data.
48
What should be considered when handling new patient data?
The model's ability to generalize to new populations that may differ from the training population ## Footnote For example, coal miners may have different health care utilization patterns than the general population.
49
Can we retrain the model if it performs poorly on testing?
Yes, but tweaking the model after testing can lead to overfitting ## Footnote This is similar to studying for a practice test and not performing well on the actual test.
50
What is overfitting?
Optimizing the model to perform well on test data at the expense of generalizing to other data sets ## Footnote This can lead to poor performance when applied to new, unseen data.
51
What is the benefit of using cross-validation?
It reduces the likelihood of overfitting by testing the model on multiple validation sets ## Footnote This involves splitting data into folds and validating across them.
52
What are the three main levers to improve model performance?
* Changing the data used by the model * Changing the type of model * Tuning hyperparameters ## Footnote These adjustments can significantly affect prediction accuracy.
53
What is a hyperparameter?
A setting knob on a model that can be tuned to optimize performance ## Footnote For example, the time spent studying a page in a textbook can be viewed as a hyperparameter.
54
How do we decide when the model is performing well enough?
It depends on how the model will be used and the required accuracy for practical applications ## Footnote This is often determined by stakeholder needs.
55
What is the mean absolute error?
It is the average absolute difference between predicted and actual expenditures ## Footnote This metric helps quantify model performance in a comprehensible way.
56
What is a limitation of mean absolute error?
It treats large and small errors equally ## Footnote This means that a $200 error is penalized the same as a $100 error, potentially masking larger discrepancies.
57
What is mean absolute error?
The mean absolute error is the average of the absolute values of the differences between predicted and actual values ## Footnote It provides a single number indicating how much the model is off, on average.
58
What is a limitation of mean absolute error?
Large errors and small errors are treated the same, meaning an error of $200 is penalized only twice as much as an error of $100.
59
What is mean squared error?
Mean squared error is the square of the difference between predicted and actual values, penalizing larger discrepancies more than smaller ones.
60
How does squared error compare to absolute error?
If the error is $100, the squared error is $10,000; if the error is $200, the squared error is $40,000.
61
What is the Brier score?
The Brier score is a mean squared error used to compare predicted probabilities to actual binary outcomes (0 or 1).
62
What does a lower Brier score indicate?
A lower Brier score indicates better model performance.
63
What does it mean for a model to be well calibrated?
A well-calibrated model's predicted probabilities behave like true probabilities.
64
How can you check the calibration of predicted probabilities?
By comparing predicted probabilities to the actual frequency of outcomes within a group of data points.
65
What is thresholding in the context of binary outcome models?
Thresholding involves picking a probability threshold and designating predicted probabilities above it as 'yes' and below as 'no.'
66
What common threshold do model builders often use?
Many model builders arbitrarily pick a threshold of 0.5.
67
What assumption does a 0.5 threshold make?
It assumes that false negatives are equally undesirable as false positives.
68
Why might a threshold of 0.5 be inappropriate in certain models?
In scenarios where false positives are worse than false negatives, a threshold closer to 0 may be more suitable.
69
Who should determine the costs of false positives and false negatives?
The person using the model should communicate the relative costs of false positives and false negatives.
70
What are the four important numbers for measuring classifier performance?
True positives, true negatives, false positives, false negatives.
71
What is sensitivity in model performance metrics?
Sensitivity, or true positive rate, is the percentage of all positives the model identifies.
72
What is specificity in model performance metrics?
Specificity, or true negative rate, is the percentage of all negatives the model identifies.
73
What is positive predictive value?
The probability that a predicted positive is a true positive, calculated as true positives divided by true positives plus false positives.
74
What is negative predictive value?
The probability that a predicted negative is a true negative, calculated as true negatives divided by true negatives plus false negatives.
75
What key questions should be asked when building prediction models?
Questions regarding population, outcome variable, feature selection, training, and model performance should be considered.
76
What should be considered in feature selection?
Most predictive features, steps taken for selection, importance techniques, and feature engineering techniques.
77
What strategies can be implemented to avoid overfitting?
Regularization techniques, cross-validation, or early stopping.
78
What performance metrics should be used for model evaluation?
Metrics appropriate for the use case, including sensitivity, specificity, and predictive values.
79
What is the importance of model evaluation techniques?
They assess the performance of trained models and validate generalizability.