Objective 4 - Predictive Modeling Flashcards
Risk factors that indicate whether a person may have high claims
- Inherent risk factors, such as age, sex, and race
- Medical condition-related factors, such as diabetes or cancer
- Family history (for conditions that are inheritable)
- Lifestyle risk factors, such as smoking, lack of exercise, and poor nutrition
- External risk factors, such as industry, location, and education
Types of medical management interventions
- Care coordination (focuses on the system) - includes case management, discharge planning, and in-hospital care coordination
- Condition management (focuses on the patient) - includes disease management and risk factor management
- Provider management (focuses on the provider) - includes provider profiling, pay-for-performance, and accountable care organizations
Areas where condition-based models are used in healthcare financial applications
- Program management - identifying high-risk individuals, financial modeling and resource allocation, and program evaluation (eg, calculating savings)
- Provider or health plan reimbursement - normalizing populations to pay providers or plans for the risks they accept and to evaluate provider effectiveness. Profiling providers to assess quality and efficiency
- Actuarial and U/W functions - pricing health plans, underwriting groups, and projecting future claims costs
Types of predictive models that are not based on medical conditions (traditional “non-condition risk-based” models)
- Age/sex - rates are established for a group based on the average age/sex factor of the members in the group (works best for large groups w/ age/sex factors close to 1.0)
- Prior cost - the prior year’s claims are used to project future costs (is reasonably accurate for large groups, but not for smaller groups)
- Combination of age/sex and prior cost - often used for rating smaller groups
Sources of data for developing risk factors
- Claims data - for medical condition-related risk factors such as diabetes or cancer
- Self-reported data - for lifestyle related risk factors such as smoking, stress, lack of exercise, poor nutrition, etc. (see separate list of risk factors identified by a health risk assessment)
- External data - for lifestyle-related risk factors such as industry, geography, education, and income level
Risk factors identified by a health risk assessment
- Personal disease history
- Family disease history
- Health screenings and immunizations
- Alcohol consumption
- Injury prevention behavior
- Nutrition
- Physical activity
- Skin protection
- Stress and well-being
- Tobacco use
- Weight management
- Women’s health (eg, pregnancy status)
- General health assessment
- Functional health status
- Mental health status
Types of data sources for predictive modeling
- Physician referral/chart (high reliability, low practicality) - medical charts provide the most information, but have serious drawbacks (see separate list)
- Enrollment (high reliability, high practicality) - can be used to convert claims data into PMPM amounts
- Claims (medium reliability, high practicality) - usually available to health plans and continually refreshed as events occur. Data quality varies greatly (must check for accuracy). Lots of info is provided in claim forms for hospital (UB04) and professional (CMS 1500) claims.
- Pharmacy (medium reliability, high practicality) - high quality data that completes quickly. But there is no diagnosis on the claims, and prescriptions that aren’t filled won’t generate claims
- Laboratory values (high reliability, low practicality) - can be difficult to obtain, and vendors do not use a standard format
- Self-reported (low/medium reliability, low practicality) - will become important since members can report info that isn’t available elsewhere, but there are drawbacks (see separate list)
Drawbacks of using data from medical charts
- They do not cover OON services or drugs prescribed by OON providers
- They do not record the patient’s compliance with physician orders (such as prescription filling)
- Transcribing the data and transferring it to a uniform format is time consuming and requires highly-trained staff
- There is not uniformity in how physicians code conditions and their severity
- Charts are typically unavailable to the health plan or the actuary
Advantages and disadvantages of using diagnosis codes for identifying member conditions
Advantages:
1. Codes are almost always present on medical claims
2. A uniform format exists
3. Usefulness for identifying conditions
Disadvantages:
1. Usually only the primary and secondary codes are populated in the claims data
2. Coding errors may occur
3. Codes may sometimes be selected to drive maximum reimbursement
4. Different physicians may follow different coding practices
Drawbacks of using survey data
- Surveys must be commissioned, budgeted, and executed in order to generate the data
- Data isn’t updated as medical events occur, so it can become stale unless the survey is updated periodically
- Response bias can make it dangerous to draw conclusions from survey responses
- Respondents may submit untruthful answers
Questions to answer when building a clinical identification algorithm
A clinical identification algorithm is a set of rules that is applied to a claims data set to identify the conditions present in the population
- Where are the diagnoses?
- What is the source of the diagnosis (claims, medical charts, etc.)?
- If the source is claims, what claims should be considered (inpatient, outpatient, lab, etc.)?
- If the claim contains more than one diagnosis, how many diagnoses will be considered for identification?
- Over what time span, and how often, will a diagnosis have to appear in claims for that diagnosis to be incorporated?
- What procedures may be useful for determining severity of a diagnosis?
- What prescription drugs may be used to identify conditions?
Challenges when constructing a condition-based model
- The large # of procedure and drug codes
- Deciding the severity level at which to recognize the condition
- The impact of co-morbidities for conditions that are often found together
- The degree of certainty with which the diagnosis has been identified
- The extent of the data (claims data will cover all members, but self-reported data will not)
- The type of benefit design that underlies the data
Definitions of sensitivity and specificity
When building clinical identification algorithms, the proper balance between sensitivity and specificity must be found
1. Sensitivity - the % of members correctly identified as having a condition (“true positives”)
2. Specificity - the % of members correctly identified as not having a condition (“true negatives”)
Specificity may be more important for underwriting, while sensitivity may be more important for care management, since clinicians can verify the presence of a condition.
External sources of clinical identification algorithms
- HEDIS (from the NCQA) has algorithms for identifying some conditions (eg, asthma, high blood pressure, diabetes)
- Disease Management Association of America (now Care Continuum Alliance) developed algorithms for identifying chronic diseases
- Grouper models - commercially-available models that identify member conditions and score them for relative risk and cost
- Literature - articles will sometimes report the codes that are used for analysis
Reasons for using commercially-available grouper models
- Building algorithms from scratch requires a considerable amount of work
- Models must be maintained to accommodate new codes, which requires even more work
- Commercially-available models are accessible to many users. Providers and plans often require that payments be based on a model that is available for review and validation