Data-Driven Simulation for Healthcare Facility Utilization Modeling and Evaluation

Utilization evaluation for healthcare facilities such as hospitals and nursing homes is crucial for providing high quality healthcare services in various communities. In this paper, a data-driven simulation framework integrating statistical modeling and agent-based simulation (ABS) is proposed to evaluate the utilization of various healthcare facilities. A Bayesian modeling approach is proposed to model the relationship between heterogeneous individuals’ characteristics and time to readmission in the hospital and nursing home. An agent-based simulation model is developed to model the dynamically changing health conditions of individuals and readmission/discharge events. The individuals are modeled as agents in the agent-based simulation model, and their time to readmission and length of stay are driven by the developed Bayesian individualized models. An application based on Florida’s Medicare and Medicaid claims data demonstrates that the proposed framework can effectively evaluate the healthcare facility utilization under various scenarios.

To achieve high quality healthcare service delivery and cost-effective resource management among different healthcare service facilities is one of the important and essential objectives in the current healthcare research and practice. The increasing healthcare demand of the elderly due to the rapid population aging and high prevalence of diseases and disabilities, coupled with the costly and limited resources available in healthcare facilities, pose great challenges in current U.S. healthcare systems to ensure high quality of care. The utilization of healthcare facility is a key measure of healthcare service demand. The pursuit of match between healthcare demand and capacity requires a deep understanding of the relationship between the healthcare facility utilization and the various individual characteristics of aging population. Healthcare administrative claims data, originally generated for administrative and billing purpose, contains important information, such as healthcare time to readmission and length of stay (LoS), which can be leveraged to investigate the healthcare facility utilization. They provide valuable tracking information for admissions and discharges of elderly individuals with different health conditions and demographics. Since healthcare facility utilization may be affected by various individual characteristics and different types of healthcare facilities, such as acute care and long-term care facilities, may also have different influencing factors, it will be desirable to develop data-driven models by analyzing individual’s time to readmission and LoS from historical administrative claims data. Compared to the low level clinical data, administrative claims data is less fragmented and contains integrated compatible readmission information among different healthcare facilities, while the detailed individual health conditions, such as physiological measurements, is unavailable in the high level claims data due to privacy issue. These unobserved factors may also affect individuals’ healthcare utilization and need to be quantified explicitly. Thus, an efficient and effective statistical model needs to be developed to capture such unobserved heterogeneity and to quantify the influence of observed individual characteristics on different types of healthcare facilities utilization.

Several challenges are involved in healthcare readmission/LoS modeling and claims data analysis. The readmission data in practice exhibits right-skewness. This right-skewness makes the normality assumption in conventional statistical modeling approaches invalid. To address this skewness issue and the influences of different factors, many statistical modeling approaches have been investigated (Bernatz et al. 2015; Jasti 2008). However, these methods only consider heterogeneity induced by observed factors. They ignore the unobserved heterogeneity, which quantifies the effect of unobserved or unmeasurable factors such as the aforementioned physiological information with regarding to the high level claims data. Some approaches are developed recently to address the issue of unobserved heterogeneity (Lee et al. 2012; Kansagara et al. 2011). However, these existing studies mainly employ non-Bayesian estimation method, such as maximum likelihood estimation method. They can only assess the average healthcare utilization over a population but cannot provide an individualized model for every individual care recipient. Non-Bayesian methods also have issues in unknown parameter estimation when sample size is small. In addition, most of the previous studies only consider single type of healthcare facility, and fail to address the multiple types of competing healthcare facilities.

Although great efforts have been taken to develop statistical models to estimate individual patient’s time to readmission and LoS in healthcare facilities, evaluating the performance of a complex healthcare system design that consists of multiple individuals is still challenging. Simulation is a powerful tool to study the behavior of complex systems in a dynamic environment. Agent-based simulation (ABS) is an emerging simulation paradigm to study individuals’ decision making in various applications, such as transportation (Kim et al. 2017), homeland security (Yuan et al. 2015), and supply chain management (Meng et al., 2014). One of the major advantages of agent-based simulation is the capability of modeling individual objects (e.g., patients) and their interactions between each other and with external environment (e.g., healthcare insurance policy). In the healthcare system, patients’ behaviors directly affect the utilization of different healthcare resources (e.g., facility and personnel). The allocation of healthcare resources conversely impact the individuals’ decisions on healthcare service selection. In addition, the individuals’ health condition and service requirements are dynamically changing over time. Therefore, agent-based simulation is considered as an effective approach to study complex systems such as healthcare systems.

Agent-based modeling

State chart for agents

In order to model healthcare systems in the agent-based simulation, the individuals’ characteristics and their decisions on healthcare facility selection must be properly defined. To do so, the agents, which are to model the individuals, must be driven by valid statistical models developed based on real data. Therefore, it is critical to integrate agent-based simulation and statistical models to produce realistic outputs. In this paper, we propose a data-driven simulation approach that integrates Bayesian analytical modeling and agent-based simulation to evaluate the utilization of healthcare facility. We consider two types of healthcare facilities, namely the acute care facility of hospitals and long-term care facility of nursing homes. The proposed statistical models can jointly estimate the facility specific individual observed and unobserved heterogeneity, and capture both within-individual dependency and between-individual independency. The derived healthcare demand can be used as a service metric and performance measure (e.g. length of stay) for the utilization evaluation. For each individual, multiple individual characteristics, such as ethnic group, age, gender, availability of caregiver, and health condition, are considered to determine the time to readmission and LoS in the hospital and nursing home. An agent-based simulation model is then developed to model the individuals’ readmission and LoS in the hospital and nursing home. The event of individual’s readmission and discharge in the simulation model are driven by the Bayesian individualized models. Healthcare facility utilization is defined as the simulation output.

The remainder of the paper is organized as follows. Section 2 discusses the integration of Bayesian modeling approach and agent-based simulation. Section 3 provides a real application to demonstrate the effectiveness of the proposed data-driven simulation approach. Section 4 concludes the paper with suggestions for future work.

Data Driven Simulation Approach

The proposed data driven simulation approach integrates the Bayesian modeling approach and agent-based simulation for healthcare systems with a heterogeneous population. The Bayesian modeling approach is to estimate each individual’s time to readmission to different types of healthcare facilities. The agent-based simulation is to simulate each individual’s readmission and discharge events to estimate the utilizations of different types of healthcare facilities.

Postes connexes