White Paper

Improving COVID Readiness and Responsiveness with Local Patient Data and Machine Learning

Ray Bamford

Executive Summary

COVID-19 has wreaked havoc in countries around the world and in states across the U.S. With a “fall wave” widely expected by epidemiologists, COVID-19 continues to be a major threat to public health, the economy, and the normal functioning of our society.

In this white paper, we outline an approach to forecasting COVID-19 hospitalizations based on local, patient-level demographic and clinical data. We combine local patient data with classical and modern machine learning methods that are uniquely well suited to this challenge. This enables us to produce forecasts with high relevance, robustness, and accuracy.

Our objective is to empower decision makers with timely, actionable insights that enable them to make informed, data-driven decisions. The approach we describe applies directly to forecasting other resource requirements, such as ICU capacity, ventilator supply, and medical staffing needs. These forecasts will enable leaders at all levels to anticipate demand, maximize readiness, and improve patient outcomes – ultimately saving lives and accelerating the recovery of our economy.

The COVID-19 Challenge

Like many Americans, we're concerned about the growing COVID-19 case rates and deaths in so many parts of our country. We perceive a real need, and opportunity, to help government and healthcare leaders make better informed, more timely decisions. We believe this would save lives and help the country get our economy and nation back on its feet ASAP.

It’s been well established that having access to first-rate healthcare services significantly improves patient outcomes – fatality rates have risen dramatically when health systems have been overrun with cases. Yet in July, months after ravaging China, Europe, and New York City, COVID-19 deaths increased dramatically across the southern U.S. In south Texas, hospitals ran out of capacity and supplies, and at least one hospital convened a “death panel” to decide which patients to treat and which to send home [2] [3] [4]. Over 50 Florida hospitals reached ICU capacity. As of August 15th, twenty-six Florida counties had zero ICU beds available [5] [6]. Over the first two weeks of August, the U.S. averaged over 1,000 deaths per day – the equivalent of one September 11 attack every three days [7].

We believe that we can and should do better. With a new wave of COVID-19 cases expected in the autumn flu season, leaders at the national, state, and local levels need to remain vigilant. Lives and livelihoods depend on it.

Our Objectives

Our objective is to provide leaders with timely insights in the face of uncertainty, enabling them to make informed, data-driven, risk-adjusted decisions that improve patient outcomes and save lives. With robust forecasts, decision makers can anticipate demand and take proactive actions to prepare for that demand.

In this paper, we outline an approach to forecasting COVID-19 hospitalization that will help leaders at all levels, from federal agencies to local hospitals, anticipate demand and improve readiness and responsiveness. This approach is directly extensible to related resource planning needs such as ICU capacity, ventilator supply, and medical staffing needs.

The Solution: Technical Approach

Background and context

Many world-class scientists and data scientists have developed COVID-19 forecasting tools. This includes teams from Stanford, Harvard, University of Washington, and numerous other institutions [8]. Each of these surveyed tools base their forecasts on macro factors such as case growth rates, the degree of social distancing, and the average hospital Length of Stay (LOS), among others.

Our approach to forecast hospitalization incorporates patient-level data and combines classical methods such as biostatistical survival modeling with deep neural networks to generate forecasts. This will yield predictions with great robustness and relevance.

Project goals

We established several goals for this effort:

  1. Develop tools that are relevant for local decision makers, as well as regional, state-wide, and national decision makers.
  2. Create forecasts based on patient-level demographic and clinical data, including known COVID-19 risk factors
  3. Forecast COVID-19 resource requirements, such as hospitalization demand, as opposed to case rates.
  4. Deliver a range of forecasts, including expected and “worst-case” scenarios, rather than simply point forecasts, in order to enable leaders to make better risk-adjusted decisions.
  5. Provide forecasts that extend out six weeks (42 days) or longer.

Patient-level data

Our models incorporate patient-level data from confirmed positive cases, as well as forecasted cases. COVID-19 severity and mortality have been shown to depend heavily on patient-specific characteristics. All cases are not equivalent.

In July, researchers in the U.K. published an extensive study of COVID-19 risk factors based on records of over 17 million patients associated with almost 11,000 COVID-19 related deaths. They found that age is the most prominent risk factor, especially for patients older than 70. Other high risk factors include obesity; diabetes; certain cancers, particularly cancers of the blood; reduced kidney function; history of strokes, dementia, or other neurological disease; as well as compromised immune systems, among others [9].

Hence, an outbreak in a nursing home will result in many more severe cases and much greater demand for healthcare resources than a similar sized outbreak in a nursery or elementary school. By basing our models on patient demographics, we can greatly improve the accuracy and relevance of our forecasts of COVID-19 related resources.

Biostatistical survival models

We utilize sophisticated biostatistical methods, including time-to-event survival regression models, which allow us to forecast not only the probability that an event such as hospitalization will occur, but also the timing of the event. These are transparent models that produce hazard ratios for each predictor, so we will understand the relative contributions of each predictor variable to the forecast.

Most traditional modeling approaches require complete datasets with equal amounts of information for each record. Survival analysis models naturally accommodate the use of all available data, including in-progress cases with anywhere from 0 to 42 days of data. By making use of all available case data, including unresolved cases, we improve predictive power and forecast accuracy.

Neural networks

Neural networks use a hierarchical structure of nodes, organized in layers, to map inputs (predictor variables) to outputs (predictions). These models have the ability to capture extremely complex, non-linear relationships between inputs and outputs.

In our current case, we develop a neural network with patient demographic information and other predictors as inputs and hospitalization probability by day as outputs. An additional advantage of the neural network models is that they do not require us to assume proportional hazards, which require that predictors maintain the same relative influence over all time periods. This gives our models additional flexibility and robustness.

Model aggregation

The biostatistical survival models and the neural networks each have benefits and limitations. Our approach is to utilize both types of models. Model aggregation is a commonly used technique to improve model performance. By combining predictions of models with different assumptions, we can achieve higher levels of model accuracy and robustness.

Confidence intervals and forecast ranges

Decision makers benefit from knowing not just expected or most-likely outcomes, but also the range of possible outcomes, including “worst case” scenarios. As such, our approach provides confidence intervals and a range of forecasts (e.g., 25th, 50th, 75th, and 90th percentiles). This allows decision makers to make better decisions that are tailored to their own risk sensitivities and required safety factors.

Open Source Software

We make extensive use of Free Open Source Software (FOSS), including Python, R, Keras, and Jupyter notebooks. These tools are common in the data science and machine learning communities and are easy to incorporate into a general analytical workflow.

The Benefits: Expected Results

Our models produce forecasts of local hospitalization requirements by day, going out six weeks (or longer). By basing our models on local patient data that includes known risk factors related to illness severity and fatality, we can produce much more accurate and relevant forecasts. By combining biostatistical survival models with neural networks, we gain many advantages that improve forecast flexibility and predictive power. And by providing a range of forecast outcomes, from “best case” to “worst case”, we give decision makers the ability to assess and incorporate risk considerations into their decisions.

Our forecasts will be delivered daily (or more frequently if new data is available more often) and shown visually in a dashboard. This will provide timely, actionable insights to leaders at all levels, from local and regional levels to state and national levels. These insights enable leaders to (a) anticipate demand, so they can take proactive actions that maximize readiness, and (b) recognize when conditions change, so they can respond rapidly and intervene effectively.

Our forecasts can help leaders make informed, data driven decisions and address questions such as the following:

  • How much capacity do my hospital or local healthcare providers require for COVID-19 cases?
  • What are expected capacity needs and worst-case (90th percentile) capacity needs?
  • How can local leaders be proactive in ensuring COVID-19 readiness and responsiveness?
  • Which hospitals and counties are at greatest risk of running out of hospital beds, ICU capacity, or ventilator availability?
  • Where should state and federal agencies focus their attention to have the greatest positive impact?

Ultimately, we aim to improve patient outcomes, save lives, and accelerate the recovery of our economies.

About Alucida

Alucida specializes in enabling data-driven decisions and improving the speed and quality of those decisions. We combine expertise in strategic thinking, design thinking, and data science — with a focus on planning under extreme uncertainty (which certainly applies to our current situation with COVID-19!). Our data scientists have deep expertise across a wide range of both classical and modern machine learning algorithms and technologies.

Contact Us to Receive the Technical White Paper

This white paper is an excerpt from an accompanying technical white paper that describes our technical approach in more detail. The technical white paper goes into greater depth on topics such as data sources and assumptions, feature selection, and model training and validation. It provides deeper explanations of how we apply biostatistical survival models and neural networks to build our forecasting models. It also specifies how we evaluate and measure model accuracy. Please contact us today if you are interested in learning more.


Works Cited

  1. "Worldometer - United States," [Online]. Available: https://www.worldometers.info/coronavirus/country/us/. [Accessed 15 August 2020].
  2. D. J. Livingston, "The COVID-19 Crisis in South Texas Spirals Out of Control," 22 July 2020. [Online]. Available: https://medium.com/beingwell/the-covid-19-crisis-in-south-texas-spirals-out-of-control-8a8a0f489f69.
  3. "Texas hospital forced to set up 'death panel' as Covid-19 cases surge," The Guardian, 26 July 2020. [Online]. Available: https://www.theguardian.com/world/2020/jul/26/covid-19-death-panels-starr-county-hospital-texas.
  4. "Laredo hospitals ‘at capacity’; mayor says ‘things are critical’ after 10 more COVID-19 deaths," BorderReport, 3 August 2020. [Online]. Available: https://www.borderreport.com/health/coronavirus/laredo-hospitals-at-capacity-mayor-says-things-are-critical-after-10-more-covid-19-deaths/.
  5. "At least 50 Florida hospital ICUs hit capacity," The Hill , 29 July 2020. [Online]. Available: https://thehill.com/homenews/coronavirus-report/509636-at-least-50-florida-hospital-icus-hit-capacity.
  6. "Florida Hospital ICU Beds," [Online]. Available: https://www.wpbf.com/article/icu-beds-florida-coronavirus/33562716. [Accessed 15 August 2020].
  7. "September 11 Attacks," [Online]. Available: https://en.wikipedia.org/wiki/September_11_attacks. [Accessed 15 August 2020].
  8. "A Compendium of Models that Predict the Spread of COVID-19," American Hospital Association, [Online]. Available: https://www.aha.org/guidesreports/2020-04-09-compendium-models-predict-spread-covid-19. [Accessed 15 August 2020].
  9. A. J. W. B. G. Elizabeth J. Williamson, "Factors associated with COVID-19-related death using OpenSAFELY," Nature, 08 July 2020. [Online]. Available: https://www.nature.com/articles/s41586-020-2521-4.

White Paper

Improving COVID Readiness and Responsiveness with Local Patient Data and Machine Learning

Ray Bamford

Executive Summary

COVID-19 has wreaked havoc in countries around the world and in states across the U.S. With a “fall wave” widely expected by epidemiologists, COVID-19 continues to be a major threat to public health, the economy, and the normal functioning of our society.

In this white paper, we outline an approach to forecasting COVID-19 hospitalizations based on local, patient-level demographic and clinical data. We combine local patient data with classical and modern machine learning methods that are uniquely well suited to this challenge. This enables us to produce forecasts with high relevance, robustness, and accuracy.

Our objective is to empower decision makers with timely, actionable insights that enable them to make informed, data-driven decisions. The approach we describe applies directly to forecasting other resource requirements, such as ICU capacity, ventilator supply, and medical staffing needs. These forecasts will enable leaders at all levels to anticipate demand, maximize readiness, and improve patient outcomes – ultimately saving lives and accelerating the recovery of our economy.

The COVID-19 Challenge

Like many Americans, we're concerned about the growing COVID-19 case rates and deaths in so many parts of our country. We perceive a real need, and opportunity, to help government and healthcare leaders make better informed, more timely decisions. We believe this would save lives and help the country get our economy and nation back on its feet ASAP.

It’s been well established that having access to first-rate healthcare services significantly improves patient outcomes – fatality rates have risen dramatically when health systems have been overrun with cases. Yet in July, months after ravaging China, Europe, and New York City, COVID-19 deaths increased dramatically across the southern U.S. In south Texas, hospitals ran out of capacity and supplies, and at least one hospital convened a “death panel” to decide which patients to treat and which to send home [2] [3] [4]. Over 50 Florida hospitals reached ICU capacity. As of August 15th, twenty-six Florida counties had zero ICU beds available [5] [6]. Over the first two weeks of August, the U.S. averaged over 1,000 deaths per day – the equivalent of one September 11 attack every three days [7].

We believe that we can and should do better. With a new wave of COVID-19 cases expected in the autumn flu season, leaders at the national, state, and local levels need to remain vigilant. Lives and livelihoods depend on it.

Our Objectives

Our objective is to provide leaders with timely insights in the face of uncertainty, enabling them to make informed, data-driven, risk-adjusted decisions that improve patient outcomes and save lives. With robust forecasts, decision makers can anticipate demand and take proactive actions to prepare for that demand.

In this paper, we outline an approach to forecasting COVID-19 hospitalization that will help leaders at all levels, from federal agencies to local hospitals, anticipate demand and improve readiness and responsiveness. This approach is directly extensible to related resource planning needs such as ICU capacity, ventilator supply, and medical staffing needs.

The Solution: Technical Approach

Background and context

Many world-class scientists and data scientists have developed COVID-19 forecasting tools. This includes teams from Stanford, Harvard, University of Washington, and numerous other institutions [8]. Each of these surveyed tools base their forecasts on macro factors such as case growth rates, the degree of social distancing, and the average hospital Length of Stay (LOS), among others.

Our approach to forecast hospitalization incorporates patient-level data and combines classical methods such as biostatistical survival modeling with deep neural networks to generate forecasts. This will yield predictions with great robustness and relevance.

Project goals

We established several goals for this effort:

  1. Develop tools that are relevant for local decision makers, as well as regional, state-wide, and national decision makers.
  2. Create forecasts based on patient-level demographic and clinical data, including known COVID-19 risk factors
  3. Forecast COVID-19 resource requirements, such as hospitalization demand, as opposed to case rates.
  4. Deliver a range of forecasts, including expected and “worst-case” scenarios, rather than simply point forecasts, in order to enable leaders to make better risk-adjusted decisions.
  5. Provide forecasts that extend out six weeks (42 days) or longer.

Patient-level data

Our models incorporate patient-level data from confirmed positive cases, as well as forecasted cases. COVID-19 severity and mortality have been shown to depend heavily on patient-specific characteristics. All cases are not equivalent.

In July, researchers in the U.K. published an extensive study of COVID-19 risk factors based on records of over 17 million patients associated with almost 11,000 COVID-19 related deaths. They found that age is the most prominent risk factor, especially for patients older than 70. Other high risk factors include obesity; diabetes; certain cancers, particularly cancers of the blood; reduced kidney function; history of strokes, dementia, or other neurological disease; as well as compromised immune systems, among others [9].

Hence, an outbreak in a nursing home will result in many more severe cases and much greater demand for healthcare resources than a similar sized outbreak in a nursery or elementary school. By basing our models on patient demographics, we can greatly improve the accuracy and relevance of our forecasts of COVID-19 related resources.

Biostatistical survival models

We utilize sophisticated biostatistical methods, including time-to-event survival regression models, which allow us to forecast not only the probability that an event such as hospitalization will occur, but also the timing of the event. These are transparent models that produce hazard ratios for each predictor, so we will understand the relative contributions of each predictor variable to the forecast.

Most traditional modeling approaches require complete datasets with equal amounts of information for each record. Survival analysis models naturally accommodate the use of all available data, including in-progress cases with anywhere from 0 to 42 days of data. By making use of all available case data, including unresolved cases, we improve predictive power and forecast accuracy.

Neural networks

Neural networks use a hierarchical structure of nodes, organized in layers, to map inputs (predictor variables) to outputs (predictions). These models have the ability to capture extremely complex, non-linear relationships between inputs and outputs.

In our current case, we develop a neural network with patient demographic information and other predictors as inputs and hospitalization probability by day as outputs. An additional advantage of the neural network models is that they do not require us to assume proportional hazards, which require that predictors maintain the same relative influence over all time periods. This gives our models additional flexibility and robustness.

Model aggregation

The biostatistical survival models and the neural networks each have benefits and limitations. Our approach is to utilize both types of models. Model aggregation is a commonly used technique to improve model performance. By combining predictions of models with different assumptions, we can achieve higher levels of model accuracy and robustness.

Confidence intervals and forecast ranges

Decision makers benefit from knowing not just expected or most-likely outcomes, but also the range of possible outcomes, including “worst case” scenarios. As such, our approach provides confidence intervals and a range of forecasts (e.g., 25th, 50th, 75th, and 90th percentiles). This allows decision makers to make better decisions that are tailored to their own risk sensitivities and required safety factors.

Open Source Software

We make extensive use of Free Open Source Software (FOSS), including Python, R, Keras, and Jupyter notebooks. These tools are common in the data science and machine learning communities and are easy to incorporate into a general analytical workflow.

The Benefits: Expected Results

Our models produce forecasts of local hospitalization requirements by day, going out six weeks (or longer). By basing our models on local patient data that includes known risk factors related to illness severity and fatality, we can produce much more accurate and relevant forecasts. By combining biostatistical survival models with neural networks, we gain many advantages that improve forecast flexibility and predictive power. And by providing a range of forecast outcomes, from “best case” to “worst case”, we give decision makers the ability to assess and incorporate risk considerations into their decisions.

Our forecasts will be delivered daily (or more frequently if new data is available more often) and shown visually in a dashboard. This will provide timely, actionable insights to leaders at all levels, from local and regional levels to state and national levels. These insights enable leaders to (a) anticipate demand, so they can take proactive actions that maximize readiness, and (b) recognize when conditions change, so they can respond rapidly and intervene effectively.

Our forecasts can help leaders make informed, data driven decisions and address questions such as the following:

  • How much capacity do my hospital or local healthcare providers require for COVID-19 cases?
  • What are expected capacity needs and worst-case (90th percentile) capacity needs?
  • How can local leaders be proactive in ensuring COVID-19 readiness and responsiveness?
  • Which hospitals and counties are at greatest risk of running out of hospital beds, ICU capacity, or ventilator availability?
  • Where should state and federal agencies focus their attention to have the greatest positive impact?

Ultimately, we aim to improve patient outcomes, save lives, and accelerate the recovery of our economies.

About Alucida

Alucida specializes in enabling data-driven decisions and improving the speed and quality of those decisions. We combine expertise in strategic thinking, design thinking, and data science — with a focus on planning under extreme uncertainty (which certainly applies to our current situation with COVID-19!). Our data scientists have deep expertise across a wide range of both classical and modern machine learning algorithms and technologies.

Contact Us to Receive the Technical White Paper

This white paper is an excerpt from an accompanying technical white paper that describes our technical approach in more detail. The technical white paper goes into greater depth on topics such as data sources and assumptions, feature selection, and model training and validation. It provides deeper explanations of how we apply biostatistical survival models and neural networks to build our forecasting models. It also specifies how we evaluate and measure model accuracy. Please contact us today if you are interested in learning more.


Works Cited

  1. "Worldometer - United States," [Online]. Available: https://www.worldometers.info/coronavirus/country/us/. [Accessed 15 August 2020].
  2. D. J. Livingston, "The COVID-19 Crisis in South Texas Spirals Out of Control," 22 July 2020. [Online]. Available: https://medium.com/beingwell/the-covid-19-crisis-in-south-texas-spirals-out-of-control-8a8a0f489f69.
  3. "Texas hospital forced to set up 'death panel' as Covid-19 cases surge," The Guardian, 26 July 2020. [Online]. Available: https://www.theguardian.com/world/2020/jul/26/covid-19-death-panels-starr-county-hospital-texas.
  4. "Laredo hospitals ‘at capacity’; mayor says ‘things are critical’ after 10 more COVID-19 deaths," BorderReport, 3 August 2020. [Online]. Available: https://www.borderreport.com/health/coronavirus/laredo-hospitals-at-capacity-mayor-says-things-are-critical-after-10-more-covid-19-deaths/.
  5. "At least 50 Florida hospital ICUs hit capacity," The Hill , 29 July 2020. [Online]. Available: https://thehill.com/homenews/coronavirus-report/509636-at-least-50-florida-hospital-icus-hit-capacity.
  6. "Florida Hospital ICU Beds," [Online]. Available: https://www.wpbf.com/article/icu-beds-florida-coronavirus/33562716. [Accessed 15 August 2020].
  7. "September 11 Attacks," [Online]. Available: https://en.wikipedia.org/wiki/September_11_attacks. [Accessed 15 August 2020].
  8. "A Compendium of Models that Predict the Spread of COVID-19," American Hospital Association, [Online]. Available: https://www.aha.org/guidesreports/2020-04-09-compendium-models-predict-spread-covid-19. [Accessed 15 August 2020].
  9. A. J. W. B. G. Elizabeth J. Williamson, "Factors associated with COVID-19-related death using OpenSAFELY," Nature, 08 July 2020. [Online]. Available: https://www.nature.com/articles/s41586-020-2521-4.

Let's connect.
Contact us today to explore how we can help you exploit your biggest opportunities or overcome your biggest challenges.
Contact Us