Abstract
ntroductionInfluenza is a contagious disease that causes epidemics in manyparts of the world. The World Health Organization estimates thatinfluenza causes three to five million severe illnesses each year and250,000-500,000 deaths [1]. Predicting and characterizing outbreaksof influenza is an important public health problem and significantprogress has been made in predicting single outbreaks. However,multiple temporally overlapping outbreaks are also common.These may be caused by different subtypes or outbreaks in multipledemographic groups. We describe ourMultiple Outbreak DetectionSystem(MODS) and its performance on two actual outbreaks.This work extends previous work by our group [2,3,4] by using model-averaging and a new method to estimate non-influenza influenza-likeillness (NI-ILI). We also apply MODS to a real dataset with a doubleoutbreak.MethodsMODS is part of a framework for disease surveillance developedby our group. In this framework, a natural language processing systemextracts symptoms from emergency department patient-care reports.These features are combined with laboratory results and passed to acase detection system that infers a probability distribution over thediseases each patient may have. These diseases include influenza,NI-ILI, and other (appendicitis, trauma, etc.). This distribution isexpressed in terms of the likelihoods of the patients’ data. These aregiven to MODS which searches a space of multiple outbreak models,computes the likelihood of each model, and calculates the expectednumber of influenza cases day-by-day. This work differs from pastwork in three important ways. First, we address the problem ofdetecting and characterizing multiple, overlapping outbreaks. Second,we do not rely on simple counts, but use likelihoods given evidencein the free-text portion of patient-care reports as well as laboratoryfindings. Third, we explicitly account for non-influenza influenza-like illnesses. This is important because some forms of influenza-likeillness (such as respiratory syncytial virus) are contagious and exhibitoutbreak activity. This research was approved by the University ofPittsburgh and Intermountain Healthcare IRBs.ResultsWe conducted a set of experiments with simulated outbreaks.MODS is able to detect a single outbreak six to eight weeks beforethe peak. It is also able to recognize a second outbreak approximatelyhalfway between peaks for simulated double outbreaks. Weconducted experiments using real outbreaks and compared ourresults to thermometer sales [5]. Using data from Allegheny CountyPennsylvania for the 2009-2010 influenza season, on September1 MODS predicted an outbreak with a peak on October 5. Thethermometer peak was October 21. The figure “Prediction on October1 for Allegheny County” compares MODS’ prediction on October 1to thermometer sales. Using data from Salt Lake City Utah for the2010-2011 influenza season, on November 1 MODS predicted anoutbreak with peak on December 7. The first thermometer peak wasDecember 29. On January 20 MODS predicted a second outbreakwith peak on February 9. The second thermometer peak was March5. The figure “Prediction on January 20 for Salt Lake City” comparesMODS’ prediction on January 20 to thermometer sales.ConclusionsWe have built aMultiple Outbreak Detection Systemthat candetect and characterize overlapping outbreaks of influenza. Althoughthe system currently predicts outbreaks of influenza, it is built on ageneral Bayesian framework that can be extended to other diseases.Future work includes incorporating multiple forms of evidence,modeling other known contagious diseases, and detecting outbreaksof new previously unknown diseases.Prediction on October 1 for Allegheny County 2009-2010Prediction on January 20 for Salt Lake City 2010-2011