Abstract
ObjectiveWe apply an empirical Bayesian framework to perform changepoint analysis on multiple cattle mortality data streams, accountingfor delayed reporting of syndromes.IntroductionTaking into account reporting delays in surveillance systems isnot methodologically trivial. Consequently, most use the date of thereception of data, rather than the (often unknown) date of the healthevent itself. The main drawback of this approach is the resultingreduction in sensitivity and specificity1. Combining syndromicdata from multiple data streams (most health events may leave a“signature” in multiple data sources) may be performed in a Bayesianframework where the result is presented in the form of a posteriorprobability for a disease2.MethodsWe used a historical national database on Swiss cattle mortality tomodel daily baseline counts of two syndromic time series3. Reportingdelay was defined as the number of days between reported occurrenceand reporting date. The cumulative probability distribution of theestimated reporting delays was used to calculate for each day theproportion of cases that were reported either on the same day or witha delay of 1 to 14 days.We evaluated outbreak detection performance under threescenarios: (A) delayed data reporting occurs but is not accountedfor; (B) delayed data reporting occurs and is accounted for; and (C)absence of delayed data reporting (i.e. an ideal system). Outputsare presented as the value of evidence (V) in favour of an ongoingoutbreak accumulated overnpoints in time (30 days in this case).At each timet, V is defined as the ratio between the posterior andprior odds for H1versus H0:[insert equation 1 here]Using sensitivity, time to detection and in-control run length,performance of the (V-based) system on large and small non-specificoutbreaks was measured.ResultsThe evolution of V based on the information available on the 1st,5th and 10th day after the onset of an outbreak can be visualised inFig. 1. After 5 days, V shows evidence in favour of an outbreak forboth syndromes combined, as well as for on-farm deaths alone, only inthe “Delay aware” and “No delay” scenarios. The development of Vfor the perinatal deaths alone highlights the importance of consideringmultiple syndromic data streams for outbreak detection, as it speaksin favour of an outbreak at a later stage than on-farm deaths alone orboth syndromes combined.ConclusionsOur empirical Bayes approach is an attractive alternative tomultivariate CUSUM algorithms offering a logical approach toweighting variables and incorporating additional information such asdelayed reporting, and a performance on a comparable level to anideal (no delay) system. Outbreaks are detected earlier and with onlya marginal loss of specificity compared to a system where reportingdelay is present but unaccounted for.We also found that the accumulation of evidence from severaldays resulted in a significantly better outbreak detection timeliness,for a given specificity; or a similar timeliness, but higher specificity,compared to an algorithm4that only looks for days with unusual highnumber of counts.Fig. 1: Evolution of V over three time points (t) for the three scenarios.Outbreak starts at t=651. Number of observed perinatal (circle) and on-farmdeaths (cross), V for both (solid grey) and individual syndromes (dotted greyand black respectively), prior probability that an outbreak is ongoing (greydashed) and posterior probability that an outbreak is ongoing given theevidence (black dashed). Horizontal grey solid line shows V=1.