Abstract
Objective
To determine the merits of different surveillance methods for cluster detection, in particular when used in conjuction with small area data. This will be investigated using a simulated framework. This is with a view to support further surviellance work using real small area data.
Introduction
Health surveillance is well established for infectious diseases, but less so for non-communicable diseases. When spatio-temporal methods are used, selection often appears to be driven by arbitrary criteria, rather than optimal detection capabilities. Our aim is to use a theoretical simulation framework with known spatio-temporal clusters to investigate the sensitivity and specificity of several traditional (e.g. SatScan and Cusum) and Bayesian (incl. BaySTDetect and Dcluster) statistical methods for spatio-temporal cluster detection of non-communicable disease.
Methods
Count data were generated using various random effects (RE). A subset of areas was randomly given an increased relative risk (RR) to simulate disease clusters. Simulations were conducted in R using a grid of 625 areas. We used 12 times= nteps within a hierarchical Poisson model. Multiple values of model parameters, including REs and the RR within clusters, were then tested. The range of RE (values) was derived from real-world data from England on common and rare diseases. RR ranging between 1.2 and 1.8 were tested to reflect both low and high exposures to pollutants and other risk factors. ROC analysis, based on 50 simulations, was used to assess the performance of each statistical method for each combination of parameter values.
Results
Our ROC analysis suggested that SaTScan usually had the highest specificity at low sensitivities (<0.5), although its maximum sensitivity was often lower than when using the Bayesian methods. In scenarios where the RR within clusters was lower, all methods had less sensitivity at a given specificity. Cusum usually performed quite similarly to SatScan, while the two Bayesian methods considered often misidentified a high proportion of disease clusters. P-values generated by SaTScan need to be considered with caution as they did not relate closely with the sensitivity or specificity of the ROC curves from our simulations.
Conclusions
Real-world investigations of spatio-temporal signals (e.g. disease clusters) are often complex and time consuming. Identifying the best method to reduce the risks of identifying false positives and of missing real clusters is therefore essential. Despite the inherent constraints of theoretical simulations, such a framework allows to objectively assess the performance of different methods. Overall, our simulation framework suggested that SatScan would usually be the easiest, most user-friendly and best performing space-time methods for non-communicable disease surveillance.