Improving mapping of flu epidemics through data science

In the UK, the seasonal flu epidemic kills an average of 600 people each year, but this can fluctuate enormously, especially in pandemic years. However, capturing enough data to predict an outbreak is difficult due to the unpredictable nature of the influenza virus.

Many factors can influence the length and severity of an outbreak. These can vary from what type of influenza viruses are spreading and the peak times of an outbreak, to whether scientists can offer a vaccine compatible with the right virus. Furthermore, as the influenza virus adapts and mutates, every solution must be provided on a case-by-case basis.  For example, the outbreak of swine flu in 2009 was an irregular strain in that it did not severely affect the age-groups who are usually most vulnerable. While seasonal flu often has its greatest impact on older people, the worst affected by swine flu were adults under 65 years old.

Since 1967, the Royal College of General Practitioners Research and Surveillance Centre (RCGP RSC) network has been actively monitoring incidents of influenza, alongside other diseases, in a long-standing surveillance and research partnership with Public  Health England and its predecessor organisations. The network is the principal surveillance system for England. It extracts data from around 200 volunteer GP practices, collecting anonymised data. This data enables the NHS to find spikes of outbreaks during the different seasons of the year, to assess their effectiveness and support vaccination programmes. The network played an important role in reporting the swine flu epidemic. (The RCGP RSC secure data and analytics hub moved to Surrey in 2015.)

The quality and accuracy of the UK’s medical records is high but factors such as the structure of GP appointments and human input can lead to inconsistencies. Most GP appointments are 10-minutes long and a GP can see as many as 40 people a day. These constraints can limit data quality in computerised records. Any inconsistencies in data or reports recorded limit the knowledge we can glean from the retrospective data. We at the University of Surrey usually run weekly data quality checks to ensure the data received is coherent. However, due to data quality issues or incomplete extraction we have to exclude around 10% of recorded data as they do not meet the required standards.

As we improve our ability to analyse and utilise big data, such as through our partnership with NPL, we will unlock exciting new opportunities to improve and innovate healthcare in the UK

We have recently started work with NPL, using their data mining expertise to study the data monitored by RCGP RSC and correct missing or miscoded incidents not properly tracked by the network, in particular differentiating a new incident case of an influenza-like illness from a follow-up.  NPL’s algorithm will enable us to analyse the corrected data weekly and historical data will be analysed retrospectively. This level of accuracy should provide us a clearer and more comprehensive picture of peaks in different influenza epidemics, and allow us to include data we currently discard. Such visibility will allow us to easily identify trends over time, and ultimately improve the accuracy of epidemic early warnings, and increase our understanding of the impact of infection on certain demographics. It will also give us a new perspective on the efficacy of new vaccines and other treatments and improve our ability to plan for and treat flu epidemics.

This opens many avenues to how we can better utilise this data in the future. For example, there is the potential to combine current data sets with other information, such as data from social media. We have researched this possibility in conjunction with University College London and PHE. The findings found a correlation between our data sets and data from Twitter. It is best to be cautious about social media in the context of predicting epidemics, given the self-reported and unverified nature of the data. However, it is certainly true that as we improve our ability to analyse and utilise big data, such as through our partnership with NPL, we will unlock exciting new opportunities to improve and innovate healthcare in the UK.

We make your impossible possible

Get in touch with us to find out how we can help make your impossible possible.