Power of Population Data Science Webinar: Future Directions in Probabilistic Linkage

Thursday, October 11, 2018
Event type: 
2:30pm (London, UK time zone)

All sessions will be delivered live and online via the Gotowebinar system.

Can’t attend the live session? This presentation will be recorded and posted on the PopData's YouTube channel and the International Journal of Population Data Science (IJPDS) website for future reference. We recommend you register for the presentations of your choice so we can send you a link to the latest recorded sessions as they are available.

Probabilistic linkage as currently implemented is probabilistic in name only. The application of thresholds and clerical review reduce probabilistic linkage to being essentially deterministic once more; all record pairs are treated as either links or nonlinks. This fails to capture information about the uncertainty and error that is usually implicit in the linkage process, and does not support probabilistic analysis of linked data. This seminar will start by examining some common misconceptions about probabilistic linkage and its relation to deterministic techniques, highlighting important differences that do exist and introducing emerging alternatives to the Fellegi-Sunter approach. We will then look at how probabilistic linkage techniques can be built on and used to support analysis that carries information about error and uncertainty in linkage though to results and conclusions.

Topics covered will include:

  • Understanding error and uncertainty in data linkage
  • Common myths and misconceptions about probabilistic linkage
  • Emerging alternatives to Fellegi-Sunter
  • Moving towards probabilistic analysis of linked data
  • Bias analysis
  • The missing data toolbox: Imputation-based analysis of linked data
  • Summary of lessons learned and implications for enhancing research in Population Data Science

View a preview of the webinar below or view full webinar presentation on YouTube


Dr. James Doidge is an epidemiologist based at the University College London, Great Ormond Street Institute of Child Health. James has worked with linked administrative data in Australia and the UK and advised the UK Office for National Statistics and NHS Digital on data linkage matters. He has taught courses on data linkage at several institutions around the UK through his work with the Administrative Data Research Network. His current interests focus on developing new methods for data linkage and analysis of linked data, which he applies to a diverse range of research topics in population health. Read more about James here.

Dr. Harvey Goldstein is a Professor of Social Statistics at the University of Bristol and the University College London institute of Child health as well as a visiting professor at the London School of Hygiene.

He is currently a joint editor of the Journal of the Royal statistical Society, series A. and was awarded the RSS Guy medal in silver in 1998 and elected a fellow of the British Academy in 1997. Dr. Goldstein recently authored a substantive reference text in the area of statistical data analysis entitled, Multi-level Statistical Models (Wiley, 2014th edition).

His research interests include statistical modelling techniques in the construction and analysis of educational tests with a particular interest in institutional and international comparisons. Dr. Goldstein’s most recent work has focussed on developing efficient methods for handling missing data and measurement errors in complex models including multilevel ones, procedures for unbiased and efficient record linkage of large datasets and procedures for maintaining data integrity while ensuring privacy in the release and analysis of big data sets. Read more about Harvey here.

> back to main webinar page

Page last revised: October 15, 2018