Identification algorithms and related considerations when using administrative data for epidemiology

12:00noon to 1:15pm PST

This webinar is part of the Advanced Methods Webinar Series

Administrative health data (e.g., healthcare practitioner encounters, hospitalizations, and prescription dispensations) are routinely-collected, population-based sources of health information that hold substantial potential for health services evaluation and epidemiological research.

Since such data are collected for administration, budgeting, and/or billing purposes – not for research purposes per se – understanding how best to use and make inferences based on such administrative health data for epidemiology is important.

The nature of administrative data is often such that it requires modification and transformation to be more useful. This contrasts to researcher-designed data collection (e.g., cohort studies, surveys) whereby there typically is a high degree of control over the type and timing of information collected. Hence, in numerous instances administrative data cannot be taken at face value.

Failure to appropriately transform administrative data can introduce various information biases into analyses and results including misclassification of variables (e.g., exposures, outcomes), as well as under- or over-counting of health events (e.g., count of hospitalizations). Accordingly, a central component of appropriately using administrative health data is creating, evaluating, and applying so-called ‘identification algorithms’ to help mitigate such biases.

Such algorithms are often a set of conditions/rules used to transform data from their raw format into a less biased version. A common example is a case-finding algorithm – a pattern of healthcare service use (including healthcare practitioner encounters and/or medication dispensations) that is deemed to meaningfully classify a person as having a certain health condition (e.g., asthma). Another example could be an algorithm applied to identify a hospitalization episode of care (e.g., combining sequential hospitalization stays from a patient that are ‘components’ within a distinct hospitalization episode, to avoid counting each component as distinct hospitalization episodes).The precise definition criteria for algorithms to identify such concepts have important methodological considerations, which can impact research findings, and decisions based on that information.

This webinar will focus on identification algorithms and related considerations within the context of epidemiological research leveraging linked multi-source administrative health data in Canada. Specifically, this session will:

  • Introduce various important concepts relevant to using identification algorithms with administrative health data: including the nature of administrative health data, data quality considerations, and examples of algorithm types (e.g., case-finding, healthcare events [such as hospitalization episodes of care], and socio-demographic characteristics,)
  • Highlight considerations when using diagnostic codes (e.g., ICD), drug information numbers (DINs), and other attributes/values located within administrative health data often relevant as inputs within identification algorithms.
  • Unpack how potential biases can be mitigated when using administrative health data for epidemiological research – with respect to identification algorithms (e.g., generating and evaluating algorithms’ validity evidence, conceptual considerations, understanding limitations and strengths of administrative data and other sources included in a data linkage).



Scott Emerson Scott Emerson MSc, is an Epidemiologist within the Epidemiology and Population Health Program of the BC Centre for Excellence in HIV/AIDS (BC-CfE), and is based at St. Paul’s Hospital in Vancouver.

He completed his MSc (Epidemiology) at UBC’s School of Population and Public Health (SPPH). During his graduate training, Scott gained experience in understanding, managing, and analyzing administrative healthcare data as well as their multi-source linkage with survey, educational, and immigration data. His master’s thesis focused on generating and evaluating validity evidence for a quality of life measure, and was supported by a Tri-Council Canada Graduate Scholarship (CGS-M).

His prior experience includes analytic epidemiological roles at ICES (the Institute for Clinical Evaluative Sciences; Toronto) and UBC’s Human Early Learning Partnership (a child health research centre). In his current role, Scott supports various initiatives focused on education and capacity-building in the application of linked administrative healthcare data for epidemiology.

> download ppt presentation

View recorded presentation below.

What did you think of this webinar?

Please take a few minutes to complete our online survey. Your feedback will help shape future webinar series!


Did you miss it?

If you did, it's not too late! 

View all our webinars and more on our YouTube channel

"Population Data BC is a go-to channel for me."
Kay R

What did you think?

Have you watched any of our recorded webinars or presentations?

Please tell us what you think by completing our short survey. Your feedback is very important to us and will help us develop future training courses and webinars.