COVID-19 vaccinations, cases, excess mortality, and much more
Explore our COVID-19 data

Data sources and methodological differences

We source the data for the Global Health Explorer from three main sources: 

  • World Health Organization’s Global Health Estimates
  • World Health Organization’s Global Health Observatory
  • Institute of Health Metrics and Evaluation’s Global Burden of Disease Study

Global Health Estimates (WHO GHE)

The Global Health Estimates are primarily calculated using cause-of-death statistics that are reported to the WHO by individual countries. 

These vital registration (VR) statistics are submitted to the WHO Mortality Database on an annual basis by country, year, cause, age and sex. This data is included in the Global Health Estimates if it meets criteria assessing completeness and quality. Since many countries don’t meet these criteria, the GHE does not incorporate VR statistics for every country.

There are a number of specialist WHO groups and UN agencies that collect topic or disease-specific data. The dataset on HIV and AIDS – collected and published by UN AIDS – is one example of this. The estimates based on the VR data are compared to the data from specialist WHO groups and UN agencies and adjustments are made if necessary. 

Where the VR data is not usable and there is no other nationally representative cause of death data, then the World Health Organization adopts the IHME’s Global Burden of Disease data to fill these gaps. Full details on how the WHO’s Global Health Estimates are calculated are available here.

Global Health Observatory (WHO GHO)

The WHO’s Global Health Observatory brings together a large number of variables produced by the WHO and specialist UN agencies. 

The variables found within the Global Health Observatory are limited to the leading causes of death and injury. This means the WHO GHO is less ‘complete’ than the IHME Global Burden of Disease study and WHO GHE in terms of the diseases included.

The method used to produce each of these variables is different and tailored to the specific cause of death or injury. This is a different approach to both the Global Burden of Disease (IHME) and Global Health Estimates (WHO) which attempt to use a consistent modeling approach for all causes of death and injuries. Additionally, variables in the Global Health Observatory are not consistently disaggregated by age or sex. Instead, they show what data is available and is most relevant for the given variable. 

The full list of variables available on the Global Health Observatory are available here, each has their own associated metadata and method.

Global Burden of Disease (IHME)

In the Global Burden of Disease study, the IHME uses a wide range of input data. This includes, but is not limited to census data, household surveys, civil registration and vital statistics, disease registries, health service use, air pollution monitors, satellite imaging, disease notifications, and other sources. These data are available through scientific journals, reports, online databases, books, news reports, and other resources. For most diseases and injuries the data are used as input into a series of standardized models which are used to generate estimates of each disease or injury for all age-groups, sexes, locations, and years. 

Since the publication of its Global Burden of Disease study in 2017, the IHME has used its own population estimates. These differ from those used by the UN Population Division, which are used by the World Health Organization (WHO). This key difference propagates through all of the resulting health datasets. Even if the IHME and WHO assumed the same rates of health and mortality burden, this difference in population estimates means many of their population-adjusted figures would be different.

Full details on how the IHME’s Global Burden of Disease is calculated are available here and here.

Why do different sources disagree on figures for the same metric?

Collecting precise data on global health is difficult: we can never know exactly which diseases or injuries affect people across the world at any given point in time. In the absence of perfect data, health researchers have a number of ways by which they try to estimate the burden of global health outcomes. These estimates are what we present in the Global Health Explorer. 

Each of these sources have slightly different methods for estimating the burden of disease and causes of death, and in some cases the definition of the diseases and injuries in question differ slightly as our Explorer shows.

Differences in disease definition between sources

The diseases and injuries presented in the Global Health Explorer are typically aggregates of multiple similar causes of death. For example, ‘Deaths from falls’ includes 20 different types of fall which are combined into this one variable. 

Individual causes of death have an associated International Classification of Disease (ICD) code. The tenth version of the ICD codes are used in the most recent Global Health Estimates and Global Burden of Disease and this is referred to as ICD-10. For example ‘Fall from tree’ has the ICD-10 code ‘W14’, and the ICD-10 codes for all ‘Deaths from falls’ are ‘W00-W19’.

Both IHME and WHO follow the same definition for ‘Deaths from falls’ but for some other causes of death, their definitions of the same aggregate cause of death doesn’t include exactly the same ICD-10 codes. For example, ICD-10 code ‘F02.3 – A dementia developing in the course of established Parkinson disease’ is included as ‘Dementia’ in the WHO’s Global Health Estimates but as ‘Parkinson’s disease’ in the IHME’s Global Burden of Disease.

These slight differences in the definition of diseases may also contribute to the different values we see when looking at the same metric from different sources. The ICD-10 codes used by IHME to define causes of death can be found here and the ICD-10 codes used by the WHO can be found here.