How we choose our data sources

Our articles and data visualizations rely on work from many different people and organizations. When citing this entry, please also cite the underlying data sources. This entry can be cited as:

Max Roser (2017) - "How we choose our data sources". Published online at OurWorldInData.org. Retrieved from: 'https://ourworldindata.org/about/how-we-chose-our-data-sources' [Online Resource]
Our World in Data presents the empirical evidence on global development in entries dedicated to specific topics.

This post is part of our series in which we reflect on our work, linked to from our about page.

I. Sources of data

One of our key tasks in producing this publication is to bring together the best and most informative data sets on a particular topic.

Sources of data that we bring together are published by 3 different sources:

  1. specialized institute – like the Peace Research Institute Oslo PRIO)
  2. research articles – like Bourguignon & Morrison – ’Inequality Among World Citizens: 1820-1992’ in the American Economic Review [2002]
  3. international institutions or statistical agencies – like the OECD, the World Bank, and UN institutions.

In every visualization we indicate the source of the presented data.

II. How we choose which data to present

We have 6 guidelines to decide which sources to accept and which data to present.

1) As far back into the past as possible - but up to today

The goal is to give a perspective on the long-term development and therefore we always aim to find time series data that reach back as far as possible. Unfortunately the availability of data is often itself an achievement of modern development and data is not available for the more distant past. A solution for this problem is data that has later been reconstructed and we aim to give a more complete picture by taking this data into account.

At the same time the idea is also to present a 'history of today' and we therefore also want to ensure that the data presented reaches until today. The limitation here is often that it takes up to several years for researchers and international institutions to publish important data for the most recent period.

2) As global as possible

A second objective is to give an account of each topic that includes as many societies, countries, and world regions as possible.

3) Present data in its entirety

Shorter sample periods may mask important trends and a recent reversal of a long-term trend could be falsely interpreted as the direction of the long-term trend. The merit of taking a historical perspective that studies long-term trends is that it shows the direction in which some aspect of our world is developing. Therefore we also always ensure to present the whole dataset and we do not want to cut off the original data.

4) Comparable through time and across societies

A third objective is to ensure that the data we present is comparable across time and across societies.

When data is not comparable across countries and through time we highlight this in the text accompanying the visualisation.

5) There is no other data - or we would include this data

An important promise is that we are not withholding any data that would give a different impression of the long-term development of some aspect. If two credible sources would publish statistics that contradict each other – indicating an open debate between researchers – then we would say so.

6) Reference the original source

To make the data base useful for readers and credit the important work of those who construct the data presented here we aim to always reference the original source of the data.

We could very well fail to notice that we violated our own guidelines: If you find that we are not following my own guidelines, or you have any other complaints, please contact us through the form below – click on 'Give us Feedback'.