Having access to reliable cross-country data on taxation is important because it helps us understand and contextualize the changing landscape of public policy around the world. And yet, while the importance of high-quality data on taxation that is comparable across time and countries is widely recognised, there are substantial deficiencies in the estimates published by traditional sources. In this blog post I want to highlight the efforts of the International Centre for Tax and Development (ICTD) to produce a new dataset that tries to address some of these deficiencies, and that has perhaps not received all the attention it deserves.
Today, the most widely used source of cross-country data on taxation is arguably the IMF Government Finance Statistics (IMF-GFS). Unfortunately, despite IMF efforts, the estimates published in this source are problematic for a number of reasons.
Perhaps the most important limitation of IMF-GFS estimates is coverage: there is extensive missing data, with large spatial and temporal gaps. Yet coverage is not the only limitation. The IMF-GFS estimates are inconsistent with estimates published in other mainstream sources such as the World Bank World Development Indicators (WB-WDI), both because of differences in the methodology used to construct the variables, and because of inconsistencies in the way countries collect and report underlying revenue data.
The following chart, from Prichard (2016), shows an example of data discrepancies in tax revenues for Ghana.1 The different series correspond to different sources: the blue line denotes estimates using IMF Article IV reports, the orange line denotes estimates from the IMF-GFS, the yellow line denotes data from IMF Country Reports, and the green line denotes estimates from the World Bank World Development Indicators.2
As can be seen, research findings and policy conclusions would be quite different depending on the source that one uses to characterize the evolution of tax revenues in Ghana.
Considering the above, the International Centre for Tax and Development (ICTD) recently started producing a depurated dataset that combines data from multiple sources (including the OECD, the IMF-GFS and the WB-WDI among others), applying consistency checks and flagging potential problems that may arise with the interpretation of estimates. This is a very valuable resource for research and a clear improvement over the individual underlying sources.
The following visualization maps cross-country estimates of total tax revenues as a share of GDP from the new ICTD Government Revenue Dataset (ICTD-GRD). You can find many other visualizations using this data in our entry on taxation.
One of the key advantages of the ICTD-GRD is that it stipulates a clear hierarchy for underlying sources, country by country. Generally speaking, the ICTD-GRD prioritises sources with (i) longer time series, and (ii) higher levels of disaggregation of sub-categories of revenues, such as social contributions. Yet each case is analysed independently, so the underlying sources are hand-picked, country by country, year by year. This yields fewer missing observations and more consistent estimates across countries and time.
Another important advantage of the ICTD-GRD is that it flags estimates that seem problematic. These flags include remarks such as ‘Data not credible, ‘Data is of Questionable Analytical Comparability’, and ‘Cannot Exclude Resource Revenue from Sub-Components of Total Tax Revenue’. This is evidently convenient for anyone interested in doing econometric analysis: researchers can report results both with and without the flagged data.
Of course, there remains much to be done regarding cross-country data quality in the field of taxation. While the transparent process of manual data cleaning in the ICTD-GRD is clearly a step forward, it is by no means a definitive solution. Misreporting by countries makes it difficult to unambiguously rank sources.
You can read more about how the ICTD-GRD was constructed in Prichard et al. (2014).4 And you can read more about how this new dataset has been used to re-examine major research questions – such as the relationships between tax and aid, elections, economic growth, and democratization – in Prichard (2016).5