Data

# Military expenditure

See all data and research on:

This data is calculated by using nine different military expenditure data sources and combining them using a model. The model links the country-year data together and estimates a mean with a prediction interval for each observation. For more information about the methodology, see the original article.

"Latent variable model

In the main manuscript, we present, estimate, and describe a latent variable model that links together observed dataset values from across many sources of military expenditure data.

We are interested in estimating is country-year military spending. Using military ex- penditure data presents several challenges because the datasets are incomplete, cover short periods of time, and are presented in many different monetary units-of-measurement. To overcome these challenges, we specify a dynamic latent variable measurement model that links all of the available information across different contemporary and historical sources of arms spending data. We essentially want to estimate the country-year distribution or simply the average of military spending across all the available observed dataset values so that we generate the best estimate of military spending for each of the country-year units.

The observed dataset values are linked together through the estimation of a country- year parameter or latent trait. However, the latent trait parameter itself is not directly of interest for inference because it does not have a direct monetary interpretation. This is because it is scaled by the item-specific intercept parameter which transforms the latent trait into the unit-of-measurement of any one of the originally observed military expenditure variables. The measurement model provides predictive intervals for each of the original observed variables on the original scales of these variables. Notationally, we represent the observed country-year dataset values as yitj where i indexes countries, t indexes years of time, and j indexes the dataset. The model then produces posterior predictive distributions of yitj, which we denote as y ̃itj. These are normally distributed values (on the natural log scale). We can therefore take the average of y ̃itj as E(y ̃itj) or the standard deviation of y ̃itj as sd(y ̃itj).

For the applications in the main manuscript and in this appendix, y ̃itj is the key the quantity we care about. It is the estimated value of yitj, conditional on all the other observed information about military spending for a given country-year unit, which is captured by the latent trait θcur[it] and then scaled by the item-specific intercept parameter αj. Note that, as described in the main manuscript, that we also account for the relationship between current and constant monetary values through inflation by this year scaling relationship: θcon[it] = βt ∗ θcur[it]

We approximate the posterior distributions of y ̃itj by taking repeated draws from Bayesian simulation model. Specifically, the measurement models are estimated with four MCMC chains to run for 2,000 iterations each using the Stan software (Stan Development Team, 2021). The first 1,000 iterations are thrown away as a burn-in or warmup period. The 4,000 remaining samples were thinned by a factor of 2 and are used to generate the posterior prediction intervals for the original observed variables. Diagnostics (i.e. trace plots, effective sample size, and R-hats) all suggest convergence (Gelman and Hill, 2007).

So in the end, we have a normally distributed, posterior prediction interval: y ̃itj for every country-year dataset. We can then compare the observed dataset values to these prediction intervals to see how well the model is doing at approximating these observed dataset values. We learn a lot from these descriptive comparisons as we demonstrate in the main manuscript and in additional detail in the rest of this appendix. Ultimately, these comparisons help us validate the resulting estimates relative to other estimates. Even the original data represents historic and government estimates, so such validation efforts are essential, especially when comparing long term historical trends and making predictions about the future."

Military expenditure
This data is expressed in US dollars. It is adjusted for inflation but does not account for differences in the cost of living between countries.
Source
Barnum et al. - Global Military Spending Dataset (2024) – with minor processing by Our World in Data
Last updated
July 22, 2024
Next expected update
July 2025
Date range
1816–2022
Unit
constant 2021 US\$

## Sources and processing

### This data is based on the following sources

Military spending data measure key international relations concepts such as balancing, arms races, the distribution of power, and the severity of military burdens. Unfortunately, missing values and measurement error threaten the validity of existing findings. Addressing this challenge, we introduce the Global Military Spending Dataset (GMSD). GMSD collates new and existing expenditure variables from a comprehensive collection of sources, expands data coverage, and employs a latent variable model to estimate missing values and quantify measurement error.

Retrieved on
July 22, 2024
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.
``````
Miriam Barnum; Christopher Fariss; Jonathan Markowitz; Gaea Morales (2024). Measuring Arms: Introducing the Global Military Spending Dataset. Journal of Conflict Resolution, 0(0). https://doi.org/10.1177/00220027241232964
Miriam Barnum; Christopher Fariss; Jonathan Markowitz; Gaea Morales (2022). "Global Military Spending Dataset", https://doi.org/10.7910/DVN/DHMZOW, Harvard Dataverse, V4; estimates_milex_con_20231205.rds [fileName]
``````

### How we process data at Our World in Data

All data and visualizations on Our World in Data rely on data sourced from one or several original data providers. Preparing this original data involves several processing steps. Depending on the data, this can include standardizing country names and world region definitions, converting units, calculating derived indicators such as per capita measures, as well as adding or adapting metadata such as the name or the description given to an indicator.

At the link below you can find a detailed description of the structure of our data pipeline, including links to all the code used to prepare data across Our World in Data.

## Reuse this work

• All data produced by third-party providers and made available by Our World in Data are subject to the license terms from the original providers. Our work would not be possible without the data providers we rely on, so we ask you to always cite them appropriately (see below). This is crucial to allow data providers to continue doing their work, enhancing, maintaining and updating valuable data.
• All data, visualizations, and code produced by Our World in Data are completely open access under the Creative Commons BY license. You have the permission to use, distribute, and reproduce these in any medium, provided the source and authors are credited.

### Citations

``“Data Page: Military expenditure”, part of the following publication: Bastian Herre and Pablo Arriagada (2013) - “Military Personnel and Spending”. Data adapted from Barnum et al.. Retrieved from https://ourworldindata.org/grapher/military-spending-gmsd [online resource]``
``Barnum et al. - Global Military Spending Dataset (2024) – with minor processing by Our World in Data``
``Barnum et al. - Global Military Spending Dataset (2024) – with minor processing by Our World in Data. “Military expenditure” [dataset]. Barnum et al., “Global Military Spending Dataset Version 4” [original data]. Retrieved September 13, 2024 from https://ourworldindata.org/grapher/military-spending-gmsd``