Data

Scholarly publications on artificial intelligence per million people

CSET
See all data and research on:

What you should know about this indicator

  • The data covers scholarly publications on AI, including journal articles, conference papers, working papers, and preprints.
  • Articles are flagged as AI-related using machine learning models trained on subject tags from arXiv, an open repository for scientific papers.
  • The models only run on articles with English titles or abstracts. Research published only in other languages is missed. Coverage of Chinese research is also limited, since many Chinese journals are not in the underlying sources.
  • An article counts for a country if at least one of its authors is affiliated with an institution there. If authors are based in different countries, it counts once for each.
  • Authors are linked to the country of the institution they worked at when the article was published, not their country of origin.
Scholarly publications on artificial intelligence per million people
CSET
Scholarly publications on AI per million people, including journal articles, conference papers, working papers, and preprints. The data only covers articles with an English-language title or abstract.
Source
Center for Security and Emerging Technology (2026); Population based on various sources (2024)with major processing by Our World in Data
Last updated
April 27, 2026
Next expected update
October 2026
Date range
2016–2024
Unit
publications per million people

Sources and processing

Center for Security and Emerging Technology – Country Activity Tracker: Artificial Intelligence

ETO's Country AI Activity Metrics dataset includes national-level metrics for AI-related research, patents, and private-market investment.

The metrics are derived from a variety of underlying data sources, including ETO's Merged Academic Corpus for research data; The Lens, PATSTAT, and 1790 Analytics for patents; and Crunchbase for company and investment data.

The dataset focuses on countries, not organizations or individuals, and on AI and its subfields. There are many ways to assess countries' AI activities, and the three types of metrics included here, while meaningful, are not exhaustive. The data also has a lag, making counts incomplete for recent years; the lag is especially significant for patent data.

Retrieved on
April 27, 2026
Retrieved from
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.

ETO's Country AI Activity Metrics dataset includes national-level metrics for AI-related research, patents, and private-market investment.

The metrics are derived from a variety of underlying data sources, including ETO's Merged Academic Corpus for research data; The Lens, PATSTAT, and 1790 Analytics for patents; and Crunchbase for company and investment data.

The dataset focuses on countries, not organizations or individuals, and on AI and its subfields. There are many ways to assess countries' AI activities, and the three types of metrics included here, while meaningful, are not exhaustive. The data also has a lag, making counts incomplete for recent years; the lag is especially significant for patent data.

Retrieved on
April 27, 2026
Retrieved from
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.

Various sources – Population

Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources.

You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources

Retrieved on
March 31, 2026
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.
The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources

Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources.

You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources

Retrieved on
March 31, 2026
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.
The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources

All data and visualizations on Our World in Data rely on data sourced from one or several original data providers. Preparing this original data involves several processing steps. Depending on the data, this can include standardizing country names and world region definitions, converting units, calculating derived indicators such as per capita measures, as well as adding or adapting metadata such as the name or the description given to an indicator.

At the link below you can find a detailed description of the structure of our data pipeline, including links to all the code used to prepare data across Our World in Data.

Read about our data pipeline
Notes on our processing step for this indicator

We divided the source's country-level figures by population to calculate per-million-people values.

How to cite this page

To cite this page overall, including any descriptions, FAQs or explanations of the data authored by Our World in Data, please use the following citation:

“Data Page: Scholarly publications on artificial intelligence per million people”, part of the following publication: Charlie Giattino, Edouard Mathieu, Veronika Samborska, and Max Roser (2023) - “Artificial Intelligence”. Data adapted from Center for Security and Emerging Technology, Various sources. Retrieved from https://archive.ourworldindata.org/20260430-092147/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.html [online resource] (archived on April 30, 2026).

How to cite this data

In-line citationIf you have limited space (e.g. in data visualizations), you can use this abbreviated in-line citation:

Center for Security and Emerging Technology (2026); Population based on various sources (2024) – with major processing by Our World in Data

Full citation

Center for Security and Emerging Technology (2026); Population based on various sources (2024) – with major processing by Our World in Data. “Scholarly publications on artificial intelligence per million people – CSET” [dataset]. Center for Security and Emerging Technology, “Country Activity Tracker: Artificial Intelligence”; Various sources, “Population” [original data]. Retrieved May 2, 2026 from https://archive.ourworldindata.org/20260430-092147/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.html (archived on April 30, 2026).

Quick download

Download the data shown in this chart as a ZIP file containing a CSV file, metadata in JSON format, and a README. The CSV file can be opened in Excel, Google Sheets, and other data analysis tools.

Data API

Use these URLs to programmatically access this chart's data and configure your requests with the options below. Our documentation provides more information on how to use the API, and you can find a few code examples below.

Data URL (CSV format)
https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.csv?v=1&csvType=full&useColumnShortNames=false
Metadata URL (JSON format)
https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.metadata.json?v=1&csvType=full&useColumnShortNames=false

Code examples

Examples of how to load this data into different data analysis tools.

Excel / Google Sheets
=IMPORTDATA("https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.csv?v=1&csvType=full&useColumnShortNames=false")
Python with Pandas
import pandas as pd
import requests

# Fetch the data.
df = pd.read_csv("https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.csv?v=1&csvType=full&useColumnShortNames=false", storage_options = {'User-Agent': 'Our World In Data data fetch/1.0'})

# Fetch the metadata
metadata = requests.get("https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.metadata.json?v=1&csvType=full&useColumnShortNames=false").json()
R
library(jsonlite)

# Fetch the data
df <- read.csv("https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.csv?v=1&csvType=full&useColumnShortNames=false")

# Fetch the metadata
metadata <- fromJSON("https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.metadata.json?v=1&csvType=full&useColumnShortNames=false")
Stata
import delimited "https://ourworldindata.org/grapher/scholarly-publications-on-artificial-intelligence-per-million-people.csv?v=1&csvType=full&useColumnShortNames=false", encoding("utf-8") clear