Data

Cumulative number of large-scale AI systems by country

See all data and research on:

About this data

Cumulative number of large-scale AI systems by country
Refers to the location of the primary organization with which the authors of a large-scale AI systems are affiliated. An AI system can have multiple authors, each potentially affiliated with different institutions, thus contributing to the count for multiple countries. The 2024 data is incomplete and was last updated 20 June 2024.
Source
Epoch (2024) – with major processing by Our World in Data
Last updated
June 19, 2024
Next expected update
December 2024
Date range
2017–2024
Unit
AI systems

Sources and processing

This data is based on the following sources

A dataset that tracks compute-intensive AI models, with training compute over 10²³ floating point operations (FLOP). This corresponds to training costs of hundreds of thousands of dollars or more. 

To identify compute-intensive AI models, the team at Epoch AI used various resources, estimating compute when not directly reported. They included benchmarks and repositories, such as Papers With Code and Hugging Face, to find models exceeding 10²³ FLOP. They also explored non-English media and specific leaderboards, particularly focusing on Chinese sources.

Additionally, they examined blog posts, press releases from major labs, and scholarly literature to track new models. A separate table was created for models with unconfirmed but plausible compute levels. Despite thorough methods, proprietary and secretive models may have been missed.

Retrieved on
June 19, 2024
Citation
This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.
Robi Rahman, David Owen and Josh You (2024), "Tracking Compute-Intensive AI Models". Published online at epochai.org. Retrieved from: 'https://epochai.org/blog/tracking-compute-intensive-ai-models' [online resource]

How we process data at Our World in Data

All data and visualizations on Our World in Data rely on data sourced from one or several original data providers. Preparing this original data involves several processing steps. Depending on the data, this can include standardizing country names and world region definitions, converting units, calculating derived indicators such as per capita measures, as well as adding or adapting metadata such as the name or the description given to an indicator.

At the link below you can find a detailed description of the structure of our data pipeline, including links to all the code used to prepare data across Our World in Data.

Read about our data pipeline
Notes on our processing step for this indicator

The number of large-scale AI systems by country is determined by tallying the number of machine learning models that are associated with the geographical location of the researchers' affiliated institutions. It's important to note that a single model can have multiple authors, each potentially affiliated with different institutions, thus contributing to the count for multiple countries.

Reuse this work

  • All data produced by third-party providers and made available by Our World in Data are subject to the license terms from the original providers. Our work would not be possible without the data providers we rely on, so we ask you to always cite them appropriately (see below). This is crucial to allow data providers to continue doing their work, enhancing, maintaining and updating valuable data.
  • All data, visualizations, and code produced by Our World in Data are completely open access under the Creative Commons BY license. You have the permission to use, distribute, and reproduce these in any medium, provided the source and authors are credited.

Citations

How to cite this page

To cite this page overall, including any descriptions, FAQs or explanations of the data authored by Our World in Data, please use the following citation:

“Data Page: Cumulative number of large-scale AI systems by country”, part of the following publication: Charlie Giattino, Edouard Mathieu, Veronika Samborska and Max Roser (2023) - “Artificial Intelligence”. Data adapted from Epoch. Retrieved from https://ourworldindata.org/grapher/cumulative-number-of-large-scale-ai-systems-by-country [online resource]
How to cite this data

In-line citationIf you have limited space (e.g. in data visualizations), you can use this abbreviated in-line citation:

Epoch (2024) – with major processing by Our World in Data

Full citation

Epoch (2024) – with major processing by Our World in Data. “Cumulative number of large-scale AI systems by country” [dataset]. Epoch, “Tracking Compute-Intensive AI Models” [original data]. Retrieved July 15, 2024 from https://ourworldindata.org/grapher/cumulative-number-of-large-scale-ai-systems-by-country