Description: We are looking for a new Data Scientist to join our data management team. This team is responsible for the whole chain of collection, transformation, documentation and dissemination of data on the many topics that we cover on Our World in Data.
Contract type: Contractor (flexible hours, starting from 3 days per week, preferably full time)
Deadline: Hiring on a rolling basis – please apply early, even if you are not available soon
Interview process: We will review applications on a rolling basis and contact candidates for intro calls. Shortlisted candidates will then be contacted for interviews and assessment.
Compensation: We will consider candidates at different experience levels. Compensation will be discussed early in the selection process and will depend on your profile and experience.
Not the right role for you? See other roles.
Research and data are crucial to making progress against the large problems the world is facing and to build a better future. At Our World in Data we are building a publishing platform to make research and data on the world’s largest problems accessible and understandable.
The problems the world faces are very diverse – global poverty, CO₂ emissions, child mortality, mental health, and many more. Our World in Data readers who are concerned about these problems should be able to rely on our compilation of research, our database, and our visualizations to understand them clearly, and learn how it is possible to make progress against them.
Over the past year, we’ve done a lot of work on the COVID-19 pandemic, and that is still an important focus – and it will likely remain so for the months to come. Millions of people rely on our work on the pandemic, from the public, to teachers and researchers, to policymakers and world leaders.
Global Change Data Lab, the non-profit organization that produces the web publication Our World in Data, is seeking a new Data Scientist to join its data team.
Data is at the very heart of all the work we do. Everything in our day-to-day job is tied back to data that allows us to understand the state of the world and how it is changing. As a Data Scientist, you will be responsible for contributing to the whole chain of collection, transformation, documentation and dissemination of our data on the many topics that we cover on Our World in Data. The ideal candidate needs to be passionate and skilled in data analysis and data engineering, interested in learning about many research topics they may not have previously worked on, and capable of understanding academic publications and datasets. It’s not just the technical skills for data analysis and management that are essential; equally important is the ability to understand what this data tells us about the world and how to use it to communicate this to our millions of users.
We typically work with datasets that can be considered small by industry standards, and with technologies that are not necessarily at the cutting edge of data science and cloud services. But we are strong believers that machine learning and AI are not the most urgent next steps to make progress on many of the world’s largest problems. Instead, clean, reliable and well-documented CSV files that let our research team and our users generate clear, easy-to-understand, and accurate visualizations, are what is most important.
Global Change Data Lab (GCDL) is a small educational charity with a focus on large global problems and international development. Our flagship project, the web publication Our World in Data, is focused on increasing the use of data and evidence to make progress against the world’s largest problems. Among those who rely on our work are international organizations, the governments of countries around the world, major businesses and NGOs, and large media and news outlets.
This is an exciting time to join us. GCDL is growing quickly and our projects have a huge positive impact. Our World in Data receives more than 5 million visitors every month, and for many relevant online search queries in global development – ‘CO2 emissions’, ‘world poverty’, ‘child mortality’, ‘population growth’ – we are one of the top search results in many parts of the world. Our work is also cited widely in the media: in 2020 our work was referenced in more than 1,500 articles at major media outlets (those with more than 10 million monthly visitors) and many thousand smaller ones.
We are constantly looking for ways to innovate, and we don’t hesitate to start new projects or change course if doing so is the best way to further our mission. We are a small team, but we are growing quickly, and this means every individual contribution now has the potential to make a big difference going forward.
Duties for this role include:
- Writing scripts to import, clean, and collate data from many sources and in many formats;
- Designing and implementing data pipelines to facilitate or automate regular updates of our datasets;
- Developing metadata (title, subtitles, descriptions) for our variables and charts that is understandable, perfectly accurate, and consistent across sources;
- Implementing and maintaining transparent and clear documentation of sources, including original and transformed data.
- Contributing to the development and management of our public datasets, made available through GitHub and a future data API.
- Identifying new datasets of potential interest, and assessing their relevance based on documentation, variable catalogs, and exploratory data analysis;
- Designing and implementing derived variables in or across datasets, such as per capita measures, averages, aggregates, and smoothed time series.
- Thoroughly testing newly-implemented features and variables to ensure their reliability across geographical locations and time.
- Working with our software engineers to design, pilot and test improvements of our data exploration tools.
- Understanding, prioritizing and replying to user feedback on our data and its presentation;
- Engaging with external data providers, journalists and policymakers on data availability, quality, and what it tells us across the many topics we cover;
- Constantly improving our public datasets and charts based on suggestions received via email, communication with experts, GitHub, or social media.
- Attention to detail;
- Ability and drive to work without supervision;
- Curiosity, openness to new ideas, flexibility to learn from new evidence and receive feedback;
- Being able to assess what data is accurate and insightful and which is not;
- Proactive, assertive and action-oriented;
- Ability to think systematically about problems and recognize shared behaviors and patterns to provide solutions.
- Minimum of a Bachelor’s degree, preferably in a quantitative field;
- Experience with importing, transforming, and maintaining datasets for various audiences, in a research or industry environment;
- Excellent skills in data wrangling, preferably in Python (pandas) but we also welcome applications from proficient R users (tidyverse, data.table);
- Very good knowledge of data visualization principles and good practices;
- Good understanding of our work at Our World in Data;
- Fluency in English (all our work, internally and externally, is conducted in English).
- Fundamental knowledge of bash scripts, SQL, GitHub, and web scraping libraries;
- Experience with the development and maintenance of datasets;
- Previous knowledge or interest in global development;
- Experience with academic research and science communication.