Climate Action Tracking & Data Sharing

This prompt addresses the key challenge of providing clear accountability to non-state actors in the process of meeting their climate pledges.

PROMPT WS 1: Climate action data sharing is fraught with difficulties, particularly when it entails non-state actors such as cities and companies. Building on our solution from the 2019 Collabathon, develop a homomorphic encryption scheme that allows actors to encrypt their granular data while letting third-party organizations run sophisticated checks on the encrypted data without ever gaining access to the true, decrypted data. This solution should be incorporated in a strategic pitch deck that presents a value proposition for cities, companies, or other stakeholder organizations with confidential climate data to make that data available for integration.

Prompt Host: Data-Driven Lab (contact: Willie Khoo)


The Data-Driven Lab (DDL) uses cutting-edge data analytics to understand the role of subnational and non-state actors in their efforts to address the climate challenge. In the process to compile and account for global climate action pledges and progress by non-state actors, DDL faces different issues in regards to both data processing and sharing. These issues can be divided into two main data categories:

1) Data that DDL has collected for our analysis and global aggregation reports

  • Costs associated with climate emissions data: Global climate disclosure initiatives like CDP spend a significant amount of resources collecting, processing, cleaning and verifying data. They currently charge for access to their full dataset, which is ingrained in their business model. They do provide some datasets for free on their open data portal (, particularly on cities, states and regions.

  • Data sharing/reposting permissions: The data that DDL has received directly from data providers (e.g., Global Covenant of Mayors, Carbonn, etc.) cannot be shared. The data providers would likely take major issue with this, so we would need to get their permission once it is clearly defined what it is we are doing with the data.

  • Data updating mechanisms: Many of these organizations do not use APIs to store or display data on websites. What this means is that our process of getting the data from providers usually involves: 1) them sending us a spreadsheet, often including data that is not available publicly on the website; 2) us scraping data from their websites and recompiling it using our R package and other cleaning procedures. This cumbersome process is currently done manually and its cost is covered by research funds rather than shared by the global climate action tracking community that ultimately benefits from it.

  • Reconciling diverse datasets: There are often errors and inconsistencies in the data that we have to spend copious amounts of time manually checking and verifying, including:

    • Erroneous baseline or inventory emissions data

    • Inconsistent reduction targets (e.g., an actor has updated their targets but these updates have not yet been reported to their network or data provider

    • Incorrect demographic information

    • Inconsistent targets - at present we have only cleaned/modeled economy-wide emission reduction targets, but in reality actors make a wide range of other targets covering many sectors, and often they report these targets without the necessary accompanying information to quantify their impact (e.g., a renewable electricity generation target means that you need to have the current energy mix breakdown, and very infrequently is this data reported concurrently with the target)

    • Inconsistent units, particularly for intensity-based targets

For further context, here is an explanation of what CDP does to clean/make accessible their investor dataset:

2) Data that actors themselves (e.g., businesses, cities, etc.) possess and could report

  • High Cost: It is very time-consuming and expensive to develop an emissions inventory

  • Reporting Fatigue: Actors do not want to have to report to yet another platform because of the associated costs and labor. Even having to report on an annual cycle is a cumbersome affair

  • Privacy concerns: Some emissions data may be considered sensitive and could reveal proprietary secrets

  • Data Inequity: Some actors, particularly in the global south, lack capacity to develop their own inventories

  • Security concerns: The data files might contain viruses or other sort of security threat that might threaten the entire ecosystem.

Additional Resources:

Introduction video from Nov 16 2019 Singapore node

Password: singapore