Second development efforts

This section describes the efforts and work done by the Yale Openlab team from September 2019 to February 2020

The Data (Consensus) Challenge on blockchains

Building an accounting system on blockchain for external assets is exactly as difficult as it sounds. One of our first tasks was to populate data in to our DB. The only problem with this was the datasets were completely incompatible. We want to develop a nested accounting structure where everybody can verify datasets and algorithms. To meet this challenge we started writing a postgres database and blockchain state to manage it.

Public blockchains are meant to be decentralized and the way they establish consensus around information is agreed upon in the consensus engine that block producers must run. One of the stranger concepts to deal with here is integrity of the datasets. In order to provide our climate data that data has to be on-boarded on to the platform. Which means the identity of the transaction signer should be known and the sources referenced. This paradigm allows the whole community to verify identity and provide a point of view based approach to data confidence.

Traditionally you may publish your work in a journal, this process allows a social graph to verify your identity, your academic and professional relationships, the methods you've used and finally the data you've produced to help others understand your work. In this way other can reproduce your work and provide consensus via reproducible results.

The first hurdle here is an identity graph. Our approach will be to let anybody register a keypair with any name and email. Once a keypair is registered it will be able to perform a few different actions such as request endorsements. In the future we will restrict other actions based on how endorsed a new account is. A problem with this relatively open model is Sybil attacks; where malactors register several keypairs and endorse each other to build a competing identity graph. One solution we may use to reduce this risk is to anchor our trust around emails with .edu and have some accounts verify email access. In the near future we would like to integrate the work of HyperLedger's Indy to leverage their robust digital identity framework.

Pilot work on Hyperledger Sawtooth

We have deployed HyperLedger's Sawtooth infrastructure on AWS. One of the advantages of Sawtooth is it's integrated seth (Sawtooth ethereum virtual machine) transaction processor family which lets us port many of the written smart contracts to our free execution pipeline(in gas token terms). For the OpenClimate collabathon we encourage running your own instance and also interacting with ours.

The other issue with data consensus is the several differing accounting methods. The anchors of our data accountability pipeline will be roll ups into ISO-3166 numerical codes. We hope to encourage a voluntary distributed approach to developing consensus around data by allowing our collaborators to submit records for schemas(file types), algorithms(executables for reproducibility), dApps(executables for interactions and data visualization), and of course datasets that are formatted into the aforementioned schemas.

Most of these records will be simple structures for showing who has made these record attributions and who has verified the same data. The namespace for records in the sawtooth platform can be used to prevent duplicate records and have identical results simply additionally signed. Eventually we'll have datasets generated from different algorithms that can each have confidence scores, differing points of view from the identity graph, and cradle to grave accountability. This consensus is required to build higher abstractions such as financial instruments.

One of our weak areas currently is integrating the Transaction Processors and configuring sawtooth for our purposes. If you are familiar with these technologies please contact the datathon prompt host, Steven Ettinger, via discord.

Our First Transaction Processor (Smart Contract)

Have a look at the HyperLedger Sawtooth source documents.

First we need a way to construct the payload that will trigger our process. These are built by the client and signed locally. Once signed our processor can validate the transaction to determine if state should change.

er_payload.js
'use strict'
const { InvalidTransaction } = require('sawtooth-sdk/processor/exceptions')
class EmRePayload {
constructor(name, action, desc, addr) {
this.name = name
this.action = action
this.desc = desc
this.addr = addr
}
static fromBytes(payload) {
payload = payload.toString().split(/,(?=(?:[^"]"[^"]")[^"]$)/gm) //allows nesting stringified objects
if (payload.length === 4) {
let emrePayload = new EmRePayload(payload[0], payload[1], payload[2], payload[3])
if (!emrePayload.name) {
throw new InvalidTransaction('Name is required')
}
if (emrePayload.name.indexOf('|') !== -1) {
throw new InvalidTransaction('Name cannot contain "|"')
}
if (!emrePayload.action) {
throw new InvalidTransaction('Action is required')
}
return emrePayload
} else if (payload.length === 2) {
let emrePayload = new EmRePayload(payload[0], payload[1])
if (!emrePayload.name) {
throw new InvalidTransaction('Name is required')
}
if (emrePayload.name.indexOf('|') !== -1) {
throw new InvalidTransaction('Name cannot contain "|"')
}
if (emrePayload.action !== 'veri' || emrePayload.action !== 'fix') {
throw new InvalidTransaction('Invalid action or data')
}
return emrePayload
} else if (payload.length === 3) {
let emrePayload = new EmRePayload(payload[0], payload[1], payload[2])
if (!emrePayload.name) {
throw new InvalidTransaction('Name is required')
}
if (emrePayload.name.indexOf('|') !== -1) {
throw new InvalidTransaction('Name cannot contain "|"')
}
if (emrePayload.action !== 'fix') {
throw new InvalidTransaction('Invalid action or data')
}
return emrePayload
} else {
throw new InvalidTransaction('Invalid payload serialization')
}
}
}
module.exports = EmRePayload

Now we need to build a way to store the information in the sawtooth state.

er_state.js
'use strict'
const stringify = require('json-stable-stringify')
const crypto = require('crypto')
class EmReState {
constructor(context) {
this.context = context
this.addressCache = new Map([])
this.timeout = 500 // Timeout in milliseconds
}
getdRec(name) {
return this._loaddRecs(name).then((dRecs) => dRecs.get(name))
}
setdRec(name, dRec) {
let address = _makeEmReAddress(name) //by choosing our namespace with different portions of data we can ensure no record duplication
return this._loaddRecss(name).then((dRecs) => {
dRecs.set(name, dRec)
return dRecs
}).then((dRecs) => {
let data = _serialize(dRecs)
this.addressCache.set(address, data)
let entries = {
[address]: data
}
return this.context.setState(entries, this.timeout)
})
}
deletedRec(name) {
let address = _makeEmReAddress(name)
return this._loaddRecs(name).then((dRecs) => {
dRecs.delete(name)
if (dRecs.size === 0) {
this.addressCache.set(address, null)
return this.context.deleteState([address], this.timeout)
} else {
let data = _serialize(dRecs)
this.addressCache.set(address, data)
let entries = {
[address]: data
}
return this.context.setState(entries, this.timeout)
}
})
}
_loaddRecs(name) {
let address = _makeEmReAddress(name)
if (this.addressCache.has(address)) {
if (this.addressCache.get(address) === null) {
return Promise.resolve(new Map([]))
} else {
return Promise.resolve(_deserialize(this.addressCache.get(address)))
}
} else {
return this.context.getState([address], this.timeout)
.then((addressValues) => {
if (!addressValues[address].toString()) {
this.addressCache.set(address, null)
return new Map([])
} else {
let data = addressValues[address].toString()
this.addressCache.set(address, data)
return _deserialize(data)
}
})
}
}
}
const _hash = (x) =>
crypto.createHash('sha512').update(x).digest('hex').toLowerCase().substring(0, 64)
const EMRE_FAMILY = 'emre'
const EMRE_NAMESPACE = _hash(EMRE_FAMILY).substring(0, 6)
const _makeEmReAddress = (x) => EMRE_NAMESPACE + _hash(x)
module.exports = {
EMRE_NAMESPACE,
EMRE_FAMILY,
EmReState
}
const _deserialize = (data) => {
let dRecsIterable = data.split('|').map(x => x.split(/,(?=(?:[^"]"[^"]")[^"]$)/gm))
.map(x => [x[0], { name: x[0], by: x[1], addr: x[2], veri: JSON.parse(x[3]), data: JSON.parse(x[4]) }])
return new Map(dRecsIterable)
}
const _serialize = (dRecs) => {
let dRecStrs = []
for (let namedRec of dRecs) {
let name = namedRec[0]
let dRec = namedRec[1]
dRecStrs.push([name, dRec.by, dRec.addr, stringify(dRec.veri), stringify(dRec.data)].join(','))
}
return Buffer.from(dRecStrs.join('|'))
}

The actual logic to handle transactions fits in the handler file. It generally chooses its functions by the action in the payload.

er_handler.js
'use strict'
const EmRePayload = require('./er_payload')
const { EMRE_NAMESPACE, EMRE_FAMILY, EmReState } = require('./er_state')
const { TransactionHandler } = require('sawtooth-sdk/processor/handler')
const { InvalidTransaction } = require('sawtooth-sdk/processor/exceptions')
class EMREHandler extends TransactionHandler {
constructor() {
super(EMRE_FAMILY, ['1.0'], [EMRE_NAMESPACE])
}
apply(transactionProcessRequest, context) {
let payload = EmRePayload.fromBytes(transactionProcessRequest.payload)
let emreState = new EmReState(context)
let header = transactionProcessRequest.header
let by = header.signerPublicKey
if (payload.action === 'create') {
return emreState.getdRec(payload.name)
.then((dRec) => {
if (dRec !== undefined) {
throw new InvalidTransaction('Invalid Action: dRec already exists.')
}
let Allowed = isAllowed(payload.desc)
function isAllowed(str) {
return /[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\,\.\-_\^\@\$\%\*\(\)\=\+\/]/.test(str);
}
if (!Allowed) {
throw new InvalidTransaction('Invalid Description: Only "Aa-Zz-_,.^@$%+/=" allowed in description.')
}
let createddRec = { //addressing records by their contents can ensure everybody signs the same records.
name: payload.desc.country + payload.desc.last_reported_year + payload.desc.last_reported_mmtco2e + payload.desc.gas_type,
addr: payload.addr,
by: by,
veri: [],
data: {
t: payload.desc.gas_type,
s: payload.desc.sector,
y: payload.desc.last_reported_year,
r: payload.desc.last_reported_mmtco2e,
c: payload.desc.country,
d: payload.desc.data_source,
p: payload.desc.data_reporter_public_key,
i: payload.desc.ipfs_identity,
}
}
return emreState.setdRec(payload.name, createddRec)
})
} else if (payload.action === 'veri') {
return emreState.getdRec(payload.name)
.then((dRec) => {
if (dRec === undefined) {
throw new InvalidTransaction(
'Invalid Action: Verify requires an existing dRec.'
)
}
let veri = false
for (i = 0; i < dRec.veri.length; i++) {
if (by === dRec.veri[i][0]) {
veri = true
}
}
if (by !== dRec.by && !veri) {
dRec.veri.push([by, 0])
}
return emreState.setdRec(payload.name, dRec)
})
} else if (payload.action === 'fix') {
return emreState.getdRec(payload.name)
.then((dRec) => {
if (dRec === undefined) {
throw new InvalidTransaction(
'Invalid Action: Verify requires an existing dRec.'
)
}
let veri = false
for (i = 0; i < dRec.veri.length; i++) {
if (by === dRec.veri[i][0]) {
veri = true
}
}
if (by !== dRec.by && !veri) {
dRec.veri.push([by, payload.desc])
}
return emreState.setdRec(payload.name, dRec)
})
} else if (payload.action === 'delete') {
return emreState.getdRec(payload.name)
.then((dRec) => {
if (dRec === undefined) {
throw new InvalidTransaction(
`No dRec exists with name ${payload.name}: unable to delete`)
} else if (by == dRec.by) {
return emreState.deletedRec(payload.name)
} else {
arr = drec.veri,
index = null
for (i = 0; i < arr.length; i++) {
if (arr[i][0] == by) {
index = i
break;
}
}
if (index > -1) {
dRec.veri.splice(index, 1)
return emreState.setdRec(payload.name, dRec)
} else {
throw new InvalidTransaction(
`You don't have permission to do that.`)
}
}
})
} else {
throw new InvalidTransaction(
`Action must be create, delete, or veri not ${payload.action}`
)
}
}
}
module.exports = EMREHandler

Finally a package.json and index file are built to handle to connections from sawtooth:

package.json
{
"name": "drec_javascript",
"version": "1.0.0",
"description": "An implementation of the drec transaction family using the sawtooth JS sdk",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"start": "node index.js"
},
"author": "",
"license": "Apache-2.0",
"dependencies": {
"cbor": "^3.0.0",
"json-stable-stringify": "^1.0.1",
"sawtooth-sdk": "file:./../../javascript"
}
}
index.js
'use strict'
const { TransactionProcessor } = require('sawtooth-sdk/processor')
const EMREHandler = require('./er_handler')
if (process.argv.length < 3) {
console.log('missing a validator address')
process.exit(1)
}
const address = process.argv[2]
const transactionProcessor = new TransactionProcessor(address)
transactionProcessor.addHandler(new EMREHandler())
transactionProcessor.start()

This should form the entirety of the transaction handler. As you may be able to see to meet all of our requirements for data consensus this first approach won't quite fit all the wickets.