Tag Archives: hash function

An Akka Actor-based Blockchain

As proposed at the beginning of this blockchain mini blog series, we’ll have an Actor-based blockchain application at the end of the series. The fully functional application is written in Scala along with the Akka toolkit.

While this is part of a blog series, this post could still be viewed as an independent one that illustrates the functional flow of a blockchain application implemented in Scala/Akka.

What is a blockchain?

Summarizing a few key characteristics of a blockchain (primarily from the angle of a cryptocurrency system):

  • At the core of a cryptocurrency system is a distributed ledger with a collection of transactions stored in individual “blocks” each of which is successively chained to another, thus the term “blockchain”.
  • There is no centralized database storing the ledger as the authoritative data source. Instead, each of the decentralized “nodes” maintains its own copy of the blockchain that gets updated in a consensual fashion.
  • At the heart of the so-called “mining” process lies a “consensus” algorithm that determines how participants can earn the mining reward as an incentive for them to collaboratively grow the blockchain.
  • One of the most popular consensus algorithms is Proof of Work (PoW), which is a computationally demanding task for the “miners” to compete for a reward (i.e. a certain amount of digital coins) offered by the system upon successfully adding a new block to the existing blockchain.
  • In a cryptocurrency system like Bitcoin, the blockchain that has the highest PoW value (generally measured by the length, or technically referred to as height) of the blockchain overrides the rest.

Beyond cryptocurrency

While blockchain is commonly associated with cryptocurrency, the term has been generalized to become a computing class (namely blockchain computing) covering a wide range of use cases, such as supply chain management, asset tokenization. For instance, Ethereum, a prominent cryptocurrency, is also a increasingly popular computing platform for building blockchain-based decentralized applications. Its codebase is primarily in Golang and C++.

Within the Ethereum ecosystem, Truffle (a development environment for decentralized applications) and Solidity (a JavaScript alike scripting language for developing “smart contracts”), among others, have prospered and attracted many programmers from different industry sectors to develop decentralized applications on the platform.

In the Scala world, there is a blockchain framework, Scorex 2.0 that allows one to build blockchain applications not limited to cryptocurrency systems. Supporting multiple kinds of consensus algorithms, it offers a versatile framework for developing custom blockchain applications. Its predecessor, Scorex, is what powers the Waves blockchain. As of this post, the framework is still largely in experimental stage though.

How Akka Actors fit into running a blockchain system

A predominant implementation of the Actor model, Akka Actors offer a comprehensive API for building scalable distributed systems such as Internet-of-Things (IoT) systems. It comes as no surprise the toolset also works great for what a blockchain application requires.

Lightweight and loosely-coupled by design, actors can serve as an efficient construct to model the behaviors of the blockchain mining activities that involve autonomous block creations in parallel using a consensus algorithm on multiple nodes. In addition, the non-blocking interactions among actors via message passing (i.e. the fire-and-forget tell or query-alike ask methods) allow individual modules to effectively interact with custom logic flow or share states with each other. The versatile interaction functionality makes actors useful for building various kinds of modules from highly interactive routines such as simulation of transaction bookkeeping to request/response queries like blockchain validation.

On distributed cluster functionality, Akka provides a suite of cluster features — cluster-wide routing, distributed data replication, cluster sharding, distributed publish/subscribe, etc. There are different approaches to maintaining the decentralized blockchains on individual cluster nodes. For multiple independent mining processes to consensually share with each others their latest blockchains in a decentralized fashion, Akka’s distributed pub/sub proves to be a superb tool.

A blockchain application that mimics a simplified cryptocurrency system

UPDATE: A new version of this application that uses Akka Typed actors (as opposed to the Akka classic actors) is available. An overview of the new application is at this blog post. Also available is a mini blog series that describes the basic how-to’s for migrating from Akka classic to Akka Typed.

It should be noted that Akka Actors has started moving towards Typed Actors since release 2.5, although both classic and typed actors are being supported in the current 2.6 release. While the Akka Typed API which enforces type-safe code is now a stable release, it’s still relatively new and the API change is rather drastic, requiring experimental effort to ensure everything does what it advertises. Partly because of that, Akka classic actors are used in this blockchain application. Nonetheless, the code should run fine on both Akka 2.5 and 2.6.

Build tool for the application is the good old sbt, with the library dependencies specified in built.sbt and all the configurative values of the Akka cluster and blockchain specifics such as the proof-of-work difficulty level, mining reward, time-out settings in application.conf:

Note that Artery TCP remoting, as opposed to the classic Netty-base remoting, is used.

With the default configuration, the application will launch an Akka cluster on a single host with two seed nodes at port 2551 and 2552 for additional nodes to join the cluster. Each user can participate the network with their cryptographic public key (for collecting mining reward) provided as an argument for the main program on one of the cluster nodes to perform simulated mining tasks.

For illustration purpose, the main program will either by default enter a periodic mining loop with configurable timeout, or run a ~1 minute quick test by adding “test” to the program’s argument list.

Functional flow of the blockchain application

Rather than stepping through the application logic in text, the following diagram illustrates the functional flow of the Akka actor-based blockchain application:

Akka Blockchain - functional flow

Below is a summary of the key roles played by the various actors in the application:

Blockchainer – A top-level actor that maintains a distributed copy of the blockchain and transaction queue for a given network participant (e.g. a miner) identified by their cryptographic public key. It collects submitted transactions in the queue and updates the blockchain according to the consensual rules via the cluster-wide distributed pub/sub. Playing a managerial role, the actor delegates mining work to actor Miner and validation of mined blocks to actor BlockInspector.

Miner – A child actor of Blockchainer responsible for processing mining tasks, carrying out computationally demanding Proof of Work using a non-blocking routine and returning the proofs back to the parent actor via the Akka ask pattern.

BlockInspector – Another child actor of Blockchainer for validating content of a given block, typically a newly mined block. The validation verifies that generated proof and goes “vertically” down to the nested data structure (transactions/transactionItems, merkleRoot, etc) within a block as well as “horizontally” across all the preceding blocks. The result is then returned to the parent actor via Akka ask.

Simulator – A top-level actor that simulates mining requests and transaction submissions sent to actor Blockchainer. It spawns periodic mining requests by successively calling Akka scheduler function scheduleOnce with randomized variants of configurable time intervals. Transaction submissions are delegated to actor TransactionFeeder.

TransactionFeeder – A child actor of actor Simulator responsible for periodically submitting transactions to actor Blockchainer via an Akka scheduler. Transactions are created with random user accounts and transaction amounts. Since accounts are represented by their cryptographic public keys, a number of PKCS#8 PEM keypair files under “{project-root}/src/main/resources/keys/” were created in advance to save initial setup time.

As for the underlying data structures including Account, Transactions, MerkleTree, Block and ProofOfWork, it’s rather trivial to sort out their inter-relationship by skimming through the relevant classes/companion objects in the source code. For details at the code level of 1) how they constitute the “backbone” of the blockchain, and 2) how Proof of Work is carried out in the mining process, please refer to the previous couple of posts of this mini series.

Complete source code of the blockchain application is available at GitHub.

Test running the blockchain application

Below is sample console output with edited annotations from an Akka cluster of two nodes, each running the blockchain application with the default configuration on its own JVM.

Note that, for illustration purpose, each block as defined in trait Block‘s toString method:

is represented in an abbreviated format as:

where proof is the first incremented nonce value in PoW that satisfies the requirement at the specified difficulty level.

As can be seen in the latest copies of blockchain maintained on the individual cluster nodes, they get updated via distributed pub/sub in accordance with the consensual rule, but still may differ from each other (typically by one or more most recently added blocks) when examined at any given point of time.

Reliability and efficiency

The application is primarily for proof of concept, hence the abundant side-effecting console logging for illustration purpose. From a reliability and efficiency perspective, it would benefit from the following enhancements to be slightly more robust:

  • Fault tolerance: Akka Persistence via journals and snapshots over Redis, Cassandra, etc, woud help recover an actor’s state in case of a system crash. In particular, the distributed blockchain copy (and maybe transactionQueue as well) maintained within actor Blockchainer could be crash-proofed with persistence. One approach would be to refactor actor Blockchainer to delegate maintenance of blockchain to a dedicated child PersistentActor.
  • Serialization: Akka’s default Java serializer is known for not being very efficient. Other serializers such as Protocol Buffers, Kryo should be considered.

Feature enhancement

Feature-wise, the following enhancements would help make the application one step closer to a real-world cryptocurrency system:

  • Data privacy: Currently the transactions stored in the blockchain isn’t encrypted, despite PKCS public keys are being used within individual transactions. The individual transaction items could be encrypted, each of which to be stored with the associated cryptographic public key/signature, requiring miners to validate the signature while allowing only those who have the private key for certain transactions to see the content.
  • Self regulation: A self-regulatory mechanism that adjusts the difficulty level of the Proof of Work in accordance with network load would help stabilize the currency. As an example, in a recent drastic plunge of the Bitcoin market value in mid March, there was reportedly a significant self-regulatory reduction in the PoW difficulty to temporarily make mining more rewarding that helped dampen the fall.
  • Currency supply: In a cryptocurrency like Bitcoin, issuance of the mining reward by the network is essentially the “minting” of the digital coins. To keep inflation rate under control as the currency supply grows, the rate of coin minting must be proportionately regulated over time. For instance, Bitcoin has a periodic “halfing” mechanism that reduces the mining reward by half for every 210,000 blocks added to the blockchain and will cease producing new coins once the total supply reaches 21 million coins.
  • Blockchain versioning: Versioning of the blockchain would make it possible for future algorithmic changes by means of a fork, akin to Bitcoin’s soft/hard forks, without having to discard the old system.
  • User Interface: The existing application focuses mainly on how to operate a blockchain network, thus supplementing it with, say, a Web-based user interface (e.g. using Play framework) would certainly make it a more complete system.

Blockchain Mining And Proof-of-Work

This is the second part of the Blockchain mini-series. Core focus of this post is to define the building “block” of a blockchain and illustrate how to “mine” a block in Scala.

First, let’s make available the key entity classes (Account, Transactions, MerkleTree, etc) and utility/crypto functions we assembled in the previous post.

The building “block” of a Blockchain

We now define the building “block” of a blockchain. In our simple model, it’s a “reversed” linked list with each node consisting of the hash value of its preceding block. Key fields in the Block class include the hash values of the current and preceding blocks, the transaction collection and the corresponding Merkle root, block creation timestamp and a couple of fields for Proof of Work (PoW) – a consensus algorithm we’ll get into shortly.

Let’s define object ProofOfWork, which includes just a couple of values relevant to the PoW process, as follows for now:

RootBlock is the primordial block (a.k.a. genesis block) and any subsequent block chained after it is a LinkedBlock which consists of an additional field blockPrev, a “pointer” to the preceding block. For simplicity in function signatures, the hash function hashFcn for the Block subclasses and other previously defined classes is not provided as constructor or class method arguments but rather a predefined function for all.

Proof of Work – a computationally demanding task

Proof of Work (PoW) is a kind of consensus algorithms, arguably popularized by Bitcoin. Due to the lack of a centralized authoritative source like the central bank for a conventional currency, a decentralized cryptocurrency network needs a consensus protocol that every participant agrees to.

The idea of PoW is to incentivize participants to work on a computationally demanding task in order to be qualified for adding a new block of unspent transactions to the existing blockchain – the distributed ledger. The incentive is a reward, typically a specific amount of the cryptocurrency, the network offers to the participants who:

  1. completed the task, and,
  2. successfully added to the blockchain a new block consisting of the task completion proof.

These participants are referred to as the miners. The copy of blockchain maintained by any competing miner with the highest PoW value (generally referred to the length of the blockchain) overrides the rest.

A commonly employed PoW scheme is to require a miner to repeatedly apply a hash function to a given string of text combined with an incrementing integer until the resulting hash value has a certain number of leading zeros. Mathematically this is a task requiring exponentially more trials for every additional zero in the requirement.

We now expand object ProofOfWork to include all the key elements for PoW.

Borrowing some terms from the Bitcoin model, defaultDifficulty is the default difficulty level of the PoW task. The level value represents the number of leading zeros required to be in the resulting hash value after repeatedly applying a hash function against the concatenation of the hash of a block and a monotonically increasing integer. The incremental integer is called nonce, and defaultNonce is the network-default value. Note that, by design, the output of hashing is so unpredictable and sensitive to input change that any given starting nonce value will not have any advantage over any other values.

Method validProof concatenates a dataset of type byte-array and a byte-converted nonce, applies hashFcn to the combined data and reports whether the resulting hash value has the number of leading zeros specified by difficulty. And method generateProof is a simple recursive snippet that takes a Base64 string and runs validProof repeatedly with an monotonically incrementing nonce until it satisfies the difficulty requirement.

Generating Proof

Through repeated trials of hashing with the incrementing nonce value, the “proof” is the first incremented nonce value that satisfies the requirement at the set difficulty level. As it could take significant time to arrive at the required hash value, it’s logical to perform PoW asynchronously.

Using the scheduleOnce Akka scheduler to complete a Promise with a TimeoutException after a certain elapsed time, we’ve created an asynchronous method with a non-blocking timeout mechanism (as opposed to Await) that returns a Future of the wanted nonce value.

Test running generatePoW

To test out generating Proof of Work, we borrow the random account/transaction creation snippet from the previous blog:

Generating a few Transactions objects with random content, each to be used in building a block:

The miner will use their own account to receive the blockchain reward upon successful block acceptance. Creating a new block requires a couple of things:

  • the reference to the last block in the existing blockchain
  • a collection of unspent transactions from a distributed queue

The miner will then save the last block reference as blockPrev in the new block, prepend to the transaction collection an additional transaction with their own account reference for the reward, and start the PoW process. Upon finishing the PoW, the incremented nonce that satisfies the requirement at the predefined difficulty level will be kept in the new block as the proof for validation by the blockchain system. Below is the method mine that creates the new block.

Let’s just say the miner owns the first of the previously created accounts:

We’ll start mining a new block to be appended to the root block (i.e. the genesis block RootBlock).

Now that the last block of the blockchain is block1, a subsequent mining attempt will make it the preceding block; and further mining will be performed like so, successively.

A couple of notes:

  1. At difficulty level 3, the 20-second timeout is generally sufficient on a computer with, say, a 3GHz 4-core CPU, although one may still get a timeout once in a while. If we elevate it to level 4, the required proof-generating time will be about 100-fold (i.e. 30+ minutes) of what its takes for level 3.
  2. The incremented nonce saved in the returned block is the exact number of hashing trials to have finally satisfied the PoW requirement at the predefined difficulty level. At difficulty level 3, the number of trials ranges from millions to tens of millions. A level up would require billions of trials.

To look at what the final blockchain (i.e. block3) is like, let’s create a simple function that converts the linked blocks to elements of a List:

As a side note, the above detailed steps are for illustration purpose. The list of transactions and blocks could’ve been created in a more concise manner:

Proof validation

Validating a proof (i.e. the incremented nonce) saved in a block is simple. Object ProofOfWork includes method validProofIn(block: Block) that takes a block and verifies whether the hash function can reproduce using the block’s hash and nonce a result with the number of leading zeros matching the difficulty level.

For example, the following verification confirms that block3 has a valid proof.

In the next blog post, we’ll port the above proof-of-concept Scala snippets to run on a scalable Akka cluster.

Transaction Hash Tree In A Blockchain

I’m starting a mini blog series that centers around the Blockchain topic. At the end of the series will be a simple blockchain application in Scala on an Actor-based Akka cluster. The application will in some way follow a simplified version of the Bitcoin cryptocurrency’s operational model, including its proof-of-work consensus algorithm.

Cryptocurrency and Blockchain

Some quick background info about blockchain – In 2009, Bitcoin emerged as the first decentralized cryptocurrency and took the world by storm. Besides proving to the world the possibility of running a digital currency without the need of a centralized authority, it has also fascinated people (particularly in the finance and technology industries) with its simple yet effective operational model.

Cryptocurrency has also popularized the term “blockchain” which represents its underlying data structure and has since been broadened to a computing class that covers a wide range of applications (e.g. “smart contracts”) in different domains. Even though conceptually how a cryptocurrency like Bitcoin works isn’t complicated, it does require some basic knowledge in cryptography, particularly in PKCS (public key cryptography standards).

Utility functions

First, a few utility functions:

Hashing is a critical process for integrity check in blockchain’s underlying data structure. We’ll use SHA-256, which has a function signature of Array[Byte] => Array[Byte]. We’ll also need some minimal cryptographic functions for creating public keys that serve as IDs for user accounts.

Basic cryptography

To load Base64 public keys from the key files commonly in PKCS#8 PEM format on a file system, we use Bouncy Castle and Apache Commons Codec. As a side note, neither of the additional packages would be needed if the key files were in PKCS#8 DER format, which is binary and less commonly used.

Transactions in a Blockchain

With some basic utility and crypto functions in place, we now create class Account, which represents a user (e.g. a transaction originator or a miner) with the user’s cryptographic public key as the account ID. The corresponding private key, supposedly kept in private by the user, is for decrypting a transaction encrypted with the key. In our simplified model, the transactions won’t be encrypted hence private keys won’t be used.

Next, we create class TransactionItem that represents a single transaction.

The id of TransactionItem is the hash value of the concatenated class fields in bytes. Note that the apply factory method performs the necessary hashing of the provided arguments to assemble a TransactionItem with the hash-value ID.

Next, we define class Transactions, representing a collection of TransactionItems. The id of Transactions is just a random-UUID. It could’ve been defined as a collective hash value like TransactionItem’s id to ensure content integrity but we’re going to leave that to be taken care in a hash-tree data structure, the Merkle Tree.

For illustration, we create a method that instantiates a Transactions object consisting of a random number of TransactionItems, each with a pair of random Accounts and a random amount (remember each Account needs a public key as its id).

Generating a couple of Transactions objects with random content:

Hashing transactions into a Merkle tree

Merkle trees (or hash trees) are commonly used in blockchain computing. The main purpose of using a hash tree is to guarantee the authenticity of the contents in the dataset by successively composing their hashes in a hard-to-tamper fashion while keeping the resulting data structure relatively lightweight.

In a previous blog post re: Merkle tree implementation in Scala, we saw how a hash tree can be created from a collection of dataset. Even though the transaction collection data structure is now slightly more complex than the trivial example in the previous post, there is no added complexity in creating the hash tree. Borrowing a slight variant of the Merkle tree class from that post:

Using MerkleTree’s apply factory method, we simply supply transactions objects we’ve created as the method argument:

For transaction collection trans1, Merkle root mRoot1 is all that is needed to ensure its integrity. So is mRoot2 for trans2. For a given collection of transactions, the recursive hashing of the transaction items in the tree nodes all the way to the root node makes it mathematically difficult to tamper with the transaction content. The Merkle root along with the associated transaction collection will be kept in an immutable “block”.

While the term “blockchain” has been used ad libitum throughout the post, we have not seen anything remotely resembling a “block” yet, have we? So far, we’ve only put in place some simple data structures along with a few utility/crypto functions. Nonetheless, they’re the essential elements for the building “block” of a blockchain, which we’ll dig into in the next post of this blog series.