Monthly Archives: March 2020

Blockchain Mining And Proof-of-Work

This is the second part of the Blockchain mini-series. Core focus of this post is to define the building “block” of a blockchain and illustrate how to “mine” a block in Scala.

First, let’s make available the key entity classes (Account, Transactions, MerkleTree, etc) and utility/crypto functions we assembled in the previous post.

The building “block” of a Blockchain

We now define the building “block” of a blockchain. In our simple model, it’s a “reversed” linked list with each node consisting of the hash value of its preceding block. Key fields in the Block class include the hash values of the current and preceding blocks, the transaction collection and the corresponding Merkle root, block creation timestamp and a couple of fields for Proof of Work (PoW) – a consensus algorithm we’ll get into shortly.

Let’s define object ProofOfWork, which includes just a couple of values relevant to the PoW process, as follows for now:

RootBlock is the primordial block (a.k.a. genesis block) and any subsequent block chained after it is a LinkedBlock which consists of an additional field blockPrev, a “pointer” to the preceding block. For simplicity in function signatures, the hash function hashFcn for the Block subclasses and other previously defined classes is not provided as constructor or class method arguments but rather a predefined function for all.

Proof of Work – a computationally demanding task

Proof of Work (PoW) is a kind of consensus algorithms, arguably popularized by Bitcoin. Due to the lack of a centralized authoritative source like the central bank for a conventional currency, a decentralized cryptocurrency network needs a consensus protocol that every participant agrees to.

The idea of PoW is to incentivize participants to work on a computationally demanding task in order to be qualified for adding a new block of unspent transactions to the existing blockchain – the distributed ledger. The incentive is a reward, typically a specific amount of the cryptocurrency, the network offers to the participants who:

  1. completed the task, and,
  2. successfully added to the blockchain a new block consisting of the task completion proof.

These participants are referred to as the miners. The copy of blockchain maintained by any competing miner with the highest PoW value (generally referred to the length of the blockchain) overrides the rest.

A commonly employed PoW scheme is to require a miner to repeatedly apply a hash function to a given string of text combined with an incrementing integer until the resulting hash value has a certain number of leading zeros. Mathematically this is a task requiring exponentially more trials for every additional zero in the requirement.

We now expand object ProofOfWork to include all the key elements for PoW.

Borrowing some terms from the Bitcoin model, defaultDifficulty is the default difficulty level of the PoW task. The level value represents the number of leading zeros required to be in the resulting hash value after repeatedly applying a hash function against the concatenation of the hash of a block and a monotonically increasing integer. The incremental integer is called nonce, and defaultNonce is the network-default value. Note that, by design, the output of hashing is so unpredictable and sensitive to input change that any given starting nonce value will not have any advantage over any other values.

Method validProof concatenates a dataset of type byte-array and a byte-converted nonce, applies hashFcn to the combined data and reports whether the resulting hash value has the number of leading zeros specified by difficulty. And method generateProof is a simple recursive snippet that takes a Base64 string and runs validProof repeatedly with an monotonically incrementing nonce until it satisfies the difficulty requirement.

Generating Proof

Through repeated trials of hashing with the incrementing nonce value, the “proof” is the first incremented nonce value that satisfies the requirement at the set difficulty level. As it could take significant time to arrive at the required hash value, it’s logical to perform PoW asynchronously.

Using the scheduleOnce Akka scheduler to complete a Promise with a TimeoutException after a certain elapsed time, we’ve created an asynchronous method with a non-blocking timeout mechanism (as opposed to Await) that returns a Future of the wanted nonce value.

Test running generatePoW

To test out generating Proof of Work, we borrow the random account/transaction creation snippet from the previous blog:

Generating a few Transactions objects with random content, each to be used in building a block:

The miner will use their own account to receive the blockchain reward upon successful block acceptance. Creating a new block requires a couple of things:

  • the reference to the last block in the existing blockchain
  • a collection of unspent transactions from a distributed queue

The miner will then save the last block reference as blockPrev in the new block, prepend to the transaction collection an additional transaction with their own account reference for the reward, and start the PoW process. Upon finishing the PoW, the incremented nonce that satisfies the requirement at the predefined difficulty level will be kept in the new block as the proof for validation by the blockchain system. Below is the method mine that creates the new block.

Let’s just say the miner owns the first of the previously created accounts:

We’ll start mining a new block to be appended to the root block (i.e. the genesis block RootBlock).

Now that the last block of the blockchain is block1, a subsequent mining attempt will make it the preceding block; and further mining will be performed like so, successively.

A couple of notes:

  1. At difficulty level 3, the 20-second timeout is generally sufficient on a computer with, say, a 3GHz 4-core CPU, although one may still get a timeout once in a while. If we elevate it to level 4, the required proof-generating time will be about 100-fold (i.e. 30+ minutes) of what its takes for level 3.
  2. The incremented nonce saved in the returned block is the exact number of hashing trials to have finally satisfied the PoW requirement at the predefined difficulty level. At difficulty level 3, the number of trials ranges from millions to tens of millions. A level up would require billions of trials.

To look at what the final blockchain (i.e. block3) is like, let’s create a simple function that converts the linked blocks to elements of a List:

As a side note, the above detailed steps are for illustration purpose. The list of transactions and blocks could’ve been created in a more concise manner:

Proof validation

Validating a proof (i.e. the incremented nonce) saved in a block is simple. Object ProofOfWork includes method validProofIn(block: Block) that takes a block and verifies whether the hash function can reproduce using the block’s hash and nonce a result with the number of leading zeros matching the difficulty level.

For example, the following verification confirms that block3 has a valid proof.

In the next blog post, we’ll port the above proof-of-concept Scala snippets to run on a scalable Akka cluster.

Transaction Hash Tree In A Blockchain

I’m starting a mini blog series that centers around the Blockchain topic. At the end of the series will be a simple blockchain application in Scala on an Actor-based Akka cluster. The application will in some way follow a simplified version of the Bitcoin cryptocurrency’s operational model, including its proof-of-work consensus algorithm.

Cryptocurrency and Blockchain

Some quick background info about blockchain – In 2009, Bitcoin emerged as the first decentralized cryptocurrency and took the world by storm. Besides proving to the world the possibility of running a digital currency without the need of a centralized authority, it has also fascinated people (particularly in the finance and technology industries) with its simple yet effective operational model.

Cryptocurrency has also popularized the term “blockchain” which represents its underlying data structure and has since been broadened to a computing class that covers a wide range of applications (e.g. “smart contracts”) in different domains. Even though conceptually how a cryptocurrency like Bitcoin works isn’t complicated, it does require some basic knowledge in cryptography, particularly in PKCS (public key cryptography standards).

Utility functions

First, a few utility functions:

Hashing is a critical process for integrity check in blockchain’s underlying data structure. We’ll use SHA-256, which has a function signature of Array[Byte] => Array[Byte]. We’ll also need some minimal cryptographic functions for creating public keys that serve as IDs for user accounts.

Basic cryptography

To load Base64 public keys from the key files commonly in PKCS#8 PEM format on a file system, we use Bouncy Castle and Apache Commons Codec. As a side note, neither of the additional packages would be needed if the key files were in PKCS#8 DER format, which is binary and less commonly used.

Transactions in a Blockchain

With some basic utility and crypto functions in place, we now create class Account, which represents a user (e.g. a transaction originator or a miner) with the user’s cryptographic public key as the account ID. The corresponding private key, supposedly kept in private by the user, is for decrypting a transaction encrypted with the key. In our simplified model, the transactions won’t be encrypted hence private keys won’t be used.

Next, we create class TransactionItem that represents a single transaction.

The id of TransactionItem is the hash value of the concatenated class fields in bytes. Note that the apply factory method performs the necessary hashing of the provided arguments to assemble a TransactionItem with the hash-value ID.

Next, we define class Transactions, representing a collection of TransactionItems. The id of Transactions is just a random-UUID. It could’ve been defined as a collective hash value like TransactionItem’s id to ensure content integrity but we’re going to leave that to be taken care in a hash-tree data structure, the Merkle Tree.

For illustration, we create a method that instantiates a Transactions object consisting of a random number of TransactionItems, each with a pair of random Accounts and a random amount (remember each Account needs a public key as its id).

Generating a couple of Transactions objects with random content:

Hashing transactions into a Merkle tree

Merkle trees (or hash trees) are commonly used in blockchain computing. The main purpose of using a hash tree is to guarantee the authenticity of the contents in the dataset by successively composing their hashes in a hard-to-tamper fashion while keeping the resulting data structure relatively lightweight.

In a previous blog post re: Merkle tree implementation in Scala, we saw how a hash tree can be created from a collection of dataset. Even though the transaction collection data structure is now slightly more complex than the trivial example in the previous post, there is no added complexity in creating the hash tree. Borrowing a slight variant of the Merkle tree class from that post:

Using MerkleTree’s apply factory method, we simply supply transactions objects we’ve created as the method argument:

For transaction collection trans1, Merkle root mRoot1 is all that is needed to ensure its integrity. So is mRoot2 for trans2. For a given collection of transactions, the recursive hashing of the transaction items in the tree nodes all the way to the root node makes it mathematically difficult to tamper with the transaction content. The Merkle root along with the associated transaction collection will be kept in an immutable “block”.

While the term “blockchain” has been used ad libitum throughout the post, we have not seen anything remotely resembling a “block” yet, have we? So far, we’ve only put in place some simple data structures along with a few utility/crypto functions. Nonetheless, they’re the essential elements for the building “block” of a blockchain, which we’ll dig into in the next post of this blog series.