I’m starting a mini blog series that centers around the Blockchain topic. At the end of the series will be a simple blockchain application in Scala on an Actor-based Akka cluster. The application will in some way follow a simplified version of the Bitcoin cryptocurrency’s operational model, including its proof-of-work consensus algorithm.
Cryptocurrency and Blockchain
Some quick background info about blockchain – In 2009, Bitcoin emerged as the first decentralized cryptocurrency and took the world by storm. Besides proving to the world the possibility of running a digital currency without the need of a centralized authority, it has also fascinated people (particularly in the finance and technology industries) with its simple yet effective operational model.
Cryptocurrency has also popularized the term “blockchain” which represents its underlying data structure and has since been broadened to a computing class that covers a wide range of applications (e.g. “smart contracts”) in different domains. Even though conceptually how a cryptocurrency like Bitcoin works isn’t complicated, it does require some basic knowledge in cryptography, particularly in PKCS (public key cryptography standards).
Utility functions
First, a few utility functions:
import java.time.{Instant, LocalDateTime, ZoneId} import java.time.format.DateTimeFormatter object Util { def bytesToBase64(bytes: Array[Byte]): String = java.util.Base64.getEncoder.encodeToString(bytes) def base64ToBytes(base64: String): Array[Byte] = java.util.Base64.getDecoder.decode(base64) def longToBytes(num: Long): Array[Byte] = java.nio.ByteBuffer.allocate(8).putLong(num).array def timestampToDateTime(timestamp: Long, zone: String = "UTC"): String = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss").format( LocalDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of(zone)) ) } import Util._
Hashing is a critical process for integrity check in blockchain’s underlying data structure. We’ll use SHA-256, which has a function signature of Array[Byte] => Array[Byte]. We’ll also need some minimal cryptographic functions for creating public keys that serve as IDs for user accounts.
Basic cryptography
import scala.util.{Try, Success, Failure} import java.io.{OutputStreamWriter, FileOutputStream, IOException} import java.security.spec.X509EncodedKeySpec import java.security._ import org.apache.commons.codec.binary.Base64 import org.bouncycastle.jce.provider.BouncyCastleProvider import org.bouncycastle.util.io.pem.{PemObject, PemWriter} object Crypto { def sha256(byteArr: Array[Byte]): Array[Byte] = java.security.MessageDigest.getInstance("SHA-256").digest(byteArr) val hashFcn = sha256 _ def generateKeyPair(keySize: Int = 4096): KeyPair = { val generator = java.security.KeyPairGenerator.getInstance("RSA") generator.initialize(keySize) generator.genKeyPair } def writePemFile(key: Key, description: String, filename: String): Unit = { val pemObject = new PemObject(description, key.getEncoded()) val pemWriter = new PemWriter(new OutputStreamWriter(new FileOutputStream(filename))) try { pemWriter.writeObject(pemObject) } catch { case e: IOException => println("ERROR: IO Exception $e") } finally { pemWriter.close() } } def generateKeyPairPemFiles(filePrefix: String, keySize: Int = 4096): Unit = { Security.addProvider(new BouncyCastleProvider()) val generator = KeyPairGenerator.getInstance("RSA") generator.initialize(keySize) val keyPair = generator.generateKeyPair() writePemFile(keyPair.getPublic(), "RSA PUBLIC KEY", s"${filePrefix}_public.pem") writePemFile(keyPair.getPrivate(), "RSA PRIVATE KEY", s"${filePrefix}_private.pem") } def publicKeyFromPemFile(keyFile: String): Option[PublicKey] = { Try(scala.io.Source.fromFile(keyFile)) match { case Success(k) => val keyString = k.mkString. replace("-----BEGIN RSA PUBLIC KEY-----\n", ""). replace("-----END RSA PUBLIC KEY-----\n", "") val keyBytes = Base64.decodeBase64(keyString) val publicKey = KeyFactory.getInstance("RSA").generatePublic(new X509EncodedKeySpec(keyBytes)) Some(publicKey) case Failure(e) => None } } def publicKeyToBase64(publicKey: PublicKey): String = bytesToBase64(publicKey.getEncoded) } import Crypto._
To load Base64 public keys from the key files commonly in PKCS#8 PEM format on a file system, we use Bouncy Castle and Apache Commons Codec. As a side note, neither of the additional packages would be needed if the key files were in PKCS#8 DER format, which is binary and less commonly used.
Transactions in a Blockchain
With some basic utility and crypto functions in place, we now create class Account, which represents a user (e.g. a transaction originator or a miner) with the user’s cryptographic public key as the account ID. The corresponding private key, supposedly kept in private by the user, is for decrypting a transaction encrypted with the key. In our simplified model, the transactions won’t be encrypted hence private keys won’t be used.
case class Account(key: String, name: String) { override def toString: String = s"A(${key.substring(0,2)}*${key.substring(128,132)}*, $name)" } object Account { def fromKeyFile(keyFile: String, name: String): Account = publicKeyFromPemFile(keyFile) match { case Some(key) => new Account(publicKeyToBase64(key), name) case None => throw new Exception(s"ERROR: Problem loading $keyFile!") } }
Next, we create class TransactionItem that represents a single transaction.
case class TransactionItem(id: String, accountFrom: Account, accountTo: Account, amount: Long, timestamp: Long) { override def toString: String = { val datetime = timestampToDateTime(timestamp) s"TI(${id.substring(0, 4)}, ${accountFrom.name} -> ${accountTo.name}, ${amount}, ${datetime})" } } object TransactionItem { def apply(accountFrom: Account, accountTo: Account, amount: Long, timestamp: Long): TransactionItem = { val bytes = accountFrom.key.getBytes ++ accountTo.key.getBytes ++ longToBytes(amount) ++ longToBytes(timestamp) new TransactionItem(bytesToBase64(hashFcn(bytes)), accountFrom, accountTo, amount, timestamp) } }
The `id` of TransactionItem is the hash value of the concatenated class fields in bytes. Note that the `apply` factory method performs the necessary hashing of the provided arguments to assemble a TransactionItem with the hash-value ID.
Next, we define class Transactions, representing a collection of TransactionItems. The `id` of Transactions is just a random-UUID. It could’ve been defined as a collective hash value like TransactionItem’s id to ensure content integrity but we’re going to leave that to be taken care in a hash-tree data structure, the Merkle Tree.
import java.util.UUID.randomUUID case class Transactions(id: String, items: Array[TransactionItem], timestamp: Long) { override def toString: String = { val datetime = timestampToDateTime(timestamp) s"T(${id.substring(0, 4)}, ${items.map(_.amount).sum}/${items.size}, ${datetime})" } } object Transactions { def apply(transactions: Array[TransactionItem], timestamp: Long): Transactions = new Transactions(randomUUID.toString, transactions, timestamp) }
For illustration, we create a method that instantiates a Transactions object consisting of a random number of TransactionItems, each with a pair of random Accounts and a random amount (remember each Account needs a public key as its `id`).
val numOfAccounts = 5 val maxTransItems = 3 val keyPath = "/tmp/" def randomFcn = java.util.concurrent.ThreadLocalRandom.current def distinctRandomIntPair(lower: Int, upper: Int): List[Int] = { val rand1 = randomFcn.nextInt(lower, upper) val rand2 = randomFcn.nextInt(lower, upper) if (rand1 != rand2) List(rand1, rand2) else List(rand1, if (rand2 < upper - 1) rand2 + 1 else lower) } def generateTrans( numOfAccounts: Int, maxTransItems: Int, keyPath: String, keyFiles: List[String] ): Transactions = { def genTransItem: TransactionItem = { val idx = distinctRandomIntPair(0, numOfAccounts) val accountFrom = Account.fromKeyFile(s"${keyPath}${keyFiles(idx(0))}", s"User${idx(0)}") val accountTo = Account.fromKeyFile(s"${keyPath}${keyFiles(idx(1))}", s"User${idx(1)}") val amount = 1000L + randomFcn.nextInt(0, 5) * 500L val transItem = TransactionItem(accountFrom, accountTo, amount, System.currentTimeMillis) println(s"accountFrom: $accountFrom accountTo: $accountTo") println(s"TransactionItem: $transItem") transItem } val numOfTransItems = randomFcn.nextInt(1, maxTransItems + 1) val transItems = Array.tabulate(numOfTransItems)(_ => genTransItem) Transactions(transItems, System.currentTimeMillis) }
Generating a couple of Transactions objects with random content:
(0 until numOfAccounts).foreach(i => generateKeyPairPemFiles(s"${keyPath}account${i}")) val keyFiles = List.tabulate(numOfAccounts)(i => s"account${i}_public.pem") val trans1 = generateTrans(numOfAccounts, maxTransItems, keyPath, keyFiles) // accountFrom: A(MI*kOJv*, User1) accountTo: A(MI*rlVg*, User2) // TransactionItem: TI(Ih+j, User1 -> User2, 2000, 2020-03-02 21:36:23) // accountFrom: A(MI*rlVg*, User2) accountTo: A(MI*Qpgs*, User4) // TransactionItem: TI(3vjD, User2 -> User4, 2500, 2020-03-02 21:36:23) // trans1: Transactions = T(dca2, 4500/2, 2020-03-02 21:36:23) val trans2 = generateTrans(numOfAccounts, maxTransItems, keyPath, keyFiles) // accountFrom: A(MI*3SG5*, User3) accountTo: A(MI*kOJv*, User1) // TransactionItem: TI(EQgn, User3 -> User1, 1000, 2020-03-02 21:36:26) // accountFrom: A(MI*rlVg*, User2) accountTo: A(MI*kOJv*, User1) // TransactionItem: TI(+cR/, User2 -> User1, 2500, 2020-03-02 21:36:26) // accountFrom: A(MI*rlVg*, User2) accountTo: A(MI*3SG5*, User3) // TransactionItem: TI(G1Ue, User2 -> User3, 1500, 2020-03-02 21:36:26) // trans2: Transactions = T(7915, 5000/3, 2020-03-02 21:36:26)
Hashing transactions into a Merkle tree
Merkle trees (or hash trees) are commonly used in blockchain computing. The main purpose of using a hash tree is to guarantee the authenticity of the contents in the dataset by successively composing their hashes in a hard-to-tamper fashion while keeping the resulting data structure relatively lightweight.
In a previous blog post re: Merkle tree implementation in Scala, we saw how a hash tree can be created from a collection of dataset. Even though the transaction collection data structure is now slightly more complex than the trivial example in the previous post, there is no added complexity in creating the hash tree. Borrowing a slight variant of the Merkle tree class from that post:
class MerkleTree( val hash: Array[Byte], val left: Option[MerkleTree] = None, val right: Option[MerkleTree] = None ) extends Serializable { def height: Int = { def loop(node: MerkleTree): Int = { if (!node.left.isEmpty && !node.right.isEmpty) math.max(loop(node.left.get), loop(node.right.get)) + 1 else if (!node.left.isEmpty) loop(node.left.get) + 1 else if(!node.right.isEmpty) loop(node.right.get) + 1 else 1 } loop(this) } def printNodes: Unit = { def printlnByLevel(t: MerkleTree): Unit = { for (l <- 1 to t.height) { loopByLevel(t, l) println } } def loopByLevel(node: MerkleTree, level: Int): Unit = { if (level <= 1) print(s"$node ") else { if (!node.left.isEmpty) loopByLevel(node.left.get, level - 1) else () if (!node.right.isEmpty) loopByLevel(node.right.get, level - 1) else () } } printlnByLevel(this) } override def toString: String = s"MT(${bytesToBase64(hash).substring(0, 4)})" } object MerkleTree { def apply(data: Array[Array[Byte]]): MerkleTree = { @scala.annotation.tailrec def buildTree(nodes: Array[MerkleTree]): Array[MerkleTree] = nodes match { case ns if ns.size <= 1 => ns case ns => val pairedNodes = ns.grouped(2).map{ case Array(a, b) => new MerkleTree(hashFcn(a.hash ++ b.hash), Some(a), Some(b)) case Array(a) => new MerkleTree(hashFcn(a.hash), Some(a), None) }.toArray buildTree(pairedNodes) } if (data.isEmpty) new MerkleTree(hashFcn(Array.empty[Byte])) else { val nodes = data.map(byteArr => new MerkleTree(hashFcn(byteArr))) buildTree(nodes)(0) // Return root of the tree } } }
Using MerkleTree’s `apply` factory method, we simply supply transactions objects we’ve created as the method argument:
val mRoot1 = MerkleTree(trans1.items.map(_.id.getBytes)) // mRoot1: MerkleTree = MT(OLsC) mRoot1.printNodes // MT(OLsC) // MT(v7GL) MT(FbT3) val mRoot2 = MerkleTree(trans2.items.map(_.id.getBytes)) // mRoot2: MerkleTree = MT(B0yx) mRoot2.printNodes // MT(B0yx) // MT(o85z) MT(MFDs) // MT(/t9G) MT(qxR3) MT(iGYW)
For transaction collection `trans1`, Merkle root `mRoot1` is all that is needed to ensure its integrity. So is `mRoot2` for `trans2`. For a given collection of transactions, the recursive hashing of the transaction items in the tree nodes all the way to the root node makes it mathematically difficult to tamper with the transaction content. The Merkle root along with the associated transaction collection will be kept in an immutable “block”.
While the term “blockchain” has been used ad libitum throughout the post, we have not seen anything remotely resembling a “block” yet, have we? So far, we’ve only put in place some simple data structures along with a few utility/crypto functions. Nonetheless, they’re the essential elements for the building “block” of a blockchain, which we’ll dig into in the next post of this blog series.
Pingback: An Akka Actor-based Blockchain | Genuine Blog
Pingback: Blockchain Mining And Proof-of-Work | Genuine Blog
Pingback: Actor-based Blockchain In Akka Typed | Genuine Blog