In Scala 2.13, method `unfold` is added to the standard library without a big bang it probably deserves. A first glance at the method signature might make one wonder how it could possibly be useful. Admittedly, it’s not intuitive to reason about how to make use of it. Although it’s new in the Scala standard library, a couple of Akka Stream operators like `Source.unfold`, `Source.unfoldAsync` with similar method signature and functionality have already been available for a while.
While the method is available for a number of Scala collections, I’ll illustrate the using of it with the `Iterator` collection. One reason `Iterator` is chosen is that its “laziness” allows the method to be used for generating infinite sequences. Here’s the method unfold’s signature:
def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): Iterator[A]
Method fold versus unfold
Looking at a method by the name `unfold`, one might begin to ponder its correlation to method fold. The contrary between `fold` and `unfold` is in some way analogous to that between `apply` and unapply, except that it’s a little more intuitive to “reverse” the logic from `apply` to `unapply` than from `fold` to `unfold`.
Let’s take a look at the method signature of `fold`:
def fold[A1 >: A](z: A1)(op: (A1, A1) => A1): A1
Given a collection (in this case, an Iterator), method `fold` allows one to iteratively transform the elements of the collection into an aggregated element of similar type (a supertype of the elements to be precise) by means of a binary operator. For example:
Iterator(1 to 10: _*).fold(1000)(_ + _) // res1: Int = 1055
Reversing method fold
In the above example, the binary operator `_ + _`, which is a shorthand for `(acc, x) => acc + x`, iteratively adds a number from a sequence of number, and `fold` applies the operator against the given Iterator’s content starting with an initial number 1000. It’s in essence doing this:
1000 + 1 + 2 + ... + 10
To interpret the “reverse” logic in a loose fashion, let’s hypothesize a problem with the following requirement:
Given the number 1055 (the “folded” sum), iteratively assemble a monotonically increasing sequence from 1 such that subtracting the cumulative sum of the sequence elements from 1055 remains larger than 1000.
Here’s one way of doing it using `unfold`:
val iter = Iterator.unfold((1, 1055)){ case (i, n) if n > 1000 => Some((i, (i+1, n-i))) case _ => None } iter.toList // res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
How does unfold work?
Recall that Iterator’s `unfold` has the following method signature:
def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): Iterator[A]
As can be seen from the signature, starting from a given “folded” initial state value, elements of a yet-to-be-generated sequence are iteratively “unfolded” by means of the function `f`. In each iteration, the returned tuple of type Option[(A, S)] determines a few things:
- the 1st tuple element of type `A` is the new element to be added to the resulting sequence
- the 2nd tuple element of type `S` is the next `state` value, revealing how the state is being iteratively mutated
- a returned `Some((elem, state))` signals a new element being generated whereas a returned `None` signals the “termination” for the sequence generation operation
In the above example, the `state` is itself a tuple with initial state value `(1, 1055)` and next state value `(i+1, n-i)`. The current state `(i, n)` is then iteratively transformed into an `Option` of tuple with:
- the element value incrementing from `i` to `i+1`
- the state value decrementing from `n` to `n-i`, which will be iteratively checked against the `n > 1000` condition
Modified examples from Akka Stream API doc
Let’s look at a couple of examples modified from the Akka Stream API doc for Source.unfold. The modification is minor but necessary due to difference in the method signatures.
Example 1:
def countDown(n: Int): Iterator[Int] = Iterator.unfold(n) { count => if (count > 0) Some((count, count - 1)) else None } countDown(10).toList // res1: List[Int] = List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
This is a nice “Hello World” example of `unfold`. Following the above bullet points of how-it-works, #1 and #2 tell us the resulting sequence has a starting element `count` iteratively decremented by 1 and how-it-works #3 says when `count` is not larger than 0 (i.e. decremented down to 0) the sequence generation operation stops.
Example 2:
val fibonacci: Iterator[BigInt] = Iterator.unfold((BigInt(0), BigInt(1))) { case (x, y) => Some((x, (y, x + y))) } fibonacci.take(10).toList // res1: List[BigInt] = List(0, 1, 1, 2, 3, 5, 8, 13, 21, 34) fibonacci.drop(90).next // res2: BigInt = 2880067194370816120
This example showcases a slick way of generating a Fibonacci sequence. Here, we use a tuple as the initial `state` value, resulting in the operator returning a value with a nested tuple. Tuples are used for the `state` because each number in a Fibonacci sequence depends on two preceding numbers, `Fib(n) = Fib(n-1) + Fib(n-2)`, hence in composing the sequence content we want to carry over more than one number in every iteration.
Applying the logic of how-it-works #1 and #2, if `x` and `y` represent the current and next elements, respectively, generated for the resulting sequence, `x + y` would be the value of the following element in accordance with the definition of Fibonacci numbers. In essence, the tuple `state` represents the next two values of elements to be generated. What about how-it-works #3? The missing of `None` case in the return value of the binary operator indicates that there is no terminating condition, hence we have an infinite sequence here.
Another example:
Here’s one more example which illustrates how one could generate a Factorial sequence using `unfold`.
val factorial: Iterator[BigInt] = Iterator.unfold((BigInt(1), 0)){ case (n, i) => Some((n, (n*(i+1), i+1))) } factorial.take(10).toList // res1: List[BigInt] = List(1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880)
In this example, we also use a tuple to represent the `state`, although there is a critical difference between what the tuple elements represent when compared with the previous example. By definition, the next number in a Factorial sequence only depends on the immediately preceding number, `Fact(i+1) = Fact(i) * (i+1)`, thus the first tuple element, `n * (i+1)`, takes care of that, defining what the next element of the resulting sequence will be. But there is also a need to carry over the next index value and that’s what the second tuple element is for. Again, without the `None` case in the return value of the binary operator, the resulting sequence will be infinite.
As a side note, we could also use method iterate that comes with Scala `Iterator` collection with similar iteration logic like below:
def factorial(n: Int): Iterator[BigInt] = Iterator.iterate((BigInt(1), 0)){ case (n, i) => (n*(i+1), i+1) }. take(n).map(_._1) factorial(10).toList // res2: List[BigInt] = List(1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880)
Thank you for this unfold explanation.
Brain benching, but finally it unfolded for me. Thanks.
If your blog would have a subscribe field I would have used.
cool, still relevant