Scala Unfold

In Scala 2.13, method unfold is added to the standard library without a big bang it probably deserves. A first glance at the method signature might make one wonder how it could possibly be useful. Admittedly, it’s not intuitive to reason about how to make use of it. Although it’s new in the Scala standard library, a couple of Akka Stream operators like Source.unfold, Source.unfoldAsync with similar method signature and functionality have already been available for a while.

While the method is available for a number of Scala collections, I’ll illustrate the using of it with the Iterator collection. One reason Iterator is chosen is that its “laziness” allows the method to be used for generating infinite sequences. Here’s the method unfold’s signature:

Method fold versus unfold

Looking at a method by the name unfold, one might begin to ponder its correlation to method fold. The contrary between fold and unfold is in some way analogous to that between apply and unapply, except that it’s a little more intuitive to “reverse” the logic from apply to unapply than from fold to unfold.

Let’s take a look at the method signature of fold:

Given a collection (in this case, an Iterator), method fold allows one to iteratively transform the elements of the collection into an aggregated element of similar type (a supertype of the elements to be precise) by means of a binary operator. For example:

Reversing method fold

In the above example, the binary operator _ + _, which is a shorthand for (acc, x) => acc + x, iteratively adds a number from a sequence of number, and fold applies the operator against the given Iterator’s content starting with an initial number 1000. It’s in essence doing this:

To interpret the “reverse” logic in a loose fashion, let’s hypothesize a problem with the following requirement:

Given the number 1055 (the “folded” sum), iteratively assemble a monotonically increasing sequence from 1 such that subtracting the cumulative sum of the sequence elements from 1055 remains larger than 1000.

Here’s one way of doing it using unfold:

How does unfold work?

Recall that Iterator’s unfold has the following method signature:

As can be seen from the signature, starting from a given “folded” initial state value, elements of a yet-to-be-generated sequence are iteratively “unfolded” by means of the function f. In each iteration, the returned tuple of type Option[(A, S)] determines a few things:

  1. the 1st tuple element of type A is the new element to be added to the resulting sequence
  2. the 2nd tuple element of type S is the next state value, revealing how the state is being iteratively mutated
  3. a returned Some((elem, state)) signals a new element being generated whereas a returned None signals the “termination” for the sequence generation operation

In the above example, the state is itself a tuple with initial state value (1, 1055) and next state value (i+1, n-i). The current state (i, n) is then iteratively transformed into an Option of tuple with:

  • the element value incrementing from i to i+1
  • the state value decrementing from n to n-i, which will be iteratively checked against the n > 1000 condition

Modified examples from Akka Stream API doc

Let’s look at a couple of examples modified from the Akka Stream API doc for Source.unfold. The modification is minor but necessary due to difference in the method signatures.

Example 1:

This is a nice “Hello World” example of unfold. Following the above bullet points of how-it-works, #1 and #2 tell us the resulting sequence has a starting element count iteratively decremented by 1 and how-it-works #3 says when count is not larger than 0 (i.e. decremented down to 0) the sequence generation operation stops.

Example 2:

This example showcases a slick way of generating a Fibonacci sequence. Here, we use a tuple as the initial state value, resulting in the operator returning a value with a nested tuple. Tuples are used for the state because each number in a Fibonacci sequence depends on two preceding numbers, Fib(n) = Fib(n-1) + Fib(n-2), hence in composing the sequence content we want to carry over more than one number in every iteration.

Applying the logic of how-it-works #1 and #2, if x and y represent the current and next elements, respectively, generated for the resulting sequence, x + y would be the value of the following element in accordance with the definition of Fibonacci numbers. In essence, the tuple state represents the next two values of elements to be generated. What about how-it-works #3? The missing of None case in the return value of the binary operator indicates that there is no terminating condition, hence we have an infinite sequence here.

Another example:

Here’s one more example which illustrates how one could generate a Factorial sequence using unfold.

In this example, we also use a tuple to represent the state, although there is a critical difference between what the tuple elements represent when compared with the previous example. By definition, the next number in a Factorial sequence only depends on the immediately preceding number, Fact(i+1) = Fact(i) * (i+1), thus the first tuple element, n * (i+1), takes care of that, defining what the next element of the resulting sequence will be. But there is also a need to carry over the next index value and that’s what the second tuple element is for. Again, without the None case in the return value of the binary operator, the resulting sequence will be infinite.

As a side note, we could also use method iterate that comes with Scala Iterator collection with similar iteration logic like below:

1 thought on “Scala Unfold

  1. jeroen dijkmeijer

    Thank you for this unfold explanation.

    Brain benching, but finally it unfolded for me. Thanks.
    If your blog would have a subscribe field I would have used.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *