Tag Archives: scala generics

Ad-hoc Polymorphism In Scala

Over the past few years, there seems to be a subtle trend of software engineers favoring typeclass patterns that implement polymorphism in an ad-hoc fashion, namely, Ad-hoc Polymorphism. To see the benefits of such kind of polymorphism, let’s first look at what F-bounded polymorphism, a subtype polymorphism, has to offer.

F-bounded polymorphism

// F-bounded polymorphism
trait Car[T <: Car[T]] { self: T =>
  def vin: String
  def model: String
  def price: Double
  def setPrice(newPrice: Double): T
}

Next, a couple of child classes are defined:

// F-bounded polymorphism continued
case class Sedan(vin: String, model: String, price: Double) extends Car[Sedan] {
  def setPrice(newPrice: Double) = copy(price = newPrice)
}

case class Sports(vin: String, model: String, price: Double) extends Car[Sports] {
  def setPrice(newPrice: Double) = copy(price = newPrice)
}

A F-bounded type has a peculiar signature of the self-recursive `A[T <: A[T]]` which mandates the given type `T` itself a sub-type of `A[T]`, like how type `Sedan` is defined (Sedan <: Car[Sedan]). Note that the self-type annotation used in the trait isn’t requirement for F-bounded type. Rather, it’s a common practice for safeguarding against undesirable mix-up of sub-classes like below:

// F-bounded polymorphism with self-type
case class Sports(vin: String, model: String, price: Double) extends Car[Sedan] {
  def setPrice(newPrice: Double) = Sedan(vin, model, newPrice)
}
// ERROR: illegal inheritance; self-type Sports does not conform to Car[Sedan]'s selftype Sedan ...

“Type argument” versus “Type member”

Rather than a `type argument`, a F-bounded type could also be expressed as a `type member` which needs to be defined in its child classes.:

// F-bounded polymorphism expressed as type member
trait Car {
  type T <: Car
  def vin: String
  def model: String
  def price: Double
  def setPrice(newPrice: Double): T
}

case class Sedan(...) extends Car {
  type T = Sedan
  ...
}

case class Sports(...) extends Car {
  type T = Sports
  ...
}

It should be noted that with the `type member` approach, self-type would not be applicable, hence mix-up of sub-classes mentioned above is possible.

Let’s define a sedan and test out method `setPrice`:

// F-bounded polymorphism example continued
val sedan1 = Sedan("1ABC*234", "Honda Accord", 20500)

sedan1.setPrice(19500)
// res1: Sedan = Sedan("1ABC*234", "Honda Accord", 19500.0)

Under the F-bounded type’s “contract”, a method such as the following would work as intended to return the specified sub-type:

Had the Car/Sedan hierarchy been set up as the less specific `T <: Car`, the corresponding method:

// Problem with simple subtype
def setSalePrice[T <: Car](car: T, discount: Double): T =
  car.setPrice(car.price * (1.0 - discount max 0.0))

would fail as it couldn’t guarantee the returning type is the exact type of the input.

F-bounded type collection

Next, let’s look at a collection of cars.

// Collection of F-bounded elements
val sedan2 = Sedan("2DEF*345", "BMW 330i", 38000)
val sports1 = Sports("5MNO*678", "Ford Mustang", 34000)

val cars = List(sedan1, sedan2, sports1)
// cars: List[Product with Serializable with Car[_ >: Sports with SUV with Sedan <: Product with Serializable with Car[_ >: Sports with SUV with Sedan <: Product with Serializable]]] = ...

The resulting type is a rather ugly sequence of gibberish. To help the compiler a little, give it some hints about `T <: Car[T]` as shown below:

// Existential type hints
import scala.language.existentials

val cars = List[T forSome { type T <: Car[T] }](sedan1, sedan2, sports1)
// cars: List[T forSome { type T <: Car[T] }] = List(...)

Ad-hoc polymorphism

Contrary to subtype polymorphism which orients around a supertype with a rigid subtype structure, let’s explore a different approach using typeclasses, known as Ad-hoc polymorphism.

// Ad-hoc polymorphism example
trait Car[T] {
  def vin(car: T): String
  def model(car: T): String
  def price(car: T): Double
  def setPrice(car: T, newPrice: Double): T
}

case class Sedan(vin: String, model: String, price: Double)
case class Sports(vin: String, model: String, price: Double)

Next, a couple of “ad-hoc” implicit objects are created to implement the trait methods.

// Ad-hoc polymorphism example continued
import scala.language.implicitConversions

implicit object SedanCar extends Car[Sedan] {
  def vin(car: Sedan) = car.vin
  def model(car: Sedan) = car.model
  def price(car: Sedan) = car.price
  def setPrice(car: Sedan, newPrice: Double): Sedan = car.copy(price = newPrice)
}

implicit object SportsCar extends Car[Sports] {
  def vin(car: Sports) = car.vin
  def model(car: Sports) = car.model
  def price(car: Sports) = car.price
  def setPrice(car: Sports, newPrice: Double): Sports = car.copy(price = newPrice)
}

Note that alternatively, the implicit objects could be set up as ordinary companion objects of the case classes with implicit anonymous classes:

// Alternative - defining implicits in companion objects
case class Sedan(vin: String, model: String, price: Double)
object Sedan {
  implicit val SedanCar = new Car[Sedan] {
    def vin(car: Sedan) = car.vin
    def model(car: Sedan) = car.model
    def price(car: Sedan) = car.price
    def setPrice(car: Sedan, newPrice: Double): Sedan = car.copy(price = newPrice)
  }
}

case class Sports(vin: String, model: String, price: Double)
object Sports {
  implicit val SportsCar = new Car[Sports] {
    def vin(car: Sports) = car.vin
    def model(car: Sports) = car.model
    def price(car: Sports) = car.price
    def setPrice(car: Sports, newPrice: Double): Sports = car.copy(price = newPrice)
  }
}

Unifying implemented methods

Finally, an implicit conversion for cars of type `T` is provided by means of an implicit class to create a “unified” method that takes the corresponding method implementations from the provided implicit `Car[T]` parameter.

// Ad-hoc polymorphism example continued
implicit class CarOps[T](car: T)(implicit ev: Car[T]) {
  def setPrice(newPrice: Double): T = ev.setPrice(car: T, newPrice: Double)
}

Testing it out:

// Ad-hoc polymorphism example continued
val sedan1 = Sedan("1ABC*234", "Honda Accord", 20500)

sedan1.setPrice(19500)
// res1: Sedan = Sedan("1ABC*234", "Honda Accord", 19500.0)

New methods, like `setSalePrice`, can be added as needed in the implicit objects:

// Ad-hoc polymorphism example continued
implicit object SedanCar extends Car[Sedan] {
  ...
  def setSalePrice(car: Sedan, discount: Double): Sedan =
    car.setPrice(car.price * (1.0 - discount max 0.0))
}

implicit object SportsCar extends Car[Sports] {
    ...
  def setSalePrice(car: Sports, discount: Double): Sports =
    car.setPrice(car.price * (1.0 - discount max 0.0))
}

Ad-hoc type collection

Next, a collection of cars:

// Collection of elements of ad-hoc type
val sedan2 = Sedan("2DEF*345", "BMW 330i", 38000)
val sports1 = Sports("5MNO*678", "Ford Mustang", 34000)

val cars = List(sedan1, sedan2, sports1)
// cars: List[Product with java.io.Serializable] = List(
//   Sedan("1ABC*234", "Honda Accord", 20500.0),
//   Sedan("2DEF*345", "BMW 330i", 38000.0),
//   Sports("5MNO*678", "Ford Mustang", 34000.0)
// )

Similar to the F-bounded collection, the inferred resulting type isn’t very helpful. Unlike in the F-bounded case, we do not have a `T <: Car[T]` contract. Using an approach illustrated in this blog post, we could assemble the collection as a list of `(car, type)` tuples:

// Existential type hints
import scala.language.existentials

val cars = List[(T, Car[T]) forSome { type T }](
  (sedan1, implicitly[Car[Sedan]]),
  (sedan2, implicitly[Car[Sedan]]), 
  (sports1, implicitly[Car[Sports]])
)
// cars: List[(T, Car[T] forSome { type T })] = List(...)

By means of a simple example, we’ve now got a sense of how Ad-hoc polymorphism works. The F-bounded example serves as a contrasting reference of how the polymorphism bound by a more “strict” contract plays out in comparison. Given the flexibility of not having to bind the base classes into a stringent subtype relationship upfront, the rising popularity of Ad-hoc polymorphism certainly has its merits.

That said, lots of class models in real-world applications still fits perfectly well into a subtype relationship. In suitable use cases, F-bounded polymorphism generally imposes less boilerplate code. In addition, Ad-hoc polymorphism typically involves using of implicits that may impact code maintainability.

Patching Numeric Sequence In Scala

Like fetching top N elements from a sequence of comparable elements, patching numeric sequence is also a common need, especially when processing data that isn’t complete or clean. By “patching”, I mean interpolating missing spots in a list of numbers. A simplistic patch or interpolation is to fill a missing number with the average of the previous few numbers.

For example, given the following list of numbers:

60, 10, 50, (), 20, 90, 40, 80, (), (), 70, 30

we would like to replace each of the missing numbers with the average of, say, its previous 3 numbers. In this case, the leftmost missing number should be replace with 40 (i.e. (60 + 10 + 50) / 3).

Below is a simple snippet that patches missing numbers in a Double-type sequence with the average of the previous N numbers. The missing (or bad) numbers in the original sequence are represented as Double.NaN.

// Patching Double-type numbers with average of previous N numbers

def patchAvgLastN(n: Int)(list: List[Double]) = {
  def patchCurrElem(i: Int)(l: List[Double], e: Double) = {
    val avgLastN =
      if (e.isNaN) l match {
        case Nil => 0.0
        case _ =>
          val lastN = l.take(i)
          lastN.sum / lastN.size
      } else e
    avgLastN :: l
  }

  list.foldLeft( List[Double]() )( patchCurrElem(n) ).reverse
}

val list: List[Double] = List(60, 10, 50, Double.NaN, 20, 90, 40, 80, Double.NaN, Double.NaN, 70, 30)

patchAvgLastN(3)(list)
// res1: List[Double] = List(60.0, 10.0, 50.0, 40.0, 20.0, 90.0, 40.0, 80.0, 70.0, 63.3, 70.0, 30.0)

As shown in the code, method ‘patchCurrElem’ is created to prepend the calculated average of the previous N numbers to the supplied list. Its signature fits well to be a function taken by ‘foldLeft’ to traverse the entire sequence for applying the patch. Since ‘patchCurrElem’ prepends the sub-sequence for optimal operations in Scala List, the final list requires a reversal.

Note that ‘lastN.size’ rather than the literal ‘N’ is used to handle cases when there is less than N prior numbers available for average calculation. And ‘case Nil’ will cover cases when there is no prior number.

Generalizing the patch method

In deriving a generic method for ‘patchAvgLastN’, We’re not going to generalize it for Scala Numeric, as ‘average’ isn’t quite meaningful for non-fractional numbers such as integers. Instead, we’ll generalize it for Scala Fractional, which provides method ‘mkNumericOps’ for access to FractionalOps that consists of division operator (i.e. ‘/’) necessary for average calculation.

Since we’re no longer handling Double-type number, ‘-999’ is used as the default value (of type Int) to replace Double.Nan as the missing (or bad) number.

// Patching sequence of generic numbers with average of previous N numbers

def patchAvgLastN[T](n: Int, badNum: Int = -999)(list: List[T])(implicit num: Fractional[T]) = {
  def patchCurrElem(i: Int)(l: List[T], e: T) = {
    import num.mkNumericOps
    val unk = num.fromInt(badNum)
    val avgLastN =
      if (e == unk) l match {
        case Nil => num.zero
        case _ =>
          val lastN = l.take(i)
          lastN.reduce(_ + _) / num.fromInt(lastN.size)
      } else e
    avgLastN :: l
  }

  list.foldLeft( List[T]() )( patchCurrElem(n) ).reverse
}

val list: List[Double] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[Double](3)(list)
// res1: List[Double] = List(60.0, 10.0, 50.0, 40.0, 20.0, 90.0, 40.0, 80.0, 70.0, 63.3, 70.0, 30.0)

val list: List[Float] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[Float](3)(list)
// res2: List[Float] = List(60.0, 10.0, 50.0, 40.0, 20.0, 90.0, 40.0, 80.0, 70.0, 63.3, 70.0, 30.0)

val list: List[BigDecimal] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[BigDecimal](3)(list)
// res3: List[BigDecimal] = List(60, 10, 50, 40, 20, 90, 40, 80, 70, 63.33, 70, 30)

Work-arounds for patching integer sequences

A quick work-around for interpolating a list of integers (type Int, Long or BigInt) would be to transform the integers to a Fractional type, apply the patch and transform back to the original type. Note that in some cases, rounding or truncation might occur in the transformations. For example:

// A work-around for patching integer sequences

val list: List[Int] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[Double](3)(list.map(_.toDouble)).map(_.toInt)
// res1: List[Int] = List(60, 10, 50, 40, 20, 90, 40, 80, 70, 63, 70, 30)

val list: List[Long] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[Double](3)(list.map(_.toDouble)).map(_.toLong)
// res2: List[Long] = List(60, 10, 50, 40, 20, 90, 40, 80, 70, 63, 70, 30)

val list: List[BigInt] = List(60, 10, 50, -999, 20, 90, 40, 80, -999, -999, 70, 30)
patchAvgLastN[BigDecimal](3)(list.map(BigDecimal(_))).map(_.toBigInt)
// res3: List[scala.math.BigInt] = List(60, 10, 50, 40, 20, 90, 40, 80, 70, 63, 70, 30)

Generic Top N Elements In Scala

Getting the top N elements from a list of elements is a common need in applications that involve data retrieval. If the list is big, it’ll be inefficient and wasteful (in terms of processing resource) to sort the entire list when one is interested in only the top few elements.

Consider a list of real numbers (i.e. Double or Float-typed) and let’s say we want to fetch the smallest N numbers from the list. A commonly used algorithm for the task is rather straight forward:

Start with the first N numbers of the list as the selected N-element sublist, then check for each of the remaining numbers of the list and if it’s smaller than the largest number in the sublist swap out in each iteration the largest number with it.

Algorithmic steps

Formulating that as programmatic steps, we have:

  1. Maintain a sorted N-element list in descending order, hence its head is the max in the list (Assuming N isn’t a big number, the cost of sorting is trivial).
  2. In each iteration, if the current element in the original list is smaller than the head element in the N-element list, replace the head element with the current element; otherwise leave the current N-element list unchanged.

Upon completing the iterations, the N-element list will consist of the smallest elements in the original list and a final sorting in ascending order will result in a sorted N-element list:

// Top N (smallest) elements from a Double-type list

def topN(n: Int, list: List[Double]): List[Double] = {
  def maxListHead(l: List[Double], e: Double): List[Double] = {
    if (e < l.head) (e :: l.tail).sortWith(_ > _) else l
  }
  list.drop(n).foldLeft( list.take(n).sortWith(_ > _) )( maxListHead ).
    sortWith(_ < _)
}

val list: List[Double] = List(3, 2, 8, 2, 9, 1, 5, 5, 9, 1, 7, 3, 4)

topN(5, list)
// res1: List[Double] = List(1.0, 1.0, 2.0, 2.0, 3.0)

Note that it would be trivial to modify the method to fetch the largest N numbers (instead of the smallest) in which case one only needs to reverse the inequality operator in ‘e < l.head’, the iterative ‘sorthWith(_ > _)’ and the final ‘sorthWith(_ < _)’.

Refactoring to eliminate sorting

Now, let’s say we’re going to use it to fetch some top N elements where N is a little bigger, like top 5,000 from a 1 million element list. Except for the inevitable final sorting of the sublist, all the other ‘sorthWith()’ operations can be replaced with something less expensive. Since all we care is to be able to conditionally swap out the largest number in the sublist, we just need the largest number to be placed at the head of the sublist and the same algorithmic flow will work fine.

The refactored topN method below replaces all ‘sortWith()’ (except for the final sorting) with ‘bigHead()’ which places the largest number of the input list at its head position:

// Top N (smallest) elements from a Double-type list (refactored)

def topN(n: Int, list: List[Double]): List[Double] = {
  def bigHead(l: List[Double]): List[Double] = list match {
    case Nil => list
    case _ =>
      l.tail.foldLeft( List(l.head) )( (acc, x) =>
        if (x >= acc.head) x :: acc else acc.head :: x :: acc.tail
    )
  }
  def maxListHead(l: List[Double], e: Double): List[Double] = {
    if (e < l.head) bigHead(e :: l.tail) else l
  }
  list.drop(n).foldLeft( bigHead(list.take(n)) )( maxListHead ).
    sortWith(_ < _)
}

Generic top N numeric elements

Next, we generalize method topN to handle any list of elements of the type implicitly associated with Numeric, with a typeclass pattern known as context bound.

// Top N (smallest) elements from a generic numeric list

def topN[T](n: Int, list: List[T])(implicit num: Numeric[T]): List[T] = {
  import num.{mkNumericOps, mkOrderingOps}
  def bigHead(l: List[T]): List[T] = list match {
    case Nil => list
    case _ =>       l.tail.foldLeft( List(l.head) )( (acc, x) =>
        if (x >= acc.head) x :: acc else acc.head :: x :: acc.tail
      )
  }
  def maxListHead(l: List[T], e: T): List[T] = {
    if (e < l.head) bigHead(e :: l.tail) else l
  }
  list.drop(n).foldLeft( bigHead(list.take(n)) )( maxListHead ).
    sortWith(_ < _)
}

val list: List[Double] = List(3, 2, 8, 2, 9, 1, 5, 5, 9, 1, 7, 3, 4)
topN[Double](5, list)
// res1: List[Double] = List(1.0, 1.0, 2.0, 2.0, 3.0)

val list: List[Int] = List(3, 2, 8, 2, 9, 1, 5, 5, 9, 1, 7, 3, 4)
topN[Int](5, list)
// res2: List[Int] = List(1, 1, 2, 2, 3)

With the context bound for type ‘T’, importing the implicit mkNumericOps and mkOrderingOps methods makes arithmetic and comparison operators available for the list elements to be compared and ordered.

Generic top N objects ordered by mapped values

To further generalize topN, rather than being limited to numeric elements we enable it to take a list of generic objects and return the top N elements ordered by the individual element’s corresponding value (e.g. an order-able class field of the object). To accomplish that, we revise topN as follows:

  • Loosen the context bound from ‘Numeric’ to the more generic Ordering so that items can be ordered by non-numeric values such as strings
  • Take as an additional parameter a mapping function that tells what values corresponding to the objects are to be ordered by
// Top N elements from a list of objects ordered by object-mapped values

case class Score[T : Numeric](id: String, rank: T)

def topN[S, T : Ordering](n: Int, list: List[S], f: S => T): List[S] = {
  val orderer = implicitly[Ordering[T]]
  import orderer._
  def bigHead(l: List[S]): List[S] = list match {
    case Nil => list
    case _ =>
      l.tail.foldLeft( List(l.head) )( (acc, x) =>
        if (f(x) >= f(acc.head)) x :: acc else acc.head :: x :: acc.tail
      )
  }
  def maxListHead(l: List[S], e: S): List[S] = {
    if (f(e) < f(l.head)) bigHead((e :: l.tail)) else l
  }
  list.drop(n).foldLeft( bigHead(list.take(n)) )( maxListHead ).
    sortWith(f(_) < f(_))
}

val scores = List(
  Score("a", 3), Score("b", 6), Score("c", 8), Score("d", 1), Score("e", 11),
  Score("f", 5), Score("g", 9), Score("h", 12), Score("i", 2), Score("j", 10)
)
topN(5, scores, (s: Score[Int]) => s.rank)
// res1: List[Score[Int]] = List(Score(d,1), Score(i,2), Score(a,3), Score(f,5), Score(b,6))

Note that the type parameter ‘T : Ordering’, which signifies a context bound, is the shorthand notation for:

// Context Bound
case class Score[T](id: String, rank: T)(implicit orderer: Ordering[T])