Ruminations of a Programmer

Domain Models - Late Evaluation buys you better Composition

Sun, 30 Jul 2017 13:11:00 +0000

In the last post we talked about early abstractions that allow you to design generic interfaces which can be polymorphic in the type parameter. Unless you abuse the type system of a permissive language like Scala, if you adhere to the principles of parametricity, this approach helps you implement abstractions that are reusable under various contexts. We saw this when we implemented the generic contract of the mapReduce function and its various specializations by supplying different concrete instances of the Monoid algebra.

In this post we will take a look at the other end of the spectrum in designing functional domain models. We will discuss evaluation semantics of model behaviors - the precise problem of when to commit to specific concrete evaluation semantics. Consider the following definition of a domain service module ..

type ErrorOr[A] = Either[String, A]

trait PaymentService {
  def paymentCycle: ErrorOr[PaymentCycle]
  def qualifyingAccounts(paymentCycle: PaymentCycle): ErrorOr[Vector[Account]]
  def payments(accounts: Vector[Account]): ErrorOr[Vector[Payment]]
  def adjustTax(payments: Vector[Payment]): ErrorOr[Vector[Payment]]
  def postToLedger(payments: Vector[Payment]): ErrorOr[Unit]
}

Such definitions are quite common these days. We have a nice monadic definition going on which can be composed as well to implement larger behaviors out of smaller ones ..

def processPayments() = for {
  p <- paymentCycle
  a <- qualifyingAccounts(p)
  m <- payments(a)
  a <- adjustTax(m)
  _ <- postToLedger(a)
} yield ()

Can we improve upon this design ?

Committing to the concrete early - the pitfalls ..

One of the defining aspects of reusable abstractions is the ability to run it under different context. This is one lesson that we learnt in the last post as well. Make the abstractions depend on the least powerful algebra. In this example our service functions return Either, which is a monad. But it's not necessarily the least powerful algebra in the context. Users may choose to use some other monad or may be even applicative to thread through the context of building larger behaviors. Why not keep the algebra unspecified at the service definition level and hope to have specializations in implementations or even in usages at the end of the world ? Here's what we can do ..

// algebra
trait PaymentService[M[_]] {
  def paymentCycle: M[PaymentCycle]
  def qualifyingAccounts(paymentCycle: PaymentCycle): M[Vector[Account]]
  def payments(accounts: Vector[Account]): M[Vector[Payment]]
  def adjustTax(payments: Vector[Payment]): M[Vector[Payment]]
  def postToLedger(payments: Vector[Payment]): M[Unit]
}

A top level service definition that keeps the algebra unspecified. Now if we want to implement a larger behavior with monadic composition, we can do this ..

// weaving the monad
def processPayments()(implicit me: Monad[M]) = for {
  p <- paymentCycle
  a <- qualifyingAccounts(p)
  m <- payments(a)
  a <- adjustTax(m)
  _ <- postToLedger(a)
} yield p

Note that we are using only the monadic bind in composing the larger behavior - hence the least powerful algebra that we can use is that of a Monad. And we express this exact constraint by publishing the requirements of the existence of an instance of a Monad for the type constructor M.

What about Implementation ?

Well, we could avoid the committment to a concrete algebra in the definition of the service. What about the implementation ? One of the core issues with the implementation is how you need to handle errors. This is an issue which often makes you commit to an implementation when you write the interpreter / implementation of a service contract. You may use Failure for a Try based implementation, or Left for an Either based implementation etc. Can we abstract over this behavior through a generic error handling strategy ? Some libraries like cats offers you abstractions like MonadError that helps you implement error reporting functionality using generic monadic APIs. Here's how we can do this ..

class PaymentServiceInterpreter[M[_]](implicit me: MonadError[M, Throwable])
  extends PaymentService[M] {

  //..

  def payments(accounts: Vector[Account]): M[Vector[Payment]] = 
    if (accounts.isEmpty) me.raiseError(
      new IllegalArgumentException("Empty account list"))
    else //..
    //..
  }
  //..
}

Note we needed a monad with error handling capabilities and we used MonadError for that. Note that we have kept the error type in MonadError as Throwable, which may seem a bit unethical in the context of pure functional programming. But it's also true that many libraries (especially Java ones) or underlying abstractions like Future or Try play well with exceptions. Anyway this is just a red herring though it has nothing to do with the current topic of discussion. The moot point is that you need to supply a MonadError that you have instances of.

Here's how cats defines the trait MonadError ..

trait MonadError[F[_], E] extends ApplicativeError[F, E] with Monad[F] { //..

.. and that's exactly what we will commit to. We are still dealing with a generic Monad even in the implementation without committing to any concreate instance.

End of the World!

The basic purpose why we wanted to delay committing to the concrete instance was to allow the users the flexibility to choose their own implementations. This is what we call the principle of delayed evaluation. Abstract early, evaluate late and decouple the concerns of building and the evaluation of the abstractions. We have already seen the 2 of these principles - we will see that our design so far will accommodate the third one as well, at least for some instances of M.

The user of our API has the flexibility to choose the monad as long as she supplies the MonadError[M, Throwable] instance. And we have many to choose from. Here's an example of the above service implementation in use that composes with another service in a monadic way and choosing the exact concrete instance of the Monad at the end of the world ..

import cats._
import cats.data._
import cats.implicits._

// monix task based computation
object MonixTaskModule {
  import monix.eval.Task
  import monix.cats._

  val paymentInterpreter = new PaymentServiceInterpreter[Task]
  val emailInterpreter = new EmailServiceInterpreter[Task]

  for {
    p <- paymentInterpreter.processPayments
    e <- emailInterpreter.sendEmail(p)
  } yield e
}

// future based computation
object FutureModule {
  import scala.concurrent.Future
  import scala.concurrent.ExecutionContext.Implicits.global
  
  val paymentInterpreter = new PaymentServiceInterpreter[Future]
  val emailInterpreter = new EmailServiceInterpreter[Future]

  for {
    p <- paymentInterpreter.processPayments
    e <- emailInterpreter.sendEmail(p)
  } yield e
}

// try task based computation
object TryModule {
  import scala.util.Try

  val paymentInterpreter = new PaymentServiceInterpreter[Try]
  val emailInterpreter = new EmailServiceInterpreter[Try]

  for {
    p <- paymentInterpreter.processPayments
    e <- emailInterpreter.sendEmail(p)
  } yield e
}

Monix Task is an abstraction that decouples the building of the abstraction from execution. So the Task that we get from building the composed behavior as in the above example can be executed in a deferred way depending of the requirements of the application. It can also be composed with other Tasks to build larger ones as well.

Vertical Composition - stacking abstractions

When you have not committed to an implementation early enough, all you have is an unspecified algebra. You can do fun stuff like stacking abstractions vertically. Suppose we want to implement auditability in some of our service methods. Here we consider a simple strategy of logging as a means to audit the behaviors. How can we take an existing implementation and plugin the audit function selectively ? The answer is we compose algebras .. here's an example that stacks the Writer monad with an already existing algebra to make the payments function auditable ..

final class AuditablePaymentService[M[_]: Applicative](paymentService: PaymentService[M]) 
  extends PaymentService[WriterT[M, Vector[String], ?]] {

  def paymentCycle: WriterT[M, Vector[String], PaymentCycle] =
    WriterT.lift(paymentService.paymentCycle)

  def qualifyingAccounts(paymentCycle: PaymentCycle): WriterT[M, Vector[String], Vector[Account]] =
    WriterT.lift(paymentService.qualifyingAccounts(paymentCycle))

  def payments(accounts: Vector[Account]): WriterT[M, Vector[String], Vector[Payment]] =
    WriterT.putT(paymentService.payments(accounts))(accounts.map(_.no))

  //..
}

val auditablePaymentInterpreter = new AuditablePaymentService[Future](
  new PaymentServiceInterpreter[Future]
)

We took the decision to abstract the return type in the form of a type constructor early on. But committed to the specific type only during the actual usage of the service implementation. Early abstraction and late committment to implementation make great composition possible and often in ways that may pleasantly surprise you later ..

Domain Models - Early Abstractions and Polymorphic Domain Behaviors

Sun, 25 Jun 2017 18:11:00 +0000

Let's talk genericity or generic abstractions. In the last post we talked about an abstraction Money, which, BTW was not generic. But we expressed some of the operations on Money in terms of a Money[Monoid], where Monoid is a generic algebraic structure. By algebraic we mean that a Monoid

is generic in types
offers operations that are completely generic on the types
all operations honor the algebraic laws of left and right identities and associativity

But when we design a domain model, what does this really buy us ? We already saw in the earlier post how law abiding abstractions save you from writing some unit tests just through generic verification of the laws using property based testing. That's just a couple of lines in any of the available libraries out there.

Besides reducing the burden of your unit tests, what does Money[Monoid] buy us in the bigger context of things ? Let's look at a simple operation that we defined in Money ..

Just to recapitulate, here's the definition of Money

class Money (val items: Map[Currency, BigDecimal]) { //..
}

object Money {
  final val zeroMoney = new Money(Map.empty[Currency, BigDecimal])

  def apply(amount: BigDecimal, ccy: Currency) = new Money(Map(ccy -> amount))

  // concrete naive implementation: don't
  def add(m: Money, n: Money) = new Money(
    (m.items.toList ++ n.items.toList)
      .groupBy(_._1)
      .map { case (k, v) => 
        (k, v.map(_._2).sum) 
      }
    )

  //..
}

add is a naive implementation though it's possibly the most frequent one that you will ever encounter in domain models around you. It picks up the Map elements and then adds the ones with the same key to come up with the new Money.

Why is this a naive implementation ?

First of all it deconstructs the implementation of Money, instead of using the algebraic properties that the implementation may have. Here we implement Money in terms of a Map, which itself forms a Monoid under the operations defined by Monoid[Map[K, V]]. Hence why don't we use the monoidal algebra of a Map to implement the operations of Money ?

object Money {

  //..

  def add(m: Money, n: Money) = new Money(m.items |+| n.items)

  //..
}

|+| is a helper function that combines the 2 Maps in a monoidal manner. The concrete piece of code that you wrote in the naive implementation is now delegated to the implementation of the algebra of monoids for a Map in a completely generic way. The advantage is that you need (or possibly someone else has already done that for you) to write this implementation only once and use it in every place you use a Map. Reusability of polymorphic code is not via documentation but by actual code reuse.

On to some more reusability of generic patterns ..

Consider the following abstraction that builds on top of Money ..

import java.time.OffsetDateTime
import Money._

import cats._
import cats.data._
import cats.implicits._

object Payments {
  case class Account(no: String, name: String, openDate: OffsetDateTime, 
    closeDate: Option[OffsetDateTime] = None)
  case class Payment(account: Account, amount: Money, dateOfPayment: OffsetDateTime)

  // returns the Money for credit payment, zeroMoney otherwise
  def creditsOnly(p: Payment): Money = if (p.amount.isDebit) zeroMoney else p.amount

  // compute valuation of all credit payments
  def valuation(payments: List[Payment]) = payments.foldLeft(zeroMoney) { (a, e) =>
    add(a, creditsOnly(e))
  }
  //..
}

valuation gives a standard implementation folding over the List that it gets. Now let's try to critique the implementation ..

1. The function does a foldLeft on the passed in collection payments. The collection only needs to have the ability to be folded over and List can do much more than that. We violate the principle of using the least powerful abstraction as part of the implementation. The function that implements the fold over the collection only needs to take a Foldable - that prevents misuse on part of a user feeling like a child in a toy store with something more grandiose than what she needs.

2. The implementation uses the add function of Money, which is nothing but a concrete wrapper over a monoidal operation. If we can replace this with something more generic then it will be a step forward towards a generic implementation of the whole function.

3. If we squint a bit, we can get some more light into the generic nature of all the components of this 2 line small implementation. zeroMoney is a zero of a Monoid, fold is a generic operation of a Foldable, add is a wrapper over a monoidal operation and creditsOnly is a mapping operation over every payment that the collection hands you over. In summary the implementation folds over a Foldable mapping each element using a function and uses the monoidal operation to collapse the fold.

Well, it's actually a concrete implementation of a generic map-reduce function ..

def mapReduce[F[_], A, B](as: F[A])(f: A => B)
  (implicit fd: Foldable[F], m: Monoid[B]): B = 
    fd.foldLeft(as, m.empty)((b, a) => m.combine(b, f(a)))

In fact the Foldable trait contains this implementation in the name of foldMap, which makes our implementation of mapReduce even simpler ..

def mapReduce1[F[_], A, B](as: F[A])(f: A => B)
  (implicit fd: Foldable[F], m: Monoid[B]): B = fd.foldMap(as)(f)

And List is a Foldable and our implementation of valuation becomes as generic as ..

object Payments {
  //..

  // generic implementation
  def valuation(payments: List[Payment]): Money = {
    implicit val m: Monoid[Money] = Money.MoneyAddMonoid
    mapReduce(payments)(creditsOnly)
  }
}

The implementation is generic and the typesystem will ensure that the Money that we produce can only come from the list of payments that we pass. In the naive implementation there's always a chance that the user subverts the typesystem and can play malice by plugging in some additional Money as the output. If you look at the type signature of mapReduce, you will see that the only way we can get a B is by invoking the function f on an element of F[A]. Since the function is generic on types we cannot ever produce a B otherwise. Parametricity FTW.

mapReduce is completely generic on types - there's no specific implementation that asks it to add the payments passed to it. This abstraction over operations is provided by the Monoid[B]. And the abstraction over the form of collection is provided by Foldable[F]. It's now no surprise that we can pass in any concrete operation or structure that honors the contracts of mapReduce. Here's another example from the same model ..

object Payments {
  //..

  // generic implementation
  def maxPayment(payments: List[Payment]): Money = {
    implicit val m: Monoid[Money] = Money.MoneyOrderMonoid
    mapReduce(payments)(creditsOnly)
  }
}

We want to compute the maximum credit payment amount from a collection of payments. A different domain behavior needs to be modeled but we can think of it as belonging to the same form as valuation and implemented using the same structure as mapReduce, only passing a different instance of Monoid[Money]. No additional client code, no fiddling around with concrete data types, just matching the type contracts of a polymorphic function.

Looks like our investment on an early abstraction of mapReduce has started to pay off. The domain model remains clean with much of the domain logic being implemented in terms of the algebra that the likes of Foldables and Monoids offer. I discussed some of these topics at length in my book Functional and Reactive Domain Modeling. In the next instalment we will explore some more complex algebra as part of domain modeling ..

Domain models, Algebraic laws and Unit tests

Sun, 18 Jun 2017 16:37:00 +0000

In a domain model, when you have a domain element that forms an algebraic abstraction honoring certain laws, you can get rid of many of your explicitly written unit tests just by checking the laws. Of course you have to squint hard and discover the lawful abstraction that hides behind your concrete domain element.

Consider this simple abstraction for Money that keeps track of amounts in various currencies.

scala> import Money._
import Money._

// 1000 USD
scala> val m = Money(1000, USD)
m: laws.Money = (USD,1000)

// add 248 AUD
scala> val n = add(m, Money(248, AUD))
n: laws.Money = (AUD,248),(USD,1000)

// add 230 USD more
scala> val p = add(n, Money(230, USD))
p: laws.Money = (AUD,248),(USD,1230)

// value of the money in base currency (USD)
scala> p.toBaseCurrency
res1: BigDecimal = 1418.48

// debit amount
scala> val q = Money(-250, USD)
q: laws.Money = (USD,-250)

scala> val r = add(p, q)
r: laws.Money = (AUD,248),(USD,980)

The valuation of Money is done in terms of its base currency which is usually USD. One of the possible implementations of Money is the following (some parts elided for future explanations) ..

sealed trait Currency
case object USD extends Currency
case object AUD extends Currency
case object JPY extends Currency
case object INR extends Currency

class Money private[laws] (val items: Map[Currency, BigDecimal]) {
  def toBaseCurrency: BigDecimal = 
    items.foldLeft(BigDecimal(0)) { case (a, (ccy, amount)) =>
      a + Money.exchangeRateWithUSD.get(ccy).getOrElse(BigDecimal(1)) * amount
    }

  override def toString = items.toList.mkString(",")
}

object Money {
  final val zeroMoney = new Money(Map.empty[Currency, BigDecimal])

  def apply(amount: BigDecimal, ccy: Currency) = new Money(Map(ccy -> amount))
  def add(m: Money, amount: BigDecimal, ccy: Currency) = ???

  final val exchangeRateWithUSD: Map[Currency, BigDecimal] = 
    Map(AUD -> 0.76, JPY -> 0.009, INR -> 0.016, USD -> 1.0)
}

Needless to say we will have quite a number of unit tests that check for addition of Money, including the boundary cases of adding to zeroMoney.

It's not very hard to see that the type Money forms a Monoid under the add operation. Or to speak a bit loosely we can say that Money is a Monoid under the add operation.

A Monoid has laws that every instance needs to honor - associativity, left identity and right identity. And when your model element needs to honor the laws of algebra, it's always recommended to include the verification of the laws as part of your test suite. Besides validating the sanity of your abstractions, one side-effect of verifying laws is that you can get rid of many of your explicitly written unit tests for the operation that forms the Monoid. They will be automatically verified when verifying the laws of Monoid[Money].

Here's how we define Monoid[Money] using Cats ..

val MoneyAddMonoid: Monoid[Money] = new Monoid[Money] {
  def combine(m: Money, n: Money): Money = add(m, n)
  def empty: Money = zeroMoney
}

and the implementation of the previously elided add operation on Money using Monoid on Map ..

object Money {
  //..

  def add(m: Money, amount: BigDecimal, ccy: Currency) = 
    new Money(m.items |+| Map(ccy -> amount))

  //..

}

Now we can verify the laws of Monoid[Money] using specs2 and ScalaCheck and the helper classes that Cats offers ..

import cats._
import kernel.laws.GroupLaws
import org.scalacheck.{ Arbitrary, Gen }
import Arbitrary.arbitrary

class MoneySpec extends CatsSpec { def is = s2"""

  This is a specification for validating laws of Money

  (Money) should
     form a monoid under addition    $e1 
  """

  implicit lazy val arbCurrency: Arbitrary[Currency] = Arbitrary { 
    Gen.oneOf(AUD, USD, INR, JPY) 
  }

  implicit def moneyArbitrary: Arbitrary[Money] = 
    Arbitrary {
      for {
        i <- Arbitrary.arbitrary[Map[Currency, BigDecimal]]
      } yield new Money(i)
    }

  def e1 = checkAll("Money", GroupLaws[Money].monoid(Money.MoneyAddMonoid))
}

and running the test suite will verify the Monoid laws for Monoid[Money] ..

[info] This is a specification for validating laws of Money
[info]
[info] (Money) should
[info] form a monoid under addition monoid laws must hold for Money
[info] + monoid.associativity
[info] + monoid.combineAll
[info] + monoid.combineAll(Nil) == id
[info] + monoid.combineAllOption
[info] + monoid.combineN(a, 0) == id
[info] + monoid.combineN(a, 1) == a
[info] + monoid.combineN(a, 2) == a |+| a
[info] + monoid.isEmpty
[info] + monoid.leftIdentity
[info] + monoid.rightIdentity
[info] + monoid.serializable

In summary ..

strive to find abstractions in your domain model that are constrained by algebraic laws
check all laws as part of your test suite
you will find that you can get rid of quite a few explicitly written unit tests just by checking the laws of your abstraction
and of course use property based testing for unit tests

In case you want to take a look at the full code base, it's there on my Github repo. In the next post we will take the next step towards modeling with generic algebraic code using the Monoid pattern from this example. Code written in parametric form without depending on specialized concrete types can be more robust, easier to test and easier to reason about. I have also discussed this at length in my book Functional and Reactive Domain Modeling. I plan to supplement the materials covered there with more examples and code patterns ..

Baking a π can teach you a bit of Parametricity

Sat, 13 Jun 2015 15:47:00 +0000

Even though I got my copy of Prof. Eugenia Cheng's awesome How to Bake π a couple of weeks back, I started reading it only over this weekend. I am only on page 19 enjoying all the stuff regarding cookies that Prof. Cheng is using to explain abstraction. This is a beautiful piece of explanation and if you are a programmer you may get an extra mile out of the concepts that she explains here. Let's see if we can unravel a few of them ..

She starts with a real life situation such as:

If Grandma gives you five cookies and Grandpa gives you five cookies, how many cookies will you have ?

Let's model this as box of cookies that you get from your Grandma and Grandpa and you need to count them and find the total. Let's model this in Scala and we may have something like the following ..

case class CookieBox(count: Int)

and we can define a function that gives you a CookieBox containing the total number of cookies from the 2 boxes that we pass to the function ..

def howManyCookies(gm: CookieBox, gp: CookieBox) = {
  CookieBox(gm.count + gp.count)
}

and we use howManyCookies to find the count ..

scala> val gm = CookieBox(5)
gm: CookieBox = CookieBox(5)

scala> val gp = CookieBox(5)
gp: CookieBox = CookieBox(5)

scala> howManyCookies(gm, gp)
res5: CookieBox = CookieBox(10)

.. so we have 10 cookies from our Grandma & Grandpa .. Perfect!

The problem is .. the child answers: "None, because I'll eat them all". To model this let's add a function eat to our CookieBox abstraction ..

case class CookieBox(count: Int) {
  // let's assume n < count for simplicity
  def eat(n: Int): CookieBox = CookieBox(count - n)
}

So instead of the correct way to answer the question, the child cheats and implements howManyCookies as ..

def howManyCookies(gm: CookieBox, gp: CookieBox) = {
  CookieBox(gm.eat(gm.count).count + gp.eat(gp.count).count)
}

and we get the following ..

scala> howManyCookies(gm, gf)
res6: CookieBox = CookieBox(0)

Prof. Cheng continues ..

The trouble here is that cookies do not obey the rules of logic, so using math to study them doesn't quite work. .. We could impose an extra rule on the situation by adding "... and you're not allowed to eat the cookies". If you're not allowed to eat them, what's the point of them being cookies ?

This is profound indeed. When we are asked to count some stuff, it really doesn't matter if they are cookies or stones or pastries. The only property we need here is to be able to add together the 2 stuff that we are handed over. The fact that we have implemented howManyCookies in terms of CookieBox gives the little child the opportunity to cheat by using the eat function. More information is actually hurting us here, being concrete with data types is actually creating more avenues for incorrect implementation.

Prof. Cheng is succinct here when she explains ..

We could treat the cookies as just things rather than cookies. We lose some resemblance to reality, but we gain scope and with it efficiency. The point of numbers is that we can reason about "things" without having to change the reasoning depending on what "thing" we are thinking about.

Yes, she is talking about generalization, being polymorphic over what we count. We just need the ability to add 2 "things", be it cookies, monkeys or anchovies. In programming we model this with parametric polymorphism, and use a universal quantification over the set of types for which we implement the behavior.

def howMany[A](gm: A, gp: A) = //..

We have made the implementation parametric and got rid of the concrete data type CookieBox. But how do we add the capability to sum the 2 objects and get the result ? You got it right - we already have an abstraction that makes this algebra available to a generic data type. Monoids FTW .. and it doesn't get simpler than this ..

trait Monoid[T] {
  def zero: T
  def append(t1: T, t2: T): T
}

zero is the identity function and append is a binary associative function over 2 objects of the type. So given a monoid instance for our data type, we can model howMany in a completely generic way irrespective of whether A is a CookieBox or Monkey.

def howMany[A : Monoid](gm: A, gp: A): A = gm append gp

Implementing a monoid for CookieBox is also simple ..

object CookieBox {
  implicit val CookieBoxMonoid = new Monoid[CookieBox] {
    val zero = CookieBox(0)
    def append(i: CookieBox, j: CookieBox) = CookieBox(i.count + j.count)
  }
}

With the above implementation of howMany, the little child will not be able to cheat. By providing a simpler data type we have made the implementation more robust and reusable across multiple data types.

Next time someone wants me to explain parametricity, I will point them to Page 19 of How to Bake π.

Randomization and Probabilistic Techniques to scale up Machine Learning

Wed, 25 Mar 2015 20:35:00 +0000

Some time back I blogged about the possibilities that probabilistic techniques and randomization bring on to the paradigm of stream computing. Architectures based on big data not only relate to high volume storage, but also on low latency velocities, and this is exactly where stream computing has a role to play. I discussed a few data structures like bloom filters, count min sketch and hyperloglog and algorithms like Locality Sensitive Hashing that use probabilistic techniques to reduce the search and storage space while processing huge volumes of data.

Of late, I have been studying some of the theories behind machine learning algorithms and how they can be used in conjunction with the petabytes of data that we generate everyday. And the same thing strikes here - there are algorithms that can model the most perfect classifier. But you need randomization and probabilistic techniques to make them scale, even at the expense of a small amount of inaccuracy creeping within your model. In most cases we will see that the small inaccuracy that comes within your algorithm because of probabilistic bounds can be compensated by the ability to process more data within the specified computational timeline. This is true even for some of the basic algorithms like matrix multiplication that form the core of machine learning models.

The contents of this post is nothing original or new. It's just to share some of my thoughts in learning the usage of approximation techniques in building machine learning classifiers.

Matrix Multiplication

Not only these specialized data structures or algorithms, randomization has been found to be quite effective for processing large data sets even for standard algorithms like matrix multiplication, polynomial identity verification or min cut identification from large graphs. In all such cases the best available algorithms have computational complexity which works well for a small data set but doesn't scale well enough with the volumes of data.

Consider a case where we are given 3 matrices, $A$, $B$ and $C$ and we need to verify if $AB = C$. The standard algorithm for matrix multiplication takes $\Theta(n^3)$ operations and there's also a sophisticated algorithm that works in $\Theta(n^{2.37})$ operations. Instead let's consider some randomization and choose a random vector $\bar{r} = (r_1, r_2, .. r_n) \in \{0, 1\}^n$. Now we can compute $AB\bar{r}$ by first computing $B\bar{r}$ and then $A(B\bar{r})$. And then we compute $C\bar{r}$. If we find $A(B\bar{r}) \neq C\bar{r}$, then $AB \neq C$. Otherwise we return $AB = C$. Instead of matrix-matrix multiplication our randomized algorithm uses matrix-vector multiplication, which can be done in $\Theta(n^2)$ operations the standard way.

Obviously a $\Theta(n^2)$ algorithm has a lower computational complexity than $\Theta(n^3)$ and scales better with larger data sets. Now the question is how accurate is this algorithm ? Is it guaranteed to give the correct answer every time we run it ? As with other probabilistic algorithms, there's a chance that our algorithm will return a wrong result. But as long as we can show that the chance is minimal and can be reduced by tuning some parameters, we should be fine.

It can be shown that if $AB \neq C$ and if $\bar{r}$ is chosen uniformly at random from $\{0, 1\}^n$ then $Pr(AB\bar{r} = C\bar{r}) <= 1/2$. But the trick is that we can run our randomized algorithm many times choosing $\bar{r}$ with replacement from $\{0, 1\}^n$. If for any of these trials we get $AB\bar{r} \neq C\bar{r}$, then we can conclude $AB \neq C$. And the probability that we get $AB\bar{r} = C\bar{r}$ for all $k$ trials despite $AB \neq C$ is $2^{-k}$. So for $100$ trials, the chance of error is $2^{-100}$, which we can see is really small. The detailed proof of this analysis can be found in the excellent book Probability and Computing by Michael Mitzenmacher & Eli Upfal.

Matrix multiplication is something that's used heavily especially in implementing machine learning classifiers. And if we can tolerate that little chance of error we get an algorithm with lower computational complexity that scales much better.

Stochastic Gradient Descent

Consider another use case from core machine learning classifier design. Gradient descent is a standard way to minimize the empirical risk for measuring training set performance. The empirical risk is given by the following equation:
$$E_n(f) = (1/n)\sum_i l(f_w(x_i),y_i)$$
where $l$ is the loss function that measures the cost of predicting $f_w(x_i)$ from $n$ training examples where the actual answer is $y$ and $f_w(x)$ is the function parameterized by the weight vector $w$. Each iteration of gradient descent updates the weights $w$ on the basis of the gradient of $E_n(f_w)$ according to the following iterative step:

$$w_{t+1} = w_t - \gamma (1/n) \sum_i \nabla_w Q(z_i, w_t)$$
where $\gamma$ is an adequately chosen gain. Note that a single update step for the parameter runs through all the training examples and this gets repeated for every update step that you do before convergence. Compare this with Stochastic Gradient Descent (SGD) where the update step is given by the following:

$$w_{t+1} = w_t - \gamma \nabla_w Q(z_t, w_t)$$
Note instead of running through all examples and compute the exact gradient, SGD computes the gradient based on one randomly picked example $z_t$. So, SGD does a noisy approximation to the true gradient. But since it does not have to process all the examples in every iteration it scales better with a large data set. In this paper on Large Scale Machine Learning With Stochastic Gradient Descent, Leon Bottou classifies the error in building the classifier into 3 components:

Approximation Error, which comes from the fact that the function $f$ that we choose is different from the optimal function $f^*$ and we approximate using a few examples

Estimation Error, which comes from the fact that we have a finite number of training examples and would have gone away with infinite number of them

Optimization Error, which comes from the fact that we are using an inferior algorithm to estimate the gradient

With normal gradient descent we will have low optimization error since we run through all the training examples in every iteration to compute the gradient, which is clearly superior to the algorithm of SGD that does a noisy approximation. But SGD will report a lower approximation and estimation error since we will be able to process a larger dataset within the stipulated computation time. So it's a tradeoff of that we make using SGD, but clearly we scale better with larger data sets.

Singular Value Decomposition

Singular Value Decomposition is a dimensionality reduction technique to unearth a smaller number of intrinsic concepts from a high dimensional matrix by removing unnecessary information. It does so by projecting the original matrix on to lower dimensions such that the reconstruction error is minimized. What this means is that given a matrix $A$ we decompose it into lower dimensional matrices by removing the lesser important information. And we do this in such a way that we can reconstruct a fairly close approximation to $A$ from those lower dimensional matrices. In theory SVD gives the best possible projection in terms of reconstruction error (optimal low rank approximation). But in practice it suffers from scalability problems with large data sets. It generates dense singular vectors even if the original matrix is a sparse one and hence is computationally inefficient, taking cubic time in the size of the data.

This can be addressed by another algorithm, the CUR algorithm which allows larger reconstruction error but lesser computation time. CUR decomposes the original matrix into ones of lesser dimensions but uses a randomized algorithm in selection of columns and rows based on their probability distribution. Now it can be shown that CUR reconstruction is just an additive term away from SVD reconstruction and it's a probabilistic bound subject to the condition that we select a specific range of columns and rows from $A$. The computational bound of CUR is of the order of the data set, which is much less than that of SVD (which as I mentioned earlier is cubic). This is yet another example where we apply randomization and probabilistic techniques to scale our algorithm better for larger data sets in exchange for a little amount of inaccuracy.

These are only a few instances of probabilistic bounds being applied to solve real world machine learning problems. There are a lots more. In fact I find that scalability of machine learning has a vey direct correlation with application of probabilistic techniques to the model. As I mentioned earlier the point of this post is to share some of my thoughts as I continue to learn techniques to scale up machine learning models. Feel free to share your ideas, thoughts and discussions in comments.

Functional Patterns in Domain Modeling - Composing a domain workflow with statically checked invariants

Tue, 10 Feb 2015 19:45:00 +0000

I have been doing quite a bit of domain modeling using functional programming mostly in Scala. And as it happens when you work on something for a long period of time you tend to identify more and more patterns that come up repeatedly within your implementations. You may ignore these as patterns the first time, get a feeling of mere coincidence the next time, but third time really gives you that aha! moment and you feel like documenting it as a design pattern. In course of my learnings I have started blogging on some of these patterns - you can find the earlier ones in the series in:

Functional Patterns in Domain Modeling - The Specification Pattern

Functional Patterns in Domain Modeling - Immutable Aggregates and Functional Updates

Functional Patterns in Domain Modeling - Anemic Models and Compositional Domain Behaviors

In this continuing series of functional patterns in domain modeling, I will go through yet another idiom which has been a quite common occurrence in my explorations across various domain models. You will find many of these patterns explained in details in my upcoming book on Functional and Reactive Domain Modeling, the early access edition of which is already published by Manning.

One of the things that I strive to achieve in implementing domain models is to use the type system to encode as much domain logic as possible. If you can use the type system effectively then you get the benefits of parametricity, which not only makes your code generic, concise and polymorphic, but also makes it self-testing. But that's another story which we can discuss in another post. In this post I will talk about a pattern that helps you design domain workflows compositionally, and also enables implementing domain invariants within the workflow, all done statically with little help from the type system.

As an example let's consider a loan processing system (simplified for illustration purposes) typically followed by banks issuing loans to customers. A typical simplified workflow looks like the following :-

The Domain Model

The details of each process is not important - we will focus on how we compose the sequence and ensure that the API verifies statically that the correct sequence is followed. Let's start with a domain model for the loan application - we will keep on enriching it as we traverse the workflow.

case class LoanApplication private[Loans](
  // date of application
  date: Date,
  // name of applicant
  name: String,
  // purpose of loan
  purpose: String,
  // intended period of repayment in years
  repayIn: Int,
  // actually sanctioned repayment period in years
  actualRepaymentYears: Option[Int] = None,
  // actual start date of loan repayment
  startDate: Option[Date] = None,
  // loan application number
  loanNo: Option[String] = None,
  // emi
  emi: Option[BigDecimal] = None
)

Note we have a bunch of attributes that are defined as optional and will be filled out later as the loan application traverses through the sequence of workflow. Also we have declared the class private and we will have a smart constructor to create an instance of the class.

Wiring the workflow with Kleisli

Here are the various domain behaviors modeling the stages of the workflow .. I will be using the scalaz library for the Kleisli implementation.

def applyLoan(name: String, purpose: String, repayIn: Int, 
  date: Date = today) =
  LoanApplication(date, name, purpose, repayIn)

def approve = Kleisli[Option, LoanApplication, LoanApplication] { l => 
  // .. some logic to approve
  l.copy(
    loanNo = scala.util.Random.nextString(10).some,
    actualRepaymentYears = 15.some,
    startDate = today.some
  ).some
}

def enrich = Kleisli[Option, LoanApplication, LoanApplication] { l => 
  //.. may be some logic here
  val x = for {
    y <- l.actualRepaymentYears
    s <- l.startDate
  } yield (y, s)

  l.copy(emi = x.map { case (y, s) => calculateEMI(y, s) }).some
}

applyLoan is the smart constructor that creates the initial instance of LoanApplication. The other 2 functions approve and enrich perform the approval and enrichment steps of the workflow. Note both of them return an enriched version of the LoanApplication within a Kleisli, so that we can use the power of Kleisli composition and wire them together to model the workflow ..

val l = applyLoan("john", "house building", 10)
val op = approve andThen enrich
op run l

When you have a sequence to model that takes an initial object and then applies a chain of functions, you can use plain function composition like h(g(f(x))) or using the point free notation, (h compose g compose f) or using the more readable order (f andThen g andThen h). But in the above case we need to have effects along with the composition - we are returning Option from each stage of the workflow. So here instead of plain composition we need effectful composition of functions and that's exactly what Kleisli offers. The andThen combinator in the above code snippet is actually a Kleisli composition aka function composition with effects.

So we have everything the workflow needs and clients use our API to construct workflows for processing loan applications. But one of the qualities of good API design is to design it in such a way that it becomes difficult for the client to use it in the wrong way. Consider what happens with the above design of the workflow if we invoke the sequence as enrich andThen approve. This violates the domain invariant that states that enrichment is a process that happens after the approval. Approval of the application generates some information which the enrichment process needs to use. But because our types align, the compiler will be perfectly happy to accept this semantically invalid composition to pass through. And we will have the error reported during run time in this case.

Remembering that we have a static type system at our disposal, can we do better ?

Phantom Types in the Mix

Let's throw in some more types and see if we can tag in some more information for the compiler to help us. Let's tag each state of the workflow with a separate type ..

trait Applied
trait Approved
trait Enriched

Finally make the main model LoanApplication parameterized on a type that indicates which state it is in. And we have some helpful type aliases ..

case class LoanApplication[Status] private[Loans]( //..

type LoanApplied  = LoanApplication[Applied]
type LoanApproved = LoanApplication[Approved]
type LoanEnriched = LoanApplication[Enriched]

These types will have no role in modeling domain behaviors - they will just be used to dispatch to the correct state of the sequence that the domain invariants mandate. The workflow functions need to be modified slightly to take care of this ..

def applyLoan(name: String, purpose: String, repayIn: Int, 
  date: Date = today) =
  LoanApplication[Applied](date, name, purpose, repayIn)

def approve = Kleisli[Option, LoanApplied, LoanApproved] { l => 
  l.copy(
    loanNo = scala.util.Random.nextString(10).some,
    actualRepaymentYears = 15.some,
    startDate = today.some
  ).some.map(identity[LoanApproved])
}

def enrich = Kleisli[Option, LoanApproved, LoanEnriched] { l => 
  val x = for {
    y <- l.actualRepaymentYears
    s <- l.startDate
  } yield (y, s)

  l.copy(emi = x.map { case (y, s) => calculateEMI(y, s) }).some.map(identity[LoanEnriched])
}

Note how we use the phantom types within the Kleisli and ensure statically that the sequence can flow only in one direction - that which is mandated by the domain invariant. So now an invocation of enrich andThen approve will result in a compilation error because the types don't match. So once again yay! for having the correct encoding of domain logic with proper types.

Probabilistic techniques, data streams and online learning - Looking forward to a bigger 2015

Wed, 31 Dec 2014 19:20:00 +0000

I look forward to 2015 as the year when randomized algorithms, probabilistic techniques and data structures become more pervasive and mainstream. The primary driving factors for this will be more and more prevalence of big data and the necessity to process them in near real time using minimal (or constant) memory bandwidth. You are given data streams where possibly you will see every data only once in your lifetime and you need to churn out analytics from them in real time. You cannot afford to store all of them in a database on disk since it will incur an unrealistic performance penalty to serve queries in real time. And you cannot afford to store all information in memory even if you add RAM at your own will. You need to find clever ways to optimize your storage, employ algorithms and data structures that use sublinear space and yet deliver information in real time.

Many such data structures are already being used quite heavily for specialized processing of data streams ..

Bloom Filters for set membership using sublinear storage
HyperLogLog for efficient cardinality estimation
Count Min Sketch for estimating frequency related properties of a data stream

These data structures are becoming more and more useful as we prepare to embrace and process larger data sets with fairly strict online requirements. And it has started making a difference. Take for example Impala, the open source analytic database from Cloudera that works on top of Hadoop. Impala's NDV aggregate function (number of distinct values) uses the HyperLogLog algorithm to estimate this number, in parallel, in a fixed amount of space. This blog post has the details of the performance improvement that it offers in comparison to the standard distinct count. The immensely popular NoSQL store Redis also offers a HyperLogLog implementation that you can use to get an approximation on the cardinality of a set using randomization. Salvatore has the details here on the implementation of HyperLogLog algorithm in Redis.

The most important reason these algorithms and data structures are becoming popular is the increased focus on our "online" requirements. We are not only processing bigger and bigger data set, we need results faster too. We just cannot afford to push all analytics to the batch mode and expect results coming out after an overnight batch processing. Various architectural paradigms like the lambda architecture also target to address this niche area. But before investing on such complex architectures, often some neat data structures that use probabilistic techniques and randomization may offer a much lighter weight solution that you are looking for.

Consider processing the Twitter stream and generating analytics (of whatever form) online. This means that immediately after seeing one twitter feed you must be able to predict something and update your model at the same time. Which means you need to memorize the data that you see in the feed, apply it to update your model and yet cannot store the entire hose that you have seen so far. This is online learning and is the essence of techniques like stochastic gradient descent that help you do this - the model is capable of making up to date predictions after every data that you see. John Myles White has an excellent presentation on this topic.

Consider this other problem of detecting similarities between documents. When you are doing this on a Web scale you will have to deal with millions of documents to find the similar sets. There are techniques like minhash which enable you to compress documents into signature matrices. But even then the scale becomes too big to be processed and reported to the user in a meaningful amount of time. As an example (from Mining Massive Datasets), if you process 1 million document using signatures of length 250, you still have to use 1000 bytes per document - the total comes to 1 gigabyte which very well fits into the memory of a standard laptop. But when you check for similar pairs, you need to process (1,000,000 choose 2) or half a trillion pairs of documents which will take almost 6 days to compute all similarities on a laptop. Enter probabilistic techniques and locality sensitive hashing (LSH) algorithm fits this problem like a charm. Detecting similarity is a problem that arises in recommender systems with collaborative filtering and LSH can be used there as well. The basic idea of LSH as applied to similarity detection is to use hashing multiple number of times and identify candidate pairs that qualify for similarity checking. The idea is to reduce the search space using probabilistic techniques so that we can eliminate a class of candidates which have very low chance of being similar.

Here I have only scratched the surface of the areas where we apply randomization and probabilistic techniques to solve problems that are very real today. There are plentiful other areas in data mining, graph clustering, machine learning and big data processing where similar techniques are employed to reduce the curse of dimensionality and provide practical solution at scale. 2014 has already seen a big surge in terms of popularizing these techniques. I expect 2015 to be bigger and more mainstream in terms of their usage.

Personally I have been exploring data stream algorithms a lot and have prepared a collection of some useful references. Feel free to share in case you find it useful. I hope to do something more meaningful with stream processing data structures and online learning in 2015. Have a very happy and joyous new year ..

Functional and Reactive Domain Modeling

Mon, 03 Nov 2014 07:45:00 +0000

Manning has launched the MEAP of my upcoming book on Domain Modeling.

The first time I was formally introduced to the topic was way back when I played around with Erik Evans' awesome text on the subject of Domain Driven Design. In the book he discusses various object lifecycle patterns like the Factory, Aggregate or Repository that help separation of concerns when you are implementing the various interactions between the elements of the domain model. Entities are artifacts with identities, value objects are pure values while services model the coarse level use cases of the model components.

Traditionally we followed the recommendations of Erik in our real world implementations and used the object oriented paradigm for modeling all interactions. We started talking about rich domain models and anemic domain models. The rich model espoused a richer agglomeration of state and behavior within the model, while the anemic model preferred to keep them decoupled. Martin Fowler sums this up in his post on Anemic Domain Models ..

"The basic symptom of an Anemic Domain Model is that at first blush it looks like the real thing. There are objects, many named after the nouns in the domain space, and these objects are connected with the rich relationships and structure that true domain models have. The catch comes when you look at the behavior, and you realize that there is hardly any behavior on these objects, making them little more than bags of getters and setters. Indeed often these models come with design rules that say that you are not to put any domain logic in the the domain objects. Instead there are a set of service objects which capture all the domain logic. These services live on top of the domain model and use the domain model for data."

Go Functional

In Functional and Reactive Domain Modeling I look at the problem with a different lens. The primary focus of the book is to encourage building domain models using the principles of functional programming. It's a completely orthogonal approach than OO and focuses on verbs first (as opposed to nouns first in OO), algebra first (as opposed to objects in OO), function composition first (as opposed to object composition in OO), lightweight objects as ADTs (instead of rich class models).

The book starts with the basics of functional programming principles and discusses the virtues of purity and the advantages of keeping side-effects decoupled from the core business logic. The book uses Scala as the programming language and does an extensive discussion on why the OO and functional features of Scala are a perfect fit for modelling complex domains. Chapter 3 starts the core subject of functional domain modeling with real world examples illustrating how we can make good use of patterns like smart constructors, monads and monoids in implementing your domain model. The main virtue that these patterns bring to your model is genericity - they help you extract generic algebra from domain specific logic into parametric functions which are far more reusable and less error prone. Chapter 4 focuses on advanced usages like typeclass based design and patterns like monad transformers, kleislis and other forms of compositional idioms of functional programming. One of the primary focus of the book is an emphasis on algebraic API design and to develop an appreciation towards ability to reason about your model.

Go Reactive

The term "reactive" has recently become quite popular in describing systems that are responsive, scalable and adaptive. In implementing complex domain models, we find many areas which can be made more responsive by implementing them as non blocking operations instead of the standard blocking function calls. Using higher level concurrency primitives like actors, futures and data flow based computing we can compose asynchronous operations and increase the net throughput of your model. The second part of the book discusses how you can combine the principles of functional programming with the reactive way of implementing behaviors. The paradigm of application design using the principles of functional programming together with an asynchronous non-blocking mode of communication between the participating entities promises to be a potent combination towards developing performant systems that are relatively easy to manage, maintain and evolve. But designing and implementing such models need a different way of thinking. The behaviors that you implement have to be composable using pure functions and they can form building blocks of bigger abstractions that communicate between them using non-blocking asynchronous message passing.

Functional meets Reactive

Functional and Reactive Domain Modeling takes you through the steps teaching you how to think of the domain model in terms of pure functions and how to compose them to build larger abstractions. You will start learning with the basics of functional programming principles and gradually progress to the advanced concepts and patterns that you need to know to implement complex domain models. The book demonstrates how advanced FP patterns like algebraic data types, typeclass based design and isolation of side-effects can make your model compose for readability and verifiability. On the subject of reactive modeling, the book focuses on the higher order concurrency patterns like actors and futures. It uses the Akka framework as the reference implementation and demonstrates how advanced architectural patterns like event sourcing and command-query-responsibility-segregation can be put to great use while implementing scalable models. You will learn techniques that are radically different from the standard RDBMS based applications that are based on mutation of records. You’ll also pick up important patterns like using asynchronous messaging for interaction based on non blocking concurrency and model persistence, which delivers the speed of in-memory processing along with suitable guarantees of reliability.

Looking forward to an exciting journey with the book. I am sure you will also find interest in the topics that I discuss there. And feel free to jump on to AuthorOnline and fire questions that we all can discuss. I am sure this will also lead to an overall improvement in the quality of the book.

Functional Patterns in Domain Modeling - Anemic Models and Compositional Domain Behaviors

Mon, 12 May 2014 09:03:00 +0000

I was looking at the presentation that Dean Wampler made recently regarding domain driven design, anemic domain models and how using functional programming principles help ameliorate some of the problems there. There are some statements that he made which, I am sure made many OO practitioners chuckle. They contradict popular beliefs that encourage OOP as the primary way of modeling using DDD principles.

One statement that resonates a lot with my thought is "DDD encourages understanding of the domain, but don't implement the models". DDD does a great job in encouraging developers to understand the underlying domain model and ensuring a uniform vocabulary throughout the lifecycle of design and implementation. This is what design patterns also do by giving you a vocabulary that you can heartily exchange with your fellow developers without influencing any bit of implementation of the underlying pattern.

On the flip side of it, trying to implement DDD concepts using standard techniques of OO with joined state and behavior often gives you a muddled mutable model. The model may be rich from the point of view that you will find all concepts related to the particular domain abstraction baked in the class you are modeling. But it makes the class fragile as well since the abstraction becomes more locally focused losing the global perspective of reusability and composability. As a result when you try to compose multiple abstractions within the domain service layer, it becomes too much polluted with glue code that resolves the impedance mismatch between class boundaries.

So when Dean claims "Models should be anemic", I think he means to avoid this bundling of state and behavior within the domain object that gives you the false sense of security of richness of the model. He encourages the practice that builds domain objects to have the state only while you model behaviors using standalone functions.

Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function.
— John Carmack (@ID_AA_Carmack) March 31, 2011

One other strawman argument that I come across very frequently is that bundling state and behavior by modeling the latter as methods of the class increases encapsulation. If you are still a believer of this school of thought, have a look at Scott Meyer's excellent article which he wrote as early as 2000. He eschews the view that a class is the right level of modularization and encourages more powerful module systems as better containers of your domain behaviors.

As continuation of my series on functional domain modeling, we continue with the example of the earlier posts and explore the theme that Dean discusses ..

Here's the anemic domain model of the Order abstraction ..

case class Order(orderNo: String, orderDate: Date, customer: Customer, 
  lineItems: Vector[LineItem], shipTo: ShipTo, 
  netOrderValue: Option[BigDecimal] = None, status: OrderStatus = Placed)

In the earlier posts we discussed how to implement the Specification and Aggregate Patterns of DDD using functional programming principles. We also discussed how to do functional updates of aggregates using data structures like Lens. In this post we will use these as the building blocks, use more functional patterns and build larger behaviors that model the ubiquitous language of the domain. After all, one of the basic principles behind DDD is to lift the domain model vocabulary into your implementation so that the functionality becomes apparent to the developer maintaining your model.

The core idea is to validate the assumption that building domain behaviors as standalone functions leads to an effective realization of the domain model according to the principles of DDD. The base classes of the model contain only the states that can be mutated functionally. All domain behaviors are modeled through functions that reside within the module that represents the aggregate.

Functions compose and that's precisely how we will chain sequence of domain behaviors to build bigger abstractions out of smaller ones. Here's a small function that values an Order. Note it returns a Kleisli, which essentially gives us a composition over monadic functions. So instead of composing a -> b and b -> c, which we do with normal function composition, we can do the same over a -> m b and b -> m c, where m is a monad. Composition with effects if you may say so.

def valueOrder = Kleisli[ProcessingStatus, Order, Order] {order =>
  val o = orderLineItems.set(
    order,
    setLineItemValues(order.lineItems)
  )
  o.lineItems.map(_.value).sequenceU match {
    case Some(_) => right(o)
    case _ => left("Missing value for items")
  }
}

But what does that buy us ? What exactly do we gain from these functional patterns ? It's the power to abstract over families of similar abstractions like applicatives and monads. Well, that may sound a bit rhetoric and it needs a separate post to justify their use. Stated simply, they encapsulate effects and side-effects of your computation so that you can focus on the domain behavior itself. Have a look at the process function below - it's actually a composition of monadic functions in action. But all the machinery that does the processing of effects and side-effects are abstracted within the Kleisli itself so that the user level implementation is simple and concise.

With Kleisli it's the power to compose over monadic functions. Every domain behavior has a chance of failure, which we model using the Either monad - here ProcessingStatus is just a type alias for this .. type ProcessingStatus[S] = \/[String, S]. Using the Kleisli, we don't have to write any code for handling failures. As you will see below, the composition is just like the normal functions - the design pattern takes care of alternate flows.

Once the Order is valued, we need to apply discounts to qualifying items. It's another behavior that follows the same pattern of implementation as valueOrder.

def applyDiscounts = Kleisli[ProcessingStatus, Order, Order] {order =>
  val o = orderLineItems.set(
    order,
    setLineItemValues(order.lineItems)
  )
  o.lineItems.map(_.discount).sequenceU match {
    case Some(_) => right(o)
    case _ => left("Missing discount for items")
  }
}

Finally we check out the Order ..

def checkOut = Kleisli[ProcessingStatus, Order, Order] {order =>
  val netOrderValue = order.lineItems.foldLeft(BigDecimal(0).some) {(s, i) => 
    s |+| (i.value |+| i.discount.map(d => Tags.Multiplication(BigDecimal(-1)) |+| Tags.Multiplication(d)))
  }
  right(orderNetValue.set(order, netOrderValue))
}

And here's the service method that composes all of the above domain behaviors into the big abstraction. We don't have any object to instantiate. Just plain function composition that results in an expression modeling the entire flow of events. And it's the cleanliness of abstraction that makes the code readable and succinct.

def process(order: Order) = {
  (valueOrder andThen 
     applyDiscounts andThen checkOut) =<< right(orderStatus.set(order, Validated))
}

In case you are interested in the full source code of this small example, feel free to take a peek at my github repo.

Functional Patterns in Domain Modeling - Immutable Aggregates and Functional Updates

Sun, 06 Apr 2014 15:19:00 +0000

In the last post I looked at a pattern that enforces constraints to ensure domain objects honor the domain rules. But what exactly is a domain object ? What should be the granularity of an object that my solution model should expose so that it makes sense to a domain user ? After all, the domain model should speak the language of the domain. We may have a cluster of entities modeling various concepts of the domain. But only some of them can be published as abstractions to the user of the model. The others can be treated as implementation artifacts and are best hidden under the covers of the published ones.

An aggregate in domain driven design is a published abstraction that provides a single point of interaction to a specific domain concept. Considering the classes I introduced in the last post, an Order is an aggregate. It encapsulates the details that an Order is composed of in the real world (well, only barely in this example, which is only for illustration purposes :-)).

Note an aggregate can consist of other aggregates - e.g. we have a Customer instance within an Order. Eric Evans in his book on Domain Driven Design provides an excellent discussion of what constitutes an Aggregate.

Functional Updates of Aggregates with Lens

This is not a post about Aggregates and how they fit in the realm of domain driven design. In this post I will talk about how to use some patterns to build immutable aggregates. Immutable data structures offer a lot of advantages, so build your aggregates ground up as immutable objects. Algebraic Data Type (ADT) is one of the patterns to build immutable aggregate objects. Primarily coming from the domain of functional programming, ADTs offer powerful techniques of pattern matching that help you match values against patterns and bind variables to successful matches. In Scala we use case classes as ADTs that give immutable objects out of the box ..

case class Order(orderNo: String, orderDate: Date, customer: Customer, 
  lineItems: Vector[LineItem], shipTo: ShipTo, netOrderValue: Option[BigDecimal] = None, 
  status: OrderStatus = Placed)

Like all good aggregates, we need to provide a single point of interaction to users. Of course we can access all properties using accessors of case classes. But what about updates ? We can update the orderNo of an order like this ..

val o = Order( .. )
o.copy(orderNo = newOrderNo)

which gives us a copy of the original order with the new order no. We don't mutate the original order. But anybody having some knowledge of Scala will realize that this becomes pretty clunky when we have to deal with nested object updation. e.g in the above case, ShipTo is defined as follows ..

case class Address(number: String, street: String, city: String, zip: String)
case class ShipTo(name: String, address: Address)

So, here you go in order to update the zip code of a ShipTo ..

val s = ShipTo("abc", Address("123", "Monroe Street", "Denver", "80233"))
s.copy(address = s.address.copy(zip = "80231"))

Not really pleasing and can go off bounds in comprehensibility pretty soon.

In our domain model we use an abstraction called a Lens for updating Aggregates. In very layman's terms, a lens is an encapsulated get and set combination. The get extracts a small part from a larger whole, while the set transforms the larger abstraction with a smaller part taken as a parameter.

case class Lens[A, B](get: A => B, set: (A, B) => A)

This is a naive definition of a Lens in Scala. Sophisticated lens designs go a long way to ensure proper abstraction and composition. scalaz provides one such implementation out of the box that exploits the similarity in structure between the get and the set to generalize the lens definition in terms of another abstraction named Store. As it happens so often in functional programming, Store happens to abstract yet another pattern called the Comonad. You can think of a Comonad as the inverse of a Monad. But in case you are more curious, and have wondered how lenses form "the Coalgebras for the Store Comonad", have a look at the 2 papers here and here.

Anyway for us mere domain modelers, we will use the Lens implementation as in scalaz .. here's a lens that helps us update the OrderStatus within an Order ..

val orderStatus = Lens.lensu[Order, OrderStatus] (
  (o, value) => o.copy(status = value),
  _.status
)

and use it as follows ..

val o = Order( .. )
orderStatus.set(o, Placed)

will change the status field of the Order to Placed. Let's have a look at some of the compositional properties of a lens which help us write readable code for functionally updating nested structures.

Composition of Lenses

First let's define some individual lenses ..

// lens for updating a ShipTo of an Order
val orderShipTo = Lens.lensu[Order, ShipTo] (
  (o, sh) => o.copy(shipTo = sh),
  _.shipTo
)

// lens for updating an address of a ShipTo
val shipToAddress = Lens.lensu[ShipTo, Address] (
  (sh, add) => sh.copy(address = add),
  _.address
)

// lens for updating a city of an address
val addressToCity = Lens.lensu[Address, String] (
  (add, c) => add.copy(city = c),
  _.city
)

And now we compose them to define a lens that directly updates the city of a ShipTo belonging to an Order ..

// compositionality FTW
def orderShipToCity = orderShipTo andThen shipToAddress andThen addressToCity

Now updating a city of a ShipTo in an Order is as simple and expressive as ..

val o = Order( .. )
orderShipToCity.set(o, "London")

The best part of using such compositional data structures is that it makes your domain model implementation readable and expressive to the users of your API. And yet your aggregate remains immutable.

Let's look at another use case when the nested object is a collection. scalaz offers partial lenses that you can use for such composition. Here's an example where we build a lens that updates the value member within a LineItem of an Order. A LineItem is defined as ..

case class LineItem(item: Item, quantity: BigDecimal, value: Option[BigDecimal] = None, 
  discount: Option[BigDecimal] = None)

and an Order has a collection of LineItems. Let's define a lens that updates the value within a LineItem ..

val lineItemValue = Lens.lensu[LineItem, Option[BigDecimal]] (
  (l, v) => l.copy(value = v),
  _.value
)

and then compose it with a partial lens that helps us update a specific item within a vector. Note how we convert our lineItemValue lens to a partial lens using the unary operator ~ ..

// a lens that updates the value in a specific LineItem within an Order 
def lineItemValues(i: Int) = ~lineItemValue compose vectorNthPLens(i)

Now we can use this composite lens to functionally update the value field of each of the items in a Vector of LineItems using some specific business rules ..

(0 to lis.length - 1).foldLeft(lis) {(s, i) => 
  val li = lis(i)
  lineItemValues(i).set(s, unitPrice(li.item).map(_ * li.quantity)).getOrElse(s)
}

In this post we saw how we can handle aggregates functionally and without any in-place mutation. This keeps the model pure and helps us implement domain models that has sane behavior even in concurrent settings without any explicit use of locks and semaphores. In the next post we will take a look at how we can use such compositional structures to make the domain model speak the ubiquitous language of the domain - another pattern recommended by Eric Evans in domain driven design.

Functional Patterns in Domain Modeling - The Specification Pattern

Mon, 31 Mar 2014 06:27:00 +0000

When you model a domain, you model its entities and behaviors. As Eric Evans mentions in his book Domain Driven Design, the focus is on the domain itself. The model that you design and implement must speak the ubiquitous language so that the essence of the domain is not lost in the myriads of incidental complexities that your implementation enforces. While being expressive the model needs to be extensible too. And when we talk about extensibility, one related attribute is compositionality.

Functions compose more naturally than objects and In this post I will use functional programming idioms to implement one of the patterns that form the core of domain driven design - the Specification pattern, whose most common use case is to implement domain validation. Eric's book on DDD says regarding the Specification pattern ..

It has multiple uses, but one that conveys the most basic concept is that a SPECIFICATION can test any object to see if it satisfies the specified criteria.

A specification is defined as a predicate, whereby business rules can be combined by chaining them together using boolean logic. So there's a concept of composition and we can talk about Composite Specification when we talk about this pattern. Various literature on DDD implement this using the Composite design pattern so commonly implemented using class hierarchies and composition. In this post we will use function composition instead.

Specification - Where ?

One of the very common confusions that we have when we design a model is where to keep the validation code of an aggregate root or any entity, for that matter.

Should we have the validation as part of the entity ? No, it makes the entity bloated. Also validations may vary based on some context, while the core of the entity remains the same.
Should we have validations as part of the interface ? May be we consume JSON and build entities out of it. Indeed some validations can belong to the interface and don't hesitate to put them there.
But the most interesting validations are those that belong to the domain layer. They are business validations (or specifications), which Eric Evans defines as something that "states a constraint on the state of another object". They are business rules which the entity needs to honor in order to proceed to the next stage of processing.

We consider the following simple example. We take an Order entity and the model identifies the following domain "specifications" that a new Order must satisfy before being thrown into the processing pipeline:

it must be a valid order obeying the constraints that the domain requires e.g. valid date, valid no of line items etc.
it must be approved by a valid approving authority - only then it proceeds to the next stage of the pipeline
customer status check must be passed to ensure that the customer is not black-listed
the line items from the order must be checked against inventory to see if the order can be fulfilled

These are the separate steps that are to be done in sequence by the order processing pipeline as pre-order checks before the actual order is ready for fulfilment. A failure in any of them takes the order out of the pipeline and the process stops there. So the model that we will design needs to honor the sequence as well as check all constraints that need to be done as part of every step.

An important point to note here is that none of the above steps mutate the order - so every specification gets a copy of the original Order object as input, on which it checks some domain rules and determines if it's suitable to be passed to the next step of the pipeline.

Jumping on to the implementation ..

Let's take down some implementation notes from what we learnt above ..

The Order can be an immutable entity at least for this sequence of operations
Every specification needs an order, can we can pull some trick out of our hat which prevents this cluttering of API by passing an Order instance to every specification in the sequence ?
Since we plan to use functional programming principles, how can we model the above sequence as an expression so that our final result still remains composable with the next process of order fulfilment (which we will discuss in a future post) ?
All these functions look like having similar signatures - we need to make them compose with each other

Before I present more of any explanation or theory, here are the basic building blocks which will implement the notes that we took after talking to the domain experts ..

type ValidationStatus[S] = \/[String, S]

type ReaderTStatus[A, S] = ReaderT[ValidationStatus, A, S]

object ReaderTStatus extends KleisliInstances with KleisliFunctions {
  def apply[A, S](f: A => ValidationStatus[S]): ReaderTStatus[A, S] = kleisli(f)
}

ValidationStatus defines the type that we will return from each of the functions. It's either some status S or an error string that explains what went wrong. It's actually an Either type (right biased) as implemented in scalaz.

One of the things which we thought will be cool is to avoid repeating the Order parameter for every method when we invoke the sequence. And one of the idioamtic ways of doing it is to use the Reader monad. But here we already have a monad - \/ is a monad. So we need to stack them together using a monad transformer. ReaderT does this job and ReaderTStatus defines the type that somehow makes our life easier by combining the two of them.

The next step is an implementation of ReaderTStatus, which we do in terms of another abstraction called Kleisli. We will use the scalaz library for this, which implements ReaderT in terms of Kleisli. I will not go into the details of this implementation - in case you are curious, refer to this excellent piece by Eugene.

So, how does one sample specification look like ?

Before going into that, here are some basic abstractions, grossly simplified only for illustration purposes ..

// the base abstraction
sealed trait Item {
  def itemCode: String
}

// sample implementations
case class ItemA(itemCode: String, desc: Option[String], 
  minPurchaseUnit: Int) extends Item
case class ItemB(itemCode: String, desc: Option[String], 
  nutritionInfo: String) extends Item

case class LineItem(item: Item, quantity: Int)

case class Customer(custId: String, name: String, category: Int)

// a skeleton order
case class Order(orderNo: String, orderDate: Date, customer: Customer, 
  lineItems: List[LineItem])

And here's a specification that checks some of the constraints on the Order object ..

// a basic validation
private def validate = ReaderTStatus[Order, Boolean] {order =>
  if (order.lineItems isEmpty) left(s"Validation failed for order $order") 
  else right(true)
}

It's just for illustration and does not contain much domain rules. The important part is how we use the above defined types to implement the function. Order is not an explicit argument to the function - it's curried. The function returns a ReaderTStatus, which itself is a monad and hence allows us to sequence in the pipeline with other specifications. So we get the requirement of sequencing without breaking out of the expression oriented programming style.

Here are a few other specifications based on the domain knowledge that we have gathered ..

private def approve = ReaderTStatus[Order, Boolean] {order =>
  right(true)
}

private def checkCustomerStatus(customer: Customer) = ReaderTStatus[Order, Boolean] {order =>
  right(true)
}

private def checkInventory = ReaderTStatus[Order, Boolean] {order =>
  right(true)
}

Wiring them together

But how do we wire these pieces together so that we have the sequence of operations that the domain mandates and yet all goodness of compositionality in our model ? It's actually quite easy since we have already done the hard work of defining the appropriate types that compose ..

Here's the isReadyForFulfilment method that defines the composite specification and invokes all the individual specifications in sequence using for-comprehension, which, as you all know does the monadic bind in Scala and gives us the final expression that needs to be evaluated for the Order supplied.

def isReadyForFulfilment(order: Order) = {
  val s = for {
    _ <- validate
    _ <- approve
    _ <- checkCustomerStatus(order.customer)
    c <- checkInventory
  } yield c
  s(order)
}

So we have the monadic bind implement the sequencing without breaking the compositionality of the abstractions. In the next instalment we will see how this can be composed with the downstream processing of the order that will not only read stuff from the entity but mutate it too, of course in a functional way.

A Sketch as the Query Model of an EventSourced System

Wed, 22 Jan 2014 19:46:00 +0000

In my last post I discussed the count-min sketch data structure that can be used to process data streams using sub-linear space. In this post I will continue with some of my thoughts on how count-min sketches can be used in a typical event sourced application architecture. An event sourcing system typically has a query model which provides a read only view of how all the events are folded to provide a coherent view of the system. I have seen applications where the query model is typically rendered from a relational database. And the queries can take a lot of time to be successfully processed and displayed to the user if the data volume is huge. And when we are talking about Big Data, this is not a very uncommon use case.

Instead of rendering the query from the RDBMS, quite a few types of them can be rendered from a count-min sketch using sub-linear space. Consider the use case where you need to report the highest occuring user-ids in a Twitter stream. The stream is continuous, huge and non ending and you get to see each item once. So you get each item from where you parse out the user-id occurring in it and update the sketch. So each entry of the sketch contains the frequency of the user-id that hashes to that slot. And we can take the minimum of all the slots to which a user-id hashes to, in order to get the frequency of that user-id. The details of how this works can be found in my last post.

Consider the case where we need to find the heavy-hitters - those user-ids whose frequency exceeds a pre-determined threshold. For that, in addition to the sketch we can also maintain a data structure like heap or tree where we update the top-k heavy hitters. When a user-id appears, we update the sketch, get its estimated frequency from the sketch and if it exceeds the threshold, also record it in the data structure. So at any point in time we can probe this accessary data structure to get the current heavy-hitters. Spark examples contain a sample implementation of this heavy hitters query from a Twitter stream using the CountMinSketchMonoid of Algebird.

Can this be a viable approach of implementing the query model in an event sourced system if the use case fits the approximation query approach ? It can be faster, relatively cheap in space and can prove to be responsive enough to be displayed in dashboards in the form of charts or graphs.

Count-Min Sketch - A Data Structure for Stream Mining Applications

Sun, 19 Jan 2014 17:48:00 +0000

In today's age of Big Data, streaming is one of the techniques for low latency computing. Besides the batch processing infrastructure of map/reduce paradigm, we are seeing a plethora of ways in which streaming data is processed at near real time to cater to some specific kinds of applications. Libraries like Storm, Samza and Spark belong to this genre and are starting to get their share of user base in the industry today.

This post is not about Spark, Storm or Samza. It's about a data structure which is one of the relatively new entrants in the domain of stream processing, which is simple to implement, but has already proved to be of immense use in serving a certain class of queries over huge streams of data. I have been doing some readings about the application of such structures and thought of sharing them with the readers of my blog.

Using Sublinear Space

Besides data processing, these tools also support data mining over streams that include serving specialized queries over data using limited space and time. Ok, so once we store all data as they come we can always serve queries with O(n) space. But since we are talking about huge data streams, it may not even be possible to run algorithms on the full set of data - it simply will be too expensive. Even if we have the entire set of data in a data warehouse, the processing of the entire data set may take time and consume resources that we cannot afford to have, considering the fee charged under the evolving models of using the platform-as-a-service within the cloud based infrastructure. Also the fact that these algorithms will be working on data streams, there's a high likelihood that they will get to see these data only in a single pass. The bottom line is that we need to have algorithms that work on sub-linear space.

Working on sublinear space implies that we don't get to store or see all data - hence an obvious conclusion from this will be the fact that we also don't get to deliver an accurate answer to some queries. We rely on some approximation techniques and deliver an accuracy with a reasonably high probability bound. We don't store all data, instead we store a lossy compressed representation of the data and deliver user queries from this subset instead of the entire set.

One widely used technique for storing a subset of data is through Random Sampling, where the data stored is selected through some stochastic mechanism. There are various ways to determine which data we select for storing and how we build the estimator for querying the data. There are pros and cons with this approach, but it's one of the simplest ways to do approximation based queries on streaming data.

There are a few other options like Histograms and Wavelet based synopses. But one of the most interesting data structures that have been developed in recent times is the Sketch, which uses summary based techniques for delivering approximation queries, gets around the typical problems that sampling techniques have and are highly parallelizable in practice.

An important class of sketch is one where the sketch vector (which is the summary information) is a linear transform of the input vector. So if we model the input as a vector we can multiply it by a sketch matrix to obtain the sketch vector that contains the synopses data that we can use for serving approximation queries. Here's a diagrammatic representation of the sketch as a linear transform of the input data.

Count-Min Sketch

One of the most popular forms of the sketch data structure is the Count-Min Sketch introduced by Muthukrishnan and Cormode in 2003. The idea is quite simple and the data structure is based on probabilistic algorithms to serve various types of queries on streaming data. The data structure is parameterized by two factors - ε and δ, where the error in answering the query is within a factor of ε with probability δ. So you can tune these parameters based on the space that you can afford and accordingly amortize the accuracy of results that the data structure can serve you.

Consider this situation where you have a stream of data (typically modeled as a vector) like updates to stock quotes in a financial processing system arriving continuously that you need to process and report statistical queries on a real time basis.

We model the data stream as a vector a[1 .. n] and the updates received at time t are of the form (i_t, c_t) which mean that the stock quote for a[i_t] has been incremented by c_t. There are various models in which this update can appear as discussed in Data Streams: Algorithms and Applications by Muthukrishnan which includes negative updates as well and the data structure can be tuned to handle each of these variants.

The core of the data structure is a 2 dimensional array count[w, d] that stores the synopses of the original vector and which is used to report results of queries using approximation techniques. Hence the total space requirement of the data structure is (w * d). We can bound each of w and d in terms of our parameters ε and δ and control the level of accuracy that we want our data structure to serve.

The data structure uses hashing techniques to process these updates and report queries using sublinear space. So assume we have d pairwise-independent hash functions {h₁ .. h_d} that hash each of our inputs to the range (1 .. w). Just for the more curious mind, pairwise independence is a method to construct a universal hash family, a technique that ensures lower number of collisions in the hash implementation.

When an update (i_t, c_t) comes for the stream, we hash a[i_t] through each of the hash functions h₁ .. h_d and increment each of the w entries in the array that they hash to.

for i = 1 to d v = h(i)(a[i_t]) // v is between 1 and w count[i, v] += c_t // increment the cell count by c_t end
At any point in time if we want to know the approximate value of an element a[i] of the vector a, we can get it from computing the minimum of all values in each of the d cells of count where i hashes to. This can be proved formally. But the general intuition is that since we are using hash functions there's always a possibility of multiple i's colliding on to the same cell and contributing additively to the value of the cell. Hence the minimum among all hash values is the closest candidate to give us the correct result for the query.

The figure above shows the processing of the updates in a Count-Min sketch. This is typically called the Point Query that returns an approximation of a[i]. Similarly we can use a Count-Min sketch to get approximation queries for ranges which is typically a summation over multiple point queries. Another interesting application is to serve inner product queries where the data structure is used to query inner products of 2 vectors, a typical application of this being the estimation of join sizes in relational query processing. The paper Statistical Analysis of Sketch Estimators gives all details of how to use sketching as a technique for this.

Count-Min sketches have some great properties which make them a very useful data structure when processing distributed streams. They have associativity properties and can be modelled as monoids and hence terribly performant in a distributed environment where you can parallelize sketch operations. In a future post I will discuss some implementation techniques and how we can use count-min sketches to serve some useful applications over data streams. Meanwhile Twitter's algebird and ClearSpring's stream-lib offer implementations of Count-Min sketch and various other data structures applicable for stream mining applications.

Scala Redis client goes non blocking : uses Akka IO

Wed, 24 Jul 2013 04:37:00 +0000

scala-redis is getting a new non blocking version based on a kernel implemented with the new Akka IO. The result is that all APIs are non blocking and return a Future. We are trying to keep the API as close to the blocking version as possible. But bits rot and some of the technical debt need to be repaid. We have cleaned up some of the return types which had unnecessary Option[] wrappers, made some improvements and standardizations on the API type signatures and also working on making the byte array manipulation faster using akka.util.ByteString at the implementation level. We also have plans of using the Akka IO pipeline for abstracting the various stages of handling Redis protocol request and response.

As of today we have quite a bit ready for review by the users. The APIs may change a bit here and there, but the core APIs are up there. There are a few areas which have not yet been implemented like PubSub or clustering. Stay tuned for more updates on this blog .. Here are a few code snippets that demonstrate the usage of the APIs ..

Non blocking get/set

@volatile var callbackExecuted = false

val ks = (1 to 10).map(i => s"client_key_$i")
val kvs = ks.zip(1 to 10)

val sets: Seq[Future[Boolean]] = kvs map {
  case (k, v) => client.set(k, v)
}

val setResult = Future.sequence(sets) map { r: Seq[Boolean] =>
  callbackExecuted = true
  r
}

callbackExecuted should be (false)
setResult.futureValue should contain only (true)
callbackExecuted should be (true)

callbackExecuted = false
val gets: Seq[Future[Option[Long]]] = ks.map { k => client.get[Long](k) }
val getResult = Future.sequence(gets).map { rs =>
  callbackExecuted = true
  rs.flatten.sum
}

callbackExecuted should be (false)
getResult.futureValue should equal (55)
callbackExecuted should be (true)

Composing through sequential combinators

val key = "client_key_seq"
val values = (1 to 100).toList
val pushResult = client.lpush(key, 0, values:_*)
val getResult = client.lrange[Long](key, 0, -1)

val res = for {
  p <- pushResult.mapTo[Long]
  if p > 0
  r <- getResult.mapTo[List[Long]]
} yield (p, r)

val (count, list) = res.futureValue
count should equal (101)
list.reverse should equal (0 to 100)

Error handling using Promise Failure

val key = "client_err"
val v = client.set(key, "value200")
v.futureValue should be (true)

val x = client.lpush(key, 1200)
val thrown = evaluating { x.futureValue } should produce [TestFailedException]
thrown.getCause.getMessage should equal ("ERR Operation against a key holding the wrong kind of value")

Feedbacks welcome, especially on the APIs and their usage. All code are in Github with all tests in the test folder. Jisoo Park (@guersam) has been doing an awesome job contributing a lot to all the goodness that's there in the repo. Thanks a lot for all the help ..

The Realm of Racket is an enjoyable read

Mon, 22 Jul 2013 05:44:00 +0000

There are many ways to write a programming language book. You can start introducing the syntax and semantics of the language in a naturally comprehensible sequence of complexity and usage. Or you can choose to introduce the various features of the language with real world examples using the standard librray that the language offers. IIRC Accelerated C++ by Andrew Koenig and Barbara Moo takes this route. I really loved this approach and enjoyed reading the book.

Of course Matthias Felleisen is known for a third way of teaching a language - the fun way. The Little Schemer and The Seasoned Schemer have introduced a novel way of learning a language. The Realm of Racket follows a similar style of teaching the latest descendant of Lisp, one game at a time. The implementation of every game introduces the idioms and language features with increasing degrees of complexity. There's a nice progression which helps understanding the complex features of the language building upon the already acquired knowledge of the simpler ones in earlier chapters.

The book begins with a history of the Racket programming language and how it evolved as a descendant of Scheme, how it makes programming fun and how it can be used successfully as an introductory language to students aspiring to learn programming. Then it starts with Getting Started with Dr Racket for the impatient programmer and explains the IDE that will serve as your companion for the entire duration of your playing around with the book.

Every chapter introduces some of the new language features and either develops a new game or builds on some improvement of a game developed earlier. This not only demonstrates a real world usage of the syntax and semantics of the language but makes the programmer aware of how the various features interact as a whole to build complex abstractions out of simpler ones. The book also takes every pain to defer the complexity of the features to the right point so that the reader is not burdened upfront. e.g. Lambdas are introduced only when the authors have introduced all basics of programming with functions and recursion. Mutants are introduced only after teaching the virtues of immutablity. For loops and comprehensions appear only when the book has introduced all list processing functions like folds, maps and filters. And then the book goes into great depth explaining why the language has so many variants of the for loop like for/list, for/fold, for*, for/first, for/last etc. In this entire discussion of list processing, for loops etc., I would love to see a more detailed discussion on sequence in the book. Sequence abstracts a large number of data types, but, much like Clojure it introduces a new way of API design - a single sequence to rule them all. API designers would surely like to have more of this sauce as part of their repertoire. Racket's uniform way of handling sequence is definitely a potent model of abstraction as compared to Scheme or other versions of Lisp.

The games developed progress in complexity and we can see the powers of the language being put to great use when the authors introduce lazy evaluation and memoized computations and use them to improve the Dice of Doom. Then the authors introduce distributed game development which is the final frontier that the book covers. It's truly an enjoyable ride through the entire series.

The concluding chapter talks about some of the advanced features like classes, objects and meta-programming. Any Lisp book will be incomplete without a discussion of macros and language development. But I think the authors have done well to defer these features till the end. Considering the fact that this is a book for beginning to learn the language this sounded like a sane measure.

However, as a programmer experienced in other languages and wanting to take a look at Racket, I would have loved to see some coverage on testing. Introducing a bit on testing practices, maybe a unit testing library would have made the book more complete.

The style of writing of this book has an underlying tone of humor and simplicity, which makes it a thoroughly enjoyable read. The use of illustrations and comics take away the monotony of learning the prosaics of a language. And the fact that Racket is a simple enough language makes this combination with pictures very refreshing.

On the whole, as a book introducing a language, The Realm of Racket is a fun read. I enjoyed reading it a lot and recommend it without reservations for your bookshelf.

Endo is the new fluent API

Mon, 03 Jun 2013 06:40:00 +0000

I tweeted this over the weekend ..

a good title for a possible blog post .. endo is the new fluent API ..
— Debasish Ghosh (@debasishg) June 1, 2013

My last two blog posts have been about endomorphisms and how it combines with the other functional structures to help you write expressive and composable code. In A DSL with an Endo - monoids for free, endos play with Writer monad and implement a DSL for a sequence of activities through monoidal composition. And in An exercise in Refactoring - Playing around with Monoids and Endomorphisms, I discuss a refactoring exercise that exploits the monoid of an endo to make composition easier. Endomorphisms help you lift your computation into a data type that gives you an instance of a monoid. And the mappend operation of the monoid is the function composition. Hence once you have the Endo for your type defined, you get a nice declarative syntax for the operations that you want to compose, resulting in a fluent API. Just a quick recap .. endomorphisms are functions that map a type on to itself and offer composition over monoids. Given an endomorphism we can define an implicit monoid instance ..

implicit def endoInstance[A]: Monoid[Endo[A]] = new Monoid[Endo[A]] {
  def append(f1: Endo[A], f2: => Endo[A]) = f1 compose f2
  def zero = Endo.idEndo
}

I am not going into the details of this, which I discussed at length in my earlier posts. In this article I will sum up with yet another use case for making fluent APIs using the monoid instance of an Endo. Consider an example from the domain of securities trading, where a security trade goes through a sequence of transformations in its lifecycle through the trading process .. Here's a typical Trade model (very very trivialified for demonstration) ..

sealed trait Instrument
case class Security(isin: String, name: String) extends Instrument

case class Trade(refNo: String, tradeDate: Date, valueDate: Option[Date] = None, 
  ins: Instrument, principal: BigDecimal, net: Option[BigDecimal] = None, 
  status: TradeStatus = CREATED)

Modeling a typical lifecycle of a trade is complex. But for illustration, let's consider these simple ones which need to executed on a trade in sequence ..

Validate the trade
Assign value date to the trade, which will ideally be the settlement date
Enrich the trade with tax/fees and net trade value
Journalize the trade in books

Each of the functions take a Trade and return a copy of the Trade with some attributes modified. A naive way of doing that will be as follows ..

def validate(t: Trade): Trade = //..

def addValueDate(t: Trade): Trade = //..

def enrich(t: Trade): Trade = //..

def journalize(t: Trade): Trade = //..

and invoke these methods in sequence while modeling the lifecycle. Instead we try to make it more composable and lift the function Trade => Trade within the Endo ..

type TradeLifecycle = Endo[Trade]

and here's the implementation ..

// validate the trade: business logic elided
def validate: TradeLifecycle = 
  ((t: Trade) => t.copy(status = VALIDATED)).endo

// add value date to the trade (for settlement)
def addValueDate: TradeLifecycle = 
  ((t: Trade) => t.copy(valueDate = Some(t.tradeDate), status = VALUE_DATE_ADDED)).endo

// enrich the trade: add taxes and compute net value: business logic elided
def enrich: TradeLifecycle = 
  ((t: Trade) => t.copy(net = Some(t.principal + 100), status = ENRICHED)).endo

// journalize the trade into book: business logic elided
def journalize: TradeLifecycle = 
  ((t: Trade) => t.copy(status = FINALIZED)).endo

Now endo has an instance of Monoid defined by scalaz and the mappend of Endo is function composition .. Hence here's our lifecycle model using the holy monoid of endo ..

def doTrade(t: Trade) =
  (journalize |+| enrich |+| addValueDate |+| validate).apply(t)

It's almost the specification that we listed above in numbered bullets. Note the inside out sequence that's required for the composition to take place in proper order.

Why not plain old composition ?

A valid question. The reason - abstraction. Abstracting the composition within types helps you compose the result with other types, as we saw in my earlier blog posts. In one of them we built larger abstractions using the Writer monad with Endo and in the other we used the mzero of the monoid as a fallback during composition thereby avoiding any special case branch statements.

One size doesn't fit all ..

The endo and its monoid compose beautifully and gives us a domain friendly syntax that expresses the business functionality ina nice succinct way. But it's not a pattern which you can apply everywhere where you need to compose a bunch of domain behaviors. Like every idiom, it has its shortcomings and you need different sets of solutions in your repertoire. For example the above solution doesn't handle any of the domain exceptions - what if the validation fails ? With the above strategy the only way you can handle this situation is to throw exceptions from validate function. But exceptions are side-effects and in functional programming there are more cleaner ways to tame the evil. And for that you need different patterns in practice. More on that in subsequent posts ..

An exercise in Refactoring - Playing around with Monoids and Endomorphisms

Mon, 04 Mar 2013 05:30:00 +0000

A language is powerful when it offers sufficient building blocks for library design and adequate syntactic sugar that helps build expressive syntax on top of the lower level APIs that the library publishes. In this post I will discuss an exercise in refactoring while trying to raise the level of abstraction of a modeling problem.

Consider the following modeling problem that I recently discussed in one of the Scala training sessions. It's simple but offers ample opportunities to explore how we can raise the level of abstraction in designing the solution model. We will start with an imperative solution and then incrementally work on raising the level of abstraction to make the final code functional and more composable.

A Problem of Transformation ..

The problem is to compute the salary of a person through composition of multiple salary components calculated based on some percentage of other components. It's a problem of applying repeated transformations to a pipeline of successive computations - hence it can be generalized as a case study in function composition. But with some constraints as we will see shortly.

Let's say that the salary of a person is computed as per the following algorithm :

basic = the basic component of his salary
allowances = 20% of basic
bonus = 10% of (basic + allowances)
tax = 30% of (basic + allowances + bonus)
surcharge = 10% of (basic + allowances + bonus - tax)

Note that the computation process starts with a basic salary, computes successive components taking the input from the previous computation of the pipeline. But there's a catch though, which makes the problem a bit more interesting from the modleing perspective. Not all components of the salary are mandatory - of course the basic is mandatory. Hence the final components of the salary will be determined by a configuration object which can be like the following ..

// an item = true means the component should be activated in the computation
case class SalaryConfig(
  surcharge: Boolean    = true, 
  tax: Boolean          = true, 
  bonus: Boolean        = true,
  allowance: Boolean    = true 
)

So when we compute the salary we need to take care of this configuration object and activate the relevant components for calculation.

A Function defines a Transformation ..

Let's first translate the above components into separate Scala functions ..

// B = basic + 20%
val plusAllowance = (b: Double) => b * 1.2

// C = B + 10%
val plusBonus = (b: Double) => b * 1.1

// D = C - 30%
val plusTax = (b: Double) => 0.7 * b

// E = D - 10%
val plusSurcharge = (b: Double) => 0.9 * b

Note that every function computes the salary up to the stage which will be fed to the next component computation. So the final salary is really the chained composition of all of these functions in a specific order as determined by the above stated algorithm.

But we need to selectively activate and deactivate the components depending on the SalaryConfig passed. Here's the version that comes straight from the imperative mindset ..

The Imperative Solution ..

// no abstraction, imperative, using var
def computeSalary(sc: SalaryConfig, basic: Double) = {
  var salary = basic
  if (sc.allowance) salary = plusAllowance(salary)
  if (sc.bonus) salary = plusBonus(salary)
  if (sc.tax) salary = plusTax(salary)
  if (sc.surcharge) salary = plusSurcharge(salary)
  salary
}

Straight, imperative, mutating (using var) and finally rejected by our functional mindset.

Thinking in terms of Expressions and Composition ..

Think in terms of expressions (not statements) that compose. We have functions defined above that we could compose together and get the result. But, but .. the config, which we somehow need to incorporate as part of our composable expressions.

So direct composition of functions won't work because we need some conditional support to take care of the config. How else can we have a chain of functions to compose ?

Note that all of the above functions for computing the components are of type (Double => Double). Hmm .. this means they are endomorphisms, which are functions that have the same argument and return type - "endo" means "inside" and "morphism" means "transformation". So an endomorphism maps a type on to itself. Scalaz defines it as ..

sealed trait Endo[A] {
  /** The captured function. */
  def run: A => A
  //..
}

But the interesting part is that there's a monoid instance for Endo and the associative append operation of the monoid for Endo is function composition. That seems mouthful .. so let's dissect what we just said ..

As you all know, a monoid is defined as "a semigroup with an identity", i.e.

trait Monoid[A] {
  def append(m1: A, m2: A): A
  def zero: A
}

and append has to be associative.

Endo forms a monoid where zero is the identity endomorphism and append composes the underlying functions. Isn't that what we need ? Of course we need to figure out how to sneak in those conditionals ..

implicit def endoInstance[A]: Monoid[Endo[A]] = new Monoid[Endo[A]] {
  def append(f1: Endo[A], f2: => Endo[A]) = f1 compose f2
  def zero = Endo.idEndo
}

But we need to append the Endo only if the corresponding bit in SalaryConfig is true. Scala allows extending a class with custom methods and scalaz gives us the following as an extension method on Boolean ..

/**
 * Returns the given argument if this is `true`, otherwise, the zero element 
 * for the type of the given argument.
 */
final def ??[A](a: => A)(implicit z: Monoid[A]): A = b.valueOrZero(self)(a)

That's exactly what we need to have the following implementation of a functional computeSalary that uses monoids on Endomorphisms to compose our functions of computing the salary components ..

// compose using mappend of endomorphism
def computeSalary(sc: SalaryConfig, basic: Double) = {
  val e = 
    sc.surcharge ?? plusSurcharge.endo     |+|
    sc.tax ?? plusTax.endo                 |+|
    sc.bonus ?? plusBonus.endo             |+| 
    sc.allowance ?? plusAllowance.endo 
  e run basic
}

More Generalization - Abstracting over Types ..

We can generalize the solution further and abstract upon the type that represents the collection of component functions. In the above implementation we are picking each function individually and doing an append on the monoid. Instead we can abstract over a type constructor that allows us to fold the append operation over a collection of elements.

Foldable[] is an abstraction which allows its elements to be folded over. Scalaz defines instances of Foldable[] typeclass for List, Vector etc. so we don't care about the underlying type as long as it has an instance of Foldable[]. And Foldable[] has a method foldMap that makes a Monoid out of every element of the Foldable[] using a supplied function and then folds over the structure using the append function of the Monoid.

trait Foldable[F[_]]  { self =>
  def foldMap[A,B](fa: F[A])(f: A => B)(implicit F: Monoid[B]): B
  //..
}

In our example, f: A => B is the endo function and the append is the append of Endo which composes all the functions that form the Foldable[] structure. Here's the version using foldMap ..

def computeSalary(sc: SalaryConfig, basic: Double) = {
  val components = 
    List((sc.surcharge, plusSurcharge), 
         (sc.tax, plusTax), 
         (sc.bonus, plusBonus),
         (sc.allowance, plusAllowance)
    )
  val e = components.foldMap(e => e._1 ?? e._2.endo)
  e run basic
}

This is an exercise which discusses how to apply transformations on values when you need to model endomorphisms. Instead of thinking in terms of generic composition of functions, we exploited the types more, discovered that our tranformations are actually endomorphisms. And then applied the properties of endomorphism to model function composition as monoidal appends. The moment we modeled at a higher level of abstraction (endomorphism rather than native functions), we could use the zero element of the monoid as the composable null object in the sequence of function transformations.

In case you are interested I have the whole working example in my github repo.

A DSL with an Endo - monoids for free

Fri, 15 Feb 2013 12:05:00 +0000

When we design a domain model, one of the issues that we care about is abstraction of implementation from the user level API. Besides making the published contract simple, this also decouples the implementation and allows post facto optimization to be done without any impact on the user level API.

Consider a class like the following ..

// a sample task in a project
case class Task(name: String) 

// a project with a list of tasks & dependencies amongst the
// various tasks
case class Project(name: String, 
                   startDate: java.util.Date, 
                   endDate: Option[java.util.Date] = None, 
                   tasks: List[Task] = List(), 
                   deps: List[(Task, Task)] = List())

We can always use the algebraic data type definition above to add tasks and dependencies to a project. Besides being cumbersome as a user level API, it also is a way to program too close to the implementation. The user is coupled to the fact that we use a List to store tasks, making it difficult to use any alternate implementation in the future. We can offer a Builder like OO interface with fluent APIs, but that also adds to the verbosity of implementation, makes builders mutable and is generally more difficult to compose with other generic functional abstractions.

Ideally we should be having a DSL that lets users create projects and add tasks and dependencies to them.

In this post I will discuss a few functional abstractions that will stay behind from the user APIs, and yet provide the compositional power to wire up the DSL. This is a post inspired by this post which discusses a similar DSL design using Endo and Writers in Haskell.

Let's address the issues one by one. We need to accumulate tasks that belong to the project. So we need an abstraction that helps in this accumulation e.g. concatenation in a list, or in a set or in a Map .. etc. One abstraction that comes to mind is a Monoid that gives us an associative binary operation between two objects of a type that form a monoid.

trait Monoid[T] {
  def append(m1: T, m2: T): T
  def zero: T
}

A List is a monoid with concatenation as the append. But since we don't want to expose the concrete data structure to the client API, we can talk in terms of monoids.

The other data structure that we need is some form of an abstraction that will offer us the writing operation into the monoid. A Writer monad is an example of this. In fact the combination of a Writer and a Monoid is potent enough to have such a DSL in the making. Tony Morris used this combo to implement a logging functionality ..

for {
  a <- k withvaluelog ("starting with " + _)
  b <- (a + 7) withlog "adding 7"
  c <- (b * 3).nolog
  d <- c.toString.reverse.toInt withvaluelog ("switcheroo with " + _)
  e <- (d % 2 == 0) withlog "is even?"
} yield e

We could use this same technique here. But we have a problem - Project is not a monoid and we don't have a definition of zero for a Project that we can use to make it a Monoid. Is there something that would help us get a monoid from Project i.e. allow us to use Project in a monoid ?

Enter Endo .. an endomorphism which is simply a function that takes an argument of type T and returns the same type. In Scala, we can state this as ..

sealed trait Endo[A] {
  // The captured function
  def run: A => A
  //..
}

Scalaz defines Endo[A] and provides a lot of helper functions and syntactic sugars to use endomorphisms. Among its other properties, Endo[A] provides a natural monoid and allows us to use A in a Monoid. In other words, endomorphisms of A form a monoid under composition. In our case we can define an Endo[Project] as a function that takes a Project and returns a Project. We can then use it with a Writer (as above) and implement the accumulation of tasks within a Project.

Exercise: Implement Tony Morris' logger without side-effects using an Endo.

Here's how we would like to accumulate tasks in our DSL ..

for {
  _ <- task("Study Customer Requirements")
  _ <- task("Analyze Use Cases")
  a <- task("Develop code")
} yield a

Let's define a function that adds a Task to a Project ..

// add task to a project
val withTask = (t: Task, p: Project) => p.copy(tasks = t :: p.tasks)

and use this function to define the DSL API task which makes an Endo[Project] and passes it as a Monoid to the Writer monad. In the following snippet, (p: Project) => withTask(t, p) is a mapping from Project => Project, which gets converted to an Endo and then passed to the Writer monad for adding to the task list of the Project.

def task(n: String): Writer[Endo[Project], Task] = {
  val t = Task(n)
  for {
    _ <- tell(((p: Project) => withTask(t, p)).endo)
  } yield t
}

The DSL snippet above is a monad comprehension. Let's add some more syntax to the DSL by defining dependencies of a Project. That's also a mapping from one Project state to another and can be realized using a similar function like withTask ..

// add project dependency
val withDependency = (t: Task, on: Task, p: Project) => 
  p.copy(deps = (t, on) :: p.deps)

.. and define a function dependsOn to our DSL that allows the user to add the explicit dependencies amongst tasks. But this time instead of making it a standalone function we will make it a method of the class Task. This is only for getting some free syntactic sugar in the DSL. Here's the modified Task ADT ..

case class Task(name: String) {
  def dependsOn(on: Task): Writer[Endo[Project], Task] = {
    for {
      _ <- tell(((p: Project) => withDependency(this, on, p)).endo)
    } yield this
  }
}

Finally we define the last API of our DSL that glues together the building of the Project and the addition of tasks and dependencies without directly coupling the user to some of the underlying implementation artifacts.

def project(name: String, startDate: Date)(e: Writer[Endo[Project], Task]) = {
  val p = Project(name, startDate)
  e.run._1(p)
}

And we can finally create a Project along with tasks and dependencies using our DSL ..

project("xenos", now) {
  for {
    a <- task("study customer requirements")
    b <- task("analyze usecases")
    _ <- b dependsOn a
    c <- task("design & code")
    _ <- c dependsOn b
    d <- c dependsOn a
  } yield d
}

In case you are interested I have the whole working example in my github repo.

Modular Abstractions in Scala with Cakes and Path Dependent Types

Fri, 01 Feb 2013 06:57:00 +0000

I have been trying out various options of implementing the Cake pattern in Scala, considered to be one of the many ways of doing dependency injection without using any additional framework. There are other (more functional) ways of doing the same thing, one of which I blogged about before and also talked about in a NY Scala meetup. But I digress ..

Call it DI or not, the Cake pattern is one of the helpful techniques to implement modular abstractions in Scala. You weave your abstract components (aka traits), layering on the dependencies and commit to implementations only at the end of the world. I was trying to come up with an implementation that does not use self type annotations. It's not that I think self type annotations are kludgy or anything but I don't find them used elsewhere much besides the Cake pattern. And of course mutually recursive self annotations are a code smell that makes your system anti-modular.

In the following implementation I use path dependent types, which have become a regular feature in Scala 2.10. Incidentally it was there since long back under the blessings of an experimental feature, but has come out in public only in 2.10. The consequence is that instead of self type annotations or inheritance I will be configuring my dependencies using composition.

Let me start with some basic abstractions of a very simple domain model. The core component that I will build is a service that reports the portfolio of clients as a balance. The example has been simplified for illustration purposes - the actual real life model has a much more complex implementation.

A Portfolio is a collection of Balances. A Balance is a position of an Account in a specific Currency as on a particular Date. Expressing this in simple terms, we have the following traits ..

// currency
sealed trait Currency
case object USD extends Currency
case object EUR extends Currency
case object AUD extends Currency

//account
case class Account(no: String, name: String, openedOn: Date, status: String)

trait BalanceComponent {
  type Balance

  def balance(amount: Double, currency: Currency, asOf: Date): Balance
  def inBaseCurrency(b: Balance): Balance
}

The interesting point to note is that the actual type of Balance has been abstracted in BalanceComponent, since various services may choose to use various representations of a Balance. And this is one of the layers of the Cake that we will mix finally ..

Just a note for the uninitiated, a base currency is typically considered the domestic currency or accounting currency. For accounting purposes, a firm may use the base currency to represent all profits and losses. So we may have some service or component that would like to have the balances reported in base currency.

trait Portfolio {
  val bal: BalanceComponent
  import bal._

  def currentPortfolio(account: Account): List[Balance]
}

Portfolio uses the abstract BalanceComponent and does not commit to any specific implementation. And the Balance in the return type of the method currentPortfolio is actually a path dependent type, made to look nice through the object import syntax.

Now let's have some standalone implementations of the above components .. we are still not there yet to mix the cake ..

// report balance as a TUPLE3 - simple
trait SimpleBalanceComponent extends BalanceComponent {
  type Balance = (Double, Currency, Date)

  override def balance(amount: Double, currency: Currency, asOf: Date) = 
    (amount, currency, asOf)
  override def inBaseCurrency(b: Balance) = 
    ((b._1) * baseCurrencyFactor.get(b._2).get, baseCurrency, b._3)
}

// report balance as an ADT
trait CustomBalanceComponent extends BalanceComponent {
  type Balance = BalanceRep

  // balance representation
  case class BalanceRep(amount: Double, currency: Currency, asOf: Date)

  override def balance(amount: Double, currency: Currency, asOf: Date) = 
    BalanceRep(amount, currency, asOf)
  override def inBaseCurrency(b: Balance) = 
    BalanceRep((b.amount) * baseCurrencyFactor.get(b.currency).get, baseCurrency, b.asOf)
}

And a sample implementation of ClientPortfolio that adds logic without yet commiting to any concrete type for the BalanceComponent.

trait ClientPortfolio extends Portfolio {
  val bal: BalanceComponent
  import bal._

  override def currentPortfolio(account: Account) = {
    //.. actual impl will fetch from database
    List(
      balance(1000, EUR, Calendar.getInstance.getTime),
      balance(1500, AUD, Calendar.getInstance.getTime)
    )
  }
}

Similar to ClientPortfolio, we can have multiple implementations of Portfolio reporting that reports balances in various forms. So our cake has started taking shape. We have the Portfolio component and the BalanceComponent already weaved in without any implementation. Let's add yet another layer to the mix, maybe for fun - a decorator for the Portfolio.

We add Auditing as a component which can decorate *any* Portfolio component and report the balance of an account in base currency. Note that Auditing needs to abstract implementations of BalanceComponent as well as Portfolio since the idea is to decorate any Portfolio component using any of the underlying BalanceComponent implementations.

Many cake implementations use self type annotations (or inheritance) for this. I will be using composition and path dependent types.

trait Auditing extends Portfolio {
  val semantics: Portfolio
  val bal: semantics.bal.type
  import bal._

  override def currentPortfolio(account: Account) = {
    semantics.currentPortfolio(account) map inBaseCurrency
  }
}

Note how the Auditing component uses the same Balance implementation as the underlying decorated Portfolio component, enforced through path dependent types.

And we have reached the end of the world without yet committing to any implementation of our components .. But now let's do that and get a concrete service instantiated ..

object SimpleBalanceComponent extends SimpleBalanceComponent
object CustomBalanceComponent extends CustomBalanceComponent

object ClientPortfolioAuditService1 extends Auditing {
  val semantics = new ClientPortfolio { val bal = SimpleBalanceComponent }
  val bal: semantics.bal.type = semantics.bal
}

object ClientPortfolioAuditService2 extends Auditing {
  val semantics = new ClientPortfolio { val bal = CustomBalanceComponent }
  val bal: semantics.bal.type = semantics.bal
}

Try out in your Repl and see how the two services behave the same way abstracting away all implementations of components from the user ..

scala> ClientPortfolioAuditService1.currentPortfolio(Account("100", "dg", java.util.Calendar.getInstance.getTime, "a"))
res0: List[(Double, com.redis.cake.Currency, java.util.Date)] = List((1300.0,USD,Thu Jan 31 12:58:35 IST 2013), (1800.0,USD,Thu Jan 31 12:58:35 IST 2013))

scala> ClientPortfolioAuditService2.currentPortfolio(Account("100", "dg", java.util.Calendar.getInstance.getTime, "a"))
res1: List[com.redis.cake.ClientPortfolioAuditService2.bal.Balance] = List(BalanceRep(1300.0,USD,Thu Jan 31 12:58:46 IST 2013), BalanceRep(1800.0,USD,Thu Jan 31 12:58:46 IST 2013))

The technique discussed above is inspired from the paper Polymoprhic Embedding of DSLs. I have been using this technique for quite some time and I have discussed a somewhat similar implementation in my book DSLs In Action while discussing internal DSL design in Scala.

And in case you are interested in the full code, I have uploaded it on my Github.

A language and its interpretation - Learning free monads

Mon, 14 Jan 2013 09:50:00 +0000

I have been playing around with free monads of late and finding them more and more useful in implementing separation of concerns between pure data and its interpretation. Monads generally don't compose. But if you restrict monads to a particular form, then you can define a sum type that composes. In the paper Data Types a la carte, Wouter Swierstra describes this form as

data Term f a =
    Pure a
    | Impure (f (Term f a))

These monads consist of either pure values or an impure effect, constructed using f. When f is a functor, Term f is a monad. And in this case, Term f is the free monad being the left adjoint to the forgetful functor f.

I am not going into the details of what makes a monad free. I once asked this question on google+ and Edward Kmett came up with a beautiful explanation. So instead of trying to come up with a half assed version of the same, have a look at Ed's response here .

In short, we can say that a free monad is the freeest object possible that's still a monad.

Composition

Free monads compose and help you build larger abstractions which are pure data and yet manage to retain all properties of a monad. Hmm .. this sounds interesting because now we can not only build abstractions but make them extensible through composition by clients using the fact that it's still a monad.

A free monad is pure data, not yet interpreted, as we will see shortly. You can pass it to a separate interpreter (possibly multiple interpreters) which can do whatever you feel like with the structure. Your free monad remains pure while all impurities can be put inside your interpreters.

And so we interpret ..

In this post I will describe how I implemented an interpreter for a Joy like concatenative language that uses the stack for its computation. Of course it's just a prototype and addresses an extremely simplified subset, but you get the idea. The basic purpose is to explore the power of free monads and come up with something that can potentially be extended to a full blown implementation of the language.

When designing an interpreter there's always the risk of conflating the language along with the concerns of interpreting it. Here we will have the 2 completely decoupled by designing the core language constructs as free monads. Gabriel Gonzalez has written a series of blog posts [1,2,3] on the use of free monads which contain details of their virtues and usage patterns. My post is just an account of my learning experience. In fact after I wrote the post, I discovered a similar exercise done for embedding Forth in Haskell - so I guess I'm not the first one to learn free monads using a language interpreter.

Let's start with some code, which is basically a snippet of Joy like code that will run within our Haskell interpreter ..

p :: Joy ()
p = do push 5
       push 6
       add
       incr
       add2
       square
       cube
       end

This is our wish list and at the end of the post we will see if we can interpret this correctly and reason about some aspects of this code. Let's not bother much about the details of the above snippet. The important point is that if we fire it up in ghci, we will see that it's pure data!

*Joy> p
Free (Push 5 (Free (Push 6 (Free (Add (Free (Push 1 (Free (Add (Free (Push 1 (Free (Add (Free (Push 1 (Free (Add (Free (Dup (Free (Mult (Free (Dup (Free (Dup (Free (Mult (Free (Mult (Free End))))))))))))))))))))))))))))))
*Joy>

We haven't yet executed anything of the above code. It's completely free to be interpreted and possibly in multiple ways. So, we have achieved this isolation - that the data is freed from the interpreter. You want to develop a pretty printer for this data - go for it. You want to apply semantics and give an execution model based on Joy - do it.

Building the pure data (aka Builder)

Let's first define the core operators of the language ..

data JoyOperator cont = Push Int cont
                      | Add      cont 
                      | Mult     cont 
                      | Dup      cont 
                      | End          
                      deriving (Show, Functor)

The interesting piece is the derivation of the Functor, which is required for implementing the forgetful functor adjoint of the free monad. Keeping the technical mumbo jumbo aside, free monads are just a general way of turning functors into monads. So if we have a core operator f as a functor, we can get a free monad Free f out of it. And knowing something is a free monad helps you transform transform an operation over the monad (the monad homomorphism) into an operation over the functor (functor homomorphism). We will see how this helps later ..

The other point to note is that all operators take a continuation argument that points to the next operation in the chain. End is the terminal symbol and we plan to ignore anything that the user enters after an End.

Push takes an Int and pushes into the stack, Add pops the top 2 elements of the stack and pushes the sum, Mult does the same for multiplication and Dup duplicates the top element of the stack. End signifies the end of program.

Next we define the free monad over JoyOperator by using the Free data constructor, defined as part of Control.Monad.Free ..

data Free f a = Pure a | Free (f (Free f a))

-- | The free monad over JoyOperator
type Joy = Free JoyOperator

And then follow it up with some of the definitions of Joy operators as operations over free monads. Note that liftF lifts an operator (which is a Functor) into the context of a free monad. liftF has the following type ..

liftF :: Functor f => f a -> Free f a

As a property, a free moand has a forgetful functor as its left adjoint. The unlifting from the monad to the functor is given by the retract function ..

retract :: Monad f => Free f a -> f a

and needless to say

retract . liftF = id

-- | Push an integer to the stack
push :: Int -> Joy ()
push n = liftF $ Push n ()

-- | Add the top two numbers of the stack and push the sum
add :: Joy ()
add = liftF $ Add ()

.. and this can be done for all operators that we wish to support.

Not only this, we can also combine the above operators and build newer ones. Remember we are working with monads and hence the *do* notation based sequencing comes for free ..

-- | This combinator adds 1 to a number. 
incr :: Joy ()
incr = do {1; add}

-- | This combinator increments twice
add2 :: Joy ()
add2 = do {incr; incr}

-- | This combinator squares a number
square :: Joy ()
square = do {dup; mult}

Now we can have a composite program which sequences through the core operators as well as the ones we derive from them. And that's what we posted as our first example snippet of a target program.

An Interpreter (aka Visitor)

Once we have the pure data part done, let's try and build an interpreter that does the actual execution based on the semantics that we defined on the operators.

-- | Run a joy program. Result is either an Int or an error
runProgram :: Joy n -> Either JoyError Int
runProgram program = joy [] program
  where joy stack (Free (Push v cont))         = joy (v : stack) cont
        joy (a : b : stack) (Free (Add cont))  = joy (a + b : stack) cont
        joy (a : b : stack) (Free (Mult cont)) = joy (a * b : stack) cont
        joy (a : stack) (Free (Dup cont))      = joy (a : a : stack) cont
        joy _ (Free Add {})                    = Left NotEnoughParamsOnStack
        joy _ (Free Dup {})                    = Left NotEnoughParamsOnStack
        joy [] (Free End)                      = Left NotEnoughParamsOnStack
        joy [result] (Free End)                = Right result
        joy _ (Free End)                       = Left NotEmptyOnEnd
        joy _ Pure {}                          = Left NoEnd

runProgram is the interpreter that takes a free monad as its input. Its implementation is quite trivial - it just matches on the recursive structure of the data and pushes the appropriate results on the stack. Now if we run our program p using the above interpreter we get the correct result ..

*Joy> runProgram p
Right 7529536
*Joy>

Equational Reasoning

Being Haskell and being pure, we obviously can prove some bits and pieces of our program as mathematical equations. At the beginning I said that End is the end of the program and anything after End needs to be ignored. What happens if we do the following ..

runProgram $ do {push 5; incr; incr; end; incr; incr}

If you have guessed correctly we get Right 7, which means that all operations after end have been ignored. But can we prove that our program indeed does this ? The following is a proof that end indeed ends our program. Consider some operation m follows end ..

end >> m

-- definition of end
= liftF $ End >> m

-- m >> m' = m >>= \_ -> m'
= liftF $ End >>= \_ -> m

-- definition of liftF
= Free End >>= \_ -> m

-- Free m >>= f = Free (fmap (>>= f) m)
= Free (fmap (>>= \_ -> m) End)

-- fmap f End = End
= Free End

-- liftF f = Free f (f is a functor)
= liftF End

-- definition of End
= end

So this shows that any operation we do after end is never executed.

Improving the bind

Free monads offer improved modularity by allowing us to separate the building of pure data from the (possibly) impure concerns of interpreting it with the external world. But a naive implementation leads to a penalty in runtime efficiency as Janis Voigtlander discusses in his paper Asymptotic Improvement of Computations over Free Monads. And the Haskell's implementation of free monads had this inefficiency where the asymptotic complexity of substitution was quadratic because of left associative bind. Edward Kmett implemented the Janis trick and engineered a solution that gets over this inefficiency. I will discuss this in a future post. And if you would like to play around with the interpreter, the source is there in my github.

strict : recursion :: non-strict : co-recursion

Thu, 03 Jan 2013 05:36:00 +0000

Consider the very popular algorithm that uses a tail recursive call to implement a map over a List. Here's the implementation in F#

let map converter l =
  let rec loop acc = function
    | [] -> acc
    | hd :: tl -> loop (converter hd :: acc) tl
  List.rev (loop [] l)

Scala is also a statically typed functional programming language, though it uses a completely different trick and mutability to implement map over a sequence. Let's ignore it for the time being and imagine we are being a bonafide functional programmer.

The above code uses a very common idiom of accumulate-and-reverse to implement a tail recursive algorithm. Though Scala stdlib does not use this technique and uses a mutable construct for this implementation, we could have done the same thing in Scala as well ..

Both Scala and F# are languages with strict evaluation semantics as the default. What would a similar tail recursive Haskell implementation look like ?

map' f xs = reverse $ go [] xs
    where
    go accum [] = accum
    go accum (x:xs) = go (f x : accum) xs

Let's now have a look at the actual implementation of map in Haskell prelude ..

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

Whoa! It's not a tail recursion and instead a body recursive implementation. Why is that ? We have been taught that tail recursion is the holy grail since it takes constant stack space and all ..

In a strict language the above implementation of map will be a bad idea since it uses linear stack space. On the other hand the initial implementation is tail recursive. But Haskell has non-strict semantics and hence the last version of map consumes only one element of the input and yields one element of the output. The earlier versions consume the whole input before yielding any output.

In a lazy language what you need is to make your algorithm co-recursive. And this needs co-data. Understanding co-data or co-recursion needs a brief background of co-induction and how it differs from induction.

When we define a list in Haskell as

data [a] = [] | a : [a]

it means that the set "List of a" is the smallest set such that [] is in the List of a, and if [a] is in the List of a and a is in a, then a : [a] is in List of a. This is the inductive definition os a List and we can use recursion to implement various properties of a List. Also once we have a recursive definition, we can use structural induction to prove various properties of the data structure.

If an inductive definition on data gives us the smallest set, a co-inductive definition on co-data gives us the largest set. Let's define a Stream in Haskell ..

data Stream a = Cons a (Stream a)

First to note - unlike the definition of a List, there's no special case for empty Stream. Hence no base case unlike the inductive definition above. And a "Stream of a" is the largest set consisting of a pair of an "a" and a "Stream of a". Which means that Stream is an infinite data structure i.e. the range (or co-domain) of a Stream is infinite. And we implement properties on co-inductive data structures using co-recursive programs. A co-recursive program is defined as one whose range is a type defined recursively as the greatest solution of some equation.

In terms of a little bit of mathematical concepts, if we model types as sets, then an infinite list of integers is the greatest set X for which there is a bijection X ≅ ℤ x X and a program that generates such a list is a co-recursive program. As its dual, the type of a finite list is the least set X for which X ≅ 1 + (ℤ x X), where 1 is a singleton set and + is the disjoint union of sets. Any program which consumes such an input is a recursive program. (Note the dualities in italics).

Strictly speaking, the co-domain does not necessarily have to be infinite. If you have read my earlier post on initial algebra, co-recursion recursively defines functions whose co-domain is a final data type, dual to the way that ordinary recursion recursively defines functions whose domain is an initial data type.

In case of primitive recursion (with the List data type above), you always frame the recursive step to operate on data which gets reduced in size in subsequent steps of recursive calls. It's not so with co-recursion since you may be working with co-data and infinite data structures. Since in Haskell, a List can also be infinite, using the above co-recursive definition of map, we can have

head $ map (1+) [1..]

which invokes map on an infinite list of integers. And here the co-recursive steps of map operate successively on sets of data which are not less than the earlier set. So, it's not tail recursion that makes an efficient implementation in Haskell, you need to make the co-recursive call within the application of a constructor.

As the first post of the new year, that's all friends. Consider this as the notes from someone learning the principles of co-inductive definitions. Who knew co-recursion is so interesting ? It will be more interesting once we get down into the proof techniques for co-recursive programs. But that's for a future post ..

The Best of 2012 - Looking Back

Fri, 28 Dec 2012 19:01:00 +0000

The year 2012 was a bit different for me in terms of readings and enlightenment. While once again Scala proved to be the most dominant language that I used throughout the year, I also had some serious productive time and mind share to learning machine learning and related topics. Online courses at Coursera were of immense help in this regard and proved to be the catalyst of inspiration. Being in India, staying within the comfort zones of my happy home, I could attend the classes of professors like Daphne Koller, Martin Odersky and Geoffrey Hinton. Thanks a lot Coursera for this enlightening experience ..

online courses that I attended ..

Probabilistic Graphical Models (Daphne Koller)
Functional Programming Principles in Scala (Martin Odersky)
Neural Networks in Machine Learning (Geoffrey Hinton)

the best of the papers that I read ..

I usually read lots of papers on my subjects of interest. The total number are too many to list here. But here are a few of them, mostly from the domain of programming languages and type theory. In this regard I must confess that I am not a programming language theory person. But I sincerely believe that you need to know some theory to make sense of its implementation. That's the reason I think programmers (yes. serious programmers using a decent programming language with a static type system) will do better to learn a bit of category theory.

As I mentioned at the beginning, currently my programming language of choice is Scala and I have been exploring how to write better code with Scala's multi-paradigm capabilities. Though I tend to use the functional power of Scala more than its object oriented features, there are quite a few idioms that I use where I find its object oriented capabilities useful. Scala 2.10 will now be released any time and I am even more excited to try out many of the new features that it will offer.

Ok .. here are some of the papers that I enjoyed reading in 2012 ..

An Introduction to Programming and Proving with Dependent Types in Coq - Adam Chlipala
Theorems for free! - Philip Wadler
Totality - Conor McBride
Proofs are Programs: 19th Century Logic and 21st Century Computing - Philip Wadler
Data types à la carte - Wouter Swierstra
Agda-curious?: an exploration of programming with dependent types - Conor McBride
Ideal Hash Trees - Phil Bagwell

a few good books ..

This is also a different list that what I expected at the beginning of the year. It's mostly due to my focus on machine learning and AI related topics that I took up in course of my journey.

Causality: Models, Reasoning and Inference - Judea Pearl
Probabilistic Graphical Models: Principles and Techniques - Daphne Koller
A World Without Time: The Forgotten Legacy of Godel and Einstein - Palle Yourgrau
Neural Networks for Pattern Recognition - Christopher M. Bishop
Turing (A Novel about Computation) - Christos H. Papadimitriou
Practical Foundations for Programming Languages - Robert Harper (Read the freely available ebook)
Haskell: The Craft of Functional Programming (3rd Edition) - Simon Thompson

3 conferences in a year - my best so far ..

This was my second consecutive PhillyETE and I spoke on functional approach to domain modeling. In case you missed it earlier, here's the presentation up on slideshare. I combined my PhillyETE trip with a trip to London to attend ScalaDays 2012. It was a great experience listening to 3 keynotes by Guy Steele, Simon Peyton Jones and Martin Odersky. I also got to meet a number of my friends on Twitter who rock the Scala community .. it was so nice getting to know the faces behind the twitter handles. Later in the year I also spoke at QCon NY on the topic of domain modeling in the functional world.

One of the conferences that I plan to attend in 2013 is The Strange Loop. It has been one of the awesomest conferences so far and I hate to be tracking it only on Twitter every year. Keeping fingers crossed ..

some plans for 2013 (Not Resolutions though!)

In no specific order ..

Do more Haskell - Haskell is surely in for some big renaissance. I just spent a couple of hours of very productive time watching Edward Kmett explain his new lens library.
Continue Scala - obviously .. needs no explanation ..
Explore more of machine learning and work on at least one concrete project. I am now reading 2 solid books on ML theory - as I mentioned above I feel a bit uncomfortable learning implementations without any of the theory that drive them. While both of these books assume a solid background of mathematics and statistics (which I don't have), I am trying to crawl through them, greedily accumulating necessary math background as much as I can. Wish I could take a sabbatical leave for 1 year and finish both of them.

Guess that's all .. See you all on the other side of 2013. Happy holidays and have a gorgeous 2013.

Towards better refactoring support in IDEs for functional programming

Mon, 01 Oct 2012 06:03:00 +0000

A couple of days back I was thinking how we could improve the state of IDEs and let them give rich feedbacks to users focusing on code improvement and refactoring. When you write Java programs, IntelliJ IDEA is possibly the best IDE you can have with an extensive set of refactoring tools. But most of these are targeted towards reducing boilerplates and do refactor based on structural changes only (e.g. replace constructor with builder or use interfaces wherever possible etc.). Can we improve the state of the art when we are using a semantically richer language like Scala or Haskell ? How about suggesting some idioms that relate to the paradigm of functional programming and explore some of the best practices suggested by the masters of the trade ?

I always find many programmers use Option.get in Scala when they should be using higher order structures instead. Here's an example of the anti-pattern ..

if oName.isDefined oName.get.toUpperCase else "Name not present"

which should ideally be refactored to

oName.map(_.toUpperCase).getOrElse("Name not present")

The advantage with the latter is that it's more composable and can be pipelined to other higher order functions down the line without introducing any temporary variables.

Many people also use pattern matching for destructuring an Option ..

oName match {
  case Some(n) => n.toUpperCase
  case _ => "Name not present"
}

This is also not what idioms espouse and is almost equivalent to explicit null checking.

Can an IDE suggest such refactoring ? I tweeted about my thoughts and almost immediately Mirko, who is doing some great stuff on the Scala IDE, added this to his list of ideas. Nice!

Here's another idea. How about suggesting some program transformations for functional programming ? I know Scala is not a pure functional language and we don't have effect tracking. I have seen people use side-effecting functions in map, which, I know is a strict no-no. But again, the compiler doesn't complain. Still we can have the IDE suggest some refactorings assuming the programmer uses purity as per recommendations. After all, these are suggestions only and can enlighten a programmer with some of the realizations of how to model based on the best practices.

Consider this ..

(-1000000 to 1000000).map(_ + 1).filter(_ > 0)

Here the map is executed on all the elements and then filter processes the whole output. One option to optimize will be to defer the computations using view ..

(-1000000 to 1000000).view.map(_ + 1).filter(_ > 0)

But this doesn't reduce the load of computation on map. It just prevents the creation of a temporary list as an output of the map. We can apply a transformation called filter promotion that does the filtering first so that map can operate on a potentially smaller list. Of course this assumes that both functions are *pure* so that operations can be reordered - but that's what we should anyway ensure while doing functional programming .. right ? Here's the transformation ..

(-1000000 to 1000000).filter(((_: Int) > 0) compose ((_: Int) + 1)).map(_ + 1)

which can be further simplified into ..

(-1000000 to 1000000).filter((_: Int) >= 0).map(_ + 1)

.. and this has same computation load on filter but potentially less load on map, since it now has to process a filtered list.

Can we have such transformations available in the IDE ? I am a faithful user of vim, but with such enrichments I may consider going back to an IDE. TYhis really has no limits. Consider Wadler's free theorems. Many times we tend to write functions whose arguments are more specialized in types than what's required. Generalization of such functions using polymorphic arguments can lead to better abstraction and verification through free theorems. An IDE can suggest all these and may also help generate properties for property based testing using tools like QuickCheck and ScalaCheck. But that's discussion for another day ..

P.S. Simon Thompson's Haskell The Craft of Functional Programming 3rd edition discusses lots of similar transformations in various chapters that discuss reasoning or Haskell programs.

Category Theory and Programming - Reinforcing the value of generic abstractions

Fri, 10 Aug 2012 03:52:00 +0000

Ok .. so my last post on the probable usefulness of Category Theory in programming generated an overwhelming response all over the places. It initiated lots of discussions which are always welcome and offers a great conduit towards enlightenment of everyone. From all these discussions, we can summarize that it's not a boolean answer whether category theory has any usefulness in making you a better programmer. Category theory is abstract, probably more abstract than what our normal programmer's mind can comprehend. Here's what one mathematician's bias tells us on HN ..

"This may be a mathematician's bias, but Category Theory is somewhat underwhelming because you can't really do anything with it. It's actually too abstract--you can model basically every branch of mathematics with Categories, but you can't actually prove very much about them from this perspective. All you get is a different vocabulary to talk about results that you already knew. It's also less approachable because its objects of study are other branches of mathematics, whereas most other branches have number, vectors, and functions as standard objects."

Now let me digress a bit towards personal experience. I don't have a strong math background and have only started learning category theory. I don't get most of the mathy stuff that category theorists talk about that I cannot relate to my programming world. But the subset that Pierce talks about and which I can relate to when I write code gives me an alternate perspective to think about functional abstractions. And, as I mentioned in my last post, category theory focuses more on the morphisms than on the objects. When I am programming in a functional language, this has a strong parallel towards emphasizing how abstractions compose. Specifically point free style composition has its inspiration from category theory and the way it focuses on manipulating arrows across objects.

Focusing on Generic Abstractions

Category theory gives us Functors, Applicatives, Monads, Semigroups, Monoids and a host of other abstractions which we use extensively in our everyday programming. Or at least I try to. These have taught me to think in terms of generic abstractions. I will give you a practical example from my personal experience that served as a great eye opener towards appreciating the value of generic abstractions ..

I program mostly in Scala and some times in Haskell. And one thing I need to do frequently (and it's not at all something unique to me) is apply a function f to a collection of elements (say of type Foo) with the caveat that the function may sometimes return no result. In Scala the obvious way to model this is for the function to return an Option type. Just for the record, Option is the equivalent of Maybe in Haskell and can be one of the two concrete types, Some (indicating the presence of a value) and None (indicates no value). So, applying f: Foo => Option[Bar] over the elements of a List[Foo], gives me List[Option[Bar]]. My requirement was to convert the result into an Option[List[Bar]], which would have a value only if all the Foos in the List gave me Some[Bar], and None otherwise. Initially I implemented this function as a specialized utility for Option data type only. But then I realized that this function could very well be applicable for many other data types like Either, List etc. And this aha! moment came courtsey of Scalaz which has a function sequence that does the exact same thing and works for all Traversible data structures (not just Lists) and containing anything that's an instance of an Applicative Functor. The Scala standard library doesn't have generic abstractions like Monad, Monoid or Semigroup and I constantly find reinventing them all the time within real world domain models that I program. No wonder I have Scalaz as a default dependency in almost all of my Scala projects these days.

So much for generic abstractions .. Now the question is do I need to know Category Theory to learn the generic abstractions like Applicative Functors or Monads ? Possibly no, but that's because some of the benevolent souls in the programming community have translated all those concepts and abstractions to a language that's more approachable to a programmer like me. At the same time knowing CT certainly gives you a broader perspective. Like the above quote says, you can't pinpoint and say that I am using this from my category theory knowledge. But you can always get a categorical perspective and a common vocabulary that links all of them to the world of mathematical composition.

Natural Transformation, Polymorphism and Free Theorems

Consider yet another aha! moment that I had some time back when I was trying to grok Natural Transformations, which are really morphisms between Functors. Suppose I have 2 functors F and G. Then a natural transformation η: F → G is a map that takes an object (in Haskell, a type, say, a) and returns a morphism from F(a) to G(a), i.e.

η :: forall a. F a → G a

Additionally, a natural transformation also needs to satisfy the following constraint ..

For any f :: A → B, we must have fmap_G f . η = η . fmap_F f

Consider an example from Haskell, the function reverse on lists, which is defined as

reverse :: [a] -> [a], which we can more specifically state as reverse :: forall a. [a] -> [a] ..

Comparing reverse with our earlier definition of η, we have F ≡ [] and G ≡ []. And since for lists, fmap = map, we have the other criterion for natural transformation as map f . reverse = reverse . map f. And this is the free theorem for reverse!

Not only this, we can get more insight from the application of natural transformation. Note the forall annotation above. It literally means that η (or reverse) works for any type a. That is, the function is not allowed to peek inside the arguments and its behavior does not depend on the exact type of a. We can freely substitute one type for another and yet have the same behavior of the function. Hence a natural transformation is all about polymorphic functions.

Summary

I agree that Category Theory is not a must read to do programming, particularly if you are not using a typed functional language. But often I find a category diagram or a categorical analysis reveals a lot of insight about an abstraction, particularly with respect to how it composes with other similar abstractions in the world.

Does category theory make you a better programmer ?

Mon, 30 Jul 2012 04:55:00 +0000

How much of category theory knowledge should a working programmer have ? I guess this depends on what kind of language the programmer uses in his daily life. Given the proliferation of functional languages today, specifically typed functional languages (Haskell, Scala etc.) that embeds the typed lambda calculus in some form or the other, the question looks relevant to me. And apparently to a few others as well. In one of his courses on Category Theory, Graham Hutton mentioned the following points when talking about the usefulness of the theory :

Building bridges—exploring relationships between various mathematical objects, e.g., Products and Function
Unifying ideas - abstracting from unnecessary details to give general definitions and results, e.g., Functors
High level language - focusing on how things behave rather than what their implementation details are e.g. specification vs implementation
Type safety - using types to ensure that things are combined only in sensible ways e.g. (f: A -> B g: B -> C) => (g o f: A -> C)
Equational proofs—performing proofs in a purely equational style of reasoning

Many of the above points can be related to the experience that we encounter while programming in a functional language today. We use Product and Sum types, we use Functors to abstract our computation, we marry types together to encode domain logic within the structures that we build and many of us use equational reasoning to optimize algorithms and data structures.

But how much do we need to care about how category theory models these structures and how that model maps to the ones that we use in our programming model ?

Let's start with the classical definition of a Category. [Pierce] defines a Category as comprising of:

a collection of objects
a collection of arrows (often called morphisms)
operations assigning to each arrow f an object dom f, its domain, and an object cod f, its codomain (f: A → B, where dom f = A and cod f = B
a composition operator assigning to each pair of arrows f and g with cod f = dom g, a composite arrow g o f: dom f → cod g, satisfying the following associative law:

f: A → B

g: B → C

h: C → D

h o (g o f) = (h o g) o f

for each object A, an identity arrow id_A: A → A satisfying the following identity law:

f: A → B

id_B o f = f

f o id_A = f

Translating to Scala

Ok let's see how this definition can be mapped to your daily programming chores. If we consider Haskell, there's a category of Haskell types called Hask, which makes the collection of objects of the Category. For this post, I will use Scala, and for all practical purposes assume that we use Scala's pure functional capabilities. In our model we consider the Scala types forming the objects of our category.

You define any function in Scala from type A to type B (A => B) and you have an example of a morphism. For every function we have a domain and a co-domain. In our example, val foo: A => B = //.. we have the type A as the domain and the type B as the co-domain.

Of course we can define composition of arrows or functions in Scala, as can be demonstrated with the following REPL session ..

scala> val f: Int => String = _.toString
f: Int => String = <function1>

scala> val g: String => Int = _.length
g: String => Int = <function1>

scala> f compose g
res23: String => String = <function1>

and it's very easy to verify that the composition satisfies the associative law.

And now the identity law, which is, of course, a specialized version of composition. Let's define some functions and play around with the identity in the REPL ..

scala> val foo: Int => String = _.toString
foo: Int => String = <function1>

scala> val idInt: Int => Int = identity(_: Int)
idInt: Int => Int = <function1>

scala> val idString: String => String = identity(_: String)
idString: String => String = <function1>

scala> idString compose foo
res24: Int => String = <function1>

scala> foo compose idInt
res25: Int => String = <function1>

Ok .. so we have the identity law of the Category verified above.

Category theory & programming languages

Now that we understand the most basic correspondence between category theory and programming language theory, it's time to dig a bit deeper into some of the implicit correspondences. We will definitely come back to the more explicit ones very soon when we talk about products, co-products, functors and natural transformations.

Do you really think that understanding category theory helps you understand the programming language theory better ? It all depends how much of the *theory* do you really care about. If you are doing enterprise software development and/or really don't care to learn a language outside your comfort zone, then possibly you come back with a resounding *no* as the answer. Category theory is a subject that provides a uniform model of set theory, algebra, logic and computation. And many of the concepts of category theory map quite nicely to structures in programming (particularly in a language that offers a decent type system and preferably has some underpinnings of the typed lambda calculus).

Categorical reasoning helps you reason about your programs, if they are written using a typed functional language like Haskell or Scala. Some of the basic structures that you encounter in your everyday programming (like Product types or Sum types) have their correspondences in category theory. Analyzing them from CT point of view often illustrates various properties that we tend to overlook (or take for granted) while programming. And this is not coincidental. It has been shown that there's indeed a strong correspondence between typed lambda calculus and cartesian closed categories. And Haskell is essentially an encoding of the typed lambda calculus.

Here's an example of how we can explain the properties of a data type in terms of its categorical model. Consider the category of Products of elements and for simplicity let's take the example of cartesian products from the category of Sets. A cartesian product of 2 sets A and B is defined by:

A X B = {(a, b) | a ∈ A and b ∈ B}

So we have the tuples as the objects in the category. What could be the relevant morphisms ? In case of products, the applicable arrows (or morphisms) are the projection functions π₁: A X B → A and π₂: A X B → B. Now if we draw a category diagram where C is the product type, then we have 2 functions f: C → A and g: C→ B as the projection functions and the product function is represented by : C → A X B and is defined as <F, G>(x) = (f(x), g(x)). Here's the diagram corresponding to the above category ..

and according to the category theory definition of a Product, the above diagram commutes. Note, by commuting we mean that for every pair of vertices X and Y, all paths in the diagram from X to Y are equal in the sense that each path forms an arrow and these arrows are equal in the category. So here commutativity of the diagram gives
π₁ o <F, G> = f and
π₂ o <F, G> = g.

Let's now define each of the functions above in Scala and see how the results of commutativity of the above diagram maps to the programming domain. As a programmer we use the projection functions (_1 and _2 in Scala's Tuple2 or fst and snd in Haskell Pair) on a regular basis. The above category diagram, as we will see gives some additional insights into the abstraction and helps understand some of the mathematical properties of how a cartesian product of Sets translates to the composition of functions in the programming model.

scala> val ip = (10, "debasish")
ip: (Int, java.lang.String) = (10,debasish)

scala> val pi1: ((Int, String)) => Int = (p => p._1)
pi1: ((Int, String)) => Int = <function1>

scala> val pi2: ((Int, String)) => String = (p => p._2)
pi2: ((Int, String)) => String = <function1>

scala> val f: Int => Int = (_ * 2)
f: Int => Int = <function1>

scala> val g: Int => String = _.toString
g: Int => String = <function1>

scala> val `<f, g>`: Int => (Int, String) = (x => (f(x), g(x)))
<f, g>: Int => (Int, String) = <function1>

scala> pi1 compose `<f, g>`
res26: Int => Int = <function1>

scala> pi2 compose `<f, g>`
res27: Int => String = <function1>

So, as we claim from the commutativity of the diagram, we see that pi1 compose `<f, g>` is typewise equal to f and pi2 compose `<f, g>` is typewise equal to g. Now the definition of a Product in Category Theory says that the morphism between C and A X B is unique and that A X B is defined upto isomorphism. And the uniqueness is indicated by the symbol ! in the diagram. I am going to skip the proof, since it's quite trivial and follows from the definition of what a Product of 2 objects mean. This makes sense intuitively in the programming model as well, we can have one unique type consisting of the Pair of A and B.

Now for some differences in semantics between the categorical model and the programming model. If you consider an eager (or eager-by-default) language like Scala, the Product type fails miserably in presence of the Bottom data type (_|_) represented by Nothing. For Haskell, the non-strict language, it also fails when we consider the fact that a Product type needs to satisfy the equations (fst(p), snd(p)) == p and we apply the Bottom (_|_) for p. So, the programming model remains true only when we eliminate the Bottom type from the equation. Have a look at this comment from Dan Doel in James Iry's blog post on sum and product types.

This is an instance where a programmer can benefit from knwoledge of category theory. It's actually a bidirectional win-win when knowledge of category theory helps more in understanding of data types in real life programming.

Interface driven modeling

One other aspect where category theory maps very closely with the programming model is its focus on the arrows rather than the objects. This corresponds to the notion of an interface in programming. Category theory typically "abstracts away from elements, treating objects as black boxes with unexamined internal structure and focusing attention on the properties of arrows between objects" [Pierce]. In programming also we encourage interface driven modeling, where the implementation is typically abstracted away from the client. When we talk about objects upto isomorphism, we focus solely on the arrows rather than what the objects are made of. Learning programming and category theory in an iterative manner serves to enrich your knowledge on both. If you know what a Functor means in category theory, then when you are designing something that looks like a Functor, you can immediately make it generic enough so that it composes seamlessly with all other functors out there in the world.

Thinking generically

Category theory talks about objects and morphisms and how arrows compose. A special kind of morphism is Identity morphism, which maps to the Identity function in programming. This is 0 when we talk about addition, 1 when we talk about multiplication, and so on. Category theory generalizes this concept by using the same vocabulary (morphism) to denote both stuff that do some operations and those that don't. And it sets this up nicely by saying that for every object X, there exists a morphism id_X : X → X called the identity morphism on X, such that for every morphism f: A → B we have id_B o f = f = f o id_A. This (the concept of a generic zero) has been a great lesson at least for me when I identify structures like monoids in my programming today.

Duality

In the programming model, many dualities are not explicit. Category theory has an explicit way of teaching you the dualities in the form of category diagrams. Consider the example of Sum type (also known as Coproduct) and Product type. We have abundance of these in languages like Scala and Haskell, but programmers, particularly people coming from the imperative programming world, are not often aware of this duality. But have a look at the category diagram of the sum type A + B for objects A and B ..

It's the same diagram as the Product only with the arrows reversed. Indeed a Sum type A + B is the categorical dual of Product type A X B. In Scala we model it as the union type like Either where the value of the sum type comes either from the left or the right. Studying the category diagram and deriving the properties that come out of its commutativity helps understand a lot of theory behind the design of the data type.

In the next part of this discussion I will explore some other structures like Functors and Natural Transformation and how they map to important concepts in programming which we use on a daily basis. So far, my feeling has been that if you use a typed functional language, a basic knowledge of category theory helps a lot in designing generic abstractions and make them compose with related ones out there in the world.