Introductory TyDD in Scala: Deconstructing complex Types

These types are ugly and cumbersome; they are not at all human readable. How do we get data out of such a type and format it in a way that is useful to the operator? Let's again start with a naive example.

def stringify1[A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: List[(A, (B, C))]): String = {
  in.map{case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }.mkString("(", "; ", ")")
}

Here we take a List of nested pairs and return a string that is hopefully more human readable than the toString method would provide.

Abstract the F

Like before, we will abstract the F from our function that it may be used with any type constructor of arity 1 rather than hard-coded to List.

import cats.Functor
val functorList = new Functor[List]{
  override def map[A, B](fa: List[A])(f: A => B): F[B] =
    fa.map(f)
}
def stringify2[F[_]: Functor, A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: F[(A, (B, C))]): String = {
  val F = implicitly[Functor[F]]
  F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  ???
}

The cats library provides a nifty Functor typeclass for us so, we can abstract the map call pretty easily. What of the mkString? As it turns out, cats provides a typeclass for this as well! It is called Show; let's see how it works.

import cats.Show
def stringify3[F[_]: Functor, A, B, C](
  fa: A => String, fb: B => String, fc: C => String,
  in: F[(A, (B, C))])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val result = F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  FS.show(result)
}

And extending this idea of Show to the Function1 instances we have

def stringify4[F[_]: Functor, A: Show, B: Show, C: Show](
  in: F[(A, (B, C))])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  val fc = implicitly[Show[C]].show _
  val result = F.map(in){case (a, (b, c)) =>
    fa(a) + ", " + fb(b) + ", " + fc(c)
  }
  FS.show(result)
}

Abstracting over Arity

We'll attempt to build a recursive version of this function using the same principled we've used in previous posts.

def stringify5[F[_]: Functor, A: Show, B: Show](
  in: F[(A, B)])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  val result = F.map(in){case (a, b) =>
    fa(a) + ", " + fb(b)
  }
  FS.show(result)
}

This is not what we want! We need to recurse on the Show instance inside the Functor. Let's make a recursive function for Show. This will follow the code we read from Shapeless very closely.

implicit def makeShow[A: Show, B: Show]: Show[(A, B)] = {
  val fa = implicitly[Show[A]].show _
  val fb = implicitly[Show[B]].show _
  new Show[(A, B)]{
    override def show(t: (A, B)): String = {
      val (a, b) = t
      "(" + fa(a) + ", " + fb(b) + ")“
    }
  }
}

This says, Given any two Show instances, a Show instances for their pair can be produced. Alternatively, it can be said that given a proof for Show[A] and a proof for Show[B] a proof for Show[(A, B)] follows.

So, now we have the following stringify function:

def stringify[F[_]: Functor, A: Show](
  in: F[A])(implicit
  FS: Show[F[String]]): String = {
  val F = implicitly[Functor[F]]
  val fa = implicitly[Show[A]].show _
  val result = F.map(in)(fa)
  FS.show(result)
}

Notice how simple our code has become. All of the specific type information has been absorbed into the recursive implicit functions. This is simply the Show instance for Functor itself. Furthermore, our makeShow function will produce Show instances for any nesting of Tuple2 instances; it generates Show instances for binary trees. The makeShow function is 10 lines of code (even including the Scala boilerplate) and gave us a giant boost in usefulness for our library code.

What's the Point?

Let's see an example of how this can be used. Given our individual Show instances:

implicit val ShowListString = new Show[List[String]]{
  def show(in: List[String]): String =
    in.mkString("(", "; ", ")")
}
implicit val showInt = new Show[Int]{
  override def show(in: Int): String = in.toString}
implicit val showLong = new Show[Long]{
  override def show(in: Long): String = in.toString}
implicit val showString = new Show[String]{
  override def show(in: String): String = in}
implicit val showDouble = new Show[Double]{
  override def show(in: Double): String = f"$in%.2f"}
implicit val showFloat = new Show[Float]{
  override def show(in: Float): String = f"$in%.2f"}
implicit val showArrayByte = new Show[Array[Byte]]{
  override def show(in: Array[Byte]): String = new String(in)}

Our writer and reader code becomes:

//Write the thing with our writer
val result = implicitly[Result1]
//Read the thing back with our reader
println(stringify(result))

For anyone to create an application which reads and writes data with our library they need only define the specific business logic and types for their application. If they miss writing an instance, the compiler will tell them. If they rearrange the order of the types in IO, the compiler will figure it out for them. Many common pain points of writing data readers and writers have been taken care of by the library implementor giving the rest of the team the ability to focus on business logic.

The goal here is not to have an entire company of developers who write this kind of code. Just like the goal of any business is to have employees who each bring something new to the team, the goal here is to have a few developers who write these libraries that other developers can use to develop business applications. The benefit of this is the libraries help limit the kinds of errors that can make it to production; the entire production cycle gets a confidence boost.

Introductory TyDD in Scala: Writing Type Level Functions

Type level functions consume types and produce types. This allows us to tell the compiler exactly what we want in our code that errors may be caught at compile time rather than at later stages in our product deployment cycle. The true power of this approach lies in the fact that with Scala no application can be produced without compilation. No matter how undisciplined a developer may be, she cannot skip compilation. Moving error checking into compilation gives us more confidence in the performance of our binary than confidence gained by any other means. In short, when there is a failure in production we always ask "were the tests run?" but never do we ask "was it compiled?"

A Basic Zipper

Here we have a zipper.

def zipper[A, B, C](
  a: List[A], b: List[B], c: List[C]
): List[(A, (B, C))] = a.zip(b.zip(c))

From previous posts in this series we know this can be generalized with a type class. Let's exercise this muscle immediately.

trait Zip[F[_]]{
  def apply[A, B](a: F[A], b: F[B]): F[(A, B)]
}
def zipper[F[_]: Zip, A, B, C](
  a: F[A], b: F[B], c: F[C]): F[(A, (B, C))] = {
  val F = implicitly[Zip[F]]
  F(a, F(b, c))
}

This is nicer than the first version but it is still super restrictive. It only works for zipping exactly 3 values. If we want to zip 2 or 4 or 70 values, we are out of luck! We saw how the shapeless HList allowed us to compose an arbitrary number of arbitrary types into a single type. Let's try to use the same kinds of methods here to produce a zipper of arbitrary arity.

Step 1 - Simplify as much as is possible

We will simplify a zipper to the purest form we can. The purest zipper takes two instances of a particular type constructor and produces a single instance of that type constructor on a pair. Let's write that.

def zipper[F[_]: Zip, H, T](
  h: F[H], t: F[T]): F[(H, T)] = {
  val F = implicitly[Zip[F]]
  F(h, t)
}

Now, we don't need separate functions for different arity versions of a zipper. We can simply call this function recursively to produce the desired result.

val (list1, list2, list3, list4, list5, list6) = ...
implicit val ZipList = new Zip[List]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[(A, B)] = a.zip(b)
}
val with2 = zipper(list1, list2)
val with3 = zipper(list1,
            zipper(list2, list3))
val with6 = zipper(list1,
            zipper(list2,
            zipper(list3,
            zipper(list4,
            zipper(list5, list6)))))

If we recall the shapeless HList code learned to read, we see the same pattern here of a recursive type being produced from recursive calls. This can be done in the compiler using implicits.

Step 2 - Replace explicit recursion with implicit

implicit def zipper[F[_]: Zip, H, T](implicit
  h: F[H], t: F[T]): F[(H, T)] = {
  val F = implicitly[Zip[F]]
  F(h, t)
}

This is the same code just the keyword implicit is placed in two locations. In scala we can promote an explicit call to an implicit call simply by adding a keyword. We inform the inputs with implicit so the compiler knows to find the head and tail by itself. We inform the function with implicit so the compiler knows to call the function implicitly if a value of the necessary type is not found.

We communicate our intent to the compiler with implicits and types. Type aliases help simplify business logic.

type F[A] = List[A]
type Result =
  F[
    (Int, (Long, (String, (Double, (Float, Array[Byte])
  ))))]

To tell the compiler which values it can use during the application of functions we inform the values as implicit

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

The implicitly function tells the compiler what it needs to execute.

implicitly[Result]

And that's it! The compiler assembles the recursive calls for us. No more errors from placing things in the wrong order; refactoring is as simple as rearranging the order of the types in our alias Result. One change on one line propagates throughout the entire code base automatically.

Why stop at Tuple2?

There is nothing here that requires that a tuple be used. In fact, the only thing we need is a type constructor of arity 2. We can express as

trait ZipG[F[_], G[_, _]]{
  def apply[A, B](a: F[A], b: F[B]): F[G[A, B]]
}
implicit def zipper[F[_], G[_, _], H, T](implicit
  F: ZipG[F, G], h: F[H], t: F[T]): F[G[H, T]] = {
  F(h, t)
}

And we can zip Tuple2s or Eithers by creating type class instances

implicit val zipListTuple2 = new ZipG[List, Tuple2]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[(A, B)] = a.zip(b)
}
implicit val zipListEither = new ZipG[List, Either]{
  override def apply[A, B](
    a: List[A], b: List[B]): List[Either[A, B]] =
    for{a <- a; b <- b}yield{
      if(a.toString.size < b.toString.size) Left(a)
      else Right(b)
    }
}

For Tuple2 we have business logic like

type F[A] = List[A]
type Result =
  F[
    (Int, (Long, (String, (Double, (Float, Array[Byte])
  ))))]

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

implicitly[Result]

This is the same as before. Commonly, further abstractions at the type level have little or no effect on code at the value level. Abstractions should allow the code to be more expressive than it was prior to the abstraction exercise never less expressive.

Changing our business logic to use Either as our Zipping class is simple

type F[A] = List[A]
type Result =
  F[
    Either[Int, Either[Long, Either[String,
    Either[Double, Either[Float, Array[Byte]]
  ]]]]]

implicit val list1: List[Int] = ???
implicit val list2: List[Long] = ???
implicit val list3: List[String] = ???
implicit val list4: List[Double] = ???
implicit val list5: List[Float] = ???
implicit val list6: List[Array[Byte]] = ???

implicitly[Result]

This is very powerful. By changing our type aliases, we were able to entirely change the meaning of our business logic without complex refactorings. As long as there are the correct type classes in implicit scope, the business logic need not be bothered by those implementation details.

What we have here is a sort of data pipeline and writer. Now that we have formatted data that we can work with in code, how do we present that data to an operator? Next, we'll write a reader for our types.

 

Introductory TyDD in Scala: Basic Type Class Development

(A more complete treatment of type classes and higer kinds.)

A simple Type Class

Take the following

trait Mapping[A, B]{
  def map(a: A): B
}

and an instance for it

val mapping: Mapping[List[Int], List[String]] =
  new Mapping[List[Int], List[String]]{
    override def map(a: List[Int]): List[String] =
      a.map(_.toString)
  }

This instance is super restrictive. It only works for taking Int into String. We want to map a List of any type. Since we know what our type parameters are, we can achieve our goal by passing in a function

trait ListMapping[A, B]{
  def map(list: List[A])(f: A => B): List[B]
}

So, given a List[A] and a function, A => B, we can get a List[B]. And by taking the type parameters from the trait definition and placing them onto the function definition, we can squeeze out a bit more freedom.

trait ListMapping{
  def map[A, B](list: List[A])(f: A => B): List[B]
}
val mapping: ListMapping =
  new ListMapping{
    override def map[A, B](a: List[A])(f: A => B): List[B] =
      a.map(f)
  }

Now, why would anyone ever do this? The List type provides a map function which does exactly this. With this approach one may provide any number of methods for mapping a list. Take this alternative:

val reverseMapping: ListMapping =
  new ListMapping{
    override def map[A, B](a: List[A])(f: A => B): List[B] =
      a.reverse.map(f)
  }

Through type classes, we can define new functionality for old data structures on the fly. Similar code can be written for sort order, string formatting or just about anything else.

Making things even more general

While, the ability to map Lists of any type in any number of ways is fairly abstract it is not abstract enough for our purposes. What if we want to map a different data structure such as an Option or a Stream or a spark Dataset?

Luckily, Scala has a language feature which can help us out here.

trait WithMap[F[_]]{
  def map[A, B](m: F[A])(f: A => B): F[B]
}

The type parameter, F[_], has a type parameter of _, this tells the compiler that our type parameter itself requires a type parameter. Notice in our definition all mention of List has been replaced by our parameter, F. This just says that given a type, F, which itself takes a type parameter, we can change the inner type or F without changing F. We can do this with any parameterized type of arity 1.

implicit val listWithMap = new WithMap[List]{
  override def map[A, B](m: List[A])(f: A => B): List[B] =
    m.map(f)
}
implicit val optionWithMap = new WithMap[Option]{
  override def map[A, B](m: Option[A])(f: A => B): Option[B] =
    m.map(f)
}
implicit val streamWithMap = new WithMap[Stream]{
  override def map[A, B](m: Stream[A])(f: A => B): Stream[B] =
    m.map(f)
}
val reverseListWithMap = new WithMap[List]{
  override def map[A, B](m: List[A])(f: A => B): List[B] =
    m.reverse.map(f)
}


With these techniques we can define super polymorphic functions. Take this pretty strinfigy function

def prettyString[F[_]: WithMap, A](m: F[A])(f: A => String): String = {
  implicitly[WithMap[F]].map(m)(f).toString
}

This takes two type parameters, F[_]: WithMap and A. The `:` character in the first type parameter tells the compiler that it needs an implicit instance of WithMap defined for our type F.

And here is a data processor defined in the same way

def processData[F[_]: WithMap, A, B, C, D](
  m1: F[A])(f1: A => B)(f2: B => C)(f3: C => D): F[D] = {
  val F = implicitly[WithMap[F]] 
  val m2 = F.map(m1)(f1)
  val m3 = F.map(m2)(f2)
  F.map(m3)(f3)
}

We have taken an implementation detail (the map function on List, Option, etc...) and brought it outside the type. This has given us the ability to talk about data which has a sensible map function without knowing what that data necessarily looks like.

Next we'll learn how to read some of the Type Level functions that exist in the shapeless library.

Introductory TyDD in Scala: Anatomy of a Type Level Function

Value Level Functions

A value level function typically looks like

def f(a: Int, b: Int): String = {
  a.toString + b.toString
}

The key parts are

  1. A keyword: def
  2. Inputs: a: Int, b: Int
  3. Outputs: String
  4. Body: a.toString + b.toString

When you see these 4 parts, you know you are reading a value level function. Nothing surprising here. Now, let's see what a similar definition looks like at the type level.

Type Level Functions

A type level function looks like

trait MyTrait[A, B]{type Out}
object MyTrait{
  def apply[A, B, C](): MyTrait[A, B]{type Out = C} =
    new MyTrait[A, B]{override type Out = C}
}

In this definition there is a type refinement, MyTrait[A, B]{type Out = C}. These are undesirable artifacts of type level development. To simplify these definitions we use the Aux alias (a document about on this). Aux helps us remove type refinements from our logic.

A type level function looks like

trait MyTrait[A, B]{type Out}
object MyTrait{
  type Aux[A, B, C] = MyTrait[A, B]{type Out = C}
  def apply[A, B, C](): Aux[A, B, C] =
    new MyTrait[A, B]{override type Out = C}
}

The type refinement from the previous example is replaced by the nicer (more readable, fewer braces, less code) Aux type.

Type level functions have the same 4 key parts as value level functions

  1. Keywords: trait object
  2. Inputs: A, B
  3. Outputs: type Out
  4. Body: def apply[A, B, C](): Aux[A, B, C]

Here the inputs are type parameters and outputs are type members. This is so the output types are not erased and can be referenced later in business logic. This is similar to value level functions as the result of a value level function does not expose the inputs required by the function.

Bodies of type level functions are value level functions. They are typically only a few lines long. Their purpose is to present the compiler with a way to construct a new type from the types provided. This is what these blog posts will focus on.

Whenever you see a set of definitions which have these 4 qualities, you know you are looking at a type level function.

The type class is the fundamental element of this style of type driven development. The next post will give an overview of this concept.

Introductory TyDD in Scala

Here are the slides (pptx, pdf) and code from the PHASE talk on Type Driven Development in Scala. Long form blog post version is forthcoming.

  1. Anatomy of a Type Level Function
  2. Type Classes & Higher Kinds
  3. Reading Type Level Functions
  4. Writing Type Level Functions
  5. Reading Data that is this strongly typed

Implicits and Type Classes for Mortals

Implicits and Type Classes can be fairly hard to nail down for a developer learning Functional Programming. They can seem more like magic than like provably-correct code and, clever compiler operations can exacerbate the issue.

For those of us who have been working with these concepts for a while, implicit scope and type class polymorphism come second natrure. They are like an extension of the fingers. An expression of this divide appeared in my twitter feed a few months ago: 

This post aims to impart some of this natural feeling to those who are just breaking in. Hopefully, type class code will be easier to work with after reading.

A Crash Course in Implicit Resolution

Implicit scope in scala is insanely complex. Luckily, understanding the subtleties of the compiler algorithm are not a necessity. We only need to understand a few simple points:

  1. The implicit keyword
  2. Implicit values
  3. Implicit function parameters

The Implicit Keyword

This keyword tells the compiler that a function or value should be searchable in implicit scope. Implicit scope is the set of all implicit declarations in scope.

We'll now build up to using identifiers that are searchable in implicit scope.

Implicit Values

An implicit value is a value defined using the implicit keyword.

implicit val x = 7

Now, the value x is searchable in implicit scope. We'll make use of it later.

Implicit Function Parameters

An implicit function parameter, like an implicit value, is a parameter to a function that is marked with implicit.

def mkList(begin: Int)(implicit end: Int): List[Int] = {
  (begin to end).toList
}
assert(List(1,2,3,4,5,6,7) == mkList(1)(7))

An implicit parameter works just like a parameter without the implicit declaration. So, what's the point?

Implicit parameters do not need to be explicitly written; if a value of the correct type is found in implicit scope, the compiler will automatically input that value for the developer to the function. The following is valid:

implicit val x = 7
def mkList(begin: Int)(implicit end: Int): List[Int] = {
  (begin to end).toList
}
assert(List(1,2,3,4,5,6,7) == mkList(1))
assert(List(5,6,7) == mkList(5))

So What?

Implicit resolution is very handy for applying contexts to an application. A simple example is logging.

import scala.concurrent._
import ExecutionContext.Implicits.global
case class Logger(location: String){
  def log(message: String) =
    println(s"$message\t$location")
}

def monitoredOperation(a: String, b: Int)(implicit logger: Logger) = {
  logger.log(a)
  logger.log(b.toString)
  logger.log(a + (2 * b))
}

val logA = Logger("logDir/a.log")
val logB = Logger("logDir/b.log")

def performA() = Future{
  implicit val log = logA
  monitoredOperation("A1", 5)
  monitoredOperation("A2", 4)
  monitoredOperation("A3", 3)
  monitoredOperation("A4", 6)
}
def performB() = Future{
  implicit val log = logB
  monitoredOperation("B1", 4)
  monitoredOperation("B2", 6)
  monitoredOperation("B3", 5)
  monitoredOperation("B4", 7)
}

implicit val logMain = Logger("logDir/.log")
for{
  a <- performA()
  b <- performB()
}{
  monitoredOperation(".1", 8)
}

There are three different paths for logging which are necessary at three separate points in the application. The implicit dependency monitoredOperation has on the logger lets the developer define a logging context at an appropriate scope and then code without giving thought to logging. This helps in two ways:

  1. The developer is free to code to the business case without being further burdened by logging
  2. Forces the developer to decouple code which logs to separate locations

The second help is illustrated by the following

implicit val logA = Logger("logDir/a.log")
implicit val logB = Logger("logDir/b.log")
monitoredOperation("", 0)//compile error: ambiguous implicit values

Bringing two values of the same type into implicit scope causes the compiler to fail on implicit resolution. The compiler cannot reason about which is the one you want.

Type Classes for Mortals

Type Classes are a technique for purely functional polymorphism. They are a powerful tool for using the scala compiler to help the developer group data types with similar use cases.

Building Type Classes

Say you need to extract lengths from GUIDs for validation however the GUIDs are presented in multiple formats.

  1. String
  2. List[Char]
  3. List[String]

For example:

val valid = 32
val id1 = "3F2504E0-4F89-41D3-9A0C-0305E82C3301"
val id2 = List('3', 'F', '2', '5', '0', '4', 'E', '0',
  '4', 'F', '8', '9',
  '4', '1', 'D', '3',
  '9', 'A', '0', 'C',
  '0', '3', '0', '5', 'E', '8', '2', 'C', '3', '3', '0', '1')
val id3 = List("3F2504E0", "4F89", "41D3", "9A0C", "0305E82C3301")
println(valid == id1.size)//false
println(valid == id2.size)//true
println(valid == id3.size)//false

In scala there is a built in size member for all three of these classes. The issue is, the three values represent the same GUID logically but, their size methods return very different results. Most of the time in OO, we would gain access to the classes, create a trait which validates them as GUIDs and have them each implement that trait. This is not possible since we are dealing with standard library classes. Type Classes provide a performant solution to this gap in functionality.

What is a Type Class?

A type class in scala is a trait with a parameter and at least one abstract method  which contracts over the parameterized type. In our example it would look like:

trait Guid[T]{
  def isGuid(item: T): Boolean
}

This interface allows us to write a function for validating a GUID given an implementation of the Guid trait.

def guidIsValid[T](item: T, guid: Guid[T]): Boolean = {
  guid.isGuid(item)
}

Now, we just need to implement this trait for our three classes and use them as required. All together we have

val valid = 32
val id1 = "3F2504E0-4F89-41D3-9A0C-0305E82C3301"
val id2 = List('3', 'F', '2', '5', '0', '4', 'E', '0',
  '4', 'F', '8', '9',
  '4', '1', 'D', '3',
  '9', 'A', '0', 'C',
  '0', '3', '0', '5', 'E', '8', '2', 'C', '3', '3', '0', '1')
val id3 = List("3F2504E0", "4F89", "41D3", "9A0C", "0305E82C3301")
//use case

trait Guid[T]{
  def isGuid(item: T): Boolean
}
def guidIsValid[T](item: T, guid: Guid[T]): Boolean = {
  guid.isGuid(item)
}
val stringGuid = new Guid[String]{
  override def isGuid(str: String): Boolean = {
    val stripped = str.filter('-' != _)
    valid == stripped.size
  }
}
val lCharGuid = new Guid[List[Char]]{
  override def isGuid(chars: List[Char]): Boolean = {
    valid == chars.size
  }
}
val lStringGuid = new Guid[List[String]]{
  override def isGuid(strs: List[String]): Boolean = {
    val size = strs.map(_.size).sum
    valid == size
  }
}
guidIsValid(id1, stringGuid)//true
guidIsValid(id2, lCharGuid)//true
guidIsValid(id3, lStringGuid)//true

Here, the compiler makes sure we have a valid implementation of our type class, Guid, for each call. This is nice but, the extra argument to the guidIsValid function makes the syntax more cumbersome than the OO version (OO version would have a validGuid method on String, List[Char] and List[String]). What we would like is for the last three lines to read

guidIsValid(id1)//true
guidIsValid(id2)//true
guidIsValid(id3)//true

Remember Implicits

Let's redefine our guidIsValid function so we can leverage the implicit functionality baked into the scala compiler to input our type class argument for our function

def guidIsValid[T](item: T)(implicit guid: Guid[T]): Boolean = {
  guid.isGuid(item)
}

Now, if we redefine our type class implementations as implicits we have

trait Guid[T]{
  def isGuid(item: T): Boolean
}
def guidIsValid[T](item: T)(implicit guid: Guid[T]): Boolean = {
  guid.isGuid(item)
}
implicit val stringGuid = new Guid[String]{
  override def isGuid(str: String): Boolean = {
    val stripped = str.filter('-' != _)
    valid == stripped.size
  }
}
implicit val lCharGuid = new Guid[List[Char]]{
  override def isGuid(chars: List[Char]): Boolean = {
    valid == chars.size
  }
}
implicit val lStringGuid = new Guid[List[String]]{
  override def isGuid(strs: List[String]): Boolean = {
    val size = strs.map(_.size).sum
    valid == size
  }
}
guidIsValid(id1)//true
guidIsValid(id2)//true
guidIsValid(id3)//true

Better compiler messages

To make it easier on implementors to reason about compile errors, type class library designers should always use the implicit not found annotation on their type classes.

@annotation.implicitNotFound("${T} has no implicit Guid instance defined in scope.")
trait Guid[T]{
  def isGuid(item: T): Boolean
}

Working With Type Classes

We finally get to the main point of the tweet we referenced at the very beginning of this post: 

how hard this code is ... to work with

First let's pick out just the type class and a single implementation from our code.

val valid = 32
trait Guid[T]{
  def isGuid(item: T): Boolean
}
implicit val stringGuid = new Guid[String]{
  override def isGuid(str: String): Boolean = {
    val stripped = str.filter('-' != _)
    valid == stripped.size
  }
}

val id1 = "3F2504E0-4F89-41D3-9A0C-0305E82C3301"

This is the most basic kind of type class; its just a type class. The library has no functionality or guidance about how to use it. Most library designers add in a lot of extra implementation to the companion object for the type class. The companion object typically holds

  1. convenience functions for performing operations functionally
  2. something called Ops which wraps values in a context for the type class
  3. an implicit def into the Ops class for quickly performing operations with OO-style
trait Guid[T]{
  def isGuid(item: T): Boolean
}
object Guid{
  def apply[T](item: T)(implicit guid: Guid[T]): Boolean = {
guid.isGuid(item)
  }
  class GuidOps[T](item: T, guid: Guid[T]){
    def isGuid(): Boolean = guid.isGuid(item)
  }
  implicit def lift[T](item: T)(implicit guid: Guid[T]): GuidOps[T] = {
    new GuidOps[T](item, guid)
  }
}

With this definition in the library, the developer can choose which style to use

import Guid._
implicit val stringGuid = new Guid[String]{
  override def isGuid(str: String): Boolean = {
    val stripped = str.filter('-' != _)
    valid == stripped.size
  }
}
val id1 = "3F2504E0-4F89-41D3-9A0C-0305E82C3301"
Guid(id1)//functional
id1.isGuid()//object oriented

Type classes are simply the FP version of the decorator pattern from OO (This is made clear by the implicit def from T to GuidOps[T]; a direct mapping from the type class for some T to a decorator for T). They take a class and apply new uses to a single instance of that class without affecting other instances of that class. Together with implicits, they seem to augment a class directly (js prototype style) but, this appearance is simply an illusion created by the compiler as it decorates functions and objects for the developer.

Type Variance - Why it Matters

The company I work for has a robust Intern program; as a result, I work with a lot of young engineers and computer scientists. To date:

  1. 100% of their resumes mention they have a tight grasp on Object Oriented Programming
  2. 100% of them fail to understand the finer points of subtyping and furthermore subclassing

I have given more explanations of variance than I have given explanations of anything else on the job. So, in a effort to practice the DRY principle in all my affairs, I decided to put it into documentation I can point to.

Note: Type Variance has A LOT of math (type theory & category theory) behind it. This post will focus on its usage in the Scala language not on the math.

Sub Classes

class Foo[T]
def check[A, B](a:A, b:B)(implicit ev: A <:< B): Unit = {}

Here the class Foo is parameterized by T and the check function simply checks if the type of its first argument is a subclass of the type of its second argument. Don't worry about the check function, its implementation details are beyond the scope of this post but, its a nifty trick!

val str = ""
val obj = new Object()
check(str, obj)//compiles

So, the check here compiles which tells us String is a subclass of Object.

val fStr = new Foo[String]
val fObj = new Foo[Object]
check(fStr, fObj)//error: Cannot prove that zzz.simple.Foo[String] <:< zzz.simple.Foo[Object].

This doesn't compile because the type parameter of Foo allows for no variation in its relationship; Foo is invariant in T.

We will begin by briefly describing variance.

Variance

Variance is a huge part of programming with types. It is an intrinsic property of class hierarchies and can be witnessed as such in languages like C++ and Scala who have compiler errors along the lines of:

  • covariant whatever in contravariant position
  • whatever is in covariant position but is not a subclass of whatever

In short, type variance describes the types that may be substituted in place of another type.

Covariance: We say a substitution is covariant with a type, Foo, if Foo or any other class with a subclass relationship to Foo is valid.
Contravariance: We say a substitution is contravariant with a type, Bar, if Bar or any other class with a superclass relationship to Bar is valid.
Invariance: We say a substitution is in variant with a type, Foo, if only types that are exactly Foo are valid.

Variance in Practice

We already saw invariance above. In Scala covariance and contravariance are denoted by using the + and - symbols respectively.

Covariance

case class Foo[+T]//covariant in T

Redeclaring Foo in this way makes it covariant so our test now validates

val str = ""
val obj = new Object()
check(str, obj)//compiles
val fStr = new Foo[String]
val fObj = new Foo[Object]
check(fStr, fObj)//compiles

Declaring the type variable with a + (as covariant) tells the compiler that the subclass relationship between type parameters gives rise to a direct subclass relationship in Foo. So any def, val or var requiring a Foo[Object] can take a Foo[String] as an argument in place.

Contravariance

case class Foo[-T]//contravariant in T

This redeclaration makes Foo contravariant and breaks our test again

val str = ""
val obj = new Object()
check(str, obj)//compiles
val fStr = new Foo[String]
val fObj = new Foo[Object]
check(fStr, fObj)//error: Cannot prove that zzz.simple.Foo[String] <:< zzz.simple.Foo[Object].

This is what we expect! Contravariance implies a superclass relationship not a subclass relationship. We can fix this by reversing our input arguments

check(fObj, fStr)//compiles

This declaration is a hint to the compiler that the subclass relationship between type parameters gives rise to a superclass relationship in Foo. So any def, val or var requiring a Foo[String] can take a Foo[Object].

How to use Variance

Where covariance preserves the subclass relationship from the type parameter into the type, contraveriance reverses this relationship.

Covariance is used a lot in Scala by the collections library. Most of the immutable collections are covariant. This makes working with your data types inside the collection the same as working with them outside the collection when writing interfaces.

Contravariance is less prominent. I use contravariance for typeclasses a lot.

trait Bar[-T]{
  def bar(t:T): Unit
}
implicit val bar = new Bar[Object]{def bar(o:Object): Unit = ()}
def procBar[T:Bar](t: T){
  implicitly[Bar[T]].bar(t)
}
procBar(obj)//compiles
procBar(str)//pulled the superclass instance in

If a class does not have a typeclass instance in implicit scope for its type it can use a contravariant instance if one is in scope.

 

 

5. Typeclasses over Subclasses (Introducing Typelevel Scala into an OO Environment)

(Examples can be found on GitHub)

In the previous post, we introduced the Argonaut library to convert between values and JSON strings. The important part of this conversion is there was no superclass or interface to implement in order to get the benefit of JSON across classes. All we needed to do was define values of type CodecJson for each of the types we wanted to convert. We added the functionality to the class without changing the class itself. 

Argonaut allowed us to call toJson on classes with a codec and decodeOption on Strings to produce values of classes with a codec defined. This type of polymorphism, where a function's implementation depends on its inputs is called ad-hoc polymorphism. Furthermore, when we define a type, T, which defines functionality across classes to be used in ad-hoc polymorphic functions we call T a Type Class. Type Class polymorphism is a specific flavor of ad-hoc polymorphism.

Type Class polymorphism is a powerful tool for expressing context based functionality far more powerful than subclass polymorphism. As a well-known example take the Java interfaces Comparable and Comparator. If some data is defined in a class which implements Comparable, it can be sorted one way and needs an entire second class definition to be sorted with a different method. On the other hand, using Comparator the data is defined with a single class and each sort method gets its own Comparator. Comparator is a Type Class and allows the developer to determine in which contexts which sorting method should be used.

Subclass Method

Take the following traits:

trait Adder[Type]{
  def add(other: Type): Type
}
trait Chainer[Arg, Type[Arg]]{
  def chain[Res](f: Arg => Type[Res]): Type[Res]
}

Adder describes how to add two values of some Type together. Chainer describes how to chain operations over a parameterized type.

We'll use the idea of a team to illustrate. For the sake of simplicity we say a team consists of people of a certain profession. So we can have a team of engineers or a team of doctors or a team of cashiers or ...

Teams can (trivially) grow by hiring but, they can also grow by combining with other teams. Teams can be added.

Teams can have members who are themselves team leads. At times, the members of a lead's team must join the team the lead belongs to. This implies an operation which develops teams out of the members of teams. Teams can be chained.

Here is our implementation of Team given this functionality:

case class Team[Type](members: List[Type])
  extends Adder[Team[Type]]
  with Chainer[Type, Team]{
  override def add(other: Team[Type]): Team[Type] = {
    Team(members ++ other.members)
  }
  override def chain[Res](
      f: Type => Team[Res]): Team[Res] = {
    val list = members.flatMap(member => f(member).members)
    Team(list)
  }
}

Simple enough but this doesn't account for an organization of structured teams. For an organization who develops teams that each have one product lead and one technical lead, simple concatenation won't maintain a soft ranking of individuals within the new team. We need a new Team definition which accounts for this.

case class TeamStructured[Type](members: List[Type])
  extends Adder[TeamStructured[Type]]
  with Chainer[Type, TeamStructured]{
  override def add(
      other: TeamStructured[Type]): TeamStructured[Type] = {
    val (lead1, indi1) = members.splitAt(2)
    val (lead2, indi2) = other.members.splitAt(2)
    TeamStructured(lead1 ++ lead2 ++ indi1 ++ indi2)
  }
  override def chain[Res](
      f: Type => TeamStructured[Res]): TeamStructured[Res] = {
    val (leaders, individuals) = members.map{member =>
      val mems = f(member).members
      mems.splitAt(2)
    }.unzip
    TeamStructured(
        leaders.flatMap {x=>x} ++
        individuals.flatMap{x=>x})
  }
}

Now we have two definitions for the same data that differ only by functionality. We have a triple coupling here:

  1. Data Definition
  2. Addition Description
  3. Chaining Description

If the data needs to change (from List to Set is a good place to start) the change needs to be made in two places. Each function which accepts a Team for the purpose of team composition and combination needs to know which style of team it needs at development time. These problems gets worse for each possibility for combining and chaining teams (maybe a round robin or reverse algorithm would fit in certain situations). Type Classes solve these issues.

Type Class Method

Our traits become:

//The underscore here implies we need a parameterized type.
trait Adder[Type[_]]{
  def add[Item](
      left: Type[Item], right: Type[Item]): Type[Item]
}
trait Chainer[Type[_]]{
  def chain[Item, Res](
      arg: Type[Item], f: Item => Type[Res]): Type[Res]
}

These have the same uses as their counterparts above. However we have a single definition of the Team type:

case class Team[Type](members: List[Type])

The data is defined in a single place. Each piece of software which requires a Team has a consistent idea about what a Team is and means. The two versions of functionality are defined by:

object unstructured{
  implicit def adder: Adder[Team] = new Adder[Team]{
    override def add[Item](
        left: Team[Item], right: Team[Item]): Team[Item] = {
      Team(left.members ++ right.members)
    }
  }

  implicit def chainer: Chainer[Team] = new Chainer[Team]{
    override def chain[Item, Res](
        arg: Team[Item], f: Item => Team[Res]): Team[Res] = {
      val list = arg.members.flatMap(
          member => f(member).members)
      Team(list)
    }
  }
}

object structured{
  implicit def adder: Adder[Team] = new Adder[Team]{
    override def add[Item](
        left: Team[Item], right: Team[Item]): Team[Item] = {
      val (lead1, indi1) = left.members.splitAt(2)
      val (lead2, indi2) = right.members.splitAt(2)
      Team(lead1 ++ lead2 ++ indi1 ++ indi2)
    }
  }

  implicit def chainer: Chainer[Team] = new Chainer[Team]{
    override def chain[Item, Res](
        arg: Team[Item], f: Item => Team[Res]): Team[Res] = {
      val (leaders, individuals) = arg.members.map{member =>
        val mems = f(member).members
        mems.splitAt(2)
      }.unzip
      Team(
          leaders.flatMap {x=>x} ++
          individuals.flatMap{x=>x})
    }
  }
}

Now, each function which accepts a team, if needed, will also accept an adder or chainer or both (wholly decoupled). The down side here is each call to such a function requires at least one extra argument from the subclass versions. Scala has a fix for this limitation.

Implicits

The implicit keyword before a definition is an important part of making Type Class polymorphism beneficial to the developer. The word implicit, according to Oxford Dictionaries, means Implied though not plainly expressed. In Scala it means we can prepend the implicit keyword to an argument list and not explicitly produce the value in code assuming a valid value is in scope. For example:

def chainTeams[Type, Result](
  team: Team[Type])(
  func: Type => Team[Result])(
  implicit chain: Chainer[Team]): Team[Result] = {
  chain.chain(team, func)
}

This has three arguments, the team to operate on, the operation to perform, and the chainer for application. However, since the final argument is implicit, if we bring a valid implicit value into scope, there is no need to pass it in directly.

import structured._
val team: Team[Person] = Team(List(???))
val func: Person => Team[Person] = {(p: Person) => ???}
val newTeam: Team[Person] = chainTeams(team)(func)//valid

Since, we don't need to explicitly state the Chainer it keeps boilerplate clean. A nice effect of implicit resolution is if you have scoped two separate valid values for the implicit argument, the compiler will complain. The suggestion if you have multiple valid implicits in scope is to decouple your code functionally. No single scope should have use of more than one implicit of the same type; this is a code smell. A corollary to this is one should not explicitly provide implicit arguments; let the compiler do its work.

In the final post of this series, we will introduce another library, Cats.