December 15, 2022,
Recently, Simon Peyton-Jones, who has been working with Epic Games for about the last year, keynoted the Haskell Exchange Conference to introduce Epic’s forthcoming new programming language, Verse. Please do watch the video, which offers both a solid overview of “MaxVerse,” which is apparently the forthcoming syntactically-sugared and elaborated language most programmers will actually use, and Dr. Peyton-Jones’ characteristic charm and enthusiasm in talking about his work. It’s only in the final five minutes or so, however, that he reveals the core calculus, which is intended to serve as a new foundation for logic-functional programming languages. As Dr. Peyton-Jones points out, logic-functional languages are “a niche within a niche.” You may be a Scala developer or even a Haskell developer, but never have heard of Curry or Mercury. Ultimately, the Epic team’s goal is to provide a mainstream logic-functional language for everyone from beginners to seasoned professionals. Part and parcel of such an ambitious goal is to approach the design from the same level as you would the foundations of a new imperative programming language (the Universal Turing Machine) or a new functional programming language (the Lambda Calculus). That’s what the Verse Calculus aims to be.
The Epic team is admirably forthright in describing the Verse Calculus as very early work-in-progress. As a lifelong PLT (Programming Language Theory) enthusiast who has also been following the Unreal Engine and Tim Sweeney’s technology interests for decades, this makes me happy: maybe I can make some small contribution to Verse’s development! To that end, I’ve begun to assemble some resources with which to equip my study of the Verse Calculus paper. First, I’m quite interested in formalization and verification, so I essentially take the use of the Coq Platform for granted. The Coq Platform is described very well in Dr. Andrew Appel’s invited POPL 2022 talk. Coq is an interactive proof assistant that is often used for PLT-related tasks and has many potentially-useful tools to bring to bear on the study (and maybe implementation!) of a calculus such as the Verse Calculus.
For the rest of this post, I will attempt to walk through the paper and call out tools that seem to me, off the cuff, like they might be useful in addressing questions the paper raises, or just studying the calculus when doing so with some of the Coq Platform’s (or other third-party) tools seems unproblematic. I should emphasize that I am far from a Coq expert, so unlike a well-typed program, it’s entirely likely that I will get stuck, or even go wrong.
Since I’m aiming for formalization, let me ignore the motivating and informal sections of the paper, which quickly brings us to Section 3, which briefly explains why the use of rewrite rules has been chosen for exposition of the Verse Calculus. One of these reasons is that this is nearly the standard presentation for exposition of lambda calculi, and is also the means by which programmers in functional languages are encouraged to reason about their programs. A proper discussion of term-rewriting is out of scope for this post, so please feel free to search for other resources regarding it if it’s unfamiliar (you might find playing with the Pure programming language interesting, for example).
The first obvious sign of trouble comes in Section 3.7, “Evaluation strategy.” We find:
Any decent evaluation strategy should (a) guarantee to terminate if there is any terminating sequence of reductions; and (b) be amenable to compilation into efficient code. For example, in the pure lambda calculus, normal-order reduction, sometimes called leftmost outermost reduction, is an evaluation strategy that guarantees to terminate if any strategy does so.
It would be even better if the strategy could (c) guarantee to find the result in the minimal number of rewrite steps—so called “optimal reduction” [Asperti and Guerrini 1999; Lamping 1990; Lévy 1978]—but optimal reduction is typically very hard, even in theory, and invariably involves reducing under lambdas, so for practical purposes it is well out of reach.
My first thought here is to wonder whether any of the work the HVM project has done on Symmetric Interaction Calculus can be brought to bear. First of all, this document shares the Asperti and Guerrini citation above. Secondly, a quick look at the calculus shows some striking (to me) similarities between the Symmetric Interaction Calculus and the Verse calculus. Finally, some of the constraints applied to the Symmetric Interaction Calculus described here seem at least superficially similar to some identified for the Verse Calculus.
My second thought is that I’m probably being incredibly naïve, and shouldn’t expect a magic bullet here, so I should spend some time studying The Zoo of Lambda-Calculus Reduction Strategies, And Coq, which provides “a tool to inspect and categorize the “zoo” of existing strategies, as well as to discover and study new ones with particular properties."
Section 4.2 discusses confluence of the calculus in some detail, and hints at not being terribly satisfied with the proof sketch offered there. There’s an intriguing suggestion that a weaker notion of conflluence such as that decribed in Lambda calculus plus letrec may be applicable. I find this idea intriguing because the paper explicitly relates “lambda calculus plus letrec” to the infinitary lambda calculus, and I recently discovered A New Coinductive Confluence Proof for Infinitary Lambda Calculus, suggesting it might be possible to apply or modify the same proof techniques for the infinitary lambda calculus to the Verse Calculus.
Returning to Section 3.7, we find “we leave evaluation strategy and compilation for future work.” Having discussed what I expect to study regarding how an evaluation strategy might be provided, I also intend to study the new version of Vellvm, a formalization of a significant subset of the LLVM intermediate representation, to use as a compilation target.
All of this is most definitely significantly personally ambitious. But like many people who have studied computer science formally and found themselves with a bent for exploring many different kinds of programming languages, I’ve always wanted to design a really good one. I think that old goal is out of reach for someone with a $DAYJOB and other personal responsibilities, but I hope I can devote enough time and energy to this to make some small contribution to the future of Verse, which is the language I’ve been most excited about in many years.
July 18, 2020,
Recently, an article on Railway-Oriented Programming in Scala appeared and was linked to on the Scala subreddit. It, in turn, is based on this post, which presents the concept in the context of the F# language. In the Scala subreddit thread, I observed that ROP is a great example of the power of typed functional programming, and is a reasonably well-known pattern that goes by a different (and, let’s be honest, less approachable) name in languages and/or libraries that offer the more general form. That’s what I’d like to talk about here. First, though, please go read the other articles. They’re very good treatments of and motivators for the concept, so I won’t recapitulate them. Rather, I’ll begin by discussing what the point of ROP is, how Scala offers language features making the more general approach practical, and how ROP fits into the bigger typed functional programming picture.
Back from reading the articles? Good.
What’s the point?
So, what is ROP about? I’ll claim it’s about the following
- ROP is about function composition.
I imagine most Scala programmers have some intuition that if you have something like:
and
you can, of course:
and get a C
. But maybe fewer Scala programmers know you can say the same thing in the following way:
val fun1: A => B = ???
val fun2: B => C = ???
(fun1 andThen fun2)(a)
As the Scala-based article notes, in Scala, methods and functions are different. Methods are defined with def
. While most Scala developers have probably seen the =>
syntax, it might not be obvious that A => B
is syntactic sugar for Function1[A, B]
. Whether you say fun2(fun1(a))
or (fun1 andThen fun2)(a)
, we call this “functional composition.” It’s what we’re trying to do when we do “functional programming.”
But there’s a hitch, isn’t there? As the articles point out, most useful functions in most domains have some kind of effect. And it turns out you have to be careful how and when effects in a function happen if you want functions to “make sense” when you compose them. You want:
f andThen g andThen h andThen i...
to “work the way you expect” no matter what f
, g
, h
, i
… do, if their types line up. So:
- ROP is about taking effects into account.
You know what’s coming, right? Yes, I’m going to talk about monads.
Relax. You use monads in Scala constantly. In fact, you use monads literally every time you write a for
-comprehension. The Scala-based article notes that porting the F# code for Result
to Scala ends up yielding (no pun intended) something a lot like Scala’s Either
, which is often used to represent either a success (Right
, by convention, in Scala) or failure (Left
). And Scala developers probably know you can use Either
in for
-comprehensions:
for {
a <- Left[String, Int]("Something bad happened.")
b <- Right[String, Int](42 + a)
} yield b
res0: Either[String, Int] = Left("Something bad happened.")
So Either
is a monad, because it works in for
-comprehensions. And it’s a monad that, again as the Scala-based article points out, is often used to model “operations that can fail,” like validating user input, or writing to a database. So I’ll make a third claim:
- ROP is about taking failure into account.
But there’s another hitch, isn’t there? We’ve lost the “railway” notion along the way. There’s something appealing about the simplicity of:
f andThen g andThen h andThen i...
But if:
val f: A => Either[String, B] = ???
val g: B => Either[String, C] = ???
val h: C => Either[String, D] = ???
val i: D => Either[String, E] = ???
then the “railway” version of their composition won’t even compile, because the arguments aren’t Either
s. But the for
-comprehension works:
def compose(a: A): Either[String, E] = for {
b <- f(a)
c <- g(b)
d <- h(c)
e <- i(d)
} yield e
The reason is that all for
-comprehensions care about is that what’s to the right of the <-
is a monad. Because we’re applying all of the functions to their arguments manually, we have a monad—an Either
—on the right, so the comprehension works. Another important aspect of the reason it works is that monads have constructors, a map
method, and a flatMap
method that, together, obey certain laws, so for
-comprehensions are just syntactic sugar for a chain of flatMap
s followed by a map
for the yield
keyword. This point will be important later. The point is, it’d be really nice if there were some way to provide andThen
for Function1
s returning monads, especially monads we use to represent failure.
It turns out there is, in a library called Cats.
First, Cats helps us address the quandary I alluded to a moment ago: we want something that acts a lot like a for
-comprehension, but we’d like to just compose functions with something like andThen
, and this seems out of reach because an A
is not an Either[E, A]
. The key observation above is that for
-comprehensions “know” they’re dealing with a monad: if the type you try to use doesn’t have map
and flatMap
, your for
-comprehension won’t compile. Is there some more general way to let the compiler “know” a type T
is a monad?
It turns out there is: we can say Monad[T]
, thanks to Cats.
But there is, again, a hitch: T
isn’t really a type, but a type constructor. To use another example, Option
isn’t a type. It’s a type constructor that takes one argument, the type of its value, if it exists. So Monad
, from Cats, is a type constructor that takes another type constructor as an argument, and the result is a type, regardless of the type argument of the nested type constructor. This is a feature that distinguishes Scala’s type system from F#’s type system, and is the reason this post exists. We call these higher-kinded types. Again, don’t panic: “kind” just means “type of a type,” so “higher-kinded type” just means “type of a type of a type,” and honestly, you’ll almost certainly never have to think about that phrase again, even if you use “higher-kinded types” every day, like I do.
Higher-Kinded Types and Typeclasses
I’m not going to dwell on the formalities of higher-kinded types. The link above does a much better job of that than I could ever hope to. What I want to do instead is just lay down the intuition, which you already have from for
-comprehensions, that a higher-kinded type just makes explicit something about a type constructor that we already know implicitly. We know Option
is a monad, because we can use it in a for
-comprehension, which means an Option[A]
has a constructor, map
, and flatMap
obeying the monad laws. Here’s the thing: do the constructor, map
, and flatMap
know anything at all about A
? No. They don’t. The type Monad[Option]
—and remember, this is a type, not a type constructor, because it’s a higher-kinded type—captures this. Option
is a Monad
for all, as the logicians say, A
in Option[A]
. You may think, especially if you have a Java background, that this sounds like Option
could inherit from Monad
, because I keep saying “Option
is-a Monad
.” And in traditional object-oriented programming, you’d be right. But Cats isn’t the standard library; Option
is in the standard library; and besides, the problems with modeling “is-a” relationships with implementation inheritance are well-known even to object-oriented programmers. So what we usually call this use-pattern of higher-kinded types, especially when they obey a set of algebraic laws, is a “typeclass,” a term that comes from the purely-functional programming language Haskell. I’m not going to dwell on their formalities either. I’m just going to say “typeclass” from now on, instead of the mouthful “higher-kinded type obeying a set of algebraic laws,” or, worse, “type of a type of a type obeying a set of algebraic laws.” And now maybe you can begin to see why some of the jargon exists.
Can I Get a Witness?
Speaking of jargon, here’s some more: we’ve established that Monad[Option]
is a perfectly good type, even though Option
, by itself, is a type constructor. We’ve also established that Option
doesn’t inherit from Monad
. So what do we call a value of type Monad[Option]
? I’m afraid we still call it an “instance” of the typeclass. Now I want to ask you to make a big mental leap:
Read Monad[Option]
as “Option
implies Monad
.”
In fact, you can think of all types as propositions (that’s the Curry-Howard-Lambek correspondence, which I talked about here, but that’s out of scope for this post). But let’s stay focused on Monad[Option]
for now. How do you know whether the proposition “Option
implies Monad
” is true or not? It’s true if, and only if, there is at least one value of that type. Now let me dive into how that works in Cats in practice. First, I’ll fire up Li Haoyi’s wonderful Ammonite REPL to noodle around in (as I have actually been doing all along):
psnively@oryx-pro:~|⇒ amm
Loading...
Welcome to the Ammonite Repl 2.1.4 (Scala 2.13.2 Java 11.0.7)
@
One of the great things about the Ammonite REPL is its “magic imports” feature. It lets us import libraries directly from the usual repositories. Because I’m going to talk about effects more later, let me go ahead and import the cats-effect library:
@ import $ivy.`org.typelevel::cats-effect:2.1.4`
https://repo1.maven.org/maven2/org/typelevel/cats-effect_2.13/2.1.4/cats-effect_2.13-2.1.4.pom
100.0% [##########] 2.5 KiB (4.0 KiB / s)
https://repo1.maven.org/maven2/org/typelevel/cats-effect_2.13/2.1.4/cats-effect_2.13-2.1.4-sources.jar
100.0% [##########] 142.5 KiB (598.9 KiB / s)
https://repo1.maven.org/maven2/org/typelevel/cats-effect_2.13/2.1.4/cats-effect_2.13-2.1.4.jar
100.0% [##########] 1.1 MiB (3.0 MiB / s)
import $ivy.$
@
Now we import what we need from Cats. Usually, Cats developers do whole-package imports like this:
@ import cats._, implicits._
import cats._, implicits._
That gets us a lot of stuff that’s native to Cats, which is great. But we also want Cats typeclass instances for standard library types like Option
. Those live in a different package:
@ import cats.instances._
import cats.instances._
When we want instances of a typeclass, we usually want them to be “implicitly” in scope, because we want, in a broad sense, to know that the proposition their type represents is true. The value of type Monad[Option]
—and ideally, there is only one—is sometimes called “evidence” or a “witness.” In fact, you may already have seen code like:
def doSomething[F[_]](arg: Whatever)(implicit ev: Monad[F]) = {
...
}
and wondered what “ev
” meant. It’s short for “evidence.” Here, we’re saying we don’t care what type constructor F
actually is, or what its type argument is, as long as there’s implicit evidence it’s a Monad
. Do we have that evidence for Monad[Option]
?
@ implicitly[Monad[Option]]
res4: Monad[Option] = cats.instances.OptionInstances$$anon$1@5d3634c8
We do. Importing cats.instances._
has brought a Monad[Option]
implicitly in scope. OK, the compiler now knows Option
implies Monad
. So what? Well, types define legal operations on terms. So a value (“term” is Programming Language Theory jargon I won’t dig into further here) of a type has certain legal operations. Remember, a Monad[Option]
is not an Option
; it defines operations any Monad
supports. What are some of them?
@ res4.
*> asRight lift productL tuple3
<* compose map productLEval tuple4
<*> composeApply map10 productR tuple5
ap composeContravariant map11 productREval tuple6
ap10 composeContravariantMonoidal map12 pure tuple7
ap11 composeFunctor map13 raiseError tuple8
ap12 flatMap map14 replicateA tuple9
ap13 flatTap map15 rightIor tupleLeft
ap14 flatten map16 rightNec tupleRight
ap15 fmap map17 rightNel unit
ap16 foreverM map18 some unlessA
ap17 fproduct map19 tailRecM untilDefinedM
ap18 ifA map2 tell untilM
ap19 ifF map20 tuple10 untilM_
ap2 ifM map21 tuple11 unzip
ap20 imap map22 tuple12 valid
ap21 invalid map2Eval tuple13 validNec
ap22 invalidNec map3 tuple14 validNel
ap3 invalidNel map4 tuple15 void
ap4 iterateForeverM map5 tuple16 whenA
ap5 iterateUntil map6 tuple17 whileM
ap6 iterateUntilM map7 tuple18 whileM_
ap7 iterateWhile map8 tuple19 widen
ap8 iterateWhileM map9 tuple2 writer
ap9 leftIor mproduct tuple20
as leftNec point tuple21
asLeft leftNel product tuple22
@ res4.
As you can see, there’s a lot you can do with a Monad
. But for all that, there’s no andThen
. And that makes sense. A Monad[Option]
isn’t a Function1[A, Option[B]]
, and here we can see the problem pretty clearly. We need some type that, if possible, represents a monadic function like our val f: A => Either[String, B]
. But first, let’s deal with another hitch: Option
takes one type argument, but Either
takes two. Also, Either
explicitly represents failure with one of its type arguments. Is there a typeclass that accounts for that?
It turns out there is: MonadError
. MonadError
takes two type arguments, a type constructor taking one type argument, and a type representing error values. You see the problem: you can’t just use Either
as the first type argument for MonadError
because it takes two type arguments. It’s significant, though, that you want to use the same type for the Left
of the Either
and the second type argument of MonadError
. You want to be able to say:
MonadError[Either[Throwable, *], Throwable]
where we’re using the JVM’s pervasive Throwable
to represent errors, but the Right
Either
type can be anything, and this is a complete type, just like Monad[Option]
is a complete type in spite of not knowing what the (unwritten) type argument of Option
is.
It turns out, we can do this, with a compiler plugin called “kind-projector,” which we can use in the Ammonite REPL like this:
@ import $plugin.$ivy.`org.typelevel:::kind-projector:0.11.0`
import $plugin.$
Now we can ask: “Does Either[Throwable, *]
imply MonadError
?”
@ implicitly[MonadError[Either[Throwable, *], Throwable]]
res6: MonadError[Either[Throwable, β$0$], Throwable] = cats.instances.EitherInstances$$anon$2@724e483f
Indeed it does.
OK. We’ve managed to convince the compiler that Option
implies Monad
and that Either
implies MonadError
, and we’ve seen that Monad
offers a lot of functionality. By the way, MonadError
offers even more functionality, and all MonadError
s are Monad
s. We still haven’t seen a representation of “monadic function,” though.
Kleisli
Yeah, it’s named after the Swiss mathematician who worked it out. Sorry about that. Because it’s a complete data type, not a typeclass or instance of a typeclass for a type in the standard library, it lives in yet another package:
@ import cats.data.Kleisli
import cats.data.Kleisli
A Kleisli[M, A, B]
, sometimes called a “Kleisli triple” in the literature, represents an A => M[B]
.
Pretty underwhelming, right?
By now, though, you may be able to hazard a guess as to where this is going. Because if Kleisli
can find evidence that M
implies Monad
or MonadError
, might Kleisli
be able to offer a lot of functionality, like Monad
and MonadError
do?
The easiest way to construct a Kleisli
is just to literally apply it to an A => M[B]
:
@ import scala.util.Try
import scala.util.Try
@ Kleisli { (s: String) => Try { s.toInt }.toEither }
res13: Kleisli[Either[Throwable, B], String, Int] = Kleisli(ammonite.$sess.cmd13$$$Lambda$2649/0x0000000840b94840@47b8e2)
I’m using Try
here to turn the possible failure—a thrown exception—into an Either
with the Throwable
on the Left
, which we already know implies MonadError
. And this gives us a perfectly good Kleisli
.
@ res13("foo")
res14: Either[Throwable, Int] = Left(java.lang.NumberFormatException: For input string: "foo")
Exactly as we’d hope, passing a String
that isn’t an Int
gives us the appropriate Left
.
@ res13("42")
res15: Either[Throwable, Int] = Right(42)
Exactly as we’d hope, passing a String
that is a valid Int
gives us the appropriate Right
.
Boy, that’s a long-winded way to write def stringToInt(s: String): Either[Throwable, Int] = Try { s.toInt }.toEither
, isn’t it?
But hang on:
@ res13.
&&& attemptT handleErrorWith mapF pure tapWith
&> canEqual imap mapK raiseError tapWithF
*** choice index merge recover tell
*> choose invalid mkString_ recoverWith toReader
+++ combine invalidNec mproduct redeem traverse
<& combineK invalidNel onError redeemWith tupleLeft
<* combineN isEmpty orElse reduceA tupleRight
<*> compose iterateForeverM parFoldMapA reduceMapK typeClassInstance
<+> copy iterateUntil parUnorderedFlatTraverse reject unlessA
<<< cosequence iterateUntilM parUnorderedTraverse replicateA untilM
>> dimap iterateWhile partitionBifold right untilM_
>>= distribute iterateWhileM partitionBifoldM rightIor valid
>>> ensure left partitionEitherM rightNec validNec
adaptErr ensureOr leftIor product rightNel validNel
adaptError first leftNec productArity rightc void
andThen flatMap leftNel productElement rmap whenA
ap flatMapF leftc productElementName run whileM
ap2 flatTap lift productElementNames second whileM_
apply flatten lmap productIterator self widen
as fmap local productL some writer
asLeft foldMapK lower productLEval split |+|
asRight foreverM map productPrefix sum |||
attempt fproduct map2 productR tailRecM
attemptNarrow handleError map2Eval productREval tap
As we hoped, we get a staggering amount of functionality on Kleisli
, some of it derived from the typeclass instances M
has:
@ implicitly[MonadError[Kleisli[Either[Throwable, *], String, *], Throwable]]
res16: MonadError[Kleisli[Either[Throwable, β$0$], String, γ$1$], Throwable] = cats.data.KleisliInstances0_5$$anon$9@5d2d89a6
“Either
implies MonadError
implies (Either
implies Kleisli
) implies (Kleisli
implies MonadError
).”
But look: because Kleisli
captures the notion of “monadic function,” we have andThen
!
val getDivisor: String => Either[Throwable, Int] = s => Try { s.toInt }.toEither
val divide56By: Int => Either[Throwable, Int] = i => Try { 56 / i }.toEither
val getDivisorK = Kleisli(getDivisor)
val divide56ByK = Kleisli(divide56By)
val divide56ByString = getDivisorK andThen divide56ByK
@ divide56ByString("3")
res27: Either[Throwable, Int] = Right(18)
@ divide56ByString("foo")
res28: Either[Throwable, Int] = Left(java.lang.NumberFormatException: For input string: "foo")
andThen
on Kleisli
is traditionally spelled >=>
, which is affectionately known as the “fish operator.” This generalization of function composition, which can account for effects and failure, is called “Kleisli composition.”
So my claim is that “Railway-Oriented Programming” is Kleisli composition, and my suggestion is that you should use languages and libraries that make it explicit, if at all possible.
There’s much more to say, especially about effects and Kleisli
’s many other aspects and its relationship to types and even category theory generally. But this post is already long, because I wanted to try to spell out some details about Cats and make some intuitions explicit. If you’re still reading, thanks for hanging out with me, and I hope you found something worthwhile here.