Spec-first and code-first, hand in hand

scalasmithycodegenmetaprogrammingderivation

By Olivier Mélois

Yesterday, I released smithy4s-deriving, a Scala 3-only library that provides meta-programming-based derivation mechanisms for the core abstractions of Smithy4s. These abstractions, which act as the foundational pillar to Smithy4s, are usually code-generated by the build-plugins provided by the library, using smithy as a source of truth.

As the original author and main contributor to Smithy4s, one could ask why I've decided to enter a competition with myself by providing what could essentially be seen as a competing product. Am I lacking sanity? Probably. But I'll try to describe the why in this blogpost.

The idea behind Smithy4s

Through it's reductive to describe it this way, Smithy4s can be seen as a code-generation tool that was created with the intention to retain and exploit the protocol-agnostic nature of the Smithy interface definition language. It aims at generating code that can be used in vastly different contexts without reconfiguration, whether "context" means different serialisation formats or different protocols. This goal is achieved by means of revolutionary somewhat interesting abstractions that I've shamelessly borrowed from people way smarter than me and then proceeded to spent countless hours adapting, redesigning and refining to varying degrees of success.

There are two main abstractions to Smithy4s: smithy4s.Service, which allows abstracting over interfaces, and smithy4s.Schema, which allows abstracting over datatypes. I won't describe theses abstractions in depth here, as I've already tried to do it reasonably exhaustively in the official design documentation of Smithy4s, but however, I'll try to express their goals:

The smithy4s.Schema abstraction allows to produce codecs for virtually any serialisation formats.

schema.svg

whereas the smithy4s.Service abstraction allows to turn instances of interfaces into low-level routers (such as http routers) and turn low-level reverse-routers (such as http clients) into interface stubs, for virtually any communication protocol.

service.svg

Why are those indirections valuable? Because they drastically reduce the cost of pivoting from one technology involved in remote-communication to another. They allow for the ultimate form of pivoting between json libraries, http libraries, protocols, etc, without impacting the business logic. They allow to centralise the care for cross-cutting concerns that engineers worry about without impacting the most valuable bits of the software that the business worries about.

They also save us, library maintainers, a lot of time: writing code-generators that produce enticing code that developers want to use is really, REALLY hard. Therefore, decoupling the responsibility of generating datatypes and interfaces from the responsibility of providing interoperability between the generated code and various serialisation and communication libraries is highly desirable. In short, the abstractions let us build a lot more tooling with a lot less people.

Think of it this way: you want to add support for a new serialisation format or a new http library? Just write it against Schema or Service, and it becomes instantly available to use with the types generated by Smithy4s, past, present and future. No need to tweak the code-generator, you just have to write some good old, test-friendly scala code. Want to log all method calls to an interface? Just write it once against the Service abstraction and all your existing services can benefit from it. The impact of writing library integrations against these abstractions scales wholesomely, which, in medium to large organisations, makes it extremely cost efficient.

spec-first

The advertised usage of Smithy4s is spec-first. Users are expected to write a specification in the Smithy IDL, which looks like this:

namespace hello

structure Greeting {
  @required
  message: String
}

operation Greet {
  input := {
    @required
    firstName: String
    @required
    lastName: String
  }
  output: Required
}

service HelloService {
  operations: [Greet]
}

in order to get generated code that looks virtually like this:

case class Greeting(message: String)

object Greeting {
  implicit val schema: Schema[Greeting] = ???
}

trait HelloService[F[_]]{
  def greet(firstName: String, lastName: String) : F[Greeting]
}

object HelloService {
  implicit val service: Service[HelloService] = ???
}

The specification-first approach really shines in medium-to-large organisations, as it facilitates a higher degree of communication between engineering teams that write pieces of software that need to talk to each other. It also has the benefit of offering a common clear terminology across teams that do not necessarily use the same programming languages on a day to day basis. In other words: specifications written in an IDL are akin to legally binding contracts that help lay out responsibilities and expectations, before starting any implementation work.

The spec-first approach has other qualities: the Smithy metamodel being simple makes it easy to build validation tools that catch incompatible changes to API specifications, which is pretty interesting in high-pressure environments, where deploying an API incompatible change could lead to a lot of precious hours of sleep being lost.

Overall, spec-first approach works pretty well in large organisations, and it's probably for that very reason that AWS has developed the Smithy IDL.

code-first ?

That being said, for all its qualities, the spec-first approach comes at the cost of cognitive overhead: users need to read the Smithy documentation, get used to its terminology (which although pretty intuitive, differs slightly from Scala's), figure out how to get editor support for it, etc. This overhead is harder to justify in pet projects and small businesses. Additionally, in the Scala ecosystem, which has recently seen the rise of scala-cli as an amazing tool for small POCs and projects, and the lack of support for code-generation in scala-cli creates a barrier for people who'd like to try Smithy4s.

This is where smithy4s-deriving comes into play. It allows to derive the implementation of Schema and Service (which are quite boilerplate-heavy) from Scala code. In essence, it tries to achieve this kind of UX:

case class Greeting(message: String) derives Schema

trait HelloService derives Service {
  def greet(firstName: String, lastName: String): IO[Greeting]
}

This allows users derive instances to the abstraction that Smithy4s is built on top of, therefore allowing the use of nearly all the integrations provided by Smithy4s itself OOTB or by third party libraries, with tidily handcrafted code that you'd likely write on a day-to-day basis, all for free.

Isn't this fundamentally competing with the idea of Smithy4s as a tool that promotes spec-first? Well, a little, for sure. But at the same time, the abstractions provided by Smithy4s are reflecting the semantics of the Smithy IDL. This implies that Smithy models can be recreated from the Smithy4s abstractions, and therefore it's possible to imagine a pet project starting as a small POC using scala-cli and smithy4s-deriving to create an API for an application using a code-first approach, and for this to organically pivot to a spec-first approach when the POC unavoidably grows to be a multi-zillion-euros company that employs hundreds of scala engineers.

But more importantly: the main reason for me to want to do it is because it's SO. GODDAMN. COOL.

  • How cool is it, right, that Scala 3's meta-programming features are so awesome that they allow, with a teeny-tiny bit of elbow grease, to derive instances not only for case classes and enums, but also for whole interfaces?
  • How cool is it that a library that was initially designed to be primarily used with generated code can be used with handcrafted datatypes and interfaces, granting access to a bunch of battle-tested features at the cost of a ridiculously small number of lines of code.
  • How cool is it that the right abstractions single-handedly allow to reconcile spec-first and code-first which are virtually un-mixable yin and yang of API design?

Am I blowing my own horn? Yes. Indubitably. Maybe this smithy4s-deriving thing won't actually go anywhere or be used by anyone... But as someone who constantly doubts himself and second guesses every single decision he makes, for once I'm gonna treat myself to a pat on the back, because I feel like I've built something goddamn cool and I'm proud of it. Well done, me !

something

Anyway, maybe this is an intro to a series of posts that I'll write over the next 15 years (when time permits) about the design of Smithy4s and other software, and about my flaws as an engineer and human bean. Or maybe I'll just give up after this post. Who cares, it's not like anybody's reading any of this, right ?