Search
  • Johan Walters

Improving DateTimeFormatter.parseBest using Scala 3

Ever since I found out Scala 3 would be getting union types and some build-in generic programming through tuples (similar to shapeless), I knew I just had to tackle an old pet peeve of mine.


See, java.time.format.DateTimeFormatter has two methods for parsing text into data structures capable of storing the various time components supported by a given DateTimeFormatter, differing mainly in the amount of arguments you would like to pass:


public <T> T parse(CharSequence text, TemporalQuery<T> query)

public TemporalAccessor parseBest(CharSequence text, TemporalQuery<?>... queries)

Suppose your DateTimeFormatter lines up with a LocalDateTime, you could use

val ldt: LocalDateTIme = 
  ISO_LOCAL_DATE_TIME.parse("2021-09-14", LocalDateTime.from(_))

and the static type of your result is LocalDateTime (although exceptions can be thrown).


But if your parser has a lot of optional parts, you'd need to try a few data types to see which works best:


val ta: TemporalAccessor =
  ISO_DATE_TIME.parseBest("2021-09-14T01:02:03Z", OffsetDateTime.from(_), LocalDateTime.from(_))

yet the result type is simply 'TemporalAccessor'. So you should use 'parse' if there is only one fitting TemporalQuery with optimal type safety, or 'parseBest' if there are multiple, losing type safety. Furthermore, if we only provide a single argument to parseBest, an IllegalArgumentException is thrown!


To summarize, the problems are:

  • you need to choose between two different methods depending on whether you need one or more data types matching your DateTimeFormatter

  • passing less than two arguments to parseBest results in an exception

  • the result type for two or more arguments is imprecise, needing instanceof checking and casting in Java or pattern matching in Scala with redundant catch-all clauses.


ta match {
  case o: OffsetDateTime => ???
  case l: LocalDateTime  => ???
  case d: LocalDate => ??? // will never reach, but no warning
  case _ => ??? // will never reach, but required!
}

Union types to the rescue


Can we do better? Yes! I introduce 'parseBetter':

val datetime: OffsetDateTime | LocalDateTime = ISO_DATE_TIME.parseBetter("2021-09-14T01:02:03Z")(OffsetDateTime.from(_), LocalDateTime.from(_))

The result type is a union derived from the presented query types and is fully inferred by the Scala 3 compiler. The only way to properly handle that union type is using a match expression with cases for exactly the given types. Providing too many or too few match cases will provide a compile error (Unreachable case) or warning (match may not be exhaustive) respectively.

dateTime match {
  case o: OffsetDateTime => ???
  case l: LocalDateTime  => ???
  // no more, no less!
}

Undecipherable type signatures?


Now hold on to your seats and behold the type signature of parseBetter:

extension (dtf: DateTimeFormatter) {
  inline def parseBetter[T <: NonEmptyTuple: IsMappedBy[[R] =>> TemporalAccessor => R]](text: String)(queries: T): Union[InverseMap[T, [R] =>> TemporalAccessor => R]]
}

Your reaction may vary between

  • I know Scala, but I can't read that.

  • I don't know Scala and you've convinced me to run away even faster.

Now y'all relax, this is just an exercise in improving type safety in type/method signatures to decrease stupid mistakes in general. I'm not saying your average Scala production code should use all the new Scala 3 features everywhere as I did here. I guess Scala does require a bit of acquired taste for allowing 'ugly' type signatures (CanBuildFrom anyone?) to get 'clean' calling code (no Java-esque .collect(toList()) boilerplate please).


Now let's break this down a bit.

  • extension: this is the new Scala 3 extension method syntax.

  • 'inline' helps the compiler (in this case) to know the types at use-site more precisely, similarly to how Kotlin handles reified generics.

  • scala.NonEmptyTuple: a scala.Tuple which cannot be empty. It is the type of our queries parameter. So our queries are not a vararg (losing the element-specific generics) but a Tuple (remembering each type precisely). Luckily at use-site, using a tuple argument looks very much the same as a vararg argument, so double parentheses are not needed, as can be seen in the example above.

  • [R] =>> TemporalAccessor => R: a type lambda. Essentially the same as separately defining 'type TempQuery[R] = TemporalAccessor => R', and then using e.g. 'IsMappedBy[TempQuery]' instead. But hey, I don't like bothering users with one-off type aliases just for usage in a single method. In a way, that type alias does exist: 'TemporalQuery[R]' is a Single Abstract Method (SAM) interface corresponding to 'TemporalAccessor => R'. I tried using 'IsMappedBy[TemporalQuery]' directly, but the compiler cannot infer here that 'LocalDateTime.from(_)' is a valid 'TemporalQuery[LocalDateTime]'. This substitution of lambda's into the required SAM interface by the compiler is called SAM conversion, which normally works fine, but apparently doesn't play well with all the inlining happening here.

  • Scala.Tuple.IsMappedBy tells us each element in the tuple must have the same 'wrapper type', in this case essentially 'TemporalQuery[*]'.

  • Scala.Tuple.InverseMap peels off those wrappers, so we go from e.g. (TemporalAccessor => LocalDateTime, TemporalAccessor => OffsetDateTime) to (LocalDateTime, OffsetDateTime).

  • Scala.Tuple.Union builds a union type of the tuple elements, so again e.g. from (LocalDateTime, OffseDateTime) to LocalDateTime | OffsetDateTime. Which is exactly the return type we want.

All this tuple/wrapping/unwrapping stuff may be familiar if you have used shapeless before but is also common in Typescript. The concept of type lambdas is not new either.


For this post I have chosen not to use an Either/Try/Option return type to stay close to the original Java API, still throwing DateTimeParseExceptions (mind that this is essentially a wrapper around DateTimeFormatter.parseBest).


Throw less


If parseBetter is just a type-safe wrapper around parseBest, then what about that IllegalArgumentException being thrown when we don't provide enough queries? We can catch that compile-time too! We could change the signature to:

inline def parseBetter[T <: NonEmptyTuple: IsMappedBy[[R] =>> TemporalAccessor => R]](text: String)(queries: T)(using Tuple.Size[T] > 1 =:= true): Union[InverseMap[T, [R] =>> TemporalAccessor => R]]

The bold part is added. Note that '1' and 'true' are singleton types here. It's not a value of type Int, but really type 1 (which holds exactly one value, indeed 1, just like Unit only holds ()). Also note the >, which is part of scala.compiletime.ops.int: compile-time arithmetic! Now providing a single query won't compile, so an IllegalArgumentException is impossible.


I must say the error message isn't beautiful when we try to use this method with a single query argument. Although Scala 3 provided facilities for providing custom compile error messages I won't dive into that, if possible at all here. Instead, why don't we just allow a single argument? We just need to choose between forwarding to DateTimeFormatter.parse and DateTimeFormatter.parseBest. We can keep using the original type signature I presented.


Now we have also tackled the nuisance of needing two different methods depending on the number of arguments to provide. The only illegal amount of arguments is zero. I could have chosen to enforce that by adapting the provided 'using' clause to '> 0', but luckily there exists the type 'scala.NonEmptyTuple' for this use case.


Just show me the code!


Now for the implementation:

import java.time.*
import java.time.format.*
import java.time.temporal.*
import scala.Tuple.*
import scala.compiletime.ops.int.>
import scala.runtime.Tuples
import cats.implicits._
import cats.data.NonEmptyList

extension (dtf: DateTimeFormatter) {
  inline def parseBetter[T <: NonEmptyTuple: IsMappedBy[[R] =>> TemporalAccessor => R]](text: String)(queries: T): Union[InverseMap[T, [R] =>> TemporalAccessor => R]] = {
    type U = Union[InverseMap[T, [R] =>> TemporalAccessor => R]]
    val list: NonEmptyList[TemporalQuery[U]] =
      queries.toNelF.map { tq => ta => tq(ta) }
    inline if (valueOf[Tuple.Size[T]] == 1) {
      dtf.parse(text, list.head)
    } else {
      dtf.parseBest(text, list.toList*)
    }.asInstanceOf[U]
  }
}
extension [T <: NonEmptyTuple](t: T) {
  // 'IsMappedBy'-aware non-empty version of Tuple#toList
  inline def toNelF[F[_]](using IsMappedBy[F][T]): NonEmptyList[F[Union[InverseMap[T, F]]]] = {    NonEmptyList.fromListUnsafe(t.productIterator.toList).asInstanceOf[NonEmptyList[F[Union[InverseMap[T, F]]]]]
  }
}

The body of parseBetter consists of

  • a type alias to prevent a bit of repetition

  • a conversion from e.g. (TemporalAccessor => OffsetDateTime, TemporalAccessor => LocalDateTime) to NonEmptyList[TemporalAccessor => OffsetDateTime | LocalDateTime], followed by SAM conversion to TemporalQueries needed for the Java API.

  • a compile-time evaluated if-then-else! At call-site, only the then or else part is effectively inlined, depending on whether the query tuple has 1 or more elements. Omitting 'inline' also works but results in a run-time check.

Remarks:

  • Some casting remains necessary inside the implementation, here '.asInstanceOf[U]'. This is because we re-use the more loosely typed Java implementation. The 'we know better what the return type is' part is basically the whole point of this exercise.

  • There exists a method Tuple.toList, but that won't infer the nested function and union types properly as I need them. I guess a future iteration of the Scala 3 compiler could improve on this? Instead, I created my own version (toNelF) that is 'IsMappedBy'-aware, and made it rely on cats' NonEmptyList' (hence also the NonEmptyTuple) too. There is still a bit of casting going on inside the implementation, reflecting the implementation of Tuple.toList. Perhaps writing toNelF as an extension method is a bit over the top, but I'd like to put emphasis on the fact that it resembles and replaces Tuple.toList here.

  • The type lambda '[R] =>> TemporalAccessor => R' can also be written as'TemporalAccessor => _' with scalac flag '-Ykind-projector:underscores', or as 'TemporalAccessor => *' using scalac flag '-Ykind-projector'. The underscore syntax will be the official Scala 3 syntax but isn't (yet) enabled without a compiler flag as it would break Scala 2 code which needs to be migrated or cross-compiled. The asterisk syntax is identical to what the kind-projector compiler plugin for Scala 2 allows you to write. See here for more details.

  • I would have loved to have a type bound R <: TemporalAccessor in the type lambda's, but then the compiler throws a NullpointerException (!).

  • In my own code, I switched the 'text' and 'query' arguments around which is less similar to the original but improves ergonomics/type inference a bit.


Conclusion

Ok, so I got to play around with shiny new toys provided by Scala 3. I found a not-that-problematic API in the Java standard library but improved on it anyway (unless you find that the type signature isn't worth the benefit). But we all stumble in situations now and then where we know what the return type should be which is not the LUB of the arguments. Sometimes we use type classes. Sometimes we use macros. Sometimes we use dependent types (which are also formalized in Scala 3). Or we use the new Scala 3 match types. Here I show how to use some of the new Tuple machinery (which is built on top of match types!). If you like Scala as I do, you've found another tool for your toolbox to apply tastefully.


P.S.

If you feel Scala is already too complex, I would say: just use the subset that matches your preference! Don't be fooled that typical Scala blog post material is anywhere near representative of typical Scala production code in the wild. A large portion of my daily code could be almost grep-replaced with 's/case class/data class/' and 's/def/fun/' and turned into Kotlin. That's to show how much I like Kotlin, but ever so often I would be annoyed by limitations that I simply hardly encounter in Scala. Type-safety and expressiveness FTW!


You can play around with a slightly modified version of the code here: https://scastie.scala-lang.org/t3nImN1pSnmpKyL54IbuwA

505 views0 comments