# An odd type inference problem

This is more of a note than a full thought. Consider the following SML code:

``````datatype 'a bal = LEAF of 'a
| NODE of ('a * 'a) bal

fun first (LEAF a) = a
| first (NODE x) = #1 (first x)
``````

The definition of `first` does not typecheck, although it has a very clear meaning: it gets the first element of its argument, a perfectly balanced binary "tree" (an odd tree to be fair, as all data is stored in its leaves, but this is just an example). Conceptually, it should have type `'a bal -> 'a`. The reason why it doesn't typecheck, though, is fairly clear: in the recursive invocation `(first x)`, `x`'s type, `('a * 'a) bal`, is unified with `first`'s parameter's type, `'a bal`. This would result in the infinite type `'a = 'a * 'a`, which is caught by the occurs check. Perhaps what we're doing here seems vaguely suspect to you. In that case, consider the following code:

``````datatype ('a,'b) alt = NIL
| CONS of 'a * ('b,'a) alt

fun altmap _ _ NIL = NIL
| altmap f g (CONS (x,xs)) = CONS (f x, altmap g f xs)

(* Which of these type signatures is correct? *)
val altmap : ('a -> 'c) -> ('a -> 'c) -> ('a,'a) alt -> ('c,'c) alt
val altmap : ('a -> 'c) -> ('b -> 'd) -> ('a,'b) alt -> ('c,'d) alt
``````

What is the type of `altmap`? Intuitively, it ought to be the second type: there's nothing in `altmap` that prevents its final argument from being of type `('a,'b) alt`. `f` is only ever applied to `'a`'s, and `g` is only ever applied to `'b`'s. But SML infers (and requires) the former type. Again, it's clear why this happens from a type inference perspective: it's precisely the same reason as above. The type of the arguments to the recursive call to `altmap` are unified with the types of the parameters to `altmap`, unifying `'a` and `'b`.

Having written a basic type inferencer, I can say that acheiving the "desired" behavior above is at best nontrivial. Unfortunately I don't know enough type theory to say more than that, especially regarding whether assigning the looser types allows for anything fundamentally "wrong" from a theoretical point of view.

A footnote: The choice of SML here is irrelevant: both OCaml and Haskell have the same behavior. However, if I explicitly type-annotate `first` and `altmap` in Haskell, GHC accepts them without complaint. I do not know whether this is a feature of Haskell per se or of GHC, with its innumerable extensions.