An odd type inference problem
This is more of a note than a full thought. Consider the following SML code:
datatype 'a bal = LEAF of 'a
| NODE of ('a * 'a) bal
fun first (LEAF a) = a
| first (NODE x) = #1 (first x)
The definition of first
does not typecheck, although it has a very clear
meaning: it gets the first element of its argument, a perfectly balanced binary
"tree" (an odd tree to be fair, as all data is stored in its leaves, but this is
just an example). Conceptually, it should have type 'a bal -> 'a
. The reason
why it doesn't typecheck, though, is fairly clear: in the recursive invocation
(first x)
, x
's type, ('a * 'a) bal
, is unified with first
's parameter's
type, 'a bal
. This would result in the infinite type 'a = 'a * 'a
, which is
caught by the occurs check. Perhaps what we're doing here seems vaguely suspect
to you. In that case, consider the following code:
datatype ('a,'b) alt = NIL
| CONS of 'a * ('b,'a) alt
fun altmap _ _ NIL = NIL
| altmap f g (CONS (x,xs)) = CONS (f x, altmap g f xs)
(* Which of these type signatures is correct? *)
val altmap : ('a -> 'c) -> ('a -> 'c) -> ('a,'a) alt -> ('c,'c) alt
val altmap : ('a -> 'c) -> ('b -> 'd) -> ('a,'b) alt -> ('c,'d) alt
What is the type of altmap
? Intuitively, it ought to be the second type:
there's nothing in altmap
that prevents its final argument from being of type
('a,'b) alt
. f
is only ever applied to 'a
's, and g
is only ever applied
to 'b
's. But SML infers (and requires) the former type. Again, it's clear
why this happens from a type inference perspective: it's precisely the same
reason as above. The type of the arguments to the recursive call to altmap
are
unified with the types of the parameters to altmap
, unifying 'a
and 'b
.
Having written a basic type inferencer, I can say that acheiving the "desired" behavior above is at best nontrivial. Unfortunately I don't know enough type theory to say more than that, especially regarding whether assigning the looser types allows for anything fundamentally "wrong" from a theoretical point of view.
A footnote: The choice of SML here is irrelevant: both OCaml and Haskell have
the same behavior. However, if I explicitly type-annotate first
and altmap
in Haskell, GHC accepts them without complaint. I do not know whether this is a
feature of Haskell per se or of GHC, with its innumerable extensions.