Monday, March 2, 2009

Complete and Decidable Type Inference for GADTs

We have a new approach to type inference for GADTs that's much simpler than previous approaches and not ad-hoc:

Complete and Decidable Type Inference for GADTs
Tom Schrijvers, Simon Peyton Jones, Martin Sulzmann, Dimitrios Vytiniotis
Submitted to ICFP 2009

GADTs have proven to be an invaluable language extension, a.o. for ensuring
data invariants and program correctness. Unfortunately, they pose a tough
problem for type inference: we lose the principal-type property, which is
necessary for modular type inference.

We present a novel and simplified type inference approach for local type
assumptions from GADT pattern matches. Our approach is complete and decidable,
while more liberal than previous such approaches.

Get the draft.

7 comments:

Thomas Schilling said...

Is there a typo in the paper?

data T :: * -> * where
T1 :: Int -> T Bool
T2 :: Int

f2 (T1 n) = n -- shouldn't 'n' have type Int?
f2 (T2 xs) = null xs

How can the above function ever be typeable with result type Bool? Or are you saying that the function is not typeable but the compiler can't figure that out on its own? If so, this is a bit confusing if you don't say so.

Tom Schrijvers said...

Thanks for telling me. It's a typo. The code should read

f2 (T1 n) = n > 0
f2 (T2 xs) = null xs

Anonymous said...

I expect the ICFP reviewers will point this out, but in the meantime, you should expand the strange abbreviation "a.o." in the abstract. Most people will have no idea what this means; it is not even guessable, because "among others" normally follows the things that are among others, like this:

GADTs have proven to be an invaluable language extension for ensuring data invariants and program correctness, among others.

gfh said...
This post has been removed by a blog administrator.
Anthony Clayden said...

Somewhere in GHC 6.10 releases it became possible to put repeated type vars in type instance heads, such as:
> type instance F (a, b) a = ...
> type instance F (a, b) b = ...

This is useful for type-indexing tuples (in the style of HLists).

In 6.9 repeated tyvars were banned, and I'd tried (and failed) to achieve this as:
> type instance (a ~ a') => F (a, b) a' = ...

(Whether or not this might be valid one day, we *still* can't put constraints on type instances grr.)
In any case, the equality as a constraint would only act to validate the instance after selection, whereas the repeated tyvar plays a part in determining which instance applies.

How is this instance selection implemented in the type inference engine? I can't see it mentioned in this GADT paper, nor last year's "Type Checking ..." paper.

Tom Schrijvers said...

Anthony,

I don't know about the current state of GHC, but the type instances should be confluent for soundness reasons. You don't want

> type instance F (a,b) a = Int
> type instance F (a,b) b = Bool

because then F (Char,Char) Char can reduce to both Int and Bool. An easy solution is to disallow overlap of the type instance heads. More involved approaches also look into the bodies.

Anthony Clayden said...

Thank you Tom,
Yes I realise those are overlapping instances, and that's the real point of asking.
I know by construction I won't get overlaps. (That is, for all uses: a /~ b. Of course, I'm not really using mere tuples, but a specialised data type.)
Using FunDeps I can tell the compiler to 'trust' me with -XIncoherentInstances and/or -XUndecidableInstances -- but I agree this rapidly turns into 'swampy ground'.
So Type Functions ought to offer a more tractable way forward. But instead the overlaps opportunity got 'solved' by banning them altogether (or allowing only a very limited form where the instances must be confluent, as you say).
It seemed to me that the GADT paper's approach of introducing 'local' implications at use sites might offer a way forward. If only there were a way of telling type inference that there are to be no uses falling into the area of instance overlaps.

Post a Comment