Hindley-Milner type inference algorithim, explained

joe_the_user · on Dec 29, 2008

OK,

The article indeed shows what the Hindley-Milner algorithm is. The discussion is similar to a google techtalk I've seen. The discussion seems straightforward and I can see the connection to automatic theorem proving and prolog. But the article doesn't particular say why the algorithm is cool. I mean, one could embedded some logic problem in the types of the functions one used and get the answer to the problem using Hindley-Milner but it would be easier to just use prolog or an automatic theorem prover. Further, not all programming problems are easily amenable to being put into the form of a logic problem.

So I'm waiting for the article that shows why this is cool, I.E., what concrete problems the Hindley-Milner algorithm would be useful for.

rcoder · on Dec 29, 2008

"So I'm waiting for the article that shows why this is cool, I.E., what concrete problems the Hindley-Milner algorithm would be useful for."

Here's a quick take:

Ruby programmers (and, to a lesser extent, other dynamic OO coders) tend to get very excited about concepts like "duck typing" -- 'if it walks like a duck, it may as well be a duck' -- and tend to use it heavily for polymorphism, future-proof APIs, and other nifty "agile" programming techniques.

However, with HM type inference and either modular programming (ala ML) or typeclasses (ala Haskell), you can gain most of the same terseness and dynamism of your "duck-typed" Ruby code, while still having strong compile-time checks that your "duck-like" objects in fact satisfy all the requirements of "duck-ness".

The biggest trade-off, IMO, is that you now have a little logic programming language embedded in your type system, and can't compile your program until all the type constraints can be resolved.

Also, unlike Java, where such problems are (from the production code I've seen, anyway) usually handled with a runtime cast through the Object ground type, languages with HM type inference usually require there to actually be a valid type-conversion function between the mis-matched types in question.

None of this is unique to Hindley-Milner type inference, of course; HM is just the standard algorithm for languages which implement the simply-typed lambda calculus.

silentbicycle · on Dec 29, 2008

> Also, unlike Java, where such problems are [...] usually handled with a runtime cast through the Object ground type, languages with HM type inference usually require there to actually be a valid type-conversion function between the mis-matched types in question.

And they usually also have pattern-matching, which makes this relatively straightforward (when there is in fact a valid conversion).

It's worth taking a look at the OO system in OCaml, by the way. It works very similarly to duck-typing, but it also infers the required methods at compile time: "this function requires its first argument to be any object that takes this message with these argument type(s) and returns that type." This decouples polymorphism from inheritance/subclassing. That way, you can use OO without getting too bogged down in yucky "Circle/Ellipse" problems ( http://c2.com/cgi/wiki?CircleAndEllipseProblem ).

Ironically, the OO system in OCaml doesn't seem to get used all that much, though - with the other abilities the language already gives you, you don't really need to use OO except for problems for which it is actually a good fit, and in practice that's fewer than many people would claim.

joe_the_user · on Dec 30, 2008

"Ruby programmers (and, to a lesser extent, other dynamic OO coders) tend to get very excited about concepts like "duck typing" -- 'if it walks like a duck, it may as well be a duck' -- and tend to use it heavily for polymorphism, future-proof APIs, and other nifty "agile" programming techniques."

OK, I think I get the advantage in terseness and power relative other static languages. Does it, or how does it, give you any of the other advantages of Duck-typing (such test functions that equally use mocks, stubs or the real objects?

rcoder · on Dec 30, 2008

"OK, I think I get the advantage in terseness and power relative other static languages. Does it, or how does it, give you any of the other advantages of Duck-typing (such test functions that equally use mocks, stubs or the real objects?"

First, having a robust type system (whether simplified for the programmer by the presence of type inference or not) makes a large number of the types of tests performed in scripting language code unnecessary. Return types, errors raised, bounds checks, etc., are all implicit in the type signature of a function, and so do not need to be checked.

Secondly, at least in a pure language like Haskell, stateful code must be delimited via monads. Good program design dictates that you make those stateful components as small as possible, which then allows you to test the remainder of your code without resorting to mocks and stubs at all.

As a basic example, instead of passing a mocked file handle object between multiple methods, you might change your API to use a sequence of strings representing lines in the source file. At test time, you can simply pass any such sequence to your functions rather than mocking or stubbing all the potential behaviors of a "real" file.

That being said, your real question seems to be, "can I substitute X for Y, as I can in Ruby/Python/etc.?" My answer to that would be "yes, as long as there are (potentially trivial) implementations of all the required functions in the typeclass(es) of Y that we care about."

Haskell typing, at least, acts a lot more like an abstract interface than a concrete class, and a single datum is likely to be a member of many different typeclasses. However, each parameter to a function is constrained by a single typeclass, so you need not provide for every possible operation on a piece of data in order to mock it for testing.

silentbicycle · on Dec 30, 2008

H-M doesn't necessitate an object system, but that will work in OCaml's, specifically. See the Duck Typing section here: http://enfranchisedmind.com/blog/2008/04/14/useful-things-ab...

davidw · on Dec 29, 2008

If I understand things right, it can give you static typing, and the benefits that entails, without all the "Foobar foobar = new Foobar()" crap. It just figures it out for you based on how your code is written. Maybe I'm wrong though...

silentbicycle · on Dec 29, 2008

without all the "Foobar foobar = new Foobar()" crap

Exactly. You will spend less time restating things the compiler should be able to figure out by itself.

Vanilla H-M type inference (as in SML and OCaml) requires you to write code which the type inference can completely resolve, however. This tends to lead to compilers that seem really strict until you learn to work with rather than against them, but are excellent at catching several entire categories of bugs at compile-time. The type inference also works with complicated higher-order functions (where the full type signature might be several lines long!), not just trivial examples, and rapidly determines what code will be affected by any small changes, but it can add a little more friction to rapidly changing code, and every once in a while you will still need to do things to appease it.

Whether or not this is a net plus or minus depends on your programming style, the project, etc. (Dynamically-typed languages don't need type annotations either, but they have different trade-offs.)

stcredzero · on Dec 29, 2008

What if someone developed a language where the compiler did Hindley-Milner type inference and was implemented in the language itself? This would make the inferencing available to IDEs, which could then automatically fill in type annotations for the programmer.

silentbicycle · on Dec 29, 2008

Haskell does this already. You can ask the Haskell runtime the type annotation for a given function. The Haskell's Emacs mode (http://www.haskell.org/haskell-mode/) can insert it at point.

Also, PLT Scheme's "Mr. Spidey" (http://www.plt-scheme.org/software/mrspidey/) looks like an interesting type-inference system (non-HM), though I haven't personally used it.

nostrademons · on Dec 29, 2008

Hindley Milner type inference in 112 lines of Haskell:

http://github.com/nostrademons/eve-language/tree/dc33a5a187b...

I'm working on a constraint-typing algorithm now (like Haskell's) that has records and polymorphism. It's quite a bit harder, but vanilla Hindley-Milner makes it impossible to have any sort of subtyping or ad-hoc polymorphism, both of which are pretty necessary for practical programming.

If you're into this stuff, I'd check out the paper "Typing Haskell in Haskell", which presents a full implementation of Haskell's typechecker as a literate Haskell program.

silentbicycle · on Dec 29, 2008

Haskell has typeclasses (http://haskell.org/haskellwiki/A_brief_introduction_to_Haske...), though. They allow inference of required type characteristics ("requires a type which can be sorted", "requires a numeric type", etc.), rather than type identity, which can solve many of the same problems.

nostrademons · on Dec 29, 2008

Right, that's part of what I'm trying to implement. They're well-covered in THiH.

Typeclasses have some pitfalls though: in particular, they are not types, so you can't have a typeclass as a field in a data structure. (You can have them as part of the context of an explicit forall, eg. existential types, but those are more complicated than the average programmer can understand.)

Bruno Oliveira and Martin Sulzmann have recently done some interesting work with unifying typeclasses and ordinary algebraic data types:

http://www.comlab.ox.ac.uk/people/Bruno.Oliveira/objects.pdf

But currently type inference for this is "left to a future paper", so it's not all that useful for my purposes. I suspect it may be undecidable, given the known undecidability results for System F-omega and other type systems with subtyping.

schtog · on Dec 29, 2008

Since there seem to be some confusion over what it is good for I'll post some excerpts from a Haskell-session: http://hpaste.org/13412

Basically you get the best of both worlds, static typing, ie types checked at compile-time but you don't have to write out all the types(you can however if you want to).

  add x y = x + y
  Prelude> :t add
  add :: (Num a) => a -> a -> a

nihilocrat · on Dec 29, 2008

Ground Zero

I think he means 'square one'. One of my pet peeves.

river_styx · on Dec 29, 2008

*algorithm