Journal of Functional Programming, vol. 24 (5) (2014)
Andreas Rossberg, Claudio Russo, Derek Dreyer
ML modules are a powerful language mechanism for decomposing programs into reusable components. Unfortunately, they also have a reputation for being "complex" and requiring fancy type theory that is mostly opaque to non-experts. While this reputation is certainly understandable, given the many non-standard methodologies that have been developed in the process of studying modules, we aim here to demonstrate that it is undeserved. To do so, we give a very simple elaboration semantics for a full-featured, higher-order ML-like module language. Our elaboration defines the meaning of module expressions by a straightforward, compositional translation into vanilla System Fω (the higher-order polymorphic λ-calculus), under plain Fω typing environments. We thereby show that ML modules are merely a particular mode of use of System Fω. We start out with a module language that supports the usual second-class modules with Standard ML-style generative functors, and includes local module definitions. To demonstrate the versatility of our approach, we further extend the language with the ability to package modules as first-class values — a very simple extension, as it turns out — and a novel treatment of OCaml-style applicative functors. Unlike previous work combining both generative and applicative functors, we do not require two distinct forms of functor or sealing expressions. Instead, whether a functor is applicative or not depends only on the computational purity of its body — in fact, we argue that applicative/generative is rather incidental terminology for what is best understood as pure vs. impure functors. This approach results in a semantics that we feel is simpler and more natural, and moreover prohibits breaches of data abstraction that are possible under earlier semantics for applicative functors. We also revive (in refined form) the long-lost notion of structure sharing from SML'90. Although previous work on module type systems has disparaged structure sharing as type-theoretically questionable, we observe that (1) some variant of it is in fact necessary in order to provide a proper treatment of abstraction in the presence of applicative functors, and (2) it is straightforward to account for using ``phantom types''. Based on this, we can even justify the (previously poorly understood) "where module" operator for signatures and the related notion of manifest module specifications. Altogether, we describe a comprehensive, unified, and yet simple semantics of a full-blown module language that — with the main exception of cross-module recursion — covers almost all interesting features that can be found in either the literature or in practical implementations of ML modules. We prove the language sound and its type checking decidable.