Reiner Pope

October 10, 2009

Instance Datatypes: a language extension

Filed under: Haskell,Instance-datatypes — Reiner Pope @ 9:12 am

Consider: for any Applicative f and Num a, we could make (f a) a Num, using the declaration

> instance (Applicative f, Num a) => Num (f a) where
>     (+) = liftA2 (+)
>     (*) = liftA2 (*)
>     ...

However, we never actually want to declare such a general instance as the one above, because it rules out instances which aren’t implemented via Applicative. Instead, a better approach is for some library to define

> applicativePlus  = liftA2 (+)
> applicativeTimes = liftA2 (*)
> ...

and then have everyone who wants the Applicative instance for their particular type (say, Foo) write

> instance Num (Foo a) where
>     (+) = applicativePlus
>     (*) = applicativeTimes
>     ...

Unfortunately, each instance we declare still needs to mention all 6 functions in the Num class. What I would like to do is to say to Haskell, “use the Applicative instance for Num”, and say it just once.1

Here is a language extension that could allow you to do that. For want of a better name, I will refer to it as “Instance Datatypes” (ID).

The extension

For each class declaration

> class cxt => C a1 ... an where
>    v1 :: ty1
>    ...
>    vn :: tyn

we implicitly declare the following datatype:

> data @C a1 ... an = 
>    @C {
>       @v1 :: ty1,
>       ...
>       @vn :: tyn
>     }

Note that superclasses are irrelevant as far as the datatype is concerned.

A new syntactic form for instance declarations is added:

> instance cxt => C ty1 ... tyn = expr

which is semantically equivalent to

> d :: cxt => @C ty1 ... tyn
> d = expr
> instance cxt => C ty1 ... tyn where
>    v1 = @v1 d
>    ...
>    vn = @vn d

where d is some unused variable name.2

The @C values can be constructed and deconstructed just as ordinary record datatypes. So, for example, we could build an @Eq value:

> buildEq :: (a -> a -> Bool) -> @Eq a
> buildEq eq = @Eq {
>                    (@==) = eq,
>                    (@/=) = \a b -> not (a `eq` b)
>                  }

Some more details of the extension are discussed later, but first I give some examples.


The “Applicative instance for Num” problem

We solve the original problem:

> -- in some library:
> applicativeNum :: (Applicative f, Num a) => @Num (f a)
> applicativeNum = @Num {
>                     (@+) = liftA2 (+),
>                     (@*) = liftA2 (*),
>                     ...
>                   }
> -- when you actually want to declare an instance
> instance Num (MyApplicative Double) = applicativeNum

Generalized Newtype Deriving

There is an intersection between Instance Datatypes and Generalized Newtype Deriving (GND). When it works, GND requires no additional code on the class designer’s part, and requires only a “deriving” clause on the class instantiator’s part. To support something like GND using ID, something like the following code would be required:

> -- in some library:
> type Converter a b = (a -> b, b -> a)
> -- in the class designer's library:
> class C a where ...
> liftC :: C a => Converter a b -> @C b
> liftC = ...
> -- in the class instantiator's library:
> newtype MyType = Wrap { unwrap :: SomeOtherType }
> instance C MyType = liftC (Wrap,unwrap)

So, both the class designer, and the class instantiator would need to write more code than in GND: the class designer actually has to implement the lifter, and the class user has to write slightly more.

However, ID also has a few advantages over GND:

  • more flexibility is afforded in the converter. For example, the type need not actually be a /newtype/; it could be convertible by some other mechanism. For example, we could use a Converter (Complex a) (a,a).
  • more flexibility is afforded in the lifting. For example, the library designer can still make a lifter even if “the eta-reduction property does not hold”, which would cause GHC to give up.
  • It’s not broken, unlike GND.

Default class methods

ID would partly obsolete default class methods. Instead of providing default class methods, the class designer could just provide a function which creates the instance value given the minimal complete definition. For example:

> class Eq a where ...
> fromEquality :: (a -> a -> Bool) -> @Eq a
> fromEquality (==) = ...

This has the advantage of moving the “minimal complete definition” statement from documentation to being patently obvious, and compiler-checkable.3

Extra sharing

In some cases, polymorphic instances can cause an unavoidable loss of sharing in Haskell98. For example, if we have4:

> {-# LANGUAGE UndecidableInstances #-}
> class C a where
>     f :: a
>     g :: a
> h :: Num a => a
> h = 5^100 -- an "expensive" computatio
> instance Num a => C a where
>     f = ... h ...
>     g = ... h ...

Since h is polymorphic, its result will not be cached, and f and g will not share the result of h. I know of no way to allow f and g to share the computation of h if the instance of C is to be polymorphic.

However, with instance datatypes, we might write

> mkC :: Num a => @C a
> mkC = let h = 5^100 in @C { @f = ... h ... , @g = ... h ... }
> instance Num a => C a = mkC

In this case, the h in the let expression is monomorphic, so f and g should in fact share the computation. Although the desugaring given earlier will lose this sharing, a direct translation into GHC Core should keep it.


Special scoping rules

For this extension to be useful, we need to have access to the Prelude typeclasses, so we can build, for example, @Num values. However, the Prelude of course doesn’t (explicitly) export the @Num datatype. So, I propose the following rule to govern when the @Num datatype and its internals are in scope:

  • @v is in scope if and only v is in scope. Furthermore, @v may not be mentioned in any import or export lists.

Collisions with current identifier names

If v is a textual identifier (rather than a symbol), then @v is currently not a valid identifier. However, if v is symbol, then @v /is/ a valid identifier, and may collide with an existing identifier.

Associated types

If a class has associated datatypes or type synonyms, they can’t be declared in the datatype. As far as I can see, they must be declared at the actual instance declaration, so the syntax should perhaps be extended to

> instance cxt => C ty1 ... tyn = expr where
>     type TyFam1 ty1 ... tyn = tty1
>     ...
>     type TyFamk ty1 ... tyn = ttyk

At this point, the two instance declaration forms look very similar, and one wonders if they should in fact be merged. I believe not, because this could potentially make unclear how a class member is being defined: is it defined (a) by the declaration in the where, (b) by the @C data structure of the =expr, or (c) by the default class method?

  1. You already can do this, using the CPP macros of applicative-numbers. However, while practical, it is not a particularly satisfying approach.
  2. Note that this desugaring may lose an opportunity for sharing. See the “Extra sharing” section for details.
  3. For an admittedly weak example, consider that if you write
    > instance Eq Foo where
    >     -- no definitions

    then GHC will not warn you that any method definitions are missing: (==) and (/=) are defined in terms of each other, so the compiler cannot tell that anything is actually missing.

  4. Note that the UndecidableInstances extension is far from necessary to construct examples of this sort. However, it is useful for giving a simple example.

The Rubric Theme Blog at


Get every new post delivered to your Inbox.