Chapter 9
Uniqueness Typ
ing

9.1

Basic Ideas behind Uniqueness Typing

9.2

Attribute Propagation

9.3

Defining New Types with Uniqueness Attributes

9.4

Uniqueness and Sharing

9.5

Combining Uniqueness Typing and Overloading

9.6

Higher-Order Type Definitions

9.7

Destructive Updates using Uniqueness Typing

 

Although Clean is purely functional, operations with side-effects (I/O operations, for instance) are permitted. To achieve this without violating the semantics, the classical types are supplied with so cal­led uniqueness attributes. If an argument of a function is indicated as unique, it is guaranteed that at run-time the corresponding actual object is local, i.e. there are no other references to it. Clearly, a de­structive update of such a “unique object” can be performed safely.

The uniqueness type system makes it possible to define di­rect interfaces with an operating sys­tem, a file system (updating per­sis­tent data), with GUI’s libra­ries, it allows to create ar­rays, records or user defined data struc­tures that can be updated de­structively. The time and space behavior of a functional pro­gram therefore greatly benefits from the unique­ness typ­ing.

Uniqueness types are deduced automatically. Type attributes are poly­mor­p­hic: attribute variables and inequalities on these variables can be used to indicate relations between and restrictions on the corre­s­ponding concrete attribute values.

Sometimes the inferred ty­pe attributes give some extra information on the run-time behavior of a function. The uniqueness type system is a transparent extension of classical typing that means that if one is not interested in the uniqueness information one can simply ignore it.

Since the uniqueness typing is a rather complex matter we ex­plain this type system and the mo­tivation behind it in more detail. The first Section (9.1) explains the basic motivation for and ideas behind uni­queness typing. Section 9.2 focuses on the so-called uniqueness propagation property of (algebraic) type constructors. Then we show how new data structures can be defined containing unique objects (Section 9.3). Sharing may destroy locality properties of objects. In Section 9.4 we describe the ef­fect of sharing on uniqueness types. In order to maintain referential transparency, it appears that func­tion ty­pes have to be treated specially. The last Section (9.5) describes the combination of unique­ness typing and overloading. Especially, the subsections on constructor classes and higher-oder type defini­tions are very complex: we suggest that the reader skips these sections at first instance.

9.1    Basic Ideas behind Uniqueness Typing

The uniqueness typing is an extension of classical Milner/Mycroft typing. In the unique­ness type sys­tem u­niqueness type attributes  are attached to the classical types. Uniqueness type at­tributes appear in the type speci­fi­cations of func­tions (see 9.4) but are also permitted in the definitions of new data types (see 9.3). A classical type can be prefixed by one of the follo­w­ing unique­ness type at­tributes:

 

Type

=

{BrackType}+

 

BrackType

=

[UniversalQuantVariables] [Strict] [UnqTypeAttrib] SimpleType

 

UnqTypeAttrib

=

*

// type attribute “unique”

 

|

UniqueTypeVariable:

// a type attribute variable

 

|

.

// an anonymous type attribute variable

The basic idea behind uniqueness typing is the following. Suppose a function, say F, has a unique ar­gument (an argument with type *s, for some s). This attribute imposes an ad­ditional re­striction on applications of F.

It is guaranteed that F will have private (“unique”) access to this particular argument (see Barendsen and Smet­sers, 1993; Plasmeijer and Van Eeke­len, 1993): the object will have a refer­ence count of 11 at the moment it is in­spected by the func­tion. It is important to know that there can be more than 1 reference to the ob­ject before this specific ac­cess takes place. If a uniquely typed argument is not used to construct the function re­sult it will be­come garbage (the reference has dropped to zero). Due to the fact that this analysis is performed statically the object can be garbage collected (see Chapter 1) at compile-time. It is harmless to reuse the space occupied by the argu­ment to create the func­tion re­sult. In other words: it is allowed to up­date the unique ob­ject de­struc­tively without any conse­quences for referential transparency.

 

Example: the I/O library function fwritec is used to write a character to a file yielding a new file as result. In gen­eral it is se­mantically not allowed to overwrite the argument file with the given character to construct the re­sult­ing file. Howe­ver, by demanding the argument file to be unique by specifying

 

fwritec:: Char *File -> *File

It is guaranteed by the type system that fwritec has private access to the file such that overwriting the file can be done without violating the functional semantics of the program. The result­ing file is unique as well and can therefore be passed as continuation to another call of e.g. fwritec to make further writ­ing possi­ble.

 

WriteABC:: *File -> *File

WriteABC file = fwritec 'c' (fwritec 'b' (fwritec 'a' file))

Observe that a unique file is passed in a single threaded way (as a kind of unique token) from one func­tion to another where each function can safely modify the file knowing that is has private ac­cess to that file.

 

One can make these intermediate files more vissible by by writing the WriteABC as follows.

 

WriteABC file = file3

where

    file1 = fwritec 'a' file

    file2 = fwritec 'b' file1

    file3 = fwritec 'c' file2

 

or, alternatively (to avoid the explicit numbering of the files),

 

WriteABC file

    #   file = fwritec 'a' file

        file = fwritec 'b' file

    =   fwritec 'c' file

The type system makes it possible to make no distinction between a Clean file and a physical file of the real world: file I/O can be treated as efficiently as in imperative lan­guages. The uniqueness typing prevents writing while other readers/writers are active. E.g. one can­not ap­ply fwritec to a file being used elsewhere.

 

For instance, the following expression is not approved by the type system:

 

(file, fwritec 'a' file)

Function arguments with no uniqueness attributes added to the classical type are considered as “non-unique”: there are no reference requirements for these arguments. The function is only al­lowed to have read access (as usual in a func­tional lan­guage) even if in some of the function appli­cations the actual ar­gu­ment ap­pears to have reference count 1.

freadc:: File -> (Char, File)

The function freadc can be applied to both a unique as well as non-unique file. This is fine since the function only wants read ac­cess on the file. The type indicates that the result is always a non-unique file. Such a file can be passed for further reading, but not for further writing.

To indicate that functions don’t change uniqueness properties of arguments, one can use attribute variables.

 

The simplest example is the identity functions that can be typed as follows:

 

id:: u:a -> u:a

id x = x

 

Here a is an ordinary type variable, whereas u is an attribute variable. If id is applied to an unique object the result is also unique (in that case u is instantiated with the concrete attribute *). Of course, if id is applied to a non-unique object, the result remains non-unique. As with ordinary type variables, attribute variables should be instantiated uniformly.

 

A more interesting example is the function freadc that is typed as

 

freadc:: u:File -> u:(Char, u:File)

 

Again freadc can be applied to both unique and non-unique files. In the first case the resulting file is also unique and can, for example, be used for further reading or writing. Moreover, observe that not only the resulting file is attributed, but also the tuple containing that file and the charac­ter that has been read. This is due to the so called uniqueness propagation rule; see below.

To summarize, uniqueness typing makes it possible to update objects destructively within a purely functional lan­guage. For the development of real world applications (which manipulate files, windows, arrays, data­bases, states etc.) this is an indispensable property.

9.2    Attribute Propagation

Having explained the general ideas of uniqueness typing, we can now focus on some details of this ty­ping system.

If a unique object is stored in a data structure, the data structure itself becomes unique as well. This uniqueness propagation rule prevents that unique objects are shared indirectly via the data structure in which these objects are stored. To explain this form of hidden sharing, consider the following defini­tion of the function head

 

head:: [*a] -> *a

head [hd:tl] = hd

The pattern causes head to have access to the “deeper” arguments hd and tl. Note that head does not have any uniqueness requirements on its direct list argument. This means that in an application of head the list might be shared, as can be seen in the following function heads

 

heads list = (head list, head list)

If one wants to formulate uniqueness requirements on, for instance, the hd argument of head, it is not sufficient to attribute the corresponding type variable a with *; the surrounding list itself should also be­come unique. One can easily see that, without this additional requirement the heads example with type

 

heads:: [*a] -> (*a,*a)

heads list = (head list, head list)

would still be valid although it delivers the same object twice. By demanding that the surrounding list becomes unique as well, (so the type of head becomes head:: *[*a] -> *a) the function heads is re­jected. In general one could say that uniqueness propa­gates outwards.

Some of the readers will have noticed that, by using attribute variables, one can assign a more general uniqueness type to head:

head:: u:[u:a] -> u:a

The above propagation rule imposes additional (implicit) restrictions on the attributes appearing in type specifications of functions.

Another explicit way of indicating restrictions on attributes is by using coercion statements. These state­ments consist of attribute variable inequalities of the form u <= v. The idea is that attribute substituti­ons are only allowed if the resulting attribute inequalities are valid, i.e. not resulting in an equality of the form

‘non-unique ≤ unique’.

The use of coercion statements is illustrated by the next example in which the uniqueness type of the well-known append function is shown.

 

append:: v:[u:a] w:[u:a] -> x:[u:a],     [v<=u, w<=u, x<=u,w<=x]

The first three coercion statements express the uniqueness propagation for lists: if the elements a are unique (by choosing * for u) these statements force v,w and x to be instantiated with * also. (Note that u <= * iff u = *.) The statement w <= x expresses that the spine uniqueness of append’s result depends only on the spine attribute w of the second argument.

In Clean it is permitted to omit attribute variables and attribute inequalities that arise from propaga­tion properties; these will be added automatically by the type system. As a consequence, the following type for append is also valid.

 

append:: [u:a] w:[u:a] -> x:[u:a],       [w<=x]

Of course, it is always allowed to specify a more specific type (by instantiating type or attribute vari­a­bles). All types given below are valid types for append.

 

append:: [u:a] x:[u:a] -> x:[u:a],

append:: *[*Int] *[*Int] -> *[*Int],

append:: [a] *[a] -> *[a].

To make types more readable, Clean offers the possibility to use anonymous attribute variables. These can be used as a shorthand for indicating attribute variables of which the actual names are not essential. This allows us to specify the type for append as follows.

 

append:: [.a] w:[.a] -> x:[.a],      [w<=x]

The type system of Clean will substitute real attribute variables for the anonymous ones. Each dot gives rise to a new attribute variable except for the dots attached to type variables: type variables are at­tribu­ted uniformly in the sense that all occurrences of the same type variable will obtain the same at­tribute. In the above example this means that all dots are replaced by one and the same new attribute variable.

9.3    Defining New Types with Uniqueness Attributes

Although one mostly uses uniqueness attributes in type specifications of functions, they are also al­lo­wed in the definition of new data types.

 

AlgebraicTypeDef

=

::TypeLhs

= ConstructorDef

 

 

 

{| ConstructorDef} ;

ConstructorDef

=

 [ExistentalQuantVariables] ConstructorName {BrackType}

 

|

[ExistentalQuantVariables] (ConstructorName) [Fix][Prec] {BrackType}

 

TypeLhs

=

[*]TypeConstructor {TypeVariable}

TypeConstructor

=

TypeName

 

ExistentalQuantVariables

=

E.{TypeVariable }+:

UniversalQuantVariables

=

A.{TypeVariable }+:

 

BrackType

=

[UniversalQuantVariables] [Strict] [UnqTypeAttrib] SimpleType

UnqTypeAttrib

=

*

 

|

UniqueTypeVariable:

 

|

.

As can be inferred from the syntax, the attributes that are actually allowed in data type definitions are ‘*’ and ‘.’; attribute variables are not permitted. The (unique) * attribute can be used at any subtype whereas the (anonymous). attribute is restricted to non-variable positions.

If no uniqueness attributes are specified, this does not mean that one can only build non-unique in­stances of such a data type. Attributes not explicitly specified by the programmer are added automati­cally by the type system. To explain this standard uniqueness attribution mechanism, first remember that the types of data constructors are not specified by the programmer but derived from their corre­s­ponding data type definition.

 

For example, the (classical) definition of the List type

 

:: List a = Cons a (List a) | Nil

 

leads to the following types for its data constructors:

 

Cons:: a (List a) -> List a

Nil:: List a

To be able to create unique instances of data types, the standard attribution of Clean will automati­cally derive ap­propriate uniqueness variants for the types of the corresponding data constructors. Such a uniqueness variant is obtained via a consistent attribution of all types and subtypes appearing in a data type defini­tion. Here, consistency means that such an attribution obeys the following rules (assume that we have a type definition for some type T).

1)       Attributes that are explicitly specified are adopted.

2)       Each (unattributed) type variable and each occurrence of T will receive an attribute variable. This is done in a uniform way: equal type variables will receive equal attributes, and all occurrence of T are also equally attributed.

3)       Attribute variables are added at non-variable positions if they are required by the propagation pro­perties of the corresponding type constructor. The attribute variable that is chosen depends on the argument types of this constructor: the attribution scheme takes the attribute variable of first ar­gument appearing on a propagating position (see example below).

4)       All occurrences of the. attribute are replaced by the attribute variable assigned to the occurrences of T.

 

Example of standard attribution for data constructors. For Cons the standard attribution leads to the type

 

Cons:: u:a v:(List u:a) -> v:List u:a, [v<=u]

 

The type of Nil becomes

 

Nil:: v:List u:a, [v<=u]

 

Consider the following Tree definition

 

:: Tree a     =    Node a [Tree a]

 

The type of the data constructor Node is

 

Node:: u:a v:[v:Tree u:a] -> v:Tree u:a, [v<=u]

 

Another Tree variant.

 

:: Tree *a    =    Node *a [Tree *a]

 

leading to

 

Node:: *a *[*Tree *a] -> *Tree *a

 

Note that, due to propagation, all subtypes have become unique.

Next, we will formalize the notion of uniqueness propagation. We say that an argument of a type con­structor, say T, is propagating if the corresponding type variable appears on a propagating position in one of the types used in the right-hand side of T’s definition. A propagating position is characterized by the fact that it is not surrounded by an arrow type or by a type constructor with non-propaga­t­ing arguments. Observe that the definition of propagation is cyclic: a general way to solve this problem is via a fixed-point construction.

 

Example of the propagation rule. Consider the (record) type definition for Object.

 

Object a b::  {state:: a, fun:: b -> a}

 

The argument a is propagating. Since b does not appear on a propagating position inside this definition, Object is not propagating in its second argument.

9.4    Uniqueness and Sharing

The type inference system of Clean will derive uniqueness information after the classical Mil­ner/Mycroft ty­pes of func­tions have been inferred (see 4.3). As explained in Section 9.1, a function may require a non-unique object, a unique object or a possi­bly unique ob­ject. Uniqueness of the result of a func­tion will de­pend on the attributes of its arguments and how the result is constructed. Until now, we distinguished objects with reference count 1 from objects with a larger reference count: only the former might be unique (depending on the uniqueness type of the object itself). In practice, how­ever, one can be more liberal if one takes the evaluation order into account. The idea is that mul­tiple refer­ence to an (unique) object are harmless if one knows that only one of the references will be present at the moment it is accessed destructively. This has been used in the following function.

 

AppendAorB:: *File -> *File

AppendAorB file

|   fc == 'a' = fwritec 'a' file

              = fwritec 'b' file

where

    (fc,nf)   = freadc file

When the right-hand side of AppendAorB is evaluated, the guard is determined first (so access from freadc to file is not unique), and subsequently one of the alternatives is chosen and evaluated. De­pending on cond, either the reference from the first fwritec application to function file or that of the second application is left and therefore unique.

For this reason, the uniqueness type system uses a kind of shar­ing analysis. This sharing analy­sis is in­put for the uniqueness type system itself to check uniqueness type consis­tency (see 9.3). The analysis will label each refer­ence in the right-hand side of a function definition as read-only (if destructive access might be dangerous) or write-permitted (otherwise). Objects accessed via a read-only reference are al­ways non-unique. On the other hand, uniqueness of objects accessed via a reference labeled with write-permitted solely depends on the types of the objects themselves.

Before describing the labeling mechanism of Clean we mention that the “lifetime” of references is de­termined on a syntactical basis. For this reason we classify references to the same expression in a func­tion definition (say for f) according to their estimated run-time use, as alternative, observing and paral­lel.

•         Two references are alternative if they belong to different alternatives of f. Note that alternatives are distinguished by patterns (including case expressions) or by guards.

•         A reference r is observing w.r.t. a reference r’ if the expression containing r’ is either (1) guarded by an expression or (2) preceded by a strict let before expression containing r.

•         Otherwise, references are in parallel.

The rules used by the sharing analysis to label each reference are the following.

•         A reference, say r, to a certain object is labeled with read-only if there exist another reference, say r’, to the same object such that either r is observing w.r.t r’ or r and r’ are in parallel.

•         Multiple references to cyclic structures are always labeled as read-only.

•         All other references are labeled with write-permitted.

Unfortunately, there is still a subtlety that has to be dealt with. Observing references belonging in a strict context do not always vanish totally after the expression containing the reference has been evalu­ated: further analysis appears to be necessary to ensure their disappearance. More concretely, suppose e[r] denotes the expression containing r. If the type of e[r] is a basic type then, after evaluation, e[r] will be reference-free. In particular, it does not contain the reference r any­more. However, If the type of e[r] is not a basic type it is assumed that, after evaluation, e[r] might still refer to r. But even in the latter case a further refinement is possible. The idea is, depending on e[r], to correct the type of the object to which r refers partially in such way that only the parts of this object that are still shared lose their uniqueness.

 

Consider, for example, the following rule

 

f l =

#! x = hd (hd l)

= (x, l)

 

Clearly, x and l share a common substructure; x is even part of l. But the whole “spine” of l (of type [[...]]) does not contain any new external references. Thus, if l was spine-unique originally, it re­mains spine unique in the result of f. Apparently, the access to l only affected part of l’s structure. More technically, the type of l itself is corrected to take the partial access on l into account. In the pre­vious example, x, regarded as a function on l has type [[a]] -> a. In f’s definition the part of l’s type corresponding to the variable a is made non-unique. This is clearly reflected in the derived type for f, being

 

f:: u:[w:[a]] -> (a,v:[x:[a]]), [w <= x, u <= v]

In Clean this principle has been generalized: If the strict let expression e[r] regarded as a function on r has type

T (... a...) -> a

Then the a-part of the type of the object to which r refers becomes non-unique; the rest of the type re­mains unaffected. If the type of e[r] is not of the indicated form, r is not considered as an observing reference (w.r.t. some reference r’), but, instead, as in parallel with r’.

9.4.1 Higher Order Uniqueness Typing

Higher-order functions give rise to partial (often called Curried) applications, i.e. applications in which the actual number of arguments is less than the arity of the corresponding symbol. If these partial ap­plications contain unique sub-expressions one has to be careful.

 

Consider, for example the following the function fwritec with type

 

fwritec:: *File Char -> *File

 

in the application

 

fwritec unifile

 

(assuming that unifile returns a unique file). Clearly, the type of this application is of the form o:(Char -> *File). The question is: what kind of attribute is o? Is it a variable, is it *, or, is it not uni­que? Before making a decision, one should notice that it is dangerous to allow the above application to be shared. For example, if the expression fwritec unifile is passed to a function

 

WriteAB write_fun = (write_fun 'a', write_fun 'b')

Then the argument of fwritec is no longer unique at the moment one of the two write operations take place. Apparently, the fwritec unifile expression is essentially unique: its reference count should never become greater than 1. To prevent such an essentially unique expression from being co­pied, Clean considers the -> type constructor in combination with the * attribute as special: it is not permitted to discard its uniqueness. Now, the question about the attribute o can be answered: it is set to *. If WriteAB is typed as follows

 

WriteAB:: (Char -> u:File) -> (u:File, u:File)

WriteAB write_fun = (write_fun 'a', write_fun 'b')

the expression WriteAB (fwritec unifile) is rejected by the type system because it does not allow the ar­gument of type *(Char -> *File) to be coerced to (Char -> u:File). One can easily see that it is im­possi­ble to type WriteAB in such a way that the expression becomes typable.

To define data structures containing Curried applications it is often convenient to use the (anonymous). attribute. Example

:: Object a b = { state:: a, fun::.(b -> a) }

new:: * Object *File Char

new = { state = unifile, fun = fwritec unifile }

By adding an attribute variable to the function type in the definition of Object, it is possible to store unique functions in this data structure. This is shown by the function new. Since new contains an es­sen­tially unique expression it becomes essentially unique itself. So, new can never lose its uniqueness, and hence, it can only be used in a context in which a unique object is demanded.

Determining the type of a Curried application of a function (or data constructor) is somewhat more involved if the type of that function contains attribute variables instead of concrete attributes. Mostly, these variables will result in additional coercion statements. as can be seen in the example below.

 

Prepend:: u:[.a] [.a] -> v:[.a],     [u<=v]

Prepend a b = Append b a

 

PrependList:: u:[.a] -> w:([.a] -> v:[.a]),       [u<=v, w<=u]

PrependList a = Prepend a

Some explanation is in place. The expression (PrependList some_list) yields a function that, when applied to another list, say other_list, delivers a new list extended consisting of the concatenation of other_list and some_list. Let’s call this final result new_list. If new_list should be unique (i.e. v be­comes *) then, because of the coercion statement u<=v the attribute u also becomes *. But, if u = * then also w = *, for, w<=u. This implies that (arrow) type of the original expression (PrependList some_list) becomes unique, and hence this expression cannot be shared.

9.4.2 Uniqueness Type Coercions

As said before, offering a unique object to a function that requires a non-unique argument is safe (unless we are dealing with unique arrow types; see above). The technical tool to express this is via a co­ercion (subtype) relation based on the ordering

‘unique’ ≤ ‘non-unique’

on attributes. Roughly, the validity of ss’ depends subtype-wise on the validity of uu’ with u,u’ at­tributes in s,s’. One has, for example

u:[v:[w:Int]]u’:[v’:[w’:Int]] iff uu’, vv’, ww’.

However, a few refinements are necessary. Firstly, the uniqueness constraints expressed in terms of co­ercion statements (on attribute variables) have to be taken into account. Secondly, the coercion re­stric­tion on arrow types should be handled correctly. And thirdly, due to the so-called contravariance of -> in its first argument we have that

u:(s -> s’) ≤ u:(t -> t’) iff ts, s’ ≤ t

Since -> may appear in the definitions of algebraic type constructors, these constructors may inherit the co- and contravariant subtyping behavior with respect to their arguments. We can classify the ‘sign’ of the arguments of each type constructor as + (positive, covariant), - (negative, contravariant) or top (both positive and negative). In general this is done by analysing the (possible mutually recursive) alge­braic type definitions by a fixed-point construction, with basis sign(->) = (-,+).

 

Example: a has sign T, b has sign + in

 

::FunList a b = FunCons (a, a -> b) (FunList a b)

              | FunNil

This leads to the following coercion rules

5)       Attributes of two corresponding type variables as well as of two corresponding arrow types must be equal.

6)       The sign classification of each type constructor is obeyed. If, for instance, the sign of T’s argument is negative, then

T sT s’ iff s’ ≤ s

7)       In all other cases, the validity of a coercion relation depends on the validity of uu’, where u,u’ are attributes of the two corresponding subtypes.

The presence of sharing inherently causes a (possibly unique) object to become non-unique, if it is ac­cessed via a read-only reference. In Clean this is achieved by a type correction operation that con­verts each unique type S to its smallest non-unique supertype, simply by making the outermost at­tribute of S non-unique. Note that this operation fails if S is a function type.

9.5    Combining Uniqueness Typing and Overloading

An overloaded function actually stands for a collection of real functions. The types of these real functi­ons are obtained from the type of the overloaded function by substituting the corresponding instance type for the class variable. These instance types may contain uniqueness information, and, due to the propagation requirement, the above-mentioned substitution might give rise to uniqueness attributes overloaded type specification.

Consider, for instance, the identity class

 

class id a:: a -> a

If we want to define an instance of id for lists, say id L, which leaves uniqueness of the list elements in­tact, the (fully expanded) type of idL becomes

 

instance id L v:[u:a] -> v:[u:a]

However, as said before, the type specification of such an instance is not specified completely: it is deri­ved from the overloaded type in combination with the instance type (i.e. [...] in this particular ex­am­ple).

In Clean we require that the type specification of an overloaded operator anticipates on attributes ari­sing from uniqueness propagation, that is, the uniqueness attribute of the class variable should be cho­sen in such a way that for any instance type this ‘class attribute’ does not conflict with the correspon­d­ing uniqueness attribute(s) in the fully expanded type of this instance. In the above example this means that the type of id becomes

 

class id a:: a -> a

Another possibility is

 

class id a:: *a -> *a

However, the latter version of id will be more restrictive in its use, since it will always require that its argument is unique.

9.5.1 Constructor Classes

The combination of uniqueness typing and constructor classes (with their higher-order class variables) introduces another difficulty. Consider, for example, the overloaded map function.

 

class map m:: (a -> b) (m a) -> m b

Suppose we would add (distinct) attribute variables to the type variables a and b (to allow ‘unique in­stances’ of map)

 

class map m:: (.a ->.b) (m.a) -> m.b

The question that arises is: Which attributes should be added to the two applications of the class va­ri­able m? Clearly, this depends on the actual instance type filled in for m. E.g., if m is instantiated with a propagating type constructor (like []), the attributes of the applications of m are either attribute vari­a­bles or the concrete attribute ‘unique’. Otherwise, one can choose anything.

 

Example

 

instance map []

where

    map f l = [ f x // x <- l ]

 

 

::  T a = C (Int -> a)

 

instance map T

where

    map f (C g) = C (f o g)

 

In this example, the respective expanded types of the instances are

 

map:: (u:a -> v:b) w:[u:a] -> x:[v:b], w <= u, x <= v

map:: (u:a -> v:b) (T u:a) -> T v:b

The type system of Clean requires that a possible propagation attribute is explicitly indicated in the type specification of the overloaded function. In order to obtain versions of map producing spine uni­que data structures, its overloaded type should be specified as follows:

 

class map m:: (.a ->.b).(m.a) ->.(m.b)

This type will provide that for an application like

 

map inc [1,2,3]

indeed yields a spine unique list.

Observe that if you would omit the (anonymous) attribute variable of the second argument, the input data structure cannot contain unique data on propagating positions, e.g. one could not use such a ver­sion of map for mapping a destructive write operator on a list of unique files.

In fact, the propagation rule is used to translate uniqueness properties of objects into uniqueness pro­p­erties of the data structures in which these objects are stored. As said before, in some cases the actual data structures are unknown.

Consider the following function

 

DoubleMap f l = (map f l, map f l)

The type of this function is something like

 

DoubleMap:: (.a ->.b) (m.a) -> (.(m.b),.(m.b))

Clearly, l is duplicated. However, this does not necessarily mean that a cannot be unique anymore. If, for instance, m is instantiated with a non-propagating type constructor (like T as defined on the pre­vi­ous page) then uniqueness of a is still permitted. On the other hand, if m is instantiated with a propa­gat­ing type constructor, a unique instantiation of a should be disapproved. In Clean, the type system ‘remembers’ sharing of objects (like l in the above example) by making the corresponding type at­tri­bute non-unique. Thus, the given type for DoubleMap is exactly the type inferred by Clean’s type sys­tem. If one tries to instantiate m with a propagating type constructor, and, at the same type, a with some unique type, this will fail.

The presence of higher-order class variables, not only influences propagation properties of types, but also the coercion relation between types. These type coercions depend on the sign classification of type constructors. The problem with higher-order polymorphism is that in some cases the actual type con­structors substituted for the higher order type variables are unknown, and therefore one cannot decide whether coercions in which higher-order type variable are involved, are valid.

Consider the functions

 

double x = (x,x)

dm f l = double (map f l)

Here, map’s result (of type.(m.a)) is coerced to the non-unique supertype (m.a). However, this is only allowed if m is instantiated with type constructors that have no coercion restrictions. E.g., if one tries to substitute *WriteFun for m, where

 

WriteFun a = C.(a -> *File)

this should fail, for, *WriteFun is essentially unique. The to solve this problem is to restrict coercion properties of type variable applications (m s) to

u:(m s) ≤ u:(m t) iff st && ts

A slightly modified version of this solution has been adopted in Clean. For convenience, we have added the following refinement. The instances of type constructors classes are restricted to type con­structors with no coercion restrictions. Moreover, it is assumed that these type constructors are uniqueness pro­pagating. This means that the WriteFun cannot be used as an instance for map. Consequently, our coercion relation we can be more liberal if it involves such class variable applica­tions.

Overruling this requirement can be done adding the anonymous attribute. the class variable. E.g.

 

class map.m:: (.a ->.b).(m.a) ->.(m.b)

Now

 

instance map WriteFun

where

    map...

is valid, but the coercions in which (parts of) map’s type are involved are now restricted as explained above.

To see the difference between the two indicated variants of constructor variables, we slightly modify map’s type.

 

class map m:: (.a ->.b) *(m.a) ->.(m.b)

Without overruling the instance requirement for m the type of dm (dm as given on the previous page) be­comes.

 

dm:: (.a ->.b) *(m.a) ->.(m b, m b)

Observe that the attribute of disappeared due to the fact that each type constructor substituted for m is assumed to be propagating.

If one explicitly indicates that there are no instance restriction for the class variable m (by attributing m with.), the function dm becomes untypable.

9.6    Higher-Order Type Definitions

We will describe the effect of uniqueness typing on type definitions containing higher-order type va­ri­ables. At it turns out, this combination introduces a number of difficulties which would make a full description very complex. But even after skipping a lot of details we have to warn the reader that some of the remaining parts are still hard to understand.

As mentioned earlier, two properties of newly defined type constructor concerning uniqueness typing are important, namely, propagation and sign classification. One can probably image that, when dealing with higher-order types the determination on these properties becomes more involved. Consider, for example, the following type definition.

 

::  T m a = C (m a)

The question whether T is propagating in its second argument cannot be decided by examining this definition only; it depends on the actual instantiation of the (higher-order) type variable m. If m is in­stantiated with a propagating type constructor, like [], then T becomes propagating in its second ar­gu­ment as well. Actually, propagation is not only a property of type constructors, but also of types them­selves, particularly of ‘partial types’ For example, the partial type [] is propagating in its (only) argu­ment (Note that the number of arguments a partial type expects, directly follows from the kinds of the type constructors that have been used). The type T [] is also propagating in its argument, so is the type T ((,) Int)).

The analysis in Clean that determines propagation properties of (partial) types has been split into two phases. During the first phase, new type definitions are examined in order to determine the propaga­tion dependencies between the arguments of each new type constructor. To explain the idea, we return to our previous example.

 

::  T m a = C (m a)

First observe that the propagation of the type variable m is not interesting because m does not stand for ‘real data’ (which is always of kind *). We associate the propagation of m in T with the position(s) of the occurrence(s) of m’s applications. So in general, T is propagating in a higher-order variable m if one of m’s applications appears on a propagating position in the definition of T. Moreover, for each higher order type variable, we determine the propagation properties of all first order type variables in the fol­lowing way: m is propagating in a, where m and a are higher-order respectively first-order type variables of T, if a appears on a propagating position in one of m’s applications. In the above example, m is pro­p­agating in a, since a is on a propagating position in the application (m a). During the second phase, the propaga­tion properties of (partial) types are determined using the results of the first phase. This (roughly) pro­ceeds as follows. Consider the type T s for some (partial) type s, and T as defined earlier. First, deter­mine (recursively) the propagation of s. Then the type T s is propagating if (1) s is propa­gating, (2) T is propagating in m, and moreover (3) m is propagating in a (the second argument of the type construc­tor). With T as defined above, (2) and (3) are fulfilled. Thus, for example T [] is propa­gating and there­fore also T (T []). Now define

 

::  T2 a = C2 (a -> Int)

Then T T2 is not propagating.

The adjusted uniqueness propagation rule (see also...) becomes:

•         Let s,t be two uniqueness types. Suppose s has attribute u. Then, if t is propagating the applica­tion (t s) should have an attribute v such that vu.

Some of the readers might have inferred that this propagation rule is a ‘higher-order’ generalization of the old ‘first-order’ propagation rule.

As to the sign classification, we restrict ourselves to the remark that that sign analysis used in Clean is adjusted in a similar way as described above for the propagation case.

Example

 

::  T m a = C ((m a) -> Int)

The sign classification of T if (-,^). Here ^ denotes the fact the a is neither directly used on a positive nor on a negative position. The sign classification of m w.r.t. a is +. The partial type T [] has sign -, which e.g. implies that

T [] Int ≤ T [] *Int

The type T T2 (with T2 as defined on the previous page) has sign +, so

T T2 Int ≥ T T2 *Int

It will be clear that combining uniqueness typing with higher-order types is far from trivial: the de­scription given above is complex and moreover incomplete. However explaining all the details of this combination is far beyond the scope of the reference manual.

9.7    Destructive Updates using Uniqueness Typing

So, it is allowed to update a uniquely typed function argument (*) destructively when the ar­gument does not reappear in the function result. The question is: when does the compiler indeed make use of this possibility.

Destructive updates takes place in some predefined functions and operators which work on predefined data structures such arrays (&-operator) and files (writing to a file). Arrays and files are intended to be updated destructively and their use can have a big influence on the space and time behavior of your application (a new node does not have to be clai­med and fil­led, the garbage collec­tor is invoked less often and the locality of memory references is in­creased).

Performing destructive updates is only sensible when informa­tion is stored in nodes. Arguments of basic type (Int, Real, Char or Bool) are stored on the B-stack or in registers and it therefore does not make sense to make them unique.

The Clean compiler also has an option to re-use user-defined unique data structures: the space being occupied by a function argument of unique type will under certain conditions be reused de­struc­tively to con­struct the function result when (part of) this result is of the same type. So, a more space and time efficient program can be obtained by turning heavily used data structures into unique data structures. This is not just a matter of changing the uniqueness type attributes (like turning a lazy data structure into a strict one). A unique data structure also has to be used in a “single threaded” way (see Chapter 4). This means that one might have to restructure parts of the program to maintain the unicity of objects.

The compiler will do compile-time garbage collection for user defined unique data-structures only in certain cases. In that case run-time garbage collection time is reduced. It might even drop to zero. It is also possible that you gain even more than just garbage collection time due to better cache behavior.