Available online at www.sciencedirect.com

W ScienceDirect

ELSEVIER Electronic Notes in Theoretical Computer Science 229 (5) (2011) 39-56

www.elsevier.com/locate/entcs

Attribute Grammars as Recursion Schemes over Cyclic Representations of Zippers

Eric Badouel1

INRIA Rennes - Bretagne Atlantique, Campus universitaire de Beaulieu, F-35402 Rennes, France

Bernard Fotsing2 Rodrigue Tchougong3

IRISA, Université de Rennes 1, Campus universitaire de Beaulieu, F-35402 Rennes, France

Abstract

Evaluation of attributes w.r.t. an attribute grammar can be obtained by inductively computing a function expressing the dependencies of the synthesized attributes on inherited attributes. This higher-order functional approach to attribute grammars leads to a straightforward implementation using a higher-order lazy functional language like Haskell. The resulting evaluation functions are, however, not easily amenable to optimization rules. We present an alternative first-order functional interpretation of attribute grammars where the input tree is replaced with an extended cyclic tree each node of which is aware of its context viewed as an additional child tree. By the way, we demonstrate that these cyclic representations of zippers (trees with their context) are natural generalizations of doubly-linked lists to trees over an arbitrary signature.

Keywords: Attribute Grammar, Attribute Evaluation, Cyclic Data Structure, Zipper

1 Introduction

Attribute grammars [21,24] were introduced to make possible the manipulation of context-sensitive information, like the scope of a variable in a program. This formalism, used mainly in the context of language processing tools, can be used with two purposes: either to decorate an input tree with attributes (thus adding information locally at each node) or to define a syntax-directed computation in order to translate that input tree into some semantic domain. These two problems are related since the result of syntax-directed computation is usually given by the value

1 Email: ebadouel@irisa.fr

2 Email: bfotsing@irisa.fr

3 Email: rtchougo@irisa.fr

1571-0661/$ - see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.entcs.2011.02.015

of a specific attribute at the root node; which can be extracted once the decoration of the tree has been computed (even though one may not have to compute the whole decoration to obtain the required result). On the other hand the decorated tree can be given by a specific attribute, even though values of these attributes at different nodes can share whole subexpressions. In order to obtain efficient implementations different algorithmic solutions have often been put forward for these two situations. However if a lazy functional language like Haskell is used, one can adopt the same solution in both cases. Indeed, on the one hand, lazy evaluation avoids unecessary computations and, in the other hand, it allows sharing of subexpressions at different places.

The set of input trees is a regular set of trees given as the set of abtract-syntax trees of a context-free grammar or, if we are not interested in the concrete syntax but only on the abstract structure of trees, as the set of terms build from a multi-sorted signature. Values of attributes are defined with a set of recursive definitions given by the so-called semantic rules of the attribute grammar. If the attribute grammar is non circular (there are no cyclic dependencies between attributes) then one can can compute the value of each attribute using a topological sort of the dependency graph (whose arcs indicate the dependencies between attribute values). One can nevertheless use an order-algebraic approach based on least fixed-points [6,23] in order to compute attributes for potentially circular attribute grammars (and on potentially infinite input trees).

General attribute grammars use both synthesized and inherited attributes bearing information respectively from the subtree stemming from the given node and the context of that subtree. Attribute grammars with only synthesized attributes amount to primitive recursive schemes [9] and value of attributes are easily computed by structural recursion on trees. Things are more involved for general attribute grammars due to the manipulation of contextual information. However it was soon recognized that we can resort to attribute grammars with only synthesized attributes whose attributes are functions expressing the dependencies of the synthesized attributes on inherited attributes. Thus attribute grammars reduce to structural induction on trees at the price of using higher-order values. This higher-order functional approach to attribute grammars [19,11,2,22] leads to efficient implementations in a higher-order lazy functional language like Haskell. The Elegant system developped at Philips [1] and the UUAG system [25] from Utrecht university both illustrate this approach.

Unfortunately the resulting evaluation functions are not easily amenable to optimization techniques like short-cut fusion which are based on first-order representations of functions. We thus present an alternative first-order functional interpretation of attribute grammars where the input tree is replaced with an extended cyclic tree where each node is aware of its context viewed as an additional child tree. The price to pay is a preprocessing phase to unfold a tree into its extended cyclic version. By the way, we demonstrate that these cyclic representations of zippers (trees with their context [16]) are natural generalizations of doubly-linked lists to trees over an arbitrary signature. More precisely there are two natural ways of representing

lists in order to be able to navigate through them in both directions. Either we represent the list together with its context (zipper) for instance by using a pair of pushdown stacks: one for the current list itself and the other, in reverse order, for its context; or by using at each node a pointer to its preceding node (doubly-linked list). From a given multi-sorted signature £ we derive an extended signature Z^ whose corresponding trees are associated with £-trees or with their contexts. A zipper is introduced as a pair made of a tree and its context; thus generalizing the pair of stacks representation of lists to trees over an arbitrary signature. We also introduce a cyclic representation of zippers where each tree (respectively context) is aware of its context (resp. attached subtree) given as an extra argument; this gives rise to a new signature CZ^ generalizing the doubly-linked representation of lists. In both cases, we present a corresponding algorithm for attribute evaluation. The first one (related to the zipper representation) is similar to the solution presented by Uustalu and Vene [26] even though we do not make use of the underlying structure of comonad. The second algorithm (related to the cyclic representation of zipper) is new.

The rest of the paper is organized as follows. In section 2 we recall the basic definitions on attribute grammars and present a variant of the higher-order functional interpretation of attribute grammars. In section 3 we introduce the zippers, and the evaluation of attributes based on zipper representation. In section 4 we introduce the cyclic representation of zippers and the unfolding of a tree into its cyclic representation. In section 5 we introduce our first-order interpretation of attribute grammars based on the cyclic representation of trees.

2 Higher-order functional interpretation of an attribute grammar

In order to fix some notations, we first very briefly recall some mathematical definitions on multi-sorted signatures and their algebras (we assume the reader to be familiar with these notions, he may wish to consult [14,4] for a more detailed presentation); then we proceed to the definition of an attribute grammar and its associated interpretation. We conclude this section by introducing the notion of a rooted attribute grammar

2.1 Signature and algebra

Definition 2.1 A (multi-sorted) signature £ = (5, Op) consists of a finite set S of sorts, and a finite set Op of operators. Each operator op has an arity a(op) £ S* and a sort a (op) £ S. We let notation op : s1 x- ■ -x sn ^ s mean that op is an operator of arity a (op) = s1 ■ ■■ sn and sort a (op) = s. The rank of operator op is the length of its arity: p(op) = |a(op)|. If a (op) = e, op is said to be a constant of sort a(op).

As an example we consider the signature with only one sort Tree and whose

operators are as follows:

Fork : Tree x Tree ^ Tree

Leaf label : ^ Tree

Thus we have a set of constants indexed by a set of labels together with a binary operator. It corresponds to the following Haskell datatype definition.

data Tree a = Leaf a | Fork (Tree a) (Tree a)

Definition 2.2 Let E = (S, Op) be a signature, a E-algebra A consists of a domain of interpretation, a set As, for each sort s eS, and a function opA : AS1 x ■ ■ -xASn ^ As associated with each operator op : si x^ ■ ■x sn ^ s. A morphism of algebras f : A ^ B is a family of maps fs : As ^ Bs such that, for every ai e ASi

fs (opA(a1,...,an)) = 0PB (fsi (ai),...Js„ (an))

An algebra is said to be continuous when the domains of interpretations are complete partial orders and the interpretations of operators are continuous functions. We let T(E)s denote the set of £ terms of type s, and Tree(E)s the set of finite or infinite trees of sort s build upon the signature E together with their approximants. These sets are the respective carrier sets of the free E-algebra and the free continuous E-algebra. We identify terms with finite trees, and we interpret a tree as a partial map t : N* ^ E whose domain Dom(t) is a non-empty prefix-closed language such that for every u e Dom(t) with t(u) = op : s1 x ■■■ x sn ^ s, and i e N, one has u ■ i e Dom(t) ^ 1 < i < n and a(t(u.i)) = si. Moreover we let tu stand for the subtree of t rooted at u given by Dom(tu) = {v e N* | u ■ v e Dom(t)} and tu(v) = t(u ■ v).

2.2 Attribute grammar

The nodes of a E-tree can be decorated by attributes whose values are computed according to semantic rules.

Definition 2.3 An (abstract) attribute grammar G = (E,Attr, D,sem) is a signature E = (S, Op) each grammatical symbol (sort) of which is associated with a set of attributes in which we distinguish inherited attributes from synthesized attributes: Attr(s) = Inh(s) W Syn(s). The domain of evaluation of an attribute q e Attr(s) is a complete partial order Dq. We let

= nqelnh(s) Dq and = nqeSyn(s) Dq denote respectively the domains of interpretation for the inherited and the synthesized attributes of a node of sort s. Moreover a set of rules (the so-called semantic rules) are associated with each operator of the signature. These rules give the functional dependencies between the values of attributes and are given by the function:

sem(op) : V[s xV^ x-^xV\n —>V¡ xV^ x-^xV^

A node u e Dom(t) is said to be an occurrence of grammatical symbol s = a(t(u)), then it has the same attributes as s. The rationale of the distinction made

between inherited and synthesized attributes is the following. Synthesized attributes in a given node of a tree represent information comming from the subtree rooted at that node. Conversely, inherited attributes represent information comming from outside this subtree (i.e. from its context). For this reason, we define as input attribute of operator op : s1 • • • sn ^ s either an inherited attribute of s (whose value comes from the context) or a synthesized attribute of some of the si for 1 < i < n (whose value comes from the respective subtree). The remaining attributes, i.e. the synthesized attributes of s and the inherited attributes of the si for 1 < i < n are called ouput attributes or defined attributes. The set of semantic rules sem(op) associated with production op does actually contain exactly one definition for each output attribute in term of the input attributes.

Let us consider, as an illustration, the following attribute grammar for computing the frontier of a binary tree (the list of labels of its leaves from left to right) given by synthesized attribute flatten using an accumulating parameter (inherited attribute coflat).

coflat flatten = a:coflat

♦ Leaf *

I Fork *

i left '

y right *

Fig. 1. an attribute grammar for computing the frontier of a binary tree

Leaf a :: —> Tree£ | Tree£ • flatten = a : (Tree£ • coflat)

Fork :: Tree1 x Tree2 —> Tree£

Tree£ • flatten = Tree1 • flatten

Tree1 • coflat = Tree2 • flatten

Tree2 • coflat = Tree£ • coflat

In order to present the semantic rules we need to distinguish the different occurrences of a same grammatical symbol (sort). For this purpose if op : s1 x^ • xsn ^ s is an operator we let op :: (s1)1 x • • • x (sn)n ^ s£ for the extended notation where each occurrence of sort is tagged with its position, and by a slight abuse of notation we shall often write op :: X1 x • • • x Xn ^ X£ where Xi is an abreviation for (si)i and X£ = s£. Then the semantic rules attached to an operator op are of the form

op :: X1 x • • • x Xn —> X£

X£ • syn = sem(op)£,Syn (X\ • q;(X, q) E In op) where syn E Syn(s)

Xi • inh = sem(op)iiinh (X\ • q;(A, q) E Inop) where 1 < i < n and inh E Inh(si)

Inop = {(e, q) | q e Inh(s)}[J {(i,q) | 1 < i < n q e Syn(si)}

Outop = {(e, q) | q e Syn(s)} U {(i, q) | 1 < i < n q e Inh(si)}

represent the sets of occurrences of input attributes and output attributes respectively.

The semantic functions are actually rule schemes whose purpose is to define the value of each attribute at every node of (the tree representation of) the term. For instance if t is a tree and u e Dom(t) is such that t(u) = Fork then the above equations should be interpreted as

flatten(tu) = flatten(tu.i)

coflat(tu.i) = flatten(tu.2)

coflat(tu-2) = coflat(tu)

Thus each tree is associated with a system of equations whose variables are the occurrences of attributes

Vt = {vt,n,q^ e Dom(t) t(n) :: s1 x^ ■ ■ sn — s ; q e Att(s)}

U {vt,n-i,q^ e Dom(t) t(n) :: si x^ ■ ■ sn ^ s ; q e Att(si)}

whose resolution provides the interpretation of tree t e Tree(E)s w.r.t. to attribute grammar G as the map (t])g : VÍ -^Vl given by:

dt]) G(v)(q) = vt,s,q

where vt,q = sem(t(n))£,q (vt^-x,q'f(^,q') e In^)) for q e Syn(a(t(n)))

vt,K-i,q = sem(t(n))i,q {vt,n-x,q'/(A, q') e Intn) for q e Inh(a(t(n ■ i)))

vt,£,q = v(q) for q e Inh(a(t(e)))

where it is assumed that the vector {vt,x,q) appearing in the "where" clause is the least solution of the corresponding system of equations. We shall make this assumption each time a "where" clause occurs in a definition; this conforms to the interpretation of Haskell programs. Figure 2 displays the flow of computations of attribute occurrences that produces the frontier of a binary tree assuming the initial value of the accumulating parameter (value of attribute coflat at the root node) is the empty list.

2.3 Algebra associated with an attribute grammar

The semantic rules of an attribute grammar are syntax-directed in the sense that they are given by rule schemes associated with each operator of the signature. For this reason we can exhibit a E-algebra AG derived from the attribute grammar G such that ([t])g = tAG, i.e. the interpretation of a tree as defined in the previous section is given by the evaluation morphism (catamorphism) associated with algebra Ag.

Definition 2.4 The E-algebra Ag derived from attribute grammar G is such that

(Ag)s = Ds -Ds, and the interpretation of an operator op : s1 x • • • x sn — s is the map opA<a given by:

opAG (f1,...,fn )(v)(q) = Vop,£,q

where vcp£q = v(q) if q E Inh(s)

Vop,i,q = fi(vi)(q) where q E Syn(si) and Vi(q') = Vop,i,q> for q' E Inh(si) Vop,£,q = sem(op)£,q (vop,a,q'/ (X, q') E Inop) if q E Syn(s) Vop,i,q = sem(op)i,q (Vop, A , q' /(X, q') E Inop) if q E Inh(si)

This definition is circular [5] since in the "where" clause the inherited attributes Vi(q') = Vopi,q' for q' E Inh(si) appear both in the left-hand side and in the right-hand side of the defining equations. Thus it should be interpreted as the characterization of the vector {vop,A,q)(A,q)ein,opuoutop as the least fixed-point of the corresponding transformation.

Proposition 2.5 at)G = tAc-

Proof. Straightforward, details are provided in [3]. □

The above semantics of attribute grammars follows the approaches presented in [19,2], it also draws its inspiration from [23,6] in the sense that it gives a fixed-point semantics of attribute grammars. We have an almost literal transcription of the above definition into the language Haskell as the mechanism of lazy evaluation escapes the apparent cyclicity of the resulting program [5]. A translation of attribute grammars into catamorphism (evaluation function for an algebra) was already presented in [11]. However the presentation that we have just given is, in our opinion, more explicit and far more elementary than the one given there and it leads to a straightforward implementation in Haskell. Notice that another advantage of lazy evaluation is that we can define computations of attributes on potentially infinite data structures. For instance we can define semantics rules on streams as long as every approximations of the value of a given attribute can be computed using only a finite prefix of the stream.

The evaluation of a tree w.r.t. the algebra induced by the attribute grammar is then given by the inductive definition:

(op(t1,...,tn)AG (v)(q) = Vop,£,q

where vop,£,q = v(q) if q £ Inh(s)

Vop,i,q = tAa (Vi)(q) where q £ Syn(si) and Vi(q') = Vop,i,q for q' £ Inh(si)

Vop,£,q = sem(op)£,q (Vop,xq /(X,q') £ InP if q £ Syn(s)

Vop,i,q = sem(op)i,q (vVop,X,q'/(X, q') £ Inop) if q £ Inh(si)

Returning to our example, we derive the following Haskell code:

flatten :: Tree a -> [a] -> [a] flatten (Leaf a) coflat = a:coflat

flatten (Fork left right) coflat = flatten left (flatten right coflat)

2.4 Rooted attribute grammar

From now on we consider that each signature has a specific sort a, called its axioms. It is often convenient to consider a top level function that uses an attribute grammar to evaluate a tree after an appropriate initialization of the inherited attributes of the root node (these attributes are parameters of the corresponding system of equations). This can be done by extending the attribute grammar with an additional operator Root : a ^ T where a is the axiom of the grammar and T an additional sort, together with the associated semantic rules. A tree of the form Root(t) where t £ Tree(L)a represents a rooted tree, i.e. a tree with an empty context. We assume that the additional sort T has no inherited attribute and one synthesized attribute corresponding to the end result. The semantic rules associated with this additional operator root are then given by a pair of functions: init : T>a ^Va providing the initialization of the inherited attributes at the root node, and result : V

where V = V is the domain of values for the end result. We shall not explicitely add this new operator to the signature but consider that a rooted attribute grammar is an attribute grammar together with these two functions. The top level function is then the evaluation function associated with (the now implicit) operator Root in the algebra induced by the extended attribute grammar, it is therefore given by the expression shown in Figure 3. In our running example the result function is the

,'result

return : T(E)a —> V " init >

return t = result(val)

where val = tAG (init(val))

Fig. 3. Tying the knot: the top level function of a rooted attribute grammar

identity and the init function is the constant function returning the empty list (the accumulating parameter associated with the inherited attribute coflat is initialized

to the empty list regardless of the value of the synthesized attribute flatten at the root node), hence it is given by the following function:

return :: Tree a -> [a] return tree = flatten tree []

3 Attribute grammar as a zipper transformer

In order to account for context-dependent information we manipulate a subtree together with its corresponding context. We restrict attention to trees whose sort is the axiom. A zipper (of sort s) is given by a pair made of a tree of sort s together with a context for that tree. The representation of a context in the zipper comes from the following observation: either the context of the considered subtree t is empty or it is of the form

Opi(tl,...,ti-l,C,ti+l,...,tn) = C[op(t1,- • • [], ti+l, • • • , tn)\

where op : s1 x • • • x sn ^ s is an operator such that si is the sort of ti, and C is a context whose hole is of sort s. The trees tj for 1 < j < n and j = i are the siblings of t. Thus trees and contexts are given by the following signature.

Definition 3.1 In the signature Z^ we find two sorts denoted s and s, associated with each sort s £ S in £ and each operator op :: s1 x • • • x sn ^ s in £ is also an operator of Z^ with the same arity and sort; but it gives also rise to a family of operators opi for 1 < i < n where

op% : s1 x • • • x si-1 x s x si+i x • • • x sn ^ si

We finally have a constant operator Empty of sort a representing the empty context. A zipper c@t of sort s is a pair made of a subtree t £ Tree(Z^)s together with its context c £ Tree(Zs)s.

The interpretation of the semantic rule

X£ • syn = sem(op)£,syn (X\ • q;(X, q) £ Inop) associated with op :: X1 x • • • x Xn —> X£ is given by the following inductive rule

syn (x£@op(x1,...,xn)) =

sem(op)£,syn (q (X£@op(x1,...,xn)) /q £ Inh(s) ;

q (op1 (x1,...,xi_1,x£,xi+1,...,xn)@xi) /q £ Syn(si) )

whose interpretation is as follows. If the subtree t of zipper c@t matches the pattern x£@op(x1,...,xn), i.e. t = op(t1,...,tn), then the expression syn(c@t), standing for the value of the synthesized attribute syn of subtree t within context c, is given by the expression in the right-hand side where variables x£ and xi are replaced respectively by the context c and the subtrees ti given by pattern matching. Similarly, the interpretation of the semantic rule

Xi • inh = sem(op)in (X\ • q;(X, q) £ Inop)

is given by

inh (op1 (x1,...,Xi-1,X£,Xi+i,...,Xn)@Xi) =

sem(op)i,inh (q (X£@op(xi ,...,Xn)) /q £ Inh(s) ;

q (op1 (Xi,...,Xi-1,X£,Xi+1,...,Xn)@Xi) /q £ Syn(si) )

In this case, an inherited attribute appears as an attribute of the context (which is tested against a pattern) relative to a given subtree. If the attribute grammar is rooted the semantic rules associated with the (implicit) operator Root are translated as:

inh (Empty@X) = init (q (Empty@X) /q £ Syn(a))inh return(X) = result (q (Empty@X) /q £ Syn(a))

The Haskell code corresponding to our running example is given as follows. data Tree a = Leaf a | Fork (Tree a)(Tree a)

data Cxt a = Empty | LCxt (Cxt a)(Tree a) | RCxt (Tree a)(Cxt a) data Zipper a = Cxt a :> Tree a

flatten :: Zipper a -> [a]

flatten (cxt :> tree@(Leaf a)) = a :(coflat (cxt :> tree)) flatten (cxt :> (Fork left right)) = flatten ((LCxt cxt right):>left)

coflat :: Zipper a -> [a]

coflat (CoRoot:>tree) = []

coflat ((LCxt cxt right):>left) = flatten ((RCxt left cxt):>right)

coflat ((RCxt left cxt):>right) = coflat (cxt:>(Fork left right))

return :: Tree a -> [a]

return tree = flatten (Empty:>tree)

This code is an immediate transcription of the semantic rules, where a synthesized attribute is defined inductively on the structure of the tree component and an inherited attribute is defined inductively on the structure of the component giving the context. However we have in the right-hand side of each rule to update accordingly the various parameters. For instance the rule

coflat ((LCxt cxt right):>left) = flatten ((RCxt left cxt):>right)

states that the inherited attribute coflat when applied to a context of the form LCxt cxt right and a subtree left is given by the synthesized attribute flatten for subtree right in the corresponding context, namely RCxt left cxt. We can make this extra parameter implicit if each subtree is aware of its context, given as an extra parameter, and symmetrically each context is aware of the subtree to which it is applied. We achieve this result using cyclic representations of zippers which we define in the next section.

E. Badouel et al. /Electronic Notes in Theoretical Computer Science 229 (5) (2011) 39—56 49

4 Zippers as cyclic data structures

First we introduce an extended signature such that the corresponding trees are encodings of cyclic representations of zippers. In a second part we describe the process of unfolding a rooted tree (i.e. with an initial empty context) into its cyclic representation.

4.1 Cyclic data structures

Definition 4.1 In the signature CZs we find two sorts denoted s and s, associated with each sort s £S in £ and each operator op : s1 x • • • x sn ^ s gives rise to a family of operators op\ for A £ {e} U{1,...,n} where op£ :: s x s1 x • • • x sn ^ s and opi :: s x s1 x • • • x sn ^ si. Finally we have an operator CoRoot : a ^ a where a is the axiom of £. A tree t £ Tree(CZs)s is a representation of a subtree of type s and a tree c £ Tree(CZs)s is a representation of a context of type s.

However most of the trees build from this signature CZs are not valid representations of subtrees or contexts. Let us illustrate this phenomenon with the example of doubly-linked streams. If A is an alphabet, a stream is a tree on the monosorted signature £ (with sort S = {st}) containing one unary operator a :: st ^ st for each letter a £ A. The tree a(st) stands for the stream whose root node is labelled a and such that the remaining stream obtained by removing this root node is st. The signature Zs provides the associated structure of zipper

data Stream a = Cons{val::a, suc::Stream a}

data StreamCxt a = Snoc{val::a, pred::StreamCxt a} | Empty

data StreamZipper a = (StreamCxt a):>(Stream a)

The structure of zipper allows to navigate streams non destructively:

left, right :: StreamZipper a -> StreamZipper a right (cxt:>(Cons a str)) = (Snoc a cxt):>str left ((Snoc a cxt):>str) = cxt:>(Cons a str)

In order to navigate a stream in both direction we can alternatively add to each node a pointer to the preceding node, leading us to the structure of a doubly-linked stream:

data DStream a = Node{val:: a, prev::CxtDStream, suc ::DStream a} data CxtDStream a = CoRoot (DStream a)

| CoNode{val':: a , prev'::CxtDStream a, suc'::DStream a}

which is the inductive data structure associated with signature CZs. If we were to consider doubly-linked lists rather than doubly-linked streams then we would just have to add one unary constructor associated with the constant operator associated with the empty list:

data DList a = Node{val:: a, prev::CxtDList, suc ::DList a}

| Nil (CxtDList) data CxtDList a = CoRoot (DList a)

| CoNode{val':: a , prev'::CxtDList a, suc'::DList a}

These two data structures are isomorphic, and by identifying them we obtain a more traditional representation of doubly-linked lists as:

data DList a = Node{val:: a, prev,suc ::DList a} | Nil (DList)

An implicit assumption is that if suc xs is defined then prev (suc xs)= xs, and if prev xs is defined then suc (prev xs)= xs, similarly prev xs = Nil ys or suc xs = Nil ys entails ys=xs. These conditions are met in the following doubly-linked representation of the list [1, 2, 3]

dlist = nodel

where nodel = Node 1 (Nil nodel) node2 node2 = Node 2 nodel node3 node3 = Node 3 node2 (Nil node3)

An abstract data type is often presented by a multi-sorted signature together with equational constraints stated in terms of the constructors of the signature. They thus constrain the class of valid interpretations to belong to the corresponding equational variety of algebras. The abstract data type is then identified with the initial object of that category; namely, the quotient of the initial algebra by the induced congruence. In the present case, equations are stated in terms of the selectors of the signature. They limit the class of valid generators and the abstract data type can be identified with a subcoalgebra of the terminal coalgebra. Elements of this abstract representation can be represented by graphs whose tree unfolding satisfies the equations in the following sense. The set of equational contraints determines a binary relation on the set of nodes of a tree. The tree satisfies the equational contraints if two subtrees rooted at related nodes are the same. The carrier of the abstract data type is then given as the set of trees satisfying the equational constraints; and each such element can be seen as a tree representation of the graph whose nodes are the isomorphic class of its subtrees. Due to this graphical representation we use the expression of cyclic data structures to stand for abstract data types defined from a multi-sorted signature and a base of cycles given by a set of equations using the selectors of the signature. It could be interesting to investigate more deeply such a coalgebraic presentation of cyclic data structures [15,20,7,13,18].

In order to generate only doubly-linked lists that are well-formed (i.e. that satisfy the above identities) we will exclusively generate them using some stream coalgebra. Such a coalgebra allows to generate streams:

data StrCoalg b a = StrCoalg{out::b->a, next::b->b}

streamGen :: StrCoalg b a -> b -> Stream a streamGen (StrCoalg out next) = build

where build gen = Cons (out gen)(build (next gen))

For instance one can generate the stream of prime numbers using the sieve of Er-

atosthenes as follows:

sieve = StrCoalg head next

where next xs = filter (\n -> not((n 'mod' (head xs))==0))(tail xs)

primes = streamGen sieve [2..]

In order to generate a well-formed double-linked stream from a stream coalgebra we only have to adapt the above definition of the stream generation function by adding a new parameter for handling the context:

dStreamGen :: StrCoalg b a -> b -> DStream a dStreamGen (StrCoalg out next) gen = dstr where dstr = build gen (CoRoot dstr)

build gen cxts = Node (out gen) cxts dstr'

where dstr' = build (next gen) (CoNode (out gen) cxts dstr')

dprimes = dStreamGen sieve [2..]

Then we derive a function translating a stream into a corresponding double-linked stream:

stream2dStream :: Stream a -> DStream a stream2dStream = dStreamGen (StrCoalg val suc)

Or equivalently by expanding the definition of function dStreamGen:

stream2dStream str = dstr

where dstr = build str (CoRoot dstr)

build (Cons val suc) cxts = Node val cxts dstr' where dstr' = build suc (CoNode val cxts dstr')

Now one can navigate a doubly-linked stream:

right (Node a cxts dstr) = dstr

left (Node _ (CoNode b cxts dstr) _) = Node b cxts dstr

first :: Int -> DStream a -> [a] first 0 str = []

first (n+1) (Node a cxts dstr)= a:(first n dstr)

test = first 5 ((left.right.right.left.right.right.right.right) dprimes) > test

[11,13,17,19,23]

4-2 Unfolding of a tree

In this section we generalize on the previous example of doubly-linked streams to present a translation of trees into zippers. For that purpose we introduce an attribute grammar canonically associated with a given signature.

Definition 4.2 The rooted attribute grammar ns associated with signature £ = (5, Op) and axiom a G S is defined as follows. It has one inherited attribute associated with each sort Inh = {cxts\s GS} representing the context at the given node of the tree, and one synthesized attribute Syn = {trees\s G S} representing the subtree rooted at that node, with arities and sorts given by trees : s — s and cxts : s — s. The semantic domains are given by Dtrees = Tree(CZs)s and Vcxts = Tree(CZs)s. The semantic rules associated with operator op : si x---x sn — s are given by

op :: X1 x ■ ■■ x Xn — X£

X£ ■ trees = op£ (X ■ cxts,Xi ■ trees1 ,...,Xn ■ tree,Sn)

Xi ■ cxtsi = opi (X ■ cxts ,Xi ■ tree,n ,...,Xn ■ tree,Sn)

and the auxiliary functions result : Tree(CZs)a ^ Tree(CZs)a and init : Tree(CZs) a —^ Tree(CZs)a are respectively the identity and the operator CoRoot.

We let U, for "unfolding", denote the algebra induced by attribute grammar ns The interpretation of operator op is thus given by

opU (fi,...,fn) cxt = op£ (cxt, treei,...,treen)

where treei = fi (opi (cxt, treei ,...,treen))

and we let unfold denote the corresponding top level function:

unfold :: Tree(£)a — Tree(CZs)a

unfold(tree) = ctree where ctree = ^(CoRoot(ctree)) Thus the unfolding function can be written as: unfold tree = ctree

where ctree = builda tree (CoRoot ctree)

builds (op(ti,...,tn)) cxt = op£ (cxt, treei,...,treen) where treei = buildsi ti opi (cxt, treei,...,treen) In our example of binary trees the unfolding can be given as follows: data Tree a = Leaf a | Fork (Tree a) (Tree a)

data ZTree a = ZLeaf a (ZCxt a) | ZFork (ZCxt a) (ZTree a) (ZTree a) data ZCxt a = CoRoot (ZTree a) | ZLeft (ZCxt a) (ZTree a) (ZTree a)

| ZRight (ZCxt a) (ZTree a) (ZTree a)

unfold :: Tree a -> ZTree a unfold tree = ctree where ctree = build tree (CoRoot ctree) build (Leaf a) cxt = ZLeaf a cxt

build (Fork left right) cxt = ZFork cxt cleft cright

where cleft = build left (ZLeft cxt cleft cright) cright = build right (ZRight cxt cleft cright)

In the example of doubly-linked lists we introduced this unfolding function as a special case of the function generating a doubly-linked list from a list coalgebra. The same can be done by using an adaptation of the above unfolding function:

data Trunk a b = Leaf a | Fork b b

data Tree a = In{out:: Trunk a (Tree a)}

data TreeCoalg c a = TreeCoalg{next::c-> Trunk a c}

punfold :: TreeCoalg c a -> c -> ZTree a punfold coalg gen = ctree where ctree = build gen (CoRoot ctree)

build gen cxt = case next coalg gen of Leaf a -> ZLeaf a cxt

Fork left right -> ZFork cxt cleft cright where cleft = build left (ZLeft cxt cleft cright) cright = build right (ZRight cxt cleft cright)

unfold = punfold (TreeCoalg out)

More generally this parametric unfolding will be given as:

punfold coalg gen = tree where tree = builda gen (CoRoot tree)

builds gen cxt = case coalg gen of

op(gen\,...,genn) ^ op£ (cxt, treel,...,treen)

where treei = buildSi geni opi (cxt, treel,...,treen)

5 First-order functional interpretation of an attribute grammar

We associate a rooted attribute grammar G with a CZ^-algebra AG where

(AG) s = DS and (AG) s = VS

op£ G (v)(syn) = sem(op)£,syn(v) A

opi G(v)(inh) = sem(op)i,inh(v) CoRootA G = init

where op : s1 x • • • x sn — s, v £ vS x V^S1 x ■ ■ ■ x v\n, syn £ Syn(s), and inh £ Inh(si).

Proposition 5.1 (unfold tree)A(i = val where val = tAG (init(val)) Proof. Straightforward, details may be found in [3]. □

Therefore result unfold tree)Acoincides with the value return tree returned by the rooted attribute grammar G from the given input tree. It gives an algorithm for computing that value by simple structural recursion on the unfolding of the input tree. In our running example, we obtain the following Haskell code.

data Tree a = Leaf a | Fork (Tree a) (Tree a)

data ZTree a = ZLeaf a (ZCxt a) | ZFork (ZCxt a) (ZTree a) (ZTree a) data ZCxt a = CoRoot (ZTree a) | ZLeft (ZCxt a) (ZTree a) (ZTree a)

| ZRight (ZCxt a) (ZTree a) (ZTree a)

unfold :: Tree a -> ZTree a unfold tree = ctree ....

flatten :: ZTree a -> [a]

flatten (ZLeaf a cxt) = a:(coflat cxt)

flatten (ZFork cxt left right) = flatten left

coflat :: ZCxt a -> [a]

coflat (CoRoot tree) = []

coflat (ZLeft cxt left right) = flatten right

coflat (ZRight cxt left right) = coflat cxt

return :: Tree a -> [a]

return tree = flatten (unfold tree)

6 Conclusion

We have presented a new interpretation of attribute grammars (with both inherited and synthesized attributes), based on Huet's zipper datatype. More precisely we introduced a signature CZ^ for representing zippers as cyclic data structures, together with an unfolding operation transforming a E-tree into their cyclic representation (a CZ^-tree). An attribute grammar on signature E can immediately be identified with an CZ^-algebra so that attribute values can be computed by structural induction on the unfolding of the input tree.

If the domain of interpretation of attributes are trees over an output (or semantic) signature, and if the semantic rules are accordingly given by expressions build on this output signature, then an attribute grammar can be interpreted as a tree transformer. In this context, our result amounts to transform an attribute grammar into a (deterministic, top-down) tree transducer with input signature CZ^. It is

much easier to compose (top-down) tree transducers than to compute the syntactic composition of attribute grammars. The latter operation introduced by Ganzinger and Giegerich, as the co-called attributed coupled grammars [12], has already been related to the functional programming deforestation technique in [8,10]. We would like to recover the syntactic composition of attribute grammars through the composition of the associated (top-down) tree transducers acting on cyclic representations of zippers.

We can notice that the Haskell code we ended with is almost an immediate transcription of the semantic rules of the attribute grammar. Still the programmer need to be aware of the underlying cyclic representation of zippers and this is an undesirable overhead and a potential source of programming errors. We would like to be able to encapsulate these aspects into a structure of monad (or a structure of arrows) so that all these considerations would be totally transparent to the programmer. we are thus looking for a set of functional combinators (similar to the functional monadic parser combinators [17]) providing an attribute grammar designer with a Domain Specific Language embedded in Haskell. Using these combinators it would specify an attribute grammar (mainly by writing down semantic rules) but by doing so he would actually build an Haskell program for the corresponding evaluator of attributes or for related tools.

References

[1] Augusteijn, A., "Functional Programming, Program Transformations and Compiler Construction", PhD thesis, Technische Universiteit Eindhoven, i993.

[2] Backhouse, K. S., A functional semantics of attribute grammars, in: J.-P. Katoen and P. Stevens, eds., "Proc. of 8th Int. Conf. on Tools and Algorithms for Construction and Analysis of Systems, TACAS 2002

(Grenoble, Apr. 2002)," Lecture Notes in Computer Science 2280, Springer, 2002, pp. i42-i57.

[3] Badouel, E., B. Fotsing and R. Tchougong, Yet another implementation of attribute evaluation, Research Report No. 63i5, INRIA, 2007. Available at https://hal.inria.fr/inria-00175810.

[4] Badouel, E. and M. Tonga, Growing a domain specific language with split extensions, Research Report No. 63i4, INRIA, 2007. Available at https://hal.inria.fr/inria-00175805.

[5] Bird, R., Using circular programs to eliminate multiple traversals of data, Acta Inform. 21 (1984), 239-250.

[6] Chirica, L. M. and D. F. Martin, An order-algebraic definition of Knuthian semantics, Math. Syst.

Theory 13 (1979), pp. 1-27.

[7] Clack, C., S. Clayman and D. Parrott, Dynamic cyclic data structures in lazy functional languages, draft, 1995. Available at http://www.cs.ucl.ac.uk/staff/C.Clack/papers/NotSubmitted/graph.ps.

[8] Correnson, L., E. Duris, D. Parigot, and G. Roussel, Declarative program transformation: a deforestation case-study, in: G. Nadathur, ed., "Proc. of Int. Conf. on Principles and Practice of Declarative Programming, PPDP '99 (Paris, Sept./Oct. 1999)," Lecture Notes in Computer Science 1702, Springer, 1999, pp. 360-377.

[9] Courcelle, B. and P. Franchi-Zannettacci, Attribute grammars and recursive program schemes, I, Theor. Comput. Sci. 17 (1982), pp. 163-191; Attribute grammars and recursive program schemes, II, Theor. Comput. Sci. 17 (1982), pp. 235-257.

[10] Duris, E., D. Parigot, G. Roussel, and M. Jourdan, Structured-directed genericity in functional programming and attribute grammars, Research Report No. 3105, INRIA, 1997.

[11] Fokkinga, M., J. Jeuring, L. Meertens and E. Meijer, A translation from attribute grammars to catamorphisms, The Squiggolist 2 (1991), pp. 20-26.

[12] Ganzinger, H. and R. Giegerich, Attribute coupled grammars, in: "Proc. of 1984 SIGPLAN Symp. on Compiler Construction (Montreal, June 1984)", ACM Press, 1984, pp. 157-170.

[13] Ghani, N., M. Hamana, T. Uustalu and V. Vene, Representing cyclic structures as nested datatypes, in: H. Nilsson, ed., "Proc. of 7th Symp. on Trends in Functional Programming (Nottingham, Apr. 2006),", Univ. of Nottingham, 2006, pp. 173-188.

[14] Goguen, J., J. W. Thatcher, E. G. Wagner and J. B. Wright, Initial algebra semantics and continuous algebras, J. of ACM 24(1) (1977), pp. 68-95.

[15] The Haskell Wiki, Tying the knot: how to build a cyclic data structure, wiki note, 2002. Available at http://haskell.org/wikisnapshot/TyingTheKnot.html.

[16] Huet, G., The zipper, J. of Funct. Programming 7(5) (1997), pp. 549-554.

[17] Hutton, G. and E. Meijer, Monadic parsing in Haskell, J. of Funct. Program. 8(4) (1998), pp. 437-444.

[18] Johann, P. and N. Ghani, Initial algebra semantics is enough!, in: S. Ronchi Della Rocca, ed., "Proc. of 8th Int. Conf. on Typed Lambda Calculi and Applications, TLCA 2007 (Paris, June 2007)," Lecture Notes in Computer Science 4583, Springer, 2007, pp. 207-222.

[19] Johnsson, T., Attribute grammars as functional programming paradigm, in: G. Kahn, ed., "Proc. of 3rd Int. Conf. on Functional Programming and Computer Architecture, FPCA '87 (Portland, OR, Spept. 1987)," Lecture Notes in Computer Science 274, Springer, 1987, pp. 154-173.

[20] Klarlund, N. and M. I. Schwartzbach, Graph types, in: "Conf. Record of 20th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages, POPL '93 (Charleston, SC, Jan. 1993)," ACM Press, 1993, pp. 196-205.

[21] Knuth, D. E., Semantics of context free languages, Math. Syst. Theory 2 (1968), pp. 127-145.

[22] Kuiper, M. and S. D. Swierstra, Using attribute grammars to derive efficient functional programs, Technical Report RUU-CS-86-16, Universiteit Utrecht, 1986. Available at http://www.cs.uu.nl/ research/techreps/RUU-CS-86-16.html.

[23] Mayoh, B., Attribute grammars and mathematical semantics, SIAM J. of Comput. 10 (1981), pp. 503518.

[24] Paakki, J., Attribute grammar paradigms: a high-level methodology in language implementation, ACM Comput. Surv. 27(2) (1995), pp. 196-255.

[25] Swierstra, S. D., P. R. Azero Alcocer and J. Saraiva, Designing and implementing combinator languages, in: S. D. Swierstra, P. R. Henriques and J. N. Oliveira, eds., "Revised Lectures from 3rd Int. School on Advanced Functional Programming, AFP '98 (Braga, Sept. 1998)," Lecture Notes in Computer Science 1608, Springer, 1999, pp. 150-206. See also UUAGG (Utrecht University Attribute Grammar Compiler), http://www.cs.uu.nl/wiki/HUT/AttributeGrammarSystem.

[26] Uustalu, T. and V. Vene, Comonadic functional attribute evaluation, in: M. van Eekelen, ed., "Trends in Functional Programming 6," Intellect, 2007, pp. 145-162.