Electronic Notes in Theoretical Computer Science 76 (2002)

URL: http://www.elsevier.nl/locate/entcs/volume76.html 22 pages

Redundancy of Arguments Reduced to

Induction

Maria Alpuente^1,3 Rachid Echahed*4 Santiago Escobar^1,2,5

Salvador Lucas^1,6

tDSIC, UPV, Camino de Vera s/n, 46022 Valencia, Spain. íLeibniz-IMAG, 46, Av. Felix Viallet, 38031 Grenoble, France.

Abstract

We demonstrate that the problem of identifying redundant arguments of function symbols, i.e. parameters which can be replaced by any expression without changing the associated semantics, boils down to proving the validity of a particular class of inductive theorems in the equational theory of confluent, sufficiently complete term rewriting systems (TRSs). Hence, existing results for proving inductive theorems can be exploited to solve the problem in many interesting cases where previously developed methods fail to recognize and remove redundancies. In particular, this novel formulation directly yields a new decidability result for the redundancy problem which is based on the so-called standard theories. As an additional result which stems from the inductive encoding of the redundancy problem, we finally propose two different techniques for the analysis of redundant arguments, which are respectively based on inductionless induction and abstract rewriting (a technique for approximating normal forms in sufficiently complete, left linear, canonical TRSs).

1 Introduction

The application of automatic transformation processes during the formal development and optimization of programs can introduce encumbrances in the

1 Work partially supported by CICYT TIC2001-2705-C03-01, Acciones Integradas HI2000-0161, HA2001-0059, HU2001-0019, and Generalitat Valenciana GV01-424.

2 S. Escobar was supported by grant 4342 of Universidad Politécnica de Valencia during a stay at Leibniz-IMAG Lab.

3 Email:alpuente@dsic.upv.es, URL:http://www.dsic.upv.es/users/elp/alpuente.html

4 Email:Rachid.Echahed@imag.fr, URL:http://www-leibniz. imag.fr/PMP/Rachid.Echahed

5 Email: sescobar@dsic.upv.es, URL:http://www.dsic.upv.es/users/elp/sescobar.html

6 Email:slucas@dsic.upv.es, URL:http://www.dsic. upv.es/users/elp/slucas.html

©2002 Published by Elsevier Science B. V.

generated code that programmers usually (or presumably) do not write. Examples are redundant arguments in the functions defined in the transformed program [1,2,7,13,15,18,20,21,24,27].

Example 1.1 Consider the following program borrowing [8], which can be used for adding and substracting natural numbers in Peano's notation:

minus(x,0) ^ x p(0) ^ 0

minus(0,s(y)) ^ 0 p(s(0)) ^ 0

minus(s(x),s(y)) ^ p(s(minus(x,y))) p(s(s(x))) ^ s(x)

plus(0,y) ^ y plus(s(x),y) ^ s(plus(x,y))

If we specialize this program for the call minusplus(x,y)= minus(plus(y,x),y), which adds x to y and then removes y from the sum, thus returning the original x, the optimized program which can be obtained by using an automatic specializer of functional programs such as the one described in [3] is:

minusplus(x,0) ^ x p(0) ^ 0

minusplus(x,s(y)) ^ p(minusplus2(x,y)) p(s(x)) ^ s(x)

minusplus2(x,0) ^ x

minusplus2(x,s(y)) ^ p(minusplus2(x,y))

Note that the second argument of the function minusplus is redundant for the semantics of computed values. Known procedures for removing dead code such as [7,18,21] as well as standard (post-specialization) renaming/compression procedures (see e.g. [3]) cannot remove the redundant argument either. Moreover, redundant argument filtering procedures for logic programs such as the one included in the partial deduction system ECCE [19] do not recognize the redundancy of this parameter either.

In this paper, we provide a characterization of the problem of redundancy of arguments in terms of inductive theorems (see [2] for a detailed comparison of the problem of redundancy of arguments w.r.t. existing techniques). Then, undecidability and decidability results as well as different methods for recognizing inductive theorems can be exploited. This complements some previous results presented in [2], where the decidability of the redundancy for the class of right-ground TRSs was proven. Now, by exploiting induction we are able to prove decidability in a different and incomparable class of programs, namely the standard theories of [23]. In [2] we also proposed two different criteria for recognizing redundancy which requires either that the redundant arguments have a variable in every left hand side or a joinability condition on the rhs's of the rules, which prevented our methods from coping with many interesting examples, such as the program of Example 1.1. In this paper we provide two different, novel criteria for recognizing redundancy, which are based on inductionless induction [10] and abstract rewriting [8], respectively. The combination of these methods catches redundancy in many new practical cases, including Example 1.1. Of course, in exchange for the conditions of [2],

Alpuente, Echahed, Escobar, and Lucas different extra conditions are required, which are also discussed in the paper.

1.1 Plan of the paper

After some preliminaries in Section 2, in Section 3 we recall from [2] the notion of redundancy of arguments of function symbols and we show how the redundancy of arguments is reduced to the validity of inductive theorems. In Section 4, we recall the inductive proof technique called inductionless induction [10], and show how it can be applied for detecting redundant arguments. We also identify a new class of rewrite systems for which the problem of detection of redundant arguments is decidable. In Section 5, we show how the abstract rewriting technique of [8] can be used to detect new redundancies. Section 6 concludes the paper. We have included in the Appendix the practical generation of the optimized program of Example 1.1 together with the inductive proof necessary for detecting redundancies in the program.

2 Preliminaries

Let us first introduce the main notations used in the paper. For full or missing definitions about term rewriting, we refer to [14]; and for theorem proving in automated reasoning, we refer to [6]. Let ^C A x A be a binary relation on a set A. We denote the inverse of ^ by the symmetric closure by the transitive closure by the reflexive and transitive closure by and the reflexive, symmetric and transitive closure by We say that ^ is confluent if, for every a,b,c E A, whenever a ^* b and a ^* c, there exists d E A such that b ^* d and c ^* d. We say that ^ is terminating (or well-founded) iff there is no infinite sequence a1 ^ a2 ^ a3 ■ ■ ■.

Throughout the paper, X denotes a countable set of variables and £ denotes a finite set of function symbols {f, g,...}, each one having a fixed arity given by a function ar : £ ^ N. By T(£, X) we denote the set of terms; T(£) is the set of ground terms or Herbrand domain, i.e., terms without variable occurrences. A term is said to be linear if it has no multiple occurrences of a single variable. A fc-tuple t\,...,tk of terms is written t. The number k of elements of the tuple t will be clarified by the context. Var(t) is the set of variables in t. A substitution is a mapping a : X ^ T(£, X) which homomorphically extends to a mapping a : T(£, X) ^ T(£, X). Let Subst (£, X) denote the set of substitutions and Subst (£) be the set of ground substitutions, i.e., substitutions on T(£). If a(t) is a ground term, we call a a grounding substitution for t. A unifier of two terms t, s is a substitution a with a(t) = a(s). A most general unifier (mgu) of t,s is a unifier a such that for each unifier a' of t,s there exists 9 such that a' = 9 o a.

Terms are viewed as labelled trees in the usual way. Positions p,q,... are represented by chains of positive natural numbers used to address subterms of t. By A, we denote the empty chain. The subterm at position p of t is denoted

as t\p and t[s]p is the term t with the subterm at position p replaced by s. By Pos(t) we denote the set of positions of a term t. The symbol labeling the root of t is denoted as root(t). A context is a term C with zero or more 'holes', □ (a fresh constant symbol). We usually write simply C[ ] to denote arbitrary context, clarifying the number and location of holes 'in situ'. If C is a context and t a term, C[t] denotes the result of replacing the hole in C by t.

A rewrite rule is an ordered pair (l,r), written l ^ r, with l,r E T(£, X), l E X and Var(r) C Var(l). The left-hand side (lhs) of the rule is l and r is the right-hand side (rhs). A TRS is a pair R = (£,R) where R is a set of rewrite rules and £ is called the signature. An instance a(l) of the lhs of a rule l ^ r is a redex. A term t without redexes is said a normal form. By NFn we denote the set of finite normal forms of R. Given R = (£, R), we consider £ as the disjoint union £ = C W F of symbols c EC, called constructors, and symbols f e J7, called defined functions, where F = {/ | /(I) —► r e R} and C = £ -F. Then, T(C, X) is the set of constructor terms. A pattern is a term f (li,... ,ln) such that f eF and li,...,ln E T(C, X). A term t rewrites to s (at position p), written t s (or just t ^ s), if t\p = a(l) and s = t[a(r)]p, for some rule l ^ r E R, p E Pos(t) and substitution a. A TRS R is left linear if all its lhs's are linear terms. A constructor system (CS) is a TRS whose lhs's are patterns. A TRS R is terminating (resp. confluent) if the relation is terminating (resp. confluent). A TRS R is canonical or convergent if the relation is terminating and confluent. If the TRS R is canonical, the normal form of a term t E T(£) exists, it is unique, and it will be denoted by tin E NFr. A TRS R is sufficiently complete if yt eT(£), 3t' E T(C) such that t t'. Two terms t, s are joinable, denoted by t i s, if there exists a term u such that t u and s u.

To avoid confusion, in the sequel syntactic equality of terms is represented by =. An equation is a formula of the form r = s (or s = r) where r,s E T(£, X). An equational system is a set of equations. If E is a set of equations between terms of T(£, X), is the smallest congruence on T(£, X) such that a(s) ^*E a(t) for all equations s = t E E and for all substitutions a. Given a set of equations (or rewrite rules) E, s = t is a logical consequence of E, denoted by E h s = t, if s t. The equational theory of E is the set of equations that are logical consequences of E. The minimal Herbrand model (often called minimal model) IE of a set of equations E is the quotient algebra T(£)/. We say that a first-order equation s = t is an inductive consequence of a set of equations (or rewrite rules) E iff IE = s = t, i.e. a(s) a(t) for all grounding substitution a for t and s. The set of all inductive consequences of E is called the inductive theory of E. Inductive consequences of E will also be called inductive theorems in what follows.

3 Redundant arguments

The redundancy of an argument of a function f in a TRS R depends on the semantic properties of R that we are interested in observing. The semantics considered in this paper, which is the most commonly considered in functional programming, is the set of values (ground constructor terms) that R is able to produce in a finite number of rewriting steps from a ground term (evalR(t) = {s E T(C) | t s}). We often omit the subindex R when it is clear from the context. Other semantics which are relevant to the redundancy problem are discussed in [1,2,22].

Roughly speaking, a redundant argument of a function f is an argument ti which we do not need to consider in order to compute the semantics of any call containing a subterm f (t1, ...,tk).

Definition 3.1 [2] Let R = (£, R) be a TRS, f E £, and i e{1,..., ar(f)}. The i-th argument of f is redundant (w.r.t. evalR) if, for all context C[ ] and for all t,s E T(£) such that root(t) = f, evaR(C[t]) = evaR(C[t[s]i]).

We denote by rargevaln (f) the set of redundant arguments of a symbol f E £ w.r.t. the semantics evalR for £.

When analyzing a property of a function f in R, it is useful to get rid of the contexts and perform easier, local analyses which allow us to center the attention on the syntactic structure of the rewriting rules. This motivates the following.

Definition 3.2 [2] Let R = (£, R) be a TRS, f E £, and i e{1,..., ar(f)}. The i-th argument of f is locally redundant (w.r.t. evalR) if, for all t, s E T(£) such that root(t) = f, evalR(t) = evalR(t[s]i).

We denote by lrargevaln (f) the set of locally redundant arguments of a symbol f w.r.t. evalR.

Redundancy of an argument w.r.t. the semantics eval implies local redundancy w.r.t. eval, i.e., rargeval(f) C lrargeval(f). Unfortunately, the converse statement is not generally true. The following result in [2] ensures that local redundancy implies redundancy when the TRS is ground confluent and sufficiently complete.

Theorem 3.3 Let R be a ground confluent and sufficiently complete TRS. Then, for dtt f E £, lrargeva|(f) = rargeval(f).

Now the question of how to single out locally redundant arguments arises. In order to tackle this problem, we formalize the redundancy problem in terms of the inductive theory of the program.

3.1 Inductive theorems expressing redundancy of arguments

The following results formalize the relation between inductive theorems and redundancy of arguments in confluent and sufficiently complete TRSs.

Proposition 3.4 Let R be a confluent TRS, f E £, and i e{1,... ,ar(f)}. The i-th argument of f is locally redundant (w.r.t. eval) if the equation t = t[y]i is an inductive theorem of R, where t = f (xi,... ,xar(f)) andxi,... ,xar(f),y are distinct variables.

Theorem 3.5 Let R be a confluent and sufficiently complete TRS, f E £, and i E {1,..., ar(f)}. The i-th argument of f is redundant (w.r.t. eval) iff the equation t = t[y]i is an inductive theorem of R; where t = f (xi,..., xar(f)) and xi,..., xar(f),y are distinct variables.

Hence, for confluent, sufficiently complete TRSs, the redundancy problem is reduced to the problem of checking validity of a particular class of inductive theorems. The problem of identifying the inductive theory of a TRS is in general undecidable, as shown by [11] even for a very restricted class of TRSs: finite, canonical, left- and right-linear, and right monadic (right hand sides have depth at most 1) CSs. However, several methods for (semi)-automatically proving validity of inductive theorems have been developed, such as the cover set method [28], test set method [9], rewriting induction method [25], and inductionless induction method [10,11], which generalizes the former ones. Also, the abstract rewriting method of [8] can be used for proving inductive theorems.

In the following, we show how, both the inductionless induction as well as abstract rewriting methods can be successfully applied for detecting redundancy of arguments, where the methods discussed in [2] fail.

4 Inductionless induction

We briefly recall the inductionless induction method for proving validity of inductive theorems (see [10,11] for details). Inductionless induction tackles how to (semi)-automatically prove a set of equations C in the minimal Herbrand model of a set of equations E without making use of induction schemes (induction rules). It uses a (first-order) axiomatization A of the minimal model of E, IE, such that CUAU E is consistent if and only if C is valid in IE, i.e. IE = C .A normal axiomatization A of IE is a finite recursive set of purely universal formulas such that IE = A, IE is the only Herbrand model of E UA up to isomorphism, and for all ground terms s, t representative of its congruence class of IE, s ^ t ^A = s = t. The method relies on saturation techniques [5,6] for performing the proof by consistency of CUAU E, thus any saturation-based general-purpose first-order theorem prover can be used for inductive validity. The (in)consistency proofs are performed in two stages: first deductions on CUE are computed by saturation, yielding new consequences; then, these new consequences are checked for inconsistency w.r.t. A.

Deductions are performed by superposition defined by the following inference rules between the set of equations E and an equation c EC. We assume

below that y is a reduction ordering [14] which is total on ground terms, i.e. a relation >- which is irreflexive, transitive, well-founded, total, monotonie, and stable under substitutions. A well-known reduction ordering is the recursive path ordering, based on a total ordering (called precedence) on £.

l = r c[s]

Superposition - if a = mgu(l, s), s is not a variable,

a(c[rD

a(r) a(l),l = r E E,c[s] E C.

C V s = t

Equality resolution - if a = mgu(s, t),C V s ^ t E C.

Given a ground equation c, C^ is the set of ground instances of equations in C that are strictly smaller than c in this ordering. A ground equation (conjecture) c is entailed by a set of equations (conjectures) C if E^A^C^ h c. A non-ground equation is entailed if all its ground instances are. An inference is redundant if one of its premises or its conclusion are entailed by C .A set of equations is saturated if all inferences are redundant. A derivation sequence is a sequence C0, C1,...,Cn,... such that each Ci+1 is obtained from Ci by adding some logical consequences or by removing some entailed equations. A derivation sequence is fair if every equation which can be persistently derived is eventually derived.

The application of inductionless induction to the redundancy problem is illustrated in the following.

Example 4.1 Consider the optimized TRS of Example 1.1, which is saturated and can be oriented using the recursive path ordering with the precedence minusplus > minusplus2 > p > s > 0, and the axiomatization {Vx, y, s(x) = 0 A s(x) = s(y) ^ x = y}.

We can prove the redundancy of the second argument of minusplus by proving the validity of the equation:

(c1) minusplus(x,y) = minusplus(x,w)

We have two possible inferences by saturation, the other ones are renamings:

(c1;1) x = minusplus(x,w)

(c1>2) p(minusplus2(x,y)) = minusplus(x,w)

Here, no equation is entailed. After superposing again, we obtain (up to renaming):

(ci,3) x = x

(c1>4) x = p(minusplus2(x,w))

(c1>5) p(minusplus2(x,y)) = x

(ci,6) p(minusplus2(x,y)) = p(minusplus2(x,w))

Here, equation ci>3 is trivially entailed, ci>4 h ci>5, and ci>4 h ci>6. If we superpose one more time, we obtain (up to renaming):

(ci,7) x = p(x)

(ci)8) x = p(p(minusplus2(x,w)))

Here, equation ci>8 is entailed: ci>4 Uci>7 h ci>8. And, finally, after superposing, we obtain:

(ci,9) 0 = 0 (ci,io) s(x) = s(x)

And here, the two equations are trivially entailed.

Then, the set R U {ci,ci)i,ci)2,ci)4,ci)7} is saturated and it is immediate to check its consistency w.r.t. the axiomatization. Therefore, the theorem is proved and hence the second argument of minusplus is redundant. We have checked this automatically by using the theorem prover Spike [9] which implements a particular implicit induction technique, namely the one which is based on test sets (see the Subsection A.4 for the practical execution in Spike).

Unfortunately, in many interesting cases, existing methods for inductive validity may run forever without proving validity of inductive theorems as in the following example.

Example 4.2 Let R be the following saturated TRS oriented using the recursive path ordering with the precedence f > id > s > 0 and status left to right for f:

f(0,0) ^ 0 id(0) ^ 0 f(0,s(y)) ^ f(0,y) id(s(x)) ^ s(id(x))

f(s(x),0) ^ f(x,0) f(s(x),s(y)) ^ f(x,s(id(y)))

Consider the axiomatization of Example 4.1. The first and second arguments of f are redundant whereas it cannot be detected by previous redundancy results in [2]. In order to prove the redundancy of the first argument of f, we consider the conjecture: (ci) f(x,y) = f(z,y). Superposing ci with R, we get the following (the other equations are renamings):

(ci,i) 0 = f(z,0)

(ci,2) f(0,y) = f(z,s(y))

(ci,3) f(x,0) = f(z,0)

(cM) f(x,s(id(y))) = f(z,s(y))

Here, equation ci>3 is entailed. Superposing ci)i, ci>2 and ci>4 with R, we obtain

(up to renaming):

(c1,5) (c1,6) (C1,7) (c1,8) (c1,9)

0 = f(z,s(0)) f(0,y) = f(z,s(s(y)))

0 = f(z,0) f(0,y) = f(0,y) f(0,y) = f(z,s(id(y)))

(ci,io) f(x,s(id(y))) = f(z,s(id(y))) (ci>n) f(x,s(0)) = f(z,s(0))

(01,12) f(x,s(s(id(y)))) = f(z,s(s(y)))

Here, all equations except ci,i2 are entailed. Thus, we can obtain infinitely many equations f(x,sra(id(y))) = f(z,sra(y)) and the process may run forever unless an extra lemma id(x) = x is manually provided.

Several techniques to improve termination of the inductive validity process have been developed such as deduction of lemmas which might help to prove an inductive theorem [26,17,16]. On the other hand, different criteria can be used to stop the saturation process, such as the homeomorphic embedding (which is commonly used in program transformation for avoiding infinite sequences [3]). Unfortunately, important properties, such as refutationally completeness or finite saturation under common conditions, get lost.

Therefore, it is interesting to consider decidable classes of TRSs where the inductionless induction method terminates, and thus, redundancy of arguments can be decided. In the next section we present a result for the decidability of redundancy based on inductionless induction, which is complementary to the decidability result of [2]. We postpone to Section 5 the use of finite approximations based on abstract interpretation, such as the abstract rewriting of [8], to formalize static analyses of redundancy.

4.1 Standard Theories

In this section we consider the standard theories of [23], a class of TRSs where the saturation process is finite, thus the validity of an inductive theorem is decidable.

Standard theories are particular sets of equations which are finitely closed by superposition. We need the following: the depth d of a subterm s = t |p is the length of the position p: d = | p|; and a variable is shallow in a term if it occurs only at depth 0 or 1 in the term.

Definition 4.3 [23] A standard signature £ is a signature where every function symbol f in £ has an associated set of shallow positions sh(f) and a set

of linear positions lin(f), such that lin(f) fl sh(f) = 0 and sh(f) U lin(f) = {1,...,ar(f)}.

Definition 4.4 [23] A term s is a standard term iff it is a variable or a term of the form f (s1;..., sn) where if i E sh(f) then si is a variable or a ground term and if i E lin(f) then all variables in si are linear in s.

Note that, according to the previous definition, all ground terms are standard, whereas not every linear term is, because no term with variables occurring at depth > 1 are allowed at shallow positions. Furthermore, the only non-linear variables of a standard term are shallow variables occurring at shallow positions.

Definition 4.5 [23] An equation s = t is standard iff

(i) s is linear and t is ground or

(ii) s is a standard term f (..., g(t),...) and t is a variable or

(iii) s and t are standard terms sharing only shallow variables and no variable x is both a shallow position argument and a linear position argument in s = t.

A standard presentation is a set of standard equations and a standard theory is a theory axiomatizable by a standard presentation.

Theorem 4.6 [23] Every standard presentation E can be finitely closed under superposition.

Hence, for standard theories, the saturation process is finite and then the inductionless induction method (hence, the redundancy of arguments) is de-cidable. Note that we naturally specialize the notion of standard presentations (as originally defined in [23]) to "standard TRSs".

Theorem 4.7 Let R be a standard confluent TRS. Let the equation t = s be standard (within R). It is decidable if t = s is an inductive theorem of R.

Corollary 4.8 Let R be a standard, confluent, and sufficiently complete TRS, f E £, and i E{1,... ,ar(f)}. Let the equation t = t[y]i be standard (within R) such that t = f (x1;... ,xar(f)) and x1,... ,xar(f),y are distinct variables. It is decidable if the i-th argument of f is redundant (w.r.t. eval).

Even if the class of standard theories is somehow restrictive, it still allows to detect redundancy of arguments in significant examples.

Example 4.9 Consider the following TRS, where extra variables are allowed in right hand sides.

f(0,y) ^ y g(0,y) ^ y

f(s(x),y) ^ g(u,y) g(s(x),y) ^ f(u,y)

Here, we can automatically prove that the first argument of f (and g) is redundant. Note that this example cannot be dealt by previous results in [2].

We consider the problem of identifying new decidable classes of TRSs (w.r.t. the particular class of inductive theorems which express redundancy), as well as developing new decision algorithms for these programs, as an interesting line of work which we plan to pursue as future work. In the following section, however, instead of focusing in deeper decidability matters, we investigate finite approximations of the validity problem which lead us to formalize more practical static redundancy analyses. As an application of the analysis, we revisit Example 4.2, which is shown to be correctly analyzed by applying the new methodology based on abstract rewriting, whereas it is not coped by (unoptimized) inductionless induction.

5 Abstract rewriting

Abstract interpretation is a theory to extract relevant information from programs without considering all details given by the standard semantics [12]. In [8], Bert and Echahed proposed a framework called abstract rewriting which is based on an abstract interpretation of (conditional) term rewriting systems for approximating the normal form t[R of a term t in a canonical CS R. They make use of the notion of an abstract domain of terms. In this section, we use the technique of abstract rewriting in order to prove inductive theorems, and thus to detect redundant arguments, in the setting of canonical and sufficiently complete CS's. We first recall the abstract rewriting methodology of [8].

Definition 5.1 [8] Let £ be a signature. The abstract specification of £ is A(£) = £ U{T, ±, u, n}.

Intuitively, an abstract term t approximates the set of its ground instances, where the symbols ± and T stand for the empty set and the set of all constructor terms, respectively. Similarly, the symbols u and n correspond to set union and set intersection operators, respectively.

Let _a : E —► ^4(E) be the obvious identity signature morphism between the concrete and the abstract signatures. The signature morphism _Q is extended to a translation function on terms _Q : T(E, X) —► T(^4.(E)) such that xa = T Vx E X and (f(ti,...,tn))a = fa(tal}...,tan) Vf E £. Besides, given an abstract term t, the concrete set y(t) is the largest set of ground terms such that y (T) = T (£), y (±) = 0, y (fa (ti,..., tar(f))) = {f (si,..., sar(f)) | V1 < i < ar(f ),si E y(ti)}, Y(ti U t2) = y(ti) U y(t2), and y(ti n t2) = Y(ti) n y(t2).

In order to approximate normal forms of terms, it is defined a partial order < on abstract terms such that t < t' iff y(t) C y(t'). Given a term s E T(£, X), t E T(A(£)) is an approximation of s iff Vs' E T(£) such that (s')a < sa, (s'^^)a < t, or equivalently {a(s)U | a(s) E T(£)} C y(t).

The set of abstract terms is larger than the set of concrete terms. In order to compute approximations of normal forms of concrete terms, finite subsets of the set of abstract terms are introduced by the so-called finite upper closures

up : T(A(E)) ^ T(A(E)) such that up is monotonic (Vt,t' E T(A(E)),t < t' ^ up(t) < up(t')), extensive (Vt E T(A(E)),t < up(t)), and idempotent (up o up = up). The main objective of a finite upper closure is to restrict T(A(C)) to a finite set Tup(A(C)).

The notions of abstract rewriting system and abstract rewriting calculus are defined. An abstract rewriting system is associated to a TRS R and a finite upper closure up in order to approximate the normal forms of ground instances of concrete terms with variables. In concrete, a "computed" abstract TRS efficiently determines an approximation for any concrete term. Due to the properties of abstract terms and the ordering < , the classical definition of rewriting is extended to the abstract rewriting calculus (see [8] for details).

Now, we can exploit abstract rewriting for proving inductive theorems, which demonstrates redundancy of arguments. Let us first introduce some auxiliary results.

Definition 5.2 We define the set of up-minimal substitutions as: Substup (C, X) = [a E Subst(C, X) | Va' E Subst(C, X)AVx E X, a'(x)a < a(x)a ^ up(a'(x)a) = a(x)a}

Theorem 5.3 Let R be a left linear, canonical, sufficiently complete CS, and up be a finite upper closure. Let RUP be the "computed" abstract TRS associated to R and up. The equation s = t is an inductive theorem of R if for all a E Substup(C, X), up(a(s)a)lR^p = up(a(t)a)lRup = 5 such that 5 et(a(c) -[T, ±, LI, n}).

The following result is the key for the detection of redundant arguments.

Corollary 5.4 Let R be a left linear, canonical, sufficiently complete CS, and up be a finite upper closure. Let RUP be the "computed" abstract TRS associated to R and up. Let f E E and i E [1,... ,ar(f)}. The i-th argument of f is redundant (w.r.t. eval) iff for all a E Substup(C, X), up(a(t)a)lR»p = up(a(t[y]i)a)iRucv = 5 such that 5 E T(A(C) -{T, ±, L, n}), where t = f (x1,... ,xar(f)) and x1,... ,xar(f),y are distinct variables.

Clearly, the analysis depends on the chosen upper closure up. A standard upper closure is defined by taking the maximum depth of all constructor terms of left-hand sides of a TRS.

Example 5.5 Consider the TRS of Example 4.2 and the finite upper closure head defined by head(0a) = 0a, head(s(t)) = s(T) if t E T(C), and head(f (t1,... ,tn)) = f (head(t\),..., head(tn)) otherwise. We can prove by abstract rewriting that both arguments of f are redundant. The "computed" abstract TRS for R and head is:

fa(0a,0a) ^ 0a fa(0a,sa(T)) ^ 0a fa(sa(T),0a) ^ 0a fa(sam,sa(T)) ^ 0a

ida(0a) ^ 0a ida(sa(T)) ^ sa(T)

Then, the first and second arguments of f are redundant since for the equations f(x,y) = f(x',y) and f(x,y) = f(x,y'), and every substitution a E Substhead(C, X), a(f(x,y)riRu? = a(f(x\y))aÍRucp = 0a and a(f (x,y))^ = a(f(x,y,))alR^p = 0a. Note that this example can not be handled by the inductionless induction method nor by previously discussed methods such as [1,2].

For the sake of clarity, let us finally show, by means of one example, that there exist still interesting cases where the method in [2] succeeds whereas none of the results in this paper apply. This demonstrates that the different methods are incomparable and hence could be fruitfully combined to develop a practical tool for the detection of redundant arguments.

Example 5.6 Consider the following TRS R which is a slight modification of Example 3.26 of [4]:

h(0,0) ^ s(0) h(0,s(y)) ^ s(0) h(s(0),y) ^ s(s(0)) h(s(s(x)),y) ^ s(h(h(x,y),y))

The inductionless induction method can not automatically prove that the second argument of h is redundant since there is no automatizable reduction ordering for the TRS which could orient the equations (see [4]). On the other hand, abstract rewriting can not prove the redundancy of the second argument of h since for any finite upper closure there exists a value for the second argument which returns a term containing T, e.g. using the upper closure head of Example 5.5, h(s(T), 0)lR^p = s(T). Nevertheless, the method in [2] succeeds in proving that the second argument of h is redundant since all variables of the second argument appear in positions of redundant arguments of the rhs of the corresponding rule and s(0) I s(0).

6 Conclusion

We have shown how the problem of detecting redundant arguments reduces to that of validity of inductive theorems in confluent, sufficiently complete TRSs. As the set of inductive theorems is not recursively enumerable in general, we identify a class of rewrite systems in which detection of redundant arguments is decidable. We have also shown how "inductionless induction" as well as "abstract rewriting" techniques can be applied to detect redundant arguments and particularly in some examples that cannot be handled by previously developed methods.

However, the natural question whether it is possible to specialize methods for inductive validity to the concrete problem of redundancy arises. In future work, we plan to deepen on this point as well as to integrate the methods described in this paper into the prototype tool presented in [2].

References

[1] M. Alpuente, S. Escobar, and S. Lucas. Redundant Arguments in Term Rewriting. In 9th Int'l Workshop on Functional and Logic Programming (WFLP2000), SPUPV 2000.2039, pages 309-323, 2000. 1, 3, 5.5

[2] M. Alpuente, S. Escobar, and S. Lucas. Removing Redundant Arguments of Functions. In 9th International Conference on Algebraic Methodology And Software Technology, AMAST 2002, volume 2422 of LNCS, pages 117-131. Springer-Verlag, 2002. 1, 1, 1.1, 3, 3.1, 3.2, 3, 3.1, 4.2, 4, 4.9, 5.5, 5, 5.6, 6

[3] M. Alpuente, M. Falaschi, and G. Vidal. Partial Evaluation of Functional Logic Programs. ACM Toplas, 20(4):768-844, 1998. 1.1, 4, 1

[4] T. Arts and J. Giesl. A collection of examples for termination of term rewriting using dependency pairs. Technical Report AIB-2001-09, RWTH Aachen, Germany, 2001. 5.6

[5] L. Bachmair and H. Ganzinger. Rewrite-based equational theorem proving with selection and simplification. Journal of Logic and Computation, 4(3):217-247, June 1994. 4

[6] L. Bachmair and H. Ganzinger. Resolution theorem proving. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 1, chapter 2, pages 19-99. Elsevier Science, 2001. 2, 4

[7] S. Berardi, M. Coppo, F. Damiani, and P. Giannini. Type-based useless-code elimination for functional programs. In Proceedings of SAIG 2000, volume 1924 of LNCS, pages 172-189. Springer-Verlag, 2000. 1, 1.1

[8] D. Bert and R. Echahed. Abstraction of Conditional Term Rewriting Systems. In J. Lloyd, editor, International Symposium on Logic Programming, ILPS-95, Portland, Oregon, pages 162-176. MIT Press, 1995. 1.1, 1, 1.1, 3.1, 4, 5, 5.1, 5

[9] A. Bouhoula and M. Rusinowitch. Implicit induction in conditional theories. Journal of Automated Reasoning, 14:189-235, 1995. 3.1, 4.1, 2

[10] H. Comon. Inductionless induction. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 1, chapter 14, pages 913-962. Elsevier Science, 2001. 1, 1.1, 3.1, 4

[11] H. Comon and R. Nieuwenhuis. Induction = I-Axiomatization + First-Order Consistency. Information and Computation, 159(1/2):151-186, 2000. 3.1, 4

[12] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 238-252, Los Angeles, California, 1977. ACM Press, New York, NY. 5

[13] P. Cousot and R. Cousot. Higher-order abstract interpretation (and application to comportment analysis generalizing strictness, termination, projection and PER analysis of functional languages), invited paper. In Proceedings of the 1994 International Conference on Computer Languages, ICCL'94, pages 95112, Toulouse, France, 16-19 May 1994. IEEE Computer Society Press, Los Alamitos, California. 1

[14] N. Dershowitz and D. Plaisted. Rewriting. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 1, chapter 9, pages 535-610. Elsevier Science, 2001. 2, 4

[15] J. Hughes. Backwards Analysis of Functional Programs. In D. Bj0rner, A.P. Ershov, and N.D. Jones, editors, IFIP Workshop on Partial Evaluation and Mixed Computation, pages 187-208, 1988. 1

[16] A. Ireland and A. Bundy. Using failure to guide inductive proofs. Automated Reasoning, 16:38-35, 1996. 4

[17] D. Kapur and M. Subramaniam. Lemma discovery in automating induction. In Proc. of CADE'96, LNCS 914, pages 403-407. Springer-Verlag, Berlin, 1996.

[18] N. Kobayashi. Type-based useless variable elimination. In Proceedings of PEPM-00, pages 84-93. ACM Press, 2000. 1, 1.1

[19] M. Leuschel. The ecce partial deduction system and the dppd library of benchmarks. Accessible via http://www.cs.kuleuven.ac.be/~lpai., 1998. 1.1

[20] M. Leuschel and M. H. S0rensen. Redundant Argument Filtering of Logic Programs. In John Gallager, editor, Proceedings of the 6th International Workshop on Logic Program Synthesis and Transformation (LOPSTR'96), volume 1207 of Lecture Notes in Computer Science, pages 83-103, Stockholm, Sweden, August 1996. Springer-Verlag, Berlin. 1

[21] Y. A. Liu and S. D. Stoller. Eliminating dead code on recursive data. Science of Computer Programming, 2002. Preliminary version in Proc. of SAS'99, LNCS 1694:211-231. Springer-Verlag, Berlin, 1999. 1, 1.1

[22] S. Lucas. Transfinite rewriting semantics for term rewriting systems. In Proc. of 12th International Conference on Rewriting Techniques and Applications, RTA01, LNCS 2051, pages 216-230. Springer-Verlag, Berlin, 2001. 3

[23] R. Nieuwenhuis. Basic paramodulation and decidable theories. In Proceedings of the Eleventh Annual IEEE Symposium On Logic In Computer Science (LICS'96), pages 473-483, New York, USA, 1996. IEEE Computer Society Press. 1, 4.1, 4.3, 4.4, 4.5, 4.6, 4.1

[24] A. Pettorossi and M. Proietti. Transformation of logic programs: Foundations and techniques. Journal of Logic Programming, pages 261-32, 1994. 1

[25] U.S. Reddy. Term rewriting induction. In M. E. Stickel, editor, Proceedings of the 10th International Conference on Automated Deduction, volume 449 of Lecture Notes in Computer Science, pages 162-177. Springer-Verlag, 1990. 3.1

[26] T. Walsh. A divergence critic for inductive proofs. Artificial Intelligence Research,, 4:209-235, 1996. 4

[27] M. Wand and I. Siveroni. Constraint systems for useless variable elimination. In Proceedings of POPL'99, LNCS, pages 291-302. Springer-Verlag, 1999. 1

[28] H. Zhang. Reduction, Superposition, and Induction: Automated Reasoning in an Equation Logic. PhD thesis, Renselaer Polytechnic Institute, 1988. 3.1

A Sketch of execution

In this Appendix, we present the practical analysis and detection of redundant arguments in the program of Example 1.1.

First, in Subsection A.1, we express this program in the syntax of the partial evaluator 7 Indy [3]. Then, we show in Section A.2 the partially evaluated program, which corresponds to the optimized program of Example 1.1. Next, in order to illustrate the use of the inductionless induction method for detecting redundant arguments, the optimized program is translated to the syntax of the theorem prover 8 Spike [9] in Subsection A.3. Finally, we transcript the inductive proof generated by Spike in Subsection A.4.

A.1 Original program in Indy

plus(:0,Y) -> Y; plus(:s(X),Y) -> :s(plus(X,Y));

minus(X,:0) -> X; minus(:0,:s(Y)) -> :0;

minus(:s(X),:s(Y)) -> p(:s(minus(X,Y)));

p(:0) -> :0; p(:s(:0)) -> :0; p(:s(:s(X))) -> :s(X);

minusplus(X,Y) -> minus(plus(Y,X),Y);

A.2 Specialized program in Indy

p1_1(:0) -> :0; p1_1(:s(A)) -> :s(A);

minusplus2_2(A,:0) -> A;

minusplus2_2(A,:s(B)) -> p1_1(minusplus2_2(A,B)); minusplus2_1(A,:0) -> A;

minusplus2_1(A,:s(B)) -> p1_1(minusplus2_2(A,B));

A.3 Specialized program translated to Spike

specification : minusplus

sorts nat;

constructors :

0 : -> nat; s_ : nat -> nat;

7 Available at http://www.dsic.upv.es/users/elp/indy/.

8 Available at http://www.loria.fr/equipes/cassis/softwares/spike/.

defined functions :

p_ : nat -> nat;

minusplus__ : nat nat -> nat;

minusplus2__ : nat nat -> nat;

axioms:

p(0) = 0; p(s(x)) = s(x);

minusplus2(x,0) = x;

minusplus2(x,s(y)) = p(minusplus2(x,y)); minusplus(x,0) = x;

minusplus(x,s(y)) = p(minusplus2(x,y));

A.4 Inductive proof in Spike

Below, we include the transcription of a proving session with Spike. The theorem which proves the redundancy of the second argument of minusplus is minusplus(x1,x2)=minusplus(x1,x3). This theorem appears as the initial set E0 of theorems to prove. Some additional notes are included in the transcription to help the reader.

All the rules are oriented !

test set of R :

-> nat = {0 ; s(x1)}

induction positions of functions:

-> p : [[1]] -> minusplus : [[2]] -> minusplus2 : [[2]]

E0 = {minusplus(x1,x2) = minusplus(x1,x3)} | E0 is the

Application of generate on:

minusplus(x1,x2) = minusplus(x1,x3) with cover substitutions:

initial set of theorems to prove

x2 -> {0; s(x1)}

1) x1 = minusplus(x1,x3) ;

2) p(minusplus2(x1,x2)) = minusplus(x1,x3)

E1 = {x1 = minusplus(x1,x3) ;

p(minusplus2(x1,x2)) = minusplus(x1,x3)}

superposition

H1 = {minusplus(x1,x2) = minusplus(x1,x3)}

H1 is the

Application of generate on: x1 = minusplus(x1,x3) with cover substitutions: x3 -> {0; s(x1)}

1) x1 = x1 ;

2) x1 = p(minusplus2(x1,x2))

| initial set | of inductive | hypothesis

Delete x1 = x1

E2 = {p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = p(minusplus2(x1,x2))}

H2 = {x1 = minusplus(x1,x3) ;

minusplus(x1,x2) = minusplus(x1,x3)}

Application of generate on:

p(minusplus2(x1,x2)) = minusplus(x1,x3) with cover substitutions: x2 -> {0; s(x1)}

1) p(x1) = minusplus(x1,x3) ;

2) p(p(minusplus2(x1,x2))) = minusplus(x1,x3)

| Equation is | entailed and | removed

E3 = {x1 = p(minusplus2(x1,x2)) ; p(x1) = minusplus(x1,x3) ; p(p(minusplus2(x1,x2))) = minusplus(x1,x3)}

H3 = {p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Simplification of:

p(p(minusplus2(x1,x2))) = minusplus(x1,x3) by H3 U E3[R]: p(x1) = minusplus(x1,x3)

E4 = {x1 = p(minusplus2(x1,x2)) ; p(x1) = minusplus(x1,x3) ; p(x1) = minusplus(x1,x3)}

H4 = {p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Simplification

detects

possible

entailed

equations

Delete p(x1) = minusplus(x1,x3) it is subsumed by:p(x1) = minusplus(x1,x3) of E4

E5 = {x1 = p(minusplus2(x1,x2)) ;

| Subsumption

| also detects

| entailed

| equations

p(x1) = minusplus(x1,x3)}

H5 = {p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Application of generate on: x1 = p(minusplus2(x1,x2)) with cover substitutions: x2 -> {0; s(x1)}

1) x1 = p(x1) ;

2) x1 = p(p(minusplus2(x1,x2)))

E6 = {p(x1) = minusplus(x1,x3) ; x1 = p(x1) ;

x1 = p(p(minusplus2(x1,x2)))}

H6 = {x1 = p(minusplus2(x1,x2)) ;

p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Simplification of:

x1 = p(p(minusplus2(x1,x2))) by H6 U E6[R]: x1 = p(minusplus2(x1,x2))

E7 = {p(x1) = minusplus(x1,x3) ; x1 = p(x1) ;

x1 = p(minusplus2(x1,x2))}

H7 = {x1 = p(minusplus2(x1,x2)) ;

p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Delete x1 = p(minusplus2(x1,x2)) it is subsumed by:x1 = p(minusplus2(x1,x2)) of H7

E8 = {p(x1) = minusplus(x1,x3) ; x1 = p(x1)}

H8 = {x1 = p(minusplus2(x1,x2)) ;

p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Application of generate on: p(x1) = minusplus(x1,x3) with cover substitutions: x3 -> {0; s(x1)}

1) p(x1) = x1 ;

2) p(x1) = p(minusplus2(x1,x2))

E9 = {x1 = p(x1) ; p(x1) = x1 ;

p(x1) = p(minusplus2(x1,x2))}

H9 = {p(x1) = minusplus(x1,x3) ;

x1 = p(minusplus2(x1,x2)) ; p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Delete x1 = p(x1) it is subsumed by:p(x1) = x1 of E9

E10 = {p(x1) = x1 ;

p(x1) = p(minusplus2(x1,x2))}

H10 = {p(x1) = minusplus(x1,x3) ;

x1 = p(minusplus2(x1,x2)) ; p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Application of generate on: p(x1) = x1 with cover substitutions: x1 -> {0; s(x1)}

1) 0 = 0 ; 2) s(x1) = s(x1)

Delete 0 = 0

Delete s(x1) = s(x1)

E11 = {p(x1) = p(minusplus2(x1,x2))}

H11 = {p(x1) = x1 ;

p(x1) = minusplus(x1,x3) ; x1 = p(minusplus2(x1,x2)) ; p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Application of generate on:

p(x1) = p(minusplus2(x1,x2)) with cover substitutions: x2 -> {0; s(x1)}

1) p(x1) = p(x1) ;

2) p(x1) = p(p(minusplus2(x1,x2)))

Delete p(x1) = p(x1)

E12 = {p(x1) = p(p(minusplus2(x1,x2)))}

H12 = {p(x1) = p(minusplus2(x1,x2)) ; p(x1) = x1 ;

p(x1) = minusplus(x1,x3) ; x1 = p(minusplus2(x1,x2)) ; p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

Simplification of:

p(x1) = p(p(minusplus2(x1,x2))) by H12 U E12[R]: p(x1) = p(x1)

E13 = {p(x1) = p(x1)}

H13 = {p(x1) = p(minusplus2(x1,x2)) ; p(x1) = x1 ;

p(x1) = minusplus(x1,x3) ; x1 = p(minusplus2(x1,x2)) ; p(minusplus2(x1,x2)) = minusplus(x1,x3) ; x1 = minusplus(x1,x3) ; minusplus(x1,x2) = minusplus(x1,x3)}

E13 becomes empty and the saturation process ends without any inconsistency

Delete p(x1) = p(x1)

The initial conjectures are inductive theorems of R.