URL: http://www.elsevier.nl/locate/entcs/volume86.html 12 pages

Transforming equality logic to prepositional logic

Hans Zantema1 and Jan Friso Groote 2

Department of Computer Science Eindhoven University of Technology P.O. Box 513, 5600 MB Eindhoven, The Netherlands

Abstract

We investigate and compare various ways of transforming equality formulas to propositional formulas, in order to be able to solve satisfiability in equality logic by means of satisfiability in propositional logic. We propose equality substitution as a new approach combining desirable properties of earlier methods, we prove its correctness and show its applicability by experiments.

1 Introduction

We consider equality formulas being propositional formulas in which the atoms are equalities between variables. Two such formulas are called equality equivalent, denoted by if for any interpretation of the variables in any domain they yield the same result. For instance, we have

X = y Ax = Z x = y A y = z

since for both formulas the result is true if and only if the variables x, y, z all three have the same interpretation. On the other hand, in propositional logic they are not equivalent: writing p.q. r for x = //. x = // = z, respectively, we do not have p A q = p A r.

The main question we address is the question of how to check whether two (big) equality formulas are equality equivalent automatically. A direct observation shows that

4> tj) -i(</> tj)) false,

hence checking equivalence of two formulas can be done by checking whether a formula is equivalent to false. The latter is called satisfiability, hence we are interested in satisfiability of equality formulas.

1 Email: h.zantema@tue.nl

2 Email: jfg@win.tue.nl

©2003 Published by Elsevier Science B. V.

This problem plays an important role in hardware verification. In fact there one is interested in a slightly more extensive logic: the logic of equality with uninterpreted functions (UIF, [5]), However, by Ackermann's transformation ([1]) the problem of deciding validity of a formula in UIF is reduced to satisfiability of equality formulas. More recently, an improved transformation serving the same goal was proposed in [2],

One approach was presented in [7], where a variant of BDD-teehnologv (EQ-BDDs) was developed for satisfiability of equality formulas. The given method is complete in the sense that their algorithm always terminates, and decides whether the given formula is satisfiable. Unfortunately, in EQ-BDDs there is no unique representation as is the case in ordinary BDDs for proposi-tional formulas. Another method is proposed in [11]. There a resolution-like method was developed for checking satisfiability formulas in CNF,

A different approach is first transform the equality formula to a prepositional formula and then analyze this prepositional formula. For prepositional formulas a lot of work has been done for efficient satisfiability checking, yielding a variety of efficient and usable implementations. In this paper we concentrate on transformations ^ from equality formulas to prepositional formulas by which satisfiability of equality formulas is transformed to satisfiability of prepositional formulas, i.e.,

Having such a transformation ^ then checking satisfiability of an equality formula (j> proceeds as follows: compute \I/(</>) and decide whether \I/(</>) = false by a standard satisfiability checker for prepositional formulas. For such a transformation ^ a number of properties is desirable:

• the size of \I/(</>) is not too big;

• the structure of reflects the structure of <f>;

• the variables of \I/(</>) represent equalities in <f>.

The main goal of these properties is that checking (prepositional) satisfiability of by standard techniques is feasible for a reasonable class of formulas <f>. Roughly speaking two main approaches can be distinguished:

(i) Addition of transitivity. In this approach it is analyzed which transitivity properties may be relevant for <f>, and ^ is defined by

where T is the conjunction of the relevant transitivity properties. This approach is followed in [6,3,4],

(ii) Bit vector encoding. In this approach [log(#A)] boolean variables Xi are introduced for every variable x, where is the size of the set A of variables, and \I/(</>) is obtained from (j> by replacing every x = y by

4> —e false *J'(o) = false.

In [6] this is already mentioned as a folklore method. Closely related is range allocation [9,10], In this approach a formula structure is analyzed to define a small domain for each variable, preferably smaller than #A. Then a standard BDD based tool is used to check satisfiability of the formula under the domain.

By addition of transitivity the variables of represent equalities in <f>, but the structure of \I/(</>) does not reflect the structure of <f>. For instance, if (j> is a formula over n variables then the size of T is 0(n3) which can be much bigger than the size of (j> itself. On the other hand by bit vector encoding the structure of \I/(</>) reflects the structure of <f>, but the variables of \I/(</>) do not represent equalities in <f>. Moreover, although the size of the transformed formula is small, it often turns out that the efficiency of proving unsatisfiabilitv of this formula by standard approaches is very bad.

In this paper we define equality substitution eqs as an alternative transformation that combines both desired properties. The emphasis is on proving correctness: both for the earlier approaches and equality substitution we prove the basic correctness property 4> false *J'(o) = false. We are not

aware of earlier full proofs for the earlier approaches. In the last section we report some experiments showing that equality substitution outperforms the bit vector encoding for a class of formulas similar to the pigeon hole formulas. Comparison of equality substitution to addition of transitivity shows a similar performance, but equality substitution yields much smaller formulas,

2 Basic definitions and properties

Let A be a finite set of variable symbols. We define an equality formula by the syntax

V ::= x \ y \ z \ ■ ■ ■ where A = {x, y,z,.,,}

E ::= V = V | true | false | | (E V E) \ (E A E) \ (E E) \ (E E)

Hence an equality formula consists of equations x = y for x, y E A and usual boolean connectives. As usual redundant parentheses will be omitted. For instance, if x, y, z E A then (x = y A y = z) —x = z is an equality formula,

A domain D is defined to be a non-empty set. For any domain D we call a function e : A ^ D an assignment to D. For any assignment e we define its interpretation e on equality formulas inductively as follows:

e(true) = true e(false) = false

e((j) V V>) = e(4>) V e(ip) e(4> A tp) = e{4>) A e(tp)

€(</> ^t/j) = €(</>) ^ e(V0 e{(j) ^ V) = ^ e(?/>)

Two equality formulas <f>, tp are called equality equivalent, denoted as </> ip, if e((f) = e(ip) for every domain D and every assignment e to D. For instance, one can check that

(x = yAy = z)—>x = z true.

We will concentrate on the question how to decide whether (j> ip for arbitrary equality formulas <f>, ip. It is easily checked that

4> ij) -i(</> V) fa'se

hence we may and shall concentrate on the question whether (j> false for a given equality formula (j).

Fix a total order < on A. For an equality formula (j> write R((f) for the equality formula obtained from (j> by replacing every x = xby true and replacing x = y by y = x if y < x, for all x,y E A. Clearly R(4>) <f> for every equality formula <f>. An equality formula (j> is called reduced if (j) = R(4>), i.e., it only contains equations x = y satisfying x < y. By applying this reduction our question of deciding (j> false for arbitrary equality formulas reduces to the question of deciding (j> false for a reduced equality formula <f>.

We write = for logical equivalence in the sense of prepositional logic; if applied to equality formulas this means that an equation x = y is considered as a prepositional atom.

Write T for the conjunction of all formulas

-iR{x = y) V -iR{y = z)\J R{x = z) for which x,y,z E A are all three distinct.

Theorem 2.1 Let (j) he a reduced equality formula. Then (j) false if and only if <f> A T = false.

Proof. First assume that <f>A T = false. Let e : A ^ D be arbitrary; we have to prove that e((f) = false. By transitivity of equality in D we obtain that

e(-iR(x = y) V -iR(y = z) V R(x = z)) = true.

As a consequence we obtain e(T) = true. Hence

F(o) = F(o) A Til') = Tin A T) = false;

the last step follows from (j> A T = false and the definition of e.

Conversely assume that (j> false holds and (j> A T ^ false; we have to derive a contradiction. Since <f> AT is satisfiable there is an assignment S on the atoms of the shape x = y to the booleans such that S(<f>AT) = true, where 5 is the interpretation corresponding to S. Hence 5((f>) = S(T) = true. Define the relation ~ in A as follows:

x ~ y 6(R(x = y)).

From the definition of R it follows that ~ is reflexive and symmetric; since S(T) = true we conclude that ~ is transitive. Hence ~ is an equivalence relation. By injeetivelv mapping the equivalence classes of ~ to some domain

¿Ji-liN _L 11/iVlrt. J V.I ±1Л_/\_/ J. 11/

D we obtain an assignment e : A ^ D satisfying

X ~ у e(x) = e(y).

By construction we now have в(ф) = 8(ф) = true, contradicting the assumption ф false, □

Theorem 2,1 shows that addition of transitivity is a valid approach for transforming equality formulas to propositional formulas by which satisfiability of equality formulas is transformed to satisfiability of propositional formulas, The next theorem states validity of the bit vector encoding approach. Fix N to be the smallest number satisfying > фА. For every x E A

introduce N boolean variables x\......r v • Write AN for the set of all of these

N * фА boolean variables. The bit vector encoding bve transforming equality formulas over A to propositional formulas over AN is defined as follows:

bve (ж = y) = f\{xi о y{),

bve(true) = true, bve(false) = false, bve(-i^) = -i bve (</>), bve(</> о ф) = bve(^) о bve(^)

for x,y e А, о e {V, A, ->•,

Theorem 2.2 Let ф be an equality formula over A. Then ф false if and only if bve(^) = false.

Proof. For the 'if part we take an arbitrary assignment e : A ^ D satisfying б(ф) = true and we prove that this gives rise to a satisfying assignment for bve(</>). Since #е(А) < фА < 2N there exists an injective map a : e(A) — {false, true}^. Define a : AN —{false, true} by

a(e(x)) = (a(a;i),... ,a(x^))

for all x E A. Extend a to propositional formulas over AN by defining

a(true) = true, a (false) = false, а(^ф) = ~>а(ф),

а(ф о ф) = а(ф) о а(ф)

for х, у Е А, о Е {V, Л, —For х,у Е A we obtain

Т(х = у) = true ((х) = е(у)

а(е(х)) = а(е(у)) (since a is injective)

(а(хг),.. .,a(xN)) = (а(уг),... ,a(yN))

a(xi) = a(yi) Л • • • Л а(жлг)) = а(ук)

®(f\iLi(xi У«)) = true

77(bve(.r = у)) = true. This holds for every equality x = y. Hence,

ai (bve (</>)) = б (ф) = true.

So, we have a satisfying assignment a for bve(</>), which we had to prove.

For the converse assume a : AN —{false, true} is a satisfying assignment for bve(</>). Let D = {false, true}^, Define e : A ^ D by e(x) = (a(a;i),,,,, a(a;jv)). Similarly as above we obtain a(bve(a; = y)) = true e(x = y) = true, hence from a(bve(^)) = true we may conclude e((f) = true, contradicting the assumption (j> false, □

The requirement > #A is essential for the validity of Theorem 2,2 as is shown by the following example. Let A = {a;i,,,,, xn} and n > 2N. Then

f\ ->(xi = Xj) gkE false,

l<i<j<n

bve( /\ -i(xi = xj)) = /\ ->(/\(xik ^ xjk)) = false.

l<i<j<n l<i<j<n k=1

3 Equality substitution

In this section equality substitution eqs is introduced for transforming equality formulas to propositional formulas, combining desired properties of the two transformations considered until now. Just like in bit vector encoding a substitution is applied on the equalities in the formula, and the rest of the formula remains unchanged. The main point is to define eqs(a; = y) for variables x, y such that (j> tj) eqs(o) = eqs(r).

Let < on A be the order that we already fixed for defining R. It is convenient to number the elements of A with respect to this order, i.e., we assume A = (xi,x2,.....r„ }• for a = satisfying

tX' ^ tX' j "S % ^ J ■

For every i,j satisfying 1 < i < j < n we introduce a fresh propositional variable pij; the set of all these "i""1) variables is denoted by Pa-For 1 < k < i < j < n we define P(k, i,j) inductively by

P(hhj) =Pij

for all i,j satisfying 1 < i < j < n, and

P{k,i,j) = (pkiApkj) V (-.pki A -ipkj A P(k + 1 ,i,j))

for all k, i,j satisfying 1 < k < i < j < n. We will use these formulas only for k = 1; the formula P(l,i,j) is a propositional formula over PA of size 0(i). For instance, P(l, 3, 5) is equal to

(Pi3 A pw) V (-ipi3 A -ipi5 A ((p23 A P25) V (^p23 A ^p25 A PSB)))-

We define the transformation eqs from equality formulas over A to propo-

sitional formulas over PA as follows:

true if? = j, eqs(xi = xj) = P(l,i,j) if i<j, P(l,j,i) if j < i,

eqs(true) = true, eqs(false) = false, eqs(-i^) = -ieqs(^),

eqs(4> oip) = eqs(^) o eqs(^)

for o e {V, A, ->•,«->•}.

It is hard to give an intuition for eqs other than what follows directly from its definition; surprisingly the original intuition we had for eqs turned out to be wrong. Many modifications of eqs turned out to violate the essential property below.

Theorem 3.1 Let (f),ip be arbitrary equality formulas over A. Then

(f> ip eqs(^) = eqs(V0-Indeed, eqs((a;i = x2 A x2 = x:i) —X\ = x:i) is equal to

(.Pi2 A ((pi2 A Pi3) V ( 'P12 A -1Pi3 A p23))) P13 which is logically equivalent to eqs(true) = true.

In the remainder of this section we prove Theorem 3,1, We start by proving

4> tj) <== eqs(^) = eqs(V')-

We assume that

S(e qs(</>)) = S(eqs(ift))

for all S : Pa —Bool, and we have to prove that e((f) = e(ip) for every domain D and every assignment e : A —D. This follows from the following lemma, proving the <=-pail of Theorem 3,1,

For an assignment e : A —D we define S€ : Pa —Bool by

$e(Pij) e(xi) = e(xj).

Lemma 3.2 Let <f> be an equality formula and let e : A —D be any assignment. Then

e{<j)) = 5£(eqs(»).

Proof. Due to the compositional definition of eqs it suffices to prove this for 4> being of the shape Xi = Xj. In case of i = j this holds since e(xi = xi) = true = 5£(true) = 5£(eqs(a;i = Xj)). In the remaining case i ^ j we may assume i < j by

a ^ x ^ — x j ^ — a ^ x j — x £ ^

and symmetry in the definition of eqs. Since eqs^ = Xj) is equal to P( 1, i,j) it remains to prove

e(xi = xj) Se(P(l,i,j)).

We prove this by proving the stronger claim

e(xi = xj) 6€(P(k, i.,j)).

for all A; = 1.2.....i by reverse induction on k. For k = i this holds by

definition. As the induction hypothesis we now assume

efa) = e(xj) e(x{ = xs) <==> Se(P(k + 1 ,i,j)).

Now we have

S€(P(k,i,j)) (by definition)

$e((Pki A Pkj) v (-.pki A -1pkj A P(k + 1, i, j))) (by definition)

(e(xk) = c(xi)Ae(xk) = e(xj))V(e(xk) # e(a^)Ae(xk) # e(xj)ASe(P(k+l,i,j)))

(by the induction hypothesis) (e(xk) = e(xi) A e(xk) = e(xj)) V (e(xk) # e(a^) A e(xk) # e(xj) A e(a^) = e(xj))

(by transitivity of =) (e(xk) = e(xi) A e(xi) = e(xj)) V (e(xk) # e(xi) A e(xk) # e(xj) A e(xi) = e(xj))

(proposition logic) (e(xk) = e(xi) V e(xk) # e(xj) V e(xi) # e(xj)) A e(xi) = e(xj)

(by transitivity of =) e(xi) = e(xj) e(xt = Xj)

which we had to prove. □

The hard part of Theorem 3.1 is the =>-pari. For that we need a lemma.

Lemma 3.3 Let T be the conjunction of all formulas

-iR{x = y) V -iR{y = z)\J R{x = z) for which x,y,z e A are all three distinct. Then eqs(T) = true.

Proof. We have to prove that eqs(->R(x = y)V^R(y = z) VR(x = z)) = true. Let j,k,m satisfying l<j<A;<m<nbe the numbers of the variables x, y, z in some order. Then the required property is one of the following three propositional equivalences:

->P(l,j,k) V ->P(l,j,m) V P(l,k,m) = true,

->P(l,j,k) V P(l,j,m) V -iP(l,k,m) = true,

P(l,j,k) V ->P(l,j,m) V -iP(l,k,m) = true.

We will prove the more general property that for every i satisfying 1 < i < j the following three propositional equivalences hold:

-iP(i, j, k) V -iP(i, j, m) V P(i, k, m) = true,

-iP(i, j, k) V P(i, j, m) V ->P(i, k, m) = true,

P(i, j, k) V -iP(i, j, m) V -iP(i, k,m) = true.

First assume that the first equivalence does not hold. Then there is an assignment such that the propositions

P(h3,k) = (pijApik) A^pikAP(i + l,j,k)),

P(i,j, rn) = (pij A pim) V (-ipij A -ipim A P(i + 1 rn)),

-1p(i, k, rn) = (-1Pik V -ipim) A {pikV Pirn V ->P{i +l,k, rn))

all three hold. If pjtj holds then we conclude from the validity of the first two propositions that pik and pim both hold too, contradicting the validity of the third proposition. Hence holds. Then by validity of all three propositions we conclude that -ipik, ~<Pim, P(i + 1, J, k), P(i + l,j, m) and -iP(i + 1, k, m) all hold. Repeating the same argument j — i times yields that

P{h j, k) = pjh, P(j, j, m) = pjm,

-iP(j, k, rn) = (-.pjk V -ipjm) A (pjk V pjm V ->P(j + 1, k, rn))

all three hold, contradiction. Hence the first equivalence to be proved holds. Next assume that the second equivalence does not hold. Then in a similar way after j — i steps we obtain that

P{h j, k) = pjk, -.P(j> j,m) = ^Pjm,

P(j, k, m) = (pjk A pjm) V (-.pjk A ]Pjïïi A P(j + 1, k, m))

all three hold, contradiction.

Finally assuming that the third equivalence does not hold yields in a similar way that

j, k) = -.pjk, P(J, J, m) = Pjm,, P(j, k, m) = (pjk A pjm) V (-.pjk A Pjmi A P(j + 1, k, m)) all three hold, contradiction, □

Now we prove the =>-pari of Theorem 3,1,

Assume (f> if). Then -i((f> ip) false. From Theorem 2,1 we conclude that -i((f> ^ ip) AT = false. In this equivalence the equalities are considered as propositional variables. Since eqs has been defined as a substitution on these variables we conclude eqs(—ip) A T) = false. We obtain

-i(eqs(</>) eqs(V')) = eqs(—V))

= eqs(-i(0 ip)) A eqs(T) (by Lemma 3,3)

= eqs(->(<f) <+ip)AT)

= false

hence eqs(^) = eqs(V'), which concludes the proof of Theorem 3,1, 4 Experimental results

In this section we report some experimental results comparing addition of transitivity, bit vector encoding and equality substitution, all three in combination

with various propositional satisfiability provers.

We consider the formulas form„ from [11] that are related to the pigeon hole formulas in proposition calculus. Just like pigeon hole formulas these are parameterized by a number n, they are easily seen to be contradictory by a meta argument, and each of the formulas is the conjunction of two subformulas. The formulas are defined as follows,

form„ = ( f\ Xi^Xj) A f\( \f Xi = y)

l<i<j<n j=i ie{i,...,n},i^j

There are n + 1 variables xi,... ,xn,y. The first subformula states that all values of different.

The second subformula states that the value of y occurs in every subset of size n—1 of {xi,,,,, xn}, hence it will occur at least twice in {x^ ,,,, xn}, contradicting the property of the first subformula. Hence the total formula is unsatisfiable. This is a non-trivial kind of unsatisfiabilitv in the following sense: the whole formula is a conjunction of a great number of formulas, and for every of these conjuncts it holds that the formula is satisfiable after removing the conjunct. Moreover, for every pair of variables the equality between these variables occurs in the formula, either positively or negatively. Since pigeon hole like formulas are well-known to be notoriously hard in propositional logic, we consider this formula to be an interesting candidate for experiments for techniques for checking satisfiability of equality formulas. We did our experiments on the formula form„ for n having the values 10, 15, 20, 30, 40, 50, 60.

We used three different propositional satisfiability checkers. The first one consists of computing the BDD using the package CUDD, see http://supportweb.cs.bham.ac.uk/documentation/cudd/. In the table this checker is denoted by 'bdd'. The second one first transforms the formula to CNF using Tseitin's transformation and then applies zChaff, see http://ee.princeton.edu/~chaff/zchaff.php. In the table this checker is denoted by 'eh'. The last one is the checker HeerHugo ([8]), denoted by 'hli'. All experiments are carried out under Linux on a lGhz, pentium 4,

The following table reports the results. Times are in seconds; means that more than 600 seconds were required. Size indicates the number of binary symbols in the propositional formula.

add transitivity bit vector encoding equality substitution

n size bdd ch hh size bdd ch hh size bdd ch hh

10 1619 1 0 0 1079 56 1 113 794 0 0 0

15 5354 - 0 0 2519 - 7 - 2554 1 0 1

20 12539 - 0 0 5699 - 91 - 5889 20 0 1

30 41759 - 0 1 13049 - - - 19284 - 0 4

40 98279 - 1 3 28079 - - - 44979 - 1 16

50 191099 - 2 6 44099 - - - 86974 - 2 49

60 329219 - 4 11 63719 - - - 149269 - 5 123

About the bdd experiments with addition of transitivity we note that the order in which the big conjunction is computed is of great influence on the result. In the table we first computed the bdds of form„ and T separately and then computed the conjunction, as is suggested by the the shape of the formula, Only computing the bdd of T is already very expensive: for 12 variables the resulting bdd has over one million nodes. However, by computing the bdd of form„ and then consecutively taking conjunction with each of the transitivity properties gives a much better result: then unsatisfiabilitv of form60 is proved in 62 seconds.

As a conclusion from the table we may state that the best results are obtained by the two transformations addition of transitivity and equality substitution, both in combination with zChaff: then unsatisfiabilitv of form60 is proved in only a few seconds. Among these two transformations equality substitution gives rise to the smallest formulas. Although bit vector encoding gives rise to much smaller formulas, it gives a very bad performance on proving unsatisfiabilitv.

5 Concluding Remarks

We proposed equality substitution as a new transformation by which the satisfiability problem for equality logic is transformed to the satisfiability problem for propositional logic. Both for earlier approaches and for this new approach we gave proofs for correctness. We did some experiments on pigeon hole like formulas showing that equality substitution serves well for proving unsatisfiabilitv of equality formulas in combination with the propositional prover zChaff, Although this involves only one particular class of formulas, it is an indication for practical applicability.

References

[1] Ackermann, W., "Solvable cases of the decision problem," Studies in Logic and the Foundations of Mathematics, North-Holland, Amsterdam, 1954.

[2] Bryant, R., S. German, and M. Velev, Processor verification using efficient reductions of the logic of uninterpreted functions to propositional logic, ACM Transactions on Computational Logic 2 (2001), pp. 93-134.

[3] Bryant, R. and M. Velev, Boolean satisfiability with transitivity constraints, in: E. Emerson and A. Sistla, editors, Computer-Aided Verification (CAV'00), LNCS 1855 (2000), pp. 85-98.

[4] Bryant, R. and M. Velev, Boolean satisfiability with transitivity constraints, ACM Transactions on Computational Logic 3 (2002), pp. 604-627.

[5] Burch, J. and D. Dill, Automated verification of pipelined microprocesoor control, in: D. Dill, editor, Computer-Aided Verification (CAV'94), LNCS 818 (1994), pp. 68-80.

[6] Goel, A., K. Sajid, H. Zhou, A. Aziz and V. Singhal, BDD based procedures for a theory of equality with uninterpreted functions, in: Proceedings of Conference on Computer-Aided Verification (CAV), Lecture Notes in Computer Science 1427 (1998), pp. 244-255.

[7] Groote, J. F. and J. C. van de Pol, Equational binary decision diagrams, in: M. Parigot and A. Voronkov, editors, Logic for Programming and Reasoning (LPAR), Lecture Notes in Artificial Intelligence 1955 (2000), pp. 161-178.

[8] Groote, J. F. and J. P. Warners, The propositional formula checker HeerHugo, Journal of Automated Reasoning 24 (2000), pp. 101-125.

[9] Pnueli, A., Y. Rodeh, O. Shtrichman and M. Siegel, Deciding equality formulas by small domains instantiations, in: Computer Aided Verification (CAV'99), LNCS 1633 (1999), pp. 455-469.

[10] Rodeh, Y. and O. Shtrichman, Finite instantiations in equivalence logic with uninterpreted functions, in: Computer Aided Verification (CAV'01), LNCS 2102 (2001), pp. 144-154.

[11] Tveretina, O. and H. Zantema, A proof system, and a decision procedure for equality logic, Technical Report CS-report 03-02, Eindhoven University of Technology (2003), available viahttp://www.win.tue.nl/~hzantema/TZ.pdf.