Scholarly article on topic 'A Categorical Theory of Patches'

A Categorical Theory of Patches Academic research paper on "Computer and information sciences"

CC BY-NC-ND
0
0
Share paper
Keywords
{"Category theory" / patch / "coinitial patch"}

Abstract of research paper on Computer and information sciences, author of scientific article — Samuel Mimram, Cinzia Di Giusto

Abstract When working with distant collaborators on the same documents, one often uses a version control system, which is a program tracking the history of files and helping importing modifications brought by others as patches. The implementation of such a system requires to handle lots of situations depending on the operations performed by users on files, and it is thus difficult to ensure that all the corner cases have been correctly addressed. Here, instead of verifying the implementation of such a system, we adopt a complementary approach: we introduce a theoretical model, which is defined abstractly by the universal property that it should satisfy, and work out a concrete description of it. We begin by defining a category of files and patches, where the operation of merging the effect of two coinitial patches is defined by pushout. Since two patches can be incompatible, such a pushout does not necessarily exist in the category, which raises the question of which is the correct category to represent and manipulate files in conflicting state. We provide an answer by investigating the free completion of the category of files under finite colimits, and give an explicit description of this category: its objects are finite sets labeled by lines equipped with a transitive relation and morphisms are partial functions respecting labeling and relations.

Academic research paper on topic "A Categorical Theory of Patches"

Available online at www.sciencedirect.com

ScienceDirect

Electronic Notes in Theoretical Computer Science 298 (2013) 283-307

www.elsevier.com/locate/entcs

A Categorical Theory of Patches

Samuel Mimram and Cinzia Di Giusto

CEA, LIST

Abstract

When working with distant collaborators on the same documents, one often uses a version control system, which is a program tracking the history of files and helping importing modifications brought by others as patches. The implementation of such a system requires to handle lots of situations depending on the operations performed by users on files, and it is thus difficult to ensure that all the corner cases have been correctly addressed. Here, instead of verifying the implementation of such a system, we adopt a complementary approach: we introduce a theoretical model, which is defined abstractly by the universal property that it should satisfy, and work out a concrete description of it. We begin by defining a category of files and patches, where the operation of merging the effect of two coinitial patches is defined by pushout. Since two patches can be incompatible, such a pushout does not necessarily exist in the category, which raises the question of which is the correct category to represent and manipulate files in conflicting state. We provide an answer by investigating the free completion of the category of files under finite colimits, and give an explicit description of this category: its objects are finite sets labeled by lines equipped with a transitive relation and morphisms are partial functions respecting labeling and relations.

Keywords: Category theory, patch, coinitial patch

1 Introduction

It is common nowadays, when working with distant collaborators on the same files (multiple authors writing an article together for instance), to use a program which will track the history of files and handle the operation of importing modifications of other participants. These software called version control systems (vcs for short), like git or Darcs, implement two main operations. When a user is happy with the changes it has brought to the files it can record those changes in a patch (a file coding the differences between the current version and the last recorded version) and commit them to a server, called a repository. The user can also update its current version of the file by importing new patches added by other users to the repository and applying the corresponding modifications to the files. One of the main difficulties to address here is that there is no global notion of "time": patches are only partially ordered. For instance consider a repository with one file A and

1 This work was partially supported by the French project ANR-11-INSE-0007 REVER.

1571-0661/$ - see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.entcs.2013.09.018

two users ui and u2. Suppose that ui modifies file A into B by committing a patch f, which is then imported by U2, and then ui and U2 concurrently modify the file B into C (resp. D) by committing a patch g (resp. h). The evolution of the file is depicted on the left and the partial ordering of patches in the middle:

C d hl^E>Kglh

\ A g h CD

B B X /

/t f B B h

Now, suppose that u2 imports the patch g or that u1 imports the patch h. Clearly, this file resulting from the merging of the two patches should be the same in both cases, call it E. One way to compute this file, is to say that there should be a patch h/g, the residual of h after g, which transforms C into E and has the "same effect" as h once g has been applied, and similarly there should be a patch g/h transforming D into E. Thus, after each user has imported changes from the other, the evolution of the file is as pictured on the right above. In this article, we introduce a category L whose objects are files and morphisms are patches. Since residuals should be computed in the most general way, we formally define them as the arrows of pushout cocones, i.e. the square in the figure on the right should be a pushout.

However, as expected, not every pair of coinitial morphisms have a pushout in the category L: this reflects the fact that two patches can be conflicting (for instance if two users modify the same line of a file). Representing and handling such conflicts in a coherent way is one of the most difficult part of implementing a vcs (as witnessed for instance by the various proposals for Darcs: mergers, conflictors, graphictors, etc. [10]). In order to be able to have a representation for all conflicting files, we investigate the free completion of the category L under all pushouts, this category being denoted P, which corresponds to adding all conflicting files to the category, in the most general way as possible. This category can easily be shown to exist for general abstract reasons, and one of the main contributions of this work is to provide an explicit description by applying the theory of presheaves. This approach paves the way towards the implementation of a vcs whose correctness is deduced from universal categorical properties.

Related work. The Darcs community has investigated a formalization of patches based on commutation properties [10]. Operational transformations tackle essentially the same issues by axiomatizing the notion of residual patches [9]. In both cases, the fact that residual should form a pushout cocone is never explicitly stated, excepting in informal sentences saying that "g/f should have the same effect as g once f has been applied". We should also mention another interesting approach to the problem using inverse semigroups in [4]. Finally, Houston has proposed a category with pushouts, similar to ours, in order to model conflicting files [3], see Section 6.

Plan of the paper. We begin by defining a category L of files and patches in Sec-

tion 2. Then, in Section 3, we abstractly define the category P of conflicting files obtained by free finite cocompletion. Section 4 provides a concrete description of the construction in the simpler case where patches can only insert lines. We give some concrete examples in Section 5 and adapt the framework to the general case in Section 6. We conclude in Section 7.

2 Categories of files and patches

In this section, we investigate a model for a simplified vcs: it handles only one file and the only allowed operations are insertion and deletion of lines (modification of a line can be encoded by a deletion followed by an insertion). We suppose fixed a set L = {a,b,...} of lines (typically words over an alphabet of characters). A file A is a finite sequence of lines, which will be seen as a function A : [n] ^ L for some number of lines n £ N, where the set [n] = {0,1,...,n — 1} indexes the lines of the files. For instance, a file A with three lines such that A(0) = a, A(1) = b and A(2) = c models the file abc. Given a £ L, we sometimes simply write a for the file A : [1] ^ L such that A(0) = a. A morphism between two files A : [to] ^ L and B : [n] ^ L is an injective increasing partial function f : [m] ^ [n] such that Vi £ [m],B o f (i) = A(i) whenever f (i) is defined. Such a morphism is called a patch.

Definition 2.1 The category L has files as objects and patches as morphisms.

Notice that the category L is strictly monoidal with [m] ® [n] = [m + n] and for every file A : [m] ^ L and B : [n] ^ L, (A ® B)(i) = A(i) if i < m and (A ® B)(i) = B(i — m) otherwise, the unit being the empty file I : [0] ^ L, and tensor being defined on morphisms in the obvious way. The following proposition shows that patches are generated by the operations of inserting and deleting a line:

Proposition 2.2 The category L is the free monoidal category containing L as objects and containing, for every line a £ L, morphisms na : I ^ a (insertion of a line a) and ea : a ^ I (deletion of a line a) such that ea o na = id/ (deleting an inserted line amounts to do nothing).

Example 2.3 The patch corresponding to transforming the file abc into dadeb, by deleting the line c and inserting the lines labeled by d and e, is modeled by the partial function f : [3] ^ [5] such that f (0) = 1 and f (1) = 4 and f (2) is undefined. Graphically,

The deleted line is the one on which f is not defined and the inserted lines are those which are not in the image of f. In other words, f keeps track of the unchanged lines.

In order to increase readability, we shall consider the particular case where L is reduced to a single element. In this unlabeled case, the objects of L can be identified with integers (the labeling function is trivial), and Proposition 2.2 can be adapted to achieve the following description of the category, see also [6].

Proposition 2.4 If L is reduced to a singleton, the category L is the free category whose objects are integers and morphisms are generated by sn : n ^ n + 1 and dn : n + 1 ^ n for every n G N and i G [n + 1] (respectively corresponding to insertion and deletion of a line at i-th position), subject to the relations

sn+1sn = sn+isn dnsn = idn dndn+1 = dndn+11 (1)

whenever 0 < i < j < n.

We will also consider the subcategory L+ of L, with same objects, and total injective increasing functions as morphisms. This category models patches where the only possible operation is the insertion of lines: Proposition 2.2 can be adapted to show that L+ is the free monoidal category containing morphisms na : I ^ a and, in the unlabeled case, Proposition 2.4 can be similarly adapted to show that it is the free category generated by morphisms s" : n ^ n + 1 satisfying s"+1sn = s^+^s™ with 0 < i < j < n.

3 Towards a category of conflicting files

Suppose that A is a file which is edited by two users, respectively applying patches fi : A ^ A1 and f2 : A ^ A2 to the file. For instance,

a c c b a b — a b c d (2)

Now, each of the two users imports the modification from the other one. The resulting file, after the import, should be the smallest file containing both modifications on the original file: accbcd. It is thus natural to state that it should be a pushout of the diagram (2). Now, it can be noticed that not every diagram in L has a pushout. For instance, the diagram

a c b ¿f- a b a d b (3)

does not admit a pushout in L. In this case, the two patches f1 and f2 are said to be conflicting.

In order to represent the state of files after applying two conflicting patches, we investigate the definition of a category P which is obtained by completing the category L under all pushouts. Since, this completion should also contain an initial object (i.e. the empty file), we are actually defining the category P as the free completion of L under finite colimits: recall that a category is finitely cocomplete (has all finite colimits) if and only if it has an initial object and is closed under

pushouts [6]. Intuitively, this category is obtained by adding files whose lines are

not linearly ordered, but only partially ordered, such as on the left of

/ <<<<<<< HEAD

c d ======= (4)

b »»»> 5c55...

which would intuitively model the pushout of the diagram (3) if it existed, indicating that the user has to choose between c and d for the second line. Notice the similarities with the corresponding textual notation in git on the right. The name of the category L reflects the facts that its objects are files whose lines are linearly ordered, whereas the objects of P can be thought as files whose lines are only partially ordered. More formally, the category is defined as follows.

Definition 3.1 The category P is the free finite conservative cocompletion of L: it is (up to equivalence of categories) the unique finitely cocomplete category together with an embedding functor y : L^P preserving finite colimits, such that for every finitely cocomplete category C and functor F : L ^ C preserving finite colimits, there exists, up to unique isomorphism, a unique functor F : P ^ C preserving finite colimits and satisfying F o y = F:

Above, the term conservative refers to the fact that we preserve colimits which already exist in L (we will only consider such completions here). The "standard" way to characterize the category P, which always exists, is to use the following folklore theorem, often attributed to Kelly [5,1]:

Theorem 3.2 The conservative cocompletion of the category L is equivalent to the full subcategory of L whose objects are presheaves which preserve finite limits, i.e. the image of a limit in Lop (or equivalently a colimit in L) is a limit in Set (and limiting cones are transported to limiting cones). The finite conservative co-completion P can be obtained by further restricting to presheaves which are finite colimits of representables.

Example 3.3 The category FinSet of finite sets and functions is the conservative cocompletion of the terminal category 1.

We recall that the category L of presheaves over L, is the category of functors Lop ^ Set and natural transformations between them. The Yoneda functor y : L^ L defined on objects n £L by yn = L(—,n), and on morphisms by postcomposition, provides a full and faithful embedding of L into the corresponding presheaf category, and can be shown to corestrict into a functor y : L^P [1]. A presheaf of the form yn for some n £L is called representable.

Extracting a concrete description of the category P from the above proposition is a challenging task, because we a priori need to characterize firstly all diagrams admitting a colimit in L, and secondly all presheaves in L which preserve those diagrams. This paper introduces a general methodology to build such a category. In particular, perhaps a bit surprisingly, it turns out that we have to "allow cycles" in the objects of the category P, which will be described as the category whose objects are finite sets labeled by lines together with a transitive relation and morphisms are partial functions respecting labels and relations.

4 A cocompletion of files and insertions of lines

In order to make our presentation clearer, we shall begin our investigation of the category P in a simpler case, which will be generalized in Section 6: we compute the free finite cocompletion of the category L+ (patches can only insert lines) in the case where the set of labels is a singleton. To further lighten notations, in this section, we simply write L for this category.

We sometimes characterize the objects in L as finite colimits of objects in a subcategory G of L. This category G is the full subcategory of L whose objects are 1 and 2: it is the free category on the graph 1 % 2 , the two arrows being sj and s^ The category G of presheaves over G is the category of graphs: a presheaf P G G is a graph with P(1) as vertices, P(2) as edges, the functions P(s1) and P(s1) associate to a vertex its source and target respectively, and morphisms correspond to usual morphisms of graphs. We denote by x ^ y a path going from a vertex x to a vertex y in such a graph. The inclusion functor I : G — L induces, by precomposition, a functor I * : L — G. The image of a presheaf in L under this functor is called its underlying graph. By well known results about presheaves categories, this functor admits a right adjoint I* : G — L: given a graph G G G, its image under the right adjoint is the presheaf G* G L such that for every n G N, G*(n + 1) is the set of paths of length n in the graph G, with the expected source maps, and G* (0) is reduced to one element.

Recall that every functor F : C — D induces a nerve functor Np : D — C defined on an object A G C by Np(A) = D(F-,A) [7]. Here, we will consider the nerve Ni : L — G associated to the inclusion functor I : G — L. An easy computation shows that the image Ni(n) of n gL is a graph with n vertices, so that its objects are isomorphic to [n], and there is an arrow i — j for every i,j G [n] such that i < j. For instance,

It is, therefore, easy to check that this embedding is full and faithful, i.e. morphisms in L correspond to natural transformations in G. Moreover, since Ni(1) is the graph reduced to a vertex and Ni (2) is the graph reduced to two vertices and one arrow between them, every graph can be obtained as a finite colimit of the graphs Ni(1) and Ni(2) by "gluing arrows along vertices". For instance, the initial graph Ni(0) is

Ni (3) = 2

the colimit of the empty diagram, and the graph Ni(3) is the colimit of the diagram

Ni (i):

Nji^ NI(2) Nkso) NiS^ NI(2) -

which may also be drawn as on the left of

/ \ / \ U 1 ,1

by drawing the graphs Nj(0) and Nj(1). Notice, that the object 3 is the colimit of the corresponding diagram in L (on the right), and this is generally true for all objects of L, moreover this diagram is described by the functor El(Nj(3)) — L. The notation El(P) refers to the category of elements of a presheaf P £ C, whose objects are pairs (A,p) with A £C and p £ P(A) and morphisms f : (A,p) — (B, q) are morphisms f : A — B in C such that P(f )(q) = p, and n is the first projection functor. The functor I : G — L is thus a dense functor in the sense of Definition 4.2 below, see [7] for details.

Proposition 4.1 Given a functor F : C — D, with D cocomplete, the associated nerve Nf : D — C admits a left adjoint Rf : C —D called the realization along F. This functor is defined on objects P £C by

Rf (P) = colim(El(P) — C — D)

Proof Given a presheaf P £ C and an object D, it can be checked directly that

morphisms P — NfD in C with cocones from El(P) —> to D, which in turn are in bijection with morphisms Rf(P) — D in D, see [7]. □

Definition 4.2 A functor F : C — D is dense if it satisfies one of the two equivalent conditions:

(i) the associated nerve functor Nf : D — C is full and faithful,

(ii) every object of D is canonically a colimit of objects in C: for every D £D,

D = colim(El(NF D) — C — D) (5)

Since the functor I is dense, every object of L is a finite colimit of objects in G, and G does not have any non-trivial colimit. One could expect the free conservative finite cocompletion of L to be the free finite cocompletion P of G. We will see that this is not the case because the image in L of a non-trivial diagram in G might still have a colimit. By Theorem 3.2, the category P is the full subcategory of L of presheaves preserving limits, which we now describe explicitly. This category will

turn out to be equivalent to a full subcategory of G (Theorem 4.8). We should first remark that those presheaves satisfy the following properties:

Proposition 4.3 Given a presheaf P G L which is an object of P,

(i) the underlying graph of P is finite,

(ii) for each non-empty path x ^ y there exists exactly one edge x — y (in particular there is at most one edge between two vertices),

(iii) P(n + 1) is the set of paths of length n in the underlying graph of P, and P(0) is reduced to one element.

Proof We suppose given a presheaf P G P, it preserves limits by Theorem 3.2. The diagram on the left

s2 3 s2 P(s2) P(3) P(s2)

-,, s2) v ' p (s0) v k0 ^ ^

2 2 P (2) P (2)

s0 1 si P(s1) P(1) P(s\)

is a pushout in L, or equivalently the dual diagram is a pullback in Lop. Therefore, writing D for the diagram 2 s, 1 il 2 in L, a presheaf P G P should satisfy P((colimD)op) = limP(Dop), i.e. the above pushout diagram in L should be transported by P into the pullback diagram in Set depicted on the right of the above figure. This condition can be summarized by saying that P should satisfy the isomorphism P(3) = P (2) x p(1) P (2) (and this isomorphism should respect obvious source and target maps given by the fact that the functor P should send a limiting cone to a limiting cone). From this fact, one can deduce that the elements a of P(3) are in bijection with the paths x — y — z of length 2 in the underlying graph of P going from x = P(s|s1)(a) to z = P(s2SQ)(a). In particular, this implies that for any path a = x — y — z of length 2 in the underlying graph of P, there exists an edge x — z, which is P(sf)(a). More generally, given any integer n > 1, the object n + 1 is the colimit in L of the diagram

s !_ 2 s0 s 1 2 s0 s l 2 s0 s 1 2 s0 (6)

v01> \0 y x01> v01

with n + 1 occurrences of the object 1, and n occurrences of the object 2. Therefore, for every n G N, P(n + 1) is isomorphic to the set of paths of length n in the underlying graph. Moreover, since the diagram

>2<0 s> 2<0 *^2<0 sly 2k0 (7)

with n + 1 occurrences of the object 1 also admits the object n + 1 as colimit, we should have P(n + 1) = P(n + 1) x P(2) between any two vertices x and y, i.e. for every non-empty path x ^ y there exists exactly one edge x — y. Also, since the

object 0 is initial in L, it is the colimit of the empty diagram. The set P(0) should thus be the terminal set, i.e. reduced to one element. Finally, since I is dense, P should be a finite colimit of the representables Ni (1) and Ni (2), the set P (1) is necessarily finite, as well as the set P(2) since there is at most one edge between two vertices. □

Conversely, we wish to show that the conditions mentioned in the above proposition exactly characterize the presheaves in P among those in L. In order to prove so, by Theorem 3.2, we have to show that presheaves P satisfying these conditions preserve finite limits in L, i.e. that for every finite diagram D : J 4 L admitting a colimit we have P(colim D) = lim(P o Dop). It seems quite difficult to characterize the diagrams admitting a colimit in L, however the following lemma shows that it is enough to check diagrams "generated" by a graph which admits a colimit.

Lemma 4.4 A presheaf P G L preserves finite limits if and only if it sends the colimits of diagrams of the form

El(G) -4 G4 L (8)

to limits in Set, where G G G is a finite graph such that the above diagram admits a colimit. Such a diagram in L is said to be generated by the graph G.

Proof In order to check that a presheaf P G L preserves finite limits, we have to check that it sends colimits of finite diagrams in L which admit a colimit to limits in Set, and therefore we have to characterize diagrams which admit colimits in L. Suppose given a diagram K : J 4 L. Since I is dense, every object of linear is a colimit of a diagram involving only the objects 1 and 2 (see Definition 4.2). We can therefore suppose that this is the case in the diagram K. Finally, it can be shown that diagram K admits the same colimits as a diagram containing only sq and s} as arrows (these are the only non-trivial arrows in L whose source and target are 1 or 2), in which every object 2 is the target of exactly one arrow sq and one arrow s}. For instance, the diagram in L below on the left admits the same colimits as the diagram in the middle.

2 4*1 3 s2si 2 s0 si 2 s0 si 2 si

A / \00

s0 1 11 1 1 1 1

Any such diagram K is obtained by gluing a finite number of diagrams of the form 1 il 2 sk 1 along objects 1, and is therefore of the form El(G) 4 G 4 L for some

finite graph G G G: the objects of G are the objects 1 in K, the edges of G are the objects 2 in K and the source and target of an edge 2 are respectively given by the sources of the corresponding arrows s} and s0 admitting it as target. For instance, the diagram in the middle above is generated by the graph on the right. The fact that every diagram is generated by a presheaf (is a discrete fibration) also

follows more abstractly and generally from the construction of the comprehensive factorization system on Cat [8,11]. □

Among diagrams generated by graphs, those admitting a colimit can be characterized using the following proposition:

Lemma 4.5 Given a graph G G G, the associated diagram (8) admits a colimit in L if and only if there exists n gL and a morphism f : G — Njn in L such that every morphism g : G — Nj m in L, with m gL, factorizes uniquely through Nj n:

g4.Ni^ Nj m g

Proof Follows from the existence of a partially defined left adjoint to Nj, in the sense of [8], given by the fact that I is dense (see Definition 4.2). □

We finally arrive at the following concrete characterization of diagrams admitting colimits:

Lemma 4.6 A finite graph G G G induces a diagram (8) in L which admits a colimit if and only if it is "tree-shaped", i.e. it is

(i) acyclic: for any vertex x, the only path x ^ x is the empty path,

(ii) connected: for any pair of vertices x and y there exists a path x ^ y or a path y ^ x.

Proof Given an object n G L, recall that Njn is the graph whose objects are elements of [n] and there is an arrow i — j if and only if i < j. Given a finite graph G, morphisms f : G — Njn are therefore in bijection with functions f : Vg — [n], where Vg denotes the set of vertices of G, such that f (x) < f (y) whenever there exists an edge x — y (or equivalently, there exists a non-empty path x ^ y).

Consider a finite graph G G G, by Lemma 4.5, it induces a diagram (8) admitting a colimit if there is a universal arrow f : G — Nj n with n gL. From this it follows that the graph is acyclic: otherwise, we would have a non-empty path x ^ x for some vertex x, which would imply f (x) < f (x). Similarly, suppose that G is a graph with vertices x and y such that there is no path x ^ y or y ^ x, and there is an universal morphism f : G — Nj n for some n gL. Suppose that f (x) < f (y) (the case where f (y) < f (x) is similar). We can define a morphism g : G — Nj(n +1) by g(z) = f (z)+1 if there is a path x ^ z, g(y) = f (x) and g(z) = f (z) otherwise. This morphism is easily checked to be well-defined. Since we always have f (x) < f (y) and g(x) > g(y), there is no morphism h : Njn — Nj(n + 1) such that h o f = g.

Conversely, given a finite acyclic connected graph G, the relation < defined on morphisms by x < y whenever there exists a path x ^ y is a total order. Writing n for the number of vertices in G, the function f : G — Njn, which to a vertex associates the number of vertices strictly below it wrt <, is universal in the sense of Lemma 4.5. □

Proposition 4.7 The free conservative finite cocompletion P of L is equivalent to the full subcategory of L whose objects are presheaves P satisfying the conditions of Proposition 4.3.

Proof By Lemma 4.4, the category P is equivalent to the full subcategory of L whose objects are presheaves preserving limits of diagrams of the form (8) generated by some graph G G G which admits a colimit, i.e. by Lemma 4.6 the finite graphs which are acyclic and connected. We write Gn for the graph with [n] as vertices and edges i 4 (i + 1) for 0 < i < n — 1. It can be shown that any acyclic and connected finite graph can be obtained from the graph Gn, for some n G N, by iteratively adding an edge x 4 y for some vertices x and y such that there exists a non-empty path x ^ y. Namely, suppose given an acyclic and connected finite graph G. The relation < on its vertices, defined by x < y whenever there exists a path x ^ y, is a total order, and therefore the graph G contains Gn, where n is the number of edges of G. An edge in G which is not in Gn is necessarily of the form x 4 y with x < y, otherwise it would not be acyclic. Since by Proposition 4.3, see (7), the diagram generated by a graph of the form

is preserved by presheaves in P (which corresponds to adding an edge between vertices at the source and target of a non-empty path), it is enough to show that presheaves in P preserve diagrams generated by graphs Gn. This follows again by Proposition 4.3, see (6). □

One can notice that a presheaf P EP is characterized by its underlying graph since P(0) is reduced to one element and P(n) with n > 2 is the set of paths of length n in this underlying graph: P = I* (I*P). We can therefore simplify the description of the cocompletion of L as follows:

Theorem 4.8 The free conservative finite cocompletion P of L is equivalent to the full subcategory of the category G of graphs, whose objects are finite graphs such that for every non-empty path x ^ y there exists exactly one edge x 4 y. Equivalently, it can be described as the category whose objects are finite sets equipped with a transitive relation <, and functions respecting relations.

In this category, pushouts can be explicitly described as follows:

Proposition 4.9 With the last above description, the pushout of a diagram in P (B, <B) (A, <A) 44 (C, <C) is B i±J C/ ~ with B 3 b ~ c E C whenever there exists a E A with f (a) = b and f (a) = c, equipped with the transitive closure of the relation inherited by <b and <c .

Lines with labels. The construction can be extended to the labeled case (i.e. L is not necessarily a singleton). The forgetful functor L 4 Set sending a presheaf P to the set P(1) admits a right adjoint ! : Set 4 L. Given n E N* the elements of !L(n) are words u of length n over L, with !L(srn-l)(u) being the word obtained from u by removing the i-th letter. The free conservative finite cocompletion P of L is the slice category L/!L, whose objects are pairs (P,£) consisting of a finite presheaf P E L together with a labeling morphism £ : P 4 !L of presheaves. Alternatively, the description of Proposition 4.8 can be straightforwardly adapted by labeling the

elements of the objects by elements of L (labels should be preserved by morphisms), thus justifying the use of labels for the vertices in following examples.

5 Examples

In this section, we give some examples of merging (i.e. pushout) of patches.

Example 5.1 Suppose that starting from a file ab, one user inserts a line a' at the beginning and c in the middle, while another one inserts a line d in the middle. After merging the two patches, the resulting file is the pushout of

h al db

which is

Example 5.2 Write Gi for the graph with one vertex and no edges, and G2 for the graph with two vertices and one edge between them. We write s,t : G1 — G2 for the two morphisms in P. Since P is finitely cocomplete, there is a coproduct G1 + G1 which gives, by universal property, an arrow seq : G1 + G1 — G2:

Gi G\ + G\

or graphically

that we call the sequentialization morphism. This morphism corresponds to the following patch: given two possibilities for a line, a user can decide to turn them into two consecutive lines. We also write seq' : G1 + G1 — G2 for the morphism obtained similarly by exchanging s and t in the above cocone. Now, the pushout of

which illustrates how cyclic graphs appear in P during the cocompletion of L.

Example 5.3 With the notations of the previous example, by taking the coproduct of two copies of idG 1 : G1 — G1, there is a universal morphism G1 + G1 — G1, which illustrates how two independent lines can be merged by a patch (in order to resolve conflicts).

6 Handling deletions of lines

All the steps performed in previous sections in order to compute the free conservative finite cocompletion of the category L+ can be adapted in order to compute the cocompletion P of the category L as introduced in Definition 2.1, thus adding

support for deletion of lines in patches. In particular, the generalization of the description given by Theorem 4.8 turns out to be as follows.

Theorem 6.1 The free conservative finite cocompletion P of the category L is the category whose objects are triples (A, <,££) where A is a finite set of lines, < is a transitive relation on A and £ : A 4 L associates a la,beJ to each line, and morphisms f : (A, <a,£a) 4 (B, <b ,£b) are partial functions f : A 4 B such that for every a, a' G A both admitting an image under f, we have £b(f (a)) = £a(o), and a <a a' implies f (a) <b f (a').

Similarly, pushouts in this category can be computed as described in Proposition 4.9, generalized in the obvious way to partial functions.

Example 6.2 Suppose that starting from a file abc, one user inserts a line d after a and the other one deletes the line b. The merging of the two patches (in P') is the pushout of

i.e. the file adc. Notice that the morphism /2 is partial: b has no image.

Interestingly, a category very similar to the one we have described in Theorem 6.1 was independently proposed by Houston [3] based on a construction performed in [2] for modeling asynchronous processes. This category is not equivalent to ours because morphisms are reversed partial functions: it is thus not the most general model (in the sense of being the free finite cocompletion). As a simplified explanation for this, consider the category FinSet which is the finite cocompletion of 1. This category is finitely complete (in addition to cocomplete), thus FinSetop is finitely cocomplete and 1 embeds fully and faithfully in it. However, FinSetop is not the finite cocompletion of 1. Another way to see this is that this category does not contain the "merging" morphism of Example 5.3, but it contains a dual morphism "duplicating" lines.

7 Concluding remarks and future works

In this paper, we have detailed how we could derive from universal constructions a category which suitably models files resulting from conflicting modifications. It is finitely cocomplete, thus the merging of any modifications of the file is well-defined.

We believe that the interest of our methodology lies in the fact that it adapts easily to other more complicated base categories L than the two investigated here: in future works, we should explain how to extend the model in order to cope with multiple files (which can be moved, deleted, etc.), different file types (containing text, or more structured data such as xml trees). Also, the structure of repositories (partially ordered sets of patches) is naturally modeled by event structures labeled by morphisms in P, which will be detailed in future works, as well as how to

model usual operations on repositories: cherry-picking (importing only one patch from another repository), using branches, removing a patch, etc. It would also be interesting to explore axiomatically the addition of inverses for patches, following other works hinted at in the introduction.

Once the theoretical setting is clearly established, we plan to investigate algorithmic issues (in particular, how to efficiently represent and manipulate the conflicting files, which are objects in P). This should eventually serve as a basis for the implementation of a theoretically sound and complete distributed version control system (no unhandled corner-cases as in most current implementations of vcs).

Acknowledgement

The authors would like to thank P.-A. Mellies, E. Haucourt, T. Heindel, T. Hirschowitz and the anonymous reviewers for their enlightening comments and suggestions.

References

[1] J. Adamek and J. Rosicky. Locally presentable and accessible categories, volume 189. Cambridge Univ. Press, 1994.

[2] R. Cockett and D. Spooner. Categories for synchrony and asynchrony. Electronic Notes in Theoretical Computer Science, 1:66—90, 1995.

[3] R. Houston. On editing text. http://bosker.wordpress.com/2012/05/10/on-editing-text.

[4] J. Jacobson. A formalization of darcs patch theory using inverse semigroups. Technical report, CAM report 09-83, UCLA, 2009.

[5] M. Kelly. Basic concepts of enriched category theory, volume 64. Cambridge Univ. Press, 1982.

[6] S. Mac Lane. Categories for the Working Mathematician, volume 5 of Graduate Texts in Mathematics. Springer Verlag, 1971.

[7] S. Mac Lane and I. Moerdijk. Sheaves in geometry and logic: A first introduction to topos theory. Springer, 1992.

[8] R. Pare. Connected components and colimits. Journal of Pure and Applied Algebra, 3(1):21—42, 1973.

[9] M. Ressel, D. Nitsche-Ruhland, and R. Gunzenhauser. An integrating, transformation-oriented approach to concurrency control and undo in group editors. In Proceedings of the 1996 ACM conference on Computer supported cooperative work, pages 288—297. ACM, 1996.

[10] D. Roundy and al. The Darcs Theory. http://darcs.net/Theory.

[11] R. Street and R. Walters. The comprehensive factorization of a functor. Bull. Amer. Math. Soc, 79(2):936—941, 1973.

A A geometric interpretation of presheaves on L+

Since presheaf categories are sometimes a bit difficult to grasp, we recall here the geometric interpretation that can be done for presheaves in L+. We forget about labels of lines and for simplicity suppose that the empty file is not allowed (the objects are strictly positive integers). In this section, we denote this category by L. The same reasoning can be performed on the usual category L+, and even L, but the geometrical explanation is a bit more involved to describe.

In this case, the presheaves in L can easily be described in geometrical terms: the elements P of L are presimplicial sets. Recall from Proposition 2.4 that the category L is the free category whose objects are strictly positive natural integers, containing for every integers n £ N* and i £ [n+1] morphisms sn : n 4 n+1, subject to the relations s"+1sn = s^is™ whenever 0 < i < j < n. Writing y : L4 L for the Yoneda embedding, a representable presheaf y(n +1) £ L®11 can be pictured geometrically as an n-simplex: a 0-simplex is a point, a 1-simplex is a segment, a 2-simplex is a (filled) triangle, a 3-simplex is a (filled) tetrahedron, etc.:

y(1) y(2) y(3) y(4)

Notice that the n-simplex has n faces which are (n — 1)-dimensional simplices, and these are given by the image under y(s"n), with i £ [n], of the unique element of y(n+1)(n+1): the i-th face of an n-simplex is the (n—1)-simplex obtained by removing the i-th vertex from the simplex. More generally, a presheaf P £ L®11 (a finite presimplicial set) is a finite colimit of rep- b g c

resentables: every such presheaf can be pictured as a gluing of sim- •--m

plices. For instance, the half-filled square on the right corresponds to / j h the presimplicial set P with P(1) = {a, b, c, d}, P(2) = {/, g, h, i,j}, •——

P(3) = {a} with faces P(s1)(/) = a, P(s0)(/)= b, etc. a i d

Similarly, in the labeled case, a labeled presheaf (P,£) £ L/! can be pictured as a presimplicial set whose vertices (0-simplices) are labeled by elements of L. The word labeling of higher-dimensional simplices can then be a

ab /"a

deduced by concatenating the labels of the vertices it has as iterated y

faces. For instance, an edge (a 1-simplex) whose source is labeled by a b aC and target is labeled by b is necessarily labeled by the word ab, etc. bc "c

More generally, presheaves in L+ can be pictured as augmented presimplicial sets and presheaves in L as augmented simplicial sets, a description of those can for instance be found in Hatcher's book Algebraic Topology.

B Proofs of classical propositions

In this section, we briefly recall proofs of well-known propositions as our proofs rely on a fine understanding of those. We refer the reader to [7] for further details.

Proposition 4.1 Given a functor F : C — D, with D cocomplete, the associated nerve Nf : D — C admits a left adjoint Rf : C — D called the realization along F. This functor is defined on objects P £ C by

Rf(P) = colim(El(P) C — D)

Proof In order to show the adjunction, we have to construct a natural family of isomorphisms D(Rf(P),D) = C(P,NfD) indexed by a presheaf P £ C and an object D £ D. A natural transformation d £ C(P, NfD) is a family of functions (do : PC — NfDC)cec such that for every morphism f : C' — C in C the diagram

P(CD(FC, D)

P(f)\ p(Ff,D)

P (C') ^ D(FC',D)

commutes. It can also be seen as a family (do(p) : FC — D)(o,p)eEi(P) of morphisms in D such that the diagram

FC a < , Fnp(C,p)

~ "" go (p)

1c> (p')

Oc (p)

D or equivalently Fnpf

FC'Oc' (p (f )(p)) Fnp (C ',p')

commutes for every morphism f : C' — C in C. This thus defines a cocone from Fnp : El(P) — D to D, and those cocones are in bijection with morphisms Rf(P) — D by definition of Rf(P) as a colimit: we have shown D(Rf(P),D) = C(P,Nf(D)), from which we conclude. □

The equivalence between the two conditions of Definition 4.2 can be shown as follows.

Proposition B.1 Given a functor F : C — D, the two following conditions are equivalent:

(i) the associated nerve functor Nf : D — C is full and faithful,

(ii) every object of D is canonically a colimit of objects in C: for every D GD,

D = colim(El(NFD) —^ C — D)

Proof In the case where D is cocomplete, the nerve functor Nf : D — C admits Rf : C — D as right adjoint, and the equivalence amounts to showing that the right adjoint is full and faithful if and only if the counit is an isomorphism, which is a classical theorem [6, Theorem IV.3.1]. The construction can be adapted to the general case where D is not necessarily cocomplete by considering colim(El(-) —+ C — D) : C — D as a partially defined left adjoint (see [8]) and generalizing the theorem. □

S. Mimram, C. Di Giusto / Electronic Notes in Theoretical Computer Science 298 (2013) 283—307 299

C Proofs of the construction of the finite cocompletion

Lemma 4.4 A presheaf P £ L preserves finite limits, if and only if it sends the colimits of diagrams of the form

El(G) —4 g4 L

to limits in Set, where G £ G is a finite graph such that the above diagram admits a colimit. Such a diagram in L is said to be generated by the graph G.

Proof In order to check that a presheaf P £ L preserves finite limits, we have to

check that it sends colimits of finite diagrams in L which admit a colimit to limits

in Set, and therefore we have to characterize diagrams which admit colimits in L.

The number of diagrams to check can be reduced by using the facts that limits

commute with limits [6]. For instance, the inclusion functor I : G 4 L is dense,

which implies that every object n £L is canonically a colimit of the objects 1 and 2

by the formula n = colim(El(^In) 4 G 4 L), see Definition 4.2. Thus, given a

finite diagram K : J 4 L, we can replace any object n different from 1 and 2

occurring in the diagram by the corresponding diagram El(Win) 4 G 4 L, thus obtaining a new diagram K' : J 4 L which admits the same colimit as K. This shows that P will preserve finite limits if and only if it preserves limits of finite diagrams in L in which the only occurring objects are 1 and 2. Since the only non-trivial arrows in L between the objects 1 and 2 are sq,s{; : 1 4 2, and removing an identity arrow in a diagram does not change its colimit, the diagram K can thus be assimilated to a bipartite graph with vertices labeled by 1 or 2 and edges labeled by s0 or sQ, all edges going from vertices 1 to vertices 2.

We can also reduce the number diagrams to check by remarking that some pairs of diagrams are "equivalent" in the sense that their image under P have the same limit, independently of P. For instance, consider a diagram in which an object 2 is the target of two arrows labeled by sQ (on the left). The diagram obtained by identifying the two arrows along with the objects 1 in their source (on the right) can easily be checked to be equivalent by constructing a bijection between cocones of the first and cocones of the second.

2 ••• 2 ^2^ 2 ••• 2

1 •• 11

1 •■"■■ 1 ■•■•• 1

2 -• 2 2 2 2

A W-4-W A

1 1*^1 1 •• 1

More precisely, if we write K : J' 4L and K : J 4 L for the two diagrams and J : J' 4 J for the obvious functor, the canonical arrow colim(Ko J) 4 colim(K) is an isomorphism, i.e. the functor J is final. The same reasoning of course also holds with sQ instead of sj. We can therefore restrict ourselves to considering diagrams in which 2 is the target of at most one arrow sQ, and of at most one arrow sQ. Conversely, if an object 2 is the target of no arrow sj (on the left), then we can add a new object 1 and a new arrow from this object to the object 2 (on the right) and

300 S. Mimram, C. Di Giusto /Electronic Notes in Theoretical Computer Science 298 (2013) 283-307 obtain an equivalent diagram:

2 2 ••• 2 2 2 ••• 2

f-.4¥ a si|f-i¥i

1 01 1

The same reasoning holds with s1 instead of s0 and we can therefore restrict ourselves to diagrams in which every object 2 is the target of exactly one arrow s0 and one arrow s1.

Any such diagram K is obtained by gluing a finite number of diagrams of the form

„1 2 „1 1 1

along objects 1, and is therefore of the form El(G) — G — L for some finite graph G G G: the objects of G are the objects 1 in K, the edges of G are the objects 2 in K and the source and target of an edge 2 are respectively given by the sources of the corresponding arrows s1 and s1 admitting it as target. For instance, the diagram on the left

/ \ / \ / \ 1 K 1 A

is generated by the graph on the right. □

Lemma 4.5 Given a graph G G G, the associated diagram (8) admits a colimit in L if and only if there exists n gL and a morphism f : G — Njn in L such that every morphism g : G — Nj m in L, with m GL, factorizes uniquely through Nj n:

g4NI^ Nj m g

Proof We have seen in proof of Proposition 4.1 that morphisms in L(G,Njn) are in bijection with cocones in L from El(G) G L to n, and moreover given a morphism h : n — m in G the morphism L(G, Njn) — L(G, Njm) induced by post-composition with Nj h is easily checked to correspond to the usual notion of morphism between n-cocones and m-cocones induced by Nj h (every morphism Nj n — Nj m is of this form since Nj is full and faithful). We can finally conclude using the universal property defining colimiting cocones. □

D Proofs for deletions of lines

In this section, we detail proofs of properties mentioned in Section 6.

D.1 Sets and partial functions

Before considering the conservative finite cocompletion of the category L, as introduced in Definition 2.1, it is enlightening to study the category PSet of sets and partial functions. A partial function f : A — B can always be seen

(i) as a total function f : A — B W {1a} where 1a is a fresh element wrt A, where given a G A, f (a) = 1a means that the partial function is undefined on A,

(ii) alternatively, as a total function f : Aw{1a} — Bw{1b} such that f (1a) = 1B. This thus suggests to consider the following category:

Definition D.1 The category pSet of pointed sets has pairs (A, a) where A is a set and a G A as objects, and morphisms f : (A, a) — (B,b) are (total) functions f : A — B such that f (a) = b.

Point (ii) of the preceding discussion can be summarized by saying that a partial function can be seen as a pointed function and conversely:

Proposition D.2 The category PSet of sets and partial functions is equivalent to the category pSet of pointed sets.

It is easily shown that the forgetful functor U : pSet — Set, sending a pointed set (A, a) to the underlying set A, admits a left adjoint F : Set — pSet, defined on objects by FA = (A W {-La}, -a). This adjunction induces a monad T = UF on Set, from which point (i) can be formalized:

Proposition D.3 The category PSet is equivalent to the Kleisli category Sety associated to the monad T : Set — Set.

Finally, it turns out that the category pSet of pointed sets might have been discovered from PSet using "presheaf thinking" as follows. We write G for the full subcategory of PSet containing two objects: the empty set 0 = 0 and a set 1 = {*} with only one element, and two non-trivial arrows * : 0 — 1 and 1 : 1 — 0 (the undefined function) such that lo* = id0. We write I : G — PSet for the inclusion functor. Consider the associated nerve functor Nj : PSet — G. Given a set A the presheaf Nj A G G is such that:

• NjAO = PSet(I0,A) = {*}: the only morphism 0 — A in PSet is noted *,

• NjA1 = PSet(11,A) = A W {1A}: a morphism 1 — A is characterized by the image of *G A which is either an element of A or undefined,

• Nj A* : Nj A1 — Nj A0 is the constant function whose image is *,

• NjAl : NjA0 — NjA1 is the function such that the image of * is 1a.

Moreover, given A,B G PSet a natural transformation from Nj A to Nj B is a pair of functions f : A W {1a} — B W {1b } and g : {*} — {*} such that the diagrams

A W{1a} f B W {1b} A W{1a} ^w{1b}

Nj A4 ¡Nj B* and Nj A^ fNj B±

{*} {*}

commutes. Since {*} is the terminal set, such a natural transformation is characterized by a function / : A W {-La} 4 B W {Lb} such that /(-a) = —B. The functor Ni : PSet 4 G is thus dense and its image is equivalent to pSet.

D.2 A cocompletion of L

The situation with regards to the category L is very similar. We follow the plan of Section 4 and first investigate the unlabeled case: L is the category with integers as objects and partial injective increasing functions / : [to] 4 [n] as morphisms / : m 4 n.

We write G for the full subcategory of L whose objects are 0, 1 and 2. This is the free category on the graph

d0 ^0-d1

subject to the relations

sjsj = sQsJ djsj = idi dJsQ = id2 dQsQ = id2 djdj = dJdQ (D.1)

(see Proposition 2.4). We write I : G 4 L for the embedding and consider the associated nerve functor Ni : L4G. Suppose given an object n £L, the associated presheaf Nin can be described as follows. Its sets are

• Ni n0 = L(I0,n) = {★},

• Nin1 = L(11, n) = [n] W {-},

• Ni n2 = L(12,n) =

{(i,j) £ [n] x [n] | i < j} W {(L,i) | i £ [n]} W {(i, L) | i £ [n]} W {(-, -)}: a partial function / : 2 4 n is characterized by the pair of images (/(0),/(1)) of 0,1 £ [n], where L means undefined.

and morphisms are

• NInsj : NIn1 4 Nin0 is the constant function whose image is *,

• Nindj : Ni n0 4 Nin1 is the function whose image is L,

• NInsQ : NIn2 4 Nin1 is the second projection,

• Ni nsQ : Ni n2 4 Ni n1 is the first projection,

• NIndj : NIn1 4 NIn2 sends i £ [n] W {-L} to (L,i)

• NIndQ : NIn1 4 NIn2 sends i £ [n] W {-L} to (i, L)

Such a presheaf can be pictured as a graph with Nin1 as set of vertices, Nin2 as set of edges, source and target being respectively given by the functions Ni nsQ and

Ni nsj:

Its vertices are elements of [n] W {1} and edges are of the form

• i — j with i,j G [n] such that i < j,

• i — 1 for i G [n]

• 1— i for i G [n]

• 1 — 1

Morphisms are usual graphs morphisms which preserve the vertex 1. We are thus naturally lead to define the following categories of pointed graphs and graphs with partial functions. We recall that a graph G = (V, s, t, E) consists of a set V of vertices, a set E of edges and two functions s,t : E — V associating to each edge its source and its target respectively.

Definition D.4 We define the category pGraph of pointed graphs as the category whose objects are pairs (G, x) with G = (V, E) and x G V such that for every vertex there is exactly one edge from and to the distinguished vertex x, and morphisms f : G — G' are usual graph morphisms consisting of a pair (fv ,fE) of functions fv : Vg — Vg< and fE : Eg — Eg such that for every edge e G Eg, fv(s(e)) = s(fE(e)) and fv(t(e)) = t(fE(e)), which are such that the distinguished vertex is preserved by fv.

Definition D.5 We define the category PGraph of graphs and partial morphisms as the category whose objects are graphs and morphisms f : G — G' are pairs (fv,fE) of partial functions fv : Vg — Vg< and fE : Eg — Eg> such that

• for every edge e G Eg such that fE(e) is defined, fv(s(e)) and fv(t(e)) are both defined and satisfy fv(s(e)) = s(fE(e)) and fv(t(e)) = t(fE(e)),

• for every edge e G Eg such that fv(s(e)) and fv(t(e)) are both defined, fE(e) is also defined.

More briefly: a morphism is defined on an edge if and only it is defined on its source and on its target.

Similarly to previous section, a partial morphism of graph can be seen as a pointed morphism of graph and conversely:

Proposition D.6 The categories pGraph and PGraph are equivalent.

Now, notice that the category L is isomorphic to the full subcategory of PGraph whose objects are the graphs whose set of objects is [n] for some n G N, and

such that there is an edge i — j precisely when i < j. Also notice that the full subcategory of pGraph whose objects are the graphs Njn (with 1 as distinguished vertex) with n G N is isomorphic to the full subcategory of G whose objects are the Njn with n G N. And finally, the two categories are equivalent via the isomorphism of Proposition D.6. From this, we immediately deduce that the functor Nj : L — G is full and faithful, i.e.

Proposition D.7 The functor I : G — L is dense.

We can now follow Section 4 step by step, adapting each proposition as necessary. The conditions satisfied by presheaves in P introduced in Proposition 4.3 are still valid in our new case:

Proposition D.8 Given a presheaf P G L which is an object of P,

(i) the underlying graph of P is finite,

(ii) for each non-empty path x ^ y there exists exactly one edge x — y,

(iii) P(n + 1) is the set of paths of length n in the underlying graph of P, and P(0) is reduced to one element.

Proof The diagrams of the form (6) and (7) used in proof of Proposition 4.3 still admit the same colimit n + 1 with the new definition of L and 0 is still initial. It can be checked that the limit of the image under a presheaf P G L of a diagram (6) is still the set of paths of length n in the underlying graph of P. □

Lemma 4.4 is also still valid:

Lemma D.9 A presheaf P G L preserves finite limits, if and only if it sends the colimits of diagrams of the form

El(G) G — L

to limits in Set, where G G G is a finite pointed graph such that the above diagram admits a colimit. Such a diagram in L is said to be generated by the pointed graph G.

Proof The proof of Lemma 4.4 was done "by hand", but we mentioned a more abstract alternative proof. In the present case, a similar proof can be done but would be really tedious, so we provide the abstract one. In order to illustrate why we have to do so, one can consider the category of elements associated to the presheaves representable by 0 and 1, which are clearly much bigger than in the case of Section 4:

«0 i ; El(NJ°) = *= ,0 —-

0 " dj—

El(Nj 1)

subject to relations which follow directly from (D.1).

Before going on with the proof, we need to introduce a few notions. A functor F : C 4 D is called final if for every category E and diagram G : D 4 E the canonical morphism colim(G o F) 4 colim(G) is an isomorphism [6]: restricting a diagram along F does not change its colimit. Alternatively, these functors can be characterized as functors such that for every object D £ D the category is nonempty and connected (there is a zig-zag of morphisms between any two objects). A functor F : C 4 D is called a discrete fibration if for any object C £ C and morphism g : D 4 FC in D there exists a unique morphism / : C' 4 C in C such that F/ = g called the lifting of g. To any such discrete fibration one can associate a presheaf P £D defined on any D £D by PD = F-1(D) = {C £C | FC = D} and on morphisms g : D' 4 D as the function Pg which to C £ PD associates the source of the lifting of g with codomain C. Conversely, any presheaf P £D induces a discrete fibration El(P) 4 D, and these two operations induce an equivalence of categories between the category D and the category of discrete fibrations over D. It was shown by Pare, Street and Walters [8,11] that any functor F : C4 D factorizes as final functor J : C 4 E followed by a discrete fibration K : E 4 D, and this factorization is essentially unique: this is called the comprehensive factorization of a functor. More explicitly, the functor K can be defined as follows. The inclusion functor Set 4 Cat which send a set to the corresponding discrete category admits a left adjoint n0 : Cat 4 Set, sending a category to its connected components (its set of objects quotiented by the relation identifying two objects linked by a zig-zag of morphisms). The discrete fibration part K above can be defined as El(P) 4 D where P £ I? is the presheaf defined by P = n0(—/F). In this precise sense, every diagram F in D is "equivalent" to one which is "generated" by a presheaf P on D (we adopted this informal terminology in the article in order to avoid having to introduce too many categorical notions).

In our case, we can thus restrict to diagrams in L generated by presheaves on L. Finally, since I : G 4 L is dense, we can further restrict to diagrams generated by presheaves on G by interchange of colimits. □

Lemma 4.6 applies almost as in Section 4: since the morphism / : G 4 Nin (seen as a partial functions between graphs) has to satisfy the universal property of Lemma 4.5, by choosing for every vertex x of G a partial function gx : G 4 Nim which is defined on x (such a partial function always exists), it can be shown that the function / has to be total. The rest of the proof can be kept unchanged. Similarly,

Proposition 4.7 applies with proof unchanged. Finally, we have that

Theorem D.10 The free conservative finite cocompletion P of L is equivalent to the full subcategory of L whose objects are presheaves P satisfying the conditions of Proposition D.8. Since its objects P satisfy I*I*(P) = P, it can equivalently be characterized as the full subcategory of G whose objects P are

(i) finite,

(ii) transitive: for each non-empty path x ^ y there exists exactly one edge x — y,

(iii) pointed: P(0) is reduced to one element.

From this characterization (which can easily be extended to the labeled case), along with the correspondence between pointed graphs and graphs with partial functions (Proposition D.6), the category is shown to be equivalent to the category described in Theorem 6.1: the relation is defined on vertices x,y of a graph G by x < y whenever there exists a path x ^ y.

As in case of previous section, the forgetful functor pGraph — Graph admits a left adjoint, thus inducing a monad on Graph. The category pGraph is equivalent to the Kleisli category associated to this monad, which is closely related to the exception monad as discussed in [3].

E Modeling repositories

We briefly detail here the modeling of repositories evoked in Section 7. As explained in the introduction, repositories can be modeled as partially ordered sets of patches, i.e. morphisms in L. Since some of them can be incompatible, it is natural to model them as particular labeled event structures.

Definition E.1 An event structure (E, <, #) consists of a set E of events, a partial order relation < on E and incompatibility relation on events. We require that

(i) for any event e, the downward closure of {e} is finite and

(ii) given ei, el and e2 such that ei < ei and ei#e2, we have e1#e2.

Two events e1 and e2 are compatible when they are not incompatible, and independent when they are compatible and neither e1 < e2 nor e2 < e1. A configuration x is a finite downward-closed set of compatible events. An event e2 is a successor of an event ei when ei < e2 and there is no event in between. Given an event e we write le for the configuration, called the cause of e, obtained as the downward closure of {e} from which e was removed. A morphism of event structures f : (E, <, #) — (E', <', #') is an injective function f : E — E' such that the image of a configuration is a configuration. We write ES for the category of event structures.

To every event structure E, we can associate a trace graph T(E) whose vertices are configurations and edges are of the form x — x Ы {e} where x is a configuration such that e $ x and x Ы {e} is a configuration. A trace is a path x ^ y in this

graph. Notice that two paths x ^ y are of the same length. Moreover, given two configurations x and y such that x C y, there exists necessarily a path x ^ y. It can be shown that this operation provides a faithful embedding T : ES 4 Graph from the category of event structures to the category of graphs, which admits a right adjoint.

Example E.2 An event structure with five events is pictured on the left (arrows represent causal dependencies and ~ incompatibilities). The associated trace graph is pictured on the right.

{a, b, c, d} 4

{a,b,c}

b c-^c a

{a,b,c'} {a,b} h {a,c'}-

Definition E.3 A categorical event structure (E, A) in a category C with an initial object consists of an event structure equipped with a labeling functor A : (TE)* 4 C, where (TE)* is the free category generated by the graph TE, such that A0 is the initial object of C and the image under A of every square

eyf X^1

e\ /^2

in TE is a pushout in C.

The following proposition shows that a categorical event structure is characterized by a suitable labeling of events of E by morphisms of C.

Proposition E.4 The functor A is characterized, up to isomorphism, by the image of the transitions le 4 le W {e}.

We can now define a repository to be simply a finite categorical event structure (E, <, #,A : T(E) 4 L). Such a repository extends to a categorical event structure (E, <, #o,I ◦ A : T(E) 4 P), where #o is the empty conflict relation. The .state S of such an event structure is the file obtained as the image S = I o A(E) of the maximal configuration: this is the file that the users is currently editing given his repository. Usual operations on repositories can be modeled in this context, for instance importing the patches of another repository is obtained by a pushout construction (the category of repositories is finitely cocomplete).