Eventual Consistency

From DiVersions
Revision as of 10:25, 10 April 2022 by Michael Murtaugh (talk | contribs) (fix timecodes in yt urls (does this work?))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Eventual Consistency

Michael Murtaugh

Got milk? (1968)

Got milk? (1968)
Got milk? (1968)

A now canonized moment in many tellings of the history of computation is the so-called Mother of all demos when Douglas Engelbart and a team of engineers performed a kind of technical theater at a computer conference in San Francisco in 1968. In the demo, Engelbart sits in front of computer display and proceeds to use something that (to a modern eye) seems to fuse elements of a text editor, outliner, drawing program, and web browser. At one point he begins to prepare a grocery list starting with a list of desired items, then proceeds to show off the systems ability to categorise and display the list in a variety of orders eventually producing a graphical optimal route to go home to complete his to dos. At another point (through a bit of stage magic using then state of the art video screen splitting equipment), Engelbart appears to initiate a kind of video conference with a colleague at a remote location and the two proceed to edit some computer code together.[1]

The Wikipedia article Collaborative real-time editor was first created by user MattisManzel May 30, 2005 [2], and initially listed just two software examples: SubEthaEdit and MoonEdit. On November 19, 2008, a link is added to a new Wikipedia page (also created that day) called Etherpad. A reference to Douglas Engelbart and the “mother of all demos” is added on December 7, 2009 by an anonymous editor. Despite getting marked as vandalism and removed the same day, the reference returns the 20th of December where it has remained until today.[3]

etherpad + OT

This essay started as a research into the algorithms underlying the Etherpad application[4], a browser-based collaborative text editor that has become central to many practices around Constant. In my shared role as a Constant system administrator, and as a participant in the Diversions worksession, I have developed a somewhat intimate relationship to this software object. The particularities of the software, and the centrality of its role to much of the collaborative work around Constant, has significantly driven my own software practices, as well as those of others around me. The necessity to manage problems such as etherpad’s ever-growing (and eventually unmanageably large) database size as well as the need to provide continuous access to documents given the server’s (and related ecosystem of plugins) often unstable nature has led to the development of scripts to manage static snapshots of etherpad content[5]. Similarly, the software’s high demands on network connectivity when collaborating with a large numbers of users working together in the same (physical) place led to extensive thinking about and implementation of local and portable infrastructures[6].

If you want to find out how Etherpad’s Easysync works (the library that makes it really real-time), start with this PDF (complex, but worth reading).[7]

My explorations started in the Etherpad documentation, my interest piqued by that phrase complex but worth reading that appears in the README of the project’s source code. This led, in typical web fashion, to a mesh of interlinking (video) presentations and academic papers describing algorithms and techniques around collaborative (digital) writing systems. Two particular algorithms were repeatedly invoked and appeared central to this research: the operational transformation (OT) and the conflict-free replicated data type (CRDT).

David Greenspan, one of the original developers of Etherpad (and presumably an author of that complex, but worth reading PDF document), gave a helpful presentation at a special “Etherpad Meetup” in San Francisco in 2013 providing a technical description of Operational Transformation situated in the particular history of the etherpad project.[8] Greenspan starts his presentation by describing his current work for a startup web-framework project. Etherpad was itself born from another web framework called AppJet, a project Greenspan co-founded with two ex-Google engineers in 2007 to push the limits of applications based on a browser/server infrastructure rather than that of traditional offline applications. In mid 2008 he writes an email to colleagues proposing that they should try to build a cross between superhappychat (another early AppJet prototype) and SubEthaEdit (a Mac-based collaborative editor) where ‘merging should be done to an extent that people can comfortably co-edit a document, and perhaps text could be colored by author’. The project is announced publicly just months later, on November 19.


  • A document is a list of characters, or a string.
  • A document can also be represented as a list of changesets.


  • A changeset represents a change to a document.
  • A changeset can be applied to a document to produce a new document.[9]

Greenspan proceeds to describe how etherpad documents are based on the concept of changesets – a concept he claims “developed as the result of an all-nighter”. Changesets are a concise representation of just the changes an author makes to a particular text. For instance, if one were to type the following into etherpad:


And then change the last word from “text” to “test”, the complete document as a list of changesets would be:

Changeset               Interpretation
--------------------    --------------------
Z:1>3*0+3$thi           insert 3 characters: thi
Z:4>2=3*0|1+2$s\n       keep 3 chars, insert 1 line: s (newline)
Z:6>2|1=5*0+2$is        keep 1 line, insert 2 characters: is
Z:8>1|1=5=2*0|1+1$\n    keep 1 line, keep 2 characters, insert (newline) 
Z:9>2|2=8*0|1+2$a\n     keep 2 lines, insert 1 line: a (newline)
Z:b>2|3=a*0+2$te        keep 3 lines, insert 2 characters: te
Z:d>2|3=a=2*0+2$xt      keep 3 lines, keep 2 characters, insert xt
Z:f<1|3=a=2-1$          keep 3 lines, keep 2 characters, delete 1 character
Z:e>1|3=a=2*0+1$s       keep 3 lines, keep 2 characters, insert 1 character: s

Representing the “document as list of changesets” has something in common with a Wikipedia article, where internally, a database is used to record the entire history of changes editors have made. In this way, it’s possible to view the entire timeline of edits, view differences between those edits, and eventually make “reversions” to previously written formulations or remove unwanted edits. The fact that the history of an article is stored, was a crucial part of supporting the radical decision to allow wikis to be edited by users without first "authorizing" them (with for instance a login). In the case of Wikipedia, the granularity of this history is each time an editor saves a version. In the case of etherpad, the granularity of editing is much finer, often reducing edits to a few keystrokes that in effect are automatically committed as each editor types. As each changeset is considered a “revision” it is usual for a document to have tens of thousands of revisions. This fact, combined with the added overhead that each changeset is timestamped and associated with an author, makes the seemingly compact representation grow to the potentially unmanageable database sizes that can give etherpad system administrators headaches.

While changesets and the accompanying easysync algorithm is not designed for compactness it is instead designed for its speed of distributing the changes to other authors and for automatically merging them to a shared version. To demonstrate this, Greenspan gives the example of the text “Here is a very nice sentence” changed by two editors simultaneously into: “Here is a sentence” and “I like a nice sentence”. In the slide red cross-outs (deletions) and the green additions (insertions) show the changesets. The merge algorithm is designed to honor both editors’ deletions, then performs both of their insertions. In this case, the resulting merged sentence is “HereI like a sentence”.

Two editors divergent edits are merged
Two editors divergent edits are merged

Users of etherpad will recognize the multi-colored formatting of the final sentence, with the two authors voices spliced together like wedges of Neapolitan ice cream (each assigned a distinct color). Also familiar would be the way using the software involves moments of rupture, when collectively written texts temporarily break, or swell with duplications to then be resutured or whittled down by further editing. The important point is that the software doesn’t usually choose a final version of the text, rather the algorithmic merging happens quickly enough to catalyze a social process of negotiation to correct or rework the text. Greenspan, for his part, is cognisant of this, and at one point reminds the audience that the algorithm is really only good for merging changes made in the relatively short time span of network delays. The merging is purely to manage the kind of real or near real time conflicts where editors are assumed present and actively engaged with the text. Using the algorithm to merge changes made by independent editing made over longer periods of disconnection out of the context of this live social negotiation would not make sense.


Though Greenspan only mentions it by name once at the very beginning of this presentation, the easysync algorithm is an exemplary implementation of an Operational Transformation (OT). In order to zoom into the algorithm, and this idea of transforming operations, Greenspan pares back the edits to an even simpler moment where the text “Hello” is again edited simultaneously to “Hello!” and “Yellow”. He shows a series of diagrams where the two edits are represented as change vectors extending from a common starting point to two points of divergence (opposite corners of the diagram). The key idea of OT is how changes are exchanged and transformed among editors so that changes made by others can be applied to the one’s own local (and thus often different) context. In other words, editor 2 will transform editor one’s exclamation point (made to the original Hello) and apply it to their edit Yellow, while editor one will receive editor 2’s “Y” and “w” (also made to the original Hello) and apply that to their edit “Hello!”. Ultimately, in addition to being designed to be efficient, the goal is for the results of all transformations to converge – that is produce the same final text. Thus OT “squares” the diagram with the result of applying the transformed transformations (in principle) returning back to a shared text.

Part of the challenge in working with OT is that, (at least to me) there is something enchanting in the teetering between fascination for how it appears to work, and yet a complexity that keeps it just out of reach of full comprehension. The very idea of transforming editing operations, themselves already a kind of transformation (insert/delete characters) seems enticingly recursive. Greenspan: ‘I put this in my head as you’re transforming one operation against another, you are transforming it to allow for this other operation to already have happened’. Greenspan also notes, with a muted sense of wonder, how each of the 5 possible transitions shown in the diagram represent a (slightly) different transformation, and yet somehow they all relate to each other.

Diffraction does not produce “the same” displaced, as reflection and refraction do. Diffraction is a mapping of interference, not of replication, reflection, or reproduction. A diffraction pattern does not map where differences appear, but rather maps where the effects of difference appear.[10]

Thinking about OT, I’m reminded of this passage from Donna Haraway on diffraction. And yet, the results of OT’s merge (“HereI like a sentence”) don’t quite seem live up to the speculative potentials of such a radical (potentially) diffractive representation.


Hello World! :-)

In a 2018 presentation, researcher Martin Kleppmann describes an alternative technique to OT called Conflict-free replicated data types (CRDTs).[11] Early in the presentation he presents an example of “text editing” nearly identical to that of Greenspan, again showing two simple parallel edits merged into one. As in etherpad, the edits are shown as arrows from one text to another and labeled with their “transformative content” such as “insert World after Hello”. Here, Kleppmann notes that this is the way that Google Docs works. Though Kleppmann never refers to etherpad, the points he makes are valid as both etherpad and Google Docs are based on the same OT-based mechanisms.

Hello World! :-)
Tombstoning OT
Tombstoning OT

Kleppmann then goes on to draw a line between OT and the more recently developed concept, that of the CRDTs. The dates listed on his slide (OT (1989-2006), CRDTs (2006-present)), clearly positions CRDTs as OT’s successor -- to use terminology “from the literature”, it could be said that Kleppmann here is attempting to tombstone OT.[12] Kleppmann, a researcher in distributed systems at the University of Cambridge, reminds his audience that OT, though originating in 1989, and implemented in a number of projects, has a checkered academic history with many of its theories later disproven. In particular, many “edge cases” have been demonstrated where diverging edits will not converge unless somehow additionally constrained. In particular, the implementation used by Google Docs (and etherpad) works only because it relies on there being a single (centralized) server to ensure a consistent shared state. This, he notes, has serious consequences such as mobile phones and laptops synchronizing say calendar data via a server in a data center rather than directly (despite the fact they may be lying physically next to each other). Thus, he gives a compelling example of how a formal flaw in an algorithm or implementation can have physical consequences, shaping economic decisions and carrying social and ecological implications.

Inserting in the same place

Near the end of his presentation, Kleppmann dives in deep, elaborating the specifics of how a particular CRDT model works to mathematically guarantee convergence between several editors making changes at the same place in a document. The demonstration is also a logical tour de force as Kleppmann flawlessly recites a sublimely abstract brain-fuck of an explanation as both textual content and “meta” information about the positions of the text are reduced to stream of alphanumeric symbols.[13] The CRDT models Kleppmann presents, have been subjected to rigorous formal testing and shown to demonstrate a property that academic literature terms Strong Eventual Consistency – a guarantee that given enough time for messages between editors to circulate, each editors’ document will arrive at the same state.[14] Thus CRDTs (at least theoretically) may be better able to exploit peer to peer network infrastructure and avoid the need for a centralized server.

Inserting in the same place
1 + 1 = 3?
1 + 1 = 3?

Something else significant occurs earlier on in the presentation as Kleppmann generalizes the technique of CRDTs to “data types” other than text. The DTs in CRDTs refers after all to potentially many data types. Kleppmann follows his “Hello World” by two other examples of “collaborative editing”, the graphical similarity of the diagrams suggesting a congruity between each considered case. The first shows that instead of text, editors might be editing sets. Starting from set {a, b}, one editor removes element {b} while another adds element {c} and voila: the algorithm correctly merges both timelines to {a, c} -- no argument there. The second example is one that often appears in discussions of CRDTs as the simplest of cases, a “counter”. Here the two editors are shown with an initial state that says “c = 1”. Arrows are then shown labeled “increment” transforming this “text” into “c = 2”. Kleppmann notes here that you might be tempted to say that both editors are in agreement with c = 2, but that this would in fact “lose information”. Instead, he explains that when the two edits are merged, both editors should arrive at the final state “c = 3”. To support this he notes:

We need to consider not just the state – the value at any one time – because if we compare just 2 and 2 they look the same, but actually we need to capture the sequence of changes that were made to the data and re-apply those, and that will allow us to reach a sensible merged outcome.[15]

Here Kleppmann makes a significant and subtle shift from speaking of text to speaking of data. Rather than performing insertions and deletions to text, the object at hand is “data” and “information” that, apparently, are unambiguous and thus can be “sensibly merged” with the precision of a logical operation.

In a textual sense, what exactly two editors might mean when writing “c=2” depends of course on the context. If they were transcribing an algebra lecture, the repetition may indeed simply be that, an insistence upon the fact that c = 2 is indeed what was said. But even accepting the shift from “text” to “data”, the question of context still remains. If indeed the “incrementer” is a “like counter” then arriving at a value of 3 might be the desired outcome.[16] If instead the editors are collectively counting (in drinking-game fashion) the number of times a speaker uses the phrase “eventual consistency”, then c=3 might incorrectly represent a double count (and thus an unearned extra shot of whiskey).

Here the error that Kleppmann makes is not logical (his logic is indeed pristine), but rather an instance of what Katherine Hayles has described as information losing its body. His error is one of omission as logical precision obscures and defers relevant questions of context and meaning.

To cut through this Gordian knot, Shannon and Wiener defined information so that it would be calculated as the same value regardless of the contexts in which it was embedded, which is to say, they divorced it from meaning. In context, this was an appropriate and sensible decision. Taken out of context, the definition allowed information to be conceptualized as if it were an entity that can flow unchanged between different material substrates [...] Thus, a Simplification necessitated by engineering considerations becomes an ideology in which a reified concept of information is treated as if it were fully commensurate with the complexities of human thought.[17]

Got milk? (2018)

Got milk? (2018)
Got milk? (2018)

Kleppmann’s presentation of CRDTs as solution to the problem of OT’s lack of theoretical precision mirrors a larger problem of how technical discussions of collaborative tools often lack of consideration of social context and their necessity in assessing the ultimate efficacy of such system. Speaking of etherpad, Greenspan makes reference to the fact that any kind of “auto merge” algorithm necessarily involves questions of “preserving intention”. When he arrives at the merged text “HereI like a sentence”, he notes how the editors will ‘have to say... wait a minute, what were we trying to do there.’[18] When he describes the worst-case scenario of two editors simultaneously deleting then editing opposing halves of text, and notes that the algorithms output is ‘guaranteed to look weird, but also guaranteed to work’, he’s also assuming that editors are engaged with each other in the realtime loop of the etherpad editor and thus are likely to stop what they’re doing and negotiate. In contrast, when Kleppmann devotes a portion of his presentation to an “application” in a system he co-developed called “automerge”, the example he gives is disappointing. First the “toy example” of a shared shopping list doesn’t even bother to address what “conflict” in this case might mean -- namely two containers of milk being purchased at around the same time (a situation, which strong eventual consistency has nothing to offer as it makes no stipulations about the length of potential gaps of time when participants would be out of communication).

But practical concerns aside, my frustration is less with the uselessness of the example, and more with the lack of engagement with the actual materiality (and very real potential) such systems actually (might) possess. Developing truly inspirational new forms of distributed collaboration requires both a technical precision and level of real-world engagement with what such systems then actually do in a social context. The data do not speak for themselves.

Towards diffractive technotexts

Technotexts: When a literary work interrogates the inscription technology that produces it, it mobilizes reflexive loops between its imaginative world and the material apparatus embodying that creation as a physical presence. [19]

As an antidote to the dreary examples of optimized distributed shopping lists, consider the following two projects that could be said to make productive and imaginative use of the particular materials they make use of and create.

Epicpedia EpicTheatre01.png
Epicpedia EpicTheatre01-bw-sharpen.png
Epicpedia EpicTheater02.png
Epicpedia EpicTheater02-bw-sharpen.png

In Epicpedia (2008), Annemieke van der Hoek creates a work that makes use of the underlying history that lies beneath the surface of each Wikipedia article.[20] Inspired by the work of Berthold Brecht and the notion of Epic Theater, Epicpedia presents Wikipedia articles as screenplays, where each edit becomes an utterance performed by a cast of characters (both major and minor) that takes place over a span of time, typically many years. The work uses the API of wikipedia to retrieve for a given article the sequence of revisions, their corresponding user handles, the summary message (that allows editors to describe the nature of their edit), and the timestamp to then produce a differential reading.

As the series of edits is run through, in this case the article for Berthold Brecht himself, a particular edit is preceded by “6 days later, at noon”, a readable representation of the elapsed time between the timestamps of the current and previous edits. The script then imagines the action of a given actor/editor, in this case the wikpedia user handle “Jeremy Butler”, in “embodied” terms reflecting whether the new text of the edit represents an addition, deletion, or both to the previous version. In this case: ‘MAKING HEAVY GESTURE WITH ARMS: [[theater director|director|] OPENS EYES: [[playwright]]’, here the edit has replaced a link to the wiki article theater directors to playwright. Above this appears the (stage) note ‘Brecht was more a playwright than a director’ (the author’s summary when making the edit). The projects use of the different levels of text of the screenplay (differentiating between “stage directions” and spoken dialog) echoes the similarly variegated textual content of a Wikipedia article, from the actual words that appear in the article, to the edit comment, to the metadata stored alongside the edit in the database. In this way, the reading of a Wikipedia article through the lens of the Epicpedia screenplay is if not truer, than at least more reflective of the sociality underlying its creation. In this form even the appearance of spam, a constant concern to the Wikipedia community, itself finds a dramatic staging revealing the many normally hidden caretakers of the community hastily making an appearance to chase away the interruptions.

Your World of Text, Andrew Badr (2009)

Your World of Text, Andrew Badr (2009)

Your World of Text, Andrew Badr (2009)

Your World of Text, Andrew Badr (2009)


In Your World of Text (2009), Andrew Badr produces a radically different kind of collaborative writing system to etherpad, with a digital map-style sprawl of an infinite grid of characters.[21] Here conflict resolution occurs at the level of the character, and territory is claimed and reclaimed like graffiti in an urban space, by overwriting in specific locations. The fact that the system is grid oriented enables the system to interface to communities of practice and a history of tools around ASCII art, and the combination with the navigational map paradigm supports surprising content such as role playing style villages mixed with textual and other (typo)graphical content.

In the field of Computer Supported Cooperative Work (CSCW), research aims to develop computer systems which support people working together. What this means in practical terms is the subject of considerable debate in the field, the main tension arguably being between approaches focusing on the development of new technology and those focusing on understanding how people collaborate. [...]

Systems of ideas about how the world works, are important because they influence decisions we make. The designer of a CSCW system will be influenced by how she sees collaboration, just as designers of robots in Artifical Intelligence have been influenced by notions of plan execution.

Part of the work of collaboration [...] is exchanging information about changes [emphasis added]. Support for the process must therefore be flexible enough to allow not only changes about the state of the process and potential influences to take place, but also information contributing to the coordination of the work. Because the whole environment of the work is highly relevant to individual decisions, the process is too complex to be completely described in models. Instead, technological support could be provided for parts of the process in a way which allows its users to define and develop their own methods of coordination and control. [...] Such a view of collaboration allows a great deal of variability. Instead of seeing it as a technical modelling problem, the approach poses variation as central to collaborative work.[22]

Beck’s research reminds us that designers’ and programmers’ ideas of how the world works and how they view collaboration, has an impact on the final design. The technical decisions made (like storing history, requiring a login, using color to indicate authors) have real world social impact. Beck is writing in 1995 about a case study where she followed a pair of researchers collaboratively writing a research paper using a mixture of technologies available at the time, conventional word processing software, e-mail and telephone calls. Beck reminds us of the importance of the exchange of information about changes, and of the coordination of work to collaboration.

When Kleppmann speaks of “preserving information” in the context of his “simple counter” example, we should be reminded of Hayle’s warnings of that ‘ideology in which a reified concept of information is treated as if it were fully commensurate with the complexities of human thought’. In Kleppmann’s case, this preservation is done at the cost of removing/ignoring context. The apparent simplicity of his “toy” examples belie the actual subtleties of human communication and the multiplicities of what it could mean to write something collaboratively.[23]

Beck reminds us of the incompleteness of any model, and the need to consider that what which is better left outside of the system. Rather than embracing the premature resolution of a “conflict-free” model, what alternatives might exist for models embued with Haraway’s “diffractive representation”, based on “mapping the effects of difference”. How might data structures designed to support parallel versions (like git with its many possibilities for branching) be integrated into writing system as a primary feature to remain legible and rewritable, rather than hidden away in a “history” view. How might such a view support or even encourage convergence (when desired), providing a new kind of parallel legibility of diversion. Artistic projects like Epicpedia and Your world of text, that feature the inherent messiness of a collaborative writing process, suggest a bridge to a class of future collaborative writing spaces based on a diffractive model which see ‘variation as central to collaborative work.’

  1. Thierry Bardini, Bootstrapping (Stanford University Press, 2000), 141.
  2. https://en.wikipedia.org/w/index.php?title=Collaborative_real-time_editor&oldid=14449448
  3. https://en.wikipedia.org/w/Collaborative_real-time_editor
  4. See https://etherpad.org/ .
  5. See https://gitlab.constantvzw.org/aa/etherdump .
  6. See https://networksofonesown.constantvzw.org/etherbox/ .
  7. See https://github.com/ether/etherpad-lite/#things-you-should-know .
  8. David Greenspan, "Transforming Text," recorded 2013 at Etherpad SF Meetup, video, 48:04, https://www.youtube.com/watch?v=bV2JGpS-1R0 .
  9. See https://github.com/ether/etherpad-lite/blob/develop/doc/easysync/easysync-full-description.pdf .
  10. Donna Haraway, “The Promise of Monsters,” in The Haraway Reader (Routlege, 2004), 70.
  11. Martin Kleppmann, "CRDTs and the Quest for Distributed Consistency," recorded at QCon London 2018, video, 43:38, https://www.youtube.com/watch?v=B5NULPSiOGw .
  12. See: Gérald Oster, Pascal Molli, Pascal Urso, Abdessamad Imine. Tombstone Transformation Functions for Ensuring Consistency in Collaborative Editing Systems. 2006. https://hal.inria.fr/file/index/docid/109039/filename/OsterCollaborateCom06.pdf
  13. See https://www.youtube.com/watch?v=B5NULPSiOGw&t=2308 .
  14. See also this earlier presentation by Kleppmann: Conflict Resolution for Eventual Consistency, Berlin 2016, https://www.youtube.com/watch?v=yCcWpzY8dIA .
  15. See https://www.youtube.com/watch?v=B5NULPSiOGw&t=536 .
  16. See "Consistency without consensus in production systems" which describes how Soundcloud uses CRDTs to efficiently maintain “like counts” and other statistics on a globally accessible service that scales to artists with millions of followers, https://www.youtube.com/watch?v=em9zLzM8O7c .
  17. N. Katherine Hayles, How We Became Posthuman: Virtual Bodies in Cybernetics, Literature, and Informatics (University of Chicago, 1999), 53-54.
  18. See https://www.youtube.com/watch?v=bV2JGpS-1R0&t=280 .
  19. N. Katherine Hayles, Writing Machines (MIT press, 2002), 25.
  20. See also: [Epic Web Design], Annemieke van der Hoek (2008)
  21. https://www.yourworldoftext.com/
  22. Beck, Eevi E. 1995. "Changing documents/documenting changes: using computers for collaborative writing over distance." In The Cultures of Computing, edited by Susan Leigh Star, pp.55-56. Oxford, UK: Blackwell.
  23. In a more recent talk Kleppmann’s addresses the “hard problems” around CRDTs, many indeed stemming from real world constraints, however the kleft between the messiness of practice, and the precision of his proposed models is still very much in evidence. See: https://www.youtube.com/watch?v=x7drE24geUw