How can we cope with texts that have been heavily corrected? There are two basic approaches: the documentary method and the layering method 1).

The documentary method

In the documentary method 2) no attempt is made to connect changes together. Instead, markup is used to record the simple facts on the page, such as: ‘there was a deletion here’ or ‘this word was replaced by this other word, then it was crossed out and replaced by this’, etc.

The advantage of the documentary method is simplicity. Because it doesn’t claim to interpret the changes, it doesn’t assert any possibly incorrect theory about how the text evolved. But the documentary method also doesn’t say very much, and instead of reaching out to a wider public it really looks inward, to a small clique of scholarly users.

Almost the only thing you can do with texts encoded this way is to display them in a diplomatic view. By using a combination of diacriticals, layout, and textual colours a complex display, which is basically a re-drawing of the manuscript page, can be built up. Ordinary users, coming across such a display, will quickly hit the button that normalises the text into something more readable. And scholarly users will probably get more out of the original page-image, since that is not subject to the distortion of the encoding process. Since it is possible to link parts of a clearly readable transcription to parts of the image, diplomatic representations do not appear to be very useful, and, depending on the complexity of the document, they can also be very expensive to produce.

The layering method

The rationale behind layering is that it should be obvious that at certain times the author or other persons revised the document through a sequence of revisions. If that sequence can be recovered, then the different states of the text can be separated from the apparent mass of corrections so that coherent transcriptions of each state can be extracted. These states are akin to separate drafts of a document, except that they were all written within one physical copy.

The problem with this method is that it is often not possible to completely separate all the layers of revision across an entire document. However, there are several ways to connect many of the changes that initially appear to be independent. Examples will be taken from Harpur because images of his manuscripts are not copyright. But you would find identical cases in texts in any language, time period or text type. Writing, after all, is writing.

1. Changes may be connected by metre

In poetry a change in one part of a line that disrupts the metre may be connected to a change in the same line to compensate. Consider the following pair of changes in Harpur’s Genius Lost:

Changes connected by metre Figure 1: Changes connected by metre

The original reading was ‘That startled me with something of her air’. When the author changed ‘startled’ to ‘struck’ a syllable was lost, and so ‘something’ was changed into ‘a semblance’ to compensate. The rhythm is disturbed if one or the other change occurred independently, but is preserved if they are both taken together.

2. Changes may be connected by pen colour or hand

Consider the following fragment, again from Genius Lost:

Changes connected by pen colour Figure 2: Changes connected by pen colour

Here the first version reads:

There is not one green mound existent long,
    Nor ancient wayside stone,
But upon there some child of bitter wrong
    Hath sate him down alone
    To bite his pallid lip, and groan.

All the corrections are done in an ink of clearly lighter colour, which suggests they were carried out at the same time. The revised text reads:

There is not one green mound existent long,
    In any region nor old wayside stone,
On which some weary child of shame and wrong
    Hath sate not there alone
    To bite his pallid lip, and hear the unpitied groan.

A further layer is discernable when ‘unpitied’ is changed into ‘unheeded’. So in this one section there are three states, all of which make sense.

3. Changes may be connected by grammar

Changes may also be connected by grammar: for example, the change from a singular to a plural subject leads to a change in a subsequent verb, as in this case:

Changes connected by grammar Figure 3: Changes connected by grammar

The original version read:

    Of the most desolate death beds of those
Who against this same Darkness fought
    (Each in the age, whereon they rose
To burn, like the peculiar stars of night,

And was changed through a series of connected corrections to:

    Of all the desolate death beds of those
Who against this same Darkness fought
    (Each in his age, on which his spirit rose
Like some peculiar star of night,

Here the plural ‘stars’ is changed to ‘star’ to agree with the change in subject from ‘they’ to ‘his spirit’. So we know that these changes all occured together on the basis of grammar alone.

4. Changes may be connected by transposition

Changes may also be connected by a corresponding deletion and addition used to carry out a transposition:

Changes connected by transposition Figure 4: Changes connected by transposition

Here ‘beyond it’ and ‘eternal’ are transposed, so we know that both changes occurred at the same time.

5. Changes may be connected by sense

Changes may simply be connected by sense:

Changes connected by sense Figure 5: Changes connected by sense

Here the changes from the first to the final version are governed in part by grammar, metre and transposition but also by sense, from

                And when the sun’s
Broad disk hath sunk into the tree tops, there
Seeming a moment ere he sets all barred


                And when the sun
Hath sunk into the tree tops and is seen
A moment ere it sets all barred across

You can’t have ‘is seen’ and ‘Seeming’ in the same sentence so these changes must have occurred at the same time.

‘Islands’ of change

Outside of the immediate vicinity of an individual change our ability to connect changes dimishes rapidly. Once we have gone beyond a few sentences the only possible way to connect changes is by hand or pen-colour. So this leaves the connected changes like ‘islands’ that have no special relation to each other.

It’s true that treating island-states as if they were document-wide states implies that, e.g. a level three change in one part of the document was contemporary with a level three change elsewhere, when in fact it probably wasn’t. But there are three defenses to this objection.

  1. Unimportance. Since there is no connection by definition between changes that occur in two islands, there can be no harm in displaying their states together in the same layer.
  2. Utility. The point of discerning layers of correction is that it results in coherent texts. The alternative is to leave the changes inline and get incoherent texts that cannot be compared. This way you can see the text as it mutates over time, and it is always readable.
  3. Hypocrisy. Even when the changes are recorded inline, a succession of local changes is still being recorded. The accusation that not numbering the succession of corrections explicitly is better, because it doesn’t misinterpret the text, falls down on the simple observation that even the documentary method also defines and numbers layers implicitly.

Transcribing layers

But how does one begin to transcribe a complex document into layers? For each change we must ask: what was the original text written by the author? If possible, that should be written down first. And changes are not always above or below the line, or in the margins. Often the first change is in the same line as its replacement text:

Inline correction Figure 6: Inline correction

Here the author originally wrote:

Of mine own soul!

which he immediately corrected to:

Of mine own spirit!

then continued writing. Here the sentence ends immediately after ‘soul!/spirit!’, but quite often the sentence continues without any clear substitution. In such cases the replacement text can be regarded as a substitute for the remainder of the sentence.

Having discerned the initial text, each correction should be read to discern the coherent states that existed in each local place of change, and then the successive states should be written down. This is based on the assumption that when the author changed the text he/she would have very probably left it in a coherent state, e.g.

Layers of correction Figure 7: Layers of correction

Here the coherent states of the last line are:

Of evil: in the clearest well there lies
As ever in the clearest fountain lies
So in the clearest fountain ever lies 
As in the clearest fountain ever lies

The layers can be recorded in two ways. The first method is to use embedded markup like TEI-XML, and then create the layers by separating them later using a ‘splitter’ program (as in the Ecdosis Import service). In this way you can record each layer as a <rdg> (reading) element within an <app>. Since this has a hierarchical structure, and variation often does not, you may be forced to repeat text between layers. However, this doesn’t matter since the layers will later be merged and any text in common will be reduced to one copy automatically. In this way it is conceptually possible to encode even the most convoluted nests of variants.

The second method is the same except that you copy the text into separate files. Or this can be done using the MML editor. You create a new layer and then copy the text from the previous layer. Then you simply change it until it equals the current state. Then save. This is the preferred technique because it is much easier on the eyes of the transcriber.

Economising on layers

Teasing out all the layers in a document can be tedious; it may also cost you more money in transcritption and proofreading costs than you can afford. A reasonable shortcut therefore is to transcribe the first and last layers only, or at a pinch only the final layer. Beware, though, that this reduces the information content of the transcription. It moves it away from being a true record of the changes and more a utilitarian transcription for a particular purpose.


1) Domenico Fiormonte, Valentina Matiradonna, Desmond Schmidt, 2010. ‘Digital Encoding as a Hermeneutic and Semiotic Act: The Case of Valerio Magrelli’, Digital Humanities Quarterly 4.1.

2) Elena Pierazzo, 2014. ‘Digital documentary editions and the others’, Scholarly Editing 35.

Next: Mathematics and graphics

Last modified Thu Oct 8 18:54:28 AEST 2015