E-TEXTS

TRANSCRIPTION POLICY




Transcribing text from Victorian periodicals onto the Web raises several problems. These e-texts seek to preserve as much information as possible from the original ands reproduce as closely as possible the contemporary reading experience.

No changes are made to the text itself. Wilkie Collins was very particular about the appearance of his work and gave specific instructions to the printer. Victorian spelling, punctuation, and even printing errors are kept. So the texts here are exact character-for-character, line-for-line, and page-for-page transcripts of the original. Where the original was printed in two columns side by side on one page then each column is transcribed separately and follows on below.

However, limitations of the language which is used to write Web pages (called HTML) and of the browsers which read it mean that exact preservation of typography is impossible. But it is kept as far as is possible within these restrictions.

Four areas in particular cause difficulty.

Justification
The type in Victorian periodicals was always fully justified, that is it had a straight margin on both sides of the type. That was achieved by adding spaces between words word so that every line was a constant length. This highly skilled task was done by hand and eye, and indeed had been since printing from moveable type began in the 15th century. Nowadays it is done automatically by electronic typesetters and word processors. Unfortunately, there is presently no way of creating a straight right-hand margin in HTML so it has been left ragged.

Typeface
Different computers have different capabilities of representing fonts and font specification in HTML is in its infancy. Set your browser to use Times Roman.

Headings
Page headings are preserved but are difficult to display correctly. HTML strips out multiple spaces back to one space; it does not support tabs; and centred text lines up around the centre line of the display not the text. Centred headings and ranged right page numbers on narrow text are difficult to get right on Web pages. Headings are placed at the top of the page in the best representation possible. Sometimes page headings were split over left and right-hand pages. So the even-numbered left-hand page has the first half of the heading and the odd-numbered right hand page has the other half. If a second column follows on it is left without page number or heading and is denoted by a rule.

Paragraphs
Indented paragraphs are not possible in HTML. In these e-texts paragraphs are shown by a blank line.

Victorian punctuation
To our eyes Victorian punctuation was weird. Exclamation and question marks are freely used within sentences; quotation marks are placed after rather than before full stops (periods); and commas are used frequently. All this is preserved as are the extra spaces before many punctuation marks. However, long dashes, which were very common especially in humorous work, are represented by three short dashes thus --- which is about the same length. Opening and closing quotation marks, which should be curved one way for opening and another for closing, are simply kept straight as the curved versions are not standard characters that will appear in a common way on all browsers.

Data capture
The text is scanned and then translated into ASCII characters using optical character recognition (OCR) software. It is then carefully checked against the original both for text and typography.


All material on these pages is © Paul Lewis 1996