What You Get on a CD
Apart from CDs containing only photographs, the data on these CD roms consists of pictures (graphics files) of every page, generated by scanning the original book(s). The scans of the text pages are not photographic images, but are black and white 1 bit images which means that when printed you obtain a page which is very close to the original - examples of the pages are given for most of the books.
There are no plain text files of the books on these CD ROMs. The files are Adobe Acrobat .pdf files (unless stated otherwise). Readers are provided on the CDroms for both PCs and Macs (including iMacs). Navigation of the CD rom within Acrobat is made easy and quick by means of "navigation panes". Most of the CDs contain "Active links" (similar to HyperLinks on a web page) on the Contents and Index pages, and all but the National Gazetteer are machine searchable. Please note that those CDs that are machine searchable will generally not have active links on the index pages.
Included on all our Yorkshire CDroms are 5 maps:
- A map showing the wapentakes.
- 3 maps showing the parishes within the Ridings
- A detailed map of Yorkshire showing which places were within which parishes in the pre 1832 era (i.e. the ancient parishes)
The CDs contain the best possible scanned images that can be produced so that they are easily readable on a computer monitor without having to strain to read. The scanned images of text pages on the CDroms are black and white (not greyscale) unless otherwise stated, and when printed will give you a good facsimile of the page as it was when originally printed. The images have around 95% of the spurious dots removed (and 99.9% of the smaller ones), and are centred on the pages which are, where possible, a consistent size. The images are as square to page as possible.
All of the books except for the National Gazetteer are machine searchable i.e. you can search for a word (or name) or a phrase. Being machine searchable also allows the Acrobat "Read out loud" facility to work for those who are visually impaired.
All books are scanned with the pages as flat as possible to the image sensor in order to keep scanning distortions to a minimum - overhead scanners are not use as these give horizontally compressed characters towards the centre of the book due to the curve on the pages and do not remove the print skewing relative to the page edges that occurs in many of the older books.
The images on the CDrom are in Acrobat pdf format, and are presented as individual books accessible from a common starting point which should appear automatically when the CDrom is inserted in the computer. Within each book, each page is presented separately. Photographs and detailed drawings are scanned in photographic mode, and where these are part of a text page, they are presented separately with a resolution such that all the detail of the original can be seen.
To make an Acrobat image searchable requires that the image is available in plain text so that the search software can search for individual words as opposed to searching for the image of the word. To generate the plain text file requires that the whole of the book is "OCRd" (Optical Character Recognition). The resulting text file is rarely perfect, and so a search for a given word or phrase cannot be guaranteed 100%, though in most cases it will be around 99.9%. Within the OCR software (currently Omnipage 15), proof-reading is carried out in order to remove errors in the text due to typos in the original (such as "succeeeding", reversed pairs of letters "teh" for "the", inverted letters, etc), and missing or distorted letters/words. As a rule, the sample .pdf files given on this site for the books are searchable.
The CDRoms & Cases
Various manufacturers recordable CDs (CD-Rs) are used, all branded, such as TDK, DataSafe, DataMedia, Memorex etc., and the data on all CDs is verified after "burning".
All CDroms are supplied in a virtually unbreakable black plastic case - these cases are similar to a DVD case, but are the same width and height as a standard CD jewel case, but 1mm thicker. To remove the CD simply press the button in the middle and the CD will pop up from the clip. If you try to remove the CD by pulling you will break the CD.
The CDroms are mailed in a "Jiffy" (bubble wrap padded) bag, in which the CD case just fits. This has proved very reliable with no CDrom case damage. If more than one CD is ordered, each CD will be dispatched as a single item - this is to keep the possibility of damage/losses to a minimum.
About the Scanning Process
All text files are scanned at 300 dots per inch (dpi), and if printed at the same size as the page scanned, then you will get a good quality reproduction of the original page, but please remember that the original page may have been anything but perfect (Baines' Directory and Langdale's in particular). To the best of our knowledge, each of the pages on these CD roms are fully readable, though in some cases there may be only partial characters. The scanned page files are 1 bit images and not photographic images. (i.e. the viewed page is either black or white, no greys). This ensures that you obtain a well defined image when printing.
Typical problems with the original books are:
- Brown spotting or "Foxing" : these have usually been removed.
- Too much ink, causing blurring and "tails".
- Too little ink, causing partially missing characters.
- Leaning type, causing only the left or the right hand side of a character to appear.
- "Print through" from the back of the page causing double images to appear when the page is scanned. This problem is normally invisible to you on the CDs, as the scanner is adjusted for each page that has this problem.
- "Ink spread" caused by the ink seeping across the paper over the years. This causes the original page to go a yellowy brown. Again, this problem is normally invisible to you on the CDs, as the scanner is adjusted for each page that has this problem.
- Skewed printing. In general, all the pages are presented to you within 0.3 degrees of true. However in some cases the original is out of true, particularly the last pages of the Baines East and North Riding.