As part of my interest in digitization and its influence on historical research, I recently attended a conference on digitization hosted by the Van Leer Institute of Jerusalem. One of the sessions, ‘Digitization and the Humanities’, dealt with the developments in the field of digitization, and in particular with an ongoing project, called The Friedberg Genizah Project, which aims to digitize the entire corpus of finds included in the Cairo Genizah.
What is the Cairo Genizah exactly?
The Cairo Genizah is a staggering amount of fragments of documents (some 250,000 in total), quires and books found in a locked synagogue room toward the end of the 19th century in Egypt. Most of the documents, ranging in date of production from the 9th to the 16th century, were taken from Egypt to England by Professor Schechter of Cambridge, and are still kept there today. The remainder was eventually dispersed throughout the world, with major concentrations of documents found today in the UK, Israel, the United States and some central and eastern European countries. The fragments range in topic from Rabbinical to liturgical, biblical and Talmudic works, on a variety of subjects, and are in a generally deteriorated state, due to the conditions in which they were stored.
The Goals of the Project
The digitization of these fragments therefore poses a serious challenge to the Friedberg Genizah Project teams, not only because of the sheer quantity of documents and their physical condition, but also because the aims of the project are twofold. The first goal is indeed to digitize the documents – the project team scans the fragments at a resolution of 600 DPI, giving a clear image of the fragment even at very high zooms. In order to do this, researchers needed to separate the fragments from their original background materials, when it was discovered that the scanning computer had difficulty distinguishing the fragment from the background. The team discovered that the color allowing for maximum contrast with the fragments was a particular shade of blue, so that is what they used for the scan.
So far, the team has managed to scan some 85,000 pictures, and have now begun scanning the largest repository of fragments, found in Cambridge, at a rate of 10,000 per month.
The second stage of the project is perhaps even more ambitious. The Friedberg Genizah team intends to add a second layer of information to the existing scans. This layer will include, when complete, an identification, transcription and translation of the fragment. Since there are so many of these fragments, there arose a need for an identification system, in order to catalog the pieces by their various attributes, but also as part of an attempt to match separate pieces which once belonged to a single, original manuscript.
When the team finds such a match, and this is quite rare, the two or more fragments are called a ‘join’. Some of the joins are composed of fragments located in different parts of the world, and are identified as such for the first time thanks to the Friedberg Genizah project.
The second layer also contains a collection of all the research literature ever published on the subject of the Cairo Genizah, as well as software designed to navigate through it. Currently, the site contains roughly one million bits of data, referring, among other things, to the dimensions and characteristics of the fragments.
There are several ambitious goals for the future of the project. The first is to integrate the digitization with the meta-data, i.e. the literature and annotations. This will allow for a more in-depth experience of the Genizah documents, complete with recent bibliography, transcription and translation. The second aim is to improve the optical character recognition (OCR) capabilities of the research team, allowing for a smoother and speedier transcription and subsequent understanding of the documents. The third aim is to compare different sets of fragments, something which is currently not being done due to lack of time. The fourth and final goal is to create an interface allowing scholars around the world to offer their insights on joins previously suggested by the computer.
The Friedberg Genizah Project unites teams from around the world, and has constructed a site designed to become a working tool for researchers, but is also suitable for people who want to learn more about the project, or the Genizah itself. Access to the website is free, and the site itself is quite remarkable in detail and scope.