Repository logo

A Codicological and Linguistic Typology of Common Torah Codices from the Cairo Genizah



Change log


Arrant, Estara 


This PhD thesis develops a typology of common Torah codices from the Cairo Genizah. As many of these codices have a non-standard form of Tiberian vocalisation (NST) the thesis also develops a typology of this vocalisation present in the corpus. These codices lack the full features associated with exemplary Bibles (having at least two columns of text, written on parchment, with Masoretic notes), and NST is the use of Tiberian Hebrew vowel signs and orthography which deviates from standard Tiberian vocalisation (ST), the main reference point for which is Codex Leningradensis. Previous research only studied a small number of NST MSS in order to contextualise them within the development of standard Tiberian, and they pay little attention to codicology. Likewise, Bibles which do not fit a certain level of codicological sophistication have never been systematically assessed (for a significant corpus). This thesis takes a representative sample of ~ 1,500 Torah fragments (~1,800 if numbers from a preliminary case study are included in the count) and subjects them to a holistic statistical, codicological, and linguistic analysis. My thesis aims to establish whether the codicological and linguistic features of these Torahs fall into meaningful, correlating patterns reflecting regional variations in codicological style and vowel sign usage. I store the MS data in a MySQL database which I built, and I employ statistical machine learning algorithms to find naturally occurring patterns in this data. These patterns are then analysed for their linguistic and codicological meaningfulness. Each chapter analyses a subsection of MSS: (a) parchment Torahs with 2 columns, but no Masoretic notes; (b) single-column parchment Torahs; (c) paper Torahs written in scribal hands; (d) Torahs written by children and non-scribe laymen. This research has uncovered some crucial insights for our knowledge of the popular writing of the Torah in the Masoretic and immediate post-Masoretic period. First, common Torahs show rich diversity of regional codicological styles. Second, their NST falls into different regional patterns; these can reflect a mixture of phonological language contact with dialectal Arabic, Aramaic influence from Targums, and an imperfect application of Standard Tiberian vocalisation rules. Third, these codicological and NST patterns tend to complement each other, so that NST and codicology generally correlate, region-by-region, throughout the world of the Genizah.





Khan, Geoffrey


Cairo Genizah, Hebrew Bible, Arabic, Aramaic, Torah, Codicology, Linguistics, Language Contact, Phonology, Machine Learning, Digital Humanities, Statistics, Data Science


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Hebrew Studentship Funding: Faculty of Asian and Middle Eastern Studies; Christ's College Bursary