Cantus Planus Austria

Handwritten Text and Music Recognition (HTMR)

The CANTUS Database and CANTUS Index projects show that short inventories of the texts and melodies of chants form an important precondition for documenting and rendering accessible medieval music manuscripts. However, one significant drawback of the current inventory and referencing procedure is the way data is generated: currently, scholars must carry out this task manually. Because of this laborious task, musicological research only has a very limited and incomplete data pool available for more in-depth analyses. In order to achieve meaningful results, both the quantity and quality of the data has to be significantly improved.

Newer approaches in Digital Humanities and Computer Science allow this error-prone and time-consuming method to be automated. Technological advances in Handwritten Text and Music Recognition make it possible to automatically recognise and transcribe the content of the digitised music manuscripts. This content can then be systematically structured, classified and referenced. Referencing, which is based on comparing the full texts with each other, along with advances in technology will improve data quality and, ultimately, save time.

In the context of a planned project, the training data required for text and music recognition (deep learning processes developed within the field of artificial intelligence research) will be compiled by chant research specialists, drawing upon a corpus of around 300 medieval Austrian manuscripts with plainchant. Further developing and applying the above innovative technologies to this corpus of manuscripts represents an all-encompassing solution with far-reaching benefits. The extensive body of manuscripts is an ideal training material and newly-developed online tools will be made available to the scientific community free of charge and will be openly accessible, enabling further research. Additionally, the project would allow a corpus of manuscripts, which was previously documented insufficiently, to be opened up for study.

Alongside the more technical tasks described above, musicological analysis of the automatically-generated data pool will be the main focus of the project work. This data will produce complete transcriptions of melodies of mass and office chants of a comprehensive corpus of secular and monastic traditions, representative at least for the German-Austrian area for the first time.

Based on this extensive data stock, an attempt will be made to carry out far-reaching analyses of the repertoire and the melodies. Automated repertoire comparisons of normative data from Libri ordinarii (generated in a previous project) will play an important role, along with the contents of the music-practical sources investigated here. Further analysis of the melodies will also provide evidence for the various different melodic variations.