Context
Music scores held in libraries, archives and cathedrals constitute an immense cultural heritage that is physically deteriorating. Scanning halts this decay, but the real value comes from transcribing the visual content into a structured symbolic format — a digital score — which enables search, indexing and computational analysis. Optical Music Recognition (OMR) aims to automate this transcription, yet current deep-learning approaches require large labelled datasets. Manual annotation is costly and is made harder by the heterogeneity of music documents: different eras, notation systems (mensural, neumatic, modern), regional styles, materials and degradation.
Objectives
The project investigates four complementary strategies to minimise the need for labelled data in OMR:
- Active learning (AL) — selecting the most informative samples to annotate, so that human effort is concentrated where it has the greatest impact on model performance.
- Parameter-efficient fine-tuning (PEFT) — adapting pre-trained models to new collections by updating only a small subset of parameters, including the exploration of large-scale models in OMR tasks.
- Few-shot learning (FSL) — training effective models from a handful of annotated examples, crucial for rare or under-represented collections.
- Domain adaptation (DA) — transferring knowledge from labelled collections to related but unlabelled ones, handling differences in style, notation and document condition.
Work plan
The project runs from September 2025 to August 2027 and is organised in three work packages:
- WP1 — Efficient data selection and labelling, including uncertainty-based prioritisation and semi-supervised methods.
- WP2 — Adaptation of pre-trained models, with a study of domain-shift impact and PEFT techniques.
- WP3 — Few-shot learning and domain adaptation, validated across heterogeneous collections of historical scores, manuscripts and modern prints.
Results will be disseminated through open datasets, open-access publications, and venues such as ISMIR, ICDAR and ICPR.
Team
- Francisco José Castellanos Regalado (Principal Investigator) — University of Alicante, GRFIA group. Optical music recognition, document analysis, computer vision.
- Gonzalo Romero-García — EPITA Research Laboratory (LRE), Image Processing and Pattern Recognition group. Mathematics, computational music, signal processing, document analysis.
- Joseph Chazalon — EPITA Research Laboratory (LRE), Image Processing and Pattern Recognition group. Computer vision, historical document analysis, assisted information extraction.
The project builds a bridge between the GRFIA group's expertise in OMR and the LRE document analysis team's experience with historical documents (e.g. the Mezanno and HoloCheck projects, and the ICDAR historical-map competitions).
Funding
Grant CIGE/2024/Fa-Sol-La, Subvenciones a Grupos de Investigación Emergentes, Conselleria d'Educació, Cultura, Universitats i Ocupació — Generalitat Valenciana. Area: Information and Communication Technologies.