Yale DHLab - ATHENA

DIGITAL PROJECTS @ YALE

Visual Analysis

ATHENA

DIGITAL PROJECTS @ YALE

ATHENA

Friday, January 17, 2025

ATHENA

View Project

Tags:

Overview

Automatic Text Height ExtractioN for the Analysis of old handwritten manuscripts (ATHENA) has developed a layout analysis method to perform automatic text height estimation, even in the case of very noisy and damaged handwritten manuscripts. The effectiveness of the method has been tested on a large corpus of medieval manuscripts that has different writing styles and has been affected by other factors, such as ink bleed-through, background noise, and overtyping text lines.

Goals & Methods

ATHENA presents a parameter-free, automatic method to perform text height estimation. Given the image of a manuscript page, a multi-scale representation is first produced. Then, for each sub-image at each level, a robust, frequency-based descriptor is computed. Finally, a voting procedure finds the predominant spatial frequency in the document page, whose period is the value of the text height. This proves to be an efficient and reliable technique in the case of very noisy and damaged old handwritten manuscripts. The major contributions of the proposed approach include the following:

Frequency-based descriptor. A new local image descriptor based on a frequency analysis of the y-axis projected profile of the normalized image autocorrelation function

Multi-scale framework. A multi-level approach with a voting procedure to exploit spatial consistency between frequency-based image descriptors at different scale levels

Evaluation. An extensive evaluation of the proposed algorithm is presented and then applied to a huge heterogeneous corpus.

FROM THE ARCHIVE

The Kan'ichi Asakawa Epistolary Network Project

Jan 17 2025

Kani’chi Asakawa (1873-1948) was the first professor of Japanese history and the head of the East Asia Library at Yale. Asakawa was an influential scholar in the field of...

Learn More »

ATHENA

Jan 17 2025

Automatic Text Height ExtractioN for the Analysis of old handwritten manuscripts (ATHENA) has developed a layout analysis method to perform automatic text height estimation, even in the case of...

Learn More »

Babylonian Collection Digital Imaging

Jan 17 2025

This project extends research methods and supports didactic objectives associated with the Yale Babylonian Collection, which contains the largest assemblage of cuneiform tablets, seals, and other inscribed artifacts documenting...

Learn More »