LLMs

Veras Audire et Reddere Voces — Prosodically‑Accurate Latin Text-to-Speech Corpus and Methodology

An openly licensed, expertly validated corpus of Latin poetic audio that makes classical metre audible with modern large‑language‑model TTS. Drawing on Pedecerto’s machine‑readable scansions, we steered a general‑purpose model to recite 216 lines—Vergil Aeneid 1.1–100 (hexameter) and Ovid Heroides 1.1–116 (elegiac couplets)—with metrically faithful stress, yielding nearly 24 minutes of line‑aligned, classroom‑ready audio. This project was presented at the ACL 2025 Student Research Workshop (Vienna) and at CLiC‑it 2025 (Cagliari).

Jul 26, 2025