Automating the Detection of Poetic Features: The Limerick as Model Organism

Almas Abdibayev, Yohei Igarashi, Allen Riddell, Daniel Rockmore


Abstract
In this paper we take up the problem of “limerick detection” and describe a system to identify five-line poems as limericks or not. This turns out to be a surprisingly difficult challenge with many subtleties. More precisely, we produce an algorithm which focuses on the structural aspects of the limerick – rhyme scheme and rhythm (i.e., stress patterns) – and when tested on a a culled data set of 98,454 publicly available limericks, our “limerick filter” accepts 67% as limericks. The primary failure of our filter is on the detection of “non-standard” rhymes, which we highlight as an outstanding challenge in computational poetics. Our accent detection algorithm proves to be very robust. Our main contributions are (1) a novel rhyme detection algorithm that works on English words including rare proper nouns and made-up words (and thus, words not in the widely used CMUDict database); (2) a novel rhythm-identifying heuristic that is robust to language noise at moderate levels and comparable in accuracy to state-of-the-art scansion algorithms. As a third significant contribution (3) we make publicly available a large corpus of limericks that includes tags of “limerick” or “not-limerick” as determined by our identification software, thereby providing a benchmark for the community. The poetic tasks that we have identified as challenges for machines suggest that the limerick is a useful “model organism” for the study of machine capabilities in poetry and more broadly literature and language. We include a list of open challenges as well. Generally, we anticipate that this work will provide useful material and benchmarks for future explorations in the field.
Anthology ID:
2021.latechclfl-1.9
Volume:
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic (online)
Editors:
Stefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
Venue:
LaTeCHCLfL
SIG:
SIGHUM
Publisher:
Association for Computational Linguistics
Note:
Pages:
80–90
Language:
URL:
https://aclanthology.org/2021.latechclfl-1.9
DOI:
10.18653/v1/2021.latechclfl-1.9
Bibkey:
Cite (ACL):
Almas Abdibayev, Yohei Igarashi, Allen Riddell, and Daniel Rockmore. 2021. Automating the Detection of Poetic Features: The Limerick as Model Organism. In Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 80–90, Punta Cana, Dominican Republic (online). Association for Computational Linguistics.
Cite (Informal):
Automating the Detection of Poetic Features: The Limerick as Model Organism (Abdibayev et al., LaTeCHCLfL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.latechclfl-1.9.pdf
Video:
 https://aclanthology.org/2021.latechclfl-1.9.mp4