Abstract
To improve the accessibility of visual figures, auto-generation of text description of individual images has been studied. However, it cannot be directly applied to comics as the descriptions can be redundant as similar scenes appear in a row. To address this issue, we propose generating the descriptions per group of related images and demonstrate how an dense captioning technique for videos can be utilized for this purpose and ways to improve its performance. To assess the effectiveness of our approach and to identify factors affecting the quality of text descriptions of comics, we conducted a preliminary study with 3 sighted evaluators and a main user study with 12 participants with visual impairments. The results show that text descriptions generated per group of images are perceived to be better than those generated per image in terms of accuracy, clarity, understandability, length, informativeness and preference for sighted groups, when annotator is human. In the same conditions, when the annotator is AI, it exhibited better performance in terms of length. Also, people with visual impairments prefer group descriptions because of conciseness, smooth connectivity of sentences, and non-repetitive features. Based on the findings, we provide design recommendations for generating accessible comic descriptions at a scale for blind users.
Original language | English |
---|---|
Title of host publication | Proceedings of 2024 29th Annual Conference on Intelligent User Interfaces, IUI 2024 |
Publisher | Association for Computing Machinery |
Pages | 750-760 |
Number of pages | 11 |
ISBN (Electronic) | 9798400705083 |
DOIs | |
State | Published - 18 Mar 2024 |
Event | 29th Annual Conference on Intelligent User Interfaces, IUI 2024 - Greenville, United States Duration: 18 Mar 2024 → 21 Mar 2024 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 29th Annual Conference on Intelligent User Interfaces, IUI 2024 |
---|---|
Country/Territory | United States |
City | Greenville |
Period | 18/03/24 → 21/03/24 |
Bibliographical note
Publisher Copyright:© 2024 Owner/Author.
Keywords
- comics
- dense video captioning
- image description
- people with visual impairment