Enhancing the Podcast Browsing Experience through Topic Segmentation and Visualization with Generative AI

Jimin Park, Chaerin Lee, Eunbin Cho, Uran Oh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Podcasts present challenges in information retrieval due to their non-visual nature and extended length. To understand these challenges, we conducted interviews with 12 podcast users and identified difficulties in grasping the overall podcast content with metadata alone, highlighting the necessity of navigating to specific segments. Based on this finding, we propose a browsing method that utilizes Large Language Models (LLMs) and image generation models to segment podcast contents, integrating visual cues for supporting efficient navigation. To investigate how this new method differs from conventional approaches and to evaluate its effectiveness, we conducted another user study with 12 participants. The results revealed that keyword search is ineffective when dealing with unfamiliar or inaccurate keywords. Additionally, it requires thorough examination of the script to comprehend the overall content of each episode. On the other hand, segmenting the contents and labeling the topic for each segment facilitated was found to be helpful for understanding of the overall content, enabling easy navigation to desired topics. Furthermore, we found that providing an image enabled participants to easily distinguish one segment from another, which was preferred by participants. This multimodal browsing approach is expected to establish a foundational framework for the effective browsing and comprehension of audio content, extending its applicability beyond podcasts to various forms of audio files.

Original languageEnglish
Title of host publicationIMX 2024 - Proceedings of the 2024 ACM International Conference on Interactive Media Experiences
PublisherAssociation for Computing Machinery, Inc
Pages117-128
Number of pages12
ISBN (Electronic)9798400705038
DOIs
StatePublished - 7 Jun 2024
Event2024 ACM International Conference on Interactive Media Experiences, IMX 2024 - Stockholm, Sweden
Duration: 12 Jun 202414 Jun 2024

Publication series

NameIMX 2024 - Proceedings of the 2024 ACM International Conference on Interactive Media Experiences

Conference

Conference2024 ACM International Conference on Interactive Media Experiences, IMX 2024
Country/TerritorySweden
CityStockholm
Period12/06/2414/06/24

Bibliographical note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

Keywords

  • Audio Content Browsing
  • Content Visualization
  • Generative AI
  • Podcast

Fingerprint

Dive into the research topics of 'Enhancing the Podcast Browsing Experience through Topic Segmentation and Visualization with Generative AI'. Together they form a unique fingerprint.

Cite this