Multimodal question answering form mobile devices

Tom Yeh, Trevor Darrell

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations


This paper introduces multimodal question answering, a new interface for community-based question answering services. By offering users an extra modality - photos - in addition to the text modality to formulate queries, multimodal question answering overcomes the limitations of text-only input methods when the users ask questions regarding visually distinctive objects. Such interface is especially useful when users become curious about an interesting object in the environment and want to know about it - simply by taking a photo and asking a question in a situated (from a mobile device) and intuitive (without describing the object in words) manner. We propose a system architecture for multimodal question answering, describe an algorithm for searching the database, and report on the findings of two prototype studies.

Original languageEnglish
Title of host publicationProceedings of the 13th International Conference on Intelligent User Interfaces 2008, IUI'08
Number of pages4
StatePublished - 2008
Event13th International Conference on Intelligent User Interfaces 2008, IUI'08 - Maspalomas, Gran Canaria, Spain
Duration: 13 Jan 200816 Jan 2008

Publication series

NameInternational Conference on Intelligent User Interfaces, Proceedings IUI


Conference13th International Conference on Intelligent User Interfaces 2008, IUI'08
CityMaspalomas, Gran Canaria


  • Information retrieval
  • Mobile application
  • Pattern matching
  • Question answering


Dive into the research topics of 'Multimodal question answering form mobile devices'. Together they form a unique fingerprint.

Cite this