Associating the visual representation of user interfaces with their Internal structures and metadata

Tsung Hsiang Chang, Tom Yeh, Rob Miller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

41 Scopus citations

Abstract

Pixel-based methods are emerging as a new and promising way to develop new interaction techniques on top of existing user interfaces. However, in order to maintain platform independence, other available low-level information about GUI widgets, such as accessibility metadata, was neglected intentionally. In this paper, we present a hybrid framework, PAX, which associates the visual representation of user interfaces (i.e. the pixels) and their internal hierarchical metadata (i.e. the content, role, and value). We identify challenges to building such a framework. We also develop and evaluate two new algorithms for detecting text at arbitrary places on the screen, and for segmenting a text image into individual word blobs. Finally, we validate our framework in implementations of three applications. We enhance an existing pixel-based system, Sikuli Script, and preserve the readability of its script code at the same time. Further, we create two novel applications, Screen Search and Screen Copy, to demonstrate how PAX can be applied to development of desktop-level interactive systems.

Original languageEnglish
Title of host publicationUIST'11 - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology
Pages245-255
Number of pages11
DOIs
StatePublished - 2011
Event24th Annual ACM Symposium on User Interface Software and Technology, UIST'11 - Santa Barbara, CA, United States
Duration: 16 Oct 201119 Oct 2011

Publication series

NameUIST'11 - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology

Conference

Conference24th Annual ACM Symposium on User Interface Software and Technology, UIST'11
Country/TerritoryUnited States
CitySanta Barbara, CA
Period16/10/1119/10/11

Keywords

  • Accessibility API
  • Graphical user interfaces
  • Pixel
  • Text detection
  • Text segmentation

Fingerprint

Dive into the research topics of 'Associating the visual representation of user interfaces with their Internal structures and metadata'. Together they form a unique fingerprint.

Cite this