Sikuli: Using GUI screenshots for search and automation

Tom Yeh, Tsung Hsiang Chang, Robert C. Miller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

290 Scopus citations

Abstract

We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sikuli also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events. We report a web-based user study showing that searching by screenshot is easy to learn and faster to specify than keywords. We also demonstrate several automation tasks suitable for visual scripting, such as map navigation and bus tracking, and show how visual scripting can improve interactive help systems previously proposed in the literature.

Original languageEnglish
Title of host publicationUIST 2009 - Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology
Pages183-192
Number of pages10
DOIs
StatePublished - 2009
Event22nd Annual ACM Symposium on User Interface Software and Technology, UIST 2009 - Victoria, BC, Canada
Duration: 4 Oct 20097 Oct 2009

Publication series

NameUIST 2009 - Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology

Conference

Conference22nd Annual ACM Symposium on User Interface Software and Technology, UIST 2009
Country/TerritoryCanada
CityVictoria, BC
Period4/10/097/10/09

Keywords

  • Automation
  • Image search
  • Online help

Fingerprint

Dive into the research topics of 'Sikuli: Using GUI screenshots for search and automation'. Together they form a unique fingerprint.

Cite this