TY - GEN
T1 - Searching documentation using text, OCR, and image
AU - Yeh, Tom
AU - Katz, Boris
PY - 2009
Y1 - 2009
N2 - We describe a mixed-modality method to index and search software documentation in three ways: plain text, OCR text of embedded figures, and visual features of these figures. Using a corpus of 102 computer books with a total of 62,943 pages and 75,800 figures, we empirically demonstrate that our method achieves better precision/recall than do alternatives based on single modalities.
AB - We describe a mixed-modality method to index and search software documentation in three ways: plain text, OCR text of embedded figures, and visual features of these figures. Using a corpus of 102 computer books with a total of 62,943 pages and 75,800 figures, we empirically demonstrate that our method achieves better precision/recall than do alternatives based on single modalities.
KW - Computer vision
KW - Content-based image retrieval
KW - Multimodal search
UR - http://www.scopus.com/inward/record.url?scp=72449170558&partnerID=8YFLogxK
U2 - 10.1145/1571941.1572123
DO - 10.1145/1571941.1572123
M3 - Conference contribution
AN - SCOPUS:72449170558
SN - 9781605584836
T3 - Proceedings - 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009
SP - 776
EP - 777
BT - Proceedings - 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009
T2 - 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009
Y2 - 19 July 2009 through 23 July 2009
ER -