SF Bay ACM Data Mining SIG - Mining Quotations for Links and...

Submitted by Rick G. Villarreal

2008-05-14 18:30:00 - 2008-05-14 00:00:00

Mining Quotations for Links and Ideas

Presented by Okan Kolak and Bill Schilit, researchers, Google Inc.

Cost: Free and open to all who wish to attend, but membership is only $10/year.

Abstract:

Scanning books, magazines, and newspapers has become an widespread activity because people believe that much of the worlds information still resides off-line. In general after these works are scanned they are indexed for search and processed to add links. In this talk we will describes a new approach to automatically add links by mining repeated passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work rather than to the entire work, facilitating navigation. Our system has been run on a digital library of over 1 million books, has been used by thousands of people, and has generated the worlds largest collection of quotations. We will also present a follow-on project based on the theory that authors copy passages from book to book because these quotations capture an idea particularly well: Jefferson on liberty; Stanton on womens rights; and Gibson on cyberpunk. Our Key Ideas prototype provides an interaction model where readers fluidly explore the library by viewing popular quotations on a particular key term, and follow links to quotations on related key terms.

SAP Labs, LLC
3410 Hillview Avenue Palo Alto, California 94304
Were you there? Log In or Sign Up to post photos, reviews, and discuss this event.

SF Bay ACM Data Mining SIG - Mining Quotations for Links and... Homepage

Education Event