A brief foray into vectorial semantics An Article by James Somers jsomers.net One of the best (and easiest) ways to start making sense of a document is to highlight its “important” words, or the words that appear within that document more often than chance would predict. That’s the idea behind Amazon.com’s “Statistically Improbable Phrases”: Amazon.com’s Statistically Improbable Phrases, or “SIPs”, are the most distinctive phrases in the text of books in the Search Inside!™ program. To identify SIPs, our computers scan the text of all books in the Search Inside! program. If they find a phrase that occurs a large number of times in a particular book relative to all Search Inside! books, that phrase is a SIP in that book. mathmeaningwordsnotetakingsearchchance