where the term w0 , which represents a uniform prior distribution, is most often chosen to be 1.3 This uniform prior expectation also often features under the name of Laplace correction. It is clear that, by changing the above formula appropriately, arbitrary prior distributions can be incorporated.
Table 4-1: Resources for better email
8.3 8.4
Project implementation
and a rule might be PLURAL ,x. ,x. 1 s . In practice, a real grammar would be more complex than this. (It might, for instance, be able to handle the fact that the plural of sheep was still sheep , not sheeps , as posited by our example.) Nevertheless, even the simple example has some power. Note that three possibilities are given for the part of speech of dog: a good dog (noun), to dog the footsteps of someone (verb), a dog rose (adjective) and probabilities of occurrence of these are given in brackets, based on measurements across a large number of texts whose parts of speech have been marked up by humans. Other statistics can also be gathered about words and classes of words, for example for parts of speech, and embodied in statistical rules. For instance, very seldom does the word to come directly before a noun, but very frequently before a verb. So, since dog reasonably frequently occurs as a verb (15%), it is very likely to be one in the phrase to dog someone s footsteps . Comparatively simple NLP grammars of this type have been used to analyse business data to allow highly accurate extraction of company names, location, etc. and the analysis of eMails to determine whether or not they are complaints. A rather different application is the parsing of queries put into a search engine which queries a database. Here the parser uses NPL techniques to correct any errors made by the person making the query. The parser uses a grammar which allows it to check on whether queries have the correct structure and vocabulary and may also hold semantic knowledge about the database that allows further interpretation. One example given by one vendor [72] is the parsing of 6 Pak 12oz Diet Cola , where it is claimed that semantic knowledge is required to separate bottle size from the packaging.
Times (June 20, 2002), C-1.
136 Connect! A Guide to a New Way of Working from GigaOM s Web Worker Daily Create context by reading around the edges
C y Yx y cos
