Last Updated on March 11, 2021
I am one of those English speakers who wishes they were more fluent in another language.
The English Language
I find the applications of NLP (natural language processing) technology quite exciting, however the challenges to developing great NLP sure give me a wry smile. This is an amusing summary of some of the trickier parts of English:
It is worthwhile to spend a few moments on some of the inherent limitations of English.
Our words are polymorphous; their meanings change depending on the context in which they occur. Word polymorphism can be used for comic effect (e.g., “Both the martini and the bar patron were drunk”). As humans steeped in the culture of our language, we effortlessly invent the intended meaning of each polymorphic pair in the following examples: “a bandage wound around a wound,” “farming to produce produce,” “please present the present in the present time,” “don’t object to the data object,” “teaching a sow to sow seed,” wind the sail before the wind comes,” and countless others.
Words lack compositionality; their meaning cannot be deduced by analyzing root parts. For example, there is neither pine nor apple in pineapple, no egg in eggplant, and hamburgers are made from beef, not ham. You can assume that a lover will love, but you cannot assume that a finger will “fing.” Vegetarians will eat vegetables, but humanitarians will not eat humans. Overlook and oversee should, logically, be synonyms, but they are antonyms.
Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information
For many words, their meanings are determined by the case of the first letter of the word. For example, Nice and nice, Polish and polish, Herb and herb, August and august