Notes on machine learning and exceptions

Statistical machine learning has been the de-facto standard in NLP research and practice. However, its very success might be hiding its the problems. One such problem is exceptions.

Natural language is full of exceptions: idiomatic phrases that defy compositionality, irregular verbs and exceptions to grammatical rules, or unexpected events that, though not linguistic phenomena themselves, happen to be communicated via language. So far, statistical NLP has treated them as inconvenient oddity and, in most cases, swept them under the rug, hoping that they wouldn’t reduce F-score.

But a system doesn’t really understand language without handling exceptions and I will argue that (not) handling exceptions has important consequences to machine learning. Continue reading