Institute of Information Theories and Applications FOI ITHEA
Abstract:
Malapropism is a semantic error that is hardly detectable because it usually retains syntactical links
between words in the sentence but replaces one content word by a similar word with quite different meaning. A
method of automatic detection of malapropisms is described, based on Web statistics and a specially defined
Semantic Compatibility Index (SCI). For correction of the detected errors, special dictionaries and heuristic rules
are proposed, which retains only a few highly SCI-ranked correction candidates for the user’s selection.
Experiments on Web-assisted detection and correction of Russian malapropisms are reported, demonstrating
efficacy of the described method.