Jump to Content

The Power of Language Music: Arabic Lemmatization through Patterns

Ayah Zirizkly
Mona Diab
Proceedings of the Workshop on Cognitive Aspects of the Lexicon, Osaka, Japan (2016), pp. 40-50

Abstract

Patterns play a pivotal role in Arabic morphological processing whether related to derivation or inflection. These patterns have not been yet adequately and fully utilized in computational processing of the language. The novel contribution of this paper is performing lemmatization (a high level lexical processing) without relying on a lookup dictionary. We use a machine learning classifier to predict the lemma pattern for a given stem, and use mapping rules to convert stems to their respective lemmas.