Farhan Ar Rafi

Geonames Classification for Bengali Geowordnet

Published on December 201515 min read
Figure: Photo by Sharafat Raheb on Unsplash


Bengali Geo-Wordnet focuses on the creation of system that integrates Wordnet, Geonames and Bengali Language that provides a growing vocabulary set and crowd sourced data for all linguistic and natural language processing systems. The Geonames data used in the system has been prepared using manual classifications methods under the supervision of top linguists and professors of related departments. Validation data shows that 98.8% terms were classified properly with 1.2% being nonexistent in Bengali language. The underlying system has been designed using a scalable method which enables the system to be extended and replicated with ease. The system opens wide possibilities of development of Bengali Linguistic Systems and Natural Language Processing Systems and Optical Character Recognition Systems.

Read the full paper →