bgMWE – a tool for MWE recognition

bgMWE is a tool for corpus processing and MWE recognition and tagging. It is developed in Java and is thus platform independent. bgMWE comprises a set of modules which can be applied for particular NLP tasks. It is largely language independent and can work either in resource-light mode, or its performance can be boosted by employing lexical resources. The system includes the following modules:
-Web crawler for Wikipedia;
-Extraction of lexical data – lists of words and MWEs;
-Converter between formats – vertical format, XML, etc.;
-Pre-processing module – applying a chunker, a tagger, etc.;
-Collection of frequency data;
-MWE recognition and tagging.


Languages: Bulgarian (bg)