PELCRA WebLign crawler

WEBLIGN

A customizable site-specific crawler for multilingual websites. The tool provides a general crawling infrastructure and several site-specific parsers. The crawling results are stored in a simple relational database (the database schema is provided along with the code.)