CURLICAT Bulgarian corpus 
The Bulgarian CURLICAT corpus consists of texts from different sources, provided with appropriate licences for distribution. We used three general types of sources with regard to the metadata extraction: Bulgarian National Corpus (provided that they have redistributable licensing terms); some public repositories with open and copyright free data; blogs with redistributable licenses, open content websites, etc. The Bulgarian CURLICAT collection contains 113 087 documents, distributed in seven thematic domains: Culture, Education, European Union, Finance, Politics, Economics, and Science. For more information see the CURLICAT website (http:curlicat-project.eu/deliverables)
People who looked at this resource also viewed the following:
- Compilation of Hungarian-Latvian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Greek-Lithuanian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Greek-Italian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Spanish-Hungarian parallel corpora resources used for training of NTEU Machine Translation engines.