CURLICAT Bulgarian corpus ![Corpus](/static/metashare/css/sexybuttons/images/icons/silk/database_yellow.png)
The Bulgarian CURLICAT corpus consists of texts from different sources, provided with appropriate licences for distribution. We used three general types of sources with regard to the metadata extraction: Bulgarian National Corpus (provided that they have redistributable licensing terms); some public repositories with open and copyright free data; blogs with redistributable licenses, open content websites, etc. The Bulgarian CURLICAT collection contains 113 087 documents, distributed in seven thematic domains: Culture, Education, European Union, Finance, Politics, Economics, and Science. For more information see the CURLICAT website (http:curlicat-project.eu/deliverables)
People who looked at this resource also viewed the following:
People who downloaded this resource also downloaded the following: