CURLICAT Romanian corpus 
The Romanian corpus contains 26,477 files, which represent our contribution to the CURLICAT project. It contains texts from 7 domains: science, politics, culture, economy, health, education, nature. Each file has multiple levels of annotation: tokenized, lemmatized, morphologically annotated, dependency parsed, named entities, nominal phrases, IATE terms and automatic domain-specific terms were identified as well. All processing tools are available within the RELATE platform.
People who looked at this resource also viewed the following:
- Compilation of German-Maltese parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Greek-Latvian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Irish-Italian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of French-Hungarian parallel corpora resources used for training of NTEU Machine Translation engines.