Croatian and English monolingual corpus from Croatian web resources
"Croatian and English monolingual corpus from Croatian web resources" compiled from corpora listed in ReadMe file by Consortium of National Language Technology Platform (NLTP) Project (Action number: 2018-EU-IA-0082). Published under CC-BY-SA-4.0 license.'}
Monolingual corpus of Croatian web resources collected during NLTP project.
Croatian: 1131719 sentences, 24303220 words
English: 218436 sentences, 5711041 words
People who looked at this resource also viewed the following:
- English-Croatian Parallel Corpus from Croatian web resources
- Compilation of English-Slovak parallel corpora resources used for training of NTEU Machine Translation engines.
- Manufactured data based on ParaCrawl release 8 Italian-English, it terms
- PRINCIPLE Foras na Gaeilge parallel translation memory dataset