MARCELL Croatian-English Parallel Corpus of Legislative Texts
MARCELL Croatian-English Parallel Corpus of Legislative Texts contains the total body of Croatian legislative documents (1563 documents) which are translated into English and a set of Croatia’s international treaties (253 documents), totaling to 1816 documents. The size in tokens is 14,379,657 in Croatian and 17,673,788 in English. This parallel corpus is processed at the level of paragraph and sentence splitting, segment alignment and each of 396,984 translation units (TUs) was manually checked for alignment. The file format is TMX (v1.4) while in the header additional metadata on document type, year of production, attributed EUROVOC descriptor or descriptors, and domain is stored.
People who looked at this resource also viewed the following:
- MARCELL Croatian legislative subcorpus
- Bilingual hr-en parallel corpus from the National and University Library in Zagreb website (Processed)
- Bilingual hr-en parallel corpus from Croatian National Bank website (Processed)
- Croatian-English corpus with Acts on Biological and Landscape Diversity and Environmental Protection (Processed)
People who downloaded this resource also downloaded the following: