Multilingual (EN, MK, SQ) corpus from websites of government of North Macedonia v.1.04 
Multilingual dataset (EN, MK, SQ) based on the content of websites of the government of North Macedonia. It includes 307146 Translation Units in total. It was generated by crawling the websites in February 2021, detecting pairs of parallel documents, identifying parallel sentence pairs and filtering the results. The number of TUs are:
en-mk 138805
en-sq 41486
mk sq 126855
People who looked at this resource also viewed the following:
- Compilation of Lithuanian-Portuguese parallel corpora resources used for training of NTEU Machine Translation engines. Tier 3.
- A Bilingual English-Ukrainian Lexicon of Named Entities Extracted from Wikipedia
- Compilation of Czech-Romanian parallel corpora resources used for training of NTEU Machine Translation engines.
- Compilation of Irish-Slovak parallel corpora resources used for training of NTEU Machine Translation engines.