Bilingual (EN-MK]) corpus from websites of government of North Macedonia v.1.0
Bilingual dataset (EN, MK, SQ) based on the content of websites of the government of
North Macedonia. It was generated by crawling the websites in
February 2021, detecting pairs of parallel documents, identifying parallel sentence pairs and filtering the
results.
People who looked at this resource also viewed the following:
- Web-acquired data related to culture (Part I). Multilingual (BG, CS, DA, DE, EL, EN, ET, FI, FR, HR, IS, IT, LT, LV, MK, MT, RU, SK, SV) collection of files in TMX format.
- Monolingual North Macedonian corpus from the website of the Public Enterprise Official Gazette (part 1)
- Monolingual North Macedonian corpus from websites of government of North Macedonia (part 1)
- COVID-19 Government of Sweden dataset v1. Bilingual (EN, SV)
People who downloaded this resource also downloaded the following: