Bilingual (EN-MK]) corpus from websites of government of North Macedonia v.1.0

Bilingual dataset (EN, MK, SQ) based on the content of websites of the government of
North Macedonia. It was generated by crawling the websites in
February 2021, detecting pairs of parallel documents, identifying parallel sentence pairs and filtering the
results.