English-Albanian corpus from websites of national Agencies v.1.0

Bilingual dataset (EN-SQ) based on the content of websites of national agencies. It includes 84747 Translation Units. It was generated by crawling the websites in January 2021, detecting pairs of parallel documents, identifying parallel sentence pairs and filtering the results.