SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in TMX format. 
Collection of 31 bilingual TMX files for EN-X language pairs, where X is BG, CS, DE, EL, EN, ES, ET, FI, FR, HR, HU, IS, IT, LT, LV, MK, NB, NN, PL, PT, RU, SK, SL, SQ, SV. It also contains small collection for a few more language combinations. It was generated by processing abstracts of Bachelor,… Read More
People who looked at this resource also viewed the following:
- English-Albanian corpus from websites of national Agencies v.1.0
- Compilation of Bulgarian-Spanish; Castilian parallel corpora resources used for training of NTEU Machine Translation engines. Tier 3.
- Web-acquired data related to Scientific research (Part I). Multilingual (BG, CS, DA, DE, EN, ES, ET, FR, GA, HR, IT, LT, LV, NB, NL, PL, PT, RU, SK, SV, UK) collection of files in TMX format.
- Montenegrin web corpus MaCoCu-cnr 1.0
People who downloaded this resource also downloaded the following:
- Web-acquired data related to Scientific research (Part I). Multilingual (BG, CS, DA, DE, EN, ES, ET, FR, GA, HR, IT, LT, LV, NB, NL, PL, PT, RU, SK, SV, UK) collection of files in Moses format.
- Multilingual content acquired from advocacy and law associations/firms, conciliation/arbitration/co-operation institutes, dispute prevention and resolution agencies (part1, v.0).
- SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in MOSES format.
- Multilingual content acquired from advocacy and law associations/firms, conciliation/arbitration/co-operation institutes, dispute prevention and resolution agencies (part 1 , v.1).