7 Language Resources

Order by:

 Bilingual (EN, AL) corpus v.1.02 based on WikiMatrix
Number of downloads 2 Number of views 17
  • Albanian
  • English
  • CC-BY-SA-3.0
 Bilingual (EN, AL) corpus v.1.05 based on WikiMatrix
Number of downloads 2 Number of views 16
  • Albanian
  • English
  • CC-BY-SA-3.0
 COVID-19 - HEALTH Wikipedia dataset. Bilingual (EN-SQ)
Number of downloads 3 Number of views 35
  • Albanian
  • English
  • CC-BY-SA-3.0
 COVID-19 - HEALTH Wikipedia dataset. Multilingual (52 EN-X language pairs)
Number of downloads 103 Number of views 191
  • Afrikaans
  • Albanian
  • Arabic
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan; Valencian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Esperanto
  • Estonian
  • Finnish
  • French
  • Galician
  • German
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Italian
  • Korean
  • Latvian
  • Lithuanian
  • Macedonian
  • Malay (macrolanguage)
  • Malayalam
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swahili (macrolanguage)
  • Swedish
  • Tagalog
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese
  • CC-BY-SA-3.0
 HRW dataset v1. Multilingual (EN, AR, BG, BN, CS, DA, DE, EL, ES, FA, FI, FR, HR, HU, IN, IT, KO, LV, NB, NL, PL, PT, RU, SK, SQ, SV, TH, TL, TR, UK, UR, Vi, ZH)
Number of downloads 16 Number of views 41
  • Albanian
  • Bengali
  • Bulgarian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Filipino; Pilipino
  • Finnish
  • French
  • German
  • Hungarian
  • Indonesian
  • Italian
  • Korean
  • Latvian
  • Modern Greek (1453-)
  • Norwegian Bokmål
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Slovak
  • Spanish; Castilian
  • Swedish
  • Tai languages
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese
  • CC-BY-NC-ND-3.0
 SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in MOSES format.
Number of downloads 5 Number of views 22
  • Albanian
  • Bulgarian
  • Croatian
  • Czech
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hungarian
  • Icelandic
  • Italian
  • Latvian
  • Lithuanian
  • Macedonian
  • Modern Greek (1453-)
  • Norwegian Bokmål
  • Norwegian Nynorsk
  • Polish
  • Portuguese
  • Russian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • CC-BY-NC-SA-4.0
 SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in TMX format.
Number of downloads 40 Number of views 92
  • Albanian
  • Bulgarian
  • Croatian
  • Czech
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hungarian
  • Icelandic
  • Italian
  • Latvian
  • Lithuanian
  • Macedonian
  • Modern Greek (1453-)
  • Norwegian Bokmål
  • Norwegian Nynorsk
  • Polish
  • Portuguese
  • Russian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • CC-BY-NC-SA-4.0