Filter by:
German (21)
Russian (21)
English (21)
French (20)
Italian (14)
Portuguese (14)
Spanish; Castilian (14)
Polish (12)
Czech (11)
Arabic (9)
Swedish (9)
Bulgarian (8)
Dutch; Flemish (8)
Latvian (8)
Slovak (8)
Croatian (7)
Estonian (7)
Lithuanian (7)
Norwegian Bokmål (7)
Ukrainian (7)
Danish (6)
Finnish (6)
Macedonian (5)
Vietnamese (5)
Albanian (4)
Chinese (4)
Hungarian (4)
Icelandic (4)
Korean (4)
Persian (4)
Slovenian (4)
Turkish (4)
Bengali (3)
Indonesian (2)
Irish (2)
Maltese (2)
Serbian (2)
Tamil (2)
Afrikaans (1)
Azerbaijani (1)
Basque (1)
Belarusian (1)
Bosnian (1)
Esperanto (1)
Galician (1)
Hebrew (1)
Hindi (1)
Malagasy (1)
Malayalam (1)
Norwegian (1)
Tagalog (1)
Tai languages (1)
Telugu (1)
Thai (1)
Urdu (1)
Corpus (21)
Text (21)
Open Under- PSI (5)
CC- BY-4.0 (4)
CC- BY-3.0 (3)
CC- BY- NC-4.0 (1)
CC- BY- SA-3.0 (1)
Public Domain (1)
Multilingual (21)
Parallel (21)
SOCIAL QUESTIONS (17)
SCIENCE (4)
Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Language Description: | |
Tool/Service: |
21 Language Resources (Page 1 of 2)
« Previous | Next »Order by:
COVID-19 CDC dataset v2. Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH, UK, VI)
18
26
- Chinese
- English
- French
- German
- Italian
- Korean
- Philippine languages
- Portuguese
- Russian
- Spanish; Castilian
- Ukrainian
- Vietnamese
- Public Domain
COVID-19 Government of Canada dataset v2. Multilingual (EN, FR, DE, ES, EL, IT, PL, PT, RO, KO, RU, ZH, UK, VI, TA, TL)
10
27
- Chinese
- English
- French
- German
- Italian
- Korean
- Modern Greek (1453-)
- Philippine languages
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Spanish; Castilian
- Tamil
- Ukrainian
- Vietnamese
- CC-BY-NC-4.0
COVID-19 - HEALTH Wikipedia dataset. Multilingual (52 EN-X language pairs)
103
193
- Afrikaans
- Albanian
- Arabic
- Azerbaijani
- Basque
- Belarusian
- Bengali
- Bosnian
- Bulgarian
- Catalan; Valencian
- Chinese
- Croatian
- Czech
- Danish
- Dutch; Flemish
- English
- Esperanto
- Estonian
- Finnish
- French
- Galician
- German
- Hebrew
- Hindi
- Hungarian
- Indonesian
- Italian
- Korean
- Latvian
- Lithuanian
- Macedonian
- Malay (macrolanguage)
- Malayalam
- Modern Greek (1453-)
- Norwegian
- Persian
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Serbian
- Slovak
- Slovenian
- Spanish; Castilian
- Swahili (macrolanguage)
- Swedish
- Tagalog
- Tamil
- Telugu
- Thai
- Turkish
- Ukrainian
- Vietnamese
- CC-BY-SA-3.0
COVID-19 Parallel Global Voices dataset. Multilingual (EN, ES, FR, IT, EL, RU, AR, MG, NL, SR, BN, PT, PL, DE, RO, CS)
23
98
- Arabic
- Bengali
- Czech
- Dutch; Flemish
- English
- French
- German
- Italian
- Malagasy
- Modern Greek (1453-)
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Serbian
- Spanish; Castilian
- CC-BY-3.0
COVID-19 POLISH-GOV v2 dataset. Multilingual (EN, PL, FR, DE, VI, RU, UK)
6
13
- English
- French
- German
- Polish
- Russian
- Ukrainian
- Vietnamese
- Open Under-PSI
COVID-19 Voltaire dataset v1. Multilingual (EN, AR, CS, DE, EL, ES, FA, FR, IT, NB, NL, NN, PL, PT, RO, RU, TR)
2
15
- Arabic
- Czech
- Dutch; Flemish
- English
- French
- German
- Italian
- Modern Greek (1453-)
- Norwegian Bokmål
- Norwegian Nynorsk
- Persian
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Spanish; Castilian
- Turkish
- CC-BY-NC-ND-4.0
COVID-19 Voltaire dataset v2. Multilingual (EN, AR, CS, DE, EL, ES, FA, FR, IT, NB, NL, NN, PL, PT, RO, RU, TR)
11
29
- Arabic
- Czech
- Dutch; Flemish
- English
- French
- German
- Italian
- Modern Greek (1453-)
- Norwegian Bokmål
- Norwegian Nynorsk
- Persian
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Spanish; Castilian
- Turkish
- CC-BY-NC-ND-4.0
COVID-19 WIPO dataset v1. Multilingual (EN, ES, FR, DE, PT, RU)
1
18
- English
- French
- German
- Portuguese
- Russian
- Spanish; Castilian
- CC-BY-3.0
COVID-19 WIPO dataset v2. Multilingual (EN, ES, FR, DE, PT, RU, AR, ZH)
4
11
- English
- French
- German
- Portuguese
- Russian
- Spanish; Castilian
- CC-BY-3.0
HRW dataset v1. Multilingual (EN, AR, BG, BN, CS, DA, DE, EL, ES, FA, FI, FR, HR, HU, IN, IT, KO, LV, NB, NL, PL, PT, RU, SK, SQ, SV, TH, TL, TR, UK, UR, Vi, ZH)
17
42
- Albanian
- Bengali
- Bulgarian
- Chinese
- Croatian
- Czech
- Danish
- Dutch; Flemish
- English
- Filipino; Pilipino
- Finnish
- French
- German
- Hungarian
- Indonesian
- Italian
- Korean
- Latvian
- Modern Greek (1453-)
- Norwegian Bokmål
- Persian
- Polish
- Portuguese
- Russian
- Slovak
- Spanish; Castilian
- Swedish
- Tai languages
- Turkish
- Ukrainian
- Urdu
- Vietnamese
- CC-BY-NC-ND-3.0
Multilingual corpus in HEALTH (COVID-19) domain part_1b (v.1.05) in TMX format.
9
25
- Arabic
- English
- French
- German
- Russian
- CC-BY-4.0
Multilingual corpus in HEALTH (COVID-19) domain part_1b (v.1.05) in TSV/Moses-like format.
3
10
- Arabic
- English
- French
- German
- Russian
- CC-BY-4.0
Multilingual corpus in HEALTH (COVID-19) domain part_1b (v.1.0) in TMX format.
1
11
- Arabic
- English
- French
- German
- Russian
- CC-BY-4.0
Multilingual corpus in HEALTH (COVID-19) domain part_1b (v.1.0) in TSV/MOSES-like format.
1
9
- Arabic
- English
- German
- Russian
- CC-BY-4.0
OpenEdition culture-related publications. Multilingual (AR, DE, EL, EN, ES, FR, HR, IT, NL, PL, PT, RO, RU, SL, SV) collection of TMX files.
7
35
- Arabic
- Croatian
- Dutch; Flemish
- English
- French
- German
- Italian
- Modern Greek (1453-)
- Polish
- Portuguese
- Romanian; Moldavian; Moldovan
- Russian
- Slovenian
- Spanish; Castilian
- Swedish
- CC-BY-NC-ND-4.0
SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in MOSES format.
5
22
- Albanian
- Bulgarian
- Croatian
- Czech
- English
- Estonian
- Finnish
- French
- German
- Hungarian
- Icelandic
- Italian
- Latvian
- Lithuanian
- Macedonian
- Modern Greek (1453-)
- Norwegian Bokmål
- Norwegian Nynorsk
- Polish
- Portuguese
- Russian
- Slovak
- Slovenian
- Spanish; Castilian
- Swedish
- CC-BY-NC-SA-4.0
SciPar: A collection of parallel corpora from scientific abstracts (v. 2021) in TMX format.
40
95
- Albanian
- Bulgarian
- Croatian
- Czech
- English
- Estonian
- Finnish
- French
- German
- Hungarian
- Icelandic
- Italian
- Latvian
- Lithuanian
- Macedonian
- Modern Greek (1453-)
- Norwegian Bokmål
- Norwegian Nynorsk
- Polish
- Portuguese
- Russian
- Slovak
- Slovenian
- Spanish; Castilian
- Swedish
- CC-BY-NC-SA-4.0
Web-acquired data related to culture (Part I). Multilingual (BG, CS, DA, DE, EL, EN, ET, FI, FR, HR, IS, IT, LT, LV, MK, MT, RU, SK, SV) collection of files in Moses format.
5
12
- Bulgarian
- Czech
- Danish
- English
- Estonian
- Finnish
- French
- German
- Icelandic
- Italian
- Latvian
- Lithuanian
- Macedonian
- Maltese
- Russian
- Slovak
- Swedish
- Open Under-PSI
Web-acquired data related to culture (Part I). Multilingual (BG, CS, DA, DE, EL, EN, ET, FI, FR, HR, IS, IT, LT, LV, MK, MT, RU, SK, SV) collection of files in TMX format.
4
12
- Bulgarian
- Czech
- Danish
- English
- Estonian
- Finnish
- French
- German
- Icelandic
- Italian
- Latvian
- Lithuanian
- Macedonian
- Maltese
- Russian
- Slovak
- Swedish
- Open Under-PSI
Web-acquired data related to Scientific research (Part I). Multilingual (BG, CS, DA, DE, EN, ES, ET, FR, GA, HR, IT, LT, LV, NB, NL, PL, PT, RU, SK, SV, UK) collection of files in Moses format.
7
18
- Bulgarian
- Croatian
- Czech
- Danish
- Dutch; Flemish
- English
- Estonian
- French
- German
- Irish
- Italian
- Latvian
- Lithuanian
- Norwegian Bokmål
- Polish
- Portuguese
- Russian
- Slovak
- Spanish; Castilian
- Swedish
- Ukrainian
- Open Under-PSI
« Previous | Next »