COVID-19 Parallel Global Voices dataset. Multilingual (EN, ES, FR, IT, EL, RU, AR, MG, NL, SR, BN, PT, PL, DE, RO, CS)

"Covid Parallel Global Voices" dataset was created for the European Language Resources Coordination Action (ELRC) (http://lr-coordination.eu/) by researchers at the NLP group of the Institute for Language and Speech Processing (http://www.ilsp.gr/) with primary data copyrighted by Global Voices (https://globalvoices.org/) and is licensed under "CC-BY 3.0" (https://creativecommons.org/licenses/by/3.0/).

Multilingual (EN, ES, FR, IT, EL, RU, AR, MG, NL, SR, BN, PT, PL, DE, RO, CS) COVID-19-related corpus acquired from the website (https://globalvoices.org/) of GlobalVoices (28th April 2020). It contains 25755 TUs in total.
5459 EN-ES
4840 EN-FR
4056 EN-IT
3204 EN-EL
3127 EN-RU
1779 EN-AR
1045 EN-MG
675 EN-NL
434 EN-SR
384 EN-BN
276 EN-PT
193 EN-PL
178 EN-DE
66 EN-RO
39 EN-CS

DSI Relevance: eHealth