COVID-19-related multilingual corpus from EU press Corner 2020 v.0.9 in TMX format
Multilingual dataset (CEF languages) based on the press releases from the ec.europa.eu portal during 2020. For example, https://ec.europa.eu/commission/presscorner/detail/en/ip_20_1680 and https://ec.europa.eu/commission/presscorner/detail/el/ip_20_1680 are two press releaseses in EN and EL). It contains 276 TMX files including 2514613 Translation Units in total.
DSI Relevance: eHealth
People who looked at this resource also viewed the following:
- COVID-19 OSHA-EUROPA dataset v1. Multilingual (CEF languages plus IS and NB but not Irish)
- COVID-19-related multilingual corpus from EU press Corner 2020 v.0.9 in Moses-like format
- Transitional Protocol for Working Safely from the Department of Enterprise Trade and Employment January 2022
- COVID-19 Government of Canada dataset v2. Multilingual (EN, FR, DE, ES, EL, IT, PL, PT, RO, KO, RU, ZH, UK, VI, TA, TL)
People who downloaded this resource also downloaded the following:
- COVID-19 OSHA-EUROPA dataset v1. Multilingual (CEF languages plus IS and NB but not Irish)
- COVID-19 - HEALTH Wikipedia dataset. Multilingual (52 EN-X language pairs)
- ELRC3.0 Multilingual corpus made out of PDF documents from the European Medicines Agency (EMEA), https://www.ema.europa.eu, (February 2020).
- Multilingual corpus from the European Vaccination Information Portal