Multilingual corpus from the Publications Office of the EU on the medical domain

This dataset has been generated out of public content available through the Publications Office of the European Union (OP Portal), https://op.europa.eu/en/home, in April 2020

161845 new sentence pairs (in 23 EN-X language pairs in total) extracted from the Publications Office of the EU on the medical domain. These are sourced from laws, studies, EC announcements, etc. labelled with concepts like epidemiology, epidemic, disease surveillance, health control, public hygiene, etc.
7419 eng-bul
7437 eng-ces
7378 eng-dan
7430 eng-deu
7401 eng-ell
7418 eng-est
7402 eng-fin
7311 eng-fra
1289 eng-gle
7323 eng-hrv
7383 eng-hun
7336 eng-ita
7448 eng-lav
7343 eng-lit
5578 eng-mlt
7408 eng-nld
7253 eng-pol
7368 eng-por
7371 eng-ron
7362 eng-slk
7419 eng-slv
7309 eng-spa
7459 eng-swe

DSI Relevance: eHealth