Multilingual corpus from the Publications Office of the EU on the medical domain v.2

This dataset has been generated out of public content available through the Publications Office of the European Union (OP Portal), https://op.europa.eu/en/home, in April 2020

272748 sentence pairs (in 23 EN-X language pairs in total) extracted from the Publications Office of the EU on the medical domain. These are sourced from laws, studies, EC announcements, etc. labelled with concepts like epidemiology, epidemic, disease surveillance, health control, public hygiene, freedom of movement, distance learning, etc.
12845 en-bg
12848 en-cs
12930 en-da
12990 en-de
12776 en-el
12882 en-es
12710 en-et
12635 en-fi
12839 en-fr
1799 en-ga
12530 en-hr
12718 en-hu
12750 en-it
12285 en-lt
12739 en-lv
3055 en-mt
12883 en-nl
12536 en-pl
12837 en-pt
12847 en-ro
12617 en-sk
12903 en-sl
12794 en-sv

DSI Relevance: eHealth