Multilingual corpus from the Publications Office of the EU on the medical domain v.2
This dataset has been generated out of public content available through the Publications Office of the European Union (OP Portal), https://op.europa.eu/en/home, in April 2020
272748 sentence pairs (in 23 EN-X language pairs in total) extracted from the Publications Office of the EU on the medical domain. These are sourced from laws, studies, EC announcements, etc. labelled with concepts like epidemiology, epidemic, disease surveillance, health control, public hygiene, freedom of movement, distance learning, etc.
DSI Relevance: eHealth
People who looked at this resource also viewed the following:
- Multilingual corpus from the Publications Office of the EU on the medical domain
- Multilingual corpus from the European Vaccination Information Portal
- Multilingual corpus made out of PDF documents from the European Medicines Agency (EMEA), https://www.ema.europa.eu, (February 2020).
- COVID-19 EC-EUROPA v1 dataset. Multilingual (CEF languages)