Post-editing corpus English to Dutch/French/Portuguese, legal domain – ELRC-SHARE

91 Last view: 2025-10-05

4 Last update: 2020-04-27

36 Last download: 2025-09-02

Post-editing corpus English to Dutch/French/Portuguese, legal domain

APE-QUEST postedition tuples

https://ape-quest.eu

ID:

https://ape-quest.eu

Training data created for constructing quality estimation and automatic post-edition models (Ive et al. 2020). The data consists of
tuples (source sentence, machine translation output, manual post-edition, independent reference translation) for three European language pairs. The data cover the following domains: online dispute resolution, procurement and justice.
Number of tuples per language pair:
- English-Dutch: 11249
- English-French: 9989
- English-Portuguese: 10165

The machine translation output was produced with neural MT systems. All data were anonymized (replacement of person names and contact information). The tuples were randomized. The data are distributed as a set of four files per language pair (one for each element in the tuple).

DSI Relevance: OnlineDisputeResolution, eJustice, eProcurement

Distribution

Availability: Available

Licences

Distribution Details

Distribution Medium: Data Downloadable

Contact Person

Tom Vanallemeersch

text

Bilingual text corpusLanguages

Dutch; Flemish (nl)

English (en)

Portuguese (pt)

French (fr)

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Text Format

Plain Text

Size

31,403 Translation Units

Resource Creation

Anonymized

Funding Project

APE-QUEST, 2017-EU-IA-0151 (APE-QUEST)

URL: https://ape-quest.eu

Funding Type: Eu Funds

Funder: CEF Telecom

Funding Country: European Union (EU)

Metadata

Created: 26/02/2020

Last Updated: 26/02/2020

Metadata Language: English (en)

Version

Version: 1.0

People who looked at this resource also viewed the following: