CEF Data Marketplace second multilingual benchmark for the evaluation of cleaning tools
Five parallel corpora (En-Bg, En-Da, En-El, En-Hu, En-Ro) belonging to the Legal domain and manually annotated by professional translators. Each translation unit (TU) included in the datasets is annotated with information about whether it is "clean" - i.e. the translation is correct and fully equivalent to its source text, "partially clean" or "not clean". The resulting gold standards were used in the second evaluation cycle of the CEF project to evaluate the Cleaning service offered by the CEF Data Marketplace platform.
People who looked at this resource also viewed the following:
- Compilation of Bulgarian-Greek parallel corpora resources used for training of NTEU Machine Translation engines.
- English-Lithuanian EASTIN-CL Multilingual Ontology of Assistive Technology
- Monolingual Icelandic corpus from the official journal Stjórnartíðindi
- PRINCIPLE MVEP Croatian-English-German Glossary of Legal Terms
People who downloaded this resource also downloaded the following:
- Competition Economics for Judges (Processed)
- Competition Economics for Judges
- Bilingual resource with Bulgarian strategic documents in the field of telecommunications and broadband (Bulgarian - English) (Processed)
- Compilation of Bulgarian-English parallel corpora resources used for training of NTEU Machine Translation engines.