PRINCIPLE Foras na Gaeilge parallel translation memory dataset (evaluated)
Aligned parallel corpus based on translation memory data from Foras na Gaeilge. The data originally came in an aligned format, and was since normalized and cleaned. The cleaned content was subsequently searched (automated) for obvious errors, and spot-checked (manually) for quality. This dataset has been used in the development of an MT system for the EN-GA language pair, and so is considered to be of high quality.
Languages: English-Irish
Domain: mixed (general-purpose with some eProcurement)
Size: 54141 translation units
People who looked at this resource also viewed the following:
- PRINCIPLE Dept of Justice parallel English-Irish secondary legislation
- PRINCIPLE Dept of Justice parallel English-Irish secondary legislation (evaluated)
- An tAonad Aistriúcháin agus Ateangaireachta ÓEG/NUIG Translation Unit dataset (evaluated)
- PRINCIPLE Anonymized English-Irish DCHG parallel translation memory dataset