An tAonad Aistriúcháin agus Ateangaireachta ÓEG/NUIG Translation Unit dataset (evaluated)

Aligned parallel corpus based on translated material from NUI Galway. The data originally came in unaligned format. The following processing was performed: automatic text extraction from raw documents, normalization, TU alignment, cleaning, automated error detection, manual spot-check for quality. This dataset has been prepared for the development of an MT system for the EN-GA language pair, and so is considered to be of high quality. Data are contributed exclusively for use by DGT for eTranslation development.
Domain: mixed (general-purpose with some eProcurement)
Size: 16450 translation units