PAH_Oxfam Dataset (Processed)

Source English texts provided by OXFAM International ( Polish translations provided by the Polish Humanitarian Action (

The PAH/Oxfam dataset is a 100K-token Polish-English parallel resource in XLIFF format composed of translations of Oxfam reports by Polish Humanitarian Action (PAH) and Web documents from PAH website. The dataset comprises:
1. Busan in a nutshell (6674 tokens)
2. Growing disruption (10968 tokens)
3. Growing a better future (41296 tokens)
4. PAH Strategy (6028 tokens)
5. PAH: Where we work (2028 tokens)
6. The European Union's Development Policy (38422 tokens).
It was converted into a 2708-TUs Polish-English parallel resource in TMX format