Parallel Global Voices (English - Romanian)
Parallel Global Voices (English - Romanian) was created for the European Language Resources Coordination Action (ELRC) (http://lr-coordination.eu/) by researchers at the NLP group of the Institute for Language and Speech Processing (http://www.ilsp.gr/) with primary data copyrighted by Parallel Global Voices (https://globalvoices.org/) and is licensed under "CC-BY 3.0" (https://creativecommons.org/licenses/by/3.0/).
Parallel Global Voices EN-RO is a parallel corpus generated from the Global Voices multilingual group of websites (http://globalvoices.org/), where volunteers publish and translate news stories in more than 40 languages. The original content from the Global Voices websites is available by the authors and publishers under a Creative Commons Attribution license. The content was crawled in July-August 2015 by researchers at the NLP group of the Institute for Language and Speech Processing. Documents that are translations of each other were paired on the basis of their link information. After document pairing, segment alignments were automatically extracted. The results of the automatic alignment at document and segment level are distributed under a Creative Commons Attribution license.
DSI Relevance: Europeana
People who looked at this resource also viewed the following:
- Slovak corpus of texts from the Ministry of Justice of the Slovak Republic (Processed)
- English-Swedish parallel corpus from the Annual Overview of Sweden’s Official aid Agency SIDA Activities (Processed)
- Press and Information Office (PIO) Publication: "CYPRUS still occupied still divided 1974-2016" (Processed)
- Bilingual extracts from Malta International Airport Newsletter (Processed)
People who downloaded this resource also downloaded the following: