Dutch Parallel Corpus (Processed)
NTU - Nedelandse Taalunie
Dutch Parallel Corpus (DPC) is a 10-million-word, high-quality, sentence-aligned parallel corpus for the language pairs Dutch-English and Dutch-French, with Dutch as the central language. Part of the corpus is trilingual: a number of Dutch texts have translations both in English and French.
Linguistic annotation involves lemmatization and part-of-speech (PoS) tagging of the DPC data and all the text material included in the corpus is annotated with additional metadata at different levels. This allows the user to retrieve relevant information from the corpus.
The Dutch Parallel Corpus can be searched by means of the DPC Web interface which is developed for Linux/Apache/mySQL/PHP.
People who looked at this resource also viewed the following: