Detecting semantic overlap:A parallel monolingual treebank for Dutch. Erwin Marsi and Emiel Krahmer.
CLIN, 2007
From the DAESO corpus: Comparable text/paraphrases from manually generated clustered headlines of similar news articles, having a "intersects" relation. Contains 4693 TMX pairs (NL-NL), consisting of longer sequences (phrases and paragraphs).