Creation mode details: The ILSP Focused Crawler was used for the acquisition of bilingual data from multilingual websites, and for the normalization, cleaning, (near) de-duplication and identification of parallel documents. The Maligna sentence aligner was used for extracting segment alignments from crawled parallel documents. As a post-processing step, alignments were merged into one TMX file. The following filters were applied: TMX files generated from document pairs which have been identified by non-aupdih methods were discarded ; TMX files with a zeroToOne_alignments/total_alignments ratio larger than 0.16, were discarded ; Alignments of non-[1:1] type(s) were discarded. ; Alignments with a TUV (after normalization) that has less than 3 tokens, were discarded/annotated ; Alignments with a l1/l2 TUV length ratio smaller than 0.6 or larger than 1.6, were discarded/annotated ; Alignments in which different digits appear in each TUV were discarded/annotated ; Alignments with identical TUVs (after normalization) were removed. ; Alignments with only non-letters in at least one of their TUVs were removed ; Duplicate alignments were discarded. There are 12509 TUs with no annotation, containing 275190 words and 32012 lexical types in el and 288100 words and 17270 lexical types in en. The mean value of aligner's scores is 5.977708886744488, the std value is 1.1596291501448004