CURLICAT Slovak corpus v1.0

This is the Slovak language subcorpus of the collection of curated and analysed language data compiled by the CURLICAT project. It consists of 4.8 million sentences, 66.8 million tokens linguistically analized, and enriched with IATE and domain specific terminology extracted from the subcorpus. For more information see the delivery reports D1.1 and D2 of the CURLICAT website: