This is the Hungarian language subcorpus of the collection of curated and analysed language data compiled by the CURLICAT project. It consists of over 2.75 million sentences, 61.2 m tokens linguistically analized, and enriched with IATE and domain specific terminology extracted from the subcorpus. The structure of the corpus as regards sources shows a predominance of longer texts in book publications covering the following domains: culture, economy, science and social issues. For more information see the delivery reports D1.1 and D2 of the curlicat website (http:curlicat-project.eu/deliverables)