Tagged Icelandic Corpus - MIM

View resource name in all available languages

Mörkuð íslensk málheild (MÍM)


The Tagged Icelandic Corpus (MÍM) is a morphosyntactically tagged corpus of Icelandic consisting of about 25 million tokens of contemporary Icelandic texts collected from varied sources during the years 2006-2010. The corpus is intended for use in Language Technology projects and for linguistic research. The corpus is available for search through a web interface and for download in TEI-conformant XML format. Each text in the corpus is accompanied by metadata (http://www.malfong.is/index.php?lang=en&pg=mim).
No processing was performed on the corpus prior to uploading to ELRC-share.