MARCELL Slovak legislative subcorpus v2

The Slovak corpus (33 million tokens) contains documents of legally binding acts starting from the year 1993 (following minor orthography reform in 1991, but it also coincides with the independence of Slovakia). The data is obtained from the Slov-Lex legislative and information portal archive of the acts approved by the Slovak Parliament. The data has been converted from the original HTML format, filtered by date and document length, tokenized, lemmatized and morphologically annotated with the Slovak MorphoDita model and dependency parsed with UDPipe.

DSI Relevance: eJustice