Croatian Lemmatization Server – ELRC-SHARE

48 Last view: 2024-07-10

2 Last update: 2020-02-14

Croatian Lemmatization Server

http://hml.ffzg.hr/hml/info.php?show=hlp

Croatian Lemmatization Server is unique web-service for retrieving lexical entries from Croatian Morphological Lexicon and its usage in the computational linguistic processes of:
1) generation of all Croatian word-forms (all cases in singular and plural for nouns, all persons and all tenses for verbs, all cases of all genders for adjectives etc.)
2) analysis of all Croatian word-forms i.e. converting them to a base form ? lemma. For now, the lemmatization is being done on unigram level without any correspondence to left of right context. In this way for each token all possible lemmas that it could belong to are being retrieved.

Since Croatian is highly inflective language, web-pages retrieval using only base word-form (lemma) or using jocker characters (e.g. glav* for glava) gives inadequate results. Croatian Lemmatization Server enables automatical generation of queries according to all word-forms and only all word-forms of Croatian words thus serving as a starting point for precise and thorough retrieval of Croatian web-pages with Google.

Distribution

Availability: Available

Licences

Non-standard/ Other Licence/ Terms

Distribution Details

Distribution Medium: Web Executable

Execution location : http://hml.ffzg.hr/h...

IPR Holders

Croatian Morphological Lexicon

Contact Person

toolService

Service (Lemmatization)

Language Dependent

Input

Media type: Text

Languages: Croatian (hr)

Resource Creation

Funding Project

Not Applicable (N/A)

Funding Type: Other

Metadata

Created: 26/03/2019

Last Updated: 26/03/2019

Metadata Language: English (en)

People who looked at this resource also viewed the following:

Resources from the same project