Apache OpenNLP

88 Last view: 2025-08-12

3 Last update: 2019-06-06

https://opennlp.apache.org/

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also included maximum entropy and perceptron based machine learning.

The Apache OpenNLP library contains several components, enabling one to build a full natural language processing pipeline. These components include: sentence detector, tokenizer, name finder, document categorizer, part-of-speech tagger, chunker, parser, coreference resolution. Components contain parts which enable one to execute the respective natural language processing task, to train a model and often also to evaluate a model. Each of these facilities is accessible via its application program interface (API). In addition, a command line interface (CLI) is provided for convenience of experiments and training.

Distribution

Availability: Available

Licences

Apache-2.0

Distribution Details

Download locations : https://github.com/a..., https://opennlp.apac...

Distribution Medium: Data Downloadable

IPR Holders

Apache Software Foundation

Contact Person

OpenNLP mailing lists

toolService

Suite Of Tools (Chunking, Co Reference Annotation, Language Identification, Named Entity Recognition, Parsing, PoS Tagging, Sentence Splitting, Tokenization)

Language Independent

Resource Creation

Funding Project

Not Applicable (N/A)

Funding Type: Other

Metadata

Created: 17/04/2019

Last Updated: 17/04/2019

Metadata Language: English (en)

Metadata Creator

Kanella Pouli

People who looked at this resource also viewed the following:

Resources from the same project