Adrien Lardilleux and Yves Lepage. Sampling-based multilingual alignment. International Conference on Recent Advances in Natural Language Processing (RANLP 2009), Borovets, Bulgaria, September 2009.
Anymalign is a multilingual sub-sentential aligner. It can extract lexical equivalences from sentence-aligned parallel corpora. Its main advantage over other similar tools is that it can align any number of languages simultaneously. Characteristics: -Truly multilingual: any number of languages can be aligned simultaneously. -Fast: Quality of results is not a matter of time, however coverage is. The longer Anymalign runs, the more results. The program can be stopped at any time. -Easy to use: a single command should suffice for most purposes. Still from a command line! -Easy to parallelize: just run the very same command on several machines! Their results can be merged with a single command. -Easy to integrate: simple one-file input and output formats. There is no intermediary step. -Portable: written in the Python programming language, available for most systems. -Open source: released under the terms of the GPL.