MeMAD lidbox: spoken language identifier

MeMAD lidbox: spoken language identifier by Mathias Lindgren from Aalto University, pipeline by Limecraft, used under MIT licence.

Spoken language identifier for languages fi, sv, fr, de, en, and x-nolang (denotes no language detected). The pipeline works in two scenarios: 1) If there is only audio file in the request, the API splits the input audio into 2 seconds chunks and predicts corresponding spoken languages. 2) If there is an audio file and corresponding annotation/diarization json in the request, the API returns prediction results and reports the classification metrics.