awesome-new-languages-in-machine-translation

Language extensions

Lists initiatives and resources for adding new languages to machine translation models

A list of initiatives for adding new languages to opensource machine translation models

GitHub

17 stars
2 watching
0 forks
last commit: 27 days ago

Single-language projects / Ainu

https://huggingface.co/TwentyNine/nllb-jpn-ain-v1 Model:

Single-language projects / Amis

https://huggingface.co/dylan-leddy/nllb-eng-ami-reverse Model:

Single-language projects / Aromanian

https://github.com/lolismek/AroTranslate 7 about 1 month ago Code and description:
https://arxiv.org/abs/2410.17728 Paper:
https://arotranslate.com/ Interface:

Single-language projects / Awajun

https://huggingface.co/hectordiazgomez/nllb-spa-awa-v2 Model:

Single-language projects / Bambara

https://aclanthology.org/2023.loresmt-1.9/ Paper:

Single-language projects / Buryat

https://burunen.ru/news/society/107048 Press: (in Russian)
https://translate-bur.ru/ Interface:
https://huggingface.co/SaranaAbidueva/nllb-200-bxr-ru Model:

Single-language projects / Circassian (Kabardian)

https://www.zedzek.com/en Interface:

Single-language projects / Erzya

https://lango.to/ Interface:
https://aclanthology.org/2022.fieldmatters-1.6/ Paper (for an old version):

Single-language projects / Fula

https://huggingface.co/flutter-painter/nllb-fra-fuf-v2 Model:

Single-language projects / Interslavic

https://huggingface.co/Salavat/nllb-200-distilled-600M-finetuned-isv_v2 Model:
https://huggingface.co/spaces/Salavat/Interslavic-Translator-NLLB200 Demo:
https://www.youtube.com/watch?v=BiNrza83Gvw Presentation:

Single-language projects / Karakalpak

https://tahrirchi.uz/uz/translator Interface:

Single-language projects / Lezgian

https://huggingface.co/leks-forever/nllb-200-distilled-600M Model:
https://github.com/leks-forever/nllb-tuning 7 22 days ago Code:
https://huggingface.co/spaces/leks-forever/lezghian-nllb-200-distilled-600M Demo:
in a Telegram channel Description (in Russian):

Single-language projects / Mansi

https://github.com/anyasarybaeva/Mansi-Legends 3 2 months ago Github:

Single-language projects / Ngambay

https://aclanthology.org/2023.nlp4tia-1.6.pdf Paper:

Single-language projects / Qarachay Malqar

https://github.com/TBSj/Qarachay_Malqar_translator 1 6 months ago Github:
https://huggingface.co/TSjB/NLLB-201-600M-QM-V1 Model:
https://habr.com/ru/articles/829248/ Blog post (rus):

Single-language projects / Tyvan

https://cointegrated.medium.com/a37fc706b865 Blog:
https://tyvan.ru/ Interface:

Single-language projects / Zarma

https://arxiv.org/abs/2406.05888 Paper:

Multilingual projects / Finno-Ugric languages (tartuNLP)

https://aclanthology.org/2022.wmt-1.33/ Paper (an early one):
https://aclanthology.org/2023.nodalida-1.77.pdf Paper:
https://translate.ut.ee/ Interface:
https://huggingface.co/tartuNLP/smugri3_14-finno-ugric-nmt Model:

Multilingual projects / Indigenous languages of the Americas (AmericasNLP Shared Tasks)

https://aclanthology.org/2023.americasnlp-1.19.pdf Paper:
https://aclanthology.org/2024.americasnlp-1.22.pdf Paper:
https://aclanthology.org/2024.americasnlp-1.2.pdf Paper:

Multilingual projects / Hundreds of diverse languages (Apertium)

https://github.com/apertium Code:
https://www.apertium.org/ Interface (with only a subset of the most stable language pairs):