fastText_multilingual
Multilingual word embeddings
A repository providing aligned multilingual word vectors for 78 languages using the SVD method.
Multilingual word vectors in 78 languages
1k stars
55 watching
121 forks
Language: Jupyter Notebook
last commit: almost 2 years ago
Linked from 4 awesome lists
distributed-representationsmachine-learningmachine-translationnatural-language-processingnlpword-vectors
Related projects:
Repository | Description | Stars |
---|---|---|
| An implementation of a probabilistic FastText model for multi-sense word embeddings | 148 |
| This project generates Spanish word embeddings using fastText on large corpora. | 9 |
| Provides pre-trained word vectors for multiple languages to facilitate NLP tasks | 2,216 |
| Enables alignment of word embeddings across multiple languages to facilitate cross-lingual text analysis and machine learning tasks | 99 |
| Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. | 476 |
| Multi-sense word embeddings learned from visual cooccurrences | 25 |
| Provides access to pre-trained word embeddings for NLP tasks. | 81 |
| A collection of pre-trained subword embeddings in 275 languages, useful for natural language processing tasks. | 1,189 |
| Demonstrates word embedding in Indonesian language using pre-trained Word2vec models | 20 |
| Provides pre-trained ELMo representations for multiple languages to improve NLP tasks. | 1,462 |
| A collection of precomputed word embeddings for the Spanish language, derived from different corpora and computational methods. | 354 |
| Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks | 30 |
| An open-source multilingual large language model designed to understand and generate content across diverse languages and cultural contexts | 92 |
| A plugin for Jekyll blogs that enables support for multiple languages and internationalization. | 425 |
| This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 89 |