fastText_multilingual
Multilingual word embeddings
A repository providing aligned multilingual word vectors for 78 languages using the SVD method.
Multilingual word vectors in 78 languages
1k stars
55 watching
121 forks
Language: Jupyter Notebook
last commit: over 1 year ago
Linked from 4 awesome lists
distributed-representationsmachine-learningmachine-translationnatural-language-processingnlpword-vectors
Related projects:
Repository | Description | Stars |
---|---|---|
benathi/multisense-prob-fasttext | An implementation of a probabilistic FastText model for multi-sense word embeddings | 149 |
botcenter/spanishwordembeddings | This project generates Spanish word embeddings using fastText on large corpora. | 9 |
kyubyong/wordvectors | Provides pre-trained word vectors for multiple languages to facilitate NLP tasks | 2,215 |
talschuster/crosslingualcontextualemb | Enables alignment of word embeddings across multiple languages to facilitate cross-lingual text analysis and machine learning tasks | 98 |
eleutherai/polyglot | Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. | 475 |
bigredt/vico | Multi-sense word embeddings learned from visual cooccurrences | 25 |
juliatext/embeddings.jl | Provides access to pre-trained word embeddings for NLP tasks. | 81 |
bheinzerling/bpemb | A collection of pre-trained subword embeddings in 275 languages, useful for natural language processing tasks. | 1,184 |
galuhsahid/indonesian-word-embedding | Demonstrates word embedding in Indonesian language using pre-trained Word2vec models | 20 |
hit-scir/elmoformanylangs | Provides pre-trained ELMo representations for multiple languages to improve NLP tasks. | 1,463 |
dccuchile/spanish-word-embeddings | A collection of precomputed word embeddings for the Spanish language, derived from different corpora and computational methods. | 356 |
uw-madison-lee-lab/cobsat | Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks | 28 |
neulab/pangea | An open-source multilingual large language model designed to understand and generate content across diverse languages and cultural contexts | 91 |
untra/polyglot | A plugin for Jekyll blogs that enables support for multiple languages and internationalization. | 417 |
microsoft/unicoder | This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 88 |