aravec

Arabic embeddings

Provides pre-trained word embedding models for Arabic text analysis

AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

GitHub

394 stars
32 watching
78 forks
Language: Jupyter Notebook
last commit: over 3 years ago
arabicembedded-modelsgensimnlptext-miningword2vec

Related projects:

Repository Description Stars
dfki-interactive-machine-learning/arasif Provides sentence embeddings for Arabic languages using pre-trained word embeddings and Smooth Inverse Frequency algorithm 5
alexandres/lexvec An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks 803
wikipedia2vec/wikipedia2vec A tool for learning vector representations of words and entities from Wikipedia text data. 940
galuhsahid/indonesian-word-embedding Demonstrates word embedding in Indonesian language using pre-trained Word2vec models 20
tca19/dict2vec A framework to learn word embeddings using lexical dictionaries 115
satwikkottur/visualword2vec Learning word embeddings from abstract images to improve language understanding 19
vefstathiou/so_word2vec This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. 40
botcenter/spanish-sent2vec This project trains a machine learning model to generate sentence embeddings from Spanish text data using the sent2vec algorithm. 4
auspicious3000/contentvec An implementation of a self-supervised speech representation model using PyTorch and disentangled speaker embeddings 467
alexrutherford/arabic_nlp Tools for normalizing and deriving sentiment from Arabic text 26
hassygo/charngram2vec A repository providing a re-implementation of character n-gram embeddings for pre-training in natural language processing tasks 23
hit-scir/elmoformanylangs Provides pre-trained ELMo representations for multiple languages to improve NLP tasks. 1,463
cod3licious/conec A library for training and evaluating a type of word embedding model that extends the original Word2Vec algorithm 20
artetxem/vecmap An implementation of cross-lingual word embedding mappings using unsupervised learning methods 645
botcenter/spanishwordembeddings This project generates Spanish word embeddings using fastText on large corpora. 9