word2vec-slim
Word model reducer
Slims down a large pre-trained word2vec model to reduce size and improve loading time
word2vec Google News model slimmed down to 300k English words
212 stars
11 watching
38 forks
Language: Python
last commit: over 7 years ago Related projects:
Repository | Description | Stars |
---|---|---|
mmihaltz/word2vec-googlenews-vectors | A repository hosting pre-trained word vector model (3 million 300-dimension English word vectors) from the Google News corpus. | 516 |
tca19/dict2vec | A framework to learn word embeddings using lexical dictionaries | 115 |
vefstathiou/so_word2vec | This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. | 40 |
fanglanting/skip-gram-pytorch | A PyTorch implementation of the skip-gram model for learning word embeddings. | 188 |
refefer/word2vec-scala | A Scala implementation of the word2vec model representation. | 11 |
danieldk/go2vec | A package for reading and analyzing word embeddings from the word2vec format in Go. | 56 |
vyraun/half-size | An algorithm to reduce word embeddings to a specified size while maintaining performance | 128 |
wikipedia2vec/wikipedia2vec | A tool for learning vector representations of words and entities from Wikipedia text data. | 940 |
auspicious3000/contentvec | An implementation of a self-supervised speech representation model using PyTorch and disentangled speaker embeddings | 468 |
alexandres/lexvec | An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks | 803 |
cod3licious/conec | A library for training and evaluating a type of word embedding model that extends the original Word2Vec algorithm | 20 |
wooorm/stmr.c | A C implementation of a stemming algorithm to reduce words to their base form | 39 |
seomoz/word2gauss | This implementation provides a way to represent words as multivariate Gaussian distributions, allowing scalable word embeddings. | 190 |
equinor/segyio | A fast Python library for reading and writing seismic data formats | 490 |
kyubyong/wordvectors | Provides pre-trained word vectors for multiple languages to facilitate NLP tasks | 2,215 |