word2vec-slim

Word model reducer

Slims down a large pre-trained word2vec model to reduce size and improve loading time

word2vec Google News model slimmed down to 300k English words

GitHub

212 stars
11 watching
38 forks
Language: Python
last commit: over 7 years ago

Related projects:

Repository Description Stars
mmihaltz/word2vec-googlenews-vectors A repository hosting pre-trained word vector model (3 million 300-dimension English word vectors) from the Google News corpus. 516
tca19/dict2vec A framework to learn word embeddings using lexical dictionaries 115
vefstathiou/so_word2vec This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. 40
fanglanting/skip-gram-pytorch A PyTorch implementation of the skip-gram model for learning word embeddings. 188
refefer/word2vec-scala A Scala implementation of the word2vec model representation. 11
danieldk/go2vec A package for reading and analyzing word embeddings from the word2vec format in Go. 56
vyraun/half-size An algorithm to reduce word embeddings to a specified size while maintaining performance 128
wikipedia2vec/wikipedia2vec A tool for learning vector representations of words and entities from Wikipedia text data. 940
auspicious3000/contentvec An implementation of a self-supervised speech representation model using PyTorch and disentangled speaker embeddings 468
alexandres/lexvec An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks 803
cod3licious/conec A library for training and evaluating a type of word embedding model that extends the original Word2Vec algorithm 20
wooorm/stmr.c A C implementation of a stemming algorithm to reduce words to their base form 39
seomoz/word2gauss This implementation provides a way to represent words as multivariate Gaussian distributions, allowing scalable word embeddings. 190
equinor/segyio A fast Python library for reading and writing seismic data formats 490
kyubyong/wordvectors Provides pre-trained word vectors for multiple languages to facilitate NLP tasks 2,215