Top2Vec

Topic modeling library

A Python library that provides a deep learning-based approach to topic modeling and semantic search by jointly embedding topics, documents, and words.

Top2Vec learns jointly embedded topic, document and word vectors.

GitHub

3k stars
37 watching
373 forks
Language: Python
last commit: 7 days ago
bertdocument-embeddingpre-trained-language-modelssemantic-searchsentence-encodersentence-transformerstext-searchtext-semantic-similaritytop2vectopic-modelingtopic-modellingtopic-searchtopic-vectorword-embeddings

Related projects:

Repository Description Stars
tca19/dict2vec A framework to learn word embeddings using lexical dictionaries 115
wikipedia2vec/wikipedia2vec A tool for learning vector representations of words and entities from Wikipedia text data. 940
vefstathiou/so_word2vec This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. 40
materialsintelligence/mat2vec Unsupervised word embeddings capture latent knowledge from materials science literature 619
dselivanov/text2vec An R package providing efficient tools for text analysis and natural language processing tasks. 853
cemoody/lda2vec A framework for creating interpretable natural language models by combining word embeddings and topic modeling. 3,149
zhezhaoa/ngram2vec A toolkit for learning high-quality word and text representations from ngram co-occurrence statistics 846
danieldk/go2vec A package for reading and analyzing word embeddings from the word2vec format in Go. 56
benedekrozemberczki/diff2vec A reference implementation of Diffusion2Vec, a method for learning node embeddings from graph data. 126
dalinvip/cw2vec A software framework for learning Chinese word embeddings with stroke n-gram information 274
tmikolov/word2vec A tool for training word vectors using distributed neural network architectures 1,527
benedekrozemberczki/graph2vec This implementation provides a parallel method for graph representations using distributed learning techniques. 902
explosion/sense2vec A Python library that generates contextually-keyed word vectors from text data using a variation of the Word2Vec algorithm. 1,625
kostyaev/sentence2vec This is a tool for creating deep sentence embeddings using Sequence-to-Sequence learning. 22
inejc/paragraph-vectors A PyTorch implementation of a model for generating dense vector representations of paragraphs from text data. 412