gensim

Topic modeling library

A Python library for topic modeling and document analysis with large corpora, providing efficient algorithms and easy integration.

Topic Modelling for Humans

GitHub

16k stars
430 watching
4k forks
Language: Python
last commit: 3 months ago
Linked from 5 awesome lists

data-miningdata-sciencedocument-similarityfasttextgensiminformation-retrievalmachine-learningnatural-language-processingneural-networknlppythontopic-modelingword-embeddingsword-similarityword2vec

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
piskvorky/gensim-data A repository of pre-trained NLP models and corpora for text processing. 988
rdspring1/pytorch_gbw_lm Trains a large-scale PyTorch language model on the 1-Billion Word dataset 123
gmftbygmftby/science-llm A large-scale language model for scientific domain training on redpajama arXiv split 122
gregversteeg/corex_topic An implementation of an unsupervised topic modeling algorithm that leverages domain knowledge to generate informative topics from sparse count data. 627
pysam-developers/pysam A lightweight Python package for reading and manipulating genomics data in the SAM/BAM format. 785
dongwookim-ml/python-topic-model An implementation of various topic modeling algorithms in Python 369
wse-research/loris-llm-generated-representations-of-sparql-queries Generates natural language representations of SPARQL queries to facilitate understanding and mitigate errors in large knowledge graphs 3
sergioburdisso/pyss3 A Python package implementing an interpretable machine learning model for text classification with visualization tools 336
nvlabs/edm This project provides a set of tools and techniques to design and improve diffusion-based generative models. 1,399
dswah/pygam Provides an implementation of generalized additive models in Python for building flexible semi-parametric models 875
shawn-ieitsystems/yuan-1.0 Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing 591
drckf/paysage An unsupervised learning and generative models library for Python, focusing on probabilistic models and efficient computation. 119
pgularski/pysm A versatile Python State Machine library for building flexible and scalable state-based systems 73
ghiggi/gpm_api Provides a Python interface to download and analyze GPM data from NASA's Precipitation Processing System 60
thealgorithms/python A collection of algorithm implementations in Python 194,305