text2vec

Text analytics library

An R package providing efficient tools for text analysis and natural language processing tasks.

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

GitHub

853 stars
54 watching
136 forks
Language: R
last commit: 3 months ago
Linked from 2 awesome lists

glovelatent-dirichlet-allocationnatural-language-processingtext-miningtopic-modelingvectorizationword-embeddingsword2vec

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
facebookresearch/vizseq Analyzes text generation tasks and provides visual insights 443
danieldk/go2vec A package for reading and analyzing word embeddings from the word2vec format in Go. 56
kostyaev/sentence2vec This is a tool for creating deep sentence embeddings using Sequence-to-Sequence learning. 22
josephwilk/rsemantic A Ruby document vector search system with support for flexible matrix transforms 146
tmikolov/word2vec A tool for training word vectors using distributed neural network architectures 1,527
satwikkottur/visualword2vec Learning word embeddings from abstract images to improve language understanding 19
vefstathiou/so_word2vec This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. 40
bmschmidt/wordvectors An R package for building and exploring word embedding models 282
vanvalenlab/deepcell-tf A deep learning library for analyzing biological images at the single-cell level 424
indicodatasolutions/passage A Python library for text analysis with recurrent neural networks. 531
ucarlab/iasva An R package that identifies hidden factors in data to uncover sources of variation 8
dtm2451/dittoseq A suite of functions for analyzing and visualizing RNA sequencing data 190
cpsievert/ldavis An R package for visualizing and exploring topic models from text data 556
raduionescu/vlawe-boswe Software providing representations for text classification based on word embeddings and clustering 10
wikipedia2vec/wikipedia2vec A tool for learning vector representations of words and entities from Wikipedia text data. 940