GloVe

Word Vector Library

Provides pre-trained word vector representations and an implementation of the GloVe model for learning word embeddings

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings

GitHub

7k stars

229 watching

2k forks

Language: C

last commit: 8 months ago

Linked from 2 awesome lists

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
cemoody/lda2vec	A framework for creating interpretable natural language models by combining word embeddings and topic modeling.	3,152
plasticityai/magnitude	A fast and efficient utility package for utilizing vector embeddings in machine learning models	1,635
alexandres/lexvec	An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks	803
embedding/chinese-word-vectors	Provides pre-trained vectors with various properties for downstream tasks in natural language processing	11,874
jwieting/paragram-word	Trains word embeddings from a paraphrase database to represent semantic relationships between words.	30
ynqa/wego	An open-source Go library for learning and manipulating vector representations of words	476
google/sentencepiece	An unsupervised text tokenizer that segments input text into subwords and detokenizes output based on a predefined vocabulary size.	10,366
jwieting/iclr2016	Code for training universal paraphrastic sentence embeddings and models on semantic similarity tasks	193
stanfordnlp/stanza	A Python library for natural language processing tasks in many human languages.	7,315
princeton-nlp/simcse	An open source framework for learning sentence embeddings using contrastive learning.	3,457
piskvorky/gensim-data	A repository of pre-trained NLP models and corpora for text processing.	990
codertimo/bert-pytorch	An implementation of Google's 2018 BERT model in PyTorch, allowing pre-training and fine-tuning for natural language processing tasks	6,251
bigscience-workshop/promptsource	A toolkit for creating and using natural language prompts to enable large language models to generalize to new tasks.	2,718
huggingface/tokenizers	A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages.	9,156
giuseppemarra/char-word-embeddings	This repository provides an unsupervised approach to learning character-aware word and context embeddings.	0