DeCLUTR

Sentence embedding toolkit

A tool for training and evaluating sentence embeddings using deep contrastive learning

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

GitHub

379 stars
12 watching
33 forks
Language: Python
last commit: over 1 year ago
allennlpcontrastive-learningmetric-learningnatural-language-processingpytorchrepresentation-learningself-supervised-learningsemantic-searchsemantic-text-similaritysentence-embeddingssentence-similaritytransformers

Related projects:

Repository Description Stars
jwieting/para-nmt-50m A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets. 102
xiaoqijiao/coling2018 Provides training and testing code for a CNN-based sentence embedding model 2
voidism/diffcse An unsupervised contrastive learning framework for learning sentence embeddings sensitive to differences between original and edited sentences. 291
jwieting/iclr2016 Code for training universal paraphrastic sentence embeddings and models on semantic similarity tasks 193
zhanghang1989/pytorch-encoding A Python framework for building deep learning models with optimized encoding layers and batch normalization. 2,041
jwieting/acl2017 A codebase for training and using models of sentence embeddings. 33
nlprinceton/text_embedding A utility class for generating and evaluating document representations using word embeddings. 54
gink03/alt-i2v An implementation of a deep learning-based image representation learning approach using a modified fully connected layer and transfer learning from VGG16 34
malllabiisc/wordgcn A deep learning model that generates word embeddings by predicting words based on their dependency context 290
hit-scir/elmoformanylangs Provides pre-trained ELMo representations for multiple languages to improve NLP tasks. 1,463
lajanugen/s2v An implementation of a neural network model for learning efficient sentence representations from text data. 205
jwieting/charagram A tool for training and using character n-gram based word and sentence embeddings in natural language processing. 125
fursovia/geometric_embedding An implementation of a non-parameterized approach for building sentence representations 19
antoine77340/mixture-of-embedding-experts An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. 118
davidnemeskey/embert Provides pre-trained transformer-based models and tools for natural language processing tasks 2