DeCLUTR

Sentence embedding toolkit

A tool for training and evaluating sentence embeddings using deep contrastive learning

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

GitHub

380 stars

12 watching

33 forks

Language: Python

last commit: over 2 years ago

allennlpcontrastive-learningmetric-learningnatural-language-processingpytorchrepresentation-learningself-supervised-learningsemantic-searchsemantic-text-similaritysentence-embeddingssentence-similaritytransformers

Screenshot of JohnGiorgi/DeCLUTR website

aclanthology.org/2021.acl-long.72/

Related projects:

Repository	Description	Stars
jwieting/para-nmt-50m	A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets.	102
xiaoqijiao/coling2018	Provides training and testing code for a CNN-based sentence embedding model	2
voidism/diffcse	An unsupervised contrastive learning framework for learning sentence embeddings sensitive to differences between original and edited sentences.	292
jwieting/iclr2016	Code for training universal paraphrastic sentence embeddings and models on semantic similarity tasks	193
zhanghang1989/pytorch-encoding	A Python framework for building deep learning models with optimized encoding layers and batch normalization.	2,044
jwieting/acl2017	A codebase for training and using models of sentence embeddings.	33
nlprinceton/text_embedding	A utility class for generating and evaluating document representations using word embeddings.	54
gink03/alt-i2v	An implementation of a deep learning-based image representation learning approach using a modified fully connected layer and transfer learning from VGG16	34
malllabiisc/wordgcn	A deep learning model that generates word embeddings by predicting words based on their dependency context	291
hit-scir/elmoformanylangs	Provides pre-trained ELMo representations for multiple languages to improve NLP tasks.	1,462
lajanugen/s2v	An implementation of a neural network model for learning efficient sentence representations from text data.	205
jwieting/charagram	A tool for training and using character n-gram based word and sentence embeddings in natural language processing.	125
fursovia/geometric_embedding	An implementation of a non-parameterized approach for building sentence representations	19
antoine77340/mixture-of-embedding-experts	An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks.	118
davidnemeskey/embert	Provides pre-trained transformer-based models and tools for natural language processing tasks	2