para-nmt-50m

Sentence embedding training toolkit

A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets.

Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"

GitHub

102 stars

5 watching

21 forks

Language: Python

last commit: over 1 year ago

Related projects:

Repository	Description	Stars
jwieting/acl2017	A codebase for training and using models of sentence embeddings.	33
jwieting/paragram-word	Trains word embeddings from a paraphrase database to represent semantic relationships between words.	30
jwieting/iclr2016	Code for training universal paraphrastic sentence embeddings and models on semantic similarity tasks	193
nlprinceton/text_embedding	A utility class for generating and evaluating document representations using word embeddings.	54
jwieting/charagram	A tool for training and using character n-gram based word and sentence embeddings in natural language processing.	125
johngiorgi/declutr	A tool for training and evaluating sentence embeddings using deep contrastive learning	380
xiaoqijiao/coling2018	Provides training and testing code for a CNN-based sentence embedding model	2
binwang28/sbert-wk-sentence-embedding	A method to generate sentence embeddings from pre-trained language models	178
antoine77340/howto100m	Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset	254
neulab/word-embeddings-for-nmt	An open source project that provides pre-trained word embeddings and a dataset for evaluating their usefulness in neural machine translation.	121
davidnemeskey/embert	Provides pre-trained transformer-based models and tools for natural language processing tasks	2
zhanghang1989/pytorch-encoding	A Python framework for building deep learning models with optimized encoding layers and batch normalization.	2,044
microsoft/mpnet	Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.	288
zhuiyitechnology/pretrained-models	A collection of pre-trained language models for natural language processing tasks	989
huggingface/setfit	A framework for efficient few-shot learning with Sentence Transformers	2,267