para-nmt-50m
Sentence embedding training toolkit
A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets.
Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"
102 stars
5 watching
21 forks
Language: Python
last commit: 12 months ago Related projects:
Repository | Description | Stars |
---|---|---|
jwieting/acl2017 | A codebase for training and using models of sentence embeddings. | 33 |
jwieting/paragram-word | Trains word embeddings from a paraphrase database to represent semantic relationships between words. | 30 |
jwieting/iclr2016 | Code for training universal paraphrastic sentence embeddings and models on semantic similarity tasks | 193 |
nlprinceton/text_embedding | A utility class for generating and evaluating document representations using word embeddings. | 54 |
jwieting/charagram | A tool for training and using character n-gram based word and sentence embeddings in natural language processing. | 125 |
johngiorgi/declutr | A tool for training and evaluating sentence embeddings using deep contrastive learning | 379 |
xiaoqijiao/coling2018 | Provides training and testing code for a CNN-based sentence embedding model | 2 |
binwang28/sbert-wk-sentence-embedding | A method to generate sentence embeddings from pre-trained language models | 177 |
antoine77340/howto100m | Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset | 250 |
neulab/word-embeddings-for-nmt | An open source project that provides pre-trained word embeddings and a dataset for evaluating their usefulness in neural machine translation. | 121 |
davidnemeskey/embert | Provides pre-trained transformer-based models and tools for natural language processing tasks | 2 |
zhanghang1989/pytorch-encoding | A Python framework for building deep learning models with optimized encoding layers and batch normalization. | 2,041 |
microsoft/mpnet | Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
zhuiyitechnology/pretrained-models | A collection of pre-trained language models for natural language processing tasks | 987 |
huggingface/setfit | A framework for efficient few-shot learning with Sentence Transformers | 2,236 |