word-embeddings-for-nmt

NMT Dataset

An open source project that provides pre-trained word embeddings and a dataset for evaluating their usefulness in neural machine translation.

Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018

GitHub

121 stars

9 watching

19 forks

Language: Python

last commit: over 5 years ago

Related projects:

Repository	Description	Stars
jwieting/para-nmt-50m	A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets.	102
nlprinceton/text_embedding	A utility class for generating and evaluating document representations using word embeddings.	54
lmthang/nmt.matlab	Training software for neural machine translation systems using attention mechanisms and multi-layer encoder-decoder models.	105
jonsafari/nmt-list	A comprehensive catalog of various neural machine translation implementations using different deep learning frameworks.	359
novakat/nytk-nerkor-cars-ontonotespp	A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats.	1
davidnemeskey/embert	Provides pre-trained transformer-based models and tools for natural language processing tasks	2
neulab/compare-mt	A tool for comparing the performance of different language generation systems.	467
namisan/mt-dnn	A PyTorch package implementing multi-task deep neural networks for natural language understanding	2,238
harvardnlp/seq2seq-attn	An implementation of a sequence-to-sequence model with attention mechanism using LSTMs and character embeddings for neural machine translation	1,263
embeddings-benchmark/mteb	Provides tools and benchmarks for evaluating text embedding models	2,021
microsoft/neuronblocks	A toolkit for building and deploying neural network models for natural language processing tasks.	1,448
elbayadm/attn2d	A PyTorch implementation of 2D convolutional neural networks for sequence-to-sequence prediction in machine translation	502
moses-smt/nplm	A toolkit for training neural network language models	14
karthikncode/nlp-datasets	A curated list of Natural Language Processing datasets used to train and evaluate NLP models.	919
blackrockneurotech/npmk	A MATLAB-based toolkit for loading and processing data from Blackrock Microsystems' neuroscientific files.	46