word-embeddings-for-nmt
NMT Dataset
An open source project that provides pre-trained word embeddings and a dataset for evaluating their usefulness in neural machine translation.
Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018
121 stars
9 watching
19 forks
Language: Python
last commit: almost 5 years ago Related projects:
Repository | Description | Stars |
---|---|---|
| A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets. | 102 |
| A utility class for generating and evaluating document representations using word embeddings. | 54 |
| Training software for neural machine translation systems using attention mechanisms and multi-layer encoder-decoder models. | 105 |
| A comprehensive catalog of various neural machine translation implementations using different deep learning frameworks. | 359 |
| A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. | 1 |
| Provides pre-trained transformer-based models and tools for natural language processing tasks | 2 |
| A tool for comparing the performance of different language generation systems. | 467 |
| A PyTorch package implementing multi-task deep neural networks for natural language understanding | 2,238 |
| An implementation of a sequence-to-sequence model with attention mechanism using LSTMs and character embeddings for neural machine translation | 1,263 |
| Provides tools and benchmarks for evaluating text embedding models | 2,021 |
| A toolkit for building and deploying neural network models for natural language processing tasks. | 1,448 |
| A PyTorch implementation of 2D convolutional neural networks for sequence-to-sequence prediction in machine translation | 502 |
| A toolkit for training neural network language models | 14 |
| A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
| A MATLAB-based toolkit for loading and processing data from Blackrock Microsystems' neuroscientific files. | 46 |