word-embeddings-for-nmt
NMT Dataset
An open source project that provides pre-trained word embeddings and a dataset for evaluating their usefulness in neural machine translation.
Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018
121 stars
9 watching
19 forks
Language: Python
last commit: over 5 years ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A collection of pre-trained models and code for training paraphrastic sentence embeddings from large machine translation datasets. | 102 |
| | A utility class for generating and evaluating document representations using word embeddings. | 54 |
| | Training software for neural machine translation systems using attention mechanisms and multi-layer encoder-decoder models. | 105 |
| | A comprehensive catalog of various neural machine translation implementations using different deep learning frameworks. | 359 |
| | A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. | 1 |
| | Provides pre-trained transformer-based models and tools for natural language processing tasks | 2 |
| | A tool for comparing the performance of different language generation systems. | 467 |
| | A PyTorch package implementing multi-task deep neural networks for natural language understanding | 2,238 |
| | An implementation of a sequence-to-sequence model with attention mechanism using LSTMs and character embeddings for neural machine translation | 1,263 |
| | Provides tools and benchmarks for evaluating text embedding models | 2,021 |
| | A toolkit for building and deploying neural network models for natural language processing tasks. | 1,448 |
| | A PyTorch implementation of 2D convolutional neural networks for sequence-to-sequence prediction in machine translation | 502 |
| | A toolkit for training neural network language models | 14 |
| | A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
| | A MATLAB-based toolkit for loading and processing data from Blackrock Microsystems' neuroscientific files. | 46 |