bandit-nmt
NMT framework
A framework for integrating policy gradient methods into neural machine translation models and evaluating their performance under simulated human feedback.
136 stars
13 watching
26 forks
Language: Python
last commit: over 6 years ago Related projects:
Repository | Description | Stars |
---|---|---|
jonsafari/nmt-list | A comprehensive catalog of various neural machine translation implementations using different deep learning frameworks. | 359 |
kai-yue/ntk-fed | A framework for federated learning that leverages the neural tangent kernel to address statistical heterogeneity in distributed machine learning. | 3 |
harvardnlp/seq2seq-attn | An implementation of a sequence-to-sequence model with attention mechanism using LSTMs and character embeddings for neural machine translation | 1,260 |
ethanyanjiali/minchatgpt | This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2. | 213 |
mustafaturan/omnicat | A framework providing a generalized strategy holder for text classification | 11 |
harshakokel/kigb | An open-source software framework that integrates human advice into gradient boosting decision trees for improved performance in machine learning tasks. | 8 |
kefirski/bytenet | A Pytorch implementation of a neural network model for machine translation | 47 |
namisan/mt-dnn | A PyTorch package implementing multi-task deep neural networks for natural language understanding | 2,238 |
taolei87/rcnn | An implementation of neural network components and optimization methods for text analysis, including rationales for neural predictions. | 355 |
benedekrozemberczki/m-nmf | An implementation of Community Preserving Network Embedding using deep learning and matrix factorization techniques | 120 |
google-deepmind/meltingpot | Assesses generalization of multi-agent reinforcement learning algorithms to novel social situations | 620 |
yaodongyu/tct | An approach to train and optimize machine learning models in a decentralized setting by convexifying the optimization process | 4 |
kkuette/tradzqai | An environment and framework for training reinforcement learning agents to make trading decisions on cryptocurrency markets. | 165 |
elbayadm/attn2d | A PyTorch implementation of 2D convolutional neural networks for sequence-to-sequence prediction in machine translation | 501 |
sjtu-marl/malib | A framework for parallel population-based reinforcement learning | 497 |