electra

Language model training

A method for pre-training transformer networks to learn language representations from text data without labeled supervision

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

GitHub

2k stars
59 watching
355 forks
Language: Python
last commit: 8 months ago
deep-learningnlptensorflow

Related projects:

Repository Description Stars
cluebenchmark/electra Trains and evaluates a Chinese language model using adversarial training on a large corpus. 140
ymcui/chinese-electra Provides pre-trained Chinese language models based on the ELECTRA framework for natural language processing tasks 1,403
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,160
codertimo/bert-pytorch An implementation of Google's 2018 BERT model in PyTorch, allowing pre-training and fine-tuning for natural language processing tasks 6,222
gram-ai/radio-transformer-networks An implementation of a machine learning-based communications system using deep learning techniques. 127
minimaxir/textgenrnn A Python module for creating character-level or word-level neural networks for text generation and training on various datasets 4,943
eleutherai/pythia Analyzing knowledge development and evolution in large language models during training 2,280
maxpumperla/elephas Enables distributed deep learning with Keras and Spark for scalable model training 1,574
lucidrains/imagen-pytorch Implements Google's Text-to-Image Neural Network in PyTorch using a cascading DDPM architecture with dynamic clipping and noise level conditioning. 8,088
dair-ai/ml-papers-explained An explanation of key concepts and advancements in the field of Machine Learning 7,315
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,941
lucidrains/musiclm-pytorch Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. 3,166
labmlai/annotated_deep_learning_paper_implementations Implementations of various deep learning algorithms and techniques with accompanying documentation 56,215
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
nrel/sup3r Creates synthetic high-resolution spatiotemporal data for renewable energy resources using generative adversarial networks. 87