TaCL

Token representation refinement

Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations.

[NAACL'22] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

GitHub

92 stars
5 watching
6 forks
Language: Python
last commit: over 2 years ago
bertcontrastive-learninglanguage-modelnernlppretrainingtext-classification

Related projects:

Repository Description Stars
ymcui/pert Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels 354
ymcui/chinese-xlnet Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture 1,653
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 645
brightmart/xlnet_zh Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks 230
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
cluebenchmark/cluepretrainedmodels Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. 804
dinghanshen/swem A software project that implements word embedding-based models for text classification tasks and provides pre-trained embeddings and evaluation scripts. 284
pku-yuangroup/languagebind Extending pretraining models to handle multiple modalities by aligning language and video representations 723
ymcui/lert A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks. 202
tianyi-lab/hallusionbench An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy 243
tal-tech/edu-bert A pre-trained language model designed to improve natural language processing tasks in education 187
proycon/python-ucto A Python binding to an advanced, extensible tokeniser written in C++ 29
yinwenpeng/scitail Reproducible code and pre-trained model for an ACL2018 paper on textual entailment via deep explorations of inter-sentence interactions. 16
sinovation/zen A pre-trained BERT-based Chinese text encoder with enhanced N-gram representations 643
tristandeleu/pytorch-maml-rl Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks 827