TaCL

Token representation refinement

Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations.

[NAACL'22] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

GitHub

92 stars

5 watching

6 forks

Language: Python

last commit: about 3 years ago

bertcontrastive-learninglanguage-modelnernlppretrainingtext-classification

arxiv.org/abs/2111.04198

Related projects:

Repository	Description	Stars
ymcui/pert	Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels	356
ymcui/chinese-xlnet	Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture	1,652
ymcui/macbert	Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks	646
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
yiren-jian/blitext	Develops and trains models for vision-language learning with decoupled language pre-training	24
cluebenchmark/cluepretrainedmodels	Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.	806
dinghanshen/swem	Reproduces the results of an ACL 2018 paper on simple word-embedding-based models for natural language processing tasks.	284
pku-yuangroup/languagebind	Extending pretraining models to handle multiple modalities by aligning language and video representations	751
ymcui/lert	A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks.	202
tianyi-lab/hallusionbench	An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy	259
tal-tech/edu-bert	A pre-trained language model designed to improve natural language processing tasks in education	186
proycon/python-ucto	A Python binding to an advanced, extensible tokeniser written in C++	29
yinwenpeng/scitail	Reproducible code and pre-trained model for an ACL2018 paper on textual entailment via deep explorations of inter-sentence interactions.	16
sinovation/zen	A pre-trained BERT-based Chinese text encoder with enhanced N-gram representations	645
tristandeleu/pytorch-maml-rl	Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks	830