ChineseBert

Chinese character understanding model

A deep learning model that incorporates visual and phonetic features of Chinese characters to improve its ability to understand Chinese language nuances

Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"

GitHub

545 stars

6 watching

93 forks

Language: Python

last commit: about 2 years ago

Related projects:

Repository	Description	Stars
cluebenchmark/cluepretrainedmodels	Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.	806
cluebenchmark/electra	Trains and evaluates a Chinese language model using adversarial training on a large corpus.	140
soloice/chinese-character-recognition	This project demonstrates how to build and train a convolutional neural network (CNN) to recognize Chinese characters.	200
ymcui/chinese-mobilebert	An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks.	81
ethan-yt/guwenbert	Pre-trained language model for classical Chinese texts using RoBERTa architecture	511
ymcui/macbert	Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks	646
sww9370/rocbert	A pre-trained Chinese language model designed to be robust against maliciously crafted texts	15
clue-ai/chatyuan-7b	An updated version of a large language model designed to improve performance on multiple tasks and datasets	13
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
cluebenchmark/cluecorpus2020	A large-scale Chinese corpus for pre-training language models.	927
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
clue-ai/chatyuan	Large language model for dialogue support in multiple languages	1,903
zhuiyitechnology/wobert	A Word-based Chinese BERT model trained on large-scale text data using pre-trained models as a foundation	460
taosir/cnn_handwritten_chinese_recognition	A Python-based web application that recognizes handwritten Chinese characters using a Convolutional Neural Network (CNN), allowing users to input text via an online writing board and receive recognition results.	511
shawn-ieitsystems/yuan-1.0	Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing	591