ChineseBert

Chinese character understanding model

A deep learning model that incorporates visual and phonetic features of Chinese characters to improve its ability to understand Chinese language nuances

Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"

GitHub

542 stars
6 watching
92 forks
Language: Python
last commit: over 1 year ago

Related projects:

Repository Description Stars
cluebenchmark/cluepretrainedmodels Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. 804
cluebenchmark/electra Trains and evaluates a Chinese language model using adversarial training on a large corpus. 140
soloice/chinese-character-recognition This project demonstrates how to build and train a convolutional neural network (CNN) to recognize Chinese characters. 200
ymcui/chinese-mobilebert An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks. 80
ethan-yt/guwenbert A pre-trained language model for classical Chinese based on RoBERTa and ancient literature. 506
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 645
sww9370/rocbert A pre-trained Chinese language model designed to be robust against maliciously crafted texts 15
clue-ai/chatyuan-7b An updated version of a large language model designed to improve performance on multiple tasks and datasets 13
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925
brightmart/xlnet_zh Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks 230
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
zhuiyitechnology/wobert A pre-trained Chinese language model that uses word embeddings and is designed to process Chinese text 458
taosir/cnn_handwritten_chinese_recognition A Python-based web application that recognizes handwritten Chinese characters using a Convolutional Neural Network (CNN), allowing users to input text via an online writing board and receive recognition results. 508
shawn-ieitsystems/yuan-1.0 Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing 591