ChineseBert
Chinese character understanding model
A deep learning model that incorporates visual and phonetic features of Chinese characters to improve its ability to understand Chinese language nuances
Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"
545 stars
6 watching
93 forks
Language: Python
last commit: over 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 806 |
cluebenchmark/electra | Trains and evaluates a Chinese language model using adversarial training on a large corpus. | 140 |
soloice/chinese-character-recognition | This project demonstrates how to build and train a convolutional neural network (CNN) to recognize Chinese characters. | 200 |
ymcui/chinese-mobilebert | An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks. | 81 |
ethan-yt/guwenbert | Pre-trained language model for classical Chinese texts using RoBERTa architecture | 511 |
ymcui/macbert | Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks | 646 |
sww9370/rocbert | A pre-trained Chinese language model designed to be robust against maliciously crafted texts | 15 |
clue-ai/chatyuan-7b | An updated version of a large language model designed to improve performance on multiple tasks and datasets | 13 |
yunwentechnology/unilm | This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. | 439 |
cluebenchmark/cluecorpus2020 | A large-scale Chinese corpus for pre-training language models. | 927 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
clue-ai/chatyuan | Large language model for dialogue support in multiple languages | 1,903 |
zhuiyitechnology/wobert | A Word-based Chinese BERT model trained on large-scale text data using pre-trained models as a foundation | 460 |
taosir/cnn_handwritten_chinese_recognition | A Python-based web application that recognizes handwritten Chinese characters using a Convolutional Neural Network (CNN), allowing users to input text via an online writing board and receive recognition results. | 511 |
shawn-ieitsystems/yuan-1.0 | Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing | 591 |