ZEN
Chinese text encoder
A pre-trained BERT-based Chinese text encoder with enhanced N-gram representations
A BERT-based Chinese Text Encoder Enhanced by N-gram Representations
643 stars
22 watching
104 forks
Language: Python
last commit: over 2 years ago Related projects:
Repository | Description | Stars |
---|---|---|
zminghua/sentencoding | A software package providing tools to encode and process text data using a specific neural network architecture. | 16 |
soloice/chinese-character-recognition | This project demonstrates how to build and train a convolutional neural network (CNN) to recognize Chinese characters. | 200 |
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 804 |
yxuansu/tacl | Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations. | 92 |
taosir/cnn_handwritten_chinese_recognition | A Python-based web application that recognizes handwritten Chinese characters using a Convolutional Neural Network (CNN), allowing users to input text via an online writing board and receive recognition results. | 508 |
zhangxiann/skip-gram | A Python implementation of a neural network model for learning word embeddings from text data | 6 |
xujiajun/gotokenizer | A tokenizer based on dictionary and Bigram language models for text segmentation in Chinese | 21 |
lonepatient/nezha_chinese_pytorch | An implementation of a Chinese language model using PyTorch and transformer architecture. | 262 |
bootphon/phonemizer | Converts text to phonetic transcriptions in multiple languages using various backends and algorithms | 1,231 |
lxneng/xpinyin | A Python library for translating Chinese characters to pinyin | 823 |
sy-xuan/pink | This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. | 76 |
zhegan27/convsent | Trains an autoencoder to learn generic sentence representations using convolutional neural networks | 34 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
zhuiyitechnology/wobert | A pre-trained Chinese language model that uses word embeddings and is designed to process Chinese text | 458 |
ymcui/chinese-xlnet | Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture | 1,653 |