genius

Chinese segmenter

A Python library implementing Conditional Random Field-based segmenter for Chinese text processing

a chinese segment base on crf

GitHub

234 stars
26 watching
65 forks
Language: Python
last commit: almost 6 years ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
xujiajun/gotokenizer A tokenizer based on dictionary and Bigram language models for text segmentation in Chinese 21
muyang0320/tensorflow-deeplab-resnet-crf An implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation 18
thunlp/thulac-python An efficient Chinese lexical analyzer with morphological analysis capabilities 2,023
juntang-zhuang/shelfnet An implementation of a lightweight semantic segmentation model with real-time performance capabilities 252
zhuiyitechnology/t5-pegasus Chinese generation model based on T5 architecture, trained using PEGASUS method 555
cn/gb2260.py A Python implementation of the Chinese administrative division codes 126
jiahuadong/fiss Implementations of federated incremental semantic segmentation in PyTorch. 33
taosir/cnn_handwritten_chinese_recognition A Python-based web application that recognizes handwritten Chinese characters using a Convolutional Neural Network (CNN), allowing users to input text via an online writing board and receive recognition results. 508
fangpenlin/loso An implementation of a Chinese segmentation system using Hidden Makov Model algorithm 83
sinovation/zen A pre-trained BERT-based Chinese text encoder with enhanced N-gram representations 643
mrkiven/pyzh Gathers and translates Python articles from Readthedocs into Chinese 1,419
ddddxxx/swiftyopencc A Swift library for converting between Traditional and Simplified Chinese text 200
chaojie/comfyui-dynamicrafter Dynamic content generation model trained on image and text data. 126
chenjiandongx/github-spider A Python-based web crawler for scraping Github user and repository data. 264
fukuball/jieba-php A PHP module for Chinese text segmentation and word breaking 1,323