 jieba
 jieba 
 Chinese tokenizer
 A comprehensive Python library for Chinese text segmentation and word extraction.
结巴中文分词
33k stars
 1k watching
 7k forks
 
Language: Python 
last commit: about 1 year ago 
Linked from   5 awesome lists  
 Related projects:
| Repository | Description | Stars | 
|---|---|---|
|  | An Android implementation of the Chinese word segmentation algorithm jieba, optimized for fast initialization and tokenization | 153 | 
|  | A PHP module for Chinese text segmentation and word breaking | 1,331 | 
|  | Provides a Ruby port of the popular Chinese language processing library Jieba | 8 | 
|  | A Google Colab setup for cracking hashes using multiple tools | 929 | 
|  | Automates real-time market data retrieval and storage from Huobi exchange, publishing updates to Redis for use in backtesting and analysis. | 39 | 
|  | A JavaScript library designed to simplify Georgian keyboard layout support | 57 | 
|  | Analyzes and prints useful information from IPA files used in iOS app development. | 10 | 
|  | Generates stubs for interfaces in code completion tools | 60 | 
|  | A tool to compress and remove unnecessary migration history from database schema | 1,499 | 
|  | Converts packet capture files to usable hashes for Hashcat or John the Ripper analysis. | 2,039 | 
|  | Optimizes internationalization text files by reducing bundle size through code substitution | 14 | 
|  | Generates G-Code files for laser cutting solder paste stencils in KiCAD PCBs. | 16 | 
|  | Analyzes and dumps memory to extract sensitive information from running processes | 582 | 
|  | An enhancement tool for Ghidra's binary analysis capabilities | 289 | 
|  | A tool to scan and discover Homematic devices on a network. | 6 |