JWE
Chinese word embedding trainer
This is a software project that trains and evaluates word embeddings for Chinese words, characters, and fine-grained subcharacter components.
Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components
99 stars
9 watching
33 forks
Language: C
last commit: over 5 years ago Related projects:
Repository | Description | Stars |
---|---|---|
dalinvip/cw2vec | A software framework for learning Chinese word embeddings with stroke n-gram information | 274 |
ray1007/gwe | A software implementation of a word embedding method using character glyphs, enhancing traditional Chinese language processing | 30 |
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 804 |
hkust-knowcomp/r-net | An implementation of R-Net, a machine reading comprehension model using TensorFlow. | 578 |
vzhong/embeddings | Provides fast and efficient word embeddings for natural language processing. | 223 |
cluebenchmark/cluecorpus2020 | A large-scale pre-training corpus for Chinese language models | 925 |
leonard-xu/cwe | Improves word embeddings by considering internal character structures in Chinese words | 299 |
cluebenchmark/electra | Trains and evaluates a Chinese language model using adversarial training on a large corpus. | 140 |
jwieting/charagram | A tool for training and using character n-gram based word and sentence embeddings in natural language processing. | 125 |
arleyguolei/wx-words-pk | A set of tools and components for building Chinese input methods, focusing on character prediction and suggestion algorithms. | 886 |
hanzhenlei767/nlp_learn | A comprehensive collection of NLP-related code snippets and notes on various models and techniques, including pre-trained language models and Chinese text processing methods. | 25 |
jwieting/paragram-word | Trains word embeddings from a paraphrase database to represent semantic relationships between words. | 30 |
malllabiisc/wordgcn | A deep learning model that generates word embeddings by predicting words based on their dependency context | 290 |
zhezhaoa/ngram2vec | A toolkit for learning high-quality word and text representations from ngram co-occurrence statistics | 846 |
hassygo/charngram2vec | A repository providing a re-implementation of character n-gram embeddings for pre-training in natural language processing tasks | 23 |