JWE

Chinese word embedding trainer

This is a software project that trains and evaluates word embeddings for Chinese words, characters, and fine-grained subcharacter components.

Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components

GitHub

99 stars

9 watching

33 forks

Language: C

last commit: about 6 years ago

Related projects:

Repository	Description	Stars
dalinvip/cw2vec	A software framework for learning Chinese word embeddings with stroke n-gram information	274
ray1007/gwe	A software implementation of a word embedding method using character glyphs, enhancing traditional Chinese language processing	30
cluebenchmark/cluepretrainedmodels	Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.	806
hkust-knowcomp/r-net	An implementation of R-NET, a machine reading comprehension model using scaled multiplicative attention and variational dropout.	578
vzhong/embeddings	Provides fast and efficient word embeddings for natural language processing.	223
cluebenchmark/cluecorpus2020	A large-scale Chinese corpus for pre-training language models.	927
leonard-xu/cwe	This project presents an approach to improve word embeddings by incorporating internal character information into Chinese words	299
cluebenchmark/electra	Trains and evaluates a Chinese language model using adversarial training on a large corpus.	140
jwieting/charagram	A tool for training and using character n-gram based word and sentence embeddings in natural language processing.	125
arleyguolei/wx-words-pk	A set of tools and components for building Chinese input methods, focusing on character prediction and suggestion algorithms.	895
jwieting/paragram-word	Trains word embeddings from a paraphrase database to represent semantic relationships between words.	30
malllabiisc/wordgcn	A deep learning model that generates word embeddings by predicting words based on their dependency context	291
zhezhaoa/ngram2vec	A toolkit for learning high-quality word and text representations from ngram co-occurrence statistics	848
hassygo/charngram2vec	A repository providing a re-implementation of character n-gram embeddings for pre-training in natural language processing tasks	23