Chinese-Word-Vectors
Word vectors
Provides pre-trained vectors with various properties for downstream tasks in natural language processing
100+ Chinese Word Vectors 上百种预训练中文词向量
12k stars
285 watching
2k forks
Language: Python
last commit: about 1 year ago chinesechinese-word-segmentationembeddingembeddingsvectors-trainedword-embeddings
Related projects:
Repository | Description | Stars |
---|---|---|
dalinvip/cw2vec | A software framework for learning Chinese word embeddings with stroke n-gram information | 274 |
zhezhaoa/ngram2vec | A toolkit for learning high-quality word and text representations from ngram co-occurrence statistics | 846 |
cluebenchmark/cluepretrainedmodels | Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. | 804 |
vzhong/embeddings | Provides fast and efficient word embeddings for natural language processing. | 223 |
hkust-knowcomp/jwe | This is a software project that trains and evaluates word embeddings for Chinese words, characters, and fine-grained subcharacter components. | 99 |
chengyuegongr/frequency-agnostic | Improves word embeddings by using adversarial training to make them less dependent on word frequencies | 118 |
uhh-lt/sensegram | Tools and techniques for analyzing word meanings from word embeddings | 212 |
brightmart/text_classification | An NLP project offering various text classification models and techniques for deep learning exploration | 7,861 |
cluebenchmark/cluecorpus2020 | A large-scale pre-training corpus for Chinese language models | 925 |
hit-scir/chinese-mixtral-8x7b | An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary. | 641 |
plasticityai/magnitude | A fast and efficient utility package for utilizing vector embeddings in machine learning models | 1,627 |
kyubyong/wordvectors | Provides pre-trained word vectors for multiple languages to facilitate NLP tasks | 2,215 |
malllabiisc/wordgcn | A deep learning model that generates word embeddings by predicting words based on their dependency context | 290 |
xiaoqijiao/coling2018 | Provides training and testing code for a CNN-based sentence embedding model | 2 |
dccuchile/spanish-word-embeddings | A collection of precomputed word embeddings for the Spanish language, derived from different corpora and computational methods. | 356 |