GPT2-Chinese

GPT2 trainer

Training code for Chinese versions of the GPT2 language model using BERT tokenizer or BPE model.

Chinese version of GPT2 training code, using BERT tokenizer.

GitHub

7k stars
161 watching
2k forks
Language: Python
last commit: 7 months ago
chinesegpt-2nlptext-generationtransformer

Related projects:

Repository Description Stars
idea-ccnl/fengshenbang-lm A comprehensive, user-centered ecosystem of pre-trained NLP models for the Chinese language 4,022
ourongxing/chatgpt-vercel An open-source chatbot project built on Solid.js and OpenAI's GPT technology, with features like PWA support and customizable prompts. 3,194
mushan0x0/ai0x0.com An AI-powered desktop application that enables users to query and generate text, images, audio, and video content across various applications 3,719
kaqijiang/auto-gpt-zh An experimental application showcasing GPT-4's capabilities through automation and AI-driven workflows. 2,404
doggy8088/learn-git-in-30-days Teaches Git version control in 30 days through a tutorial and personal experience 3,982
imcaspar/gpt2-ml A collection of pre-trained GPT2 models and training scripts for multiple languages, including Chinese. 1,716
skyworkaigc/skytext-chinese-gpt3 An AI-powered text generation model trained on Chinese data to perform various tasks such as conversation, translation, and content creation. 419
lc1332/luotuo-chinese-llm A large language model based on the Chinese LLaMA architecture, designed to support complex conversations and applications. 3,637
xx-net/xx-net A proxy tool designed to bypass internet censorship and restrictions by disguising traffic as ordinary network activity. 33,063
thu-coai/cdial-gpt A large-scale Chinese conversation dataset and pre-trained dialog models for text generation 1,782
memochou1993/gpt-ai-assistant An AI-powered chat application using OpenAI and LINE APIs 7,428
amfe/article A collection of articles and tutorials on mobile e-commerce front-end development, covering topics such as performance optimization, filtering, and dynamic updates. 7,587
synlp/chimed-gpt A Chinese medical large language model trained on extensive medical data to perform information extraction, question answering, and multi-round dialogue tasks. 74
esbatmop/mnbvc Collects and provides access to a vast corpus of Chinese text data from various sources 3,520