GPT2-Chinese

GPT2 Trainer

A repository providing a Chinese version of the GPT2 training code, utilizing BERT tokenizer.

Chinese version of GPT2 training code, using BERT tokenizer.

GitHub

7k stars
161 watching
2k forks
Language: Python
last commit: 9 months ago
chinesegpt-2nlptext-generationtransformer

Related projects:

Repository Description Stars
idea-ccnl/fengshenbang-lm A comprehensive, user-centered ecosystem of pre-trained NLP models for the Chinese language 4,049
ourongxing/chatgpt-vercel An open-source chatbot project built on Solid.js and OpenAI's GPT technology, with features like PWA support and customizable prompts. 3,200
mushan0x0/ai0x0.com An AI-powered desktop application that enables users to query and generate text, images, audio, and video content across various applications 3,764
kaqijiang/auto-gpt-zh An experimental application showcasing GPT-4's capabilities through automation and AI-driven workflows. 2,410
doggy8088/learn-git-in-30-days Teaches Git version control in 30 days through a tutorial and personal experience 4,004
imcaspar/gpt2-ml A collection of pre-trained GPT2 models and training scripts for multiple languages, including Chinese. 1,717
skyworkaigc/skytext-chinese-gpt3 An AI-powered text generation model trained on Chinese data to perform various tasks such as conversation, translation, and content creation. 418
lc1332/luotuo-chinese-llm A large language model based on the Chinese LLaMA architecture, designed to support complex conversations and applications. 3,641
xx-net/xx-net A proxy tool designed to bypass internet censorship and restrictions by disguising traffic as ordinary network activity. 33,104
thu-coai/cdial-gpt A large-scale Chinese conversation dataset and pre-trained dialog models for text generation 1,799
memochou1993/gpt-ai-assistant An AI-powered chat application leveraging OpenAI models and LINE APIs for conversational interfaces. 7,491
amfe/article A collection of articles and tutorials on mobile e-commerce front-end development, covering topics such as performance optimization, filtering, and dynamic updates. 7,583
synlp/chimed-gpt A Chinese medical large language model trained on extensive medical data to perform information extraction, question answering, and multi-round dialogue tasks. 76
esbatmop/mnbvc A massive corpus of Chinese text data covering various forms and styles 3,581